- High-performance computing (HPC) clusters anywhere [part 2]
- HPC in the cloud
- Amazon Web Services
- Microsoft Azure
- Google Cloud
- Oracle Cloud Infrastructure
- Dedicated private HPC clusters
- Hybrid HPC
- HPC on the edge
- HPC with Canonical
- Summary
- Ubuntu cloud
- What is High-performance computing (HPC)? [part 1]
- What are HPC clusters?
- What are the main use cases of HPC?
- Summary
High-performance computing (HPC) clusters anywhere [part 2]
HPC clusters can now be deployed almost anywhere. It started with research-focused supercomputing clusters and moved on to dedicated HPC clusters. However, thanks to ever-growing improvements in technology the option of running HPC clusters in the cloud has had a huge growth in popularity in recent years. Some even combine these options and have a dedicated localised cost optimised cluster, then burst into the cloud as needed, taking advantage of hybrid cloud methodologies for clustered computing. And with computing power, use cases, and needs ever growing some are even turning to run their HPC clusters on the edge.
HPC in the cloud
Advancements in cloud computing, networking, and storage now make it possible to run HPC workloads at scales rarely seen before in the cloud. Many of the Public cloud providers have specialised resources with deep foundations in the HPC solution space that are available for consumption to organisations of all sizes. Cloud computing has been key in delivering HPC to organisations that might require bursting or scaling beyond what is reasonable with dedicated clusters, or even just small experimental clusters for organisations that are just getting started with HPC and might not have the capacity to maintain the infrastructure investment required for a private cluster. Or resources for experimentation or testing that might not be available in the organisation’s dedicated cluster, for example, GPU, FPGA, or other architectures that might be in the beginning phase of adoption.
Amazon Web Services
AWS has been one of the key players when it comes to driving innovation in providing public cloud services for HPC. Their implementation of the AWS Nitro System was key for them to eliminate virtualization overhead and enable direct access to underlying host hardware. This drove down latency and increased performance which was vital to running HPC clusters and workloads in a public cloud. In order to be able to deliver on the demands of HPC workloads when it comes to inter-node communication, they brought in the Elastic Fabric Adaptor which was key to reducing latency and increasing the performance for workloads that communicate across nodes and require a high-performance interconnect. To cover the storage needs of HPC users Amazon added a specialised storage offering based on Lustre, called Amazon FSx for Lustre. Alongside that, they have offerings in terms of scheduling with solutions such as AWS ParallelCluster and AWS Batch.
Microsoft Azure
Azure is a key player when it comes to driving HPC in the public cloud and has provided strong instance types that use traditional HPC technologies such as Infiniband which provides RDMA functionality for optimal latency and performance. They also have instance types that cater to those looking to reduce the number of cores exposed to the workload catering to workloads that are primarily limited by memory bandwidth rather than available cores. They even have an offering that delivers supercomputers as a service, which is their Cray solution. Along with that, they offer HPC-focused storage which is their Cray ClusterStor.
Google Cloud
Google Cloud Platform offers pre-configured HPC VMs that are well documented towards the user. They have a very document-driven approach towards the enablement of HPC workloads, with clear guides for anything on MPI workloads to HPC images giving users clear and practical information on how to get the most out of their usage of GCP for HPC workloads.
Oracle Cloud Infrastructure
Oracle was an early player when it came to the enablement of HPC in public clouds. They take a bare-metal approach to HPC in the public cloud, offering bare metal instance types with an ultra low latency RDMA networking delivering a solution close to what one might expect from a dedicated private HPC cluster.
Dedicated private HPC clusters
Dedicated private clusters remain a solid option in HPC for those looking to optimise on cost, control, and even very specific data ownership or security requirements. Plenty of solutions exists that give users cloud-like management capabilities of local on-premise resources. The main challenge with private HPC clusters is the high upfront investment and required expertise. This can be mitigated by working with partners such as Canonical and its partners because it gives you access to expert knowledge and solutions that make adoption more feasible. And could deliver on potential cost savings in terms of total costs while still requiring an upfront investment.
Hybrid HPC
Hybrid usage of local and public cloud-based resources has been very popular in the HPC space, giving users both the benefits of cost optimisation and control with on-premise along with the extreme scalability of public cloud-based clusters. In its nature hybrid cloud usage has the challenges associated with both public and private cloud cluster usage while also bringing out many of the potential benefits of both solutions. In a way, it delivers a complementary solution where the negatives of one get mitigated by the positives of the other. The main additional challenge coming from such a setup might be the increased complexity but overall it has the possibility to bring a greater overall resiliency. With solutions both in public and private cloud Canonical can help you simplify the increased complexity of the setup.
HPC on the edge
Many of the various HPC workloads, especially those that require real-time processing or are extremely latency-sensitive, are now being deployed on the edge. That means they are often in small clusters or even as a single very focused computer often referred to as a high-performance computer (HPC), Despite sharing an abbreviation with High-Performance Computing, it refers to the usage of a single high-performance computer instead of a cluster. HPC on the edge can be seen in anything from Telco located in edge deployments for 5G or in automotive where a high-performance computer might reside in a car to process very product-focused workloads, such as image recognition or interpreting LiDAR data for autonomous vehicles. The main challenge with HPC on the edge is that deployment and maintenance might come with some complications due to limited access which makes managing it over the lifetime of the solution a bit of a hurdle. Thankfully, we offer solutions that could solve that problem.
HPC with Canonical
Our solutions such as Ubuntu are available for high-performance computing needs across clouds where you have access to an almost unlimited number of computational resources. We offer solutions, such as the Charmed OpenStack that allow you to have your own cost-optimised cloud delivering the most value for performance with full sovereignty and we have MAAS for those looking for the ultimate cloud experience in bare metal cluster management delivering the ultimate performance and flexibility. Or you can have any combination of these and go with a hybrid cloud strategy. Whatever your requirements are and no matter the size of the computation Canonical has the solutions for you. Contact us and we’ll help you map out your needs.
Summary
This blog has introduced the key public cloud players in the HPC cluster space and how they are driving innovation and evolution in the HPC solution space. Along with an introduction to hybrid cloud and clusters and HPC on the edge.
If you are interested in more information take a look at the previous blog in the series “What is High-performance computing (HPC)?”, how Scania is Mastering multi-cloud for HPC systems with Juju, or dive into some of our other HPC content.
In the next blog, we will go into the history of HPC and Supercomputing where we will cover how it all started and how it developed into the HPC we see today.
Ubuntu cloud
Ubuntu offers all the training, software infrastructure, tools, services and support you need for your public and private clouds.
What is High-performance computing (HPC)? [part 1]
High-Performance Computing is the procedure of combining computational resources together as a single resource. The combined resources are often referred to as a supercomputer or a compute cluster. The reason this is done is to make it possible to deliver computational intensity and the ability to process complex computational workloads and applications at high speeds and in parallel. Those workloads require computing power and performance that is often beyond the capabilities of a typical desktop computer or a workstation.
What are HPC clusters?
HPC clusters are often made of a group of servers; these servers are generally referred to as compute nodes. Some clusters can be as small as a few nodes and some have hundreds and even thousands of compute nodes all connected together over a network so they can work together to solve these advanced computational workloads. These networks sometimes use high-speed interconnects or other low latency network solutions to reduce computational latency. The workloads often have significant storage requirements, either in terms of data size or throughput. So, it’s common to deploy both a high-performance storage solution, often referred to as scratch used for in-flight computation storage, along with a general-purpose solution for user data, applications, and archival. To use all these resources as effectively as possible and in parallel, sometimes a message passing interface generally referred to as MPI is used.
The workloads generally run in batches and are managed by a batch scheduler. This scheduler is a vital component of HPC clusters, it keeps track of the available resources of a cluster, it queues workloads when there are not sufficient computational resources and when computational resources become available the scheduler efficiently assigns workloads to available resources.
HPC clusters and solutions are available anywhere and can be deployed on-premise, in the cloud, or as a hybrid solution combining both on-premise and off and even on the edge.
What are the main use cases of HPC?
HPC is used to solve some of the most advanced and toughest computational problems we have today. These problems exist in all types of environments, such as science, engineering, or business. Thanks to HPC, computational fluid dynamics (CFD) workloads can simulate the flow of fluids to solve problems in numerous fields such as aerodynamics, industrial system design or weather simulation, and many more. High-performance data analytics (HPDA) is the combination of HPC and Data Analytics where parallel processing and powerful analytics are used to analyse huge data sets at incredible speeds. This is used for anything from real-time data analysis, high-frequency stock trading, and even some highly complex analytics problems found in scientific research. Large computational clusters are also used to render whole movies or create visual effects for certain scenes. Genome processing and sequencing is another field that needs HPC due to the huge data sets that are analysed and interpreted to figure out hereditary conditions or other medical anomalies. HPC can even be used to figure out large-scale logistic and supply problems such as those existing in Retail. Whatever the need, HPC has the ability to solve it.
Summary
This blog has introduced you to high-performance computing, the components that make HPC clusters and how they are used to solve the toughest computational problems we have today. We have also covered some of the many ways high-performance computing can be used in the industry with a brief overview of computational fluid dynamics and high-performance data analytics.
If you are interested in more information take a look at how Scania is Mastering multi cloud for HPC systems with Juju, or dive into our blog on Data Centre Automation for HPC, Or explore some of our other HPC content.
In the next blog, we will be highlighting some of the many ways a HPC cluster can be deployed.