HPC Handbook: Parallel Computing Primer
Latest News
November 2, 2015
Editor’s Note: This is an excerpt of Chapter 3 from The Design Engineer’s High-Performance Computing Handbook. Download the complete version here.
Few technology advances have had quite the same impact on design software as the advent of parallelization or parallel processing. By taking advantage of the power of multiple cores and multiple CPUs and GPUs (locally or on multiple servers or cloud resources), engineers and designers have been able to quickly and cost-effectively wrangle very large simulation and rendering tasks that previously would have required outsourcing and many hours (or days) of delays.
Parallelism leverages concurrency to gain better performance. Parallel computing defines these very large simulation problems in smaller pieces and simultaneously solves each piece using multiple cores, processors or computers. As a result, you can get your answer faster (lower latency) or you can find more answers in the same amount of time by increasing throughput.
“If you are handling and processing a lot of data, you are ripe for using parallelism,” says James Reinders, parallel programming evangelist at Intel.
Parallelism provides an opportunity to direct the multiple functions of a computer system in a fashion similar to conducting an orchestra. “How can the system use multiple resources in concert, rather than one a time? ‘You have to think about how you are using the total computer system,” Reinders says.
This is different than multi-threading, in which a single CPU or core executes multiple processes or threads concurrently by splitting up the data and tasks into sub-tasks on shared memory. On a CPU with Hyper-Threading, for example, you can perform multiple sequential tasks by running several dozen threads in parallel with multiple cores. “Multi-threading is for in-process computation, while parallel computing is for out-of-process computations,” says Silvina Grad-Freilich, MathWorks’ senior manager for parallel computing and deployment products.
Parallel processing can be enabled via modern CPUs or GPUs. The CPU is good at loading and holding data so that the same data can be used over and over again. GPUs can perform parallel computations that are the same or similar on different data.
“GPU accelerators complement CPUs to provide the best app performance for end users,” says Will Ramey, senior product manager of accelerated computing at NVIDIA.
Faster, Better Results
With software applications that have been designed to take advantage of parallelism, engineers can complete complex tasks much faster.
“With parallelized software, our engineering customers can really achieve enhanced engineering productivity by accelerating simulation throughput,” says Wim Slagter, lead product manager for high performance computing at ANSYS. “It also helps them to make more efficient product development decisions.”
Engineers can also achieve higher fidelity insight into product performance that couldn’t be gained any other way. “Parallelized software capabilities allow engineers to simulate larger models and more complex models, so that more accurate design decisions can be made throughout the design or development cycle,” Slagter adds.
“If the simulation runs faster, then you can do more iterations of exploring parameters of the design space much faster,” Ramey says. “You can refine those designs faster, which results in better products.”
Where Core Count Matters
Where this approach to computation matters most in design is typically in simulation and rendering involving large amounts of data. Computational fluid dynamics (CFD) simulations, for example, benefit from higher core counts. For a large aerospace company, huge simulations rely on throwing large numbers of cores at the computations.
“The applications that benefit the most are those that have computations heavy enough that they can benefit from breaking the data into chunks,” says Grad-Freilich. “You need to make the right decision for the technology that you use based on your actual application needs.”
However, not every task within an application can be parallelized. Companies may not see the speed-up they expect if there are lengthy sequential processes involved in the solution. “For example, an application may have a sequential part that runs before and after the other computational work, so the overall speedup of the application is less than linear because of the serial nature of the beginning and end processes,” Grad-Freilich says.
That limit on acceleration is explained by Amdahl’s Law, which limits the theoretical speed-up of applications using parallelism. Because some processes are inherently serial, there is a point at which the application can’t be made any more parallel. That point varies depending on the type of work involved.
There are software applications that don’t benefit from parallelism, or that don’t benefit from an exponential increase in core count. “Mechanical simulations, for example, do not scale up to tens of thousands of cores,” Slagter says. “We’ve shown in our latest release that it scales up to 128 cores for a whole suite of different benchmarks.”
“If you have data that must be operated on sequentially, and there’s only a small amount of computation for each piece of data, that application would not benefit from parallelization,” Ramey adds. “The good news is that when smart programmers get engaged, the algorithms in applications that are fundamentally serial can often be redesigned so they can run in parallel.”
Modern design workstations typically have multiple processors, which provides HPC-like capabilities on the desktop. Software of all types has been designed to take better advantage of those capabilities.
“We keep making our data problems bigger and bigger,” Reinders says. “Because of that, computers can get more use from parallelism as time goes on. As long as we use more and more data, then parallelism will be required.”
The majority of engineering and design software has been architected to take advantage of parallelism. Some software providers have struggled with building tools to take advantage of this approach because of inexperience or because the software was originally designed when single-core was the standard.
“The question becomes how much does this perturb the way the program was written originally,” Reinders says. “There are a lot of applications that were written when there was just one core in a machine. This can affect the architecture of the application in a profound way.”
Developing software that takes advantage of these capabilities also requires sustained and continued HPC software development to effectively leverage the hardware. “The software should be able to support simulation resources where they are located,” Slagter says. “Certifying remote software solutions is important, along with expertise and support from HPC partners like Intel, NVIDIA, Hewlett Packard, etc., so you can make sure that you have an optimized reference architecture for the software and good support.”
Subscribe to our FREE magazine,
FREE email newsletters or both!Latest News
About the Author
DE EditorsDE’s editors contribute news and new product announcements to Digital Engineering.
Press releases may be sent to them via [email protected].