Storage for High Performance Computing
You can accomplish your work faster if you pay attention to your hard disk.
March 2, 2009
By Peter Varhol
Most desktop operating systems, including Windows, let you adjust the size of virtual memory on a hard disk. Giving yourself more virtual memory lets you run more applications, but you willprobably also have more disk accesses. |
Hard disks have traditionally been the slowest part of a computer system; usually by a factor of 1,000 compared to memory and CPU. This can affect the speed with which an application launches, and if the application is loading large data files, the process will slow substantially.
But there’s still more to the performance problem than file load. It’s been the computer industry’s dirty little secret that today’s high-performance applications are more often disk-bound than CPU or memory bound. That occurs because of the nature of multitasking operating systems such as Windows and Linux. While more memory always helps application performance, the problem doesn’t go away.
However, with faster hard disks on the market and greater use of solid-state storage, storage is gradually enabling higher performance in both launching and running complex applications. It benefits engineers by enabling ever-more complex applications to run faster, cutting at least some amount of time off the running of the software tools of the trade. And it can cost little or nothing to achieve this improvement in performance.
All modern operating system use what’s termed “virtual memory” or “swap space” on the disk. In Windows, if you look in the System Properties utility in the Control Panel, you’ll find that you have the ability to define the amount of disk space to allocate for that purpose.
Operating systems let you load and run more applications at the same time than the computer has memory available for. They do so by loading only a portion of each application into memory at one time, known as the “working set.” The rest of the application resides on the disk, and “swapped” into memory when that part of the code is needed. So, in practice, anyone using a computer is always accessing the hard disk.
Further, it’s easy to figure out the performance hit you take when you do so. In the early 1990s, disks typically spun at 3600 rpm, which resulted in an average access time of 8.3 milliseconds. Today, a standard disk typically rotates at 5400 rpm, so it rotates once every 1/5400 minutes, or .01 seconds. On average, the disk has to rotate halfway around before the desired data is under the read arm, or .005 seconds, making the average access time on the order of five milliseconds.
Because both memory and CPU clock cycles are measured in microseconds rather than milliseconds, both are far faster than the hard disk. So in many cases the hard disk is the bottleneck, and improving hard disk performance makes a much larger impact on performance than just about anything else you can do for your computer. There are some incremental improvements in electronics and caching surrounding disk drives and controllers today, but the real value is in speeding up the rotation of the hard disks themselves.
Faster Hard Disks Are Here
Hard disks are the only part of a computer that uses moving parts driven by mechanical means. Fast as they are, they are painfully slow compared to the rest of the computer. As stated, most hard drives in standard systems today spin at 5400 rpm. Today, we’re seeing an increase in the speeds of hard disk rotation. While these increases look impressive on the surface, they still don’t begin to match the performance of the computer system electronics.
However, higher rotational speeds are directly related to performance of applications. High-performance disks have a rotational speed of 7200 rpm, which translates into an access time of 4.1 milliseconds. The very fastest disks today rotate at 10,000 rpm, giving them an access time of three milliseconds.
Other factors can also play into disk performance, such as the amount of cache on the disk controller and elsewhere in the data path of the system. Caching algorithms work off the premise that an application is more likely to use code or data most recently used already, or in the same area on the disk as previously. So hard disk controllers try to save such data to significantly faster cache storage, frequently using solidstate chips or even dynamic random access memory (DRAM). The result is that many disk accesses can be served from cache, if the cache is big enough.
It makes sense that the bigger the cache size, the better. The more the cache can hold, the more likely it is that the code or data needed by the application will be in the cache. But as fast as cache storage is, it has to be wiped clean and reloaded frequently, which means that all code and data must come from the disk, slowing down overall performance.
Solid-State Storage
A solid-state drive (or SSD) is a data storage device that uses solid-state memory to store applications and other persistent data, just as a rotating hard disk would. Fundamentally, it consists of memory chips, but these receive an additional flow of electricity while the system is off, making them nonvolatile. In other words, they can keep stored data, even while the system is turned off. Conceptually, they are similar to the chips used in USB flash drives.
You generally can’t get as much solidstate storage as you can with a rotating hard disk. That has little to do with any technical limitations, and more to do with cost. While nonvolatile solid-state storage has come down in price in recent years, it is still significantly more per GB than equivalent disk storage. Further, that will probably be the case for a long time to come.
Also, don’t count on SSD to be as fast as your system’s DRAM. Because SSD tends to have larger capacities, it also tends to be slower to access.
That said, SSD capacities are rapidly growing. Micron Technology (Boise, ID), for example, is currently sampling a 320 GB chip set. While computer engineers still have to fashion a storage architecture around those chips, it won’t be a matter of building an entire hard drive and controller. Expect to see SSD capacities in this range available on computers in the near future.
Getting Optimal Disk Performance
Other characteristics also factor into disk performance. For example, files on the disk can become fragmented as the operating system looks for enough disk space to store them. The OS prefers to store files contiguously on disk, so that only one access is needed in order to get the entire file. However, as files and applications are saved, written to, and deleted, that’s frequently not possible.
This can mean that the disk has to make several accesses to load an application or retrieve and save files, effectively doubling or tripling the time it takes to perform common tasks. That performance hit is often hidden unless you have an idea of the level of file fragmentation on the computer.
Fortunately, Windows has a built-in disk defragmenter, hidden in the Accessories menu. You can also buy commercial disk defragmenters that can show you the state of your disk, and also run the defragmenter continuously during computer use. The advantage here is that the defragmenter can move files around in storage and piece them together so that in just about all cases, loading your applications and files will take just a single disk access apiece.
Taking into account everything that a power computer user can influence in speeding up hard-drive performance, here is a prescription for improving your overall system performance.
Get the hard disk with the highest rotational speed. More than anything else, getting a hard disk with the highest possible rotational capacity is going to make your applications run faster.
Make sure you have a large disk cache. Two or more GB is fairly common in high performance disk caches these days, and it will almost certainly improve your overall performance.
Defragment your disk often. No matter how fast your disk, if the read arm has to go to multiple locations to grab file fragments, your performance is going to slow down. Either use the Windows disk defragmenter every week, or get one that runs continuously.
Keep an eye on solid-state storage. In time, it is likely that most computer storage will be solid-state. If the price and capacities available meet your requirements, go for this alternative over a traditional hard disk.
No matter what your budget for a highperformance computer, you can get a faster computer by judicious selection of your choices for mass storage. In some cases, such as adding cache or using SSD, it will cost incrementally more than a standard computer configuration, but if you’re sitting there for long periods listening to the disk churning, you will agree that a few more dollars are worth it.
More Info:
Micron Technology
Boise, ID
Microsoft
Redmond, WA
For more information on this topic, please visit deskeng.com.
Peter Varhol has been involved with software development and systems management for many years. Send comments about this column to DE-Editors@
deskeng.com.
Subscribe to our FREE magazine,
FREE email newsletters or both!About the Author
Peter VarholContributing Editor Peter Varhol covers the HPC and IT beat for Digital Engineering. His expertise is software development, math systems, and systems management. You can reach him at [email protected].
Follow DE