Hardware Performance Theory

By , 10 May 2008

Hardware Performance Theory

Blog-o'clock!! Hooray (*dances*). Hmmm... I like to know about hardware. You can keep your soldering iron to yourself, but feel free to tell me all about the specs and architecture :) . I also like things which are light, fast and running at their optimal capacity. So I made this little high level overview of typical computer hardware systems and how their components effect the overall performance of the system. Hope you enjoy!

1. Processor architecture (e.g. Intel Pentium 4)

A CPU is just like a fancy calculator, but with more registers and more operations. A simple calculator only has one register because you can only work on one number at a time. Modern CPUs might have as many as 128 register and many many operations. The brands differ between each other in what operations they offer, although its pretty difficult to say whether or not this is going to make any difference to you in the big picture. CPUs can only do one operation at a time.

Hardware Performance Theory

2. Number of processing units (e.g. 1)

New computers normally have more than one processor which allows multiple operations to be done simultaneously. This is particularly useful for multitasking or threaded applications. The processing units can be squeezed onto a single chip (multicore) or you can have them on separate chips (multiprocessor).

3. Processor clock speed (e.g. 3 GHz)

This just represents how many operations can be done per second. With your calculator you would be pushing it to reach 1 operation per second. A 3 GHz CPU can do roughly 3000 million per second. It might be worth noting that, depending on your use cases, you probably won't often need to do this much work. Most desktop CPU's are idle a great deal of the time.

4. Processor cache sizes (e.g. 1MB/2MB)

When the data the CPU needs isn't in its registers, it must fetch it from RAM. This is comparatively slow, so the CPU can cache the most used data from the RAM. Most CPUs have 2 or 3 caches or different sizes and speeds. Multicore CPUs usually share the same on-chip cache and multiprocessor systems may share an off-chip cache.

5. Front-side bus speed (e.g. 800 MHz)

The Front-side bus links the CPU to the rest of the components. The speed of the bus determines how fast it can transfer data to and from these components. You don't usually measure the bandwidth of the bus, but its speed will determine the maximum bandwidth of the RAM.

6. RAM bandwidth (e.g. 3200 MB/s)

This is how fast data can be transferred from RAM to CPU registers. The faster the RAM is, the less time the CPU spends idle. You would never saturate this bandwidth unless you were doing some very weird data processing, so it might be more useful to think in terms of s/MB rather than MB/s, but hardware vendors obviously don't agree.

7. RAM capacity (e.g. 2GB)

When data which is not in RAM needs to be accessed, it must be fetched from disk, which is typically about 100 times slower than reading from RAM, so the more data you can fit into RAM the better. Good operating systems will always use spare RAM to cache data from the disk too.

8. Number of RAM channels (e.g. 2)

Most motherboards have several slots to add RAM. Some of them can access these slots simultaneously if configured properly. For example, a dual channel controller can access two sticks of RAM at the same time which roughly doubles your bandwidth.

9. RAM configuration (e.g. one stick)

If you mix RAM sticks with different specs you can get mixed results. Usually the overall bandwidth is that of the slowest stick.

10. HDD Bus type (e.g. SATA)

This is how the hard disk is connected to the motherboard (and therefore other components). The different types of bus have different maximum bandwidths:

  • USB: 1.5 MB/s
  • USB2: 60 MB/s
  • ATA/IDE: up to 133 MB/s
  • SCSI: up to 640 MB/s
  • SATA: up to 6000 MB/s

Unfortunately, the disks themselves can't read and write at anywhere near the speeds of SATA which is used in most new systems.

11. Disk cache size (e.g. 8 MB)

Disk access is slow, but can be avoided by using a small fast cache on the disk. These caches tend to be so small compared to the total disk capacity though that having a bigger cache wouldn't give you a proportional performance improvement. Transfers from the disk cache give you the disk's fastest possible external transfer rate (e.g. 400 MB/s)

12. Disk transfer method (e.g. udma5)

There are several methods the hardware can use to shift the bits around internally. Programmable I/O (PIO) shifts data via the CPU to RAM, and Direct Memory Access (DMA) moves data directly to RAM. uDMA is the same as DMA but twice as fast. Each method has a different theoretical maximum bandwidth from 3 MB/s (PIO0) to 100 MB/s (uDMA5).

13. Disk internal transfer rate (e.g. 50 MB/s)

How fast the disk controller can can read and write from the actual disk is the internal transfer rate. It is a function of the physical characteristics of the drive such as the spin rate. Internal disk transfer is normally one the slowest parts of the system.

14. Disk arrangement (e.g. RAID0)

Two disks of the same type can be connected in parallel using RAID0 to double your effective transfer rates. You can also connect multiple independent disks without RAID in various ways. Depending on how you connect them, this could cause them both to run at the speed of the slowest drive.

15. Other

The two other main areas relating to performance are network performance and software performance. It doesn't matter how many zillion operations per second your CPU can do, or how many gigabytes of data you can shift around the machine per second if you can only download data from the 'Net at 4kb/s. Equally, software powers the whole setup and can instruct the hardware to do whatever it pleases. It's good to have software which does sensible things with your hardware.

Sometime later I might blog about network and software performance, and also some tools you can use to measure performance. Should be fun.

About Roger Keays

Hardware Performance Theory

Roger Keays is an artist, an engineer, and a student of life. He has no fixed address and has left footprints on 40-something different countries around the world. Roger is addicted to surfing. His other interests are music, psychology, languages, the proper use of semicolons, and finding good food.

Leave a Comment

Please visit https://rogerkeays.com/blog/hardware-performance-theory to add your comments.