Multiprocessor performance on a single processor
The Intel Xeon processor introduces a new technology called Hyper-Threading (HT) that, to the operating system, makes a single processor behave like two logical processors. When enabled, the technology allows the processor to execute multiple threads simultaneously, in parallel within each processor, which can yield significant performance improvement. We set out to quantify just how much improvement you can expect to see.
The current Linux symmetric multiprocessing (SMP) kernel at both the 2.4 and 2.5 versions was made aware of Hyper-Threading, and performance speed-up had been observed in multithreaded benchmarks (see Resources later in this article for articles with more details).
This article gives the results of our investigation into the effects of Hyper-Threading (HT) on the Linux SMP kernel. It compares the performance of a Linux SMP kernel that was aware of Hyper-Threading to one that was not. The system under test was a multithreading-enabled, single-CPU Xeon. The benchmarks used in the study covered areas within the kernel that could be affected by Hyper-Threading, such as the scheduler, low-level kernel primitives, the file server, the network, and threaded support.
The results on Linux kernel 2.4.19 show Hyper-Threading technology could improve multithreaded applications by 30%. Current work on Linux kernel 2.5.32 may provide performance speed-up as much as 51%.
Intel's Hyper-Threading Technology enables two logical processors on a single physical processor by replicating, partitioning, and sharing the resources within the Intel NetBurst microarchitecture pipeline.
Replicated resources create copies of the resources for the two threads:
- All per-CPU architectural states
- Instruction pointers, renaming logic
- Some smaller resources (such as return stack predictor, ITLB, etc.)
Partitioned resources divide the resources between the executing threads:
- Several buffers (Re-Order Buffer, Load/Store Buffers, queues, etc.)
Shared resources make use of the resources as needed between the two executing threads:
- Out-of-Order execution engine
Typically, each physical processor has a single architectural state on a single processor core to service threads. With HT, each physical processor has two architectural states on a single core, making the physical processor appear as two logical processors to service threads. The system BIOS enumerates each architectural state on the physical processor. Since Hyper-Threading-aware operating systems take advantage of logical processors, those operating systems have twice as many resources to service threads.