Hyper Threading Speeds Linux Hyper-Threading support in the Xeon processor

Hyper-Threading Speeds Linux

By Duc Vianney, Ph. D. - 2003-12-31 Page: 1 2 3 4 5 6 7 8 9 10

Hyper-Threading support in the Xeon processor

The Xeon processor is the first to implement Simultaneous Multi-Threading (SMT) in a general-purpose processor. (See Resources for more information on the Xeon family of processors.) To achieve the goal of executing two threads on a single physical processor, the processor simultaneously maintains the context of multiple threads that allow the scheduler to dispatch two potentially independent threads concurrently.

The operating system (OS) schedules and dispatches threads of code to each logical processor as it would in an SMP system. When a thread is not dispatched, the associated logical processor is kept idle.

When a thread is scheduled and dispatched to a logical processor, LP0, the Hyper-Threading technology utilizes the necessary processor resources to execute the thread.

When a second thread is scheduled and dispatched on the second logical processor, LP1, resources are replicated, divided, or shared as necessary in order to execute the second thread. Each processor makes selections at points in the pipeline to control and process the threads. As each thread finishes, the operating system idles the unused processor, freeing resources for the running processor.

The OS schedules and dispatches threads to each logical processor, just as it would in a dual-processor or multi-processor system. As the system schedules and introduces threads into the pipeline, resources are utilized as necessary to process two threads.

Hyper-Threading support in Linux kernel 2.4

Under the Linux kernel, a Hyper-Threaded processor with two virtual processors is treated as a pair of real physical processors. As a result, the scheduler that handles SMP should be able to handle Hyper-Threading as well. The support for Hyper-Threading in Linux kernel 2.4.x began with 2.4.17 and includes the following enhancements:

128-byte lock alignment
Spin-wait loop optimization
Non-execution based delay loops
Detection of Hyper-Threading enabled processor and starting the logical processor as if machine was SMP
Serialization in MTRR and Microcode Update driver as they affect shared state
Optimization to scheduler when system is idle to prioritize scheduling on a physical processor before scheduling on logical processor
Offset user stack to avoid 64K aliasing

View Hyper-Threading Speeds Linux Discussion

Page: 1 2 3 4 5 6 7 8 9 10 Next Page: Kernel performance measurement

First published by IBM developerWorks