This paper proposes and evaluates software techniques that increase
register file utilization for simultaneous multithreading (SMT)
processors. SMT processors require large register files to hold multiple
thread contexts that can issue instructions, out of order, every cycle. By
supporting better inter-thread sharing and management of physical registers,
an SMT processor can reduce the number of registers required and can improve
performance for a given register file size.
Our techniques specifically target register deallocation. While out-of-order
processors with register renaming are effective at knowing when a new
physical register must be allocated, they are limited in knowing when
physical registers can be deallocated. We propose architectural extensions
that permit the compiler and operating system to (1) free registers
immediately upon their last use, and (2) free registers allocated to idle
thread contexts. Our results, based on detailed instruction-level
simulations of an SMT processor, show that these techniques can increase
performance significantly for register-intensive, multithreaded programs.
To get the PostScript file, click here. For PDF, click here.