With the rapid development of VLSI technology, the speed of processors has increased dramatically in the past decade. Nevertheless, the speed of memories has been increased at a much slower pace. As long as the DRAM technology is the most cost-effective choice of high density for the main memory, we will continue to see an increasing gap in speed between the processor and the memory. Regarding computer applications, the workloads are becoming more and more data-intensive and memory demanding. Therefore, the memory access latency of data-intensive applications on a computer system is a major performance bottleneck.
We are conducting several research projects to improve memory performance. Our objectives are to understand the memory access patterns of different application types, and to exploit data locality at different levels of the memory hierarchy. We are developing, evaluating, and implementing software/hardware methods from different system aspects: application programming, runtime software support, and memory architecture designs.
Technology Transers and
Impact based on Our Research
Processor Cache Designs
We have proposed and evaluated different cache designs to increase cache hit rates, and to save power consumption.
Representative Publications:
Memory Architecture Designs
We have proposed and evaluated different architecture designs to reduce cache and memory access latencies.
Representative Publications:
Runtime Software Support to
Exploit Cache Locality
Exploiting cache locality of parallel programs at runtime is a complementary approach to a compiler optimization. This is particularly important for those applications with dynamic memory access patterns. We have developed and implemented a memory-layout oriented method to exploit cache locality of parallel loops at runtime on Symmetric Multi-Processor (SMP) systems.
Representative Publications:
Cache-effective Algorithm Designs and
Implementations
Some commonly-used algorithms, such as sorting, matrix multiplication, matrix transpose, and bit-reversals, are highly sensitive to memory performance. Since these algorithms are often repeatedly used as fundamental subroutines for scientific and commercial applications, memory-effective methods and their architecture-dependent implementations are useful program libraries for high performance programming at the user level.
Representative Publications: