Key points are not available for this paper at this time.
Microprocessor execution speeds are improving at a rate of 50%-80% per year while DRAM access times are improving at a much lower rate of 5%-10% per year. Computer systems are rapidly approaching the point at which overall system performance is determined not by the speed of the CPU but by the memory system speed. We present a high performance memory system architecture that overcomes the growing speed disparity between high performance microprocessors and current generation DRAMs. A novel prediction and prefetching technique is combined with a distributed cache architecture to build a high performance memory system. We use a table based prediction scheme with a prediction cache to prefetch data from the on-chip DRAM array to an on-chip SRAM prefetch buffer. By prefetching data we are able to hide the large latency associated with DRAM access and cycle times. Our experiments show that with a small (32 KB) prediction cache we can get an effective main memory access time that is close to the access time of larger secondary caches.
Alexander et al. (Mon,) studied this question.