The support for virtual memory is a key aspect of modern system on a chip (SoC). Traditional address translation schemes rely on a single translation lookaside buffer (TLB) entry to translate a virtual page to a physical page. To improve the TLB utilization, various TLB coalescing schemes merge multiple pages into a single TLB entry. TLB coalescing can be further enhanced by compressing multiple blocks into a group of page-table entries (PTEs). However, the previous compression schemes do not fully exploit the system characteristics and efficiently utilize PTE resources. To further enhance TLB coalescing, we propose the clustered range-compressed page table (cRCPT). This scheme combines compression and clustering techniques to improve the memory system performance. By exploiting potential spatial locality, the presented scheme can improve the address translation efficiency for high-bandwidth I/O devices.
Tran et al. (Thu,) studied this question.