Key points are not available for this paper at this time.
Optimizing an application's performance for a given microarchitecture has become painfully difficult. Increasing microarchitecture complexity, workload diversity, and the unmanageable volume of data produced by performance tools increase the optimization challenges. At the same time resource and time constraints get tougher with recently emerged segments. This further calls for accurate and prompt analysis methods. The insights from this method guide a proposal for a novel performance counters architecture that can determine the true bottlenecks of a general out-of-order processor. Unlike other approaches, our analysis method is low-cost and already featured in in-production systems - it requires just eight simple new performance events to be added to a traditional PMU. It is comprehensive - no restriction to predefined set of performance issues. It accounts for granular bottlenecks in super-scalar cores, missed by earlier approaches.
Building similarity graph...
Analyzing shared references across papers
Loading...
Ahmad Yasin (Sat,) studied this question.
synapsesocial.com/papers/6a02e098c8c4199b329e2fed — DOI: https://doi.org/10.1109/ispass.2014.6844459
Ahmad Yasin
Intel (United States)
Building similarity graph...
Analyzing shared references across papers
Loading...