Key points are not available for this paper at this time.
Testing a scalability bottleneck requires a large system to generate sufficient load, which is usually not accessible to researchers. To address this problem, this paper extrapolates the workload to a bottleneck node. The key observation that motivates our approach is that systems at a large scale are often repeating their behaviors at small scales, by running a job more times, running more nodes of the same type, or running more iterations of the same loop. Following this observation, we record a node's workloads at small scales and extrapolate such workload at a large scale. Towards this goal, we have developed PatternMiner, a semi-automatic tool to identify how workload patterns change with scale. We have tested our method on HDFS NameNode and YARN's Resource Manager. Our evaluation shows that PatternMiner is able to predict 98% of the workloads for NameNode and 83% of the workloads for the Resource Manager. Furthermore, by utilizing the extrapolated workload, we are able to emulate a cluster of up to 60,000 nodes with only 8 physical machines to evaluate NameNode and Resource Manager.
Shi et al. (Sat,) studied this question.