Abstract This work proposes a hierarchical approach to reduce the training time of task-based routines by reusing previously obtained autotuning information. This approach has been integrated into a working prototype of Chameleon, a dense linear algebra software whose tile-based routines are executed on the available computational resources by means of a runtime system. The results show that this approach provides a high degree of scalability to the entire self-optimization process, achieving a reduction in training time of up to 80% and an appropriate selection of values for the adjustable parameters.
Cámara et al. (Thu,) studied this question.