In prior research, many Neural Architecture Search (NAS) algorithms often rely on direct estimates to evaluate architectures, such as “accuracy” after extensive training. However, the search space usually includes tens or hundreds of thousands of candidate architectures, each requiring significant GPU time to obtain the “accuracy”. Thus, we propose the Universal Training-Related Estimate (UTRE). UTRE is measured by the falling speed of the training loss and the time cost of each iteration. Unlike “accuracy”, which is calculated by binary outcome (correct or incorrect), UTRE is a natural softer estimate, because it relies on knowledge of the training phase. This feature helps UTRE substantially reduce time costs under an aggressive strategy while maintaining impressive performance. In NASBench-201, the early stopping “accuracy(es)” reached the 0.8 Spearman correlation on ImageNet-16, taking 665 GPU seconds for early stopping. In contrast, UTRE achieves this with only 10.8 GPU seconds, a remarkable 62x speedup. However, NAS algorithms still require “accuracy” to determine the actual performance of found architectures. Therefore, the proposed UTRE-NAS involves the use of UTRE for coarse sorting within the search space, followed by fine sorting using “accuracy”. The promising architecture discovered by UTRE-NAS, achieving 45.21% accuracy on ImageNet-16, and total cost time is 18.2k GPU seconds, surpassing K-NAS’s 45.05% with 40k GPU seconds. Furthermore, another time-consuming experiment shows that the promising architecture discovered by UTRE-NAS reached 46.86% accuracy with less time consumption than WeakNAS and outperformed its 46.79%.
Zhang et al. (Thu,) studied this question.