To safely integrate deep learning or AI components into safety-and mission-critical systems, systematic verification techniques are required. Specifically, test case generation methods are needed by industry to assess the quality and robustness of these deep learning components. The goal of the TAEFuzz paper Ji25 is specifically on automatic robustness testing. The main target is to systematically explore the input space to find adversarial examples. An adversarial example adds small perturbations to an input of a learning component, that results in a completely different outcome of this component, eg. a different classification. Thus, the leaning component is not robust with respect to these small perturbations. TAEFuzz uses evolutionary test case generation techniques and the concept of transferable adversarial examples to effectively find these small perturbations. The approach is applied to test AI components which are used in image classification Ji25. A similar approach has also been applied to black-box speech recognition systems Ca23.
Ji et al. (Thu,) studied this question.