Training different layers differently may affect resulting adversarial robustness and clean accuracy in adversarial training. We focus on the BatchNorm layers and study their unique role in adversarial training. Through a partial adversarial (pre-)training methodology we investigate how different optimization strategies for the BatchNorm layers affect adversarial robustness, and interplay with other model design choices.
Zeise et al. (Wed,) studied this question.