Fine-grained vehicle orientation estimation is widely reported with strong in-domain accuracy, yet performance degrades substantially when models are applied across datasets; the relative contributions of visual domain shift and annotation label incompatibility to this degradation remain poorly understood. A controlled cross-dataset benchmark was conducted using two publicly available datasets—Car Full View (CFV) and Freiburg Static Cars 52 v1.1 (UnsupCar)—under a fixed ConvNeXt-Small predictor with a varied training source, test target, and image preprocessing strategy. All conditions were evaluated with five-fold cross-validation at the vehicle-instance level. Annotation label incompatibility was identified as the dominant source of transfer error: correcting the angular convention mismatch in UnsupCar orientation labels reduced cross-dataset circular mean absolute error (CMAE) by approximately 3.5–4.5∘. Crop protocol was a similarly large factor—train/test crop mismatch raised CMAE into the 9–12∘ range. Square cropping with mirrored boundary padding provided the most robust preprocessing across both in-domain and cross-dataset conditions. After label harmonization, a residual transfer gap of approximately 2∘ remained, with a consistent directional asymmetry favoring the UnsupCar-to-CFV transfer direction. Joint training on both harmonized datasets achieved the best-balanced performance (3.77∘ on CFV; 5.38∘ on UnsupCar). These results demonstrate that instance-level splitting, explicit label harmonization, and consistent crop definition are necessary preconditions for credible cross-dataset vehicle orientation evaluation.
Pasaulis et al. (Thu,) studied this question.