What question did this study set out to answer?

The aim is to improve the detection of chip-leg defects using an advanced MobileNetV3-DETR model.

April 12, 2026Open Access

MobileNetV3-DETR with locality-biased decoding for chip-leg defects on embedded AOI

Key Points

The aim is to improve the detection of chip-leg defects using an advanced MobileNetV3-DETR model.
Developed MobileNetV3-DETR tailored for dense chip-leg defect inspection
Implemented task-aligned FPN pruning to optimize pyramid levels
Introduced lattice-aware positional encoding for better attention focusing
Utilized defect-prior adaptive Top-K attention to manage attention budget
Employed defect-aware assignment (D-AA) to enhance recall of rare defects
Achieved 89.5% mAP@50 and 55.25% mAP@[0.5:0.95]
Model delivered detection time of approximately 60 ms per image
Demonstrated reduced errors on challenging defect pairs through confusion-matrix analysis

Abstract

We present a scene-specific MobileNetV3-DETR for chip-leg defect inspection, where targets are tiny, dense, and arranged on a regular lattice. Instead of stacking generic tricks, we formalize three complementary mechanisms: (A1) a task-aligned FPN pruning criterion that selects the minimal pyramid levels P3–P5 to match the empirical defect-size distribution; (A2) a lattice-aware relative positional encoding that biases attention toward physically plausible row/column offsets; and (A3) a defect-prior adaptive Top-K sparse attention that allocates the decoder’s attention budget by local response quantiles with a device-aware cap Kmax. In training, a defect-aware assignment (D-AA) re-weights the Hungarian classification term by smoothed class priors, improving recall of rare, safety-critical defects without changing the inference graph. Under a unified embedded protocol on Jetson Nano (TensorRT FP16, 512 × 512, batch = 1), the model uses ~ 7. 2 M parameters and ~ 29. 9 GFLOPs, reaches 89. 5% mAP@50 and an average of 55. 25% mAP@0. 5: 0. 95, and delivers ≈ 60 ms per image. Ablations demonstrate independent gains from A1–A3, while confusion-matrix analysis confirms reduced errors on the hardest pairs (e. g. , bentₗeg vs. damagedₗeg; dirty vs. scratches), indicating an improved accuracy–latency balance for inline AOI deployment.

Bookmark

View Full Paper

Bookmark

View Full Paper

MobileNetV3-DETR with locality-biased decoding for chip-leg defects on embedded AOI

Key Points

Abstract

Cite This Study