What question did this study set out to answer?

The aim is to enhance fault detection reliability in transmission lines without retraining a large model.

March 3, 2026Open Access

Vision–Language Models for Transmission Line Fault Detection: A New Approach for Grid Reliability and Optimization

Key Points

The aim is to enhance fault detection reliability in transmission lines without retraining a large model.
Utilized a Florence-2 vision language model as the base recognizer.
Introduced a subclass-aware fusion scheme for stable decision-making.
Employed Power-Line Focus Then Crop normalization for clearer imagery.
Implemented a corridor geo prior to suppress irrelevant detections.
Evaluated methods under a shared preprocessing and scoring pipeline.
Showed higher accuracy for detecting skinny and low-contrast faults.
Reduced false alarms for detections outside the right-of-way.
Improved score calibration within the confidence range.
Maintained suitable throughput and memory usage for UAVs and edge devices.

Abstract

Reliable fault detection along transmission corridors is essential for preventing small defects from developing into long outages and costly emergency operations. This study aims to improve the field reliability of an open vocabulary vision language backbone without retraining the large model in an end-to-end manner. The work focuses on four operational fault classes in multi-region corridor imagery collected during routine inspections and uses a Florence-2 vision language model as the base recognizer. On top of this backbone, three domain-specific components are introduced. A subclass-aware fusion scheme keeps probability mass within the active parent concept so that insulator icing and conductor icing produce stable, action-oriented decisions. A Power-Line Focus Then Crop normalization uses an attention-guided corridor window together with isotropic resizing so that thin conductors and small fittings remain visible in the processed image. A corridor geo prior reduces scores as the distance from the mapped centerline increases and in this way suppresses detections that lie outside the corridor. All methods are evaluated under a shared preprocessing and scoring pipeline in training-free and parameter-efficient tuning modes. Experiments on unseen regions show higher accuracy for skinny and low-contrast faults, fewer false alarms outside the right-of-way, and improved score calibration in the confidence range used for triage, while keeping throughput and memory usage suitable for unmanned aerial vehicles and substation edge devices.

Vision–Language Models for Transmission Line Fault Detection: A New Approach for Grid Reliability and Optimization

Key Points

Abstract

Cite This Study