What question did this study set out to answer?

The aim is to improve oriented object detection in remote sensing images by addressing query redundancy related to object quantity variations.

May 11, 2026Open Access

DQA-DETR: Dynamic Query Aggregation for Oriented Object Detection in Remote Sensing Images

Key Points

The aim is to improve oriented object detection in remote sensing images by addressing query redundancy related to object quantity variations.
Proposed a new framework called DQA-DETR for dynamic query aggregation.
Incorporated three modules: aggregation center predictor, selector, and query aggregator.
Evaluated performance on DOTA v1.0 and v1.5 datasets.
DQA-DETR enhances detection performance compared to prior methods.
Reduces query redundancy in sparse and dense scenes.
Maintains the advantages of the original DETR framework.

Abstract

Oriented object detection in remote sensing images faces challenges, such as arbitrary object orientations and uneven distribution of object quantities. Although detection transformer (DETR)-based methods have recently achieved promising progress in remote sensing object detection, they still face challenges when using a fixed query budget in scenes where object quantities vary significantly due to large-scale imaging and complex spatial distributions. A fixed query budget can be poorly matched to such variability, producing redundant queries in sparse scenes and many high-quality negatives under one-to-one matching, which may hinder optimization efficiency and degrade performance. To address this issue, we propose a transformer-based dynamic query aggregation detection framework (DQA-DETR) to alleviate query redundancy in remote sensing scenarios. DQA-DETR incorporates three key modules: aggregation center predictor, aggregation center selector, and query aggregator, which are responsible for predicting the required number of representative queries, selecting high-quality aggregation centers, and aggregating semantically related queries via a multihead attention mechanism, respectively. This strategy adaptively adjusts the number of queries involved in matching according to the object quantity in each image, effectively reducing redundancy and enhancing the model’s adaptability in both sparse and dense scenes. Extensive experiments on the DOTA v1.0 and v1.5 datasets demonstrate that DQA-DETR improves detection performance, while maintaining the end-to-end advantages of DETR, exhibiting strong competitiveness among state-of-the-art methods.

Read Full Paperexternally

Perguntar à IA

Bookmark

View Full Paper