What question did this study set out to answer?

The aim is to improve the precision of action localization and recognition in untrimmed videos.

February 17, 2026Open Access

FreqAct: Frequency-Guided Hierarchical Feature Integration for Action Detection

Key Points

The aim is to improve the precision of action localization and recognition in untrimmed videos.
Proposed FreqAct framework for multi-frequency feature fusion.
Used low-frequency modulation to stabilize segment representations.
Employed high-frequency enhancement to preserve boundary cues.
Introduced boundary-aware regression loss for learning at action boundaries.
Applied intra-segment consistency regularization for coherent predictions.
FreqAct outperforms state-of-the-art temporal action detection methods.
Demonstrated enhanced localization accuracy in experiments on THUMOS14 and ActivityNet1.3.

Abstract

Temporal action detection (TAD) aims to localize and recognize action instances in untrimmed videos, and serves as a key component in practical intelligent electronic systems such as smart video surveillance and real-time human–machine interaction. In these scenarios, accurate temporal localization is essential for reliable event understanding and downstream decision-making in edge computing and real-time streaming scenarios. To handle long video durations and diverse action dynamics, existing methods typically rely on hierarchical temporal feature integration to capture multi-scale contextual information. However, such integration often leads to intra-segment inconsistency and boundary ambiguity, as indiscriminate temporal smoothing across scales degrades segment coherence and blurs critical boundary cues. In this work, we propose FreqAct, a multi-frequency feature fusion framework that explicitly models complementary low-frequency and high-frequency temporal components within hierarchical representations. Specifically, low-frequency modulation suppresses undesired temporal fluctuations to stabilize segment-level representations, while high-frequency enhancement preserves boundary-sensitive cues essential for precise localization. Furthermore, we introduce a boundary-aware regression loss to emphasize learning at action boundaries and an intra-segment consistency regularization to encourage coherent predictions within each action instance. Extensive experiments on THUMOS14 and ActivityNet1.3 demonstrate that FreqAct outperforms state-of-the-art TAD methods, further highlighting its effectiveness and practical potential for real-world electronics applications.

FreqAct: Frequency-Guided Hierarchical Feature Integration for Action Detection

Key Points

Abstract

Cite This Study