Tactical intention recognition of aerial targets is critical for battlefield decision-making, yet existing supervised deep learning approaches face fundamental challenges due to the extreme scarcity of labeled training data in military domains, where data annotation is constrained by security concerns and the high cost of expert labeling. This paper addresses these challenges by introducing Unsupervised Momentum Contrast and Transformer-Driven Intent Classification Architecture (UMC-TransDICA), a novel framework that leverages unlabeled time-series data for training through unsupervised momentum contrast learning. Beyond the conventional combination of momentum contrast learning, Transformer encoders, and KNN classification, our architecture contributes three key methodological innovations. First, we propose a Markov chain-based positive sample generation method that employs a learned diffusion-denoising process to synthesize semantically consistent yet feature-diverse positive samples, replacing conventional stochastic augmentations that introduce semantic distortions in time series. Second, we introduce a Bi-directional InfoNCE loss function that jointly optimizes query-to-key and key-to-query similarities, demonstrating superior capability in capturing temporal dynamics compared to standard unidirectional contrastive losses. Third, we integrate these innovations into a streamlined framework that eliminates the need for fine-tuning on labeled data, using KNN classification directly in the learned embedding space during testing. Experiments demonstrate that UMC-TransDICA achieves 92.33% accuracy on 7-class tactical intention classification, outperforming supervised baselines while requiring only unlabeled pre-training data and a small labeled test set. Ablation studies validate the effectiveness of each component: the Markov chain-based positive sample generation significantly improves accuracy over random augmentation, and Bi-InfoNCE loss contributes an additional 1.8% improvement over unidirectional InfoNCE. These results establish the practical viability of unsupervised contrastive learning for military intent recognition and offer a generalizable approach for scenarios with limited labeled data.
Song et al. (Thu,) studied this question.