This study proposes a novel malware detection framework integrating dynamic and static analysis, and realizes the collaborative processing of bi-modal data through a unified graph neural network architecture. Specifically: extracting the control flow and data dependency features from binary disassembly, and capturing the system call sequence with time attributes in the sandbox environment; After encoding the two types of features into heterogeneous relationship graphs, a two-branch network is adopted to process the static topology (graph convolutional layer) and dynamic sequence (graph attention layer) respectively; Finally, the classification decision-making is achieved by the feature fusion module. In the benchmark test set of EMBER, VirusShare, and CIC-MalMem, the accuracy rate of the framework exceeded 95%, which is 4 to 7 percentage points higher than the single-modal baseline. The recall rate of unknown malware families remained above 92%, and the single-sample detection time was less than 50 milliseconds. The ablation experiment confirmed that static features effectively resist shell confusion and dynamic temporal attributes improve the recognition of distorted viruses. The current system has limitations on anti-sandbox detection technology. Further research suggests combining reinforcement learning to dynamically adjust the sandbox depth and introducing contractive learning to optimize the discriminative ability of graph embedding.
Jingyu Tang (Fri,) studied this question.