The growing fragmentation of digital evidence in modern computing environments poses significant challenges for digital forensic analysis. Data is often deleted, overwritten, or distributed across heterogeneous platforms, limiting the effectiveness of traditional forensic tools that rely on intact files and deterministic rules. This work addresses a key limitation in current forensic methodologies: the scarcity of learning-based approaches capable of identifying patterns in fragmented and incomplete digital evidence. To address this challenge, we propose PatternMiner, a hybrid deep learning framework that integrates Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM) networks, and Transformer encoders. The framework combines byte-level content fragments with contextual metadata, such as timestamps and file permissions, enabling multimodal inference from fragmented data while explicitly excluding label-derived features to prevent leakage. PatternMiner is evaluated on established forensic benchmark datasets, including Digital Corpora and AFF4 forensic containers, which simulate realistic fragmentation scenarios. All experiments are conducted under an explicit leakage-controlled evaluation protocol with group-aware data partitioning to ensure that performance reflects generalization to unseen data. Results show that the proposed framework achieves strong performance, with an accuracy of 92.1% and a macro-averaged F1-score of 92.1% under complete input conditions. Furthermore, the model demonstrates resilience to degraded and partially corrupted inputs, including truncation, byte removal, shifting, and fragment reordering. These findings indicate that PatternMiner effectively captures structural and contextual patterns in fragmented data, providing a practical step toward more reliable and data-driven forensic analysis. By combining multimodal learning with rigorous evaluation practices, the proposed framework contributes to developing scalable and generalizable solutions for modern digital forensic environments.
Sanjalawe et al. (Sat,) studied this question.