The preservation of Intangible Cultural Heritage (ICH) faces challenges in managing large volumes of unstructured digital audio. Existing Music Information Retrieval (MIR) systems often underperform in this domain as they are optimised for commercial music. This paper evaluates feature extraction for seven traditional Portuguese instruments using 1734 field recordings from the “A Música Portuguesa a Gostar Dela Própria” (MPAGDP) archive. We implemented strict session-based stratification to prevent data leakage. The performance of YAMNet, VGGish, and OpenL3 was compared. Overall, results show that while OpenL3 provides the highest average timbral resolution (mean frame-level accuracy: 91.13% ± 3.31; Macro F1: 0.884 ± 0.031), a significant performance trade-off exists. Quantitative stress tests reveal OpenL3 is most resilient to synthetic Gaussian noise (78.5% accuracy at 5 dB SNR), making it ideal for high-precision archival digitisation. Conversely, YAMNet excels in uncontrolled fieldwork with vocal interference (94.7% accuracy), offering a more robust filter for non-musical semantic noise. Additionally, an architectural ablation study justifies a Dense MLP classifier, proving that simpler heads outperform sequential models (LSTM or Transformer) in these low-resource contexts. These findings offer a flexible technical roadmap: OpenL3 is recommended for institutional repositories requiring maximum resolution, while YAMNet is optimal for mobile devices or environments with high vocal overlap, providing a robust solution for preserving regional musical memory. • A novel audio dataset of traditional Portuguese instruments stratified by recording session. • Comparison of YAMNet, VGGish, and OpenL3 embeddings for heritage audio classification. • OpenL3 achieves superior performance (mean 91.13% accuracy) in discerning similar string instruments across eight randomised seeds. • A statistically robust split methodology validated through eight independent iterations that prevents data leakage, common in small folk music datasets.
González et al. (Sat,) studied this question.