This dataset contains data for testing machine translation and topic classification in Piedmontese. It is based on FLORES+ (NLLB Team et al., 2024) and SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects (Adelani et al., EACL 2024).
Vico et al. (Thu,) studied this question.