This Data Descriptor introduces a multilingual news dataset about Ukraine spanning 2022–2025. It contains 120,617 articles gathered from publicly available online news sources and organized to support research on the information environment surrounding the war. The dataset includes article metadata and text fields, as well as thematic labels that help researchers study how key issues are discussed over time and across outlets. It is intended to support research on media coverage and information environments, supporting research on misleading narratives, and analysing trends in media coverage. The collection can also be used to develop and evaluate natural language processing methods for text classification, topic analysis, and comparative studies of reporting across languages. By providing a large, structured corpus focused on a high-impact geopolitical context, the dataset enables reproducible experiments and offers a practical foundation for researchers and practitioners interested in media analysis, disinformation studies, and information resilience.
Lipianina-Honcharenko et al. (Tue,) studied this question.