What type of study is this?

This is a Experimental Study study.

September 19, 2025Open Access

Digital innovations in historical climatology: Classifying weather and climatic extremes and their impacts on societies using machine learning on written documents

Key Points

The application of machine learning improves the analysis of historical climatology, enabling faster classification of weather extremes.
Results show that digital tools can enhance resource efficiency, with notable differences in accuracy and memory demands among methods used.
Utilizing the tambora.org corpus, the study emphasizes modern data extraction techniques for large, unstructured textual data.
Digital innovations, particularly AI, may revolutionize how researchers interpret and classify climate data from historical documents.

Abstract

This article explores how digital innovations – particularly machine learning and natural language processing – can streamline and enhance workflows in historical climatology. Traditionally reliant on time-consuming manual analysis of historical documents, the field now benefits from modern digital tools at each research stage, from source discovery to publication. Focusing on classifying large, unstructured textual data, the study examines methods ranging from manual keyword searches and Bayesian models to advanced large language models. Using the tambora.org corpus, it extracts and categorizes references to weather extremes like thunderstorms and heavy rainfall and their impacts on mobility. The paper compares these approaches in terms of accuracy, resource demands such as runtime performance and memory, and their ability to interpret historical language. It argues that digital methods – especially AI – can transform the extraction and classification of climate data from historical texts, offering significant advantages by assisting researchers in historical climatology.

Read Full Paperexternally

Ask AI

Mark Helpful

Bookmark

Relay

View Full Paper