What question did this study set out to answer?

The aim is to improve lexicostatistical methods to better understand historical divergence and language contact in Ainu dialects.

April 22, 2026Open Access

From Classification to History : Examining New Lexicostatistical Methodologies in Using the Data in Ainu Dialects

Key Points

The aim is to improve lexicostatistical methods to better understand historical divergence and language contact in Ainu dialects.
Develops a novel data-preparation technique for lexicostatistics.
Conducts a homogeneity analysis using two datasets—one with lexical-item data and another combining this with regularity data.
Visualizes results through 3D interactive graphs to show linguistic distributions.
The combined dataset indicates historical language contact among Asahikawa, Nayoro, and Soya dialects.
Lexical-item data reveals a distinct north-south division in Sakhalin dialects, correlating with Hokkaido dialects.
Demonstrates a previously unrecognized A–B–A geolinguistic distribution in Sakhalin.

Abstract

This study proposes a novel data-preparation method in lexicostatistics to potentially uncover historical divergence and language contact in Ainu dialects, addressing the methodological limitations of conventional approaches that redundantly count recurring regularities, thereby introducing statistical bias in previous lexical-item data. Our method systematically extracts almost all potential regularity data in the first step. Then, it extracts lexical-item data as linguistic information not captured by the regularity data, thereby preventing artificial inflation of specific patterns in previous data. Our revised homogeneity analysis is performed in two datasets: one consisting solely of lexical-item data and another combining lexical items with regularity data. Quantification results of Ainu dialects are visualized as 3D interactive graphs using HTML5. Visualization results from the dataset, which combines lexical items and regularity data, position the Asahikawa, Nayoro, and Soya dialects near the coordinate origin—representing the “average” characteristics—suggesting historical language contact and mixture in these dialects. Conversely, the visualization result of lexical-item data revealed a clear north–south division in Sakhalin dialects, with the northern group exhibiting similarity to SaruChitose dialects in southwestern Hokkaido Ainu dialects and the southern group showing similarity to northeastern Hokkaido Ainu dialects, demonstrating an A–B–A geolinguistic distribution that has not yet been discovered until our analyses. These findings demonstrate that our framework can integrate geolinguistic and historical linguistic perspectives: the former aligns with lexical item data in our datapreparation methods, and the latter corresponds to regularity data. Thus, our datapreparation and quantification methods will shift the focus in lexicostatistics from classification back to history in its original interest

Bookmark

View Full Paper

Bookmark

View Full Paper

From Classification to History : Examining New Lexicostatistical Methodologies in Using the Data in Ainu Dialects

Key Points

Abstract

Cite This Study