The relationship between naming traditions and geography is a complex nexus encompassing historical, sociocultural, and genetic dimensions. While qualitative scholarship has long explored these facets, the application of novel quantitative methods allows for the discovery of latent geospatial patterns at scale. This study applies computational techniques to a dataset of Türkiye’s most prevalent baby names to identify distinct spatiotemporal practices and sociocultural dynamics. We extend our previous demographic analysis framework to investigate naming trends across both geospatial and temporal scales. Using unsupervised machine learning—specifically through clustering analysis applying K -means, K -medoids, Gaussian mixture models (GMM), and spectral clustering—we demonstrate that name distributions are not stochastic but exhibit patterns driven by geographic heterogeneity. Furthermore, by integrating clustering with principal component analysis (PCA) on provincial name distributions, similar cultural connotations (e.g., nature-related names) cluster together, revealing ’geo-cultural embeddings’—a novel representation of how cultural preferences are spatially encoded in naming practices. Finally, to highlight the utility of naming data as a proxy for broader social science research, we validate a significant association between these identified cultural clusters and regional political preferences.
Emre Öner Tartan (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: