We investigate the linguistic complexity and emotional valence of popular song lyrics across English (n=1491), Spanish (n=307), and German (n=225), using an analytical corpus of 2023 tracks drawn from 2113 deduplicated tracks on Spotify’s weekly Top 200 charts (2019–2021). Transformer-based sentiment analysis is combined with complexity-science tools to characterize both the affective content and the structural organization of commercially successful lyrics. A multilingual BERT model reveals a mild negative skew across all three languages (63.7% negative overall); the 1.003-point English–German gap observed under the English-centric VADER lexicon collapses to 0.127 points under BERT, indicating that earlier cross-linguistic sentiment differences are largely measurement artifacts. Word frequency distributions follow Zipf’s law in all three languages (R2>0.96), with English steepest (α=1.409) and German shallowest (α=1.181). Detrended fluctuation analysis indicates persistent long-range correlations (H≈0.66–0.76; none of the 50 shuffled surrogates exceeded the observed values), and multifractal singularity spectra are statistically indistinguishable across languages once corpus size is controlled (all pairwise Mann–Whitney p>0.13). Streaming counts within the Top 200 are concentrated (German Gini =0.556) but, given the truncated single-snapshot sample, are reported as within-chart descriptors rather than population-level scaling.
Khanipour et al. (Thu,) studied this question.