Background: Autism spectrum disorder (ASD) is characterized by impairments in emotion recognition and regulation. While natural language processing (NLP) offers a systematic approach to estimating emotions from clinical narratives, the latent dimensional structure of these estimated emotions in ASD remains insufficiently characterized using methodologically rigorous frameworks. Objective: This exploratory study aimed to characterize the dimensional structure of model-estimated emotions in adolescents with ASD using a fine-tuned Japanese BERT model, while explicitly accounting for domain shift and compositional data constraints. Methods: A Japanese BERT model (tohoku-nlp/bert-large-japanese-v2) was fine-tuned on the WRIME dataset (social media posts). The model was applied to 1,239 clinical sessions from 14 adolescents with ASD to generate eight-dimensional emotion probability vectors. To address the compositional nature of softmax outputs and the non-independence of repeated sessions, we performed a centered log-ratio (CLR) transformation followed by within-patient centering. Principal component analysis (PCA), with patient-level bootstrapping, was conducted to identify robust dimensions. In-domain validation was performed using 150 manually annotated clinical snippets. Results: The model achieved 81.3% accuracy on the out-of-domain test set, though in-domain validation revealed performance variability across emotion categories (e.g., higher reliability for sadness than for trust). PCA identified two primary dimensions: PC1 (57.8% variance), characterized by a dominant sadness-driven internalizing axis, and PC2 (14.2% variance), representing a contrast between anticipation and aversive emotions (disgust/anger). These dimensions remained stable across bootstrap resampling. Conclusions: This study demonstrates that transformer-based NLP, when combined with rigorous compositional data analysis, can elucidate latent emotional structures within clinical narratives. However, these patterns reflect a model-mediated representation space influenced by clinical documentation practices and domain-specific model characteristics. While providing a novel quantitative framework for psychiatric NLP, our findings emphasize the necessity of in-domain validation and cautious clinical interpretation.
Building similarity graph...
Analyzing shared references across papers
Loading...
Minoru KANNO
Fukushima University
Yuka Yoshida
Yamagata University
Momoko Fujihashi
Yamagata University
Cureus
Building similarity graph...
Analyzing shared references across papers
Loading...
KANNO et al. (Sat,) studied this question.
synapsesocial.com/papers/69dc89183afacbeac03eae04 — DOI: https://doi.org/10.7759/cureus.106830
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: