This paper investigates the human-like communication abilities of modern language models, comparing several open-source and proprietary systems. As LLMs are increasingly deployed in socially interactive roles—ranging from digital companions to mental health support tools—their ability to engage users naturally and expressively has become a critical yet underexplored dimension of evaluation. Traditional benchmarks tend to emphasize accuracy or reasoning, but they fail to capture the nuanced, subjective traits that define human conversation. To address this, seven LLMs were tested using both short and sustained dialogues, evaluated by five human raters using a multi-trait rubric. LLaMA 3.2 emerged as a standout, occasionally outperforming human responses in personality and creativity. Models were assessed on five human-oriented communication traits: naturalness, empathy, creativity, adaptability, and humor/personality. Results show significant variation across systems, with some matching or exceeding human performance in specific areas—suggesting that conversational quality may depend more on tuning and stylistic freedom than model scale alone.
Building similarity graph...
Analyzing shared references across papers
Loading...
Ruixiang Liu
Transactions on Computer Science and Intelligent Systems Research
Building similarity graph...
Analyzing shared references across papers
Loading...
Ruixiang Liu (Tue,) studied this question.
www.synapsesocial.com/papers/68af55ccad7bf08b1eadc2a4 — DOI: https://doi.org/10.62051/yygprz73