This paper seeks to evaluate the potential of large language models (LLMs) such as GPT in accurately identifying verbal emotional cues and conducting coding, with specific emphasis on political communication and discourse in Japan. It aims to assess ChatGPT’s capability to distinguish messages containing derogatory expressions during deliberations in the Japanese Parliament and to classify the targets of these expressions. Furthermore, this study compares the analytical techniques and results produced by ChatGPT, with those obtained through human coding, to ascertain the potential advantages and limitations of AI-assisted qualitative data analysis within the context of political discourse. The findings show partial convergence in coding outcomes, especially when derogatory cues are explicit and targets are clearly specified. We interpret this convergence as output-level similarity under defined conditions rather than evidence of human-like interpretive calibration, contextual grounding, or metacognitive monitoring.
Kinoshita et al. (Sun,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: