Key points are not available for this paper at this time.
Abstract Online discussions are often filled with hostile and uncivil comments. While human and automated moderation have been employed to reduce incivility, such approaches are criticized for limiting free expression. This study tested (a) whether artificial intelligence (AI)-mediated communication could encourage self-moderation while preserving human agency and (b) whether the results of self-moderation are perceived by others. In Study 1 (N = 421), Korean adults read a contentious online discussion thread and contributed responses after receiving either an AI-generated sentiment score or general feedback. Results showed that 20.67% of participants (n = 87) revised their original comment, and providing a real-time AI sentiment score increased the revision likelihood compared to general feedback. Among revisers, lower AI scores predicted more positive sentiment change. These effects held regardless of participants’ preexisting favorability toward AI. Study 2 (N = 348) asked third-party observers, unaware of the feedback intervention, to evaluate comments and discussions from Study 1. Revised comments were perceived as more positive in sentiment. Discussions containing comments revised following AI score feedback were perceived as having lower conflict intensity compared to those with unrevised comments, whereas this difference did not emerge for general feedback. The findings highlight AI’s potential in promoting user-driven self-moderation without external enforcement and its cascading positive effects on third-party observers.
Shin et al. (Thu,) studied this question.