What question did this study set out to answer?

This study aims to develop a fact-checking detection model for Korean language and implement an automated fact-checking system using advanced technologies.

May 9, 2026

Development of a Korean Check-Worthy Claim Detection Model and RAG-Based Automated Fact-Checking System : Application to Presidential Candidate Debates

Key Points

This study aims to develop a fact-checking detection model for Korean language and implement an automated fact-checking system using advanced technologies.
Developed a detection model using cross-lingual transfer learning and fine-tuning with KcELECTRA.
Implemented an automated system with a Python package for sentence segmentation and multi-source web search.
Analyzed 3,919 debate sentences to identify those needing fact-checking.
Identified 499 out of 3,919 sentences (12.7%) as needing fact-checking.
Achieved an accuracy of 89.31% and an F1 score of 82.65%.
Demonstrated 92.86% recall in detecting verifiable claims compared to journalist fact-checks.

Abstract

본 연구는 한국어 팩트체크 필요성 탐지 모델을 개발하고, 이를 대규모 언어모델(large language model LLM) 및 검색 증강 생성(retrieval-augmented generation RAG) 기술과 결합한 팩트체크 자동화시스템을 구축하여 제21대 대통령 선거 후보자 토론회에 적용했다. 팩트체크 필요성 탐지 모델은 CLEF CheckThat! Lab 2024 영어 데이터셋의 교차언어 전이 학습(번역 활용)과 KcELECTRA 기반 미세조정을 통해 개발됐으며, 테스트 세트에서 정확도 89.31%, F1 점수 82.65%, 정밀도 92.05%, 재현율 75.00%를 달성했다. 자동화 시스템은 LangGraph(파이썬 패키지) 기반으로 문장 분리, 검증 필요성 판단, 다중 소스 웹 검색(DuckDuckGo, 한국어 위키피디아, Serper.dev), LLM 기반 최종 판단, 결과 통합의 단계별 작업 흐름을 구현했으며, 출처 신뢰도 필터링과 6단계 판정 체계를 적용했다. 토론회 발언 분석 결과, 전체 3,919개 문장 중 499개(12.7%)가 팩트체크 필요 문장으로 판단됐으며, 최종 판단에서 ‘사실’ 또는 ‘대체로 사실’ 34.7%, ‘거짓’ 또는 ‘대체로 거짓’ 14.4%, ‘사실 반 거짓 반’ 7.0%, ‘판단 유보’ 43.9%로 나타났다. JTBC 기자의 56건 팩트체크와 비교한 결과, 검증 대상 탐지에서 92.86%의 재현율을 보였고, 공통 52건에 대한 최종 판단 비교에서 정확도 78.85%와 가중 F1 점수 79.04%를 달성했다. 본 연구는 영어권 중심의 팩트체킹 자동화 연구를 한국어로 확장하고, 주장 탐지에 머물렀던 기존 국내 연구를 증거 수집과 사실 판단까지 연결된 통합 시스템으로 발전시키며, 개발된 모델을 공개해 후속 연구의 기반을 마련한 의의를 갖는다.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Jong-Hyuk Lee

Journals

Korean Journal of Journalism & Communication Studies

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Development of a Korean Check-Worthy Claim Detection Model and RAG-Based Automated Fact-Checking System : Application to Presidential Candidate Debates

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study