Risk disclosures play a crucial role in the investment decision‐making processes for investors. However, extracting relevant variables from unstructured financial text poses a nontrivial challenge. In this paper, we propose RiskBERT, a large language model (LLM) trained on financial texts and risk knowledge graphs, specifically designed for risk factors extraction. By incorporating both finance and risk knowledge, RiskBERT significantly improves the extraction of risk factors in financial texts. We evaluate RiskBERT’s performance on a labeled risk factors dataset comprising 119,153 sentences from 2400 Chinese A‐listed companies and compare it against other LLMs and automated text analysis algorithms for risk types’ classification. Our findings demonstrate that RiskBERT outperforms alternative models, particularly when the training sample size is limited. Moreover, we uncover that RiskBERT provides risk informativeness estimates in annual reports that are at least 4.5% higher than those derived from other models. These results highlight the value of RiskBERT as a powerful tool for extracting risk factors and enhancing risk analysis in finance and accounting domains.
Liu et al. (Thu,) studied this question.