Topic analysis facilitates the identification of knowledge flows and emerging trends in scientific research. Traditional models such as LDA generate interpretable topic-term distributions but lack deep semantic representation, while pretrained language models like SciBERT encode rich semantics with limited topic interpretability. To address this question, this study proposes an integrated LDA and SciBERT model for topic analysis. Firstly, the terms of each topic are identified by the LDA model, capturing the underlying statistical information of topic-term associations. Secondly, the SciBERT model is used to obtain semantically similar words for these terms, complementing the statistical topic information with enriched semantic context and reducing semantic information loss, thereby facilitating the extraction of fine-grained topic features and enhancing interpretability. Thirdly, a popularity index and a relevance index are proposed to analyze topic characteristics and domain evolution from both static and dynamic perspectives. Empirical results on network science data show that the proposed model produces semantically rich topics, facilitating understanding of a diverse range of applications and key issues in network science, and reveals an evolutionary trend of the domain from development to maturity. This research will help researchers improve their understanding of topic analyses within their disciplinary fields and promote innovation and the exchange of scientific knowledge.
Wang et al. (Wed,) studied this question.