Finding relevant topics or extracting useful information from large corpora of data has been challenging for academics, and topic modeling, a machine learning algorithm, has emerged as an alternative technique for discovering the underlying semantic structure of large, unstructured collections of documents. Our objectives are to identify the topics covered in the corpus data, group them by topic, show the development of research across different aspects of LIS, and demonstrate the application and use of theories from other domains in the LIS domain. We use several open-source tools for topic modeling, such as LDA (Latent Dirichlet Allocation), Gensim, Jupyter Notebook, ASReview, and OpenRefine, to extract key topics from titles and abstracts. The results of this study are summarized into three main sets: identification of specific topics, word clouds, trends in subjects, and the use and applications of theories in this domain. The model may help policymakers, funding agencies, and the government understand the current and future state of research and take corrective actions to address gaps in the literature on expert systems and applications. It also helps library professionals, classificationists, and researchers identify relevant topics in unstructured long texts and reduce information overload by removing unnecessary research documents.
Roy et al. (Thu,) studied this question.