In today's era of enormous text data, topic modeling is emerging as a revolutionary tool in natural language processing. From the corpus of scientific research articles to social media posts and newspaper headlines, topic modeling is employed in several domains to discover the latent primary themes associated with the corpus. This article provides an extensive and comprehensive review of different topic modeling techniques from their origin to the present. The effectiveness and efficacy of different topic modeling techniques, such as non‐negative matrix factorization, latent Dirichlet allocation, latent semantic analysis, probabilistic latent semantic analysis, Top2Vec, and BERTopic, are reviewed to highlight their strengths and weaknesses. A concise summary of recent studies in healthcare, bioinformatics, scientific research articles, social media platforms, and legal domains is also presented. Different quantitative and qualitative evaluation metrics are also discussed to understand the performance of topic modeling techniques better. Finally, a brief discussion on existing challenges and prospects of topic modeling is also included, providing researchers with insight into further advancements in topic modeling.
Kumari et al. (Thu,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: