In today's era of enormous text data, topic modeling is emerging as a revolutionary tool in natural language processing. From the corpus of scientific research articles to social media posts and newspaper headlines, topic modeling is employed in several domains to discover the latent primary themes associated with the corpus. This article provides an extensive and comprehensive review of different topic modeling techniques from their origin to the present. The effectiveness and efficacy of different topic modeling techniques, such as non‐negative matrix factorization, latent Dirichlet allocation, latent semantic analysis, probabilistic latent semantic analysis, Top2Vec, and BERTopic, are reviewed to highlight their strengths and weaknesses. A concise summary of recent studies in healthcare, bioinformatics, scientific research articles, social media platforms, and legal domains is also presented. Different quantitative and qualitative evaluation metrics are also discussed to understand the performance of topic modeling techniques better. Finally, a brief discussion on existing challenges and prospects of topic modeling is also included, providing researchers with insight into further advancements in topic modeling.
Building similarity graph...
Analyzing shared references across papers
Loading...
Kumari et al. (Thu,) studied this question.
synapsesocial.com/papers/68a35ef30a429f79733281a1 — DOI: https://doi.org/10.1002/aisy.202400528
Pratima Kumari
University of California, San Francisco
Sachin Kadian
North Carolina State University
Mehek Vora
North Carolina State University
Advanced Intelligent Systems
University of California, San Francisco
North Carolina State University
Dr. B. R. Ambedkar National Institute of Technology Jalandhar
Building similarity graph...
Analyzing shared references across papers
Loading...