Introduction One of the most promising avenues of artificial intelligence (AI) integration into medicine is its examination, evaluation, and characterization of pathological slides. The use of large language models (LLMs), the AI model subtype that is becoming increasingly popular, in pathological applications remains unexplored. This study investigates the histological image recognition capabilities of the multimodal models Gemini 1.5 Flash, ChatGPT-4o, and Claude 3.5 Sonnet and assesses their suitability for clinical or medical education use. Methods The models were evaluated using 300 digital histology images derived from the University of South Florida Morsani College of Medicine Virtual Microscopy database, with a prompt to ascertain each model's ability to identify tissue type and plane of sectioning used. The images included the three subtypes in both longitudinal and transverse planes of sectioning. Standard machine learning metrics such as precision, recall, accuracy, and F1 score were used to classify and evaluate each model's abilities. Results In the prediction of tissue type, OpenAI's ChatGPT had the highest metrics with an F1 score of 0.772, while Claude yielded an F1 score of 0.380, and Gemini produced a 0.460 F1 score. In the prediction of sectioning, ChatGPT produced an F1 score of 0.396, while Claude produced a value of 0.472, and Gemini yielded 0.344. Conclusion Overall, the results indicate that ChatGPT is most effective at identifying tissues. However, the inaccuracy demonstrated in evaluating sectioning compared to other models leaves room for improvement in its overall accuracy across varying tissue samples to reliably supplement medical education or clinical use.
Building similarity graph...
Analyzing shared references across papers
Loading...
Parth Shah
Pondicherry University
David J. Boughanem
University of South Florida
John Michael Templeton
University of South Florida
Cureus
Building similarity graph...
Analyzing shared references across papers
Loading...
Shah et al. (Thu,) studied this question.
synapsesocial.com/papers/68af5407ad7bf08b1eadaf06 — DOI: https://doi.org/10.7759/cureus.90103