August 22, 2024Open Access

Assessing the Proficiency of LLMs with Various Tasks and Evaluators

Key Points

Key points are not available for this paper at this time.

Abstract

Previous studies have been limited to giving one or two tasks to Large Language Models (LLMs) and involved a small number of evaluators within a single domain to evaluate the LLM’s answer. We assessed the proficiency of four LLMs by applying eight tasks and evaluating 32 results with 17 evaluators from diverse domains, demonstrating the significance of various tasks and evaluators on LLMs.

Read Full Paperexternally

اسأل الذكاء الاصطناعي

Bookmark

View Full Paper

Cite This Study

Kim et al. (Thu,) studied this question.

synapsesocial.com/papers/68e5b4f8b6db64358754e3da https://doi.org/https://doi.org/10.3233/shti240473