The study major focuses on the efficiency of the current Large Language Model (LLMs). By researching several papers that focus on it, the limitations of the current efficiency in LLM are significant problems that need to be considered by academia. Then, the study will provide some research on the progress of solving issues and explain each solution clearly. Finally, the study will focus on the further needs for developing each solution. This study is conducted on the USER-LLM, OPTIMA, and Infinite-LLM systems that can solve the efficiency problems in LLM and find some benefits in improving LLM efficiency limitations. Experimental results show that some issues in each system need to be solved in further research. This study can explain the main efficiency problems in current LLMs and provide direction for further research. With more research on the efficiency problem, computational costs and response times will decrease, enabling real-time decision-making improvement.
Taowen Qian (Thu,) studied this question.