Rapid developments in the field of deep learning have led to the emergence of large language models, which have played a seminal role in the understanding and processing of natural language. Such models are characterized by their capacity to analyze the structural complexity of language and generate human-like responses. They have been successful in applications where humans interact, such as chatbots. However, the high licensing fees and intensive hardware requirements of commercial language models have increased interest in open-source solutions. In this study, open-source models such as DeepSeek-R1 and LLaMA 3.1 are used to achieve high throughput with lower resource requirements. In order to improve the performance of open-source language models, a fine-tuning process is performed using the LoRA method. In addition, a knowledge retrieval based architecture is integrated to further improve the performance of the model. The models in this system are compared with the Claude 3.7 commercial model. Two separate datasets based on ChatGPT and Gemini are used for the comparison tests. The results show that low-parameter, open-source models, with appropriate fine-tuning and knowledge retrieval support, can compete with commercial large-scale models. As a result of this study, open-source models and retrieval augmented generation stand out as an effective alternative in hardware-constrained environments and with limited resources.
Karakaş et al. (Mon,) studied this question.