March 3, 2026

Chinese ethnic minority book classification by large language models within CLC

Key Points

Both DeepSeek-v3 and ChatGPT-4o effectively classified Chinese ethnic minority books under broad categories, but performance varied by specificity.
DeepSeek-v3 achieved over 80% accuracy when using title and abstract, while ChatGPT-4o struggled significantly, remaining below 6%.
Evaluation involved prompt engineering and metrics like accuracy, granularity, and error-type, which informed further improvement strategies.
Findings underscore the need for enhancing AI capabilities and collaboration between humans and machines in knowledge organization.

Abstract

Purpose This study evaluates the capabilities and limitations of large language models (LLMs) in classifying Chinese ethic minority books under the scheme of Chinese Library Classification. Design/methodology/approach A test collection of Chinese ethnic minority bibliographic records was constructed, and prompt engineering was used to compare the classification performance of DeepSeek-v3 and ChatGPT-4o under two input scenarios: “title + abstract” and “title only.” By designing evaluation metrics that include accuracy, granularity and error-type analysis, this study systematically evaluates the performance differences between the models, diagnoses the causes of errors and proposes improvement strategies. Findings Experimental results show both models performed well in the broad category classification of Chinese ethnic minority books, with DeepSeek-v3 exceeding 80% accuracy. Incorporating abstracts further improved accuracy and prompted longer, more detailed classification codes. However, accuracy declined for both as classification codes grew more specific. DeepSeek-v3 significantly outperformed ChatGPT-4o, achieving an overall accuracy of 40.78% and 33.50% with and without abstracts, respectively, while ChatGPT-4o remained below 6%. On the basis of classification error analysis, this study proposes improvements in classification system design, model capability enhancement and human–artificial intelligence (AI) collaboration to guide practical improvements in organizing ethnic minority resources. Originality/value Combining librarianship, ethnography and artificial intelligence, this study is the first to compare the classification ability of different large-language models for Chinese ethnic minority books. It reveals cultural limitations in knowledge organization systems, identifies the “capability threshold” of LLMs in cultural context processing and establishes an empirical basis for developing culturally-aware AI governance frameworks.

Bookmark

Cite This Study

Jia et al. (Wed,) studied this question.

synapsesocial.com/papers/69a75d49c6e9836116a270b5 https://doi.org/https://doi.org/10.1108/el-04-2025-0161

Bookmark