Los puntos clave no están disponibles para este artículo en este momento.
With the widespread application of large language models (LLMs), their potential ethical risks and social impacts have increasingly attracted attention. In order to solve the problem of scarcity of Chinese values alignment datasets, this paper constructs the Values Alignment Dataset (VAD), a Chinese dataset designed for instruction fine-tuning of large language models. The dataset sources include three types: manually constructed instruction data, synthetic data generated through data augmentation techniques, and optimized translations of open-source data. The manually constructed instruction data is designed based on topics aligned with the ethics, morals, and value principles of Chinese society and rigorously reviewed. The synthetic corpus draws on the Evol-Instruct method and is generated by depth and breadth evolution based on the CValues dataset, and the translated corpus is based on the SafeRLHF dataset, where high-quality Q&A pairs are selected for translation and optimization. In terms of quality control, the dataset employs the Instruction-Following Difficulty (IFD) filtering method. Samples with low IFD scores were removed by comparing the loss difference between the model′s response generation with and without instruction context. Subsequently, manual review ensured that the data complied with the values corpus standard and the Chinese national context. Finally, 29,000 high-quality samples were screened, stored in JSON format and contained five attributes: instruction, input, output, source, and categorization. The classification spans three levels: person, social, and nation. The VAD dataset has significant application value in areas such as ethics, socially sensitive issues, and cultural diversity. It supports the fine-tuning of LLMs to achieve values alignment in complex social contexts, reduces the generation of harmful content, and contributes to advancing the safety research and practical applications of Chinese large language models.
LI et al. (Mon,) studied this question.