February 20, 2024Open Access

Learning to Check: Unleashing Potentials for Self-Correction in Large Language Models

Key Points

Key points are not available for this paper at this time.

Abstract

Large language models (LLMs) have made significant strides in reasoning capabilities, with ongoing efforts to refine their reasoning through self-correction. However, recent studies suggest that self-correction can be limited or even counterproductive without external accurate knowledge, raising questions about the limits and effectiveness of self-correction. In this paper, we aim to enhance LLM's self-checking capabilities by meticulously designing training data, thereby improving the accuracy of self-correction. We conduct a detailed analysis of error types in mathematical reasoning and develop a tailored prompt, termed ``Step CoT Check''. Then we construct a checking-correction dataset for training models. After integrating the original CoT data and checking-correction data for training, we observe that models could improve their self-checking capabilities, thereby enhancing their self-correction capacity and eliminating the need for external feedback or ground truth labels to ascertain the endpoint of correction. We compare the performance of models fine-tuned with the ``Step CoT Check'' prompt against those refined using other promps within the context of checking-correction data. The ``Step CoT Check'' outperforms the other two check formats in model with lager parameters, providing more precise feedback thus achieving a higher rate of correctness. For reproducibility, all the datasets and codes are provided in https: //github. com/bammt/Learn-to-check.

Read Full Paperexternally

اسأل الذكاء الاصطناعي

Bookmark

View Full Paper

Cite This Study

Che et al. (Tue,) studied this question.

synapsesocial.com/papers/68e786ffb6db6435876f9c1a https://doi.org/https://doi.org/10.48550/arxiv.2402.13035

اسأل الذكاء الاصطناعي

Bookmark

View Full Paper