An enhanced retrieval method for Retrieval-Augmented Generation (RAG) is presented, aimed at improving answer accuracy through semantic keyword integration. For each document chunk, a large language model (LLM) generates representative semantic keywords that are converted into embeddings and incorporated into the retrieval process along with the embeddings of the original chunk. During retrieval, both keyword and chunk similarities are considered. In cases of low recall, an auxiliary step allows the LLM to scan documents sequentially and extract additional relevant keywords. This dynamic expansion mechanism enhances the alignment between queries and relevant content, addressing limitations of traditional embedding-only retrieval approaches.
Kai et al. (Mon,) studied this question.