This work presents an NLP-assisted framework for requirements traceability in software engineering, with a focus on trace link recovery between software requirements and related artifacts such as test cases or design documents. Establishing and maintaining traceability links is a critical but costly activity in many software projects, and manual approaches often lead to incomplete or outdated traceability. The proposed framework integrates both lexical and semantic similarity models, including TF-IDF and sentence-level embeddings, to generate ranked traceability links along with confidence scores. It supports reproducible evaluation using standard information retrieval metrics such as precision, recall, and F1-score, enabling systematic comparison of different traceability approaches. The framework is designed to support human-in-the-loop validation rather than fully automated traceability, emphasizing transparency and practical applicability. This preprint is intended as a research and experimentation platform for studying NLP-based requirements traceability and may be extended in future work to support trace link maintenance and explainability.
Aravindh R Rajendran (Thu,) studied this question.