Computer science literature indexed in Scopus and Web of Science has grown at double-digit annual rates, making comprehensive manual synthesis infeasible for individual researchers. Bibliometric workflows partially address this problem but rarely yield the interpretive depth needed to characterize a field’s accomplishments or gaps. This paper introduces a two-phase reproducible workflow integrating bibliometric science mapping (Phase 1) with structured thematic content analysis (Phase 2), implemented in R using the bibliometrix package. Phase 1 clusters publications by keyword co-occurrence; these clusters serve as the sampling frame for purposive selection of representative papers, which undergo deductive-inductive thematic coding in Phase 2. Thematic coding of this type typically requires dual-coder reliability checks; a test-retest procedure replaces that requirement, maintaining κ = 0.82 without a second coder. Applied to 648 AI-FinTech publications (2017-2026), the workflow identifies four thematic clusters and achieves κ = 0.82 . Regulatory compliance gaps and AI-blockchain integration opportunities, invisible to bibliometric analysis alone, emerged only through thematic coding. A single researcher completes the process in approximately 22 active working hours without dedicated infrastructure. • Integrates bibliometric science mapping with structured thematic content analysis into a single reproducible R-based workflow applicable to any computer science sub-field. • Links Phase 1 cluster outputs to Phase 2 sampling via an explicit allocation formula, replacing ad hoc paper selection with a principled, data-driven decision rule. • Enables single-author reliability verification via a test-retest procedure ( κ ≥ 0.80 ), removing the dual-coder requirement as a practical barrier for PhD researchers.
Thanh-Cong Truong (Mon,) studied this question.