What question did this study set out to answer?

The research aims to create an interpretable framework for assessing candidates during interviews by integrating multiple data sources.

April 3, 2026Open Access

Multi-Modal Method for Candidate Interview Assessment Based on Computer Vision and Large Language Models

Key Points

The research aims to create an interpretable framework for assessing candidates during interviews by integrating multiple data sources.
Proposes a multi-modal framework combining nonverbal behavior and LLM-based verbal analysis.
Utilizes computer vision to extract features from video interviews.
Aggregates constructs into a composite Top Potential Score to assess executive abilities.
Experiments show significant differentiation of top candidates with a Cliff's delta of 0.91.
Permutation p-value of 0.0002 underscores the method's effectiveness.
100% recall of executive candidates in the top 20% of applications verifies robustness.

Abstract

Candidate interview assessment is primarily reliant on subjective human judgment, while existing AI-based methods rely on end-to-end predictions with no psychometric basis. In this paper, we propose an interpretable multi-modal framework that combines nonverbal behavior, LLM-based verbal analysis, and Big Five personality traits into three theory-based constructs: professional-cognitive competence, observed leadership behavior, and leadership disposition. The proposed method utilizes computer vision and larger language models to extract features from video interviews. Rather than targeting predictive accuracy, the proposed method prioritizes construct validity and transparent aggregation under severe label scarcity. The proposed method aggregates the constructs into a Top Potential Score that reflects the executive abilities of the candidate. Experiments on the method show its ability to significantly differentiate top candidates from others (Cliff’s delta = 0.91 for the composite Top Potential Score, permutation p = 0.0002). Leave-one-out analysis verifies robustness, while rank-based evaluation yields 100% recall of executive candidates in the top 20% of rated applications. The findings justify the use of the proposed multi-modal method as an interpretable decision-support tool for candidate interview assessment.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Kassab et al. (Wed,) studied this question.

synapsesocial.com/papers/69cf5cd15a333a821460a6cd https://doi.org/https://doi.org/10.3390/bdcc10040106

Bookmark

View Full Paper