Organic photovoltaics (OPVs) represent one of the most dynamic fields in renewable energy research due to their potential for lightweight, flexible, and low-cost solar energy conversion. In recent years, the number of reported donor–acceptor (D/A) materials has increased exponentially, creating a vast chemical design space that is no longer manageable through conventional trial-and-error experimentation. Machine learning (ML) has emerged as a transformative tool to accelerate material discovery and performance prediction, capable of correlating molecular structure, device fabrication, and photovoltaic efficiency with unprecedented efficiency and scalability. However, the predictive accuracy of ML models depends critically on the selection of descriptors, the numerical quantities that translate molecular or device information into computable features. This review summarizes the conceptual framework and recent progress in descriptor-based ML studies for OPVs. We discuss data generation and pretreatment, classification, and interpretation of descriptors, and the construction and evaluation of predictive models. The effects of intrinsic, electronic, and optical descriptors are analyzed in detail. Special attention is given to feature-selection techniques such as SHapley Additive exPlanations (SHAP) and Pearson correlation analysis, which provide interpretability and quantitative insight. Finally, we highlight emerging directions, including morphology-aware descriptors derived from microscopy, multimodal data sets integrating theoretical and experimental data, and the coupling of ML with generative and physics-informed frameworks. Together, these advances are redefining how structure–property–performance relationships are understood and exploited in organic solar cell research.
Mao et al. (Thu,) studied this question.