What question did this study set out to answer?

To analyze three decades of empirical software engineering research to identify flaws in statistical methods and assess expert capability to address these flaws.

March 2, 2026

A Critical Reflection on the State of Data Analysis in Empirical Software Engineering

Puntos clave

To analyze three decades of empirical software engineering research to identify flaws in statistical methods and assess expert capability to address these flaws.
Conducted a large-scale literature survey of over 27,000 empirical SE papers.
Categorized studies using a large language model into methodologically adequate and inadequate.
Selected 28 primary studies for expert evaluation in a focus group-based workshop.
Revealed widespread misuse of statistical methods in empirical SE studies.
Experts struggled to identify and correct statistical flaws, raising concerns about methodological rigor.

Resumen

Context. Empirical Software Engineering drives innovation in SE through qualitative and quantitative studies. Since the 2006 Dagstuhl seminar, concerns about methodological rigor persist. Recent studies have highlighted misconceptions in statistical practices in ESE, yet their impact on the field’s progress and verifiability remains uninvestigated. Aim. To analyze three decades of SE research to uncover flaws in statistical methods used for data analysis in empirical studies. Next, to observe the capability of current empirical software engineering experts to identify and address these issues. Method. We conducted a large-scale literature survey, collecting over 27,000 empirical SE papers. Using a Large Language Model (LLM), we categorized studies into methodologically adequate and not adequate categories, and selected 28 primary studies (14 from each category) for expert evaluation via a focus group-based workshop. Results. Our findings reveal widespread misuse of statistical methods in empirical SE studies. Additionally, experts often struggle to detect these flaws and provide proper corrections, raising concerns about methodological rigor in the field. Conclusions. This study highlights the risks of perpetuating statistical misconceptions and advocates for a critical reform in the approach to data analysis in ESE. We advocate for developing frameworks that foster methodological awareness and rigor.

Me gusta

Guardar

Me gusta

Guardar

A Critical Reflection on the State of Data Analysis in Empirical Software Engineering

Puntos clave

Resumen

Cite This Study