March 3, 2026

Position paper: What is a flaky test?

Key Points

Flaky tests are often considered as tests that inconsistently pass or fail, but this definition lacks nuance.
The paper introduces specific examples of non-deterministic tests, highlighting the grey area between faults and design flaws.
A formal model of flakiness using transition systems is developed to address the complexity of non-determinism.
Imperative developers face challenges in accurately specifying non-deterministic applications, complicating test design.

Abstract

In this position paper, we argue that the commonly-accepted definition of "flaky test" -as an execution that may non-deterministically pass and fail-is inadequate. We support our claim through several examples illustrating some of the nuance of non-deterministic tests: some clearly useful (e.g., property-based tests), a few blatantly incorrect (i.e., design flaws) and a large grey area, where it is unclear whether we are facing a design flaw (incorrect specification) or a software fault (incorrect implementation).Moving toward an actionable criteria to tackle flakiness in test suites, we develop a formal model of flakiness based on transition systems. Our formalization work touches upon the core conceptual challenge of flaky tests: imperative developers struggle to account for every sources of non-determinism in their own test code and also struggle to give accurate specifications of their (non-deterministic, in ever surprising ways) applications. We blame this state of affair on the crushing complexity of the programming languages misused as specification languages (i.e., to write tests) and hint toward alternatives, which may inspire the Flaky Test community.

Bookmark

Position paper: What is a flaky test?

Key Points

Abstract

Cite This Study