What question did this study set out to answer?

This dataset aims to facilitate research on programming behavior and learning analytics through the documentation of iterative submissions and their evaluations.

May 16, 2026Open Access

CodeStream: A Dataset of Iterative Programming Submissions with Sequential Verdict Traces and Attempt Histories

Key Points

This dataset aims to facilitate research on programming behavior and learning analytics through the documentation of iterative submissions and their evaluations.
Collected 5,482 submissions from 202 undergraduate computer science students during supervised sessions.
Utilized an automated assessment platform for evaluation, documenting verdict traces and attempt orders.
Included 46 programming problems in languages C, C++, and Java, preserving submission histories.
Dataset supports analysis of programming behavior and error correction patterns.
Enables study of learning progression and sequential decision-making in programming contexts.

Abstract

Programming learning environments generate rich interaction data through iterative code submissions and automated evaluation processes. This article presents CodeStream , a dataset of programming submissions collected from undergraduate computer science students during supervised problem-solving sessions using an automated assessment platform. The dataset contains 5,482 submissions from 202 users across 46 programming problems written in C, C++, and Java. Each submission record includes source code, programming language, final evaluation verdict, attempt order, and sequential verdict traces generated during test case evaluation. A linked problem-level component provides problem descriptions and associated evaluation test cases. The dataset preserves temporal relationships between users, problems, and attempts, enabling reconstruction of submission histories and analysis of iterative problem-solving behavior. CodeStream supports research in educational data mining, learning analytics, automated feedback systems, code analysis, and programming behavior modeling. Its attempt-level structure is particularly suitable for studying error correction patterns, learning progression, and sequential decision-making in novice programming contexts.

CodeStream: A Dataset of Iterative Programming Submissions with Sequential Verdict Traces and Attempt Histories

Key Points

Abstract

Cite This Study