This study investigates the application of Data Science and Machine Learning techniques to predictthe directional behavior of Petrobras (PETR4) stock returns, classifying them as positive (upward)or negative (downward). A structured machine learning pipeline is developed to efficiently organizeand process historical market data, starting from data acquisition and proceeding through rollingwindow segmentation with distinct development and production phases.Within each rolling window, input features and the target variable are computed, followed by adevelopment stage that includes model training and testing. Several classification models are eval-uated, including Decision Trees, Logistic Regression, Gradient Boosting, and K-Nearest Neighbors.The models are compared based on classification performance metrics, and the best-performing modelin each window is selected for deployment in the production phase, where it generates out-of-samplepredictions.This rolling window framework is applied over the period from 2010 to 2023, allowing for a realis-tic simulation of model deployment under evolving market conditions. The results are systematicallyrecorded and analyzed, demonstrating the effectiveness and robustness of the proposed methodology.Despite the inherent challenges of financial return prediction, the majority of the models achievedclassification accuracy above 50%, with an average accuracy of approximately 53%, indicating statis-tically meaningful predictive power in a highly stochastic environment.
Priantti Bruno (Mon,) studied this question.