This project explores the theoretical foundations and practical implementation of feedforward neural networks. A simple network was implemented and trained on the MNIST data set using stochastic gradient descent combined with backpropagation. Training performance was analyzed using two different cost functions, cross-entropy and quadratic, as well as L2 regularization and limited tuning of hyperparameters. As expected, carefully selecting the learning rate significantly improved convergence. Furthermore, L2 regularization proved to be an effective method for reducing overfitting and improving validation accuracy, leading to a final classification accuracy of 96.4% when using the cross-entropy cost function. The results of the implementations also highlight the fact that theoretical expectations might require adjustments in practice.
Lejerkrans et al. (Wed,) studied this question.