July 31, 2024Open Access

Convergence of batch gradient training-based smoothing L1 regularization via adaptive momentum for feedforward neural networks

Key Points

Key points are not available for this paper at this time.

Abstract

Artificial neural networks are extensively employed in numerous fields, including intelligent data processing, pattern identification, feature extraction, and more 123.For feedforward neural networks (FFNN), a popular learning algorithm is backpropagation (BP) 4,5, which has since been widely used in neural network training.Gradient descent training (GDT), an optimization algorithm, minimizes the loss function 6, while BP computes the gradients of the loss function in relation to the model parameters.BP algorithm is a key component of GDT, as it provides the gradients requisited to update the model parameters.There are two main and widely used modes for implementing the GDT algorithm: the batch approach and the online method.In batch learning, the typical gradient approach refers to adjusting the weights of networks once, after the training samples have been presented in their entirety.*Author for correspondence Nonetheless, incremental learning is a variant of the conventional gradient approach, which modifies the weights once following the display of every training sample 7.For FFNN training, GDT is a well-liked and straightforward learning technique.A few deterministic convergence findings for neural network gradient algorithms, both batch and online, have been proven in previous studies 8, 9.Overfitting is the most common problem that disturbs networks during training.Overfitting occurs when the function class grows reasonably larger than the dataset 10 or the variables are inconsistent.There are many methods used to avoid this phenomenon, including regularization 11, pruning 12, early stopping 13, data augmentation 14, and ensembling 15.In general, regularization is an effective strategy that can boost stability, encourage feature sharing, and lessen overfitting to dramatically enhance model performance.Other forms of regularization include weight decay 16, matrix regularization 17, and weight elimination 18.

Convergence of batch gradient training-based smoothing L1 regularization via adaptive momentum for feedforward neural networks

Key Points

Abstract

Cite This Study