Key points are not available for this paper at this time.
The effect of data preprocessing on the learning ability of artificial neural networks was investigated with regardto the impact of distributing the input vectors uniformly with respect to the output categories in the training data set. Theanalyses were performed for neural networks dedicated to (1) dairy cow culling classification and (2) milk yield prediction.The two types of neural network used for culling classification were backpropagation and learning vector quantization. Foryield prediction, backpropagation was used. The study was repeated with several architectures for both types of network.Preprocessing of data did not have a large impact on the general performance of the networks, but did affect the results foreach output category. The effects were more pronounced in the categories containing less frequent events, for which theresults always improved. For the categories with larger number of records, balancing the data degraded the results. Therespective improvements and degradation of the results occurred for both prediction and classification, with the two types ofneural networks, and with all architectures tested. However, the magnitude of the effects varied with the type of neuralnetwork and with the architecture. The results of this study indicate that, in general, the distribution of outputs influences thelearning process of neural networks for both types of application. The results also suggest that the types of outputdistribution required for the training of neural nets may depend on the specifics of each problem.
Lacroix et al. (Wed,) studied this question.