Abstract The efficient analysis of digital mammograms plays an important role in the early detection of breast cancer and can lead to a higher percentage of recovery. The mammogram classification process can be divided in two steps as follows: first, the existence of abnormalities is defined and, second, the nature of the lesion is determined. This second step of a computer-aided diagnosis system is very important in order to select the best treatment for the patient and to achieve the highest chance of recovery. Feature extraction is crucial to identify informative characteristics that can differentiate between benign and malignant lesions. In the present paper, we compare texture and shape features with regard to their performance in the classification of the type of lesion. Furthermore, we propose the combination of features extracted from the same tissue but from a different perspective of the mammogram. Our study also addresses the issue of outliers in the dataset used. Four classification methods (Decision Trees, Random Forest, Perceptron, and Multilayer Perceptron) are used to evaluate the quality of different extracted features. Computational experiments are performed on the Digital Database for Screening Mammography and results show that shape features outperform texture features for all four classification methods. In addition, when outliers are excluded from the dataset, the system achieves 90. 01\% test accuracy using shape features, genetic algorithm-based feature selection, and the Random Forest classifier.
Bajcsi et al. (Wed,) studied this question.