摘要
This paper introduces a comparative analysis of the proficiencies of various textures and geometric features in the diagnosis of breast masses on mammograms.An improved machine learning-based framework was developed for this study.The proposed system was tested using 106 full field digital mammography images from the INbreast dataset,containing a total of 115 breast mass lesions.The proficiencies of individual and various combinations of computed textures and geometric features were investigated by evaluating their contributions towards attaining higher classification accuracies.Four state-of-the-art filter-based feature selection algorithms(Relief-F,Pearson correlation coefficient,neighborhood component analysis,and term variance)were employed to select the top 20 most discriminative features.The Relief-F algorithm outperformed other feature selection algorithms in terms of classification results by reporting 85.2%accuracy,82.0%sensitivity,and 88.0%specificity.A set of nine most discriminative features were then selected,out of the earlier mentioned 20 features obtained using Relief-F,as a result of further simulations.The classification performances of six state-of-the-art machine learning classifiers,namely k-nearest neighbor(k-NN),support vector machine,decision tree,Naive Bayes,random forest,and ensemble tree,were investigated,and the obtained results revealed that the best classification results(accuracy=90.4%,sensitivity=92.0%,specificity=88.0%)were obtained for the k-NN classifier with the number of neighbors having k=5 and squared inverse distance weight.The key findings include the identification of the nine most discriminative features,that is,FD26(Fourier Descriptor),Euler number,solidity,mean,FD14,FD13,periodicity,skewness,and contrast out of a pool of 125 texture and geometric features.The proposed results revealed that the selected nine features can be used for the classification of breast masses in mammograms.