摘要
当今时代,乳腺癌越来越成为了女性的高发病,因此尽早地排除异常因素,进行对症治疗,可以大大降低疾病风险。考虑到乳腺癌数据特征比较多,并且往往不仅存在线性特征还隐含着很多非线性特征,针对这一问题提出利用核零空间算法来进行乳腺癌的异常检测。首先利用核函数将所有的正常样本进行非线性映射变换到高维空间,再通过零空间变换将类内散度转换为0,并且将零空间中整个类的数据用该类的平均值代替,最后通过计算测试样本到该值的距离判断测试样本的异常性。该算法大大降低了计算的复杂性,也提高了乳腺癌检测的速度。通过在UCI乳腺癌数据库上的仿真实验,并对不同核函数以及设定的不同异常阈值下得到的F1-score进行对比,发现在不同核函数以及不同异常阈值下的结果是不同的,且在选取高斯核作为核函数时,可使得F1-score结果达到0.9627。充分证明了将核零空间算法用于乳腺癌异常检测是有效的。
Nowadays,breast cancer has increasingly become a high incidence of women.Therefore,removing abnormal factors as soon as possible and conducting diagnosis and treatment can greatly reduce the risk of disease.Considering that there are many features in breast cancer data,and there are often not only linear features but also many non-linear features.To solve this problem,a kernel null space algorithm is proposed to detect abnormalities of breast cancer.Firstly,the kernel function is adopted to perform nonlinear mapping and transformation of all normal samples into high-dimensional space.Secondly,the intra-class divergence is converted to 0 through zero space transformation,and the data of the entire class in the zero space is replaced with the average value of the class.Finally,the abnormality of the test sample is judged by calculating the distance from the test sample to the value.The proposed algorithm greatly reduces the complexity of the calculation and also improves the speed of breast cancer detection.Through simulation experiments on the UCI breast cancer database,the F1-score obtained under different kernel functions and different set abnormal thresholds is compared.It is found that the results under different kernel functions and different abnormal thresholds are different,and when the Gaussian kernel is selected as the kernel function,the F1-score can reach 0.9627,which fully proves that the kernel null space algorithm is effective in breast cancer abnormality detection.
作者
韩笑
毕波
唐锦萍
曹莉
HAN Xiao;BI Bo;TANG Jin-ping;CAO Li(School of Mathematics and Statistics,Northeast Petroleum University,Daqing 163318,China;School of Public Health,Hainan Medical College,Haikou 571101,China;School of Data Science and Technology,Heilongjiang University,Harbin 150080,China)
出处
《计算机技术与发展》
2022年第1期165-169,共5页
Computer Technology and Development
基金
国家自然科学基金(11701159)
2020年海南省基础与应用基础研究计划(自然科学领域)高层次人才项目基金(820RC649)。
关键词
乳腺癌
异常检测
核零空间算法
核函数
异常阈值
breast cancer
abnormal detection
kernel null space method
kernel function
abnormal threshold