摘要
在数千种已知的罕见病中,大约30%~40%的患者存在面部异常,面部特征识别成为利用机器学习预测罕见病风险和类型的一种有效方法。由于罕见病样本的稀缺,机器学习面临不少挑战。使用Dlib提取患者面部的68个关键点,并计算35个点之间的尺寸距离和26个线段的夹角余弦作为面部几何特征。通过特征选择筛选出其中前10个重要特征,并与Dlib提供的128维向量融合,从而构建138维新的向量作为面部特征,再利用NIH网站公开的DiGeorge综合征、Down综合征和Williams综合征的面部数据进行分类建模。测试结果表明,融合几何特征的模型在分类性能上优于仅使用深度学习特征的模型。这种方法不仅具有更好的可解释性,而且适用于小样本情况下的罕见病早筛分类,为罕见病预测提供了一种有效的新途径。
In thousands of known rare diseases,approximately 30%to 40%of patients have facial abnormalities.Therefore,facial feature recognition has become an effective method for predicting rare disease risks and types by using machine learning technology.However,due to the scarcity of rare disease samples,machine learning faces many challenges.Dlib was used to extract 68 key facial points of the patients,and the dimensional distance between 35 points and 26 segment angle cosines were calculated as facial geometric features.The top 10 important features were selected through feature selection and integrated with 128-dimensional vectors provided by Dlib to construct 138-dimensional vectors as facial features.The facial data of DiGeorge syndrome,Down syndrome and Williams syndrome published on the NIH website were used for classification modeling.The test results indicate that the model integrated with geometric features outperforms the model using only deep learning features in classification performance.This method not only offers better interpretability,but also is suitable for early screening and classification of rare diseases in small sample scenarios,providing an effective new method for predicting rare diseases.
作者
喻为栋
徐青云
刘雷
YU Weidong;XU Qingyun;LIU Lei(Institute of Intelligent Medicine,Fudan University,Shanghai 200030,China)
出处
《中国数字医学》
2024年第10期20-27,共8页
China Digital Medicine
关键词
罕见病
面部特征识别
机器学习
几何特征
分类建模
Rare diseases
Facial feature recognition
Machine learning
Geometric features
Classification modeling