期刊文献+

大数据癌症风险预测系统 被引量:5

Big Data Cancer Risk Prediction System
下载PDF
导出
摘要 中国抗癌协会指出:90%的早期癌症没有明显症状,以至于80%的癌症患者确诊时已属于中晚期。如果我们能够早期发现癌症,至少可以挽救上百万人的生命。本研究的主要目的就是借助于大数据价值提取技术,建立一套能够早期预测癌症风险的系统。本研究对486394人,包括40217名癌症患者和446177名健康体检者进行了血常规,血生化和尿常规数据的分析预测,预测分析数据共计48项。显著性分析和预测模型的统计方法为逻辑分析法和判别分析法,显著性检验标准为p<0.05。预测分析使用的统计软件为SAS,预测分析所用数据均来自MS SQL数据库。研究结果显示血常规,血生化和尿常规数据可以用来区分癌症患者和健康者,基于血常规,血生化和尿常规数据的癌症风险预测模型可以精准锁定高风险癌症人群,准确率达95.5%。癌症风险预测模型建成后,经过2014年1—7月9931名癌症患者和110077名健康体检者数据的验证,准确率超过95%。本研究证明血常规,血生化和尿常规数据可以用来早期预测癌症的风险。 Chinese Anti-Cancer Association indicates that about 90%of early cancers have no obvious symptoms, so that 80%of the diagnosed cancer patients are in the later stage. More than one million lives could be saved if we can predict early cancer risk. The purpose of this research is to provide a system to early predict cancer risk with the help of big data technology. A total of 486,394 people including 40,217 cancer patients and 446,177 normal people were involved in the study. The data were used in the research including demographic, CBC (Complete Blood Count), CMP (Complete Metabolic Panel), Lipids and Urinalysis data, total of 48 data points. Both Logistic analysis and discriminant analysis were used to identify the signiifcant factors and to build seven cancer risk prediction models and the signiifcant level was set at p&lt;0.05. SAS was used as the primary statistical analysis tool. All the data were pulled out from the MS SQL database. The analysis results showed that CBC, CMP, Lipids and Urinalysis data can signiifcantly distinguish normal people from cancer patients and those data can be used to build cancer risk prediction models, the average accuracy of the prediction models was 95.5%. Those seven prediction models were veriifed by a total of 120,008 people (from January 2014 to July 2014) including 9,931 cancer patients and 110,077 normal people. The accuracy of the veriifcation was over 95%. This research shows that the routine blood and urine test results can be used to predict cancer risk in the early stage.
出处 《世界复合医学》 2015年第1期63-67,共5页 World Journal of Complex Medicine
关键词 大数据 早期预测癌症 血常规 血生化 尿常规 big data early cancer prediction blood chemistry urinalysis
  • 相关文献

参考文献1

二级参考文献1

  • 1[澳]丁坦法思(L·Dintenfass) 著,廖福龙,翁维良.血液流变学在诊断及预防医学中的应用[M]科学出版社,1981.

共引文献1

同被引文献74

引证文献5

二级引证文献39

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部