期刊文献+

基于随机森林算法的数据分析软件设计 被引量:8

Design of data analysis software based on random forest algorithm
下载PDF
导出
摘要 随机森林是一种流行的机器学习方法,已经被广泛地应用于生物医学和生物信息学。针对医学数据集的特征高维等固有特点,设计一个基于随机森林的医学数据分析软件系统。本系统基于Web技术,在客户端采用Java语言接收用户数据和参数,并显示数据分析结果,在服务器端用R语言执行机器学习算法,进行数据分析。本系统具有友好的用户界面、操作简单,实现在Web端调用随机森林算法对医学临床数据进行分析,并且能够很容易地扩展到调用其他机器学习方法。 Random forest is a popular machine learning method,which has been widely used in biomedicine and bioinformatics.For the inherent characteristics of the high dimension of medical data sets,the paper designs a software system of medical data analysis based on random forest algorithm.Based on Web architecture,for the client,the system adopts Java language programming Web pages for receiving original dataset and arguments submitted by users and showing the results of data analysis.For the server,R language is used for implementing machine learning algorithms and performing data analyzing.Finally,the system has user-friendly interface and is easy to use and implement the random forest algorithm calling in Web client for medicine clinical data analysis easily extended to other machine learning methods calling.
作者 周屹 冯兆祥 白熙卓 贾子一 戴洋洋 盛鑫宇 ZHOU Yi FENG Zhaoxiang BAI Xizhuo JIA Zhiyi DAI Yangyang SHENG Xinyu(College of Computer Science and Technology, Heilongjiang Institute of Technology, Harbin 150050, China)
出处 《黑龙江工程学院学报》 CAS 2017年第3期38-41,共4页 Journal of Heilongjiang Institute of Technology
基金 黑龙江省大学生创新训练项目(201611802087) 黑龙江省大学生创业训练项目(201611802098) 国家自然科学基金项目(20154424)
关键词 机器学习 数据挖掘 随机森林 JAVA语言 R语言 machine learning data mining random forest Java language R language
  • 相关文献

参考文献2

二级参考文献20

  • 1Breiman L. Random Forests. Statistics Department University of California Berkeley, CA 94720, January,2001.
  • 2Sander O, Sommer I, Lengauer T. Local protein structure prediction using discriminative models. BMC Bioinformatics,2006,7:14.
  • 3Bao L,Cui Y. Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary informarion. Bioinformatics,2005,21 : 2185 -2190.
  • 4Jiang HY, Deng YP, Chen HS, et al. Joint analysis of two microarray gene-expression data sets to select lung adenocarcinoma marker genes. BMC Bioinformatics ,2004,5 : 81.
  • 5Zhang HP, Yu CY, Singer B. Cell and tumor classification using gene expression data: Construction of forests. Proe Natl Acad Sci USA, 2003,100:4168-4172.
  • 6Lunetta KL, Hayward LB, Segal J, et al. Screening large-scale association study data:exploiting interactions using random forests. BMC Genet,2004,5:32.
  • 7Pang H, Lin AP, Holford M, et al. Pathway analysis using random forests classification and regression. Bioinformatics,2006 ,22 :2028-2036.
  • 8Hoffmann K, Firth MJ, Beesley All, et al. Translating microarray data for diagnostic testing in childhood leukaemia. BMC Cancer, 2006,6 : 229.
  • 9Brett A, McKinney DM Reif, Ritchie MD. J H M Machine learning for detecting gene-gene interactions. Appl Bioinformatics, 2006,5 ( 2 ) : 77- 88.
  • 10Lin N, Wu BL, Jansen R, et al. Information assessment on predicting protein-protein interactions. BMC Bioinformatics,2004,5 : 154.

共引文献30

同被引文献98

引证文献8

二级引证文献30

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部