通过文本挖掘获取疾病相关功能信息被引量：3

Retrieving Gene Functional Information through Text Mining

下载PDF

导出

摘要通过定位候选策略和全基因组关联研究等方法,很多人类遗传疾病的致病基因已经定位到某个或某些染色体区间,利用计算机将染色体区间中众多的基因减少到易于实验分析的数目是寻找疾病基因的一个很重要的方法。大部分已有的预测疾病基因的方法都是利用已知致病基因的各类注释信息来预测疾病基因的。但是,目前依然有很多疾病尚没有任何具体的注释信息,这样就无法利用已有的基于已知基因信息的预测方法来识别致病基因。针对这个问题,通过挖掘生物医学文献数据库,结合人类基因产物蛋白质的功能注释数据库,从中提取与疾病相关的功能信息。这样,就可以基于这些挖掘出来的功能信息来实现这类疾病基因的预测。 Many disease genes are located within one or more specific chromosomal regions through position candidate approaches and genome-wide association studies. Prioritizing candidate genes by computational algorithms is important strategy to speed the identification of disease genes. Most approaches to identify disease genes based on function annotations have been presented in recent years. Most of them,starting from the function annotations of known genes associated with diseases,however,can not be used to identify genes for diseases without any known pathogenic genes or related function annotations. For such diseases, a new method is proposed to retrieve ralated gene functional information by mining biomedical literature and protein function annotation database. Thus, the genes for diseases lacking known causative genes also could be identified based on the gene function annotations mined.

作者袁芳周艳红王佳

机构地区华中科技大学湖北生物信息与分子成像重点实验室

出处《微计算机信息》 2009年第36期1-3,共3页 Control & Automation

基金基金申请人:周艳红项目名称:人类遗传疾病相关基因的生物信息学分析与预测基金颁发部门:国家自然科学基金(90608020) 基金申请人:周艳红项目名称:基因发现与分析的生物信息学平台研制与应用研究基金颁发部门:教育部(NCET-06-0651)

关键词疾病基因预测基因本体文本挖掘 disease gene prediction GO text mining

分类号 Q343.1 [生物学—遗传学]

引文网络
相关文献

参考文献12

1Lander E S, Linton L M, Bitten B, et al. Initial sequencing and analysis of the human genome[J]. Nature, 2001, 409(6822): 860-921.
2Yan S. Positional candidate cloning of disease genes [J]. Life Sciences, 1999, 11(5): 205-508.
3McCarthy M I, Smedley D, and Hide W. New methods for find- ing disease-susceptibility genes: impact and potential [J]. Genome Biol, 2003, 4(10): 119.
4Franke L, Bakel H, Fokkens L, et al. Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes[J]. Am J Hum Genet, 2006, 78(6): 1011- 1025.
5Perez-Iratxeta C, Wjst M, Bork P, et al. G2D: a tool for mining genes associated with disease[J]. BMC Genet, 2005, 6: 45.
6Turner F S, Clutterbuck D R, and Semple C A. POCUS: mining genomic sequence annotation to predict disease genes [J]. Genome Biol, 2003, 4(11): R75.
7Freudenberg j and Propping P. A similarity-based method for genome-wide prediction of disease-relevant human genes [J]. Bioinformatics, 2002, 18 Suppl 2:S110-115.
8MEDLINE/PubMed, http://www.ncbi.nlm.nih.gov/PubMed.
9EBI GOA project, http://www.ebi.ac.uk/GOA/index.html.
10杨丽华,戴齐,杨占华.文本分类技术研究[J].微计算机信息,2006(05X):209-211. 被引量：13

二级参考文献7

1张先飞,李弼程,刘安斐.基于改进KNFL算法的海量文本分类研究[J].微计算机信息,2005,21(11S):159-160. 被引量：4
2AH-HWEE TAN.Text Mining:The state of the art and the challenges [C].PAKDD'99 Workshop on Knowledge discovery from Advanced Databases (KDAD'99),Beijing,1999.
3Fabrizio Sebastiani.Machine Learning in Automated Text Categorization[J].ACM Computing Sruveys,2002,34(1):1-47.
4Yang Yiming,Pederson J O.A Comparative Study on Feature Selection in Text Categorization[C].Proceedings of the 14th International Conference on Machine learning.Nashville:Morgan Kanfmann,1997: 412-420.
5Mlademnic,D.,Grobelnik,M.Feature Selection for unbalanced class distribution and Native Bayees [C].Proceedings of the Sisteenth International Conference on Machine Learning.Bled:Morgan Kanfmann, 1999:258-267.
6Belur V D.Nearest Neighbor(NN)Norms:NN pattern Classification Techniques [J].IEEE Computer Society Press,New York:IEEE press, 1991.59.
7Joachims T.Text Categorization with Support Vector Machines:Learning with Many Relevant Features [J].Machine Learning,1998,11398:137-142.

共引文献12

1张启蕊,张凌,董守斌,谭景华.基于免疫算法的文本分类研究[J].微计算机信息,2007(24):210-212. 被引量：6
2褚力,张世永.基于集成合并的文本特征提取方法[J].计算机应用与软件,2008,25(10):212-213. 被引量：1
3柴忠,常晓明.一种基于CFN的特征选择及权重算法[J].微计算机信息,2009,25(3):221-222. 被引量：2
4王忠桃,岳焱,彭鑫.含倾斜文字的图像垃圾邮件过滤技术研究[J].计算机与数字工程,2010,38(5):111-112.
5胡东波,肖丹萍,曹婷.数据挖掘在员工网络招聘中的应用研究[J].商场现代化,2010(14):108-109. 被引量：1
6韩红旗,朱东华,汪雪锋.类关联词约束的K-Means半监督文本聚类方法[J].微计算机信息,2010,26(15):4-5. 被引量：2
7陈南国,张锦.基于TCM的KIII模型及其应用研究[J].微计算机信息,2012,28(2):151-152.
8戴臻.一种基于非监督判别语义特征提取的文本分类算法[J].数字技术与应用,2012,30(11):128-128.
9贾昱晟.基于机器学习的中文文本分类技术研究[J].电脑知识与技术,2011,7(7X):5194-5196. 被引量：3
10谭章禄,彭胜男,王兆刚.基于聚类分析的国内文本挖掘热点与趋势研究[J].情报学报,2019,38(6):578-585. 被引量：33

同被引文献18

1尹招琴,朱维斌,李文军.提高大型仪器使用效率培养学生创新能力[J].实验室研究与探索,2009,28(1):160-162. 被引量：45
2张新德.企业设备维护管理要点浅析[J].硅谷,2009,2(2). 被引量：6
3Aerts S, Gene Prioritization Through Genomic Data Fusion[J]. Nature Biotechnol, 2006, 24(5): 537-544.
4Franke L, Bakel H, Fokkens L, et al. Reconstruction of a Functional Human Gene Network, with an Application for Pfioritizing Positional Candidate Genes[J]. The American Journal of Human Genetics, 2006, 78(6): 1011-1025.
5Iratxeta P C, Wjst M, Bork P, et al. G2D: A Tool for Mining Genes Associated with Disease[J]. BMC Genetics, 2005, 6(3): 45-53.
6Turner F S, Clutterbuck D R, Semple C A. POCUS: Mining Genomic Sequence Annotation to Predict Disease Genes[J]. Genome Biology, 2003, 4(11): 75-83.
7Freudenberg J, Propping E A Similarity-based Method for Genome-wide Prediction of Disease-relevant Human Genes[J]. Bioinformatics, 2002, 18(2): 110-115.
8Sanchez J ~ Barton C, David V. Human Disease Genes[J]. Nature, 2001,409(15): 853-855.
9Mann , W.C. and Thompson, S.A.Rhetorical Structure theory: A theory of text organization Information Sciences Institute ,Universi- ty of Southern California,1987.
10Hearst M.A. Text Tiling:A Quantitative Approach to Discourse Segmentation Technical Report Sequoia 93/24 Berkeley:University of California, 1993.

引证文献3

1袁芳,王瑞春,管明祥,万学元,何国荣,周艳红.基于文本挖掘与功能相似性的疾病基因预测[J].计算机工程,2011,37(4):27-28. 被引量：2
2谭朔,王焕清.基于文本挖掘获取相关角色授权问题研究[J].微计算机信息,2011,27(6):232-234. 被引量：1
3孟祥婷,潘福成,张波.设备诊断维护专家系统研究与应用[J].微计算机信息,2011,27(7):231-233. 被引量：3

二级引证文献6

1黄冬兰,吴国瑛.基于现场总线技术和SIS的设备状态检修策略[J].微计算机信息,2012,28(10):103-105. 被引量：1
2符保龙.基于背景知识和主动学习的文本挖掘技术研究[J].计算机应用与软件,2013,30(5):275-278. 被引量：1
3王荣海.自动化设备零故障运行系统设计与实现[J].计算机光盘软件与应用,2013,16(16):270-270. 被引量：1
4余才阳.输变电设备状态评估及状态检修策略研究——以台州电业局为例[J].科技创新导报,2013,10(32):19-19.
5陆维嘉.关联规则挖掘结合PSO的基因-疾病关系自动提取方法[J].湘潭大学自然科学学报,2016,38(3):64-68. 被引量：5
6罗有志,陈征明,陈明,梅文涛.一种基于自适应关联熵的关键字提取算法[J].计算机与现代化,2020,0(4):67-71. 被引量：1

1袁芳,王瑞春,管明祥,万学元,何国荣,周艳红.基于文本挖掘与功能相似性的疾病基因预测[J].计算机工程,2011,37(4):27-28. 被引量：2
2孔建,沈岩,吴冠芸.基因功能的鉴定──一个新的挑战[J].国外医学（遗传学分册）,1996,19(5):228-231. 被引量：1
3周永称,崔雷.文本挖掘在基因组注释中的应用[J].中华医学图书情报杂志,2017,26(3):15-19. 被引量：2
4徐昊,陶林,魏武,贾佩琳,丁祖泉,曹志伟.文本挖掘技术在整合蛋白与疾病关系资源中的应用[J].生物信息学,2009,7(1):21-24. 被引量：4
5李满生,刘齐军,李栋,刘培磊,朱云平.蛋白质相互作用信息的文本挖掘研究进展[J].中国科学：生命科学,2010,40(9):805-819. 被引量：2
6闫雷,崔雷.急性白血病相关基因的文本挖掘分析[J].情报学报,2008,27(2):169-174. 被引量：4
7黄凯峰,何洁月.基于生物医学文献的知识发现研究[J].计算机技术与发展,2008,18(2):62-65. 被引量：3
8解瑯明,闫玉清,白春阳,杜晶.先天性心脏传导阻滞相关蛋白的研究进展[J].哈尔滨师范大学自然科学学报,2010,26(3):90-94. 被引量：1
9邱家俊,颜景斌.基因印记与lncRNA[J].中国生物工程杂志,2014,34(7):63-68. 被引量：2
10李建华,李哲人,康雁,李岭.在线孟德尔人类遗传数据库数据挖掘的研究进展[J].生物医学工程学杂志,2014,31(6):1400-1404. 被引量：7

微计算机信息

2009年第36期

浏览历史

内容加载中请稍等...

通过文本挖掘获取疾病相关功能信息被引量：3

参考文献12

二级参考文献7

共引文献12

同被引文献18

引证文献3

二级引证文献6

相关作者

相关机构

相关主题

浏览历史

通过文本挖掘获取疾病相关功能信息 被引量：3

参考文献12

二级参考文献7

共引文献12

同被引文献18

引证文献3

二级引证文献6

相关作者

相关机构

相关主题

浏览历史

通过文本挖掘获取疾病相关功能信息被引量：3