摘要
目的:从海量生物医学文献中挖掘疾病-基因-药物三者之间的关联关系,为精准医疗与靶向治疗提供依据。方法:利用词典与规则相结合的方法识别实体,利用共现法和关联规则找出疾病-基因-药物间的关联关系,对存在关联关系的实体所在的句子通过SemRep工具得到实体之间的语义关联关系,采用R语言建立关联图。结果:以肺癌疾病作为检索条件,将在PubMed中检索出的文献作为语料库,获取肺癌相关的基因和药物间的关联关系、语义关系等信息。结论:该方法能够有效提高实体关系抽取的准确率明显优于利用词典识别的方法。
Objective:Mining disease-gene-drug associations from biomedical literature can provide the basis for precision medicine and targeted therapy.Method:This paper uses dictionary and rules for entity recognition,and uses co-occurrence and association rules to find relationships of disease-gene-drug.The SemRep system is exploited to extract semantic relationships from sentences containing associated entities.R language is used to implement the association graph.Result:Using lung cancer literature from PubMed as input and returning the entity relationships,semantic relationships and other detailed information.Conclusion:The usage of both dictionary and rules for entity relationship extraction is obviously better than only using dictionary.
作者
翟菊叶
叶泽坤
杨枢
刘长青
ZHAI Ju - ye;YE Ze - kun;YANC Shu;LIU Chang - cjing(Bengbu Medical College, Bengbu 233030;Fudan University, Shanghai 200433;Hefei University of Technology,Hefei 230026 China)
出处
《新余学院学报》
2018年第2期1-5,共5页
Journal of Xinyu University
基金
国家自然科学基金项目"组合诱导Ci PSCs-NSC与DA神经元前体移植治疗帕金森病模型猪体内功能建立及调控机制研究"(81771381)
安徽高校省级自然科学一般研究项目"电子病历命名实体识别和实体关系抽取研究"(KJ2015B076by)
安徽高校人文社科重点项目"基于统计分析及多维关联规则挖掘的安徽省大学生手机依赖与身心健康关联性研究"(SK2017A0182)
关键词
生物医学
文献挖掘
肺癌
基因
药物
关系提取
biomedical
literature mining
lung cancer
gene
drug
relationship identification