期刊文献+

基于症状构成成分的上下位关系自动抽取方法 被引量:1

Automatic hyponymy extracting method based on symptom components
下载PDF
导出
摘要 针对症状间上下位关系具有较强结构特性的问题,提出一种基于症状构成成分的上下位关系自动抽取方法。首先,通过观察症状实体,发现症状可以切分为原子症状词、修饰词等八种成分,且成分的构成序列满足一定的规则。然后,利用词法分析系统和条件随机场模型对症状进行切分和成分标注。最后,把症状之间的关系抽取看作一个分类问题,选取症状成分的构成特征、词典特征以及通用特征作为分类算法的特征;基于多种分类算法训练模型,将症状间的关系分为上下位关系和非上下位关系。实验结果表明,当选用支持向量机算法,同时选用三类特征时,取得了最好的效果,准确率、召回率和F1值分别达到了82.68%、82.13%和82.40%。在此基础上,使用所提出的关系抽取算法,抽取了20 619条上下位关系,构建了具有上下位关系的症状知识库。 Since the hyponymy between symptoms has strong structural features, an automatic hyponymy extracting method based on symptom components was proposed. Firstly, it was found that symptoms can be divided into eight parts: atomic symptoms, adjunct words, and so on, and the composition of these parts satisfied certain constructed rules. Then, the lexical analysis system and Conditional Random Field (CRF) model were used to segment symptoms and label the parts of speech. Finally, the hyponymy extraction was considered as a classification problem. Symptom constitution features, dictionary features and general features were selected as the features of different classification algorithms to train the models. The relationship between symptoms were divided into hyponymy and non-hyponymy. The experimental results show that when these features are selected simultaneously, precision, recall and Fl-measure of Support Vector Machine (SVM) are up to 82. 68%, 82. 13% and 82.40%, respectively. On this basis, by using the above hyponymy extracting algorithm, 20619 hyponymies were extracted, and the knowledge base of symptom hyponymy was built.
出处 《计算机应用》 CSCD 北大核心 2017年第10期2999-3005,共7页 journal of Computer Applications
基金 国家863计划项目(2015AA020107) 国家科技支撑计划项目(2015BAH12 F01-05)~~
关键词 上下位关系 症状构成成分 条件随机场 关系分类 支持向量机 决策树 朴素贝叶斯 hyponymy symptom component Conditional Random Field (CRF) relationship classification SupportVector Machine (SVM) decision tree Naive Bayesian (NB)
  • 相关文献

参考文献3

二级参考文献16

  • 1张春霞,郝天永.汉语自动分词的研究现状与困难[J].系统仿真学报,2005,17(1):138-143. 被引量:60
  • 2车万翔,刘挺,李生.实体关系自动抽取[J].中文信息学报,2005,19(2):1-6. 被引量:116
  • 3Miller G.WordNet:An On-line Lexical Database.International Journal of Lexicography,1990,3(4)
  • 4Beeferman D.Lexical discovery with an enriched semantic network.In:Proceedings of the Workshop on Applications of Word-Net in Natural Language Processing Systems,ACL/COLING,1998
  • 5Richardson S D,Dolan W B,Vandervende L.Mindnet:acquiring and structuring semantic information from text.In:Proc.of COL-ING-ACL'98,1998.1098~1102
  • 6Cao Cungen,Shi Qiuyan.Acquiring Chinese Historical Knowledge from Encyclopedic Texts.In:Proceedings of the International Conference for Young Computer Scientists,2001.1194~1198
  • 7Dolan W,Vanderwende L,Richardson S D.Automatically Deriving Structured Knowledge Bases From On-Line Dictionaries.In:Proceedings of the Pacific Association for Computational Linguistics.Vancouver,British Columbia,1993.5~14
  • 8Shinzato K,Torisawa K.Acquiring hyponymy relations from web documents.In:Proceedings of HLT-NAACL 2004.73~80
  • 9宋柔 许勇.基于语义的百科辞典知识提取实验[J].Computational Linguistics and Chinese Language Processing,2002,7(2):101-112.
  • 10Hearst M A.Automatic acquisition of hyponyms from large text corpora.In:Proceedings of the 14th International Conference on Computational Linguistics.Nantes,France,1992

共引文献30

同被引文献6

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部