摘要
针对互联网中开放式中文文本关系难以抽取的问题,提出一个新的关系抽取方法。为缓解关系三元组抽取较难的问题,给出一个新的基于属性和概念实例的关系三元组构造方法,抽取的大量概念实例关系三元组中不仅包含大量显式关系三元组,还包含部分隐式关系三元组。在此基础上,针对关系三元组含有噪声和错误的问题,使用基于Adaboost迭代算法的协同训练方法对关系抽取模型进行优化。以大学类别领域百科条目真实文本为实验数据进行实验的结果表明,与同类关系抽取方法对比,该方法在召回率和F值上能取得较好的抽取性能。
A new relation extraction method is proposed to solve the problem of relation extraction from open Chinese free texts. In order to alleviate the difficult problem of relation triples extraction, a method based on the relationship between attribute and concept instance triples is proposed, a large number of instances of concept and relation triples includes explicit relation triples and contains an implicit relation triples. The relationship triple construction contains noise and error, in view of the relationship between the ternary group is used contains noise and wrong question, Adaboost based iterative algorithm of collaborative training methods is used to strengthen the relationship between extraction model. Experiment is carried out on the text of the encyclopedia entries in the field of university, and the experimental results show that the method can obtain better performance.
作者
王旭阳
姜喜秋
WANG Xuyang JIANG Xiqiu(College of Computer and Communication, Lanzhou University of Technology, Lanzhou 730050, Chin)
出处
《吉林大学学报(信息科学版)》
CAS
2017年第4期430-437,共8页
Journal of Jilin University(Information Science Edition)
基金
国家自然科学基金资助项目(61563030)