摘要
中文比较句研究多集中于语言学领域,然而利用机器学习的方法识别比较句的研究才刚刚起步。根据关联规则挖掘算法的基本原理提出一种基于关联特征词表的比较句识别方法,该方法将词和词性作为一个基本元素,定义特征词表中核心词和依存词之间的关联方式,利用支持向量机(SVM)分类器进行比较句的识别。实验结果表明,该方法能够有效地识别出中文比较句,在准确率、召回率和F值上均取得不错的效果。
Chinese comparative sentences are more focused in the field of linguistics. Using machine learning methods to identify comparative sentences, however, has only just started. According to the basic principle of the association rules mining algorithm, a method of comparative sentences based on the associated feature vocabulary was proposed. This method regarded word and part of speech as basic elements, defined the connecting way between the table definition core words and imerdependent relationship words, and used the Support Vector Machine (SVM) classifier for the identification of comparative sentences. The experimental results show that this method can effectively identify Chinese comparative sentences, and achieves good results in precision, recall and F-measure.
出处
《计算机应用》
CSCD
北大核心
2013年第6期1591-1594,共4页
journal of Computer Applications
基金
国家自然科学基金资助项目(60873247)
国家社会科学基金资助项目(12BXW040)
公安部科技创新计划项目(2011YYCXSDST057)
山东省自然科学基金资助项目(ZR2012FM038
ZR2011FM030)
山东省科技发展计划项目(2012GGB01194)
关键词
比较句识别
文本分类
中文比较模式库
类序列规则
关联特征词表
comparative sentences identification
text classification
Chinese comparative pattern database
classsequential rule
associated feature vocabulary