摘要
比较是人们常用的评估不同事物优劣、异同的表达方式,利用机器识别比较句并进一步抽取比较要素是语言信息处理领域一项新颖又有实用价值的课题。该文依据比较句与比较要素之间是一种"你中有我,我中有你"的共生关系,将比较句识别与比较要素抽取两个任务合二为一完成;根据词意分类,构建由领域词典、情感词典、标记词典、普通词典构成的词典系统;根据汉语比较句句义分类,构建比较句识别与比较要素抽取规则库。以第四届中文倾向性评测(COAE2012)发布的测试语料为实验对象,该系统取得了较好的实验(评测)结果。
Comparison is a common expression to assess which is better or whether they are identical (or similar) in some aspects among several things. How to identify comparative sentences and extract the elements being compared automatically is a novel and practical research in the sentiment analysis field. Based on the interdependent relation- ship between comparative sentences and comparative elements, we propose a method to accomplish the two identifi- cation tasks simultaneously. According to the semantic classification of words and comparative sentences, we con- struct the lexicon system consisting of a domain lexicon, a sentiment lexicon, a mark lexicon and a common lexicon, and them build a rule base of comparative sentences identification and comparative elements extraction. On the tes- ting corpus published by The Fourth Chinese Opinion Analysis Evaluation (COAE2012), the experiments demon- strate a promising . e. evaluation) result by the proposed method.
出处
《中文信息学报》
CSCD
北大核心
2014年第3期136-141,149,共7页
Journal of Chinese Information Processing
基金
国家语委十二五规划重点项目(ZDI125-3)
关键词
语义分类
词典与规则
比较句识别
比较要素抽取
semantic classification
lexicons and rules
comparative sentences identification
comparative elementsextraction