摘要
任务旨在对带有情感的文本数据进行浓缩、提炼进而产生文本所表达的关于情感意见的摘要,用以帮助用户更好地阅读、理解情感文本的内容。该文主要研究多文档的文本情感摘要问题,重点针对网络上存在的同一个产品的多个评论进行摘要抽取。在情感文本中,情感相关性是一个重要的特点,该文将充分考虑情感信息对文本情感摘要的重要影响。同时,对于评论语料,质量高的评论或者说可信度高的评论可以帮助用户更好的了解评论中所评价的对象。因此,该文将充分考虑评论质量对文本情感摘要的影响。并且为了进行关于文本情感摘要的研究,该文收集并标注了一个基于产品评论的英文多文档文本情感摘要语料库。实验证明,情感信息和评论质量能够帮助多文档文本情感摘要,提高摘要效果。
Opinion summarization aims to concentrate and refine the text data so as to generate a summary of the text regarding the expressed opinion. It helps users reading and understanding the content of the opinion text. This study focuses on multi-document opinion summarization where the main task is to generate a summary given amounts of reviews towards the same product. Opinion relevance is an important feature for opinion text, which is considered in our opinion summarization method. Meanwhile,users can better understand the objects that mentioned in the re- views by the help of high quality reviews or high credibility reviews, which is also considered in our method. We further collect and annotate an English multi-document corpus on product reviews. Empirical studies on the corpus demonstrate that incorporating opinion and quality information is effective for multi -document opinion summariza- tion.
出处
《中文信息学报》
CSCD
北大核心
2015年第4期33-39,共7页
Journal of Chinese Information Processing
基金
国家自然科学基金(61003155
60873150)
模式识别国家重点实验室开放课题基金资助项目
关键词
情感摘要
多文档
评论质量
opinion summarization
multi-document
reviews quality