摘要
多文档自动文摘是自然语言处理领域的一个重要研究方向。但对于多文档文摘的评价方法仍然存在方法单一,缺乏统一标准的问题。针对这些问题,就多文档文摘信息覆盖度尝试性地提出一套标准。该标准将涉及以下几个重要参数:改进BLEU参数(改进召回率),与原文档有效词覆盖度,高频词覆盖度。实验证明利用该标准能准确反映出文摘系统在信息覆盖度方面的优劣,并且接近人工评价结果。
Multi-document automatic summarization is an important branch of natural language understanding.But the methods of evaluation of the Multi-document automatic summarization also have many problems,which are single and lack of uniform standard.The investigative point in this text is to attempt to give a standard aiming at the covered rate of information of Multi-document automatic summarization.This standard will use a few of parameters :improved BLEU parameter(recall),covered rate of effective phrase with original documents,high frequency phrase covered rate.The experiments have indicated this standard can reflect the covered rate of information of summarization system good or bad ,and whether it is near to artificial evaluation results.
出处
《计算机工程与应用》
CSCD
北大核心
2007年第2期180-183,共4页
Computer Engineering and Applications
关键词
BLEU
高频词覆盖度
有效词覆盖度
召回率
BLEU
high frequency phrase covered rate
covered rate of effective phrase
recall