摘要
该文介绍了对话文本自动摘要系统的一些关键技术,包括体裁的识别、对话信息单元的识别、问题句与回答句的关联等。摘要的连贯性是衡量摘要质量的一个重要指标,由于对话文本本身的交互性,使得摘要的连贯性常存在于不同的对话者的对话内容之中,并以问题-回答对的形式出现。该文设计了一种自动识别这些局部连贯性的方法,该方法首先自动识别出所有的问题句;然后识别出与问题句相对应的回答句,形成问题-回答对;最后根据启发式规则,从这些问题-回答对中选取句子生成摘要。实验结果表明,该方法具有较高的识别准确率,并在无损摘要信息量的基础上大大提高对话文本摘要的连贯性。
<Abstrcat>Automatic summarization of spoken dialogues is a relatively new area. Some critical techniques are proposed in this paper: (1) detection of spoken dialogues; (2) detection and linking of cross-speaker information units (question-answer pairs). Due to the interactive nature of dialogues, local regions of coherence often stretch across different speakers. An approach to automatically detect those regions of local coherence is presented. Firstly, all questions are detected. Secondly, all corresponding answers of each question are detected to constitute question-answer pairs. Lastly,some sentences are extracted from the question-answer pairs to comprise a complete summarization. Experimental results show that the approach is highly efficient and it will increase summary fluency significantly while not compromising informativeness.
出处
《计算机仿真》
CSCD
2005年第5期226-230,共5页
Computer Simulation
基金
863计划资助项目(2002AA119050)。
关键词
对话文本
连贯性
问答对
语句相关度
Spoken dialogs
Fluency
Question-answer pairs
Sentence similarity