期刊文献+

一种基于云模型的文摘单元选取方法研究

A Summarization Unit Selecting Method Based on Cloud Model
下载PDF
导出
摘要 该文提出了一种基于云模型的文摘单元选取方法,利用云模型,全面考虑文摘单元的随机性和模糊性,提高面向查询的多文档自动文摘系统的性能。首先计算文摘单元和查询条件的相关性,将文摘单元和各个查询词的相关度看成云滴,通过对云的不确定性的计算,找出与查询条件真正意义相关的文摘单元;随后利用文档集合重要度对查询相关的结果进行修正,将文摘句和其他各文摘句的相似度看成云滴,利用云的数字特征计算句子重要度,找出能够概括尽可能多的文档集合内容的句子,避免片面地只从某一个方面回答查询问题。为了证明文摘单元选取方法的有效性,在英文大规模公开语料上进行了实验,并参加了国际自动文摘公开评测,取得了较好的成绩。 This paper proposes a summarization unit selection method based on the cloud model. The cloud model is used to consider randomness as well as fuzziness on distribution of summarization unit. In obtaining relevance be- tween summarization unit and query, the scores of relevance between the word and each query word are seen as cloud drops. According to the uncertainty of cloud, a summarization unit which is more relevant to the query is given higher score. After that, the importance in the document set is also considered to evaluate the sentence's ability to summarize content of the document set. Similarities between a sentence and all sentences in document set are considered as cloud drops. All these cloud drops become a cloud, which indicates the sentence's ability to summarize content of the document set. The effectiveness of the proposed method is demonstrated on large-scale open benchmark corpus in English. The method was also examined by TAC (Text Analysis conference) 2010 with satisfactory results.
作者 陈劲光
出处 《中文信息学报》 CSCD 北大核心 2016年第5期187-194,202,共9页 Journal of Chinese Information Processing
基金 教育部人文社会科学一般项目(13YJCZH013) 湖州师范学院人文社科预研究项目(KY27015A)
关键词 云模型 自动文摘 不确定性 cloud model query-focused multi-document summarization uncertainty
  • 相关文献

参考文献3

二级参考文献14

  • 1李德毅,孟海军,史雪梅.隶属云和隶属云发生器[J].计算机研究与发展,1995,32(6):15-20. 被引量:1240
  • 2Li D,Knowledge Based Syst,1998年,10期,431页
  • 3Li D,Proc Second Pacific-Asia Conf Knowledge Discovery & Data Mining.Melbourne,1998年,392页
  • 4Xia B B,Master dissertation,1997年
  • 5Li D,Logic Programming and Soft Computing,1997年
  • 6Agrawal R,Proc Twenty-First International Conference on Very Large Data Bases.San Francisc,1995年,490页
  • 7Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data. Washington, D. C: ACM Press, 1993. 207-216.
  • 8Srikant R, Agrawal R. Mining quantitative association rules in large relational tables. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data. Montreal Quebec Canada: ACM Press, 1996. 1- 12.
  • 9Miller R J, Yang Y. Association rules over interval data. In Proceedings ACM SIGMOD International Conference on Management of Data, Tucson: Arizona, 1997,452-461.
  • 10Wang K, Tay S H W, Liu B. lnterestingness-based interval merger for numeric association rules. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining. New York: AAAI Press, 1998. 121-128.

共引文献177

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部