一种基于云模型的文摘单元选取方法研究

A Summarization Unit Selecting Method Based on Cloud Model

下载PDF

导出

摘要该文提出了一种基于云模型的文摘单元选取方法,利用云模型,全面考虑文摘单元的随机性和模糊性,提高面向查询的多文档自动文摘系统的性能。首先计算文摘单元和查询条件的相关性,将文摘单元和各个查询词的相关度看成云滴,通过对云的不确定性的计算,找出与查询条件真正意义相关的文摘单元;随后利用文档集合重要度对查询相关的结果进行修正,将文摘句和其他各文摘句的相似度看成云滴,利用云的数字特征计算句子重要度,找出能够概括尽可能多的文档集合内容的句子,避免片面地只从某一个方面回答查询问题。为了证明文摘单元选取方法的有效性,在英文大规模公开语料上进行了实验,并参加了国际自动文摘公开评测,取得了较好的成绩。 This paper proposes a summarization unit selection method based on the cloud model. The cloud model is used to consider randomness as well as fuzziness on distribution of summarization unit. In obtaining relevance be- tween summarization unit and query, the scores of relevance between the word and each query word are seen as cloud drops. According to the uncertainty of cloud, a summarization unit which is more relevant to the query is given higher score. After that, the importance in the document set is also considered to evaluate the sentence＇s ability to summarize content of the document set. Similarities between a sentence and all sentences in document set are considered as cloud drops. All these cloud drops become a cloud, which indicates the sentence＇s ability to summarize content of the document set. The effectiveness of the proposed method is demonstrated on large-scale open benchmark corpus in English. The method was also examined by TAC （Text Analysis conference） 2010 with satisfactory results.

作者陈劲光

机构地区湖州师范学院教师教育学院

出处《中文信息学报》 CSCD 北大核心 2016年第5期187-194,202,共9页 Journal of Chinese Information Processing

基金教育部人文社会科学一般项目(13YJCZH013) 湖州师范学院人文社科预研究项目(KY27015A)

关键词云模型自动文摘不确定性 cloud model query-focused multi-document summarization uncertainty

分类号 TP311.13 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献3

1邸凯昌,李德毅,李德仁.云理论及其在空间数据发掘和知识发现中的应用[J].中国图象图形学报（A辑）,1999,4(11):930-935. 被引量：123
2杜益鸟,宋自林,李德毅.基于云模型的关联规则挖掘方法[J].解放军理工大学学报（自然科学版）,2000,1(1):29-34. 被引量：26
3蒋嵘,李德毅.基于形态表示的时间序列相似性搜索[J].计算机研究与发展,2000,37(5):601-608. 被引量：34

二级参考文献14

1李德毅,孟海军,史雪梅.隶属云和隶属云发生器[J].计算机研究与发展,1995,32(6):15-20. 被引量：1240
2Li D，Knowledge Based Syst，1998年，10期，431页
3Li D，Proc Second Pacific-Asia Conf Knowledge Discovery & Data Mining.Melbourne，1998年，392页
4Xia B B，Master dissertation，1997年
5Li D，Logic Programming and Soft Computing，1997年
6Agrawal R，Proc Twenty-First International Conference on Very Large Data Bases.San Francisc，1995年，490页
7Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data. Washington, D. C: ACM Press, 1993. 207-216.
8Srikant R, Agrawal R. Mining quantitative association rules in large relational tables. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data. Montreal Quebec Canada: ACM Press, 1996. 1- 12.
9Miller R J, Yang Y. Association rules over interval data. In Proceedings ACM SIGMOD International Conference on Management of Data, Tucson: Arizona, 1997,452-461.
10Wang K, Tay S H W, Liu B. lnterestingness-based interval merger for numeric association rules. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining. New York: AAAI Press, 1998. 121-128.

共引文献177

1闫文周,贾小妮.基于云模型的网络计划评审技术[J].山西建筑,2007(7):192-193. 被引量：1
2黄敏,张浩,曹加恒.一种基于关系数据库的水印技术[J].计算机工程与应用,2005,41(10):153-155. 被引量：12
3高键,姜长生,李众.一种新的云模型控制器设计[J].信息与控制,2005,34(2):157-162. 被引量：41
4倪世宏,王刚,史忠科.一种非同步时间序列特征提取算法[J].计算机应用研究,2005,22(5):87-89.
5宋咏谦,李芳,谢康林.金融时间序列的概念表示[J].计算机应用与软件,2005,22(6):53-54. 被引量：1
6王洪利,冯玉强.基于云模型具有语言评价信息的多属性群决策研究[J].控制与决策,2005,20(6):679-681. 被引量：69
7张国英,沙芸,余有明,刘玉树.基于属性相似度的云分类器[J].北京理工大学学报,2005,25(6):499-503. 被引量：11
8李众,杨一栋.一种新的基于二维云模型不确定性推理的智能控制器[J].控制与决策,2005,20(8):866-872. 被引量：25
9廖良才,David Carmichael.基于云理论和效用理论的评估方法及其在业主评估中的应用[J].系统工程,2010,28(8):39-45. 被引量：24
10王刚,倪世宏,沙孟春.基于遗传算法的非同步时间序列特征提取方法[J].计算机工程,2005,31(17):155-156. 被引量：2

1董小国,甘立国.基于句子重要度的特征项权重计算方法[J].计算机与数字工程,2006,34(8):35-37. 被引量：2
2叶星火,胡珀,张小鹏.基于特征信息提取的中文自动文摘方法[J].计算机应用与软件,2008,25(5):31-32. 被引量：3
3蒋效宇,樊孝忠,陈康.基于用户查询的中文自动文摘研究[J].计算机工程与应用,2008,44(5):48-50. 被引量：3
4索红光,梁玉环,刘玉树.基于时间戳的多文档自动文摘[J].计算机工程,2007,33(16):164-165. 被引量：3
5王娟,景作军.ABAQUS壳单元在辊弯成型中的应用[J].机械工程师,2008(2):118-120. 被引量：1
6李芳,何婷婷.面向查询的多模式自动摘要研究[J].中文信息学报,2011,25(2):9-14. 被引量：3
7崔荣一,洪炳熔.关于三层前馈神经网络隐层构建问题的研究[J].计算机研究与发展,2004,41(4):524-530. 被引量：17
8陈志敏,沈洁,林颖,周峰.基于主题划分的网页自动摘要[J].计算机应用,2006,26(3):641-644. 被引量：8
9吴德龄,胡栋.基于不对称树结构的3D-SPIHT算法改进[J].计算机技术与发展,2010,20(8):79-82. 被引量：1
10于金辉,徐晓刚,彭群生.一个三维计算机水粉笔刷模型[J].计算机辅助设计与图形学学报,2000,12(9):664-667. 被引量：16

中文信息学报

2016年第5期

浏览历史

内容加载中请稍等...

一种基于云模型的文摘单元选取方法研究

参考文献3

二级参考文献14

共引文献177

相关作者

相关机构

相关主题

浏览历史