期刊文献+

基于语句类似度优化计算的改进自动摘要算法研究 被引量:3

ON AUTOMATIC ABSTRACTING ALGORITHM BASED ON OPTIMISED SENTENCES SIMILARITY CALCULATION
下载PDF
导出
摘要 计算语句的类似度在自动问答、机器翻译、信息检索和自动摘要等算法中有着非常重要的作用。首先归纳了语句类似度计算的方法,综合考虑关键词特征、语义特征、句式特征和语句长度特征等信息并提出一种优化语句类似度计算方法,以语句权重计算方法以及冗余处理优化处理为基础,实现一个改进的自动摘要算法。通过在DUC的测评语料上进行仿真,实验结果证明了该算法对于摘要质量优化的高效性。最后,讨论了自动摘要研究存在的问题,并指出自动摘要的研究趋势。 Calculating the sentence similarity plays an important role in algorithms of automatic question-answering, machine-translation, information retrieval and automatic abstracting, etc. In this article, firstly we sum up the methods of calculating the sentences similarity, and then bring forward a new method of optimising the sentences similarity calculation by synthetically taking into consideration the information in- eluding key words characters, semantic characters, sentential form characters and sentence length characters, etc. Moreover, we implement an improved automatic abstracting algorithm on the basis of sentences weight computation and redundancy resolution optimisation. Through the simulation on DUC evaluation corpuses, the experimental result proves the efficiency of this algorithm in abstracting quality optimisation. In end of the paper, we discuss the problems'existed in automatic abstracting research and point out the studying trend of this technology as well.
出处 《计算机应用与软件》 CSCD 北大核心 2013年第9期160-162,182,共4页 Computer Applications and Software
基金 河南省科技厅基础与前沿技术研究类重点项目(112300410266) 河南省科技厅基础与前沿技术研究类重点项目(112300410262)
关键词 语句类似度 自动摘要 语句权重计算 冗余处理 Sentence similarity Automatic abstracting Sentence-weight computing Redundancy resolution
  • 相关文献

参考文献4

二级参考文献31

  • 1苏海菊,王永成.中文科技文献文摘的自动编写[J].情报学报,1989,8(6):433-439. 被引量:25
  • 2徐永东,徐志明,王晓龙,刘远超.中文文本时间信息获取及语义计算[J].哈尔滨工业大学学报,2007,39(3):438-442. 被引量:10
  • 3J. Carbonell, J. Goldstein, 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries [ A],In: Proceedings of the 21st ACM-SIGIR International Conference on Research and Development in Information Retrieval [C], Melbourne, Australia.
  • 4Lin, Chin-Yew and E. H. Hovy 2003. Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics [ A ]. In Proceedings of 2003 Language Technology Conference (HLT-NAACL 2003) [C],Edmonton,Canada,May 27- June 1,2003.
  • 5Lin, Chin-Yew and E. H. Hovy. 2002. Automated Multi-document Summarization in NeATS [ A ]. In Proceedings of the Human Language Technology Conference (HLT2002) [C] ,San Diego,CA,U.S.A. ,March 23-27,2002.
  • 6Radev,D.R. ,Jing,H. ,and Budzikowska,M.2000. Centroid-based summarization of multiple documents [A] .In ANLP-NAACL workshop on summarization [ C].
  • 7Hovy, E. and Lin, C. 1997. Automated text summarization in SUMMARIST [ A]. Pages 18- 24. In A CL '97 workshop on Intelligent Scalable Text Summarization [ C].
  • 8Wesley T. Chuang and Jihoon Yang. 2000. Extracting Sentence Segments for Text Summarization: A Machine Learning Approach[A] .In:Proceeding of The 26th Annual International ACM SIGIR Conference [C].
  • 9G. Salton. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer [ M ].Addison-Wesley, 1989.
  • 10Sasha Blair-Goldensohn. 2004, Columbia University at DUC 2004[ R]. In DUC2004.

共引文献59

同被引文献51

  • 1刘海涛.依存语法和机器翻译[J].语言文字应用,1997(3):91-95. 被引量:43
  • 2刘挺,马金山,李生.基于词汇支配度的汉语依存分析模型[J].软件学报,2006,17(9):1876-1883. 被引量:24
  • 3王永恒,贾焰,杨树强.海量短语信息文本聚类技术研究[J].计算机工程,2007,33(14):38-40. 被引量:13
  • 4Luhn H P. The Automatic Creation of Literature Abstracts [ J ]. IBM Journal of Research and Development, 1958,2 ( 2 ) : 159 - 165.
  • 5Mihalcea R. Graph-based Ranking Algorithms for Sentence Ex- traction, Applied to Text Summarization [ C ]//Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions. As- sociation for Computational Linguistics, 2004 : 20.
  • 6Erkan G, Radev D. LexRank: Graph- based Lexical Centrality as Salience in Text Summarization [ J ]. Journal of Artificial In- telligence Research, 2004(22) : 457.
  • 7沈亚翻.无尺度图K-中心点聚类算法研究[D].开封:河南大学,2009.
  • 8Banerjee S, Ramanathan K, Gupta A. Clustering short texts using Wikipedia [ C ]//Proceedings of the 30ts Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2007: 787-788.
  • 9Hu X, Sun N, Zhang C, et al. Exploiting internal and external semantics for the clustering of short texts using world knowledge [ C ]//Proceedings of the 18's ACM Conference on Information and Knowledge Management. ACM, 2009: 919-928.
  • 10Sahami M, Heilman T D. A web-based kernel function for measuring the similarity of short text snippets [ C ]// Proceedings of the 15'h International Conference on World Wide Web. ACM, 2006: 377-386.

引证文献3

二级引证文献21

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部