In the era of Big Data,we are faced with an inevitable and challenging problem of“overload information”.To alleviate this problem,it is important to use effective automatic text summarization techniques to obtain th...In the era of Big Data,we are faced with an inevitable and challenging problem of“overload information”.To alleviate this problem,it is important to use effective automatic text summarization techniques to obtain the key information quickly and efficiently from the huge amount of text.In this paper,we propose a hybrid method of extractive text summarization based on deep learning and graph ranking algorithms(ETSDG).In this method,a pre-trained deep learning model is designed to yield useful sentence embeddings.Given the association between sentences in raw documents,a traditional LexRank algorithm with fine-tuning is adopted fin ETSDG.In order to improve the performance of the extractive text summarization method,we further integrate the traditional LexRank algorithm with deep learning.Testing results on the data set DUC2004 show that ETSDG has better performance in ROUGE metrics compared with certain benchmark methods.展开更多
Sentence similarity computing plays an important role in machine question-answering systems, machine-translation systems, information retrieval and automatic abstracting systems. This article firstly sums up several m...Sentence similarity computing plays an important role in machine question-answering systems, machine-translation systems, information retrieval and automatic abstracting systems. This article firstly sums up several methods for calculating similarity between sentences, and brings out a new method which takes all factors into consideration including critical words, semantic information, sentential form and sen-tence length. And on this basis, a automatic abstracting system based on LexRank algorithm is implemented. We made several improvements in both sentence weight computing and redundancy resolution. The system described in this article could deal with single or multi-document summarization both in English and Chinese. With evaluations on two corpuses, our system could produce better summaries to a certain degree. We also show that our system is quite insensitive to the noise in the data that may result from an imperfect topical clustering of documents. And in the end, existing problem and the developing trend of automatic summariza-tion technology are discussed.展开更多
文摘In the era of Big Data,we are faced with an inevitable and challenging problem of“overload information”.To alleviate this problem,it is important to use effective automatic text summarization techniques to obtain the key information quickly and efficiently from the huge amount of text.In this paper,we propose a hybrid method of extractive text summarization based on deep learning and graph ranking algorithms(ETSDG).In this method,a pre-trained deep learning model is designed to yield useful sentence embeddings.Given the association between sentences in raw documents,a traditional LexRank algorithm with fine-tuning is adopted fin ETSDG.In order to improve the performance of the extractive text summarization method,we further integrate the traditional LexRank algorithm with deep learning.Testing results on the data set DUC2004 show that ETSDG has better performance in ROUGE metrics compared with certain benchmark methods.
文摘Sentence similarity computing plays an important role in machine question-answering systems, machine-translation systems, information retrieval and automatic abstracting systems. This article firstly sums up several methods for calculating similarity between sentences, and brings out a new method which takes all factors into consideration including critical words, semantic information, sentential form and sen-tence length. And on this basis, a automatic abstracting system based on LexRank algorithm is implemented. We made several improvements in both sentence weight computing and redundancy resolution. The system described in this article could deal with single or multi-document summarization both in English and Chinese. With evaluations on two corpuses, our system could produce better summaries to a certain degree. We also show that our system is quite insensitive to the noise in the data that may result from an imperfect topical clustering of documents. And in the end, existing problem and the developing trend of automatic summariza-tion technology are discussed.