摘要
Compared with the traditional method of adding sentences to get summary in multi-document summarization,a two-stage sentence selection approach based on deleting sentences in acandidate sentence set to generate summary is proposed,which has two stages,the acquisition of acandidate sentence set and the optimum selection of sentence.At the first stage,the candidate sentenceset is obtained by redundancy-based sentence selection approach.At the second stage,optimum se-lection of sentences is proposed to delete sentences in the candidate sentence set according to itscontribution to the whole set until getting the appointed summary length.With a test corpus,theROUGE value of summaries gotten by the proposed approach proves its validity,compared with thetraditional method of sentence selection.The influence of the token chosen in the two-stage sentenceselection approach on the quality of the generated summaries is analyzed.
Compared with the traditional method of adding sentences to get summary in multi-document summarization,a two-stage sentence selection approach based on deleting sentences in a candidate sentence set to generate summary is proposed,which has two stages,the acquisition of a candidate sentence set and the optimum selection of sentence.At the first stage,the candidate sentence set is obtained by redundancy-based sentence selection approach.At the second stage,optimum selection of sentences is proposed to delete sentences in the candidate sentence set according to its contribution to the whole set until getting the appointed summary length.With a test corpus,the ROUGE value of summaries gotten by the proposed approach proves its validity,compared with the traditional method of sentence selection.The influence of the token chosen in the two-stage sentence selection approach on the quality of the generated summaries is analyzed.
基金
the National Natural Science Foundation of China(No.60575041)
the High Technology Researchand Development Program of China(No.2006AA01Z150).