Text summarization aims to generate a concise version of the original text.The longer the summary text is,themore detailed it will be fromthe original text,and this depends on the intended use.Therefore,the problem of...Text summarization aims to generate a concise version of the original text.The longer the summary text is,themore detailed it will be fromthe original text,and this depends on the intended use.Therefore,the problem of generating summary texts with desired lengths is a vital task to put the research into practice.To solve this problem,in this paper,we propose a new method to integrate the desired length of the summarized text into the encoder-decoder model for the abstractive text summarization problem.This length parameter is integrated into the encoding phase at each self-attention step and the decoding process by preserving the remaining length for calculating headattention in the generation process and using it as length embeddings added to theword embeddings.We conducted experiments for the proposed model on the two data sets,Cable News Network(CNN)Daily and NEWSROOM,with different desired output lengths.The obtained results show the proposed model’s effectiveness compared with related studies.展开更多
Although contrastive move analysis of article abstracts has been a highlight,few studies focus on abstracts of natural sci⁃ence articles.To compensate for this gap,this study,based on IMRD model,focuses on aquatic bio...Although contrastive move analysis of article abstracts has been a highlight,few studies focus on abstracts of natural sci⁃ence articles.To compensate for this gap,this study,based on IMRD model,focuses on aquatic biology abstracts and contrasts those by native English speakers and those by Chinese authors.Combining quantitative and qualitative studies,it reveals their dif⁃ferences and similarities in terms of the frequency of different moves,sentence length and move length significance.Such similari⁃ties and differences can be explained by the face culture of China,the different language proficiency and the common convention of academic abstract.展开更多
研究探讨了使用预训练的Pegasus模型进行长文本摘要时,不同文本分割方法对摘要质量的影响。收集来自知网的200篇关于STM32单片机的学术论文作为实验文本,比较了滑动窗口、句子分割、段落分割及滑动窗口加句子分割四种分割法的长文本摘...研究探讨了使用预训练的Pegasus模型进行长文本摘要时,不同文本分割方法对摘要质量的影响。收集来自知网的200篇关于STM32单片机的学术论文作为实验文本,比较了滑动窗口、句子分割、段落分割及滑动窗口加句子分割四种分割法的长文本摘要生成效果。实验使用ROUGE(Recall-Oriented Understudy for Gisting Evaluation)指标对生成的摘要进行评估,并对实验结果进行了详细分析。在生成摘要的质量方面,段落分割法表现出色,其ROUGE-1、ROUGE-2和ROUGE-L评分分别达到了30.85、7.60和20.15,轻微超过了句子分割法的评分,且显著优于句子分割加滑动窗口法。该研究旨在为研究者和开发者提供关于长文本摘要的实践经验和见解。展开更多
传统的文本摘要方法,如基于循环神经网络和Encoder-Decoder框架构建的摘要生成模型等,在生成文本摘要时存在并行能力不足或长期依赖的性能缺陷,以及文本摘要生成的准确率和流畅度的问题。对此,提出了一种动态词嵌入摘要生成方法。该方...传统的文本摘要方法,如基于循环神经网络和Encoder-Decoder框架构建的摘要生成模型等,在生成文本摘要时存在并行能力不足或长期依赖的性能缺陷,以及文本摘要生成的准确率和流畅度的问题。对此,提出了一种动态词嵌入摘要生成方法。该方法基于改进的Transformer模型,在文本预处理阶段引入先验知识,将ELMo(Embeddings from Language Models)动态词向量作为训练文本的词表征,结合此词对应当句的文本句向量拼接生成输入文本矩阵,将文本矩阵输入到Encoder生成固定长度的文本向量表达,然后通过Decoder将此向量表达解码生成目标文本摘要。实验采用Rouge值作为摘要的评测指标,与其他方法进行的对比实验结果表明,所提方法所生成的文本摘要的准确率和流畅度更高。展开更多
基金funded by Vietnam National Foundation for Science and Technology Development(NAFOSTED)under Grant Number 102.05-2020.26。
文摘Text summarization aims to generate a concise version of the original text.The longer the summary text is,themore detailed it will be fromthe original text,and this depends on the intended use.Therefore,the problem of generating summary texts with desired lengths is a vital task to put the research into practice.To solve this problem,in this paper,we propose a new method to integrate the desired length of the summarized text into the encoder-decoder model for the abstractive text summarization problem.This length parameter is integrated into the encoding phase at each self-attention step and the decoding process by preserving the remaining length for calculating headattention in the generation process and using it as length embeddings added to theword embeddings.We conducted experiments for the proposed model on the two data sets,Cable News Network(CNN)Daily and NEWSROOM,with different desired output lengths.The obtained results show the proposed model’s effectiveness compared with related studies.
文摘Although contrastive move analysis of article abstracts has been a highlight,few studies focus on abstracts of natural sci⁃ence articles.To compensate for this gap,this study,based on IMRD model,focuses on aquatic biology abstracts and contrasts those by native English speakers and those by Chinese authors.Combining quantitative and qualitative studies,it reveals their dif⁃ferences and similarities in terms of the frequency of different moves,sentence length and move length significance.Such similari⁃ties and differences can be explained by the face culture of China,the different language proficiency and the common convention of academic abstract.
文摘研究探讨了使用预训练的Pegasus模型进行长文本摘要时,不同文本分割方法对摘要质量的影响。收集来自知网的200篇关于STM32单片机的学术论文作为实验文本,比较了滑动窗口、句子分割、段落分割及滑动窗口加句子分割四种分割法的长文本摘要生成效果。实验使用ROUGE(Recall-Oriented Understudy for Gisting Evaluation)指标对生成的摘要进行评估,并对实验结果进行了详细分析。在生成摘要的质量方面,段落分割法表现出色,其ROUGE-1、ROUGE-2和ROUGE-L评分分别达到了30.85、7.60和20.15,轻微超过了句子分割法的评分,且显著优于句子分割加滑动窗口法。该研究旨在为研究者和开发者提供关于长文本摘要的实践经验和见解。
文摘传统的文本摘要方法,如基于循环神经网络和Encoder-Decoder框架构建的摘要生成模型等,在生成文本摘要时存在并行能力不足或长期依赖的性能缺陷,以及文本摘要生成的准确率和流畅度的问题。对此,提出了一种动态词嵌入摘要生成方法。该方法基于改进的Transformer模型,在文本预处理阶段引入先验知识,将ELMo(Embeddings from Language Models)动态词向量作为训练文本的词表征,结合此词对应当句的文本句向量拼接生成输入文本矩阵,将文本矩阵输入到Encoder生成固定长度的文本向量表达,然后通过Decoder将此向量表达解码生成目标文本摘要。实验采用Rouge值作为摘要的评测指标,与其他方法进行的对比实验结果表明,所提方法所生成的文本摘要的准确率和流畅度更高。