摘要
信息时代的情报收集突破了传统情报收集方法的限制,广泛的来源导致其数据量超过了人工处理的极限。针对动向情报这类专业性强的多主题长文本,本文提出了一种基于主题聚类的自动摘要方法,即利用知识图谱蕴含的知识和语义关联关系来增强句向量包含的语义信息并进行聚类,再基于主题特征对聚类结果进行优化,最后计算每个主题中句子之间的相似度,并选取每个主题中最具代表性的句子组成摘要。这项工作具有两大显著优势,一是聚类效果更好;二是在不降低准确率的前提下,运行速度更快。
While the limitations of traditional methods in the information age no longer hinder intelligence collection,the wide range of available sources causes the amount of data acquired to exceed the limit of manual processing.For current intelligence with highly professional and long multi-topic text,this paper proposes an automatic summarization method based on topic clustering,knowledge graphs containing knowledge and semantic relation‐ships can be used to enhance the semantic information in sentence vectors for clustering.Further,the clustering results are optimized based on the topic features.Finally,the similarities between sentences in each topic are calculated,and the most representative sentence in each topic is selected to form a summary.The automatic summary method proposed in this paper has two significant advantages.The first is that sentence vectors with enhanced semantic information structures outperformed their traditional counterparts in terms of clustering.The second is that using keyword features rather than topic models allows rapid operation without reducing the accuracy.
作者
姚奕
杨帆
杜晓明
袁清波
YAO Yi;YANG Fan;DU Xiaoming;YUAN Qingbo(Department of Command and Control Engineering College,Army Engineering University of PLA,Nanjing 210007,China)
出处
《国防科技》
2022年第3期76-83,共8页
National Defense Technology
基金
军队科研基金(JY2019C078)。
关键词
动向情报
文本摘要
知识图谱
预训练模型
current intelligence
text summarization
knowledge graph
pre-trained model