摘要
网络动态演化内容的识别和分析是人们快速获取有效信息的主要手段之一,已经成为人们迫切需要解决的关键问题。动态多文档文摘建立在时间信息基础上,从网络动态演化性出发,对同一话题不同时段的文档集合进行分析,在识别信息内容差异性的基础上,对信息的动态演化性进行建模。文中在经典流行排序思想的基础上,进一步提出了动态流行排序模型。该模型中不仅融入了信息的重要性特征,而且融入了信息与历史信息的关联特征以及信息的时间特征,使文摘信息动了起来,即文摘系统具有了动态性。该模型在国际标准评测TAXT ANYNASIS CONFERENCE 2008的Update task任务语料上进行了测试,获得了较好的实验结果。
The identification and analysis of evolutionary information on the internet is an efficient means to get useful information,which has become a critic issue urgent to work out.Based on time information,starting from network dynamic evolution,the dynamic multi_document summarization analyzes the document sets of different period about a same topic.On the basis of identifying the difference of information content,a summarization model can be built.Based on the classic manifold ranking model,we propose a dynamic manifold ranking model which not only adds some significant features,but also introduces some historical redundancy features and some time information feature,which make the information contained by abstract dynamic. An evaluation based on this model is conducted on the update task corpus of TAXT ANYNASIS CONFERENCE 2008 and a good testing result is obtained.
出处
《计算机技术与发展》
2018年第3期26-31,共6页
Computer Technology and Development
基金
中央高校基本科研业务费专项资金(2572014CB26)
黑龙江省自然科学基金(F2015037)
关键词
动态多文档文摘
动态演化性
差异性分析
相似度
质心整体选优
dynamic multi_document summarization
dynamic evolution
difference analysis
similarity
overall centroid optimized