期刊文献+

基于深度强化学习的公共安全领域文本关键词抽取方法

A Keywords Extraction Method for Public Safety Domain Texts Based on Deep Reinforcement Learning
原文传递
导出
摘要 在国内政务大数据高速发展的背景下,充分利用大量无标注的公共安全领域政策公文文本数据,有效提取文本的关键信息,对提升城市安全治理能力有重要意义。因此,提出一种基于深度强化学习的公共安全领域文本关键词提取模型,通过无监督的方式快速实现文本内容的标签化,以提升用户对公共安全领域文件或事件的检索能力。文章以log-sum范数正则项作为该模型损失函数的稀疏约束,以引导策略网络学习到保留重要词汇、舍弃非重要词汇的策略。同时设计了一种mini-batch大小可变的模型训练方法,通过设置不同的mini-batch大小控制策略网络学习的难度,从而提高策略网络的泛化能力。性能对比结果显示,该模型在测试集的关键词提取任务上优于传统无监督关键词提取方法。 With the rapid development of big data in China’s government affairs,it is of great significance to fully utilize a large amount of unlabeled text data in the field of public safety,effectively extract key information from the text,and enhance urban safety governance capabilities.Therefore,a public safety domain text keyword extraction model based on deep reinforcement learning was proposed to quickly label the text content in an unsupervised manner,in order to improve the user's retrieval ability for public safety domain files or events.The paper used the log-sum norm regularization term as the sparse constraint of the loss function of the model to guide the policy network to learn strategies that retain important vocabulary and discard unimportant vocabulary.At the same time,a model training method with variable minibatch sizes was designed,which could control the difficulty of learning the policy network by setting different mini batch sizes,thereby improving the generalization capacity of the policy network.The performance comparison results showed that the model outperformed traditional unsupervised methods in the task of keyword extraction.
作者 高誉轩 孙丽娟 丁洪鑫 熊子奇 GAO Yuxuan;SUN Lijuan;DING Hongxin;XIONG Ziqi(Chengdu River and Lake Protection and Smart Water Service Center,Chengdu 610072,China;CETC Big Data Research Institute Co.,Ltd.,Guiyang 550022,China;National Engineering Research Center of Big Data Application to the Improvement of Governance Capacity,Guiyang 550022,China)
出处 《工业建筑》 2024年第2期155-160,共6页 Industrial Construction
基金 国家重点研发计划项目(2023YFC3806001)。
关键词 深度强化学习 关键词提取 log-sum范数 公共安全大数据 deep reinforcement learning keyword extraction log-sum norm public safety big data
  • 相关文献

参考文献4

二级参考文献31

  • 1LI Juanzi FAN Qi'na ZHANG Kuo.Keyword Extraction Based on tf/idf for Chinese News Document[J].Wuhan University Journal of Natural Sciences,2007,12(5):917-921. 被引量:24
  • 2Mihalcea R, Tarau P. TextRank : Bringing Order into Texts [ C ]. In: Proceedings of Empirical Methods in Natural Language Process- ing, Barcelona, Spain. 2004:404-411.
  • 3Frank E, Paynter G W, Witten I H, et al. Domain - Specific Key- phrase Extraction [ C ] In: Proceedings of the 16th International Joint Conference on Artificial Intelligence, Stockholm, Sweden. 1999 : 668 -673.
  • 4Turney P D. Learning Algorithms for Keyphrase Extraction[ J]. In- formation Retrieval, 2000, 2 (4) :303 - 336.
  • 5Pasquier C. Task 5 : Single Document Keyphrase Extraction Using Sentence Clustering and Latent Dirichlet Allocation [ C ]. In : Pro- ceedings of the 5th International Workshop on Semantic Evaluation. Stroudsburg, PA, USA : Association for Computational Linguistics, 2010:154 - 157.
  • 6Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation[ J]. Journal of Machine Learning Research, 2003, 3: 993- 1022.
  • 7Page L, Brin S, Motwani R, et al. The PageRank Citation Rank- ing: Bringing Order to the Web [ R]. Stanford Digital Library Technologies Project, 1998.
  • 8Rajaraman A, Ullman J D. Mining of Massive Datasets[ M]. Cam- bride University Press. 2012 : 171 - 173.
  • 9石晶,李万龙.基于LDA模型的主题词抽取方法[J].计算机工程,2010,36(19):81-83. 被引量:47
  • 10夏天.基于扩展标记树的网页正文抽取[J].广西师范大学学报(自然科学版),2011,29(1):133-137. 被引量:2

共引文献82

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部