期刊文献+

基于层次概率主题模型的科技文献主题发现及演化 被引量:31

Topic Extraction and Evolution for Scientific Literature Based on Hierarchical Probabilistic Topic Model
原文传递
导出
摘要 自动挖掘科技文献主题并识别主题变化对于科研工作者及时获取相关领域的最新研究动态有着重要作用。针对科技文献主题多样、动态性强等特点,分析科技文献主题发现及演化具体方法,基于层次概率主题模型h LDA,采用Gibbs抽样来进行模型参数估计,并运用互信息的方法对主题词进行筛选,以提取高质量的主题词。最后,利用先/后离散分析方法研究主题随时间的演化问题。实验结果验证了主题发现及演化方法的可行性及有效性。 Automatic mining scientific literature' s topic and observing topic change for researchers will play great role in understanding and accessing the latest research frontiers on certain field. This paper analyzed topic extraction and evolution approaches of scientific papers by examining the characteristics of the diversity and dynamics of scientific papers, and based on hierarchical probabilistic topic model, using Gibbs sampling to estimate the model parameters and choosing the high - quality topic words by means of mutual information. This paper finally used Pro/Post - discretized analysis to study the topic evolution. The experimental results show that topic extraction and evolution method proposed in this paper are feasible and effective.
作者 王平
出处 《图书情报工作》 CSSCI 北大核心 2014年第22期70-77,共8页 Library and Information Service
基金 国家自然科学基金青年科学基金项目"多因素融合下的微博话题可信度评估模型及实证研究"(项目编号:71303179)研究成果之一
关键词 主题发现 主题演化 层次概率主题模型 topic extraction topic evolution hierarchical probabilistic topic model
  • 相关文献

参考文献27

  • 1Aizawa A. An information-theoretic perspective of tf-idf measures [ J]. Information Processing and Management , 2003, 39( 1 ) :45 -65.
  • 2Sahon G, Wong A, Yang C S. A vector space model for automatic indexing [ EB/OL]. [ 2014 - 11 - 04 ]. http ://mall. psy. ohio - state, edu/LexicalSemantics/SaltonWongYang75, pdf.
  • 3Allan J, Carbonell J G, Doddington G,et al. Topic detection and tracking pilot study final report [ C ]//Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, Virgin- ia: DARPA, 1998.
  • 4Gruhl D, Guha R, Liben - Nowell D, et al. Information diffusion through blogspace [ C ]//Proceedings of the 13th International World Wide Web Conference ( WWW ' 04 ). New York: ACM, 2004:491 - 501.
  • 5Yang Yiming, Carbonell J G, Brown R D, et al. Learning approa- ches for detecting and tracking news events [ J 1. IEEE Intelligent Systems, 1999, 14(4) : 32 -43.
  • 6Zhou Ding, Ji Xiang, Zha Hongyuan, et al. Topic evolution and so- cial interactions : How authors effect research [ C ]//Proceedings of the 15th ACM International Conference on Lnfornlation and Knowl- edge Management. Virginia : ACM, 2006:248 - 257.
  • 7Mei Qiaozhu, Zhai Chengxiang. Discovering evolutionary theme pat- terns from text: An exploration of temporal text mining [ C ]//Pro- ceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. Chicago: ACM, 2005:198 - 207.
  • 8Mei Qiaozhu, Zhai Chengxiang. A mixture model for contextual text mining[ C ]//Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data mining. Philadel- phia: ACM, 2006:649 - 655.
  • 9Zhu Mingliang, Hu Weiming, Wu Ou. Topic detection and track- ing for threaded discussion communities [ C ]//Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelli-gcnce and Intelligent Agent Technology. Washington : IEEE, 2008 : 77 - 83.
  • 10Cheng V, Li C. Topic detection via participation using markov logic network [ C ]//Proceedings of the 2007 Third International IEEE Conference on Signal - Image Technologies and Internet - Based System - Volume. Shanghai : IEEE, 2007 : 85 - 91.

二级参考文献95

  • 1张付志,胡媛媛.下一代数字图书馆的体系结构及其信息访问技术研究[J].情报学报,2006,25(5):540-545. 被引量:5
  • 2胡蓓蓓.基于知识决策的数字图书馆个性化推荐[J].情报学报,2007,26(3):448-455. 被引量:12
  • 3王金龙.面向文献数据的挖掘[J].青岛理工大学学报,2007,28(3):105-107. 被引量:4
  • 4Gonalves M A,Fox E A,Watson L T.Towards a digital library theory:a formal digital library ontology[J].International Journal on Digital Libraries,2008,8(2):91-114.
  • 5Strohman T,Croft W B,Jensen D.Recommending citations for academic papers[C]∥SIGIR'07:Proceedings of the thirtieth ACM SIGIR Conference on Research and Development in Information Retrieval.Amsterdam:The Netherlands,ACM Press,2007:705-706.
  • 6Dietz L,Bickel S,Scheffer T.Unsupervised prediction of citation influences[C]∥ICML'07:Proceedings of the twenty-fourth International Conference on Machine Learning.Corvalis:Oregon.USA,ACM Press,2007:233-240.
  • 7Mei Q,Zhai C.Discovering evolutionary theme patterns from text-an exploration of temporal text mining[C]∥KDD'05:Proceedings of the eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Chicago:Illinois,USA,ACM Press,2005:198-207.
  • 8Zhou D,Ji X,Zha H,et al.Topic evolution and social interactions:how authors effect research[C]∥CIKM'06:Proceedings of the fifteenth ACM International Conference on Information and Knowledge Management.Arlington.Virginia,USA,ACM Press,2006:248-257.
  • 9Wang X R,McCallum A.Topics over time:a non-markov continuous-time model of topical trends[C]∥KDD'06:Proceedings of the twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Philadelphia:PA,USA,2006.ACM Press.
  • 10Backstrom L,Huttenlocher D,Kleinberg J,et al.Group formation in large social networks:membership,growth,and evolution[C]∥KDD'06:Proceedings of the twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Philadelphia:PA,USA,ACM Press,2006:44-54.

共引文献139

同被引文献417

引证文献31

二级引证文献321

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部