期刊文献+

面向科研人员兴趣画像的多语作者主题模型研究 被引量:5

Research of Multilingual Author-Topic Model for Profiling Researcher Interests
下载PDF
导出
摘要 全球化背景下,从不同语种的海量科研文献数据集中自动挖掘隐含主题,精准刻画科研人员研究兴趣是信息服务迈向知识服务的关键问题,也是跨语言信息检索的关键技术之一。目前刻画科研人员兴趣的方法多基于其某一语种的文献,不适用于多语言数据集。本文在作者主题模型和多语言主题模型的基础上提出了多语作者主题(JointAT)模型,可从多语言数据集刻画作者兴趣,并给出了一种估计JointAT模型参数的吉布斯采样方法。实验结果表明,JointAT模型与作者主题(AT)模型相比具有更好的泛化能力。 In the background of big data and globalization,mining latent topics automatically and profiling researchers􀆳interests accurately from massive multilingual literature are some of the key issues encountered in providing services with respect to information for knowledge and cross language information retrieval.Currently,the methods adopted to describe researchers􀆳interests are mostly based on literatures in one certain language and therefore,these are not applicable to multilanguage datasets.This study suggests the JointAT(joint author-topic)model on the basis of author-topic model and multilingual topic model to profile researchers􀆳interests from multilingual datasets.Moreover,a Gibbs sampling method to estimate the parameters of the JointAT model is proposed.The experimental results indicate that the JointAT model exhibits a better generalization ability than the author-topic model.
作者 李岩 刘志辉 高影繁 Li Yan;Liu Zhihui;Gao Yingfan(Institute of Scientific and Technical Information of China,Beijing 100038)
出处 《情报学报》 CSSCI CSCD 北大核心 2020年第6期601-608,共8页 Journal of the China Society for Scientific and Technical Information
基金 中央级公益性科研院所基本科研业务费专项资金项目“上市公司年报数据库建设及服务系统研发”(ZD2019-09) 中国科学技术信息研究所创新研究基金青年项目“上市公司技术主题识别方法及可视化研究”(QN2019-12)。
关键词 主题模型 多语作者主题模型 研究兴趣 吉布斯采样 topic model multilingual author-topic model research interests Gibbs sampling
  • 相关文献

参考文献1

二级参考文献23

  • 1Price D S. Little science,big science. New York:Columbia University Press, 1963.
  • 2Blei D, Ng A, Jordan M. Latent Dirichlet Allocation [ J]. Journal of Machine Learning Re-search, 2003, 3 :993-1022.
  • 3Rosen-Zvi M, Griffiths T, Steyvers M,et al. The author- topic model for authors and documents [ C ]//Proceedings of the 20th conference on uncertainty in artificial intelligence (UAI), Arlington: AUAI Press 2004: 487-494.
  • 4Steyvers M, Smyth P, Rosen-Zvi M, et al. Probabilistic author-topic models for information discovery [ C ]// Proceedings of the 10's ACM SIGKDD international conference on Knowledge discovery and data mining, New York: ACM Press, 2004: 306-315.
  • 5Mimno D, McCallum A. Expertise modeling for matching papers with reviewers[ C ]/! Proceedings of the 13'h ACM SIGKDD international conference on Knowledge discovery and data mining, New York :ACM Press,2007 : 500-509.
  • 6Kawamae N. Author interest topic model[ C l// Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, New York: ACM Press, 2010: 887-888.
  • 7Kawamae N. Latent interest-topic model:finding the causal relationships behind dyadic data[ C ]//Proceedings of the 19th ACM CIKM international conference on Information and knowledge management, New York: ACM Press, 2010 : 649-658.
  • 8Wang X, Mohanty N, McCallum A. Group and topic discovery from relations and text[ C]//Proceedings of the 11 ,h ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York: ACM Press, 2005 : 28-35.
  • 9Song X, Lin C,Tseng B L,et al. Modeling and predicting personal information dissemination behavior [ C ]// Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York: ACM Press, 2005 : 479-488.
  • 10Xu S, Zhu L, Qiao X, et al. Topic Linkages between Papers and Patents [ C ] // Proceedings of the 4'h AST International Conference on Advanced Science and Technology, Daejeon: SERSC press, 2012 : 176-183.

共引文献23

同被引文献102

引证文献5

二级引证文献26

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部