期刊文献+

基于泛化语义相似的科技文献混合推荐算法 被引量:1

S&T Literature Hybrid Recommendation Algorithm Based on Generalized Semantic Similarity
下载PDF
导出
摘要 文章从内涵和外延两个角度研究了科技文献相似性度量问题,首先从科技文献内涵的角度在文献特征词字符匹配基础上采用泛化方法将待推荐文献关键词与当前文献关键词及其父/子关键词进行匹配;然后从外延角度结合科技文献项目的特点将文献共引因素引入文献相似性度量;最后根据关键词泛化相似度和共引关联度定义混合相似度(HS)对候选科技文献进行排序推荐,理论分析和实验数据表明,该算法能够在一定程度上避免遗漏"特征词字符不同,但语义相似"科技文献的问题。 This paper studies the similarity measurement of Scientific and Technical (S & T) literatures from the perspective of connotation and extension. The paper firstly uses the generalization method to match the keywords of the literatures to be recom- mended and the keywords of the current literatures and their father/son keywords based on the literature characteristic word string matching from the perspective of the connotation of S & T literature. Then, the paper introduces the co-citation factors of the litera- tures into the literature similarity measurement in combination with the characteristics of the S & T literatures from the perspective of extension. Finally, the paper sorts and recommends the candidate S & T literatures in accordance with the keyword generalization similarity and the Hybrid Similarity (HS) defined by the co-citation correlation. The theoretical analysis and experimental data show that the algorithm can avoid omitting the problem of "different characteristic word string with similar semantics" in S & T litera- tures.
出处 《情报理论与实践》 CSSCI 北大核心 2013年第2期96-99,103,共5页 Information Studies:Theory & Application
基金 教育部人文社会科学研究青年基金项目"科技文献推荐系统若干问题研究"(项目编号:09YJC870001) 教育部人文社会科学研究规划基金项目"云计算环境下企业数据外包服务中的用户隐私保护问题研究"(项目编号:12YJA630136)的成果
关键词 科技文献 语义关系 相似性度量 算法 S & T literature semantic relationship similarity measurement algorithm
  • 相关文献

参考文献7

二级参考文献20

共引文献13

同被引文献27

  • 1Han J,Kamber M,Pei J.数据挖掘:概念与技术[M].第3版.范明,孟小峰译.北京:机械工业出版社,2012.
  • 2Magerman T, Van Looy B, Song X. Exploring the Feasibility and Accuracy of Latent Semantic Analysis Based Text Mining Techniques to Detect Similarity Between Patent Documents and Scientific Publications [J]. Scientometrics, 2010, 82(2): 289-306.
  • 3Wang W, Yu B. Text Categorization Based on Combination of Modified back Propagation Neural Network and Latent Semantic Analysis [J]. Neural Computing & Application, 2009, 18(8): 875-881.
  • 4Olmos R, Le6n J A, Jorge-Botana G, et al. New Algorithms Assessing Short Summaries in Expository Texts Using Latent Semantic Analysis [J]. Behavior Research Methods, 2009, 41(3): 944-950.
  • 5Law J, Bauin S, Courtial J P, et al. Policy and the Mapping of Scientific Change: A Co-word Analysis of Research into Environmental Acidification [J]. Scientometrics, 1988, 14(3):251-264.
  • 6任建华,沈炎彬,孟祥福,等.基于词条之间关联关系的文档聚类[J/OL].[2014-12-11].计算机工程与应用.http://WWW.cnki.net/kcms/detail/11,2127.TP,20141211,1528.053.html.
  • 7Steyvers M, Griffith T. Probabilistic Topic Models[A].// Latent Semantic Analysis: A Road to Meaning [M]. Laurence Erlbaum, 2006.
  • 8Landauer T K, Foltz P W, Laham D. An Introduction to Latent Semantic Analysis [J]. Discourse Processes, 1998, 25(2-3): 259-284.
  • 9Leydesdorff L. Similarity Measures, Author Cocitation Analysis, and Information Theory [J]. Journal of the American Society for Information Science & Technology (JASIST), 2005, 56(7): 769-772.
  • 10Structured Dynamic. Linked Data FAQ [EB/OL]. [2014-07- 18]. http://structureddynamics.com/linked_data.html.

引证文献1

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部