摘要
【目的】弥补传统方法在潜在合作关系挖掘中的缺陷和不足,提高潜在合作关系的挖掘效果。【方法】在分析简单计算法、最小值计算法与传统TFIDF算法缺陷和不足的基础上,提出改进TFIDF算法,并将其引入到潜在合作关系挖掘中。【结果】利用《北大中文期刊核心目录(2012年版)》中19种图书情报类期刊近5年情报学研究方法应用领域的论文作为样本数据,发现简单计算法与最小值计算法受到作者发文量影响较大,传统TFIDF算法的挖掘结果很难实现从潜在合作关系转化为现实合作关系,而改进TFIDF算法对此的满足度则表现得非常突出。【局限】改进TFIDF算法未考虑论文中作者之间的排名顺序对潜在合作关系的影响。【结论】通过将4种挖掘结果进行对比和评价,证明改进TFIDF算法较其他传统方法更科学、更具有优越性和实用价值。
[Objectivel In order to remedy the defects of traditional methods in the mining potential cooperation relationship, improve the potential mining effect. [Methods] The paper proposes the improved TFIDF algorithm and applies to the potential cooperation relationship mining based on the analysis of the flaw and the insufficiency of simple calculation method, minimum value calculation method and the traditional TFIDF algorithm. [Results] The simple calculation method and the minimum value calculation method are greatly influenced by authors productivity, traditional TFIDF algorithm result is difficult to achieve the conversion from potential cooperation relationship for practical cooperation, and improved TFIDF algorithm shows very prominent based on regarding the applying research methods of information science field in 19 kinds of journals of Library and Information Science in "Chinese Core Journal of Peking University Directory (2012 Edition)" in recent 5 years as sample data. [Limitations] The improved TFIDF algorithm does not consider the influence between author ranking orders of potential cooperation. [Conclusions] The results show that the improved TFIDF algorithm is more scientific, has more advantages and better practical value than other traditional methods, through comparing and evaluating four data mining results.
出处
《现代图书情报技术》
CSSCI
北大核心
2014年第10期84-92,共9页
New Technology of Library and Information Service
关键词
改进TFIDF算法
潜在合作关系
数据挖掘
耦合分析
Improved TFIDF algorithm Potential cooperation relationship Data mining Coupling analysis