摘要
面对海量的科技文献资源,如何评估文献、作者和研究机构的学术质量和可信度引起了广泛关注。在众多可信度评价标准中,权威度是优先和关键的评价指标。因此,对科技文献、作者和机构等学术实体的科技实力和权威度进行研究与量化评估具有很大的现实意义。本文利用文献、作者、机构等三类实体间的引用、合著、合作等关系建立异构网络模型,在此基础上提出了混合随机游走算法Co-AcademicRank定量计算文献、作者、机构的权威度,并基于MapReduce实现了分布式的Co-AcademicRank算法。最后通过对情报学和图书馆学数据集测试与分析,对比分析了PageRank和Co-ranking算法,验证了本模型的有效性、准确性和优越性。同时,实验比较了算法在单机环境下和Hadoop平台下的运行时间,证明了分布式算法的高效性和稳定性。
In the face of huge amount resource of scientific literature, how to evaluate the quality and credibility of literatures along with related authors and research institutions has aroused widespread concern. Among various quality and credibility evaluation standards, the authority is a significant measurement with higher priority. Hence it has an important and practical significance to study and quantitatively evaluate the authority of various academic entities such as literature, author and research institution. In this paper, a heterogeneous relationship network model of academic entity is proposed by analyzing literatures citation, co-authorship and the co-operation relationship among academic institutions. Furthermore, a distributed hybrid random walk algorithm named Co-AcademicRank, which can quantitatively calculate the authority of literature, author and research institution simultaneously, is designed and implemented on the basis of Map-Reduce framework. At last, intensive experiments are performed with the dataset in the domain of Information Science and Library Science. The experiment result shows that Co-AcademicRank is more accurate and effective when comparing with the other algorithms such as PageRank and Co-Ranking. At meanwhile, the comparison of the elapsed time of the algorithm running in the single-computer environment and under the platform of Hadoop/ Map-Reduce respectively also proves the high efficiency of the distributed Co-AcademicRank algorithm.
出处
《情报学报》
CSSCI
北大核心
2014年第8期872-882,共11页
Journal of the China Society for Scientific and Technical Information
基金
国家自然科学基金项目(71303179)
中国博士后科学基金第六批特别资助项目(2013T60749)