期刊文献+

基于合作者网络社区发现的学科主题分析——以国际统计学期刊为例

Subject Topic Analysis through Community Detection in Coauthorship Network-Taking International Statistical Journals as an Example
下载PDF
导出
摘要 随着大科学时代的到来,科研合作现象越来越普遍。为了了解当前主流研究主题以及科研学者之间的科研合作模式,使得科研学者对学科主题有更好的认知,建立更高效的合作团体,提高科研产出,促进学科发展,本文以国际统计学期刊为例对学科主题进行了深入的研究。首先构建合作者网络并分析其基本属性,其次提取其核心网络并分析其连通分量结构,最后利用ECV方法和正则化谱聚类算法对第一、二大连通分量进行社区个数的确定及社区划分。结果表明,统计领域科研合作现象日益普遍,合作者网络具有明显的社区结构;结合论文信息和作者属性,本文得到29个不同的学科主题,并发现不同社区之间存在交叉合作的现象,同一社区内部存在不同学科主题的融合。此外,在科研合作模式方面,本文发现同一学科主题或科研单位的学者更容易产生合作关系,同一社区的学者发表论文的期刊具有明显的相似性。 Scientific researchers are an important force in promoting the development of disciplines.In the era of big science,more and more scholars tend to cooperate in research,and the phenomenon of scientific rescarch cooperation is becoming more and more common.Through scientific rescarch collaboration,scholars can complement each other's strengths and avoid duplication of rescarch.The collaboration among researchers can be transformed into network data.The application of complex network analysis has led to significant advancements in understanding complex systems across various ficlds.Therefore,in order to understand the current mainstream research topics and the collaboration mode among rescarchers,this study collects information of 66460 papers published in 44 statistical joumals from 2001 to 2018,and builds a co-authorship network.It can be found that in recent years,statisticians have increasingly tended to collaborate in publishing papers.Besides,nearly one-fifth of statisticians have only one collaborator in the network.Professor Balakrishnan,N.,from McMaster University,has the largest number of collaborators and has published the most papers in 44 statistical journals.Many large networks have a core-periphery structure,and so does the collaborator network.We then extract its core network.The core network contains 1158 nodes and 15464 edges.Additionally,there are 34 connected components in the core network.The largest connected component has 356 authors,accounting for 30.7%of the total authors in the core network.The sccond largest connected component has 172 authors,accounting for 14.9%of the total authors in the core network.The rest of the connected components have less than 100 authors,respectively.Then we particularly analyze its first and second largest connected components.There are three common characteristics of complex networks,which are small-world,scale free,and community structure characteristics.Among them,the community structure means that nodes in the network show aggregation phenomenon.Community detection is a particularly crucial research area,as it allows for the identification of groups of nodes that exhibit specific patterms of interaction within a network.We intend to conduct community detection in the first and second largest conected components.For many community detection algorithms,the number of communitics should be pre-set.However,it is difficult to know it in a real nctwork.Using cross-validation to automatically select the number of communities in the network is a breakthrough in the field of community detection.This paper adopts the edge cross-validation(ECV)method to determine the number of communities and adjustment parameters.Then we use the regularized spectral clustering algorithm to discover the community of the co-authorship network.The core co-authorship network is divided into 62 communities.Through observation,it can be found that authors in the same community cooperate relatively closely,and authors belonging to different communities have relatively lttle cooperation.There is cross-ollaboration between different communities.Then we analyze the characteristics of 62 communities fom the three perspectives based on the authors’atributes.The first one is the research field.We find 29 different research fields,including biostatistics,variable selection,maximum likelihood,and many others.We specifically show the research fields of the communities in the largest connected component.The authors in the largest connected component have a broad range of research topics.Some communities focus on more than one field.The second one is the joumal.There exists an obvious similarity in the journals in which authors fom the same community published their papers.We take Community 6 and Community 18 as examples for detailed analysis.It is found that authors in Community 6,who mainly study survival analysis in biostatistis,are more inclined to publish papers in Bioinformatics,Biostatistics,Biometrics,and other biostatistics joumnals.Authors in Community 18,who mainly focus on variable selection,are more inclined to publish papers in the top stistical journals such as Journal of the American Statistical Association,Biometrics,and Annals of Statistics.The thind one is the author's afliation.The afliations of authors in the same community are also obviously clustered.We also take Community 6 and Community 18 as examples for detailed analysis.It is found that many of the staisticians in Community 6 are fom the University of North Carolina Chapel Hill and the Fred Hutchinson Cancer Center.Community 18 has a higher number of statisticians from the University of North Carolina and the University of North Carolina Chapel Hill.Most of the existing studies about co-authorship netwonks of statisticians focus on papers in the four major statistical jourmals(Annals of Statistics,Biometrika,Journal of the American Statistical Association,and Joumal of the Royal Statistical Society Series B-Statistical Methodology).Comparatively,the data used in this paper covers a much broader range,and the conclusions are richer.The methods used to determine the number of communities in previous studics are more subjective.Compared with them,the method adopted in this paper is more objective and universal,which can be extended to collaboration nctworks in other fields.
作者 张妍 潘蕊 方匡南 Yan Zhang;Rui Pan;Kuangnan Fang(School of Economics,Xiamen University;School of Statistics and Mathematics,Central University of Finance and Economics)
出处 《经济管理学刊》 2023年第2期219-240,共22页 Quarterly Journal of Economics and Management
基金 国家自然科学基金面上项目(11971504)对本文研究的支持。
关键词 科研合作 学科主题 合作者网络 社区发现 Research Collaboration Subject Topic Coauthorship Network Community Detection
  • 相关文献

参考文献7

二级参考文献78

共引文献92

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部