摘要
[目的/意义]对科技文献进行热点识别研究,有助于学者们准确把握学科发展趋势和前沿问题,为科研政策和人才培养提供理论依据。[方法/过程]引入文献-关键词双模网络,设计一种考虑时间因素、文献引用关系、关键词位置顺序、关键词词频、文献与关键词关联关系的关键词综合影响力模型。利用Node2vec网络表示学习模型将共现网络中的节点映射为向量,采用轮廓系数对K-means、凝聚层次聚类等4种聚类算法进行评估,遴选出最优的聚类算法,结合关键词综合影响力识别热点主题。[结果/结论]选取数字人文领域的期刊文献数据进行实验,结果表明该方法可以较好地识别数字人文领域的前沿热点。
[Purpose/significance]Research on the hotspots identification of scientific literature can help researchers accurate-ly grasp the development course and research frontiers,and provide theoretical basis for scientific research policy and personnel training.[Method/process]In this paper,a document-keyword two-mode network is introduced to design a comprehensive key-word influence model that considers time factors,document citation relationship,keyword position sequence,keyword word fre-quency,and the relationship between documents and keywords.The nodes in the co-occurrence network are mapped to vectors by the Node2vec algorithm,and then the most appropriate clustering algorithm is selected by using Silhouette Coefficient to evaluate the four clustering algorithms,such as k-means,agglomerative hierarchical clustering.At last,the hotspots identification is carried out by comprehensive keyword influence based on the clustering results.[Result/conclusion]The empirical analysis on the litera-tures of digital humanities shows that the model can effectively identify the research hotspots.
出处
《情报理论与实践》
CSSCI
北大核心
2022年第11期107-114,共8页
Information Studies:Theory & Application
关键词
研究热点
识别方法
双模网络
Node2vec
聚类算法
数字人文
research hotspots
recognition method
two-mode network
Node2vec
clustering algorithm
digital humanities