期刊文献+

中国替代计量数据集构建方法研究

Research on the Construction Methods of Altmetrics Datasets in China
下载PDF
导出
摘要 [目的/意义]探索我国替代计量数据集构建方法,为实现面向国内社会需求的替代计量分析提供数据基础。[方法/过程]从作为提及主体的数据源角度提出基于实体识别的数据集构建方法,从作为提及客体的学术成果角度提出基于术语检索的数据集构建方法,并针对所提出的方法进行实证检验与对比分析。[结果/结论]实证结果表明,所提出的基于实体识别和基于术语检索的中国替代计量数据集构建方法均具有可行性;在知乎替代计量数据集构建中,实际采集7.2万条帖子,基于正则表达式和深度学习方法识别所提及的学术论文,F_(1)值在80%以上;在微信替代计量数据集构建中,基于术语检索获取了6.55万篇CSSCI期刊论文的17余万条微信提及,微信提及相对覆盖率近70%;对两种构建方法的多角度比较分析可知,这两种构建中国替代计量数据集的基本方法相互补充,适用于不同的替代分析数据需求。 [Purpose/significance]Exploring the methods of constructing altmetrics datasets in China provides the data foundation for altmetric analysis that caters to domestic social needs.[Method/process]From the perspective of the data source as the mentioning subject,a dataset construction method based on entity recognition is proposed.From the perspective of academic outputs as the mentioned object,a dataset construction method based on term retrieval is proposed.Empirical tests and comparative analysis are conducted on the proposed methods.[Result/conclusion]The empirical results show the feasibility of the proposed entity recognition-based and term retrieval-based methods for constructing Chinese altmetrics datasets.In the construction of altmetrics datasets from Zhihu,approximately 72000 posts were collected,and academic papers mentioned were identified using regular expressions and deep learning methods,achieving an F_(1)-score of over 80%.In the construction of altmetrics datasets from WeChat,more than 170000 mentions from over 65500 CSSCI journal articles were obtained through term retrieval,achieving a relative coverage rate of nearly 70%.A multi-perspective comparative analysis of the two construction methods indicates that they complement each other and are suitable for different demands of altmetric analysis data in constructing Chinese altmetrics datasets.
作者 余厚强 梁以安 Yu Houqiang;Liang Yian(School of Information Management,Sun Yat-Sen University,Guangdong Guangzhou 510006)
出处 《情报理论与实践》 北大核心 2024年第8期171-179,共9页 Information Studies:Theory & Application
基金 国家自然科学基金面上项目“中国替代计量的数据识别机制与关键分析方法研究”(项目编号:72274227) 教育部人文社会科学研究规划基金项目“融合替代计量分析的高校科研社会影响力评价研究”(项目编号:22YJA870016)的成果。
关键词 替代计量学 替代计量数据 学术成果识别 实体识别 术语检索 altmetrics altmetric data identification of academic outputs entity recognition term retrieval
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部