摘要
[目的/意义]基于被引频次的传统引文分析法将所有引用同等看待,未能有效区分不同引用之间的差异.采用机器学习和自然语言处理技术对引用文本从不同角度进行自动分类,能够深入揭示文献之间深层次的引用关系.[方法/过程]首先对引用文本自动分类方法进行探索,采用传统机器学习和深度学习技术从引用功能和引用情感两个角度分别构建自动分类器.在此基础上,对计算机领域的1 738篇科学论文和一篇高被引论文的4 132篇施引文献两个语料集进行引用内容分析.[结果/结论]引用功能和引用情感间存在一定的相关性,并在论文中存在明显的位置分布特征;不同学科的施引文献对同一篇论文的引用在功能和情感上均存在显著差异.
[Purpose/Significance]Traditional citation analysis method based on citation frequency treats all citations equally and thus cannot effectively distinguish the differences among various citations.Using machine learning and natural language processing technologies to automatically classify citation texts from different perspectives can in depth reveal the underlying citation relations among scientific papers.[Method/Process]This paper firstly explored the automatic classification methods of citation texts,and used traditional machine learning and emerging deep learning technologies to build automatic classifiers from the two perspectives of citation function and citation sentiment respectively.On this basis,citation content analysis was carried out on two corpora of the 1738 scientific papers in the field of computer science as well as the 4132 citing papers of a highly cited paper.[Result/Conclusion]The analysis results show that there is a certain correlation between citation function and citation sentiment,their positions in the citing papers have an obvious distribution characteristic,and furthermore there is a significant difference in citation function and citation sentiment among the citing papers from different disciplines to the same cited paper.
作者
欧石燕
凌洪飞
Ou Shiyan;Ling Hongfei(School of Information Management,Nanjing University,Nanjing 210023)
出处
《图书情报工作》
CSSCI
北大核心
2022年第16期125-136,共12页
Library and Information Service
基金
国家社会科学基金重点项目“基于关联数据的学术文献内容语义发布及其应用研究”(项目编号:17ATQ001)研究成果之一。
关键词
引用文本自动分类
引用内容分析
引用功能
引用情感
automatic classification of citation texts
citation content analysis
citation function
citation sentiment