期刊文献+

基于学术论文全文的研究方法句自动抽取研究 被引量:15

Methodological and Automatic Sentence Extraction from Academic Article's Full-text
下载PDF
导出
摘要 研究方法是科技文献中的重要内容,是解决学科领域问题的方法、工具、手段或技术。研究方法的描述通常以句子为单位。将分散在科技文献中的研究方法句进行汇总,可以辅助科研工作者快速地搜寻合适的研究方法。根据方法使用主体,将研究方法句进一步分为论文使用方法句和论文引用方法句。论文使用方法句是指论文中使用的研究方法的描述句。论文引用方法句是指论文对前人使用过的研究方法的描述句。本文使用多种基于神经网络的句子分类模型从科技文献全文本中进行研究方法句抽取。在模型词向量表示层,论文使用BERT和word2vec两种词向量模型。在模型的特征选择层,本文选用三种不同的网络,分别为卷积神经网络、双向长短时记忆网络和注意力机制网络。另外,论文使用两种模型训练方式,分别为单层次结构和两层次结构。实验结果表明,基于BERT的单层次结构的双向长短时记忆网络模型取得了较优的性能。本文从《情报学报》已发表论文中进行研究方法句的抽取并分析研究方法句的分布情况。分析发现,《情报学报》逐渐重视情报学中理论的发展并关注建设情报学学科的理论体系。 Research methods are essential in the scientific literature.These include methods,tools,or techniques for solving problems in the field.The research method's description is usually presented through sentences.Summarizing these scattered sentences in the scientific literature can help researchers to quickly explore appropriate research methods.According to the method's purpose in the research paper,the research method sentence is further divided into method used and method cited sentences.The method used sentence refers to the sentence that describes the research method used in the paper and the method cited sentence refers to that cited by the paper.In this study,a variety of neural network-based sentence classification models are used for extracting the method sentences from the scientific literature's full-text.At the word vector representation layer,the study uses two-word vector models:BERT and word2vec.In the feature selection layer,three different networks are utilized:convolutional neural network(CNN),bidirectional LSTM(BiLSTM),and attention mechanism network.In addition,the study uses two model training methods:a single-level structure and a two-level structure.The experimental results show that the BERT-based BiLSTM model with single-level structure achieves the best performance.This paper analyzes the distribution of research method sentences extracted from the Journal of The China Society for Scientific and Technical Information.The analysis indicates that this journal paid more attention to the theoretical developments of information science;in addition,the journal also focused on constructing theoretical systems for this discipline.
作者 张颖怡 章成志 Zhang Yingyi;Zhang Chengzhi(Department of Information Management,School of Economics and Management,Nanjing University of Science&Technology,Nanjing 210094)
出处 《情报学报》 CSSCI CSCD 北大核心 2020年第6期640-650,共11页 Journal of the China Society for Scientific and Technical Information
基金 国家社会科学基金重大项目“情报学学科建设与情报工作未来发展路径研究”(17ZDA291)。
关键词 研究方法句抽取 信息抽取 深度学习 BERT methodological sentence extraction information extraction deep learning BERT
  • 相关文献

参考文献2

二级参考文献17

  • 1马费成.论情报学的基本原理及理论体系构建[J].情报学报,2007,26(1):3-13. 被引量:136
  • 2Yang Yiming. An Evaluation of Statistical Approaches to Text Categorization[J]. Journal of Information Retrieval, 1999, 1 (1): 69- 90.
  • 3Nanba H, Takezawa T. Classification of Research Papers into a Patent Classification System Using Two Translation Models[C]// Proc. of the Workshop on Text and Citation Analysis for Scholarly Digital Libraries. Singapore: [s. n.], 2009: 27-35.
  • 4Teufel S, Moens M. Summarizing Scientific Articles: Experiments with Relevance and Rhetorical Status[C]//Proc. of the 40th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: [s. n.], 2002: 100-137.
  • 5Abu-Jbam A, Radev D. Coherent Citation-based Summarization of Scientific Papers[C]//Proc. of the 49th Annual Meeting of the Association for Computational Linguistics. Oregon, Portland:[s. n.], 2011: 500-509.
  • 6McKnight L, Srinivasan P. Categorization of Sentence Types in Medical Abstracts[C]//Proc. of the 17th Conference of the American Medical Informatics Association. Washington D. C., USA: [s. n.], 2003:440 -444.
  • 7Yamamoto Y, Takagi T. A Sentence Classification System for Multi Biomedical Literature Summarization[C]//Proc. of the 21st International Conference on Data Engineering. Tokyo, Japan: [s. n.], 2005: 301-308.
  • 8王琳.情报学元理论研究的动态分析[J].情报科学,2007,25(10):1449-1457. 被引量:12
  • 9王芳.情报学的范式变迁及元理论研究[J].情报学报,2007,26(5):764-773. 被引量:31
  • 10马丽华,孙宁宁.情报学立场——情报学走出困境的路径[J].情报科学,2007,25(12):1769-1772. 被引量:12

共引文献38

同被引文献173

引证文献15

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部