基于WOS(Web of Science)和CNKI(中国知网)数据库,对1990~2021年长江流域生物多样性研究领域的论文进行了文献计量分析,主要从发文的数量、被引用、期刊分布、研究机构、科研人员等方面,利用VOSViewer和Ucinet-Netdraw可视化软件,分析...基于WOS(Web of Science)和CNKI(中国知网)数据库,对1990~2021年长江流域生物多样性研究领域的论文进行了文献计量分析,主要从发文的数量、被引用、期刊分布、研究机构、科研人员等方面,利用VOSViewer和Ucinet-Netdraw可视化软件,分析了机构和作者合作以及研究热点等。结果表明:年际发文量整体呈上升趋势;Ecological Indicators和《生态学报》分别是WOS和CNKI数据库的主要发文期刊,Global Change Biology和《生态学报》分别为该领域的顶级期刊;中国水产科学研究院和中国科学院为核心研究机构;刘凯和王丁对长江流域生物多样性研究领域的贡献最大;基于CNKI和WOS文献的研究热点分别为水生生物遗传多样性研究和基于微卫星分子标记技术的物种遗传多样性研究等。展开更多
The exponential growth of literature is constraining researchers’access to comprehensive information in related fields.While natural language processing(NLP)may offer an effective solution to literature classificatio...The exponential growth of literature is constraining researchers’access to comprehensive information in related fields.While natural language processing(NLP)may offer an effective solution to literature classification,it remains hindered by the lack of labelled dataset.In this article,we introduce a novel method for generating literature classification models through semi-supervised learning,which can generate labelled dataset iteratively with limited human input.We apply this method to train NLP models for classifying literatures related to several research directions,i.e.,battery,superconductor,topological material,and artificial intelligence(AI)in materials science.The trained NLP‘battery’model applied on a larger dataset different from the training and testing dataset can achieve F1 score of 0.738,which indicates the accuracy and reliability of this scheme.Furthermore,our approach demonstrates that even with insufficient data,the not-well-trained model in the first few cycles can identify the relationships among different research fields and facilitate the discovery and understanding of interdisciplinary directions.展开更多
基金funded by the Informatization Plan of Chinese Academy of Sciences(Grant No.CASWX2021SF-0102)the National Key R&D Program of China(Grant Nos.2022YFA1603903,2022YFA1403800,and 2021YFA0718700)+1 种基金the National Natural Science Foundation of China(Grant Nos.11925408,11921004,and 12188101)the Chinese Academy of Sciences(Grant No.XDB33000000)。
文摘The exponential growth of literature is constraining researchers’access to comprehensive information in related fields.While natural language processing(NLP)may offer an effective solution to literature classification,it remains hindered by the lack of labelled dataset.In this article,we introduce a novel method for generating literature classification models through semi-supervised learning,which can generate labelled dataset iteratively with limited human input.We apply this method to train NLP models for classifying literatures related to several research directions,i.e.,battery,superconductor,topological material,and artificial intelligence(AI)in materials science.The trained NLP‘battery’model applied on a larger dataset different from the training and testing dataset can achieve F1 score of 0.738,which indicates the accuracy and reliability of this scheme.Furthermore,our approach demonstrates that even with insufficient data,the not-well-trained model in the first few cycles can identify the relationships among different research fields and facilitate the discovery and understanding of interdisciplinary directions.