This paper presents methodology and trends of linguistic research in the era of big data.We begin with a discussion of the role of linguists in the information society and illustrate the opportunities and challenges l...This paper presents methodology and trends of linguistic research in the era of big data.We begin with a discussion of the role of linguists in the information society and illustrate the opportunities and challenges linguists are currently facing.After highlighting the significance of authentic data on linguistic research,we argue that language is a complex adaptive system driven by humans.Then,from the perspective of philosophy of science,we introduce the research paradigm of quantitative linguistics through several cases.Finally,we discuss how China’s linguistic research will benefit from the data-intensive approach in terms of scientification and internationalization.展开更多
In the era of the big data. the national strategies and the rapid development of computers and storage technologies bring opportunities and challenges to the library's data services. Based on the investigation litera...In the era of the big data. the national strategies and the rapid development of computers and storage technologies bring opportunities and challenges to the library's data services. Based on the investigation literature of the scientific data services in the university libraries in the United States, the development process of the scientific data is analyzed from three aspects of the service types, the service mode and the service contents. The author of this paper also proposes opportunities and challenges from 5 aspects of the policy support. strengthening the publicity, the self learning, the self positioning and relying on the embedded subject librarians, to promote the development of the library scientific data services.展开更多
The growing collection of scientific data in various web repositories is referred to as Scientific Big Data,as it fulfills the four“V’s”of Big Data—volume,variety,velocity,and veracity.This phenomenon has created ...The growing collection of scientific data in various web repositories is referred to as Scientific Big Data,as it fulfills the four“V’s”of Big Data—volume,variety,velocity,and veracity.This phenomenon has created new opportunities for startups;for instance,the extraction of pertinent research papers from enormous knowledge repositories using certain innovative methods has become an important task for researchers and entrepreneurs.Traditionally,the content of the papers are compared to list the relevant papers from a repository.The conventional method results in a long list of papers that is often impossible to interpret productively.Therefore,the need for a novel approach that intelligently utilizes the available data is imminent.Moreover,the primary element of the scientific knowledge base is a research article,which consists of various logical sections such as the Abstract,Introduction,Related Work,Methodology,Results,and Conclusion.Thus,this study utilizes these logical sections of research articles,because they hold significant potential in finding relevant papers.In this study,comprehensive experiments were performed to determine the role of the logical sections-based terms indexing method in improving the quality of results(i.e.,retrieving relevant papers).Therefore,we proposed,implemented,and evaluated the logical sections-based content comparisons method to address the research objective with a standard method of indexing terms.The section-based approach outperformed the standard content-based approach in identifying relevant documents from all classified topics of computer science.Overall,the proposed approach extracted 14%more relevant results from the entire dataset.As the experimental results suggested that employing a finer content similarity technique improved the quality of results,the proposed approach has led the foundation of knowledge-based startups.展开更多
[目的/意义]对大数据时代国内外个人信息保护的研究热点和演化趋势进行了总结和回顾,旨在为相关领域的研究提供参考和启示。[方法/过程]运用文献计量法和科学知识图谱法,基于CNKI和Web of Science数据库,以ITGInsight为主体工具,再辅之G...[目的/意义]对大数据时代国内外个人信息保护的研究热点和演化趋势进行了总结和回顾,旨在为相关领域的研究提供参考和启示。[方法/过程]运用文献计量法和科学知识图谱法,基于CNKI和Web of Science数据库,以ITGInsight为主体工具,再辅之Gephi、Excel、SATI等科学计量与知识网络分析软件,对大数据领域国内外个人信息保护研究领域的热点分布、主题演化以及研究内容进行分析。[结果/结论]大数据时代国内外个人信息保护相关研究主题分布广泛、演化规律较为复杂,呈现出显著的变化趋势,在未来的研究中,需要综合考虑技术、法律、政策等多个方面的因素,以构建更加全面、系统的个人信息保护体系。展开更多
In recent years,the rapid development of Earth observation tech-nology has produced an increasing growth in remote sensing big data,posing serious challenges for effective and efficient proces-sing and analysis.Meanwh...In recent years,the rapid development of Earth observation tech-nology has produced an increasing growth in remote sensing big data,posing serious challenges for effective and efficient proces-sing and analysis.Meanwhile,there has been a massive rise in deeplearningbased algorithms for remote sensing tasks,providing a large opportunity for remote sensing big data.In this article,we initially summarize the features of remote sensing big data.Subsequently,following the pipeline of remote sensing tasks,a detailed and technical review is conducted to discuss how deep learning has been applied to the processing and analysis of remote sensing data,including geometric and radiometric processing,cloud masking,data fusion,object detection and extraction,landuse/cover classification,change detection and multitemporal ana-lysis.Finally,we discussed technical challenges and concluded directions for future research in deep-learning-based applications for remote sensing big data.展开更多
基金This paper is a phased achievement of“A Study on Quantitative Linguistics:Contemporary Chinese Language”(11&ZD188)a major project sponsored by the National Social Science Fund of China and implemented by Zhejiang University’s“Big Data+Language Laws and Cognition”innovation team under the auspices of the Fundamental Research Funds for the Central Universities。
文摘This paper presents methodology and trends of linguistic research in the era of big data.We begin with a discussion of the role of linguists in the information society and illustrate the opportunities and challenges linguists are currently facing.After highlighting the significance of authentic data on linguistic research,we argue that language is a complex adaptive system driven by humans.Then,from the perspective of philosophy of science,we introduce the research paradigm of quantitative linguistics through several cases.Finally,we discuss how China’s linguistic research will benefit from the data-intensive approach in terms of scientification and internationalization.
文摘In the era of the big data. the national strategies and the rapid development of computers and storage technologies bring opportunities and challenges to the library's data services. Based on the investigation literature of the scientific data services in the university libraries in the United States, the development process of the scientific data is analyzed from three aspects of the service types, the service mode and the service contents. The author of this paper also proposes opportunities and challenges from 5 aspects of the policy support. strengthening the publicity, the self learning, the self positioning and relying on the embedded subject librarians, to promote the development of the library scientific data services.
基金supported by Institute of Information&communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)(2020-0-01592)Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(2019R1F1A1058548).
文摘The growing collection of scientific data in various web repositories is referred to as Scientific Big Data,as it fulfills the four“V’s”of Big Data—volume,variety,velocity,and veracity.This phenomenon has created new opportunities for startups;for instance,the extraction of pertinent research papers from enormous knowledge repositories using certain innovative methods has become an important task for researchers and entrepreneurs.Traditionally,the content of the papers are compared to list the relevant papers from a repository.The conventional method results in a long list of papers that is often impossible to interpret productively.Therefore,the need for a novel approach that intelligently utilizes the available data is imminent.Moreover,the primary element of the scientific knowledge base is a research article,which consists of various logical sections such as the Abstract,Introduction,Related Work,Methodology,Results,and Conclusion.Thus,this study utilizes these logical sections of research articles,because they hold significant potential in finding relevant papers.In this study,comprehensive experiments were performed to determine the role of the logical sections-based terms indexing method in improving the quality of results(i.e.,retrieving relevant papers).Therefore,we proposed,implemented,and evaluated the logical sections-based content comparisons method to address the research objective with a standard method of indexing terms.The section-based approach outperformed the standard content-based approach in identifying relevant documents from all classified topics of computer science.Overall,the proposed approach extracted 14%more relevant results from the entire dataset.As the experimental results suggested that employing a finer content similarity technique improved the quality of results,the proposed approach has led the foundation of knowledge-based startups.
文摘[目的/意义]对大数据时代国内外个人信息保护的研究热点和演化趋势进行了总结和回顾,旨在为相关领域的研究提供参考和启示。[方法/过程]运用文献计量法和科学知识图谱法,基于CNKI和Web of Science数据库,以ITGInsight为主体工具,再辅之Gephi、Excel、SATI等科学计量与知识网络分析软件,对大数据领域国内外个人信息保护研究领域的热点分布、主题演化以及研究内容进行分析。[结果/结论]大数据时代国内外个人信息保护相关研究主题分布广泛、演化规律较为复杂,呈现出显著的变化趋势,在未来的研究中,需要综合考虑技术、法律、政策等多个方面的因素,以构建更加全面、系统的个人信息保护体系。
基金supported in part by the National Key Research and Development Program under Grant[2017YFB0504201]the National Natural Science Foundation of China under Grant Nos.[42071316,61473286 and 401201460]+1 种基金Open Fund of State Key Laboratory of Remote Sensing Science under Grant No.[OFSLRSS201919]the Fundamental Research Funds for the Central Universities under Grant No.[B200202008].
文摘In recent years,the rapid development of Earth observation tech-nology has produced an increasing growth in remote sensing big data,posing serious challenges for effective and efficient proces-sing and analysis.Meanwhile,there has been a massive rise in deeplearningbased algorithms for remote sensing tasks,providing a large opportunity for remote sensing big data.In this article,we initially summarize the features of remote sensing big data.Subsequently,following the pipeline of remote sensing tasks,a detailed and technical review is conducted to discuss how deep learning has been applied to the processing and analysis of remote sensing data,including geometric and radiometric processing,cloud masking,data fusion,object detection and extraction,landuse/cover classification,change detection and multitemporal ana-lysis.Finally,we discussed technical challenges and concluded directions for future research in deep-learning-based applications for remote sensing big data.