Science data are very important resources for innovative research in all scientific disciplines. The Ministry of Science and Technology (MOST) of China has launched a comprehensive platform program for supporting sc...Science data are very important resources for innovative research in all scientific disciplines. The Ministry of Science and Technology (MOST) of China has launched a comprehensive platform program for supporting scientific innovations and agricultural science database construction and sharing project is one of the activities under this program supported by MOST. This paper briefly described the achievements of the Agricultural Science Data Center Project.展开更多
The availability and quantity of remotely sensed and terrestrial geospatial data sets are on the rise.Historically,these data sets have been analyzed and quarried on 2D desktop computers;however,immersive technologies...The availability and quantity of remotely sensed and terrestrial geospatial data sets are on the rise.Historically,these data sets have been analyzed and quarried on 2D desktop computers;however,immersive technologies and specifically immersive virtual reality(iVR)allow for the integration,visualization,analysis,and exploration of these 3D geospatial data sets.iVR can deliver remote and large-scale geospatial data sets to the laboratory,providing embodied experiences of field sites across the earth and beyond.We describe a workflow for the ingestion of geospatial data sets and the development of an iVR workbench,and present the application of these for an experience of Iceland’s Thrihnukar volcano where we:(1)combined satellite imagery with terrain elevation data to create a basic reconstruction of the physical site;(2)used terrestrial LiDAR data to provide a geo-referenced point cloud model of the magmatic-volcanic system,as well as the LiDAR intensity values for the identification of rock types;and(3)used Structure-from-Motion(SfM)to construct a photorealistic point cloud of the inside volcano.The workbench provides tools for the direct manipulation of the georeferenced data sets,including scaling,rotation,and translation,and a suite of geometric measurement tools,including length,area,and volume.Future developments will be inspired by an ongoing user study that formally evaluates the workbench’s mature components in the context of fieldwork and analyses activities.展开更多
Due to the recent explosion of big data, our society has been rapidly going through digital transformation and entering a new world with numerous eye-opening developments. These new trends impact the society and futur...Due to the recent explosion of big data, our society has been rapidly going through digital transformation and entering a new world with numerous eye-opening developments. These new trends impact the society and future jobs, and thus student careers. At the heart of this digital transformation is data science, the discipline that makes sense of big data. With many rapidly emerging digital challenges ahead of us, this article discusses perspectives on iSchools' opportunities and suggestions in data science education. We argue that iSchools should empower their students with "information computing" disciplines, which we define as the ability to solve problems and create values, information, and knowledge using tools in application domains. As specific approaches to enforcing information computing disciplines in data science education, we suggest the three foci of user-based, tool-based, and application- based. These three loci will serve to differentiate the data science education of iSchools from that of computer science or business schools. We present a layered Data Science Education Framework (DSEF) with building blocks that include the three pillars of data science (people, technology, and data), computational thinking, data-driven paradigms, and data science lifecycles. Data science courses built on the top of this framework should thus be executed with user-based, tool-based, and application-based approaches. This framework will help our students think about data science problems from the big picture perspective and foster appropriate problem-solving skills in conjunction with broad perspectives of data science lifecycles. We hope the DSEF discussed in this article will help fellow iSchools in their design of new data science curricula.展开更多
Purpose: The purpose of the paper is to provide a framework for addressing the disconnect between metadata and data science. Data science cannot progress without metadata research.This paper takes steps toward advanc...Purpose: The purpose of the paper is to provide a framework for addressing the disconnect between metadata and data science. Data science cannot progress without metadata research.This paper takes steps toward advancing the synergy between metadata and data science, and identifies pathways for developing a more cohesive metadata research agenda in data science. Design/methodology/approach: This paper identifies factors that challenge metadata research in the digital ecosystem, defines metadata and data science, and presents the concepts big metadata, smart metadata, and metadata capital as part of a metadata lingua franca connecting to data science. Findings: The "utilitarian nature" and "historical and traditional views" of metadata are identified as two intersecting factors that have inhibited metadata research. Big metadata, smart metadata, and metadata capital are presented as part ofa metadata linguafranca to help frame research in the data science research space. Research limitations: There are additional, intersecting factors to consider that likely inhibit metadata research, and other significant metadata concepts to explore. Practical implications: The immediate contribution of this work is that it may elicit response, critique, revision, or, more significantly, motivate research. The work presented can encourage more researchers to consider the significance of metadata as a research worthy topic within data science and the larger digital ecosystem. Originality/value: Although metadata research has not kept pace with other data science topics, there is little attention directed to this problem. This is surprising, given that metadata is essential for data science endeavors. This examination synthesizes original and prior scholarship to provide new grounding for metadata research in data science.展开更多
Since its launch in 2011, the Materials Genome Initiative(MGI) has drawn the attention of researchers from academia,government, and industry worldwide. As one of the three tools of the MGI, the use of materials data...Since its launch in 2011, the Materials Genome Initiative(MGI) has drawn the attention of researchers from academia,government, and industry worldwide. As one of the three tools of the MGI, the use of materials data, for the first time, has emerged as an extremely significant approach in materials discovery. Data science has been applied in different disciplines as an interdisciplinary field to extract knowledge from data. The concept of materials data science has been utilized to demonstrate its application in materials science. To explore its potential as an active research branch in the big data era, a three-tier system has been put forward to define the infrastructure for the classification, curation and knowledge extraction of materials data.展开更多
In present digital era,data science techniques exploit artificial intelligence(AI)techniques who start and run small and medium-sized enterprises(SMEs)to have an impact and develop their businesses.Data science integr...In present digital era,data science techniques exploit artificial intelligence(AI)techniques who start and run small and medium-sized enterprises(SMEs)to have an impact and develop their businesses.Data science integrates the conventions of econometrics with the technological elements of data science.It make use of machine learning(ML),predictive and prescriptive analytics to effectively understand financial data and solve related problems.Smart technologies for SMEs enable allows the firm to get smarter with their processes and offers efficient operations.At the same time,it is needed to develop an effective tool which can assist small to medium sized enterprises to forecast business failure as well as financial crisis.AI becomes a familiar tool for several businesses due to the fact that it concentrates on the design of intelligent decision making tools to solve particular real time problems.With this motivation,this paper presents a new AI based optimal functional link neural network(FLNN)based financial crisis prediction(FCP)model forSMEs.The proposed model involves preprocessing,feature selection,classification,and parameter tuning.At the initial stage,the financial data of the enterprises are collected and are preprocessed to enhance the quality of the data.Besides,a novel chaotic grasshopper optimization algorithm(CGOA)based feature selection technique is applied for the optimal selection of features.Moreover,functional link neural network(FLNN)model is employed for the classification of the feature reduced data.Finally,the efficiency of theFLNNmodel can be improvised by the use of cat swarm optimizer(CSO)algorithm.A detailed experimental validation process takes place on Polish dataset to ensure the performance of the presented model.The experimental studies demonstrated that the CGOA-FLNN-CSO model has accomplished maximum prediction accuracy of 98.830%,92.100%,and 95.220%on the applied Polish dataset Year I-III respectively.展开更多
There has long been discussion about the distinctions of library science,information science,and informatics,and how these areas differ and overlap with computer science.Today the term data science is emerging that ge...There has long been discussion about the distinctions of library science,information science,and informatics,and how these areas differ and overlap with computer science.Today the term data science is emerging that generates excitement and questions about how it relates to and differs from these other areas of study.展开更多
Introduction Within the field of scientometrics,which involves quantitative studies of science,the citation analysis specialism counts citations between academic papers in order to help evaluate the impact of the cite...Introduction Within the field of scientometrics,which involves quantitative studies of science,the citation analysis specialism counts citations between academic papers in order to help evaluate the impact of the cited work(Moed,2006).展开更多
This paper reviews literature pertaining to the development of data science as a discipline,current issues with data bias and ethics,and the role that the discipline of information science may play in addressing these...This paper reviews literature pertaining to the development of data science as a discipline,current issues with data bias and ethics,and the role that the discipline of information science may play in addressing these concerns.Information science research and researchers have much to offer for data science,owing to their background as transdisciplinary scholars who apply human-centered and social-behavioral perspectives to issues within natural science disciplines.Information science researchers have already contributed to a humanistic approach to data ethics within the literature and an emphasis on data science within information schools all but ensures that this literature will continue to grow in coming decades.This review article serves as a reference for the history,current progress,and potential future directions of data ethics research within the corpus of information science literature.展开更多
The rise or fall of the stock markets directly affects investors’interest and loyalty.Therefore,it is necessary to measure the performance of stocks in the market in advance to prevent our assets from suffering signi...The rise or fall of the stock markets directly affects investors’interest and loyalty.Therefore,it is necessary to measure the performance of stocks in the market in advance to prevent our assets from suffering significant losses.In our proposed study,six supervised machine learning(ML)strategies and deep learning(DL)models with long short-term memory(LSTM)of data science was deployed for thorough analysis and measurement of the performance of the technology stocks.Under discussion are Apple Inc.(AAPL),Microsoft Corporation(MSFT),Broadcom Inc.,Taiwan Semiconductor Manufacturing Company Limited(TSM),NVIDIA Corporation(NVDA),and Avigilon Corporation(AVGO).The datasets were taken from the Yahoo Finance API from 06-05-2005 to 06-05-2022(seventeen years)with 4280 samples.As already noted,multiple studies have been performed to resolve this problem using linear regression,support vectormachines,deep long short-termmemory(LSTM),and many other models.In this research,the Hidden Markov Model(HMM)outperformed other employed machine learning ensembles,tree-based models,the ARIMA(Auto Regressive IntegratedMoving Average)model,and long short-term memory with a robust mean accuracy score of 99.98.Other statistical analyses and measurements for machine learning ensemble algorithms,the Long Short-TermModel,and ARIMA were also carried out for further investigation of the performance of advanced models for forecasting time series data.Thus,the proposed research found the best model to be HMM,and LSTM was the second-best model that performed well in all aspects.A developedmodel will be highly recommended and helpful for early measurement of technology stock performance for investment or withdrawal based on the future stock rise or fall for creating smart environments.展开更多
In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary cha...In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary challenge in the appropriate selection of genes.Microarray data classification incorporates multiple disciplines such as bioinformatics,machine learning(ML),data science,and pattern classification.This paper designs an optimal deep neural network based microarray gene expression classification(ODNN-MGEC)model for bioinformatics applications.The proposed ODNN-MGEC technique performs data normalization process to normalize the data into a uniform scale.Besides,improved fruit fly optimization(IFFO)based feature selection technique is used to reduce the high dimensionality in the biomedical data.Moreover,deep neural network(DNN)model is applied for the classification of microarray gene expression data and the hyperparameter tuning of the DNN model is carried out using the Symbiotic Organisms Search(SOS)algorithm.The utilization of IFFO and SOS algorithms pave the way for accomplishing maximum gene expression classification outcomes.For examining the improved outcomes of the ODNN-MGEC technique,a wide ranging experimental analysis is made against benchmark datasets.The extensive comparison study with recent approaches demonstrates the enhanced outcomes of the ODNN-MGEC technique in terms of different measures.展开更多
In this editorial,we comment on the current development and deployment of data science in intensive care units(ICUs).Data in ICUs can be classified into qualitative and quantitative data with different technologies ne...In this editorial,we comment on the current development and deployment of data science in intensive care units(ICUs).Data in ICUs can be classified into qualitative and quantitative data with different technologies needed to translate and interpret them.Data science,in the form of artificial intelligence(AI),should find the right interaction between physicians,data and algorithm.For individual patients and physicians,sepsis and mechanical ventilation have been two important aspects where AI has been extensively studied.However,major risks of bias,lack of generalizability and poor clinical values remain.AI deployment in the ICUs should be emphasized more to facilitate AI development.For ICU management,AI has a huge potential in transforming resource allocation.The coronavirus disease 2019 pandemic has given opportunities to establish such systems which should be investigated further.Ethical concerns must be addressed when designing such AI.展开更多
The increasing dependence on data highlights the need for a detailed understanding of its behavior,encompassing the challenges involved in processing and evaluating it.However,current research lacks a comprehensive st...The increasing dependence on data highlights the need for a detailed understanding of its behavior,encompassing the challenges involved in processing and evaluating it.However,current research lacks a comprehensive structure for measuring the worth of data elements,hindering effective navigation of the changing digital environment.This paper aims to fill this research gap by introducing the innovative concept of“data components.”It proposes a graphtheoretic representation model that presents a clear mathematical definition and demonstrates the superiority of data components over traditional processing methods.Additionally,the paper introduces an information measurement model that provides a way to calculate the information entropy of data components and establish their increased informational value.The paper also assesses the value of information,suggesting a pricing mechanism based on its significance.In conclusion,this paper establishes a robust framework for understanding and quantifying the value of implicit information in data,laying the groundwork for future research and practical applications.展开更多
In order to conduct research and analysis on the construction of application-oriented undergraduate data science and big data technology courses,the professional development characteristics of universities and enterpr...In order to conduct research and analysis on the construction of application-oriented undergraduate data science and big data technology courses,the professional development characteristics of universities and enterprises should be taken into consideration,the development trend of the big data industry should be scrutinized,and professional application-oriented talents should be cultivated in line with job requirements.This paper expounds the demand for capacity-building professional development in application-oriented undergraduate data science and big data technology courses,conducts research and analysis on the current situation of professional development,and puts forward strategies in hope to provide reference for capacity-building professional development.展开更多
With the ongoing advancements in sensor networks and data acquisition technologies across various systems like manufacturing,aviation,and healthcare,the data driven vibration control(DDVC)has attracted broad interests...With the ongoing advancements in sensor networks and data acquisition technologies across various systems like manufacturing,aviation,and healthcare,the data driven vibration control(DDVC)has attracted broad interests from both the industrial and academic communities.Input shaping(IS),as a simple and effective feedforward method,is greatly demanded in DDVC methods.It convolves the desired input command with impulse sequence without requiring parametric dynamics and the closed-loop system structure,thereby suppressing the residual vibration separately.Based on a thorough investigation into the state-of-the-art DDVC methods,this survey has made the following efforts:1)Introducing the IS theory and typical input shapers;2)Categorizing recent progress of DDVC methods;3)Summarizing commonly adopted metrics for DDVC;and 4)Discussing the engineering applications and future trends of DDVC.By doing so,this study provides a systematic and comprehensive overview of existing DDVC methods from designing to optimizing perspectives,aiming at promoting future research regarding this emerging and vital issue.展开更多
I provide some science and reflections from my experiences working in geophysics,along with connections to computational and data sciences,including recent developments in machine learning.I highlight several individu...I provide some science and reflections from my experiences working in geophysics,along with connections to computational and data sciences,including recent developments in machine learning.I highlight several individuals and groups who have influenced me,both through direct collaborations as well as from ideas and insights that I have learned from.While my reflections are rooted in geophysics,they should also be relevant to other computational scientific and engineering fields.I also provide some thoughts for young,applied scientists and engineers.展开更多
Under the background of the big data era,the education of big data majors is undergoing a profound teaching reform and innovation.With the increasing role of big data technology in analysis and decision-making,updatin...Under the background of the big data era,the education of big data majors is undergoing a profound teaching reform and innovation.With the increasing role of big data technology in analysis and decision-making,updating and expanding the teaching content of big data majors has become particularly important.In the era of big data,modern enterprises have put forward new and higher demands for big data talents,which not only include traditional data analysis skills but also knowledge of data visualization and information technology.To address these challenges,big data education needs to reform and innovate in the development and utilization of teaching content,methods,and resources.This paper proposes teaching models and reform methods for big data majors and analyzes corresponding teaching reforms and innovations to meet the requirements of the new development of big data majors.The traditional classroom teaching method is no longer sufficient to meet the learning needs of students,and more dynamic and interactive teaching methods,such as case studies,flipped classrooms,and project-based learning,are becoming increasingly essential.These innovative teaching methods can more effectively cultivate students’practical operation skills and independent thinking while allowing them to better learn advanced knowledge in a real big-data environment.In addition,the paper also discusses the construction of big data processing and analysis platforms,as well as innovative teaching management and evaluation systems to improve teaching quality.展开更多
Big data has had significant impacts on our lives,economies,academia and industries over the past decade.The current equations are:What is the future of big data?What era do we live in?This article addresses these que...Big data has had significant impacts on our lives,economies,academia and industries over the past decade.The current equations are:What is the future of big data?What era do we live in?This article addresses these questions by looking at meta as an operation and argues that we are living in the era of big intelligence through analyzing from meta(big data)to big intelligence.More specifically,this article will analyze big data from an evolutionary perspective.The article overviews data,information,knowledge,and intelligence(DIKI)and reveals their relationships.After analyzing meta as an operation,this article explores Meta(DIKE)and its relationship.It reveals 5 Bigs consisting of big data,big information,big knowledge,big intelligence and big analytics.Applying meta on 5 Bigs,this article infers that 4 Big Data 4.0=meta(big data)=big intelligence.This article analyzes how intelligent big analytics support big intelligence.The proposed approach in this research might facilitate the research and development of big data,big data analytics,business intelligence,artificial intelligence,and data science.展开更多
经国内外专家深入探讨和反复论证,经主管、主办单位同意,经国家新闻出版广电总局正式批准(新广出审[2015]1187号文),由中国科学院文献情报中心主办的Chinese Journal of Library and Information Science(《中国文献情报(英)》,CJLIS)将...经国内外专家深入探讨和反复论证,经主管、主办单位同意,经国家新闻出版广电总局正式批准(新广出审[2015]1187号文),由中国科学院文献情报中心主办的Chinese Journal of Library and Information Science(《中国文献情报(英)》,CJLIS)将于2016年起正式更名为Journal of Data and Information Science(《数据与情报科学学报(英)》,JDIS)。作为国内唯一的图书馆学情报学领域英文学术期刊,CJLIS自2008年创刊以来,以刊发符合国际规范的高水平学术研究论文、推动中国图书馆学情报学学科发展为己任,组织优秀稿源、坚守学术规范、推动开放获取、严控评议流程,赢得了业界的充分肯定和广展开更多
基金Supported by Ministry of Science and Technology"National Science and Technology Platform Program"(2005DKA31800)
文摘Science data are very important resources for innovative research in all scientific disciplines. The Ministry of Science and Technology (MOST) of China has launched a comprehensive platform program for supporting scientific innovations and agricultural science database construction and sharing project is one of the activities under this program supported by MOST. This paper briefly described the achievements of the Agricultural Science Data Center Project.
基金This work was supported by the National Science Foundation[grant numbers 1526520 to AK and 0711456 to PL].
文摘The availability and quantity of remotely sensed and terrestrial geospatial data sets are on the rise.Historically,these data sets have been analyzed and quarried on 2D desktop computers;however,immersive technologies and specifically immersive virtual reality(iVR)allow for the integration,visualization,analysis,and exploration of these 3D geospatial data sets.iVR can deliver remote and large-scale geospatial data sets to the laboratory,providing embodied experiences of field sites across the earth and beyond.We describe a workflow for the ingestion of geospatial data sets and the development of an iVR workbench,and present the application of these for an experience of Iceland’s Thrihnukar volcano where we:(1)combined satellite imagery with terrain elevation data to create a basic reconstruction of the physical site;(2)used terrestrial LiDAR data to provide a geo-referenced point cloud model of the magmatic-volcanic system,as well as the LiDAR intensity values for the identification of rock types;and(3)used Structure-from-Motion(SfM)to construct a photorealistic point cloud of the inside volcano.The workbench provides tools for the direct manipulation of the georeferenced data sets,including scaling,rotation,and translation,and a suite of geometric measurement tools,including length,area,and volume.Future developments will be inspired by an ongoing user study that formally evaluates the workbench’s mature components in the context of fieldwork and analyses activities.
文摘Due to the recent explosion of big data, our society has been rapidly going through digital transformation and entering a new world with numerous eye-opening developments. These new trends impact the society and future jobs, and thus student careers. At the heart of this digital transformation is data science, the discipline that makes sense of big data. With many rapidly emerging digital challenges ahead of us, this article discusses perspectives on iSchools' opportunities and suggestions in data science education. We argue that iSchools should empower their students with "information computing" disciplines, which we define as the ability to solve problems and create values, information, and knowledge using tools in application domains. As specific approaches to enforcing information computing disciplines in data science education, we suggest the three foci of user-based, tool-based, and application- based. These three loci will serve to differentiate the data science education of iSchools from that of computer science or business schools. We present a layered Data Science Education Framework (DSEF) with building blocks that include the three pillars of data science (people, technology, and data), computational thinking, data-driven paradigms, and data science lifecycles. Data science courses built on the top of this framework should thus be executed with user-based, tool-based, and application-based approaches. This framework will help our students think about data science problems from the big picture perspective and foster appropriate problem-solving skills in conjunction with broad perspectives of data science lifecycles. We hope the DSEF discussed in this article will help fellow iSchools in their design of new data science curricula.
文摘Purpose: The purpose of the paper is to provide a framework for addressing the disconnect between metadata and data science. Data science cannot progress without metadata research.This paper takes steps toward advancing the synergy between metadata and data science, and identifies pathways for developing a more cohesive metadata research agenda in data science. Design/methodology/approach: This paper identifies factors that challenge metadata research in the digital ecosystem, defines metadata and data science, and presents the concepts big metadata, smart metadata, and metadata capital as part of a metadata lingua franca connecting to data science. Findings: The "utilitarian nature" and "historical and traditional views" of metadata are identified as two intersecting factors that have inhibited metadata research. Big metadata, smart metadata, and metadata capital are presented as part ofa metadata linguafranca to help frame research in the data science research space. Research limitations: There are additional, intersecting factors to consider that likely inhibit metadata research, and other significant metadata concepts to explore. Practical implications: The immediate contribution of this work is that it may elicit response, critique, revision, or, more significantly, motivate research. The work presented can encourage more researchers to consider the significance of metadata as a research worthy topic within data science and the larger digital ecosystem. Originality/value: Although metadata research has not kept pace with other data science topics, there is little attention directed to this problem. This is surprising, given that metadata is essential for data science endeavors. This examination synthesizes original and prior scholarship to provide new grounding for metadata research in data science.
基金Project supported by the National Key R&D Program of China(Grant No.2016YFB0700503)the National High Technology Research and Development Program of China(Grant No.2015AA03420)+2 种基金Beijing Municipal Science and Technology Project,China(Grant No.D161100002416001)the National Natural Science Foundation of China(Grant No.51172018)Kennametal Inc
文摘Since its launch in 2011, the Materials Genome Initiative(MGI) has drawn the attention of researchers from academia,government, and industry worldwide. As one of the three tools of the MGI, the use of materials data, for the first time, has emerged as an extremely significant approach in materials discovery. Data science has been applied in different disciplines as an interdisciplinary field to extract knowledge from data. The concept of materials data science has been utilized to demonstrate its application in materials science. To explore its potential as an active research branch in the big data era, a three-tier system has been put forward to define the infrastructure for the classification, curation and knowledge extraction of materials data.
基金The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work under Grant Number(RGP 1/147/42),www.kku.edu.sa.This research was funded by the Deanship of Scientific Research at Princess Nourah bint Abdulrahman University through the Fast-Track Path of Research Funding Program.
文摘In present digital era,data science techniques exploit artificial intelligence(AI)techniques who start and run small and medium-sized enterprises(SMEs)to have an impact and develop their businesses.Data science integrates the conventions of econometrics with the technological elements of data science.It make use of machine learning(ML),predictive and prescriptive analytics to effectively understand financial data and solve related problems.Smart technologies for SMEs enable allows the firm to get smarter with their processes and offers efficient operations.At the same time,it is needed to develop an effective tool which can assist small to medium sized enterprises to forecast business failure as well as financial crisis.AI becomes a familiar tool for several businesses due to the fact that it concentrates on the design of intelligent decision making tools to solve particular real time problems.With this motivation,this paper presents a new AI based optimal functional link neural network(FLNN)based financial crisis prediction(FCP)model forSMEs.The proposed model involves preprocessing,feature selection,classification,and parameter tuning.At the initial stage,the financial data of the enterprises are collected and are preprocessed to enhance the quality of the data.Besides,a novel chaotic grasshopper optimization algorithm(CGOA)based feature selection technique is applied for the optimal selection of features.Moreover,functional link neural network(FLNN)model is employed for the classification of the feature reduced data.Finally,the efficiency of theFLNNmodel can be improvised by the use of cat swarm optimizer(CSO)algorithm.A detailed experimental validation process takes place on Polish dataset to ensure the performance of the presented model.The experimental studies demonstrated that the CGOA-FLNN-CSO model has accomplished maximum prediction accuracy of 98.830%,92.100%,and 95.220%on the applied Polish dataset Year I-III respectively.
文摘There has long been discussion about the distinctions of library science,information science,and informatics,and how these areas differ and overlap with computer science.Today the term data science is emerging that generates excitement and questions about how it relates to and differs from these other areas of study.
文摘Introduction Within the field of scientometrics,which involves quantitative studies of science,the citation analysis specialism counts citations between academic papers in order to help evaluate the impact of the cited work(Moed,2006).
文摘This paper reviews literature pertaining to the development of data science as a discipline,current issues with data bias and ethics,and the role that the discipline of information science may play in addressing these concerns.Information science research and researchers have much to offer for data science,owing to their background as transdisciplinary scholars who apply human-centered and social-behavioral perspectives to issues within natural science disciplines.Information science researchers have already contributed to a humanistic approach to data ethics within the literature and an emphasis on data science within information schools all but ensures that this literature will continue to grow in coming decades.This review article serves as a reference for the history,current progress,and potential future directions of data ethics research within the corpus of information science literature.
基金supported by Kyungpook National University Research Fund,2020.
文摘The rise or fall of the stock markets directly affects investors’interest and loyalty.Therefore,it is necessary to measure the performance of stocks in the market in advance to prevent our assets from suffering significant losses.In our proposed study,six supervised machine learning(ML)strategies and deep learning(DL)models with long short-term memory(LSTM)of data science was deployed for thorough analysis and measurement of the performance of the technology stocks.Under discussion are Apple Inc.(AAPL),Microsoft Corporation(MSFT),Broadcom Inc.,Taiwan Semiconductor Manufacturing Company Limited(TSM),NVIDIA Corporation(NVDA),and Avigilon Corporation(AVGO).The datasets were taken from the Yahoo Finance API from 06-05-2005 to 06-05-2022(seventeen years)with 4280 samples.As already noted,multiple studies have been performed to resolve this problem using linear regression,support vectormachines,deep long short-termmemory(LSTM),and many other models.In this research,the Hidden Markov Model(HMM)outperformed other employed machine learning ensembles,tree-based models,the ARIMA(Auto Regressive IntegratedMoving Average)model,and long short-term memory with a robust mean accuracy score of 99.98.Other statistical analyses and measurements for machine learning ensemble algorithms,the Long Short-TermModel,and ARIMA were also carried out for further investigation of the performance of advanced models for forecasting time series data.Thus,the proposed research found the best model to be HMM,and LSTM was the second-best model that performed well in all aspects.A developedmodel will be highly recommended and helpful for early measurement of technology stock performance for investment or withdrawal based on the future stock rise or fall for creating smart environments.
基金The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work under grant number(RGP 2/42/43)This work was supported by Taif University Researchers Supporting Program(project number:TURSP-2020/200),Taif University,Saudi Arabia.
文摘In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary challenge in the appropriate selection of genes.Microarray data classification incorporates multiple disciplines such as bioinformatics,machine learning(ML),data science,and pattern classification.This paper designs an optimal deep neural network based microarray gene expression classification(ODNN-MGEC)model for bioinformatics applications.The proposed ODNN-MGEC technique performs data normalization process to normalize the data into a uniform scale.Besides,improved fruit fly optimization(IFFO)based feature selection technique is used to reduce the high dimensionality in the biomedical data.Moreover,deep neural network(DNN)model is applied for the classification of microarray gene expression data and the hyperparameter tuning of the DNN model is carried out using the Symbiotic Organisms Search(SOS)algorithm.The utilization of IFFO and SOS algorithms pave the way for accomplishing maximum gene expression classification outcomes.For examining the improved outcomes of the ODNN-MGEC technique,a wide ranging experimental analysis is made against benchmark datasets.The extensive comparison study with recent approaches demonstrates the enhanced outcomes of the ODNN-MGEC technique in terms of different measures.
文摘In this editorial,we comment on the current development and deployment of data science in intensive care units(ICUs).Data in ICUs can be classified into qualitative and quantitative data with different technologies needed to translate and interpret them.Data science,in the form of artificial intelligence(AI),should find the right interaction between physicians,data and algorithm.For individual patients and physicians,sepsis and mechanical ventilation have been two important aspects where AI has been extensively studied.However,major risks of bias,lack of generalizability and poor clinical values remain.AI deployment in the ICUs should be emphasized more to facilitate AI development.For ICU management,AI has a huge potential in transforming resource allocation.The coronavirus disease 2019 pandemic has given opportunities to establish such systems which should be investigated further.Ethical concerns must be addressed when designing such AI.
基金supported by the EU H2020 Research and Innovation Program under the Marie Sklodowska-Curie Grant Agreement(Project-DEEP,Grant number:101109045)National Key R&D Program of China with Grant number 2018YFB1800804+2 种基金the National Natural Science Foundation of China(Nos.NSFC 61925105,and 62171257)Tsinghua University-China Mobile Communications Group Co.,Ltd,Joint Institutethe Fundamental Research Funds for the Central Universities,China(No.FRF-NP-20-03)。
文摘The increasing dependence on data highlights the need for a detailed understanding of its behavior,encompassing the challenges involved in processing and evaluating it.However,current research lacks a comprehensive structure for measuring the worth of data elements,hindering effective navigation of the changing digital environment.This paper aims to fill this research gap by introducing the innovative concept of“data components.”It proposes a graphtheoretic representation model that presents a clear mathematical definition and demonstrates the superiority of data components over traditional processing methods.Additionally,the paper introduces an information measurement model that provides a way to calculate the information entropy of data components and establish their increased informational value.The paper also assesses the value of information,suggesting a pricing mechanism based on its significance.In conclusion,this paper establishes a robust framework for understanding and quantifying the value of implicit information in data,laying the groundwork for future research and practical applications.
文摘In order to conduct research and analysis on the construction of application-oriented undergraduate data science and big data technology courses,the professional development characteristics of universities and enterprises should be taken into consideration,the development trend of the big data industry should be scrutinized,and professional application-oriented talents should be cultivated in line with job requirements.This paper expounds the demand for capacity-building professional development in application-oriented undergraduate data science and big data technology courses,conducts research and analysis on the current situation of professional development,and puts forward strategies in hope to provide reference for capacity-building professional development.
基金supported by the National Natural Science Foundation of China (62272078)。
文摘With the ongoing advancements in sensor networks and data acquisition technologies across various systems like manufacturing,aviation,and healthcare,the data driven vibration control(DDVC)has attracted broad interests from both the industrial and academic communities.Input shaping(IS),as a simple and effective feedforward method,is greatly demanded in DDVC methods.It convolves the desired input command with impulse sequence without requiring parametric dynamics and the closed-loop system structure,thereby suppressing the residual vibration separately.Based on a thorough investigation into the state-of-the-art DDVC methods,this survey has made the following efforts:1)Introducing the IS theory and typical input shapers;2)Categorizing recent progress of DDVC methods;3)Summarizing commonly adopted metrics for DDVC;and 4)Discussing the engineering applications and future trends of DDVC.By doing so,this study provides a systematic and comprehensive overview of existing DDVC methods from designing to optimizing perspectives,aiming at promoting future research regarding this emerging and vital issue.
文摘I provide some science and reflections from my experiences working in geophysics,along with connections to computational and data sciences,including recent developments in machine learning.I highlight several individuals and groups who have influenced me,both through direct collaborations as well as from ideas and insights that I have learned from.While my reflections are rooted in geophysics,they should also be relevant to other computational scientific and engineering fields.I also provide some thoughts for young,applied scientists and engineers.
基金Teaching Reform Project of Beijing Union University“Exploration of Teaching Reform of Big Data Analysis and Visualization Course under the Background of New Engineering”(JJ2024Y025)。
文摘Under the background of the big data era,the education of big data majors is undergoing a profound teaching reform and innovation.With the increasing role of big data technology in analysis and decision-making,updating and expanding the teaching content of big data majors has become particularly important.In the era of big data,modern enterprises have put forward new and higher demands for big data talents,which not only include traditional data analysis skills but also knowledge of data visualization and information technology.To address these challenges,big data education needs to reform and innovate in the development and utilization of teaching content,methods,and resources.This paper proposes teaching models and reform methods for big data majors and analyzes corresponding teaching reforms and innovations to meet the requirements of the new development of big data majors.The traditional classroom teaching method is no longer sufficient to meet the learning needs of students,and more dynamic and interactive teaching methods,such as case studies,flipped classrooms,and project-based learning,are becoming increasingly essential.These innovative teaching methods can more effectively cultivate students’practical operation skills and independent thinking while allowing them to better learn advanced knowledge in a real big-data environment.In addition,the paper also discusses the construction of big data processing and analysis platforms,as well as innovative teaching management and evaluation systems to improve teaching quality.
基金This research is supported partially by the Papua New Guinea Science and Technology Secretariat(PNGSTS)under the project grant No.1-3962 PNGSTS.
文摘Big data has had significant impacts on our lives,economies,academia and industries over the past decade.The current equations are:What is the future of big data?What era do we live in?This article addresses these questions by looking at meta as an operation and argues that we are living in the era of big intelligence through analyzing from meta(big data)to big intelligence.More specifically,this article will analyze big data from an evolutionary perspective.The article overviews data,information,knowledge,and intelligence(DIKI)and reveals their relationships.After analyzing meta as an operation,this article explores Meta(DIKE)and its relationship.It reveals 5 Bigs consisting of big data,big information,big knowledge,big intelligence and big analytics.Applying meta on 5 Bigs,this article infers that 4 Big Data 4.0=meta(big data)=big intelligence.This article analyzes how intelligent big analytics support big intelligence.The proposed approach in this research might facilitate the research and development of big data,big data analytics,business intelligence,artificial intelligence,and data science.
文摘经国内外专家深入探讨和反复论证,经主管、主办单位同意,经国家新闻出版广电总局正式批准(新广出审[2015]1187号文),由中国科学院文献情报中心主办的Chinese Journal of Library and Information Science(《中国文献情报(英)》,CJLIS)将于2016年起正式更名为Journal of Data and Information Science(《数据与情报科学学报(英)》,JDIS)。作为国内唯一的图书馆学情报学领域英文学术期刊,CJLIS自2008年创刊以来,以刊发符合国际规范的高水平学术研究论文、推动中国图书馆学情报学学科发展为己任,组织优秀稿源、坚守学术规范、推动开放获取、严控评议流程,赢得了业界的充分肯定和广