Purpose–Data science is the study of the generalizable extraction of knowledge from data.It includes a variety of components and develops on methods and concepts from many domains,containing mathematics,probability m...Purpose–Data science is the study of the generalizable extraction of knowledge from data.It includes a variety of components and develops on methods and concepts from many domains,containing mathematics,probability models,machine learning,statistical learning,computer programming,data engineering,pattern recognition and learning,visualization and data warehousing aiming to extract value from data.The purpose of this paper is to provide an overview of open source(OS)data science tools,proposing a classification scheme that can be used to study OS data science software.Design/methodology/approach–The proposed classification scheme is based on general characteristics,project activity,operational characteristics and data mining characteristics.The authors then use the proposed scheme to examine 70 identified Open Source Software.From this the authors provide insight about the current status of OS data science tools and reveal the state-of-the-art tools.Findings–The features of 70 OS tools are recorded based on the criteria of the four group characteristics,general characteristics,project activity,operational characteristics and data mining characteristics.Interesting results came from the analysis of these features and are recorded here.Originality/value–The contribution of this survey is development of a new classification scheme for examination and study of OS data science tools.In parallel,this study provides an overview of existing OS data science tools.展开更多
Discrete event simulation(DES)is a well-established decision support tool in modeling work flows in manufacturing industry.But,there are an amount of practical and financial obstacles that deter the employment of this...Discrete event simulation(DES)is a well-established decision support tool in modeling work flows in manufacturing industry.But,there are an amount of practical and financial obstacles that deter the employment of this technology in industry.One of the main weaknesses of operating DES is the costs spent on collecting and mapping input data from different enterprise data resources into a DES model.Another issue is the cost of integrating simulation applications with other manufacturing applications.These barriers hinder the automated input of data into DES models and as a result deter use of real-time DES in manufacturing.This review presents the existing research studies in the literature that address the above issues,demonstrating in parallel the already implemented concepts.The scope of this review is to provide an overview of the input data phase,focusing on its automation and motivating researchers to re-examine this phase by highlighting future research directions.展开更多
基金The research leading to the results presented in this paper has received funding from the European Union Seventh Framework Programme(FP7-2012-NMP-ICT-FoF)under Grant Agreement No.314364.
文摘Purpose–Data science is the study of the generalizable extraction of knowledge from data.It includes a variety of components and develops on methods and concepts from many domains,containing mathematics,probability models,machine learning,statistical learning,computer programming,data engineering,pattern recognition and learning,visualization and data warehousing aiming to extract value from data.The purpose of this paper is to provide an overview of open source(OS)data science tools,proposing a classification scheme that can be used to study OS data science software.Design/methodology/approach–The proposed classification scheme is based on general characteristics,project activity,operational characteristics and data mining characteristics.The authors then use the proposed scheme to examine 70 identified Open Source Software.From this the authors provide insight about the current status of OS data science tools and reveal the state-of-the-art tools.Findings–The features of 70 OS tools are recorded based on the criteria of the four group characteristics,general characteristics,project activity,operational characteristics and data mining characteristics.Interesting results came from the analysis of these features and are recorded here.Originality/value–The contribution of this survey is development of a new classification scheme for examination and study of OS data science tools.In parallel,this study provides an overview of existing OS data science tools.
基金This research is funded by the EU Seventh Framework Programme under grant agreement n° 314364.
文摘Discrete event simulation(DES)is a well-established decision support tool in modeling work flows in manufacturing industry.But,there are an amount of practical and financial obstacles that deter the employment of this technology in industry.One of the main weaknesses of operating DES is the costs spent on collecting and mapping input data from different enterprise data resources into a DES model.Another issue is the cost of integrating simulation applications with other manufacturing applications.These barriers hinder the automated input of data into DES models and as a result deter use of real-time DES in manufacturing.This review presents the existing research studies in the literature that address the above issues,demonstrating in parallel the already implemented concepts.The scope of this review is to provide an overview of the input data phase,focusing on its automation and motivating researchers to re-examine this phase by highlighting future research directions.