The current education field is experiencing an innovation driven by big data and cloud technologies,and these advanced technologies play a central role in the construction of smart campuses.Big data technology has a w...The current education field is experiencing an innovation driven by big data and cloud technologies,and these advanced technologies play a central role in the construction of smart campuses.Big data technology has a wide range of applications in student learning behavior analysis,teaching resource management,campus safety monitoring,and decision support,which improves the quality of education and management efficiency.Cloud computing technology supports the integration,distribution,and optimal use of educational resources through cloud resource sharing,virtual classrooms,intelligent campus management systems,and Infrastructure-as-a-Service(IaaS)models,which reduce costs and increase flexibility.This paper comprehensively discusses the practical application of big data and cloud computing technologies in smart campuses,showing how these technologies can contribute to the development of smart campuses,and laying the foundation for the future innovation of education models.展开更多
A smart grid is the evolved form of the power grid with the integration of sensing,communication,computing,monitoring,and control technologies.These technologies make the power grid reliable,efficient,and economical.H...A smart grid is the evolved form of the power grid with the integration of sensing,communication,computing,monitoring,and control technologies.These technologies make the power grid reliable,efficient,and economical.However,the smartness boosts the volume of data in the smart grid.To obligate full benefits,big data has attractive techniques to process and analyze smart grid data.This paper presents and simulates a framework to make sure the use of big data computing technique in the smart grid.The offered framework comprises of the following four layers:(i)Data source layer,(ii)Data transmission layer,(iii)Data storage and computing layer,and(iv)Data analysis layer.As a proof of concept,the framework is simulated by taking the dataset of three cities of the Pakistan region and by considering two cloud-based data centers.The results are analyzed by taking into account the following parameters:(i)Heavy load data center,(ii)The impact of peak hour,(iii)High network delay,and(iv)The low network delay.The presented framework may help the power grid to achieve reliability,sustainability,and cost-efficiency for both the users and service providers.展开更多
To solve the lag problem of the traditional storage technology in mass data storage and management,the application platform is designed and built for big data on Hadoop and data warehouse integration platform,which en...To solve the lag problem of the traditional storage technology in mass data storage and management,the application platform is designed and built for big data on Hadoop and data warehouse integration platform,which ensured the convenience for the management and usage of data.In order to break through the master node system bottlenecks,a storage system with better performance is designed through introduction of cloud computing technology,which adopts the design of master-slave distribution patterns by the network access according to the recent principle.Thus the burden of single access the master node is reduced.Also file block update strategy and fault recovery mechanism are provided to solve the management bottleneck problem of traditional storage system on the data update and fault recovery and offer feasible technical solutions to storage management for big data.展开更多
As an open-source cloud computing platform,Hadoop is extensively employed in a variety of sectors because of its high dependability,high scalability,and considerable benefits in processing and analyzing massive amount...As an open-source cloud computing platform,Hadoop is extensively employed in a variety of sectors because of its high dependability,high scalability,and considerable benefits in processing and analyzing massive amounts of data.Consequently,to derive valuable insights from transportation big data,it is essential to leverage the Hadoop big data platform for analysis and mining.To summarize the latest research progress on the application of Hadoop to transportation big data,we conducted a comprehensive review of 98 relevant articles published from 2012 to the present.Firstly,a bibliometric analysis was performed using VOSviewer software to identify the evolution trend of keywords.Secondly,we introduced the core components of Hadoop.Subsequently,we systematically reviewed the98 articles,identified the latest research progress,and classified the main application scenarios of Hadoop and its optimization framework.Based on our analysis,we identified the research gaps and future work in this area.Our review of the available research highlights that Hadoop has played a significant role in transportation big data research over the past decade.Specifically,the focus has been on transportation infrastructure monitoring,taxi operation management,travel feature analysis,traffic flow prediction,transportation big data analysis platform,traffic event monitoring and status discrimination,license plate recognition,and the shortest path.Additionally,the optimization framework of Hadoop has been studied in two main areas:the optimization of the computational model of Hadoop and the optimization of Hadoop combined with Spark.Several research results have been achieved in the field of transportation big data.However,there is less systematic research on the core technology of Hadoop,and the breadth and depth of the integration development of Hadoop and transportation big data are not sufficient.In the future,it is suggested that Hadoop may be combined with other big data frameworks such as Storm and Flink that process real-time data sources to improve the real-time processing and analysis of transportation big data.Simultaneously,the research on multi-source heterogeneous transportation big data is still a key focus.Improving existing big data technology to enable the analysis and even data compression of transportation big data can lead to new breakthroughs for intelligent transportation.展开更多
With the recent advancements in computer technologies,the amount of data available is increasing day by day.However,excessive amounts of data create great challenges for users.Meanwhile,cloud computing services provid...With the recent advancements in computer technologies,the amount of data available is increasing day by day.However,excessive amounts of data create great challenges for users.Meanwhile,cloud computing services provide a powerful environment to store large volumes of data.They eliminate various requirements,such as dedicated space and maintenance of expensive computer hardware and software.Handling big data is a time-consuming task that requires large computational clusters to ensure successful data storage and processing.In this work,the definition,classification,and characteristics of big data are discussed,along with various cloud services,such as Microsoft Azure,Google Cloud,Amazon Web Services,International Business Machine cloud,Hortonworks,and MapR.A comparative analysis of various cloud-based big data frameworks is also performed.Various research challenges are defined in terms of distributed database storage,data security,heterogeneity,and data visualization.展开更多
In recent years, due to the widespread use of electronic services and the use of social network as well, large volumes of information are being made that this information contains various types of things such as video...In recent years, due to the widespread use of electronic services and the use of social network as well, large volumes of information are being made that this information contains various types of things such as videos, photos, texts etc. besides large volume. Due to the high volume and the lack of specificity of this information, covering them through traditional and relational databases is not possible and modem solutions should be used for processing them, so that processing speed is also covered. Data storage for processing and the way of accessing to them in memory, network communication, covering required features for distributed system in solutions that are in use for storing big data, are the items that should be covered. In this paper, a collection of advantages and challenges of big data, special features and characteristics of them has been provided and with the introduction of technologies in use, storage methods are studied and research opportunities to continue the way will be introduced.展开更多
MapReduce is a programming model for processing large data sets, and Hadoop is the most popular open-source implementation of MapReduce. To achieve high performance, up to 190 Hadoop configuration parameters must be m...MapReduce is a programming model for processing large data sets, and Hadoop is the most popular open-source implementation of MapReduce. To achieve high performance, up to 190 Hadoop configuration parameters must be manually tunned. This is not only time-consuming but also error-pron. In this paper, we propose a new performance model based on random forest, a recently devel- oped machine-learning algorithm. The model, called RFMS, is used to predict the performance of a Hadoop system according to the system' s configuration parameters. RFMS is created from 2000 distinct fine-grained performance observations with different Hadoop configurations. We test RFMS against the measured performance of representative workloads from the Hadoop Micro-benchmark suite. The results show that the prediction accuracy of RFMS achieves 95% on average and up to 99%. This new, highly accurate prediction model can be used to automatically optimize the performance of Hadoop systems.展开更多
Big data refer to the massive amounts and varieties of information in the structured and unstructured form,generated by social networking sites,biomedical equipment,financial companies,internet and websites,scientific...Big data refer to the massive amounts and varieties of information in the structured and unstructured form,generated by social networking sites,biomedical equipment,financial companies,internet and websites,scientific sensors,agriculture engineering sources,and so on.This huge amount of data cannot be processed using traditional data processing systems and technologies.Big data analytics is a process of examining information and patterns from huge data.Hence,the process needs a system architecture for data collection,transmission,storage,processing and analysis,and visualization mechanisms.In this paper,we review the background and futuristic aspects of big data.We first introduce the history,background and related technologies of big data.We focus on big data system architecture,phases and classes of big data analytics.Then we present an open source big data framework to address some of the big data challenges.Finally,we discuss different applications of big data with some examples.展开更多
The development of global informatization and its integration with industrialization symbolizes that human society has entered into the big data era.This article covers seven new characteristics of Geomatics(i.e.ubiqu...The development of global informatization and its integration with industrialization symbolizes that human society has entered into the big data era.This article covers seven new characteristics of Geomatics(i.e.ubiquitous sensor,multi-dimensional dynamics,integration via networking,full automation in real time,from sensing to recognition,crowdsourcing and volunteered geographic information,and serviceoriented science),and puts forward the corresponding critical technical challenges in the construction of integrated space-air-ground geospatial networks.Through the discussions outlined in this paper,we propose a new development stage of Geomatics entitled‘Connected Geomatics,’which is defined as a multi-disciplinary science and technology that uses systematic approaches and integrates methods of spatio-temporal data acquisition,information extraction,network management,knowledge discovery,and spatial sensing and recognition,as well as intelligent location-based services pertaining to any physical objects and human activities on the earth.It is envisioned that the advancement of Geomatics will make a great contribution to human sustainable development.展开更多
文摘The current education field is experiencing an innovation driven by big data and cloud technologies,and these advanced technologies play a central role in the construction of smart campuses.Big data technology has a wide range of applications in student learning behavior analysis,teaching resource management,campus safety monitoring,and decision support,which improves the quality of education and management efficiency.Cloud computing technology supports the integration,distribution,and optimal use of educational resources through cloud resource sharing,virtual classrooms,intelligent campus management systems,and Infrastructure-as-a-Service(IaaS)models,which reduce costs and increase flexibility.This paper comprehensively discusses the practical application of big data and cloud computing technologies in smart campuses,showing how these technologies can contribute to the development of smart campuses,and laying the foundation for the future innovation of education models.
基金This work was supported by the National Natural Science Foundation of China(61871058).
文摘A smart grid is the evolved form of the power grid with the integration of sensing,communication,computing,monitoring,and control technologies.These technologies make the power grid reliable,efficient,and economical.However,the smartness boosts the volume of data in the smart grid.To obligate full benefits,big data has attractive techniques to process and analyze smart grid data.This paper presents and simulates a framework to make sure the use of big data computing technique in the smart grid.The offered framework comprises of the following four layers:(i)Data source layer,(ii)Data transmission layer,(iii)Data storage and computing layer,and(iv)Data analysis layer.As a proof of concept,the framework is simulated by taking the dataset of three cities of the Pakistan region and by considering two cloud-based data centers.The results are analyzed by taking into account the following parameters:(i)Heavy load data center,(ii)The impact of peak hour,(iii)High network delay,and(iv)The low network delay.The presented framework may help the power grid to achieve reliability,sustainability,and cost-efficiency for both the users and service providers.
文摘To solve the lag problem of the traditional storage technology in mass data storage and management,the application platform is designed and built for big data on Hadoop and data warehouse integration platform,which ensured the convenience for the management and usage of data.In order to break through the master node system bottlenecks,a storage system with better performance is designed through introduction of cloud computing technology,which adopts the design of master-slave distribution patterns by the network access according to the recent principle.Thus the burden of single access the master node is reduced.Also file block update strategy and fault recovery mechanism are provided to solve the management bottleneck problem of traditional storage system on the data update and fault recovery and offer feasible technical solutions to storage management for big data.
基金supported by the Natural Science Foundation of China(No.52062027)the Key Research and Development Project of Gansu Province(No.22YF7GA142)+2 种基金Soft Science Special Project of Gansu Basic Research PIan(No.22JR4ZA035)Gansu Provincial Science and Technology Major Special Project-Enterprise Innovation Consortium Project(No.22ZD6GA010 and No.21ZD3GA002)Lanzhou Jiaotong University Basic Research Top Talents Training Program(No.2022JC02)。
文摘As an open-source cloud computing platform,Hadoop is extensively employed in a variety of sectors because of its high dependability,high scalability,and considerable benefits in processing and analyzing massive amounts of data.Consequently,to derive valuable insights from transportation big data,it is essential to leverage the Hadoop big data platform for analysis and mining.To summarize the latest research progress on the application of Hadoop to transportation big data,we conducted a comprehensive review of 98 relevant articles published from 2012 to the present.Firstly,a bibliometric analysis was performed using VOSviewer software to identify the evolution trend of keywords.Secondly,we introduced the core components of Hadoop.Subsequently,we systematically reviewed the98 articles,identified the latest research progress,and classified the main application scenarios of Hadoop and its optimization framework.Based on our analysis,we identified the research gaps and future work in this area.Our review of the available research highlights that Hadoop has played a significant role in transportation big data research over the past decade.Specifically,the focus has been on transportation infrastructure monitoring,taxi operation management,travel feature analysis,traffic flow prediction,transportation big data analysis platform,traffic event monitoring and status discrimination,license plate recognition,and the shortest path.Additionally,the optimization framework of Hadoop has been studied in two main areas:the optimization of the computational model of Hadoop and the optimization of Hadoop combined with Spark.Several research results have been achieved in the field of transportation big data.However,there is less systematic research on the core technology of Hadoop,and the breadth and depth of the integration development of Hadoop and transportation big data are not sufficient.In the future,it is suggested that Hadoop may be combined with other big data frameworks such as Storm and Flink that process real-time data sources to improve the real-time processing and analysis of transportation big data.Simultaneously,the research on multi-source heterogeneous transportation big data is still a key focus.Improving existing big data technology to enable the analysis and even data compression of transportation big data can lead to new breakthroughs for intelligent transportation.
文摘With the recent advancements in computer technologies,the amount of data available is increasing day by day.However,excessive amounts of data create great challenges for users.Meanwhile,cloud computing services provide a powerful environment to store large volumes of data.They eliminate various requirements,such as dedicated space and maintenance of expensive computer hardware and software.Handling big data is a time-consuming task that requires large computational clusters to ensure successful data storage and processing.In this work,the definition,classification,and characteristics of big data are discussed,along with various cloud services,such as Microsoft Azure,Google Cloud,Amazon Web Services,International Business Machine cloud,Hortonworks,and MapR.A comparative analysis of various cloud-based big data frameworks is also performed.Various research challenges are defined in terms of distributed database storage,data security,heterogeneity,and data visualization.
文摘In recent years, due to the widespread use of electronic services and the use of social network as well, large volumes of information are being made that this information contains various types of things such as videos, photos, texts etc. besides large volume. Due to the high volume and the lack of specificity of this information, covering them through traditional and relational databases is not possible and modem solutions should be used for processing them, so that processing speed is also covered. Data storage for processing and the way of accessing to them in memory, network communication, covering required features for distributed system in solutions that are in use for storing big data, are the items that should be covered. In this paper, a collection of advantages and challenges of big data, special features and characteristics of them has been provided and with the introduction of technologies in use, storage methods are studied and research opportunities to continue the way will be introduced.
基金supported by the cooperation project of Research on Green Cloud IDC Resource Scheduling with ZTE Corporation
文摘MapReduce is a programming model for processing large data sets, and Hadoop is the most popular open-source implementation of MapReduce. To achieve high performance, up to 190 Hadoop configuration parameters must be manually tunned. This is not only time-consuming but also error-pron. In this paper, we propose a new performance model based on random forest, a recently devel- oped machine-learning algorithm. The model, called RFMS, is used to predict the performance of a Hadoop system according to the system' s configuration parameters. RFMS is created from 2000 distinct fine-grained performance observations with different Hadoop configurations. We test RFMS against the measured performance of representative workloads from the Hadoop Micro-benchmark suite. The results show that the prediction accuracy of RFMS achieves 95% on average and up to 99%. This new, highly accurate prediction model can be used to automatically optimize the performance of Hadoop systems.
文摘Big data refer to the massive amounts and varieties of information in the structured and unstructured form,generated by social networking sites,biomedical equipment,financial companies,internet and websites,scientific sensors,agriculture engineering sources,and so on.This huge amount of data cannot be processed using traditional data processing systems and technologies.Big data analytics is a process of examining information and patterns from huge data.Hence,the process needs a system architecture for data collection,transmission,storage,processing and analysis,and visualization mechanisms.In this paper,we review the background and futuristic aspects of big data.We first introduce the history,background and related technologies of big data.We focus on big data system architecture,phases and classes of big data analytics.Then we present an open source big data framework to address some of the big data challenges.Finally,we discuss different applications of big data with some examples.
基金supported by the National Natural Science Foundation of China(NSFC)[grant numbers 41501383,91438203]China Postdoctoral Science Foundation[grant number 2014M562006]+1 种基金Natural Science Foundation of Hubei Province[grant number 2015CFB330]Fundamental Research Funds for the Central Universities[grant number 2042016kf0163].
文摘The development of global informatization and its integration with industrialization symbolizes that human society has entered into the big data era.This article covers seven new characteristics of Geomatics(i.e.ubiquitous sensor,multi-dimensional dynamics,integration via networking,full automation in real time,from sensing to recognition,crowdsourcing and volunteered geographic information,and serviceoriented science),and puts forward the corresponding critical technical challenges in the construction of integrated space-air-ground geospatial networks.Through the discussions outlined in this paper,we propose a new development stage of Geomatics entitled‘Connected Geomatics,’which is defined as a multi-disciplinary science and technology that uses systematic approaches and integrates methods of spatio-temporal data acquisition,information extraction,network management,knowledge discovery,and spatial sensing and recognition,as well as intelligent location-based services pertaining to any physical objects and human activities on the earth.It is envisioned that the advancement of Geomatics will make a great contribution to human sustainable development.