Traditional machine-learning algorithms are struggling to handle the exceedingly large amount of data being generated by the internet. In real-world applications, there is an urgent need for machine-learning algorithm...Traditional machine-learning algorithms are struggling to handle the exceedingly large amount of data being generated by the internet. In real-world applications, there is an urgent need for machine-learning algorithms to be able to handle large-scale, high-dimensional text data. Cloud computing involves the delivery of computing and storage as a service to a heterogeneous community of recipients, Recently, it has aroused much interest in industry and academia. Most previous works on cloud platforms only focus on the parallel algorithms for structured data. In this paper, we focus on the parallel implementation of web-mining algorithms and develop a parallel web-mining system that includes parallel web crawler; parallel text extract, transform and load (ETL) and modeling; and parallel text mining and application subsystems. The complete system enables variable real-world web-mining applications for mass data.展开更多
With the rapid development of wireless sensor network (WSN), the demands of limited radio frequency spectrum rise sharply, thereby dealing with the frequency assignment of WSN scientifically and efficiently becomes ...With the rapid development of wireless sensor network (WSN), the demands of limited radio frequency spectrum rise sharply, thereby dealing with the frequency assignment of WSN scientifically and efficiently becomes a popular topic. To improve the frequency utilization rate in WSN, a spectrum management system for WSN combined with cloud computing technology should be considered. From the optimization point of view, the study of dynamic spectrum management can be divided into three kinds of methods, including Nash equilibrium, social utility maximization, and competitive economy equilibrium. In this paper, we propose a genetic algorithm based approach to allocate the power spectrum dynamically. The objective is to maximize the sum of individual Shannon utilities with the background interference and crosstalk consideration. Compared to the approach in [1], the experimental result shows better balance between efficiency and effectiveness of our approach.展开更多
One of the major scientific challenges and societal concerns is to make informed decisions to ensure sustainable groundwater availability when facing deep uncertainties.A major computational requirement associated wit...One of the major scientific challenges and societal concerns is to make informed decisions to ensure sustainable groundwater availability when facing deep uncertainties.A major computational requirement associated with this is on-demand computing for risk analysis to support timely decision.This paper presents a scientific modeling service called‘ModflowOnAzure’which enables large-scale ensemble runs of groundwater flow models to be easily executed in parallel in the Windows Azure cloud.Several technical issues were addressed,including the conjunctive use of desktop tools in MATLAB to avoid license issues in the cloud,integration of Dropbox with Azure for improved usability and‘Drop-and-Compute,’and automated file exchanges between desktop and the cloud.Two scientific use cases are presented in this paper using this service with significant computational speedup.One case is from Arizona,where six plausible alternative conceptual models and a streamflow stochastic model are used to evaluate the impacts of different groundwater pumping scenarios.Another case is from Texas,where a global sensitivity analysis is performed on a regional groundwater availability model.Results of both cases show informed uncertainty analysis results that can be used to assist the groundwater planning and sustainability study.展开更多
基金supported by the National Natural Science Foundation of China (No. 61175052,60975039, 61203297, 60933004, 61035003)National High-tech R&D Program of China (863 Program) (No.2012AA011003)supported by the ZTE research found of Parallel Web Mining project
文摘Traditional machine-learning algorithms are struggling to handle the exceedingly large amount of data being generated by the internet. In real-world applications, there is an urgent need for machine-learning algorithms to be able to handle large-scale, high-dimensional text data. Cloud computing involves the delivery of computing and storage as a service to a heterogeneous community of recipients, Recently, it has aroused much interest in industry and academia. Most previous works on cloud platforms only focus on the parallel algorithms for structured data. In this paper, we focus on the parallel implementation of web-mining algorithms and develop a parallel web-mining system that includes parallel web crawler; parallel text extract, transform and load (ETL) and modeling; and parallel text mining and application subsystems. The complete system enables variable real-world web-mining applications for mass data.
文摘With the rapid development of wireless sensor network (WSN), the demands of limited radio frequency spectrum rise sharply, thereby dealing with the frequency assignment of WSN scientifically and efficiently becomes a popular topic. To improve the frequency utilization rate in WSN, a spectrum management system for WSN combined with cloud computing technology should be considered. From the optimization point of view, the study of dynamic spectrum management can be divided into three kinds of methods, including Nash equilibrium, social utility maximization, and competitive economy equilibrium. In this paper, we propose a genetic algorithm based approach to allocate the power spectrum dynamically. The objective is to maximize the sum of individual Shannon utilities with the background interference and crosstalk consideration. Compared to the approach in [1], the experimental result shows better balance between efficiency and effectiveness of our approach.
文摘One of the major scientific challenges and societal concerns is to make informed decisions to ensure sustainable groundwater availability when facing deep uncertainties.A major computational requirement associated with this is on-demand computing for risk analysis to support timely decision.This paper presents a scientific modeling service called‘ModflowOnAzure’which enables large-scale ensemble runs of groundwater flow models to be easily executed in parallel in the Windows Azure cloud.Several technical issues were addressed,including the conjunctive use of desktop tools in MATLAB to avoid license issues in the cloud,integration of Dropbox with Azure for improved usability and‘Drop-and-Compute,’and automated file exchanges between desktop and the cloud.Two scientific use cases are presented in this paper using this service with significant computational speedup.One case is from Arizona,where six plausible alternative conceptual models and a streamflow stochastic model are used to evaluate the impacts of different groundwater pumping scenarios.Another case is from Texas,where a global sensitivity analysis is performed on a regional groundwater availability model.Results of both cases show informed uncertainty analysis results that can be used to assist the groundwater planning and sustainability study.