The rise of big data has led to new demands for machine learning (ML) systems to learn complex mod- els, with millions to billions of parameters, that promise adequate capacity to digest massive datasets and offer p...The rise of big data has led to new demands for machine learning (ML) systems to learn complex mod- els, with millions to billions of parameters, that promise adequate capacity to digest massive datasets and offer powerful predictive analytics (such as high-dimensional latent features, intermediate repre- sentations, and decision functions) thereupon. In order to run ML algorithms at such scales, on a distrib- uted cluster with tens to thousands of machines, it is often the case that significant engineering efforts are required-and one might fairly ask whether such engineering truly falls within the domain of ML research. Taking the view that "big" ML systems can benefit greatly from ML-rooted statistical and algo- rithmic insights-and that ML researchers should therefore not shy away from such systems design-we discuss a series of principles and strategies distilled from our recent efforts on industrial-scale ML solu- tions. These principles and strategies span a continuum from application, to engineering, and to theo- retical research and development of big ML systems and architectures, with the goal of understanding how to make them efficient, generally applicable, and supported with convergence and scaling guaran- tees. They concern four key questions that traditionally receive little attention in ML research: How can an ML program be distributed over a cluster? How can ML computation be bridged with inter-machine communication? How can such communication be performed? What should be communicated between machines? By exposing underlying statistical and algorithmic characteristics unique to ML programs but not typically seen in traditional computer programs, and by dissecting successful cases to reveal how we have harnessed these principles to design and develop both high-performance distributed ML software as well as general-purpose ML frameworks, we present opportunities for ML researchers and practitioners to further shape and enlarge the area that lies between ML and systems..展开更多
To reduce resources consumption of parallel computation system, a static task scheduling opti- mization method based on hybrid genetic algorithm is proposed and validated, which can shorten the scheduling length of pa...To reduce resources consumption of parallel computation system, a static task scheduling opti- mization method based on hybrid genetic algorithm is proposed and validated, which can shorten the scheduling length of parallel tasks with precedence constraints. Firstly, the global optimal model and constraints are created to demonstrate the static task scheduling problem in heterogeneous distributed computing systems(HeDCSs). Secondly, the genetic population is coded with matrix and used to search the total available time span of the processors, and then the simulated annealing algorithm is introduced to improve the convergence speed and overcome the problem of easily falling into local minimum point, which exists in the traditional genetic algorithm. Finally, compared to other existed scheduling algorithms such as dynamic level scheduling ( DLS), heterogeneous earliest finish time (HEFr), and longest dynamic critical path( LDCP), the proposed approach does not merely de- crease tasks schedule length, but also achieves the maximal resource utilization of parallel computa- tion system by extensive experiments.展开更多
Water vapor plays a crucial role in atmospheric processes that act over a wide range of temporal and spatial scales, from global climate to micrometeorology. The determination of water vapor distribution in the atmosp...Water vapor plays a crucial role in atmospheric processes that act over a wide range of temporal and spatial scales, from global climate to micrometeorology. The determination of water vapor distribution in the atmosphere and its changing pattern is very important. Although atmospheric scientists have developed a variety of means to measure precipitable water vapor(PWV) using remote sensing data that have been widely used, there are some limitations in using one kind satellite measurements for PWV retrieval over land. In this paper, a new algorithm is proposed for retrieving PWV over land by combining different kinds of remote sensing data and it would work well under the cloud weather conditions. The PWV retrieval algorithm based on near infrared data is more suitable to clear sky conditions with high precision. The 23.5 GHz microwave remote sensing data is sensitive to water vapor and powerful in cloud-covered areas because of its longer wavelengths that permit viewing into and through the atmosphere. Therefore, the PWV retrieval results from near infrared data and the indices combined by microwave bands remote sensing data which are sensitive to water vapor will be regressed to generate the equation for PWV retrieval under cloud covered areas. The algorithm developed in this paper has the potential to detect PWV under all weather conditions and makes an excellent complement to PWV retrieved by near infrared data. Different types of surface exert different depolarization effects on surface emissions, which would increase the complexity of the algorithm. In this paper, MODIS surface classification data was used to consider this influence. Compared with the GPS results, the root mean square error of our algorithm is 8 mm for cloud covered area. Regional consistency was found between the results from MODIS and our algorithm. Our algorithm can yield reasonable results on the surfaces covered by cloud where MODIS cannot be used to retrieve PWV.展开更多
文摘The rise of big data has led to new demands for machine learning (ML) systems to learn complex mod- els, with millions to billions of parameters, that promise adequate capacity to digest massive datasets and offer powerful predictive analytics (such as high-dimensional latent features, intermediate repre- sentations, and decision functions) thereupon. In order to run ML algorithms at such scales, on a distrib- uted cluster with tens to thousands of machines, it is often the case that significant engineering efforts are required-and one might fairly ask whether such engineering truly falls within the domain of ML research. Taking the view that "big" ML systems can benefit greatly from ML-rooted statistical and algo- rithmic insights-and that ML researchers should therefore not shy away from such systems design-we discuss a series of principles and strategies distilled from our recent efforts on industrial-scale ML solu- tions. These principles and strategies span a continuum from application, to engineering, and to theo- retical research and development of big ML systems and architectures, with the goal of understanding how to make them efficient, generally applicable, and supported with convergence and scaling guaran- tees. They concern four key questions that traditionally receive little attention in ML research: How can an ML program be distributed over a cluster? How can ML computation be bridged with inter-machine communication? How can such communication be performed? What should be communicated between machines? By exposing underlying statistical and algorithmic characteristics unique to ML programs but not typically seen in traditional computer programs, and by dissecting successful cases to reveal how we have harnessed these principles to design and develop both high-performance distributed ML software as well as general-purpose ML frameworks, we present opportunities for ML researchers and practitioners to further shape and enlarge the area that lies between ML and systems..
基金Supported by the National Natural Science Foundation of China(No.61401496)
文摘To reduce resources consumption of parallel computation system, a static task scheduling opti- mization method based on hybrid genetic algorithm is proposed and validated, which can shorten the scheduling length of parallel tasks with precedence constraints. Firstly, the global optimal model and constraints are created to demonstrate the static task scheduling problem in heterogeneous distributed computing systems(HeDCSs). Secondly, the genetic population is coded with matrix and used to search the total available time span of the processors, and then the simulated annealing algorithm is introduced to improve the convergence speed and overcome the problem of easily falling into local minimum point, which exists in the traditional genetic algorithm. Finally, compared to other existed scheduling algorithms such as dynamic level scheduling ( DLS), heterogeneous earliest finish time (HEFr), and longest dynamic critical path( LDCP), the proposed approach does not merely de- crease tasks schedule length, but also achieves the maximal resource utilization of parallel computa- tion system by extensive experiments.
基金supported by the National Natural Science Foundation of China(Grant Nos.4147130541405036&41301653)+1 种基金the Sichuan Youth Science Foundation(Grant No.2015JQ0037)the Chongqing Meteorological Bureau Open Fund(Grant No.KFJJ-201402)
文摘Water vapor plays a crucial role in atmospheric processes that act over a wide range of temporal and spatial scales, from global climate to micrometeorology. The determination of water vapor distribution in the atmosphere and its changing pattern is very important. Although atmospheric scientists have developed a variety of means to measure precipitable water vapor(PWV) using remote sensing data that have been widely used, there are some limitations in using one kind satellite measurements for PWV retrieval over land. In this paper, a new algorithm is proposed for retrieving PWV over land by combining different kinds of remote sensing data and it would work well under the cloud weather conditions. The PWV retrieval algorithm based on near infrared data is more suitable to clear sky conditions with high precision. The 23.5 GHz microwave remote sensing data is sensitive to water vapor and powerful in cloud-covered areas because of its longer wavelengths that permit viewing into and through the atmosphere. Therefore, the PWV retrieval results from near infrared data and the indices combined by microwave bands remote sensing data which are sensitive to water vapor will be regressed to generate the equation for PWV retrieval under cloud covered areas. The algorithm developed in this paper has the potential to detect PWV under all weather conditions and makes an excellent complement to PWV retrieved by near infrared data. Different types of surface exert different depolarization effects on surface emissions, which would increase the complexity of the algorithm. In this paper, MODIS surface classification data was used to consider this influence. Compared with the GPS results, the root mean square error of our algorithm is 8 mm for cloud covered area. Regional consistency was found between the results from MODIS and our algorithm. Our algorithm can yield reasonable results on the surfaces covered by cloud where MODIS cannot be used to retrieve PWV.