In economics, buyers and sellers are usually the main sides in a market. Game theory can perfectly model decisions behind each “player” and calculate an outcome that benefits both sides. However, the use of game the...In economics, buyers and sellers are usually the main sides in a market. Game theory can perfectly model decisions behind each “player” and calculate an outcome that benefits both sides. However, the use of game theory is not lim-ited to economics. In this paper, I will introduce the mathematical model of general sum game, solutions and theorems surrounding game theory, and its real life applications in many different scenarios.展开更多
In order to improve the generalization ability of binary decision trees, a new learning algorithm, the MMDT algorithm, is presented. Based on statistical learning theory the generalization performance of binary decisi...In order to improve the generalization ability of binary decision trees, a new learning algorithm, the MMDT algorithm, is presented. Based on statistical learning theory the generalization performance of binary decision trees is analyzed, and the assessment rule is proposed. Under the direction of the assessment rule, the MMDT algorithm is implemented. The algorithm maps training examples from an original space to a high dimension feature space, and constructs a decision tree in it. In the feature space, a new decision node splitting criterion, the max-min rule, is used, and the margin of each decision node is maximized using a support vector machine, to improve the generalization performance. Experimental results show that the new learning algorithm is much superior to others such as C4. 5 and OCI.展开更多
Forecasting the movement of stock market is a long-time attractive topic. This paper implements different statistical learning models to predict the movement of S&P 500 index. The S&P 500 index is influenced b...Forecasting the movement of stock market is a long-time attractive topic. This paper implements different statistical learning models to predict the movement of S&P 500 index. The S&P 500 index is influenced by other important financial indexes across the world such as commodity price and financial technical indicators. This paper systematically investigated four supervised learning models, including Logistic Regression, Gaussian Discriminant Analysis (GDA), Naive Bayes and Support Vector Machine (SVM) in the forecast of S&P 500 index. After several experiments of optimization in features and models, especially the SVM kernel selection and feature selection for different models, this paper concludes that a SVM model with a Radial Basis Function (RBF) kernel can achieve an accuracy rate of 62.51% for the future market trend of the S&P 500 index.展开更多
Forecasting stock returns is extremely challenging in general,and this task becomes even more difficult given the turbulent nature of the Chinese stock market.We address the stock selection process as a statistical le...Forecasting stock returns is extremely challenging in general,and this task becomes even more difficult given the turbulent nature of the Chinese stock market.We address the stock selection process as a statistical learning problem and build crosssectional forecast models to select individual stocks in the Shanghai Composite Index.Decile portfolios are formed according to rankings of the forecasted future cumulative returns.The equity market’s neutral portfolio-formed by buying the top decile portfolio and selling short the bottom decile portfolio-exhibits superior performance to,and a low correlation with,the Shanghai Composite Index.To make our strategy more useful to practitioners,we evaluate the proposed stock selection strategy’s performance by allowing only long positions,and by investing only in Ashare stocks to incorporate the restrictions in the Chinese stock market.The longonly strategies still generate robust and superior performance compared to the Shanghai Composite Index.A close examination of the coefficients of the features provides more insights into the changes in market dynamics from period to period.展开更多
Statistical relational learning constructs statistical models from relational databases, combining relational learning and statistical learning. Its strong ability and special property make statistical relational lear...Statistical relational learning constructs statistical models from relational databases, combining relational learning and statistical learning. Its strong ability and special property make statistical relational learning become one of the important areas in machine learning research.In this paper,the general concepts and characters of statistical relational learning are presented firstly.Then some major branches of this newly emerging field are discussed,including logic and rule-based approaches,frame and object-oriented approaches,functional programming-based approaches.After that several methods of applying rough set in statistical relational learning are described,such as gRS-ILP and VPRSILP. Finally some applications of statistical relational leaning are briefly introduced and some future directions of statistical relational learning and the application of rough set in this area are pointed out.展开更多
Nitrogen(N)monitoring is essential in nurseries to ensure the production of high-quality seedlings.Nearinfrared spectroscopy(NIRS)is an instantaneous,nondestructive method to monitor N.Spectral data such as NIRS can a...Nitrogen(N)monitoring is essential in nurseries to ensure the production of high-quality seedlings.Nearinfrared spectroscopy(NIRS)is an instantaneous,nondestructive method to monitor N.Spectral data such as NIRS can also provide the basis for developing a new vegetation spectral index(VSI).Here,we evaluated whether NIRS combined with statistical modeling can accurately detect early variations in N concentration in leaves of young plants of Annona emargiaata and developed a new VSI for this task.Plants were grown in a hydroponics system with 0,2.75,5.5or 11 mM N for 45 days.Then we measured gas exchange,chlorophylla fluorescence,and pigments in leaves;analyzed complete leaf nutrients,and recorded spectral data for leaves at 966 to 1685 nm using NIRS.With a statistical learning approach,the dimensionality of the spectral data was reduced,then models were generated using two classes(N deficiency,N)or four classes(0,2.75,5.5,11 mM N).The best combination of techniques for dimensionality reduction and classification,respectively,was stepwise regression(PROC STEPDISC)and linear discriminant function.It was possible to detect N deficiency in seedlings leaves with 100%precision,and the four N concentrations with93.55%accuracy before photosynthetic damage to the plant occurred.Thereby,NIRS combined with statistical modeling of multidimensional data is effective for detecting N variations in seedlings leaves of A.emarginata.展开更多
Online banking fraud occurs whenever a criminal can seize accounts and transfer funds from an individual’s online bank account.Successfully preventing this requires the detection of as many fraudsters as possible,wit...Online banking fraud occurs whenever a criminal can seize accounts and transfer funds from an individual’s online bank account.Successfully preventing this requires the detection of as many fraudsters as possible,without producing too many false alarms.This is a challenge for machine learning owing to the extremely imbalanced data and complexity of fraud.In addition,classical machine learning methods must be extended,minimizing expected financial losses.Finally,fraud can only be combated systematically and economically if the risks and costs in payment channels are known.We define three models that overcome these challenges:machine learning-based fraud detection,economic optimization of machine learning results,and a risk model to predict the risk of fraud while considering countermeasures.The models were tested utilizing real data.Our machine learning model alone reduces the expected and unexpected losses in the three aggregated payment channels by 15%compared to a benchmark consisting of static if-then rules.Optimizing the machine-learning model further reduces the expected losses by 52%.These results hold with a low false positive rate of 0.4%.Thus,the risk framework of the three models is viable from a business and risk perspective.展开更多
As the new generation of artificial intelligence(AI)continues to evolve,weather big data and statistical machine learning(SML)technologies complement each other and are deeply integrated to significantly improve the p...As the new generation of artificial intelligence(AI)continues to evolve,weather big data and statistical machine learning(SML)technologies complement each other and are deeply integrated to significantly improve the processing and forecasting accuracy of fishery weather.Accurate fishery weather services play a crucial role in fishery production,serving as a great safeguard for economic benefits and personal safety,enabling fishermen to carry out fishery production better,and contributing to the sustainable development of the fishery industry.The objective of this paper is to offer an understanding of the present state of research and development in SML technology for simulating and forecasting fishery weather.Specifically,we analyze the current state of research and technical features of SML in weather and summarize the applications of SML in simulation and forecasting of fishery weather,which mainly include three aspects:fishery weather scenario generation,fishery weather forecasting,and fishery extreme weather warning.We also illustrate the main technical means and principles of SML technology.Finally,we summarize the most advanced SML fields and provide an outlook on their application value in the field of fishery weather.展开更多
Deficiencies of applying the traditional least squares support vector machine (LS-SVM) to time series online prediction were specified. According to the kernel function matrix's property and using the recursive cal...Deficiencies of applying the traditional least squares support vector machine (LS-SVM) to time series online prediction were specified. According to the kernel function matrix's property and using the recursive calculation of block matrix, a new time series online prediction algorithm based on improved LS-SVM was proposed. The historical training results were fully utilized and the computing speed of LS-SVM was enhanced. Then, the improved algorithm was applied to timc series online prediction. Based on the operational data provided by the Northwest Power Grid of China, the method was used in the transient stability prediction of electric power system. The results show that, compared with the calculation time of the traditional LS-SVM(75 1 600 ms), that of the proposed method in different time windows is 40-60 ms, proposed method is above 0.8. So the improved method is online prediction. and the prediction accuracy(normalized root mean squared error) of the better than the traditional LS-SVM and more suitable for time series online prediction.展开更多
The basic principles of the Support Vector Machine (SVM) are introduced in this paper. A specific process to establish an SVM prediction model is given. To improve the precision of coal reserve estimation, a support v...The basic principles of the Support Vector Machine (SVM) are introduced in this paper. A specific process to establish an SVM prediction model is given. To improve the precision of coal reserve estimation, a support vector machine method, based on statistical learning theory, is put forward. The SVM model was trained and tested by using the existing exploration and exploitation data of Chencun mine of Yima bureau’s as the input data. Then coal reserves within a particular region were calculated. These calculated results and the actual results of the exploration block were compared. The maximum relative error was 10.85%, within the scope of acceptable error limits. The results show that the SVM coal reserve calculation method is reliable. This method is simple, practical and valuable.展开更多
Statistical learning theory is for small-sample statistics. And support vector machine is a new machine learning method based on the statistical learning theory. The support vector machine not only has solved certain ...Statistical learning theory is for small-sample statistics. And support vector machine is a new machine learning method based on the statistical learning theory. The support vector machine not only has solved certain problems in many learning methods, such as small sample, over fitting, high dimension and local minimum, but also has a higher generalization (forecasting) ability than that of artificial neural networks. The strong earthquakes in Chinese mainland are related to a certain extent to the intensive seismicity along the main plate boundaries in the world, however, the relation is nonlinear. In the paper, we have studied this unclear relation by the support vector machine method for the purpose of forecasting strong earthquakes in Chinese mainland.展开更多
This paper provides an introduction to a support vector machine, a new kernel-based technique introduced in statistical learning theory and structural risk minimization, then presents a modeling-control framework base...This paper provides an introduction to a support vector machine, a new kernel-based technique introduced in statistical learning theory and structural risk minimization, then presents a modeling-control framework based on SVM. At last a numerical experiment is taken to demonstrate the proposed approach's correctness and effectiveness.展开更多
Polynomial-time randomized algorithms were constructed to approximately solve optimal robust performance controller design problems in probabilistic sense and the rigorous mathematical justification of the approach wa...Polynomial-time randomized algorithms were constructed to approximately solve optimal robust performance controller design problems in probabilistic sense and the rigorous mathematical justification of the approach was given. The randomized algorithms here were based on a property from statistical learning theory known as (uniform) convergence of empirical means (UCEM). It is argued that in order to assess the performance of a controller as the plant varies over a pre-specified family, it is better to use the average performance of the controller as the objective function to be optimized, rather than its worst-case performance. The approach is illustrated to be efficient through an example.展开更多
An ensemble prediction model of solar proton events (SPEs), combining the information of solar flares and coronal mass ejections (CMEs), is built. In this model, solar flares are parameterized by the peak flux, th...An ensemble prediction model of solar proton events (SPEs), combining the information of solar flares and coronal mass ejections (CMEs), is built. In this model, solar flares are parameterized by the peak flux, the duration and the longitude. In addition, CMEs are parameterized by the width, the speed and the measurement position angle. The importance of each parameter for the occurrence of SPEs is estimated by the information gain ratio. We find that the CME width and speed are more informative than the flare’s peak flux and duration. As the physical mechanism of SPEs is not very clear, a hidden naive Bayes approach, which is a probability-based calculation method from the field of machine learning, is used to build the prediction model from the observational data. As is known, SPEs originate from solar flares and/or shock waves associated with CMEs. Hence, we first build two base prediction models using the properties of solar flares and CMEs, respectively. Then the outputs of these models are combined to generate the ensemble prediction model of SPEs. The ensemble prediction model incorporating the complementary information of solar flares and CMEs achieves better performance than each base prediction model taken separately.展开更多
A multi-layer adaptive optimizing parameters algorithm is developed forimproving least squares support vector machines (LS-SVM) , and a military aircraft life-cycle-cost(LCC) intelligent estimation model is proposed b...A multi-layer adaptive optimizing parameters algorithm is developed forimproving least squares support vector machines (LS-SVM) , and a military aircraft life-cycle-cost(LCC) intelligent estimation model is proposed based on the improved LS-SVM. The intelligent costestimation process is divided into three steps in the model. In the first step, a cost-drive-factorneeds to be selected, which is significant for cost estimation. In the second step, militaryaircraft training samples within costs and cost-drive-factor set are obtained by the LS-SVM. Thenthe model can be used for new type aircraft cost estimation. Chinese military aircraft costs areestimated in the paper. The results show that the estimated costs by the new model are closer to thetrue costs than that of the traditionally used methods.展开更多
The method to compress the training dataset of Support Vector Machine (SVM) based on the character of the Support Vector Machine is proposed. First, the distance between the unit in two training datasets, and then t...The method to compress the training dataset of Support Vector Machine (SVM) based on the character of the Support Vector Machine is proposed. First, the distance between the unit in two training datasets, and then the samples that keep away from hyper-plane are discarded in order to compress the training dataset. The time spent in training SVM with the training dataset compressed by the method is shortened obviously. The result of the experiment shows that the algorithm is effective.展开更多
New energy integration and flexible demand response make smart grid operation scenarios complex and change-able,which bring challenges to network planning.If every possible scenario is considered,the solution to the p...New energy integration and flexible demand response make smart grid operation scenarios complex and change-able,which bring challenges to network planning.If every possible scenario is considered,the solution to the plan-ning can become extremely time-consuming and difficult.This paper introduces statistical machine learning(SML)techniques to carry out multi-scenario based probabilistic power flow calculations and describes their application to the stochastic planning of distribution networks.The proposed SML includes linear regression,probability distribu-tion,Markov chain,isoprobabilistic transformation,maximum likelihood estimator,stochastic response surface and center point method.Based on the above SML model,capricious weather,photovoltaic power generation,thermal load,power flow and uncertainty programming are simulated.Taking a 33-bus distribution system as an example,this paper compares the stochastic planning model based on SML with the traditional models published in the literature.The results verify that the proposed model greatly improves planning performance while meeting accuracy require-ments.The case study also considers a realistic power distribution system operating under stressed conditions.展开更多
The development of distributed renewable energy,such as photovoltaic power and wind power generation,makes the energy system cleaner,and is of great significance in reducing carbon emissions.However,weather can affect...The development of distributed renewable energy,such as photovoltaic power and wind power generation,makes the energy system cleaner,and is of great significance in reducing carbon emissions.However,weather can affect distributed renewable energy power generation,and the uncertainty of output brings challenges to uncertainty planning for distributed renewable energy.Energy systems with high penetration of distributed renewable energy involve the high-dimensional,nonlinear dynamics of large-scale complex systems,and the optimal solution of the uncertainty model is a difficult problem.From the perspective of statistical machine learning,the theory of planning of distributed renewable energy systems under uncertainty is reviewed and some key technologies are put forward for applying advanced artificial intelligence to distributed renewable power uncertainty planning.展开更多
This brief paper reports a hybrid algorithm we developed recently to solve the global optimization problems of multimodal functions, by combining the advantages of two powerful population-based metaheuristics differen...This brief paper reports a hybrid algorithm we developed recently to solve the global optimization problems of multimodal functions, by combining the advantages of two powerful population-based metaheuristics differential evolution (DE) and particle swarm optimization (PSO). In the hybrid denoted by DEPSO, each individual in one generation chooses its evolution method, DE or PSO, in a statistical learning way. The choice depends on the relative success ratio of the two methods in a previous learning period. The proposed DEPSO is compared with its PSO and DE parents, two advanced DE variants one of which is suggested by the originators of DE, two advanced PSO variants one of which is acknowledged as a recent standard by PSO community, and also a previous DEPSO. Benchmark tests demonstrate that the DEPSO is more competent for the global optimization of multimodal functions due to its high optimization quality.展开更多
We examined the neural correlates of the statistical learning of orthographic-semantic connections in Chinese adult learners.Visual event-related potentials(ERPs) were recorded while participants were exposed to a seq...We examined the neural correlates of the statistical learning of orthographic-semantic connections in Chinese adult learners.Visual event-related potentials(ERPs) were recorded while participants were exposed to a sequence of artificial logographic characters containing semantic radicals carrying low,moderate,or high levels of semantic consistency.The behavioral results showed that the mean accuracy of participants’ recognition of previously exposed characters was 63.1% that was significantly above chance level(50%),indicating the statistical learning of the regularities of semantic radicals.The ERP data revealed a temporal sequence of the neural process of statistical learning of orthographic-semantic connections,and different brain indexes were found to be associated with this processing,i.e.,a clear N170-P200-N400 pattern.For N170,the larger negative amplitudes were evoked by the high and moderate consistency than the low consistency.For P200,the mean amplitudes elicited by the moderate and low consistency were larger than the high consistency.In contrast,a larger N400 amplitude was observed in the low than moderate and high consistency;and more negative amplitude was elicited by the moderate than high consistency.We propose that the initial potential shifts(N170 and P200) may reflect orthographic or graphic form identification,while the later component(N400) may be associated with semantic information analysis.展开更多
文摘In economics, buyers and sellers are usually the main sides in a market. Game theory can perfectly model decisions behind each “player” and calculate an outcome that benefits both sides. However, the use of game theory is not lim-ited to economics. In this paper, I will introduce the mathematical model of general sum game, solutions and theorems surrounding game theory, and its real life applications in many different scenarios.
文摘In order to improve the generalization ability of binary decision trees, a new learning algorithm, the MMDT algorithm, is presented. Based on statistical learning theory the generalization performance of binary decision trees is analyzed, and the assessment rule is proposed. Under the direction of the assessment rule, the MMDT algorithm is implemented. The algorithm maps training examples from an original space to a high dimension feature space, and constructs a decision tree in it. In the feature space, a new decision node splitting criterion, the max-min rule, is used, and the margin of each decision node is maximized using a support vector machine, to improve the generalization performance. Experimental results show that the new learning algorithm is much superior to others such as C4. 5 and OCI.
文摘Forecasting the movement of stock market is a long-time attractive topic. This paper implements different statistical learning models to predict the movement of S&P 500 index. The S&P 500 index is influenced by other important financial indexes across the world such as commodity price and financial technical indicators. This paper systematically investigated four supervised learning models, including Logistic Regression, Gaussian Discriminant Analysis (GDA), Naive Bayes and Support Vector Machine (SVM) in the forecast of S&P 500 index. After several experiments of optimization in features and models, especially the SVM kernel selection and feature selection for different models, this paper concludes that a SVM model with a Radial Basis Function (RBF) kernel can achieve an accuracy rate of 62.51% for the future market trend of the S&P 500 index.
文摘Forecasting stock returns is extremely challenging in general,and this task becomes even more difficult given the turbulent nature of the Chinese stock market.We address the stock selection process as a statistical learning problem and build crosssectional forecast models to select individual stocks in the Shanghai Composite Index.Decile portfolios are formed according to rankings of the forecasted future cumulative returns.The equity market’s neutral portfolio-formed by buying the top decile portfolio and selling short the bottom decile portfolio-exhibits superior performance to,and a low correlation with,the Shanghai Composite Index.To make our strategy more useful to practitioners,we evaluate the proposed stock selection strategy’s performance by allowing only long positions,and by investing only in Ashare stocks to incorporate the restrictions in the Chinese stock market.The longonly strategies still generate robust and superior performance compared to the Shanghai Composite Index.A close examination of the coefficients of the features provides more insights into the changes in market dynamics from period to period.
文摘Statistical relational learning constructs statistical models from relational databases, combining relational learning and statistical learning. Its strong ability and special property make statistical relational learning become one of the important areas in machine learning research.In this paper,the general concepts and characters of statistical relational learning are presented firstly.Then some major branches of this newly emerging field are discussed,including logic and rule-based approaches,frame and object-oriented approaches,functional programming-based approaches.After that several methods of applying rough set in statistical relational learning are described,such as gRS-ILP and VPRSILP. Finally some applications of statistical relational leaning are briefly introduced and some future directions of statistical relational learning and the application of rough set in this area are pointed out.
基金a scholarship from Capes(Coordena??o de Aperfei?oamento de Pessoal de Nível Superior)-Brazil(Award number:001)for the first author。
文摘Nitrogen(N)monitoring is essential in nurseries to ensure the production of high-quality seedlings.Nearinfrared spectroscopy(NIRS)is an instantaneous,nondestructive method to monitor N.Spectral data such as NIRS can also provide the basis for developing a new vegetation spectral index(VSI).Here,we evaluated whether NIRS combined with statistical modeling can accurately detect early variations in N concentration in leaves of young plants of Annona emargiaata and developed a new VSI for this task.Plants were grown in a hydroponics system with 0,2.75,5.5or 11 mM N for 45 days.Then we measured gas exchange,chlorophylla fluorescence,and pigments in leaves;analyzed complete leaf nutrients,and recorded spectral data for leaves at 966 to 1685 nm using NIRS.With a statistical learning approach,the dimensionality of the spectral data was reduced,then models were generated using two classes(N deficiency,N)or four classes(0,2.75,5.5,11 mM N).The best combination of techniques for dimensionality reduction and classification,respectively,was stepwise regression(PROC STEPDISC)and linear discriminant function.It was possible to detect N deficiency in seedlings leaves with 100%precision,and the four N concentrations with93.55%accuracy before photosynthetic damage to the plant occurred.Thereby,NIRS combined with statistical modeling of multidimensional data is effective for detecting N variations in seedlings leaves of A.emarginata.
基金from any funding agency in the public,commercial,or not-for-profit sectors.
文摘Online banking fraud occurs whenever a criminal can seize accounts and transfer funds from an individual’s online bank account.Successfully preventing this requires the detection of as many fraudsters as possible,without producing too many false alarms.This is a challenge for machine learning owing to the extremely imbalanced data and complexity of fraud.In addition,classical machine learning methods must be extended,minimizing expected financial losses.Finally,fraud can only be combated systematically and economically if the risks and costs in payment channels are known.We define three models that overcome these challenges:machine learning-based fraud detection,economic optimization of machine learning results,and a risk model to predict the risk of fraud while considering countermeasures.The models were tested utilizing real data.Our machine learning model alone reduces the expected and unexpected losses in the three aggregated payment channels by 15%compared to a benchmark consisting of static if-then rules.Optimizing the machine-learning model further reduces the expected losses by 52%.These results hold with a low false positive rate of 0.4%.Thus,the risk framework of the three models is viable from a business and risk perspective.
基金the National Natural Science Foundation of China under Grant 52007193 and The 2115 Talent Development Program of China Agricultural University.
文摘As the new generation of artificial intelligence(AI)continues to evolve,weather big data and statistical machine learning(SML)technologies complement each other and are deeply integrated to significantly improve the processing and forecasting accuracy of fishery weather.Accurate fishery weather services play a crucial role in fishery production,serving as a great safeguard for economic benefits and personal safety,enabling fishermen to carry out fishery production better,and contributing to the sustainable development of the fishery industry.The objective of this paper is to offer an understanding of the present state of research and development in SML technology for simulating and forecasting fishery weather.Specifically,we analyze the current state of research and technical features of SML in weather and summarize the applications of SML in simulation and forecasting of fishery weather,which mainly include three aspects:fishery weather scenario generation,fishery weather forecasting,and fishery extreme weather warning.We also illustrate the main technical means and principles of SML technology.Finally,we summarize the most advanced SML fields and provide an outlook on their application value in the field of fishery weather.
基金Project (SGKJ[200301-16]) supported by the State Grid Cooperation of China
文摘Deficiencies of applying the traditional least squares support vector machine (LS-SVM) to time series online prediction were specified. According to the kernel function matrix's property and using the recursive calculation of block matrix, a new time series online prediction algorithm based on improved LS-SVM was proposed. The historical training results were fully utilized and the computing speed of LS-SVM was enhanced. Then, the improved algorithm was applied to timc series online prediction. Based on the operational data provided by the Northwest Power Grid of China, the method was used in the transient stability prediction of electric power system. The results show that, compared with the calculation time of the traditional LS-SVM(75 1 600 ms), that of the proposed method in different time windows is 40-60 ms, proposed method is above 0.8. So the improved method is online prediction. and the prediction accuracy(normalized root mean squared error) of the better than the traditional LS-SVM and more suitable for time series online prediction.
基金Project 072400430420 supported by the Natural Science Foundation of Henan Province
文摘The basic principles of the Support Vector Machine (SVM) are introduced in this paper. A specific process to establish an SVM prediction model is given. To improve the precision of coal reserve estimation, a support vector machine method, based on statistical learning theory, is put forward. The SVM model was trained and tested by using the existing exploration and exploitation data of Chencun mine of Yima bureau’s as the input data. Then coal reserves within a particular region were calculated. These calculated results and the actual results of the exploration block were compared. The maximum relative error was 10.85%, within the scope of acceptable error limits. The results show that the SVM coal reserve calculation method is reliable. This method is simple, practical and valuable.
基金Joint Seismological Science Foundation of China (104090)
文摘Statistical learning theory is for small-sample statistics. And support vector machine is a new machine learning method based on the statistical learning theory. The support vector machine not only has solved certain problems in many learning methods, such as small sample, over fitting, high dimension and local minimum, but also has a higher generalization (forecasting) ability than that of artificial neural networks. The strong earthquakes in Chinese mainland are related to a certain extent to the intensive seismicity along the main plate boundaries in the world, however, the relation is nonlinear. In the paper, we have studied this unclear relation by the support vector machine method for the purpose of forecasting strong earthquakes in Chinese mainland.
文摘This paper provides an introduction to a support vector machine, a new kernel-based technique introduced in statistical learning theory and structural risk minimization, then presents a modeling-control framework based on SVM. At last a numerical experiment is taken to demonstrate the proposed approach's correctness and effectiveness.
文摘Polynomial-time randomized algorithms were constructed to approximately solve optimal robust performance controller design problems in probabilistic sense and the rigorous mathematical justification of the approach was given. The randomized algorithms here were based on a property from statistical learning theory known as (uniform) convergence of empirical means (UCEM). It is argued that in order to assess the performance of a controller as the plant varies over a pre-specified family, it is better to use the average performance of the controller as the objective function to be optimized, rather than its worst-case performance. The approach is illustrated to be efficient through an example.
基金supported by the Young Researcher Grant of National Astronomical Observatories, Chinese Academy of Sciences, the National Basic Research Program of China (973 Program, Grant No. 2011CB811406)the National Natural Science Foundation of China (Grant Nos. 10733020, 10921303, 11003026 and 11078010)
文摘An ensemble prediction model of solar proton events (SPEs), combining the information of solar flares and coronal mass ejections (CMEs), is built. In this model, solar flares are parameterized by the peak flux, the duration and the longitude. In addition, CMEs are parameterized by the width, the speed and the measurement position angle. The importance of each parameter for the occurrence of SPEs is estimated by the information gain ratio. We find that the CME width and speed are more informative than the flare’s peak flux and duration. As the physical mechanism of SPEs is not very clear, a hidden naive Bayes approach, which is a probability-based calculation method from the field of machine learning, is used to build the prediction model from the observational data. As is known, SPEs originate from solar flares and/or shock waves associated with CMEs. Hence, we first build two base prediction models using the properties of solar flares and CMEs, respectively. Then the outputs of these models are combined to generate the ensemble prediction model of SPEs. The ensemble prediction model incorporating the complementary information of solar flares and CMEs achieves better performance than each base prediction model taken separately.
文摘A multi-layer adaptive optimizing parameters algorithm is developed forimproving least squares support vector machines (LS-SVM) , and a military aircraft life-cycle-cost(LCC) intelligent estimation model is proposed based on the improved LS-SVM. The intelligent costestimation process is divided into three steps in the model. In the first step, a cost-drive-factorneeds to be selected, which is significant for cost estimation. In the second step, militaryaircraft training samples within costs and cost-drive-factor set are obtained by the LS-SVM. Thenthe model can be used for new type aircraft cost estimation. Chinese military aircraft costs areestimated in the paper. The results show that the estimated costs by the new model are closer to thetrue costs than that of the traditionally used methods.
基金the National Natural Science Foundation of China (60503024, 50634010)
文摘The method to compress the training dataset of Support Vector Machine (SVM) based on the character of the Support Vector Machine is proposed. First, the distance between the unit in two training datasets, and then the samples that keep away from hyper-plane are discarded in order to compress the training dataset. The time spent in training SVM with the training dataset compressed by the method is shortened obviously. The result of the experiment shows that the algorithm is effective.
基金supported by the National Natural Science Foundation of China under Grant 52007193 and The 2115 Talent Development Program of China Agricultural University.
文摘New energy integration and flexible demand response make smart grid operation scenarios complex and change-able,which bring challenges to network planning.If every possible scenario is considered,the solution to the plan-ning can become extremely time-consuming and difficult.This paper introduces statistical machine learning(SML)techniques to carry out multi-scenario based probabilistic power flow calculations and describes their application to the stochastic planning of distribution networks.The proposed SML includes linear regression,probability distribu-tion,Markov chain,isoprobabilistic transformation,maximum likelihood estimator,stochastic response surface and center point method.Based on the above SML model,capricious weather,photovoltaic power generation,thermal load,power flow and uncertainty programming are simulated.Taking a 33-bus distribution system as an example,this paper compares the stochastic planning model based on SML with the traditional models published in the literature.The results verify that the proposed model greatly improves planning performance while meeting accuracy require-ments.The case study also considers a realistic power distribution system operating under stressed conditions.
基金supported by the State Key Laboratory of Alternate Electrical Power System with Renewable Energy Sources under Grant No.LAPS21016the National Natural Science Foundation of China under Grant 52007193the 2115 Talent Development Program of China Agricultural University.
文摘The development of distributed renewable energy,such as photovoltaic power and wind power generation,makes the energy system cleaner,and is of great significance in reducing carbon emissions.However,weather can affect distributed renewable energy power generation,and the uncertainty of output brings challenges to uncertainty planning for distributed renewable energy.Energy systems with high penetration of distributed renewable energy involve the high-dimensional,nonlinear dynamics of large-scale complex systems,and the optimal solution of the uncertainty model is a difficult problem.From the perspective of statistical machine learning,the theory of planning of distributed renewable energy systems under uncertainty is reviewed and some key technologies are put forward for applying advanced artificial intelligence to distributed renewable power uncertainty planning.
基金Supported by the National Natural Science Foundation of China (Grant No. 60374069)the Foundation of the Key Laboratory of Complex Systems and Intelligent Science, Institute of Automation, Chinese Academy of Sciences (Grant No. 20060104)
文摘This brief paper reports a hybrid algorithm we developed recently to solve the global optimization problems of multimodal functions, by combining the advantages of two powerful population-based metaheuristics differential evolution (DE) and particle swarm optimization (PSO). In the hybrid denoted by DEPSO, each individual in one generation chooses its evolution method, DE or PSO, in a statistical learning way. The choice depends on the relative success ratio of the two methods in a previous learning period. The proposed DEPSO is compared with its PSO and DE parents, two advanced DE variants one of which is suggested by the originators of DE, two advanced PSO variants one of which is acknowledged as a recent standard by PSO community, and also a previous DEPSO. Benchmark tests demonstrate that the DEPSO is more competent for the global optimization of multimodal functions due to its high optimization quality.
基金supported,in part,by the General Research Fund of the Hong Kong Government Research Grant Council(17609518)the Early Career Scheme of the Hong Kong Grants Council (28606419)the National Natural Science Foundation of China (31600903)。
文摘We examined the neural correlates of the statistical learning of orthographic-semantic connections in Chinese adult learners.Visual event-related potentials(ERPs) were recorded while participants were exposed to a sequence of artificial logographic characters containing semantic radicals carrying low,moderate,or high levels of semantic consistency.The behavioral results showed that the mean accuracy of participants’ recognition of previously exposed characters was 63.1% that was significantly above chance level(50%),indicating the statistical learning of the regularities of semantic radicals.The ERP data revealed a temporal sequence of the neural process of statistical learning of orthographic-semantic connections,and different brain indexes were found to be associated with this processing,i.e.,a clear N170-P200-N400 pattern.For N170,the larger negative amplitudes were evoked by the high and moderate consistency than the low consistency.For P200,the mean amplitudes elicited by the moderate and low consistency were larger than the high consistency.In contrast,a larger N400 amplitude was observed in the low than moderate and high consistency;and more negative amplitude was elicited by the moderate than high consistency.We propose that the initial potential shifts(N170 and P200) may reflect orthographic or graphic form identification,while the later component(N400) may be associated with semantic information analysis.