In economics, buyers and sellers are usually the main sides in a market. Game theory can perfectly model decisions behind each “player” and calculate an outcome that benefits both sides. However, the use of game the...In economics, buyers and sellers are usually the main sides in a market. Game theory can perfectly model decisions behind each “player” and calculate an outcome that benefits both sides. However, the use of game theory is not lim-ited to economics. In this paper, I will introduce the mathematical model of general sum game, solutions and theorems surrounding game theory, and its real life applications in many different scenarios.展开更多
Forecasting the movement of stock market is a long-time attractive topic. This paper implements different statistical learning models to predict the movement of S&P 500 index. The S&P 500 index is influenced b...Forecasting the movement of stock market is a long-time attractive topic. This paper implements different statistical learning models to predict the movement of S&P 500 index. The S&P 500 index is influenced by other important financial indexes across the world such as commodity price and financial technical indicators. This paper systematically investigated four supervised learning models, including Logistic Regression, Gaussian Discriminant Analysis (GDA), Naive Bayes and Support Vector Machine (SVM) in the forecast of S&P 500 index. After several experiments of optimization in features and models, especially the SVM kernel selection and feature selection for different models, this paper concludes that a SVM model with a Radial Basis Function (RBF) kernel can achieve an accuracy rate of 62.51% for the future market trend of the S&P 500 index.展开更多
Forecasting stock returns is extremely challenging in general,and this task becomes even more difficult given the turbulent nature of the Chinese stock market.We address the stock selection process as a statistical le...Forecasting stock returns is extremely challenging in general,and this task becomes even more difficult given the turbulent nature of the Chinese stock market.We address the stock selection process as a statistical learning problem and build crosssectional forecast models to select individual stocks in the Shanghai Composite Index.Decile portfolios are formed according to rankings of the forecasted future cumulative returns.The equity market’s neutral portfolio-formed by buying the top decile portfolio and selling short the bottom decile portfolio-exhibits superior performance to,and a low correlation with,the Shanghai Composite Index.To make our strategy more useful to practitioners,we evaluate the proposed stock selection strategy’s performance by allowing only long positions,and by investing only in Ashare stocks to incorporate the restrictions in the Chinese stock market.The longonly strategies still generate robust and superior performance compared to the Shanghai Composite Index.A close examination of the coefficients of the features provides more insights into the changes in market dynamics from period to period.展开更多
We examined the neural correlates of the statistical learning of orthographic-semantic connections in Chinese adult learners.Visual event-related potentials(ERPs) were recorded while participants were exposed to a seq...We examined the neural correlates of the statistical learning of orthographic-semantic connections in Chinese adult learners.Visual event-related potentials(ERPs) were recorded while participants were exposed to a sequence of artificial logographic characters containing semantic radicals carrying low,moderate,or high levels of semantic consistency.The behavioral results showed that the mean accuracy of participants’ recognition of previously exposed characters was 63.1% that was significantly above chance level(50%),indicating the statistical learning of the regularities of semantic radicals.The ERP data revealed a temporal sequence of the neural process of statistical learning of orthographic-semantic connections,and different brain indexes were found to be associated with this processing,i.e.,a clear N170-P200-N400 pattern.For N170,the larger negative amplitudes were evoked by the high and moderate consistency than the low consistency.For P200,the mean amplitudes elicited by the moderate and low consistency were larger than the high consistency.In contrast,a larger N400 amplitude was observed in the low than moderate and high consistency;and more negative amplitude was elicited by the moderate than high consistency.We propose that the initial potential shifts(N170 and P200) may reflect orthographic or graphic form identification,while the later component(N400) may be associated with semantic information analysis.展开更多
To synthesize real-time and realistic facial animation, we present an effective algorithm which combines image- and geometry-based methods for facial animation simulation. Considering the numerous motion units in the ...To synthesize real-time and realistic facial animation, we present an effective algorithm which combines image- and geometry-based methods for facial animation simulation. Considering the numerous motion units in the expression coding system, we present a novel simplified motion unit based on the basic facial expression, and construct the corresponding basic action for a head model. As image features are difficult to obtain using the performance driven method, we develop an automatic image feature recognition method based on statistical learning, and an expression image semi-automatic labeling method with rotation invariant face detection, which can improve the accuracy and efficiency of expression feature identification and training. After facial animation redirection, each basic action weight needs to be computed and mapped automatically. We apply the blend shape method to construct and train the corresponding expression database according to each basic action, and adopt the least squares method to compute the corresponding control parameters for facial animation. Moreover, there is a pre-integration of diffuse light distribution and specular light distribution based on the physical method, to improve the plausibility and efficiency of facial rendering. Our work provides a simplification of the facial motion unit, an optimization of the statistical training process and recognition process for facial animation, solves the expression parameters, and simulates the subsurface scattering effect in real time. Experimental results indicate that our method is effective and efficient, and suitable for computer animation and interactive applications.展开更多
Statistical relational learning constructs statistical models from relational databases, combining relational learning and statistical learning. Its strong ability and special property make statistical relational lear...Statistical relational learning constructs statistical models from relational databases, combining relational learning and statistical learning. Its strong ability and special property make statistical relational learning become one of the important areas in machine learning research.In this paper,the general concepts and characters of statistical relational learning are presented firstly.Then some major branches of this newly emerging field are discussed,including logic and rule-based approaches,frame and object-oriented approaches,functional programming-based approaches.After that several methods of applying rough set in statistical relational learning are described,such as gRS-ILP and VPRSILP. Finally some applications of statistical relational leaning are briefly introduced and some future directions of statistical relational learning and the application of rough set in this area are pointed out.展开更多
Nitrogen(N)monitoring is essential in nurseries to ensure the production of high-quality seedlings.Nearinfrared spectroscopy(NIRS)is an instantaneous,nondestructive method to monitor N.Spectral data such as NIRS can a...Nitrogen(N)monitoring is essential in nurseries to ensure the production of high-quality seedlings.Nearinfrared spectroscopy(NIRS)is an instantaneous,nondestructive method to monitor N.Spectral data such as NIRS can also provide the basis for developing a new vegetation spectral index(VSI).Here,we evaluated whether NIRS combined with statistical modeling can accurately detect early variations in N concentration in leaves of young plants of Annona emargiaata and developed a new VSI for this task.Plants were grown in a hydroponics system with 0,2.75,5.5or 11 mM N for 45 days.Then we measured gas exchange,chlorophylla fluorescence,and pigments in leaves;analyzed complete leaf nutrients,and recorded spectral data for leaves at 966 to 1685 nm using NIRS.With a statistical learning approach,the dimensionality of the spectral data was reduced,then models were generated using two classes(N deficiency,N)or four classes(0,2.75,5.5,11 mM N).The best combination of techniques for dimensionality reduction and classification,respectively,was stepwise regression(PROC STEPDISC)and linear discriminant function.It was possible to detect N deficiency in seedlings leaves with 100%precision,and the four N concentrations with93.55%accuracy before photosynthetic damage to the plant occurred.Thereby,NIRS combined with statistical modeling of multidimensional data is effective for detecting N variations in seedlings leaves of A.emarginata.展开更多
Online banking fraud occurs whenever a criminal can seize accounts and transfer funds from an individual’s online bank account.Successfully preventing this requires the detection of as many fraudsters as possible,wit...Online banking fraud occurs whenever a criminal can seize accounts and transfer funds from an individual’s online bank account.Successfully preventing this requires the detection of as many fraudsters as possible,without producing too many false alarms.This is a challenge for machine learning owing to the extremely imbalanced data and complexity of fraud.In addition,classical machine learning methods must be extended,minimizing expected financial losses.Finally,fraud can only be combated systematically and economically if the risks and costs in payment channels are known.We define three models that overcome these challenges:machine learning-based fraud detection,economic optimization of machine learning results,and a risk model to predict the risk of fraud while considering countermeasures.The models were tested utilizing real data.Our machine learning model alone reduces the expected and unexpected losses in the three aggregated payment channels by 15%compared to a benchmark consisting of static if-then rules.Optimizing the machine-learning model further reduces the expected losses by 52%.These results hold with a low false positive rate of 0.4%.Thus,the risk framework of the three models is viable from a business and risk perspective.展开更多
Statistical learning theory is for small-sample statistics. And support vector machine is a new machine learning method based on the statistical learning theory. The support vector machine not only has solved certain ...Statistical learning theory is for small-sample statistics. And support vector machine is a new machine learning method based on the statistical learning theory. The support vector machine not only has solved certain problems in many learning methods, such as small sample, over fitting, high dimension and local minimum, but also has a higher generalization (forecasting) ability than that of artificial neural networks. The strong earthquakes in Chinese mainland are related to a certain extent to the intensive seismicity along the main plate boundaries in the world, however, the relation is nonlinear. In the paper, we have studied this unclear relation by the support vector machine method for the purpose of forecasting strong earthquakes in Chinese mainland.展开更多
New energy integration and flexible demand response make smart grid operation scenarios complex and change-able,which bring challenges to network planning.If every possible scenario is considered,the solution to the p...New energy integration and flexible demand response make smart grid operation scenarios complex and change-able,which bring challenges to network planning.If every possible scenario is considered,the solution to the plan-ning can become extremely time-consuming and difficult.This paper introduces statistical machine learning(SML)techniques to carry out multi-scenario based probabilistic power flow calculations and describes their application to the stochastic planning of distribution networks.The proposed SML includes linear regression,probability distribu-tion,Markov chain,isoprobabilistic transformation,maximum likelihood estimator,stochastic response surface and center point method.Based on the above SML model,capricious weather,photovoltaic power generation,thermal load,power flow and uncertainty programming are simulated.Taking a 33-bus distribution system as an example,this paper compares the stochastic planning model based on SML with the traditional models published in the literature.The results verify that the proposed model greatly improves planning performance while meeting accuracy require-ments.The case study also considers a realistic power distribution system operating under stressed conditions.展开更多
The development of distributed renewable energy,such as photovoltaic power and wind power generation,makes the energy system cleaner,and is of great significance in reducing carbon emissions.However,weather can affect...The development of distributed renewable energy,such as photovoltaic power and wind power generation,makes the energy system cleaner,and is of great significance in reducing carbon emissions.However,weather can affect distributed renewable energy power generation,and the uncertainty of output brings challenges to uncertainty planning for distributed renewable energy.Energy systems with high penetration of distributed renewable energy involve the high-dimensional,nonlinear dynamics of large-scale complex systems,and the optimal solution of the uncertainty model is a difficult problem.From the perspective of statistical machine learning,the theory of planning of distributed renewable energy systems under uncertainty is reviewed and some key technologies are put forward for applying advanced artificial intelligence to distributed renewable power uncertainty planning.展开更多
With the advent of internet of things(IOT),the need for studying new material and devices for various applications is increasing.Traditionally we build compact models for transistors on the basis of physics.But physic...With the advent of internet of things(IOT),the need for studying new material and devices for various applications is increasing.Traditionally we build compact models for transistors on the basis of physics.But physical models are expensive and need a very long time to adjust for non-ideal effects.As the vision for the application of many novel devices is not certain or the manufacture process is not mature,deriving generalized accurate physical models for such devices is very strenuous,whereas statistical modeling is becoming a potential method because of its data oriented property and fast implementation.In this paper,one classical statistical regression method,LASSO,is used to model the I-V characteristics of CNT-FET and a pseudo-PMOS inverter simulation based on the trained model is implemented in Cadence.The normalized relative mean square prediction error of the trained model versus experiment sample data and the simulation results show that the model is acceptable for digital circuit static simulation.And such modeling methodology can extend to general devices.展开更多
Data scarcity is a major obstacle for high-resolution mapping of permafrost on the Tibetan Plateau(TP).This study produces a new permafrost stability distribution map for the 2010 s(2005–2015)derived from the predict...Data scarcity is a major obstacle for high-resolution mapping of permafrost on the Tibetan Plateau(TP).This study produces a new permafrost stability distribution map for the 2010 s(2005–2015)derived from the predicted mean annual ground temperature(MAGT)at a depth of zero annual amplitude(10–25 m)by integrating remotely sensed freezing degree-days and thawing degree-days,snow cover days,leaf area index,soil bulk density,high-accuracy soil moisture data,and in situ MAGT measurements from 237 boreholes on the TP by using an ensemble learning method that employs a support vector regression model based on distance-blocked resampled training data with 200 repetitions.Validation of the new permafrost map indicates that it is probably the most accurate of all currently available maps.This map shows that the total area of permafrost on the TP,excluding glaciers and lakes,is approximately 115.02(105.47–129.59)×10^4 km^2.The areas corresponding to the very stable,stable,semi-stable,transitional,and unstable types are 0.86×10^4,9.62×10^4,38.45×10^4,42.29×10^4,and 23.80×10^4 km^2,respectively.This new map is of fundamental importance for engineering planning and design,ecosystem management,and evaluation of the permafrost change in the future on the TP as a baseline.展开更多
Mobility data,based on global positioning system(GPS)tracking,have been widely used in many areas,such as analyzing travel patterns,investigating transport safety and efficiency,and evaluating travel impacts.Transport...Mobility data,based on global positioning system(GPS)tracking,have been widely used in many areas,such as analyzing travel patterns,investigating transport safety and efficiency,and evaluating travel impacts.Transport modes are essential factors in understanding mobility within the transport system.Therefore,in this study,a significant number of algorithms were tested for transport mode detection.However,no conclusive recommendations can be drawn regarding which method should be used.The evaluation of the performance of the algorithms was not discussed systematically either in current literature.This paper aims to provide an in-depth review of the methods applied in transport mode detection based on GPS tracking data.The performances of the reviewed methods are then compared and evaluated to provide guidance in choosing algorithms for transport mode detection based on GPS tracking data.The results indicate that the majority of current studies are based on a supervised learning method for transport mode detection.Many of the reviewed methods first require manual dataset labeling,which can produce major drawbacks,such as inefficiency and human errors.It was also found that deep learning approaches have the potential to deal with large amounts of unlabeled raw GPS datasets and increase the accuracy and efficiency of transport mode detection.展开更多
IP geolocation determines geographical location by the IP address of Internet hosts.IP geolocation is widely used by target advertising,online fraud detection,cyber-attacks attribution and so on.It has gained much mor...IP geolocation determines geographical location by the IP address of Internet hosts.IP geolocation is widely used by target advertising,online fraud detection,cyber-attacks attribution and so on.It has gained much more attentions in these years since more and more physical devices are connected to cyberspace.Most geolocation methods cannot resolve the geolocation accuracy for those devices with few landmarks around.In this paper,we propose a novel geolocation approach that is based on common routers as secondary landmarks(Common Routers-based Geolocation,CRG).We search plenty of common routers by topology discovery among web server landmarks.We use statistical learning to study localized(delay,hop)-distance correlation and locate these common routers.We locate the accurate positions of common routers and convert them as secondary landmarks to help improve the feasibility of our geolocation system in areas that landmarks are sparsely distributed.We manage to improve the geolocation accuracy and decrease the maximum geolocation error compared to one of the state-of-the-art geolocation methods.At the end of this paper,we discuss the reason of the efficiency of our method and our future research.展开更多
IP geolocation determines geographical location by the IP address of Internet hosts.IP geolocation is widely used by target advertising,online fraud detection,cyber-attacks attribution and so on.It has gained much mor...IP geolocation determines geographical location by the IP address of Internet hosts.IP geolocation is widely used by target advertising,online fraud detection,cyber-attacks attribution and so on.It has gained much more attentions in these years since more and more physical devices are connected to cyberspace.Most geolocation methods cannot resolve the geolocation accuracy for those devices with few landmarks around.In this paper,we propose a novel geolocation approach that is based on common routers as secondary landmarks(Common Routers-based Geolocation,CRG).We search plenty of common routers by topology discovery among web server landmarks.We use statistical learning to study localized(delay,hop)-distance correlation and locate these common routers.We locate the accurate positions of common routers and convert them as secondary landmarks to help improve the feasibility of our geolocation system in areas that landmarks are sparsely distributed.We manage to improve the geolocation accuracy and decrease the maximum geolocation error compared to one of the state-of-the-art geolocation methods.At the end of this paper,we discuss the reason of the efficiency of our method and our future research.展开更多
文摘In economics, buyers and sellers are usually the main sides in a market. Game theory can perfectly model decisions behind each “player” and calculate an outcome that benefits both sides. However, the use of game theory is not lim-ited to economics. In this paper, I will introduce the mathematical model of general sum game, solutions and theorems surrounding game theory, and its real life applications in many different scenarios.
文摘Forecasting the movement of stock market is a long-time attractive topic. This paper implements different statistical learning models to predict the movement of S&P 500 index. The S&P 500 index is influenced by other important financial indexes across the world such as commodity price and financial technical indicators. This paper systematically investigated four supervised learning models, including Logistic Regression, Gaussian Discriminant Analysis (GDA), Naive Bayes and Support Vector Machine (SVM) in the forecast of S&P 500 index. After several experiments of optimization in features and models, especially the SVM kernel selection and feature selection for different models, this paper concludes that a SVM model with a Radial Basis Function (RBF) kernel can achieve an accuracy rate of 62.51% for the future market trend of the S&P 500 index.
文摘Forecasting stock returns is extremely challenging in general,and this task becomes even more difficult given the turbulent nature of the Chinese stock market.We address the stock selection process as a statistical learning problem and build crosssectional forecast models to select individual stocks in the Shanghai Composite Index.Decile portfolios are formed according to rankings of the forecasted future cumulative returns.The equity market’s neutral portfolio-formed by buying the top decile portfolio and selling short the bottom decile portfolio-exhibits superior performance to,and a low correlation with,the Shanghai Composite Index.To make our strategy more useful to practitioners,we evaluate the proposed stock selection strategy’s performance by allowing only long positions,and by investing only in Ashare stocks to incorporate the restrictions in the Chinese stock market.The longonly strategies still generate robust and superior performance compared to the Shanghai Composite Index.A close examination of the coefficients of the features provides more insights into the changes in market dynamics from period to period.
基金supported,in part,by the General Research Fund of the Hong Kong Government Research Grant Council(17609518)the Early Career Scheme of the Hong Kong Grants Council (28606419)the National Natural Science Foundation of China (31600903)。
文摘We examined the neural correlates of the statistical learning of orthographic-semantic connections in Chinese adult learners.Visual event-related potentials(ERPs) were recorded while participants were exposed to a sequence of artificial logographic characters containing semantic radicals carrying low,moderate,or high levels of semantic consistency.The behavioral results showed that the mean accuracy of participants’ recognition of previously exposed characters was 63.1% that was significantly above chance level(50%),indicating the statistical learning of the regularities of semantic radicals.The ERP data revealed a temporal sequence of the neural process of statistical learning of orthographic-semantic connections,and different brain indexes were found to be associated with this processing,i.e.,a clear N170-P200-N400 pattern.For N170,the larger negative amplitudes were evoked by the high and moderate consistency than the low consistency.For P200,the mean amplitudes elicited by the moderate and low consistency were larger than the high consistency.In contrast,a larger N400 amplitude was observed in the low than moderate and high consistency;and more negative amplitude was elicited by the moderate than high consistency.We propose that the initial potential shifts(N170 and P200) may reflect orthographic or graphic form identification,while the later component(N400) may be associated with semantic information analysis.
基金supported by the 2013 Annual Beijing Technological and Cultural Fusion for Demonstrated Base Construction and Industrial Nurture (No. Z131100000113007)the National Natural Science Foundation of China (Nos. 61202324, 61271431, and 61271430)
文摘To synthesize real-time and realistic facial animation, we present an effective algorithm which combines image- and geometry-based methods for facial animation simulation. Considering the numerous motion units in the expression coding system, we present a novel simplified motion unit based on the basic facial expression, and construct the corresponding basic action for a head model. As image features are difficult to obtain using the performance driven method, we develop an automatic image feature recognition method based on statistical learning, and an expression image semi-automatic labeling method with rotation invariant face detection, which can improve the accuracy and efficiency of expression feature identification and training. After facial animation redirection, each basic action weight needs to be computed and mapped automatically. We apply the blend shape method to construct and train the corresponding expression database according to each basic action, and adopt the least squares method to compute the corresponding control parameters for facial animation. Moreover, there is a pre-integration of diffuse light distribution and specular light distribution based on the physical method, to improve the plausibility and efficiency of facial rendering. Our work provides a simplification of the facial motion unit, an optimization of the statistical training process and recognition process for facial animation, solves the expression parameters, and simulates the subsurface scattering effect in real time. Experimental results indicate that our method is effective and efficient, and suitable for computer animation and interactive applications.
文摘Statistical relational learning constructs statistical models from relational databases, combining relational learning and statistical learning. Its strong ability and special property make statistical relational learning become one of the important areas in machine learning research.In this paper,the general concepts and characters of statistical relational learning are presented firstly.Then some major branches of this newly emerging field are discussed,including logic and rule-based approaches,frame and object-oriented approaches,functional programming-based approaches.After that several methods of applying rough set in statistical relational learning are described,such as gRS-ILP and VPRSILP. Finally some applications of statistical relational leaning are briefly introduced and some future directions of statistical relational learning and the application of rough set in this area are pointed out.
基金a scholarship from Capes(Coordena??o de Aperfei?oamento de Pessoal de Nível Superior)-Brazil(Award number:001)for the first author。
文摘Nitrogen(N)monitoring is essential in nurseries to ensure the production of high-quality seedlings.Nearinfrared spectroscopy(NIRS)is an instantaneous,nondestructive method to monitor N.Spectral data such as NIRS can also provide the basis for developing a new vegetation spectral index(VSI).Here,we evaluated whether NIRS combined with statistical modeling can accurately detect early variations in N concentration in leaves of young plants of Annona emargiaata and developed a new VSI for this task.Plants were grown in a hydroponics system with 0,2.75,5.5or 11 mM N for 45 days.Then we measured gas exchange,chlorophylla fluorescence,and pigments in leaves;analyzed complete leaf nutrients,and recorded spectral data for leaves at 966 to 1685 nm using NIRS.With a statistical learning approach,the dimensionality of the spectral data was reduced,then models were generated using two classes(N deficiency,N)or four classes(0,2.75,5.5,11 mM N).The best combination of techniques for dimensionality reduction and classification,respectively,was stepwise regression(PROC STEPDISC)and linear discriminant function.It was possible to detect N deficiency in seedlings leaves with 100%precision,and the four N concentrations with93.55%accuracy before photosynthetic damage to the plant occurred.Thereby,NIRS combined with statistical modeling of multidimensional data is effective for detecting N variations in seedlings leaves of A.emarginata.
基金from any funding agency in the public,commercial,or not-for-profit sectors.
文摘Online banking fraud occurs whenever a criminal can seize accounts and transfer funds from an individual’s online bank account.Successfully preventing this requires the detection of as many fraudsters as possible,without producing too many false alarms.This is a challenge for machine learning owing to the extremely imbalanced data and complexity of fraud.In addition,classical machine learning methods must be extended,minimizing expected financial losses.Finally,fraud can only be combated systematically and economically if the risks and costs in payment channels are known.We define three models that overcome these challenges:machine learning-based fraud detection,economic optimization of machine learning results,and a risk model to predict the risk of fraud while considering countermeasures.The models were tested utilizing real data.Our machine learning model alone reduces the expected and unexpected losses in the three aggregated payment channels by 15%compared to a benchmark consisting of static if-then rules.Optimizing the machine-learning model further reduces the expected losses by 52%.These results hold with a low false positive rate of 0.4%.Thus,the risk framework of the three models is viable from a business and risk perspective.
基金Joint Seismological Science Foundation of China (104090)
文摘Statistical learning theory is for small-sample statistics. And support vector machine is a new machine learning method based on the statistical learning theory. The support vector machine not only has solved certain problems in many learning methods, such as small sample, over fitting, high dimension and local minimum, but also has a higher generalization (forecasting) ability than that of artificial neural networks. The strong earthquakes in Chinese mainland are related to a certain extent to the intensive seismicity along the main plate boundaries in the world, however, the relation is nonlinear. In the paper, we have studied this unclear relation by the support vector machine method for the purpose of forecasting strong earthquakes in Chinese mainland.
基金supported by the National Natural Science Foundation of China under Grant 52007193 and The 2115 Talent Development Program of China Agricultural University.
文摘New energy integration and flexible demand response make smart grid operation scenarios complex and change-able,which bring challenges to network planning.If every possible scenario is considered,the solution to the plan-ning can become extremely time-consuming and difficult.This paper introduces statistical machine learning(SML)techniques to carry out multi-scenario based probabilistic power flow calculations and describes their application to the stochastic planning of distribution networks.The proposed SML includes linear regression,probability distribu-tion,Markov chain,isoprobabilistic transformation,maximum likelihood estimator,stochastic response surface and center point method.Based on the above SML model,capricious weather,photovoltaic power generation,thermal load,power flow and uncertainty programming are simulated.Taking a 33-bus distribution system as an example,this paper compares the stochastic planning model based on SML with the traditional models published in the literature.The results verify that the proposed model greatly improves planning performance while meeting accuracy require-ments.The case study also considers a realistic power distribution system operating under stressed conditions.
基金supported by the State Key Laboratory of Alternate Electrical Power System with Renewable Energy Sources under Grant No.LAPS21016the National Natural Science Foundation of China under Grant 52007193the 2115 Talent Development Program of China Agricultural University.
文摘The development of distributed renewable energy,such as photovoltaic power and wind power generation,makes the energy system cleaner,and is of great significance in reducing carbon emissions.However,weather can affect distributed renewable energy power generation,and the uncertainty of output brings challenges to uncertainty planning for distributed renewable energy.Energy systems with high penetration of distributed renewable energy involve the high-dimensional,nonlinear dynamics of large-scale complex systems,and the optimal solution of the uncertainty model is a difficult problem.From the perspective of statistical machine learning,the theory of planning of distributed renewable energy systems under uncertainty is reviewed and some key technologies are put forward for applying advanced artificial intelligence to distributed renewable power uncertainty planning.
文摘With the advent of internet of things(IOT),the need for studying new material and devices for various applications is increasing.Traditionally we build compact models for transistors on the basis of physics.But physical models are expensive and need a very long time to adjust for non-ideal effects.As the vision for the application of many novel devices is not certain or the manufacture process is not mature,deriving generalized accurate physical models for such devices is very strenuous,whereas statistical modeling is becoming a potential method because of its data oriented property and fast implementation.In this paper,one classical statistical regression method,LASSO,is used to model the I-V characteristics of CNT-FET and a pseudo-PMOS inverter simulation based on the trained model is implemented in Cadence.The normalized relative mean square prediction error of the trained model versus experiment sample data and the simulation results show that the model is acceptable for digital circuit static simulation.And such modeling methodology can extend to general devices.
基金supported by the Strategic Priority Research Program of the Chinese Academy of Sciences(Grant No.XDA19070204)the National Natural Science Foundation of China(Grant Nos.42071421,41630856)。
文摘Data scarcity is a major obstacle for high-resolution mapping of permafrost on the Tibetan Plateau(TP).This study produces a new permafrost stability distribution map for the 2010 s(2005–2015)derived from the predicted mean annual ground temperature(MAGT)at a depth of zero annual amplitude(10–25 m)by integrating remotely sensed freezing degree-days and thawing degree-days,snow cover days,leaf area index,soil bulk density,high-accuracy soil moisture data,and in situ MAGT measurements from 237 boreholes on the TP by using an ensemble learning method that employs a support vector regression model based on distance-blocked resampled training data with 200 repetitions.Validation of the new permafrost map indicates that it is probably the most accurate of all currently available maps.This map shows that the total area of permafrost on the TP,excluding glaciers and lakes,is approximately 115.02(105.47–129.59)×10^4 km^2.The areas corresponding to the very stable,stable,semi-stable,transitional,and unstable types are 0.86×10^4,9.62×10^4,38.45×10^4,42.29×10^4,and 23.80×10^4 km^2,respectively.This new map is of fundamental importance for engineering planning and design,ecosystem management,and evaluation of the permafrost change in the future on the TP as a baseline.
基金the financial supported by the Swedish Energy Agency (project no. 46068-1)
文摘Mobility data,based on global positioning system(GPS)tracking,have been widely used in many areas,such as analyzing travel patterns,investigating transport safety and efficiency,and evaluating travel impacts.Transport modes are essential factors in understanding mobility within the transport system.Therefore,in this study,a significant number of algorithms were tested for transport mode detection.However,no conclusive recommendations can be drawn regarding which method should be used.The evaluation of the performance of the algorithms was not discussed systematically either in current literature.This paper aims to provide an in-depth review of the methods applied in transport mode detection based on GPS tracking data.The performances of the reviewed methods are then compared and evaluated to provide guidance in choosing algorithms for transport mode detection based on GPS tracking data.The results indicate that the majority of current studies are based on a supervised learning method for transport mode detection.Many of the reviewed methods first require manual dataset labeling,which can produce major drawbacks,such as inefficiency and human errors.It was also found that deep learning approaches have the potential to deal with large amounts of unlabeled raw GPS datasets and increase the accuracy and efficiency of transport mode detection.
文摘IP geolocation determines geographical location by the IP address of Internet hosts.IP geolocation is widely used by target advertising,online fraud detection,cyber-attacks attribution and so on.It has gained much more attentions in these years since more and more physical devices are connected to cyberspace.Most geolocation methods cannot resolve the geolocation accuracy for those devices with few landmarks around.In this paper,we propose a novel geolocation approach that is based on common routers as secondary landmarks(Common Routers-based Geolocation,CRG).We search plenty of common routers by topology discovery among web server landmarks.We use statistical learning to study localized(delay,hop)-distance correlation and locate these common routers.We locate the accurate positions of common routers and convert them as secondary landmarks to help improve the feasibility of our geolocation system in areas that landmarks are sparsely distributed.We manage to improve the geolocation accuracy and decrease the maximum geolocation error compared to one of the state-of-the-art geolocation methods.At the end of this paper,we discuss the reason of the efficiency of our method and our future research.
文摘IP geolocation determines geographical location by the IP address of Internet hosts.IP geolocation is widely used by target advertising,online fraud detection,cyber-attacks attribution and so on.It has gained much more attentions in these years since more and more physical devices are connected to cyberspace.Most geolocation methods cannot resolve the geolocation accuracy for those devices with few landmarks around.In this paper,we propose a novel geolocation approach that is based on common routers as secondary landmarks(Common Routers-based Geolocation,CRG).We search plenty of common routers by topology discovery among web server landmarks.We use statistical learning to study localized(delay,hop)-distance correlation and locate these common routers.We locate the accurate positions of common routers and convert them as secondary landmarks to help improve the feasibility of our geolocation system in areas that landmarks are sparsely distributed.We manage to improve the geolocation accuracy and decrease the maximum geolocation error compared to one of the state-of-the-art geolocation methods.At the end of this paper,we discuss the reason of the efficiency of our method and our future research.