COVID-19 pandemic restrictions limited all social activities to curtail the spread of the virus.The foremost and most prime sector among those affected were schools,colleges,and universities.The education system of en...COVID-19 pandemic restrictions limited all social activities to curtail the spread of the virus.The foremost and most prime sector among those affected were schools,colleges,and universities.The education system of entire nations had shifted to online education during this time.Many shortcomings of Learning Management Systems(LMSs)were detected to support education in an online mode that spawned the research in Artificial Intelligence(AI)based tools that are being developed by the research community to improve the effectiveness of LMSs.This paper presents a detailed survey of the different enhancements to LMSs,which are led by key advances in the area of AI to enhance the real-time and non-real-time user experience.The AI-based enhancements proposed to the LMSs start from the Application layer and Presentation layer in the form of flipped classroom models for the efficient learning environment and appropriately designed UI/UX for efficient utilization of LMS utilities and resources,including AI-based chatbots.Session layer enhancements are also required,such as AI-based online proctoring and user authentication using Biometrics.These extend to the Transport layer to support real-time and rate adaptive encrypted video transmission for user security/privacy and satisfactory working of AI-algorithms.It also needs the support of the Networking layer for IP-based geolocation features,the Virtual Private Network(VPN)feature,and the support of Software-Defined Networks(SDN)for optimum Quality of Service(QoS).Finally,in addition to these,non-real-time user experience is enhanced by other AI-based enhancements such as Plagiarism detection algorithms and Data Analytics.展开更多
Zambia like any other country in most African regions is still grappling with the dynamics of harnessing technology for the betterment of Higher Education. The onset of the Covid 19 pandemic brought a test for the pre...Zambia like any other country in most African regions is still grappling with the dynamics of harnessing technology for the betterment of Higher Education. The onset of the Covid 19 pandemic brought a test for the preparedness of the Zambian Higher Education Institutions (HEIs) in harnessing technology for pedagogical activities. As countries worldwide switched to electronic learning during the pandemic, the same could not be said for Zambian HEIs. Zambian HEIs struggled to conduct pedagogical activities on learning management platforms. This study investigated the factors affecting the implementation and assessment of learning Management systems in Zambia’s HEIs. With its focus on assessing: 1) system features, 2) compliance with regulatory standards, 3) quality of service and 4) technology acceptance as the four key assessment areas of an LMS, this article proposed a model for assessing learning management systems in Zambian HEIs. To test the proposed model, a software tool was also developed.展开更多
Safety critical control is often trained in a simulated environment to mitigate risk.Subsequent migration of the biased controller requires further adjustments.In this paper,an experience inference human-behavior lear...Safety critical control is often trained in a simulated environment to mitigate risk.Subsequent migration of the biased controller requires further adjustments.In this paper,an experience inference human-behavior learning is proposed to solve the migration problem of optimal controllers applied to real-world nonlinear systems.The approach is inspired in the complementary properties that exhibits the hippocampus,the neocortex,and the striatum learning systems located in the brain.The hippocampus defines a physics informed reference model of the realworld nonlinear system for experience inference and the neocortex is the adaptive dynamic programming(ADP)or reinforcement learning(RL)algorithm that ensures optimal performance of the reference model.This optimal performance is inferred to the real-world nonlinear system by means of an adaptive neocortex/striatum control policy that forces the nonlinear system to behave as the reference model.Stability and convergence of the proposed approach is analyzed using Lyapunov stability theory.Simulation studies are carried out to verify the approach.展开更多
The rapid expansion of artificial intelligence(AI)applications has raised significant concerns about user privacy,prompting the development of privacy-preserving machine learning(ML)paradigms such as federated learnin...The rapid expansion of artificial intelligence(AI)applications has raised significant concerns about user privacy,prompting the development of privacy-preserving machine learning(ML)paradigms such as federated learning(FL).FL enables the distributed training of ML models,keeping data on local devices and thus addressing the privacy concerns of users.However,challenges arise from the heterogeneous nature of mobile client devices,partial engagement of training,and non-independent identically distributed(non-IID)data distribution,leading to performance degradation and optimization objective bias in FL training.With the development of 5G/6G networks and the integration of cloud computing edge computing resources,globally distributed cloud computing resources can be effectively utilized to optimize the FL process.Through the specific parameters of the server through the selection mechanism,it does not increase the monetary cost and reduces the network latency overhead,but also balances the objectives of communication optimization and low engagement mitigation that cannot be achieved simultaneously in a single-server framework of existing works.In this paper,we propose the FedAdaSS algorithm,an adaptive parameter server selection mechanism designed to optimize the training efficiency in each round of FL training by selecting the most appropriate server as the parameter server.Our approach leverages the flexibility of cloud resource computing power,and allows organizers to strategically select servers for data broadcasting and aggregation,thus improving training performance while maintaining cost efficiency.The FedAdaSS algorithm estimates the utility of client systems and servers and incorporates an adaptive random reshuffling strategy that selects the optimal server in each round of the training process.Theoretical analysis confirms the convergence of FedAdaSS under strong convexity and L-smooth assumptions,and comparative experiments within the FLSim framework demonstrate a reduction in training round-to-accuracy by 12%–20%compared to the Federated Averaging(FedAvg)with random reshuffling method under unique server.Furthermore,FedAdaSS effectively mitigates performance loss caused by low client engagement,reducing the loss indicator by 50%.展开更多
The dramatic improvement of information and communication technology (ICT) has made an evolution in learning management systems (LMS). The rapid growth in LMSs has caused users to demand more advanced, automated, and ...The dramatic improvement of information and communication technology (ICT) has made an evolution in learning management systems (LMS). The rapid growth in LMSs has caused users to demand more advanced, automated, and intelligent services. This paper discusses how Artificial Intelligence and Machine Learning techniques are adopted to fulfill users’ needs in a social learning management system named “CourseNetworking”. The paper explains how machine learning contributed to developing an intelligent agent called “Rumi” as a personal assistant in CourseNetworking platform to add personalization, gamification, and more dynamics to the system. This paper aims to introduce machine learning to traditional learning platforms and guide the developers working in LMS field to benefit from advanced technologies in learning platforms by offering customized services.展开更多
The objective of dynamical system learning tasks is to forecast the future behavior of a system by leveraging observed data.However,such systems can sometimes exhibit rigidity due to significant variations in componen...The objective of dynamical system learning tasks is to forecast the future behavior of a system by leveraging observed data.However,such systems can sometimes exhibit rigidity due to significant variations in component parameters or the presence of slow and fast variables,leading to challenges in learning.To overcome this limitation,we propose a multiscale differential-algebraic neural network(MDANN)method that utilizes Lagrangian mechanics and incorporates multiscale information for dynamical system learning.The MDANN method consists of two main components:the Lagrangian mechanics module and the multiscale module.The Lagrangian mechanics module embeds the system in Cartesian coordinates,adopts a differential-algebraic equation format,and uses Lagrange multipliers to impose constraints explicitly,simplifying the learning problem.The multiscale module converts high-frequency components into low-frequency components using radial scaling to learn subprocesses with large differences in velocity.Experimental results demonstrate that the proposed MDANN method effectively improves the learning of dynamical systems under rigid conditions.展开更多
Second language acquisition can not be understood without addressing the interaction between language and cognition. Cognitive theory can extend to describe learning strategies as complex cognitive skills. Theoretical...Second language acquisition can not be understood without addressing the interaction between language and cognition. Cognitive theory can extend to describe learning strategies as complex cognitive skills. Theoretical developments in Anderson’s production systems cover a broader range of behavior than other theories, including comprehension and production of oral and written texts as well as comprehension, problem solving, and verbal learning.Thus Anderson’s cognitive theory can be served as a rationale for learning strategy studies in second language acquisition.展开更多
The support vector machine (SVM) is a novel machine learning method, which has the ability to approximate nonlinear functions with arbitrary accuracy. Setting parameters well is very crucial for SVM learning results...The support vector machine (SVM) is a novel machine learning method, which has the ability to approximate nonlinear functions with arbitrary accuracy. Setting parameters well is very crucial for SVM learning results and generalization ability, and now there is no systematic, general method for parameter selection. In this article, the SVM parameter selection for function approximation is regarded as a compound optimization problem and a mutative scale chaos optimization algorithm is employed to search for optimal paraxneter values. The chaos optimization algorithm is an effective way for global optimal and the mutative scale chaos algorithm could improve the search efficiency and accuracy. Several simulation examples show the sensitivity of the SVM parameters and demonstrate the superiority of this proposed method for nonlinear function approximation.展开更多
A relaxation least squares-based learning algorithm for neual networks is proposed. Not only does it have a fast convergence rate, but it involves less computation quantity. Therefore, it is suitable to deal with the ...A relaxation least squares-based learning algorithm for neual networks is proposed. Not only does it have a fast convergence rate, but it involves less computation quantity. Therefore, it is suitable to deal with the case when a network has a large scale but the number of training data is very limited. It has been used in converting furnace process modelling, and impressive result has been obtained.展开更多
In this paper, two theorems are proved for zero cost function (or precise I/O mapping) training algorithms about three-layered feedforward neural networks. Two training algorithms based on Moore-Penrose pseudoinverse ...In this paper, two theorems are proved for zero cost function (or precise I/O mapping) training algorithms about three-layered feedforward neural networks. Two training algorithms based on Moore-Penrose pseudoinverse (MPPI) matrix together with corresponding structure design guidelines are also proposed.展开更多
Existing manifold learning algorithms use Euclidean distance to measure the proximity of data points. However, in high-dimensional space, Minkowski metrics are no longer stable because the ratio of distance of nearest...Existing manifold learning algorithms use Euclidean distance to measure the proximity of data points. However, in high-dimensional space, Minkowski metrics are no longer stable because the ratio of distance of nearest and farthest neighbors to a given query is almost unit. It will degracle the performance of manifold learning algorithms when applied to dimensionality reduction of high-dimensional data. We introduce a new distance function named shrinkage-divergence-proximity (SDP) to manifold learning, which is meaningful in any high-dimensional space. An improved locally linear embedding (LLE) algorithm named SDP-LLE is proposed in light of the theoretical result. Experiments are conducted on a hyperspectral data set and an image segmentation data set. Experimental results show that the proposed method can efficiently reduce the dimensionality while getting higher classification accuracy.展开更多
A novel quantitative analysis method of multi-component mixture gas concentration based on support vector machine (SVM) and spectroscopy is proposed. Through transformation of the kernel function, the seriously over...A novel quantitative analysis method of multi-component mixture gas concentration based on support vector machine (SVM) and spectroscopy is proposed. Through transformation of the kernel function, the seriously overlapped and nonlinear spectrum data are transformed in high-dimensional space, but the highdimensional data can be processed in the original space. Some factors, such as kernel function, range of the wavelength, and penalty coefficient, are discussed. This method is applied to the quantitative analysis of natural gas components concentration, and the component concentration maximal deviation is 2.28%.展开更多
The silk moth (Bombyx mori) exhibits efficient Chemical Plume Tracing (CPT), which is ideal for biomimetics. However, there is insufficient quantitative understanding of its CPT behavior. We propose a hierarchical...The silk moth (Bombyx mori) exhibits efficient Chemical Plume Tracing (CPT), which is ideal for biomimetics. However, there is insufficient quantitative understanding of its CPT behavior. We propose a hierarchical classification method to segment its natural CPT locomotion and to build its inverse model for detecting stimulus input. This provides the basis for quantitative analysis. The Gaussian mixture model with expectation-maximization algorithm is used first for unsupervised classification to decompose CPT locomotion data into Gaussian density components that represent a set of quantified elemental motions. A heuristic behavioral rule is used to categorize these components to eliminate components that are descriptive of the same motion. Then, the echo state network is used for supervised classification to evaluate segmented elemental motions and to compare CPT locomotion among different moths. In this case, categorized elemental motions are used as the training data to estimate stimulus time. We successfully built the inverse CPT behavioral model of the silk moth to detect stimulus input with good accuracy. The quantitative analysis indicates that silk moths exhibit behavioral singularity and time dependency in their CPT locomotion, which is dominated by its singularity.展开更多
Several decades ago,Profs.Sean Meyn and Lei Guo were postdoctoral fellows at ANU,where they shared interest in recursive algorithms.It seems fitting to celebrate Lei Guo’s 60 th birthday with a review of the ODE Meth...Several decades ago,Profs.Sean Meyn and Lei Guo were postdoctoral fellows at ANU,where they shared interest in recursive algorithms.It seems fitting to celebrate Lei Guo’s 60 th birthday with a review of the ODE Method and its recent evolution,with focus on the following themes:The method has been regarded as a technique for algorithm analysis.It is argued that this viewpoint is backwards:The original stochastic approximation method was surely motivated by an ODE,and tools for analysis came much later(based on establishing robustness of Euler approximations).The paper presents a brief survey of recent research in machine learning that shows the power of algorithm design in continuous time,following by careful approximation to obtain a practical recursive algorithm.While these methods are usually presented in a stochastic setting,this is not a prerequisite.In fact,recent theory shows that rates of convergence can be dramatically accelerated by applying techniques inspired by quasi Monte-Carlo.Subject to conditions,the optimal rate of convergence can be obtained by applying the averaging technique of Polyak and Ruppert.The conditions are not universal,but theory suggests alternatives to achieve acceleration.The theory is illustrated with applications to gradient-free optimization,and policy gradient algorithms for reinforcement learning.展开更多
文摘COVID-19 pandemic restrictions limited all social activities to curtail the spread of the virus.The foremost and most prime sector among those affected were schools,colleges,and universities.The education system of entire nations had shifted to online education during this time.Many shortcomings of Learning Management Systems(LMSs)were detected to support education in an online mode that spawned the research in Artificial Intelligence(AI)based tools that are being developed by the research community to improve the effectiveness of LMSs.This paper presents a detailed survey of the different enhancements to LMSs,which are led by key advances in the area of AI to enhance the real-time and non-real-time user experience.The AI-based enhancements proposed to the LMSs start from the Application layer and Presentation layer in the form of flipped classroom models for the efficient learning environment and appropriately designed UI/UX for efficient utilization of LMS utilities and resources,including AI-based chatbots.Session layer enhancements are also required,such as AI-based online proctoring and user authentication using Biometrics.These extend to the Transport layer to support real-time and rate adaptive encrypted video transmission for user security/privacy and satisfactory working of AI-algorithms.It also needs the support of the Networking layer for IP-based geolocation features,the Virtual Private Network(VPN)feature,and the support of Software-Defined Networks(SDN)for optimum Quality of Service(QoS).Finally,in addition to these,non-real-time user experience is enhanced by other AI-based enhancements such as Plagiarism detection algorithms and Data Analytics.
文摘Zambia like any other country in most African regions is still grappling with the dynamics of harnessing technology for the betterment of Higher Education. The onset of the Covid 19 pandemic brought a test for the preparedness of the Zambian Higher Education Institutions (HEIs) in harnessing technology for pedagogical activities. As countries worldwide switched to electronic learning during the pandemic, the same could not be said for Zambian HEIs. Zambian HEIs struggled to conduct pedagogical activities on learning management platforms. This study investigated the factors affecting the implementation and assessment of learning Management systems in Zambia’s HEIs. With its focus on assessing: 1) system features, 2) compliance with regulatory standards, 3) quality of service and 4) technology acceptance as the four key assessment areas of an LMS, this article proposed a model for assessing learning management systems in Zambian HEIs. To test the proposed model, a software tool was also developed.
基金supported by the Royal Academy of Engineering and the Office of the Chie Science Adviser for National Security under the UK Intelligence Community Postdoctoral Research Fellowship programme。
文摘Safety critical control is often trained in a simulated environment to mitigate risk.Subsequent migration of the biased controller requires further adjustments.In this paper,an experience inference human-behavior learning is proposed to solve the migration problem of optimal controllers applied to real-world nonlinear systems.The approach is inspired in the complementary properties that exhibits the hippocampus,the neocortex,and the striatum learning systems located in the brain.The hippocampus defines a physics informed reference model of the realworld nonlinear system for experience inference and the neocortex is the adaptive dynamic programming(ADP)or reinforcement learning(RL)algorithm that ensures optimal performance of the reference model.This optimal performance is inferred to the real-world nonlinear system by means of an adaptive neocortex/striatum control policy that forces the nonlinear system to behave as the reference model.Stability and convergence of the proposed approach is analyzed using Lyapunov stability theory.Simulation studies are carried out to verify the approach.
基金supported in part by the National Natural Science Foundation of China under Grant U22B2005,Grant 62372462.
文摘The rapid expansion of artificial intelligence(AI)applications has raised significant concerns about user privacy,prompting the development of privacy-preserving machine learning(ML)paradigms such as federated learning(FL).FL enables the distributed training of ML models,keeping data on local devices and thus addressing the privacy concerns of users.However,challenges arise from the heterogeneous nature of mobile client devices,partial engagement of training,and non-independent identically distributed(non-IID)data distribution,leading to performance degradation and optimization objective bias in FL training.With the development of 5G/6G networks and the integration of cloud computing edge computing resources,globally distributed cloud computing resources can be effectively utilized to optimize the FL process.Through the specific parameters of the server through the selection mechanism,it does not increase the monetary cost and reduces the network latency overhead,but also balances the objectives of communication optimization and low engagement mitigation that cannot be achieved simultaneously in a single-server framework of existing works.In this paper,we propose the FedAdaSS algorithm,an adaptive parameter server selection mechanism designed to optimize the training efficiency in each round of FL training by selecting the most appropriate server as the parameter server.Our approach leverages the flexibility of cloud resource computing power,and allows organizers to strategically select servers for data broadcasting and aggregation,thus improving training performance while maintaining cost efficiency.The FedAdaSS algorithm estimates the utility of client systems and servers and incorporates an adaptive random reshuffling strategy that selects the optimal server in each round of the training process.Theoretical analysis confirms the convergence of FedAdaSS under strong convexity and L-smooth assumptions,and comparative experiments within the FLSim framework demonstrate a reduction in training round-to-accuracy by 12%–20%compared to the Federated Averaging(FedAvg)with random reshuffling method under unique server.Furthermore,FedAdaSS effectively mitigates performance loss caused by low client engagement,reducing the loss indicator by 50%.
文摘The dramatic improvement of information and communication technology (ICT) has made an evolution in learning management systems (LMS). The rapid growth in LMSs has caused users to demand more advanced, automated, and intelligent services. This paper discusses how Artificial Intelligence and Machine Learning techniques are adopted to fulfill users’ needs in a social learning management system named “CourseNetworking”. The paper explains how machine learning contributed to developing an intelligent agent called “Rumi” as a personal assistant in CourseNetworking platform to add personalization, gamification, and more dynamics to the system. This paper aims to introduce machine learning to traditional learning platforms and guide the developers working in LMS field to benefit from advanced technologies in learning platforms by offering customized services.
基金supported by the National Natural Science Foundations of China(Nos.12172186 and 11772166).
文摘The objective of dynamical system learning tasks is to forecast the future behavior of a system by leveraging observed data.However,such systems can sometimes exhibit rigidity due to significant variations in component parameters or the presence of slow and fast variables,leading to challenges in learning.To overcome this limitation,we propose a multiscale differential-algebraic neural network(MDANN)method that utilizes Lagrangian mechanics and incorporates multiscale information for dynamical system learning.The MDANN method consists of two main components:the Lagrangian mechanics module and the multiscale module.The Lagrangian mechanics module embeds the system in Cartesian coordinates,adopts a differential-algebraic equation format,and uses Lagrange multipliers to impose constraints explicitly,simplifying the learning problem.The multiscale module converts high-frequency components into low-frequency components using radial scaling to learn subprocesses with large differences in velocity.Experimental results demonstrate that the proposed MDANN method effectively improves the learning of dynamical systems under rigid conditions.
文摘Second language acquisition can not be understood without addressing the interaction between language and cognition. Cognitive theory can extend to describe learning strategies as complex cognitive skills. Theoretical developments in Anderson’s production systems cover a broader range of behavior than other theories, including comprehension and production of oral and written texts as well as comprehension, problem solving, and verbal learning.Thus Anderson’s cognitive theory can be served as a rationale for learning strategy studies in second language acquisition.
基金the National Nature Science Foundation of China (60775047, 60402024)
文摘The support vector machine (SVM) is a novel machine learning method, which has the ability to approximate nonlinear functions with arbitrary accuracy. Setting parameters well is very crucial for SVM learning results and generalization ability, and now there is no systematic, general method for parameter selection. In this article, the SVM parameter selection for function approximation is regarded as a compound optimization problem and a mutative scale chaos optimization algorithm is employed to search for optimal paraxneter values. The chaos optimization algorithm is an effective way for global optimal and the mutative scale chaos algorithm could improve the search efficiency and accuracy. Several simulation examples show the sensitivity of the SVM parameters and demonstrate the superiority of this proposed method for nonlinear function approximation.
基金This project was supported by the National Natural Science Foundation of China (No. 60174021)the Key Project of Tianjin Natural Science Foundation (No.010115).
文摘A relaxation least squares-based learning algorithm for neual networks is proposed. Not only does it have a fast convergence rate, but it involves less computation quantity. Therefore, it is suitable to deal with the case when a network has a large scale but the number of training data is very limited. It has been used in converting furnace process modelling, and impressive result has been obtained.
文摘In this paper, two theorems are proved for zero cost function (or precise I/O mapping) training algorithms about three-layered feedforward neural networks. Two training algorithms based on Moore-Penrose pseudoinverse (MPPI) matrix together with corresponding structure design guidelines are also proposed.
基金the Graduate Starting Seed Fund of Northwestern Polytechnical University (No.Z200760)the Innovation Fund of Northwestern Polytechnical University.
文摘Existing manifold learning algorithms use Euclidean distance to measure the proximity of data points. However, in high-dimensional space, Minkowski metrics are no longer stable because the ratio of distance of nearest and farthest neighbors to a given query is almost unit. It will degracle the performance of manifold learning algorithms when applied to dimensionality reduction of high-dimensional data. We introduce a new distance function named shrinkage-divergence-proximity (SDP) to manifold learning, which is meaningful in any high-dimensional space. An improved locally linear embedding (LLE) algorithm named SDP-LLE is proposed in light of the theoretical result. Experiments are conducted on a hyperspectral data set and an image segmentation data set. Experimental results show that the proposed method can efficiently reduce the dimensionality while getting higher classification accuracy.
基金This work was supported by the National Natural Science Foundation of China under Grant No. 60276037.
文摘A novel quantitative analysis method of multi-component mixture gas concentration based on support vector machine (SVM) and spectroscopy is proposed. Through transformation of the kernel function, the seriously overlapped and nonlinear spectrum data are transformed in high-dimensional space, but the highdimensional data can be processed in the original space. Some factors, such as kernel function, range of the wavelength, and penalty coefficient, are discussed. This method is applied to the quantitative analysis of natural gas components concentration, and the component concentration maximal deviation is 2.28%.
文摘The silk moth (Bombyx mori) exhibits efficient Chemical Plume Tracing (CPT), which is ideal for biomimetics. However, there is insufficient quantitative understanding of its CPT behavior. We propose a hierarchical classification method to segment its natural CPT locomotion and to build its inverse model for detecting stimulus input. This provides the basis for quantitative analysis. The Gaussian mixture model with expectation-maximization algorithm is used first for unsupervised classification to decompose CPT locomotion data into Gaussian density components that represent a set of quantified elemental motions. A heuristic behavioral rule is used to categorize these components to eliminate components that are descriptive of the same motion. Then, the echo state network is used for supervised classification to evaluate segmented elemental motions and to compare CPT locomotion among different moths. In this case, categorized elemental motions are used as the training data to estimate stimulus time. We successfully built the inverse CPT behavioral model of the silk moth to detect stimulus input with good accuracy. The quantitative analysis indicates that silk moths exhibit behavioral singularity and time dependency in their CPT locomotion, which is dominated by its singularity.
基金ARO W911NF1810334NSF under EPCN 1935389the National Renewable Energy Laboratory(NREL)。
文摘Several decades ago,Profs.Sean Meyn and Lei Guo were postdoctoral fellows at ANU,where they shared interest in recursive algorithms.It seems fitting to celebrate Lei Guo’s 60 th birthday with a review of the ODE Method and its recent evolution,with focus on the following themes:The method has been regarded as a technique for algorithm analysis.It is argued that this viewpoint is backwards:The original stochastic approximation method was surely motivated by an ODE,and tools for analysis came much later(based on establishing robustness of Euler approximations).The paper presents a brief survey of recent research in machine learning that shows the power of algorithm design in continuous time,following by careful approximation to obtain a practical recursive algorithm.While these methods are usually presented in a stochastic setting,this is not a prerequisite.In fact,recent theory shows that rates of convergence can be dramatically accelerated by applying techniques inspired by quasi Monte-Carlo.Subject to conditions,the optimal rate of convergence can be obtained by applying the averaging technique of Polyak and Ruppert.The conditions are not universal,but theory suggests alternatives to achieve acceleration.The theory is illustrated with applications to gradient-free optimization,and policy gradient algorithms for reinforcement learning.