For the n-qubit stochastic open quantum systems,based on the Lyapunov stability theorem and LaSalle’s invariant set principle,a pure state switching control based on on-line estimated state feedback(short for OQST-SF...For the n-qubit stochastic open quantum systems,based on the Lyapunov stability theorem and LaSalle’s invariant set principle,a pure state switching control based on on-line estimated state feedback(short for OQST-SFC)is proposed to realize the state transition the pure state of the target state including eigenstate and superposition state.The proposed switching control consists of a constant control and a control law designed based on the Lyapunov method,in which the Lyapunov function is the state distance of the system.The constant control is used to drive the system state from an initial state to the convergence domain only containing the target state,and a Lyapunov-based control is used to make the state enter the convergence domain and then continue to converge to the target state.At the same time,the continuous weak measurement of quantum system and the quantum state tomography method based on the on-line alternating direction multiplier(QST-OADM)are used to obtain the system information and estimate the quantum state which is used as the input of the quantum system controller.Then,the pure state feedback switching control method based on the on-line estimated state feedback is realized in an n-qubit stochastic open quantum system.The complete derivation process of n-qubit QST-OADM algorithm is given;Through strict theoretical proof and analysis,the convergence conditions to ensure any initial state of the quantum system to converge the target pure state are given.The proposed control method is applied to a 2-qubit stochastic open quantum system for numerical simulation experiments.Four possible different position cases between the initial estimated state and that of the controlled system are studied and discussed,and the performances of the state transition under the corresponding cases are analyzed.展开更多
An optimal tracking control problem for a class of nonlinear systems with guaranteed performance and asymmetric input constraints is discussed in this paper.The control policy is implemented by adaptive dynamic progra...An optimal tracking control problem for a class of nonlinear systems with guaranteed performance and asymmetric input constraints is discussed in this paper.The control policy is implemented by adaptive dynamic programming(ADP)algorithm under two event-based triggering mechanisms.It is often challenging to design an optimal control law due to the system deviation caused by asymmetric input constraints.First,a prescribed performance control technique is employed to guarantee the tracking errors within predetermined boundaries.Subsequently,considering the asymmetric input constraints,a discounted non-quadratic cost function is introduced.Moreover,in order to reduce controller updates,an event-triggered control law is developed for ADP algorithm.After that,to further simplify the complexity of controller design,this work is extended to a self-triggered case for relaxing the need for continuous signal monitoring by hardware devices.By employing the Lyapunov method,the uniform ultimate boundedness of all signals is proved to be guaranteed.Finally,a simulation example on a mass–spring–damper system subject to asymmetric input constraints is provided to validate the effectiveness of the proposed control scheme.展开更多
Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and ...Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence.展开更多
Aimed at infinite horizon optimal control problems of discrete time-varying nonlinear systems,in this paper,a new iterative adaptive dynamic programming algorithm,which is the discrete-time time-varying policy iterati...Aimed at infinite horizon optimal control problems of discrete time-varying nonlinear systems,in this paper,a new iterative adaptive dynamic programming algorithm,which is the discrete-time time-varying policy iteration(DTTV)algorithm,is developed.The iterative control law is designed to update the iterative value function which approximates the index function of optimal performance.The admissibility of the iterative control law is analyzed.The results show that the iterative value function is non-increasingly convergent to the Bellman-equation optimal solution.To implement the algorithm,neural networks are employed and a new implementation structure is established,which avoids solving the generalized Bellman equation in each iteration.Finally,the optimal control laws for torsional pendulum and inverted pendulum systems are obtained by using the DTTV policy iteration algorithm,where the mass and pendulum bar length are permitted to be time-varying parameters.The effectiveness of the developed method is illustrated by numerical results and comparisons.展开更多
In emerging applications such as industrial control and autonomous driving,end-to-end deterministic quality of service(QoS)transmission guarantee has become an urgent problem to be solved.Internet congestion control a...In emerging applications such as industrial control and autonomous driving,end-to-end deterministic quality of service(QoS)transmission guarantee has become an urgent problem to be solved.Internet congestion control algorithms are essential to the performance of applications.However,existing congestion control schemes follow the best-effort principle of data transmission without the perception of application QoS requirements.To enable data delivery within application QoS constraints,we leverage an online learning mechanism to design Crimson,a novel congestion control algorithm in which each sender continuously observes the gap between current performance and pre-defined QoS.Crimson can change rates adaptively that satisfy application QoS requirements as a result.Across many emulation environments and real-world experiments,our proposed scheme can efficiently balance the different trade-offs between throughput,delay and loss rate.Crimson also achieves consistent performance over a wide range of QoS constraints under diverse network scenarios.展开更多
In this paper, at first, the single input rule modules(SIRMs) dynamically connected fuzzy inference model is used to stabilize a double inverted pendulum system. Then, a multiobjective particle swarm optimization(MOPS...In this paper, at first, the single input rule modules(SIRMs) dynamically connected fuzzy inference model is used to stabilize a double inverted pendulum system. Then, a multiobjective particle swarm optimization(MOPSO) is implemented to optimize the fuzzy controller parameters in order to decrease the distance error of the cart and summation of the angle errors of the pendulums, simultaneously. The feasibility and efficiency of the proposed Pareto front is assessed in comparison with results reported in literature and obtained from other algorithms.Finally, the Java programming with applets is utilized to simulate the stability of the nonlinear system and explain the internetbased control.展开更多
Traditionally,offline optimization of power systems is acceptable due to the largely predictable loads and reliable generation.The increasing penetration of fluctuating renewable generation and internet-of-things devi...Traditionally,offline optimization of power systems is acceptable due to the largely predictable loads and reliable generation.The increasing penetration of fluctuating renewable generation and internet-of-things devices allowing for fine-grained controllability of loads have led to the diminishing applicability of offline optimization in the power systems domain,and have redirected attention to online optimization methods.However,online optimization is a broad topic that can be applied in and motivated by different settings,operated on different time scales,and built on different theoretical foundations.This paper reviews the various types of online optimization techniques used in the power systems domain and aims to make clear the distinction between the most common techniques used.In particular,we introduce and compare four distinct techniques used covering the breadth of online optimization techniques used in the power systems domain,i.e.,optimization-guided dynamic control,feedback optimization for single-period problems,Lyapunov-based optimization,and online convex optimization techniques for multi-period problems.Lastly,we recommend some potential future directions for online optimization in the power systems domain.展开更多
Based on adaptive dynamic programming(ADP),the fixed-point tracking control problem is solved by a value iteration(VI) algorithm. First, a class of discrete-time(DT)nonlinear system with disturbance is considered. Sec...Based on adaptive dynamic programming(ADP),the fixed-point tracking control problem is solved by a value iteration(VI) algorithm. First, a class of discrete-time(DT)nonlinear system with disturbance is considered. Second, the convergence of a VI algorithm is given. It is proven that the iterative cost function precisely converges to the optimal value,and the control input and disturbance input also converges to the optimal values. Third, a novel analysis pertaining to the range of the discount factor is presented, where the cost function serves as a Lyapunov function. Finally, neural networks(NNs) are employed to approximate the cost function, the control law, and the disturbance law. Simulation examples are given to illustrate the effective performance of the proposed method.展开更多
In this paper, an online optimal distributed learning algorithm is proposed to solve leader-synchronization problem of nonlinear multi-agent differential graphical games. Each player approximates its optimal control p...In this paper, an online optimal distributed learning algorithm is proposed to solve leader-synchronization problem of nonlinear multi-agent differential graphical games. Each player approximates its optimal control policy using a single-network approximate dynamic programming(ADP) where only one critic neural network(NN) is employed instead of typical actorcritic structure composed of two NNs. The proposed distributed weight tuning laws for critic NNs guarantee stability in the sense of uniform ultimate boundedness(UUB) and convergence of control policies to the Nash equilibrium. In this paper, by introducing novel distributed local operators in weight tuning laws, there is no more requirement for initial stabilizing control policies. Furthermore, the overall closed-loop system stability is guaranteed by Lyapunov stability analysis. Finally, Simulation results show the effectiveness of the proposed algorithm.展开更多
A stochastic optimal control strategy for partially observable nonlinear quasi Hamiltonian systems is proposed. The optimal control forces consist of two parts. The first part is determined by the conditions under whi...A stochastic optimal control strategy for partially observable nonlinear quasi Hamiltonian systems is proposed. The optimal control forces consist of two parts. The first part is determined by the conditions under which the stochastic optimal control problem of a partially observable nonlinear system is converted into that of a completely observable linear system. The second part is determined by solving the dynamical programming equation derived by applying the stochastic averaging method and stochastic dynamical programming principle to the completely observable linear control system. The response of the optimally controlled quasi Hamiltonian system is predicted by solving the averaged Fokker-Planck-Kolmogorov equation associated with the optimally controlled completely observable linear system and solving the Riccati equation for the estimated error of system states. An example is given to illustrate the procedure and effectiveness of the proposed control strategy.展开更多
Nonlinear loads in the power distribution system cause non-sinusoidal currents and voltages with harmonic components.Shunt active filters(SAF) with current controlled voltage source inverters(CCVSI) are usually used t...Nonlinear loads in the power distribution system cause non-sinusoidal currents and voltages with harmonic components.Shunt active filters(SAF) with current controlled voltage source inverters(CCVSI) are usually used to obtain balanced and sinusoidal source currents by injecting compensation currents.However,CCVSI with traditional controllers have a limited transient and steady state performance.In this paper,we propose an adaptive dynamic programming(ADP) controller with online learning capability to improve transient response and harmonics.The proposed controller works alongside existing proportional integral(PI) controllers to efficiently track the reference currents in the d-q domain.It can generate adaptive control actions to compensate the PI controller.The proposed system was simulated under different nonlinear(three-phase full wave rectifier) load conditions.The performance of the proposed approach was compared with the traditional approach.We have also included the simulation results without connecting the traditional PI control based power inverter for reference comparison.The online learning based ADP controller not only reduced average total harmonic distortion by 18.41%,but also outperformed traditional PI controllers during transients.展开更多
In this paper, a data-based fault tolerant control(FTC) scheme is investigated for unknown continuous-time(CT)affine nonlinear systems with actuator faults. First, a neural network(NN) identifier based on particle swa...In this paper, a data-based fault tolerant control(FTC) scheme is investigated for unknown continuous-time(CT)affine nonlinear systems with actuator faults. First, a neural network(NN) identifier based on particle swarm optimization(PSO) is constructed to model the unknown system dynamics. By utilizing the estimated system states, the particle swarm optimized critic neural network(PSOCNN) is employed to solve the Hamilton-Jacobi-Bellman equation(HJBE) more efficiently.Then, a data-based FTC scheme, which consists of the NN identifier and the fault compensator, is proposed to achieve actuator fault tolerance. The stability of the closed-loop system under actuator faults is guaranteed by the Lyapunov stability theorem. Finally, simulations are provided to demonstrate the effectiveness of the developed method.展开更多
In this paper,an adaptive dynamic programming(ADP)strategy is investigated for discrete-time nonlinear systems with unknown nonlinear dynamics subject to input saturation.To save the communication resources between th...In this paper,an adaptive dynamic programming(ADP)strategy is investigated for discrete-time nonlinear systems with unknown nonlinear dynamics subject to input saturation.To save the communication resources between the controller and the actuators,stochastic communication protocols(SCPs)are adopted to schedule the control signal,and therefore the closed-loop system is essentially a protocol-induced switching system.A neural network(NN)-based identifier with a robust term is exploited for approximating the unknown nonlinear system,and a set of switch-based updating rules with an additional tunable parameter of NN weights are developed with the help of the gradient descent.By virtue of a novel Lyapunov function,a sufficient condition is proposed to achieve the stability of both system identification errors and the update dynamics of NN weights.Then,a value iterative ADP algorithm in an offline way is proposed to solve the optimal control of protocol-induced switching systems with saturation constraints,and the convergence is profoundly discussed in light of mathematical induction.Furthermore,an actor-critic NN scheme is developed to approximate the control law and the proposed performance index function in the framework of ADP,and the stability of the closed-loop system is analyzed in view of the Lyapunov theory.Finally,the numerical simulation results are presented to demonstrate the effectiveness of the proposed control scheme.展开更多
This paper presents a neighborhood optimal trajectory online correction algorithm considering terminal time variation,and investigates its application range.Firstly,the motion model of midcourse guidance is establishe...This paper presents a neighborhood optimal trajectory online correction algorithm considering terminal time variation,and investigates its application range.Firstly,the motion model of midcourse guidance is established,and the online trajectory correction-regenerating strategy is introduced.Secondly,based on the neighborhood optimal control theory,a neighborhood optimal trajectory online correction algorithm considering the terminal time variation is proposed by adding the consideration of terminal time variation to the traditional neighborhood optimal trajectory correction method.Thirdly,the Monte Carlo simulation method is used to analyze the application range of the algorithm,which provides a basis for the division of application domain of the online correction algorithm and the online regeneration algorithm of midcourse guidance trajectory.Finally,the simulation results show that the algorithm has high real-time performance,and the online correction trajectory can meet the requirements of terminal constraint change.The application range of the algorithm is obtained through Monte Carlo simulation.展开更多
Random vector functional ink(RVFL)networks belong to a class of single hidden layer neural networks in which some parameters are randomly selected.Their network structure in which contains the direct links between inp...Random vector functional ink(RVFL)networks belong to a class of single hidden layer neural networks in which some parameters are randomly selected.Their network structure in which contains the direct links between inputs and outputs is unique,and stability analysis and real-time performance are two difficulties of the control systems based on neural networks.In this paper,combining the advantages of RVFL and the ideas of online sequential extreme learning machine(OS-ELM)and initial-training-free online extreme learning machine(ITFOELM),a novel online learning algorithm which is named as initial-training-free online random vector functional link algo rithm(ITF-ORVFL)is investigated for training RVFL.The link vector of RVFL network can be analytically determined based on sequentially arriving data by ITF-ORVFL with a high learning speed,and the stability for nonlinear systems based on this learning algorithm is analyzed.The experiment results indicate that the proposed ITF-ORVFL is effective in coping with nonparametric uncertainty.展开更多
Rather than maintaining the classic teaching approach, a growing number of schools use the blended learning system in higher education. The traditional method of teaching focuses on the result of students' progres...Rather than maintaining the classic teaching approach, a growing number of schools use the blended learning system in higher education. The traditional method of teaching focuses on the result of students' progress. However, many student activities are recorded by an online programming learning platform at present. In this paper, we focus on student behavior when completing an online open-ended programming task. First, we conduct statistical analysis to examine student behavior on the basis of test times and completed time. By combining these two factors, we then classify student behavior into four types by using k-means algorithm. The results are useful for teachers to enhance their understanding of student learning and for students to know their learning style in depth. The findings are also valuable to re-design the learning platform.展开更多
BACKGROUND During the coronavirus disease 2019(COVID-19)pandemic,traditional teaching methods were disrupted and online teaching became a new topic in education reform and informatization.In this context,it is importa...BACKGROUND During the coronavirus disease 2019(COVID-19)pandemic,traditional teaching methods were disrupted and online teaching became a new topic in education reform and informatization.In this context,it is important to investigate the necessity and effectiveness of online teaching methods for medical students.This study explored stomatology education in China to evaluate the development and challenges facing the field using massive open online courses(MOOCs)for oral medicine education during the pandemic.AIM To investigate the current situation and challenges facing stomatology education in China,and to assess the necessity and effectiveness of online teaching methods among medical students.METHODS Online courses were developed and offered on personal computers and mobile terminals.Behavioral analysis and formative assessments were conducted to evaluate the learning status of students.RESULTS The results showed that most learners had already completed MOOCs and achieved better results.Course behavior analysis and student surveys indicated that students enjoyed the learning experience.However,the development of oral MOOCs during the COVID-19 pandemic faced significant challenges.CONCLUSION This study provides insights into the potential of using MOOCs to support online professional learning and future teaching innovation,but emphasizes the need for careful design and positive feedback to ensure their success.展开更多
To overcome the large time-delay in measuring the hardness of mixed rubber, rheological parameters were used to predict the hardness. A novel Q-based model updating strategy was proposed as a universal platform to tra...To overcome the large time-delay in measuring the hardness of mixed rubber, rheological parameters were used to predict the hardness. A novel Q-based model updating strategy was proposed as a universal platform to track time-varying properties. Using a few selected support samples to update the model, the strategy could dramat- ically save the storage cost and overcome the adverse influence of low signal-to-noise ratio samples. Moreover, it could be applied to any statistical process monitoring system without drastic changes to them, which is practical for industrial practices. As examples, the Q-based strategy was integrated with three popular algorithms (partial least squares (PIE), recursive PIE (RPLS), and kernel PIE (KPIE)) to form novel regression ones, QPLS, QRPIE and QKPLS, respectively. The applications for predicting mixed rubber hardness on a large-scale tire plant in east China prove the theoretical considerations.展开更多
基金supported by the National Natural Science Foundation of China(62473354).
文摘For the n-qubit stochastic open quantum systems,based on the Lyapunov stability theorem and LaSalle’s invariant set principle,a pure state switching control based on on-line estimated state feedback(short for OQST-SFC)is proposed to realize the state transition the pure state of the target state including eigenstate and superposition state.The proposed switching control consists of a constant control and a control law designed based on the Lyapunov method,in which the Lyapunov function is the state distance of the system.The constant control is used to drive the system state from an initial state to the convergence domain only containing the target state,and a Lyapunov-based control is used to make the state enter the convergence domain and then continue to converge to the target state.At the same time,the continuous weak measurement of quantum system and the quantum state tomography method based on the on-line alternating direction multiplier(QST-OADM)are used to obtain the system information and estimate the quantum state which is used as the input of the quantum system controller.Then,the pure state feedback switching control method based on the on-line estimated state feedback is realized in an n-qubit stochastic open quantum system.The complete derivation process of n-qubit QST-OADM algorithm is given;Through strict theoretical proof and analysis,the convergence conditions to ensure any initial state of the quantum system to converge the target pure state are given.The proposed control method is applied to a 2-qubit stochastic open quantum system for numerical simulation experiments.Four possible different position cases between the initial estimated state and that of the controlled system are studied and discussed,and the performances of the state transition under the corresponding cases are analyzed.
基金supported in part by the National Natural Science Foundation of China(62033003,62003093,62373113,U23A20341,U21A20522)the Natural Science Foundation of Guangdong Province,China(2023A1515011527,2022A1515011506).
文摘An optimal tracking control problem for a class of nonlinear systems with guaranteed performance and asymmetric input constraints is discussed in this paper.The control policy is implemented by adaptive dynamic programming(ADP)algorithm under two event-based triggering mechanisms.It is often challenging to design an optimal control law due to the system deviation caused by asymmetric input constraints.First,a prescribed performance control technique is employed to guarantee the tracking errors within predetermined boundaries.Subsequently,considering the asymmetric input constraints,a discounted non-quadratic cost function is introduced.Moreover,in order to reduce controller updates,an event-triggered control law is developed for ADP algorithm.After that,to further simplify the complexity of controller design,this work is extended to a self-triggered case for relaxing the need for continuous signal monitoring by hardware devices.By employing the Lyapunov method,the uniform ultimate boundedness of all signals is proved to be guaranteed.Finally,a simulation example on a mass–spring–damper system subject to asymmetric input constraints is provided to validate the effectiveness of the proposed control scheme.
基金supported in part by the National Natural Science Foundation of China(62222301, 62073085, 62073158, 61890930-5, 62021003)the National Key Research and Development Program of China (2021ZD0112302, 2021ZD0112301, 2018YFC1900800-5)Beijing Natural Science Foundation (JQ19013)。
文摘Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence.
基金supported in part by Fundamental Research Funds for the Central Universities(2022JBZX024)in part by the National Natural Science Foundation of China(61872037,61273167)。
文摘Aimed at infinite horizon optimal control problems of discrete time-varying nonlinear systems,in this paper,a new iterative adaptive dynamic programming algorithm,which is the discrete-time time-varying policy iteration(DTTV)algorithm,is developed.The iterative control law is designed to update the iterative value function which approximates the index function of optimal performance.The admissibility of the iterative control law is analyzed.The results show that the iterative value function is non-increasingly convergent to the Bellman-equation optimal solution.To implement the algorithm,neural networks are employed and a new implementation structure is established,which avoids solving the generalized Bellman equation in each iteration.Finally,the optimal control laws for torsional pendulum and inverted pendulum systems are obtained by using the DTTV policy iteration algorithm,where the mass and pendulum bar length are permitted to be time-varying parameters.The effectiveness of the developed method is illustrated by numerical results and comparisons.
基金supported by the National Natural Science Foundation of China under Grant 62132009 and 61872211。
文摘In emerging applications such as industrial control and autonomous driving,end-to-end deterministic quality of service(QoS)transmission guarantee has become an urgent problem to be solved.Internet congestion control algorithms are essential to the performance of applications.However,existing congestion control schemes follow the best-effort principle of data transmission without the perception of application QoS requirements.To enable data delivery within application QoS constraints,we leverage an online learning mechanism to design Crimson,a novel congestion control algorithm in which each sender continuously observes the gap between current performance and pre-defined QoS.Crimson can change rates adaptively that satisfy application QoS requirements as a result.Across many emulation environments and real-world experiments,our proposed scheme can efficiently balance the different trade-offs between throughput,delay and loss rate.Crimson also achieves consistent performance over a wide range of QoS constraints under diverse network scenarios.
文摘In this paper, at first, the single input rule modules(SIRMs) dynamically connected fuzzy inference model is used to stabilize a double inverted pendulum system. Then, a multiobjective particle swarm optimization(MOPSO) is implemented to optimize the fuzzy controller parameters in order to decrease the distance error of the cart and summation of the angle errors of the pendulums, simultaneously. The feasibility and efficiency of the proposed Pareto front is assessed in comparison with results reported in literature and obtained from other algorithms.Finally, the Java programming with applets is utilized to simulate the stability of the nonlinear system and explain the internetbased control.
基金Supported by National High Technology Research and Development Program of China (863 Program) (2006AA04Z183), National Nat- ural Science Foundation of China (60621001, 60534010, 60572070, 60774048, 60728307), and the Program for Changjiang Scholars and Innovative Research Groups of China (60728307, 4031002)
基金supported by the National Natural Science Foundation of China(62103265)the“ChenGuang Program”Supported by the Shanghai Education Development Foundation+1 种基金Shanghai Municipal Education Commission of China(20CG11)the Young Elite Scientists Sponsorship Program by Cast of China Association for Science and Technology。
文摘Traditionally,offline optimization of power systems is acceptable due to the largely predictable loads and reliable generation.The increasing penetration of fluctuating renewable generation and internet-of-things devices allowing for fine-grained controllability of loads have led to the diminishing applicability of offline optimization in the power systems domain,and have redirected attention to online optimization methods.However,online optimization is a broad topic that can be applied in and motivated by different settings,operated on different time scales,and built on different theoretical foundations.This paper reviews the various types of online optimization techniques used in the power systems domain and aims to make clear the distinction between the most common techniques used.In particular,we introduce and compare four distinct techniques used covering the breadth of online optimization techniques used in the power systems domain,i.e.,optimization-guided dynamic control,feedback optimization for single-period problems,Lyapunov-based optimization,and online convex optimization techniques for multi-period problems.Lastly,we recommend some potential future directions for online optimization in the power systems domain.
基金supported in part by the National Natural Science Foundation of China(61873300,61722312)in part by the Fundamental Research Funds for the Central Universities(FRF-GF-17-B45)
文摘Based on adaptive dynamic programming(ADP),the fixed-point tracking control problem is solved by a value iteration(VI) algorithm. First, a class of discrete-time(DT)nonlinear system with disturbance is considered. Second, the convergence of a VI algorithm is given. It is proven that the iterative cost function precisely converges to the optimal value,and the control input and disturbance input also converges to the optimal values. Third, a novel analysis pertaining to the range of the discount factor is presented, where the cost function serves as a Lyapunov function. Finally, neural networks(NNs) are employed to approximate the cost function, the control law, and the disturbance law. Simulation examples are given to illustrate the effective performance of the proposed method.
文摘In this paper, an online optimal distributed learning algorithm is proposed to solve leader-synchronization problem of nonlinear multi-agent differential graphical games. Each player approximates its optimal control policy using a single-network approximate dynamic programming(ADP) where only one critic neural network(NN) is employed instead of typical actorcritic structure composed of two NNs. The proposed distributed weight tuning laws for critic NNs guarantee stability in the sense of uniform ultimate boundedness(UUB) and convergence of control policies to the Nash equilibrium. In this paper, by introducing novel distributed local operators in weight tuning laws, there is no more requirement for initial stabilizing control policies. Furthermore, the overall closed-loop system stability is guaranteed by Lyapunov stability analysis. Finally, Simulation results show the effectiveness of the proposed algorithm.
基金Project supported by the National Natural Science Foundation ofChina (No. 10332030), the Special Fund for Doctor Programs inInstitutions of Higher Learning of China (No. 20020335092), andthe Zhejiang Provincial Natural Science Foundation (No. 101046),China
文摘A stochastic optimal control strategy for partially observable nonlinear quasi Hamiltonian systems is proposed. The optimal control forces consist of two parts. The first part is determined by the conditions under which the stochastic optimal control problem of a partially observable nonlinear system is converted into that of a completely observable linear system. The second part is determined by solving the dynamical programming equation derived by applying the stochastic averaging method and stochastic dynamical programming principle to the completely observable linear control system. The response of the optimally controlled quasi Hamiltonian system is predicted by solving the averaged Fokker-Planck-Kolmogorov equation associated with the optimally controlled completely observable linear system and solving the Riccati equation for the estimated error of system states. An example is given to illustrate the procedure and effectiveness of the proposed control strategy.
文摘Nonlinear loads in the power distribution system cause non-sinusoidal currents and voltages with harmonic components.Shunt active filters(SAF) with current controlled voltage source inverters(CCVSI) are usually used to obtain balanced and sinusoidal source currents by injecting compensation currents.However,CCVSI with traditional controllers have a limited transient and steady state performance.In this paper,we propose an adaptive dynamic programming(ADP) controller with online learning capability to improve transient response and harmonics.The proposed controller works alongside existing proportional integral(PI) controllers to efficiently track the reference currents in the d-q domain.It can generate adaptive control actions to compensate the PI controller.The proposed system was simulated under different nonlinear(three-phase full wave rectifier) load conditions.The performance of the proposed approach was compared with the traditional approach.We have also included the simulation results without connecting the traditional PI control based power inverter for reference comparison.The online learning based ADP controller not only reduced average total harmonic distortion by 18.41%,but also outperformed traditional PI controllers during transients.
基金supported in part by the National Natural ScienceFoundation of China(61533017,61973330,61773075,61603387)the Early Career Development Award of SKLMCCS(20180201)the State Key Laboratory of Synthetical Automation for Process Industries(2019-KF-23-03)。
文摘In this paper, a data-based fault tolerant control(FTC) scheme is investigated for unknown continuous-time(CT)affine nonlinear systems with actuator faults. First, a neural network(NN) identifier based on particle swarm optimization(PSO) is constructed to model the unknown system dynamics. By utilizing the estimated system states, the particle swarm optimized critic neural network(PSOCNN) is employed to solve the Hamilton-Jacobi-Bellman equation(HJBE) more efficiently.Then, a data-based FTC scheme, which consists of the NN identifier and the fault compensator, is proposed to achieve actuator fault tolerance. The stability of the closed-loop system under actuator faults is guaranteed by the Lyapunov stability theorem. Finally, simulations are provided to demonstrate the effectiveness of the developed method.
基金supported in part by the Australian Research Council Discovery Early Career Researcher Award(DE200101128)Australian Research Council(DP190101557)。
文摘In this paper,an adaptive dynamic programming(ADP)strategy is investigated for discrete-time nonlinear systems with unknown nonlinear dynamics subject to input saturation.To save the communication resources between the controller and the actuators,stochastic communication protocols(SCPs)are adopted to schedule the control signal,and therefore the closed-loop system is essentially a protocol-induced switching system.A neural network(NN)-based identifier with a robust term is exploited for approximating the unknown nonlinear system,and a set of switch-based updating rules with an additional tunable parameter of NN weights are developed with the help of the gradient descent.By virtue of a novel Lyapunov function,a sufficient condition is proposed to achieve the stability of both system identification errors and the update dynamics of NN weights.Then,a value iterative ADP algorithm in an offline way is proposed to solve the optimal control of protocol-induced switching systems with saturation constraints,and the convergence is profoundly discussed in light of mathematical induction.Furthermore,an actor-critic NN scheme is developed to approximate the control law and the proposed performance index function in the framework of ADP,and the stability of the closed-loop system is analyzed in view of the Lyapunov theory.Finally,the numerical simulation results are presented to demonstrate the effectiveness of the proposed control scheme.
基金supported by the National Natural Science Foundation of China(61873278,62173339)。
文摘This paper presents a neighborhood optimal trajectory online correction algorithm considering terminal time variation,and investigates its application range.Firstly,the motion model of midcourse guidance is established,and the online trajectory correction-regenerating strategy is introduced.Secondly,based on the neighborhood optimal control theory,a neighborhood optimal trajectory online correction algorithm considering the terminal time variation is proposed by adding the consideration of terminal time variation to the traditional neighborhood optimal trajectory correction method.Thirdly,the Monte Carlo simulation method is used to analyze the application range of the algorithm,which provides a basis for the division of application domain of the online correction algorithm and the online regeneration algorithm of midcourse guidance trajectory.Finally,the simulation results show that the algorithm has high real-time performance,and the online correction trajectory can meet the requirements of terminal constraint change.The application range of the algorithm is obtained through Monte Carlo simulation.
基金supported by the Ministry of Science and Technology of China(2018AAA0101000,2017YFF0205306,WQ20141100198)the National Natural Science Foundation of China(91648117)。
文摘Random vector functional ink(RVFL)networks belong to a class of single hidden layer neural networks in which some parameters are randomly selected.Their network structure in which contains the direct links between inputs and outputs is unique,and stability analysis and real-time performance are two difficulties of the control systems based on neural networks.In this paper,combining the advantages of RVFL and the ideas of online sequential extreme learning machine(OS-ELM)and initial-training-free online extreme learning machine(ITFOELM),a novel online learning algorithm which is named as initial-training-free online random vector functional link algo rithm(ITF-ORVFL)is investigated for training RVFL.The link vector of RVFL network can be analytically determined based on sequentially arriving data by ITF-ORVFL with a high learning speed,and the stability for nonlinear systems based on this learning algorithm is analyzed.The experiment results indicate that the proposed ITF-ORVFL is effective in coping with nonparametric uncertainty.
基金supported by the National Grand R&D Plan (Grant No.2016YFB1000805)National Natural Science Foundation of China (Grant No.61702534,61432020,61472430,61502512)
文摘Rather than maintaining the classic teaching approach, a growing number of schools use the blended learning system in higher education. The traditional method of teaching focuses on the result of students' progress. However, many student activities are recorded by an online programming learning platform at present. In this paper, we focus on student behavior when completing an online open-ended programming task. First, we conduct statistical analysis to examine student behavior on the basis of test times and completed time. By combining these two factors, we then classify student behavior into four types by using k-means algorithm. The results are useful for teachers to enhance their understanding of student learning and for students to know their learning style in depth. The findings are also valuable to re-design the learning platform.
基金National Natural Science Foundation of China,No.31870971Zhejiang Medical and Health Science and Technology Plan,No.2023KY155.
文摘BACKGROUND During the coronavirus disease 2019(COVID-19)pandemic,traditional teaching methods were disrupted and online teaching became a new topic in education reform and informatization.In this context,it is important to investigate the necessity and effectiveness of online teaching methods for medical students.This study explored stomatology education in China to evaluate the development and challenges facing the field using massive open online courses(MOOCs)for oral medicine education during the pandemic.AIM To investigate the current situation and challenges facing stomatology education in China,and to assess the necessity and effectiveness of online teaching methods among medical students.METHODS Online courses were developed and offered on personal computers and mobile terminals.Behavioral analysis and formative assessments were conducted to evaluate the learning status of students.RESULTS The results showed that most learners had already completed MOOCs and achieved better results.Course behavior analysis and student surveys indicated that students enjoyed the learning experience.However,the development of oral MOOCs during the COVID-19 pandemic faced significant challenges.CONCLUSION This study provides insights into the potential of using MOOCs to support online professional learning and future teaching innovation,but emphasizes the need for careful design and positive feedback to ensure their success.
文摘To overcome the large time-delay in measuring the hardness of mixed rubber, rheological parameters were used to predict the hardness. A novel Q-based model updating strategy was proposed as a universal platform to track time-varying properties. Using a few selected support samples to update the model, the strategy could dramat- ically save the storage cost and overcome the adverse influence of low signal-to-noise ratio samples. Moreover, it could be applied to any statistical process monitoring system without drastic changes to them, which is practical for industrial practices. As examples, the Q-based strategy was integrated with three popular algorithms (partial least squares (PIE), recursive PIE (RPLS), and kernel PIE (KPIE)) to form novel regression ones, QPLS, QRPIE and QKPLS, respectively. The applications for predicting mixed rubber hardness on a large-scale tire plant in east China prove the theoretical considerations.