Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and ...Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence.展开更多
This paper studies the problem of optimal parallel tracking control for continuous-time general nonlinear systems.Unlike existing optimal state feedback control,the control input of the optimal parallel control is int...This paper studies the problem of optimal parallel tracking control for continuous-time general nonlinear systems.Unlike existing optimal state feedback control,the control input of the optimal parallel control is introduced into the feedback system.However,due to the introduction of control input into the feedback system,the optimal state feedback control methods can not be applied directly.To address this problem,an augmented system and an augmented performance index function are proposed firstly.Thus,the general nonlinear system is transformed into an affine nonlinear system.The difference between the optimal parallel control and the optimal state feedback control is analyzed theoretically.It is proven that the optimal parallel control with the augmented performance index function can be seen as the suboptimal state feedback control with the traditional performance index function.Moreover,an adaptive dynamic programming(ADP)technique is utilized to implement the optimal parallel tracking control using a critic neural network(NN)to approximate the value function online.The stability analysis of the closed-loop system is performed using the Lyapunov theory,and the tracking error and NN weights errors are uniformly ultimately bounded(UUB).Also,the optimal parallel controller guarantees the continuity of the control input under the circumstance that there are finite jump discontinuities in the reference signals.Finally,the effectiveness of the developed optimal parallel control method is verified in two cases.展开更多
This paper presents a new design approach to achieve decentralized optimal control of high-dimension complex singular systems with dynamic uncertainties. Based on robust adaptive dynamic programming(robust ADP) method...This paper presents a new design approach to achieve decentralized optimal control of high-dimension complex singular systems with dynamic uncertainties. Based on robust adaptive dynamic programming(robust ADP) method, controllers for solving the singular systems optimal control problem are designed. The proposed algorithm can work well when the system model is not exactly known but the input and output data can be measured. The policy iteration of each controller only uses their own states and input information for learning,and do not need to know the whole system dynamics. Simulation results on the New England 10-machine 39-bus test system show the effectiveness of the designed controller.展开更多
A certain number of considerations should be taken into account in the dynamic control of robot manipulators as highly complex non-linear systems.In this article,we provide a detailed presentation of the mechanical an...A certain number of considerations should be taken into account in the dynamic control of robot manipulators as highly complex non-linear systems.In this article,we provide a detailed presentation of the mechanical and electrical impli- cations of robots equipped with DC motor actuators.This model takes into account all non-linear aspects of the system.Then,we develop computational algorithms for optimal control based on dynamic programming.The robot's trajectory must be predefined,but performance criteria and constraints applying to the system are not limited and we may adapt them freely to the robot and the task being studied.As an example,a manipulator arm with 3 degrees of freedom is analyzed.展开更多
This paper introduces a self-learning control approach based on approximate dynamic programming. Dynamic programming was introduced by Bellman in the 1950's for solving optimal control problems of nonlinear dynami...This paper introduces a self-learning control approach based on approximate dynamic programming. Dynamic programming was introduced by Bellman in the 1950's for solving optimal control problems of nonlinear dynamical systems. Due to its high computational complexity, the applications of dynamic programming have been limited to simple and small problems. The key step in finding approximate solutions to dynamic programming is to estimate the performance index in dynamic programming. The optimal control signal can then be determined by minimizing (or maximizing) the performance index. Artificial neural networks are very efficient tools in representing the performance index in dynamic programming. This paper assumes the use of neural networks for estimating the performance index in dynamic programming and for generating optimal control signals, thus to achieve optimal control through self-learning.展开更多
A good hybrid vehicle control strategy cannot only meet the power requirements of the vehicle,but also effectively save fuel and reduce emissions.In this paper,the construction of model predictive control in hybrid el...A good hybrid vehicle control strategy cannot only meet the power requirements of the vehicle,but also effectively save fuel and reduce emissions.In this paper,the construction of model predictive control in hybrid electric vehicle is proposed.The solving process and the use of reference trajectory are discussed for the application of MPC based on dynamic programming algorithm.The simulation of hybrid electric vehicle is carried out under a specific working condition.The simulation results show that the control strategy can effectively reduce fuel consumption when the torque of engine and motor is reasonably distributed,and the effectiveness of the control strategy is verified.展开更多
Due to the complexity of thickness and shape synthetical adjustment system and the difficulties to build a mathematical model,a thickness and shape synthetical adjustment scheme on DC mill based on dynamic nerve-fuzzy...Due to the complexity of thickness and shape synthetical adjustment system and the difficulties to build a mathematical model,a thickness and shape synthetical adjustment scheme on DC mill based on dynamic nerve-fuzzy control was put forward,and a self-organizing fuzzy control model was established.The structure of the network can be optimized dynamically.In the course of studying,the network can automatically adjust its structure based on the specific questions and make its structure the optimal.The input and output of the network are fuzzy sets,and the trained network can complete the composite relation,the fuzzy inference.For decreasing the off-line training time of BP network,the fuzzy sets are encoded.The simulation results indicate that the self-organizing fuzzy control based on dynamic neural network is better than traditional decoupling PID control.展开更多
Dynamic Programming (DP) algorithm is used to find the optimal trajectories under Beijing cycle for the power management of synergic electric system (SES) which is composed of battery and super capacitor. Feasible rul...Dynamic Programming (DP) algorithm is used to find the optimal trajectories under Beijing cycle for the power management of synergic electric system (SES) which is composed of battery and super capacitor. Feasible rules are derived from analyzing the optimal trajectories, and it has the highest contribution to Hybrid Electric Vehicle (HEV). The methods of how to get the best performance is also educed. Using the new Rule-based power management strat-egy adopted from the optimal results, it is easy to demonstrate the effectiveness of the new strategy in further improvement of the fuel economy by the synergic hybrid system.展开更多
An optimal tracking control problem for a class of nonlinear systems with guaranteed performance and asymmetric input constraints is discussed in this paper.The control policy is implemented by adaptive dynamic progra...An optimal tracking control problem for a class of nonlinear systems with guaranteed performance and asymmetric input constraints is discussed in this paper.The control policy is implemented by adaptive dynamic programming(ADP)algorithm under two event-based triggering mechanisms.It is often challenging to design an optimal control law due to the system deviation caused by asymmetric input constraints.First,a prescribed performance control technique is employed to guarantee the tracking errors within predetermined boundaries.Subsequently,considering the asymmetric input constraints,a discounted non-quadratic cost function is introduced.Moreover,in order to reduce controller updates,an event-triggered control law is developed for ADP algorithm.After that,to further simplify the complexity of controller design,this work is extended to a self-triggered case for relaxing the need for continuous signal monitoring by hardware devices.By employing the Lyapunov method,the uniform ultimate boundedness of all signals is proved to be guaranteed.Finally,a simulation example on a mass–spring–damper system subject to asymmetric input constraints is provided to validate the effectiveness of the proposed control scheme.展开更多
Approximate dynamic programming (ADP) is a general and effective approach for solving optimal control and estimation problems by adapting to uncertain and nonconvex environments over time.
The residential energy scheduling of solar energy is an important research area of smart grid. On the demand side, factors such as household loads, storage batteries, the outside public utility grid and renewable ener...The residential energy scheduling of solar energy is an important research area of smart grid. On the demand side, factors such as household loads, storage batteries, the outside public utility grid and renewable energy resources, are combined together as a nonlinear, time-varying, indefinite and complex system, which is difficult to manage or optimize. Many nations have already applied the residential real-time pricing to balance the burden on their grid. In order to enhance electricity efficiency of the residential micro grid, this paper presents an action dependent heuristic dynamic programming(ADHDP) method to solve the residential energy scheduling problem. The highlights of this paper are listed below. First,the weather-type classification is adopted to establish three types of programming models based on the features of the solar energy. In addition, the priorities of different energy resources are set to reduce the loss of electrical energy transmissions.Second, three ADHDP-based neural networks, which can update themselves during applications, are designed to manage the flows of electricity. Third, simulation results show that the proposed scheduling method has effectively reduced the total electricity cost and improved load balancing process. The comparison with the particle swarm optimization algorithm further proves that the present method has a promising effect on energy management to save cost.展开更多
An adaptive weighted stereo matching algorithm with multilevel and bidirectional dynamic programming based on ground control points (GCPs) is presented. To decrease time complexity without losing matching precision,...An adaptive weighted stereo matching algorithm with multilevel and bidirectional dynamic programming based on ground control points (GCPs) is presented. To decrease time complexity without losing matching precision, using a multilevel search scheme, the coarse matching is processed in typical disparity space image, while the fine matching is processed in disparity-offset space image. In the upper level, GCPs are obtained by enhanced volumetric iterative algorithm enforcing the mutual constraint and the threshold constraint. Under the supervision of the highly reliable GCPs, bidirectional dynamic programming framework is employed to solve the inconsistency in the optimization path. In the lower level, to reduce running time, disparity-offset space is proposed to efficiently achieve the dense disparity image. In addition, an adaptive dual support-weight strategy is presented to aggregate matching cost, which considers photometric and geometric information. Further, post-processing algorithm can ameliorate disparity results in areas with depth discontinuities and related by occlusions using dual threshold algorithm, where missing stereo information is substituted from surrounding regions. To demonstrate the effectiveness of the algorithm, we present the two groups of experimental results for four widely used standard stereo data sets, including discussion on performance and comparison with other methods, which show that the algorithm has not only a fast speed, but also significantly improves the efficiency of holistic optimization.展开更多
An essential characteristic of the 4th Generation(4G) wireless networks is integrating various heterogeneous wireless access networks.This paper considers the network selection for both admission and handoff strategy ...An essential characteristic of the 4th Generation(4G) wireless networks is integrating various heterogeneous wireless access networks.This paper considers the network selection for both admission and handoff strategy problems in heterogeneous network of 3G/WLAN.A novel dynamic programming algorithm is proposed by taking heterogeneous network characteristics,user mobility and different service types into account.The specificity of our approach is that it puts the situations in a new model and makes decisions in stages of different states.Simulation results validate that the proposed scheme can obtain better new call blocking and handoff dropping probability performance than traditional schemes while ensuring quality-of-services(QoS) for both real-time and data connections.展开更多
This paper is concerned with the relationship between maximum principle and dynamic programming in zero-sum stochastic differential games. Under the assumption that the value function is enough smooth, relations among...This paper is concerned with the relationship between maximum principle and dynamic programming in zero-sum stochastic differential games. Under the assumption that the value function is enough smooth, relations among the adjoint processes, the generalized Hamiltonian function and the value function are given. A portfolio optimization problem under model uncertainty in the financial market is discussed to show the applications of our result.展开更多
An identification problem is considered as inaccurate measurements of dynamics on a time interval are given. The model has the form of ordinary differential equations which are linear with respect to unknown parameter...An identification problem is considered as inaccurate measurements of dynamics on a time interval are given. The model has the form of ordinary differential equations which are linear with respect to unknown parameters. A new approach is presented to solve the identification problem in the framework of the optimal control theory. A numerical algorithm based on the dynamic programming method is suggested to identify the unknown parameters. Results of simulations are exposed.展开更多
In this paper,a novel adaptive Fault-Tolerant Control(FTC)strategy is proposed for non-minimum phase Hypersonic Vehicles(HSVs)that are affected by actuator faults and parameter uncertainties.The strategy is based on t...In this paper,a novel adaptive Fault-Tolerant Control(FTC)strategy is proposed for non-minimum phase Hypersonic Vehicles(HSVs)that are affected by actuator faults and parameter uncertainties.The strategy is based on the output redefinition method and Adaptive Dynamic Programming(ADP).The intelligent FTC scheme consists of two main parts:a basic fault-tolerant and stable controller and an ADP-based supplementary controller.In the basic FTC part,an output redefinition approach is designed to make zero-dynamics stable with respect to the new output.Then,Ideal Internal Dynamic(IID)is obtained using an optimal bounded inversion approach,and a tracking controller is designed for the new output to realize output tracking of the nonminimum phase HSV system.For the ADP-based compensation control part,an ActionDependent Heuristic Dynamic Programming(ADHDP)adopting an actor-critic learning structure is utilized to further optimize the tracking performance of the HSV control system.Finally,simulation results are provided to verify the effectiveness and efficiency of the proposed FTC algorithm.展开更多
In order to solve the immaturity of decision-making methods in the regulation of winter heating in greenhouses,this study proposed a solution to the problem of greenhouse winter heating regulation using a dynamic prog...In order to solve the immaturity of decision-making methods in the regulation of winter heating in greenhouses,this study proposed a solution to the problem of greenhouse winter heating regulation using a dynamic programming algorithm.A mathematical model that included indoor environmental state variables,optimization decision variables,and outdoor random variables was established.The temperature is kept close to the expected value and the energy consumption is low.The model predicts the control solution by considering the cost function within the next 10 steps.The two-stage planning method was used to optimize the state of each moment step by step.The temperature control strategy model was obtained by training the relationship between indoor temperature,outdoor temperature,and heating time after optimization using a regression algorithm.Based on a typical Internet of Things(IoT)structure,the greenhouse control system was designed to regulate the optimal control according to the feedback of the current environment.Through testing and verification,the optimized control method could stabilize the temperature near the target value.Compared to the threshold control(threshold interval of 2.0°C)under similar weather conditions,the optimized control method reduced the temperature fluctuation range by 0.9°C and saved 7.83 kW·h of electricity,which is about 14.56%of the total experimental electricity consumption.This shows that the dynamic programming method is feasible for environmental regulation in actual greenhouse production,and further research can be expanded in terms of decision variables and policy models to achieve a more comprehensive,scientific,and precise regulation.展开更多
This paper proposes an optimal output feedback tracking control scheme of the quadrotor unmanned aerial vehicle(UAV)attitude system with unmeasured angular velocities and model uncertainties.First,neural network(NN)is...This paper proposes an optimal output feedback tracking control scheme of the quadrotor unmanned aerial vehicle(UAV)attitude system with unmeasured angular velocities and model uncertainties.First,neural network(NN)is used to approximate the model uncertainties.Then,an NN velocity observer is established to estimate the unmeasured angular velocities.Further,a quadrotor output feedback attitude optimal tracking controller is designed,which consists of an adaptive controller designed by backstepping method and an optimal compensation term designed by adaptive dynamic programming.All signals in the closed-loop system are proved to be bounded.Finally,numerical simulation example shows that the quadrotor attitude tracking scheme is effective and feasible.展开更多
基金supported in part by the National Natural Science Foundation of China(62222301, 62073085, 62073158, 61890930-5, 62021003)the National Key Research and Development Program of China (2021ZD0112302, 2021ZD0112301, 2018YFC1900800-5)Beijing Natural Science Foundation (JQ19013)。
文摘Reinforcement learning(RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming(ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively.Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks,showing how they promote ADP formulation significantly.Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has d emonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence.
基金supported in part by the National Key Reseanch and Development Program of China(2018AAA0101502,2018YFB1702300)in part by the National Natural Science Foundation of China(61722312,61533019,U1811463,61533017)in part by the Intel Collaborative Research Institute for Intelligent and Automated Connected Vehicles。
文摘This paper studies the problem of optimal parallel tracking control for continuous-time general nonlinear systems.Unlike existing optimal state feedback control,the control input of the optimal parallel control is introduced into the feedback system.However,due to the introduction of control input into the feedback system,the optimal state feedback control methods can not be applied directly.To address this problem,an augmented system and an augmented performance index function are proposed firstly.Thus,the general nonlinear system is transformed into an affine nonlinear system.The difference between the optimal parallel control and the optimal state feedback control is analyzed theoretically.It is proven that the optimal parallel control with the augmented performance index function can be seen as the suboptimal state feedback control with the traditional performance index function.Moreover,an adaptive dynamic programming(ADP)technique is utilized to implement the optimal parallel tracking control using a critic neural network(NN)to approximate the value function online.The stability analysis of the closed-loop system is performed using the Lyapunov theory,and the tracking error and NN weights errors are uniformly ultimately bounded(UUB).Also,the optimal parallel controller guarantees the continuity of the control input under the circumstance that there are finite jump discontinuities in the reference signals.Finally,the effectiveness of the developed optimal parallel control method is verified in two cases.
基金supported in part by the National Natural Science Foundation of China(61473070,61433004,61627809)SAPI Fundamental Research Funds(2018ZCX22)
文摘This paper presents a new design approach to achieve decentralized optimal control of high-dimension complex singular systems with dynamic uncertainties. Based on robust adaptive dynamic programming(robust ADP) method, controllers for solving the singular systems optimal control problem are designed. The proposed algorithm can work well when the system model is not exactly known but the input and output data can be measured. The policy iteration of each controller only uses their own states and input information for learning,and do not need to know the whole system dynamics. Simulation results on the New England 10-machine 39-bus test system show the effectiveness of the designed controller.
文摘A certain number of considerations should be taken into account in the dynamic control of robot manipulators as highly complex non-linear systems.In this article,we provide a detailed presentation of the mechanical and electrical impli- cations of robots equipped with DC motor actuators.This model takes into account all non-linear aspects of the system.Then,we develop computational algorithms for optimal control based on dynamic programming.The robot's trajectory must be predefined,but performance criteria and constraints applying to the system are not limited and we may adapt them freely to the robot and the task being studied.As an example,a manipulator arm with 3 degrees of freedom is analyzed.
基金Supported by the National Science Foundation (U.S.A.) under Grant ECS-0355364
文摘This paper introduces a self-learning control approach based on approximate dynamic programming. Dynamic programming was introduced by Bellman in the 1950's for solving optimal control problems of nonlinear dynamical systems. Due to its high computational complexity, the applications of dynamic programming have been limited to simple and small problems. The key step in finding approximate solutions to dynamic programming is to estimate the performance index in dynamic programming. The optimal control signal can then be determined by minimizing (or maximizing) the performance index. Artificial neural networks are very efficient tools in representing the performance index in dynamic programming. This paper assumes the use of neural networks for estimating the performance index in dynamic programming and for generating optimal control signals, thus to achieve optimal control through self-learning.
基金Supported by National High Technology Research and Development Program of China (863 Program) (2006AA04Z183), National Nat- ural Science Foundation of China (60621001, 60534010, 60572070, 60774048, 60728307), and the Program for Changjiang Scholars and Innovative Research Groups of China (60728307, 4031002)
基金This work was supported by the youth backbone teachers training program of Henan colleges and universities under Grant No.2016ggjs-287the project of science and technology of Henan province under Grant Nos.172102210124,202102210269the Key Scientific Research projects in Colleges and Universities in Henan(Grant No.18B460003).
文摘A good hybrid vehicle control strategy cannot only meet the power requirements of the vehicle,but also effectively save fuel and reduce emissions.In this paper,the construction of model predictive control in hybrid electric vehicle is proposed.The solving process and the use of reference trajectory are discussed for the application of MPC based on dynamic programming algorithm.The simulation of hybrid electric vehicle is carried out under a specific working condition.The simulation results show that the control strategy can effectively reduce fuel consumption when the torque of engine and motor is reasonably distributed,and the effectiveness of the control strategy is verified.
文摘Due to the complexity of thickness and shape synthetical adjustment system and the difficulties to build a mathematical model,a thickness and shape synthetical adjustment scheme on DC mill based on dynamic nerve-fuzzy control was put forward,and a self-organizing fuzzy control model was established.The structure of the network can be optimized dynamically.In the course of studying,the network can automatically adjust its structure based on the specific questions and make its structure the optimal.The input and output of the network are fuzzy sets,and the trained network can complete the composite relation,the fuzzy inference.For decreasing the off-line training time of BP network,the fuzzy sets are encoded.The simulation results indicate that the self-organizing fuzzy control based on dynamic neural network is better than traditional decoupling PID control.
文摘Dynamic Programming (DP) algorithm is used to find the optimal trajectories under Beijing cycle for the power management of synergic electric system (SES) which is composed of battery and super capacitor. Feasible rules are derived from analyzing the optimal trajectories, and it has the highest contribution to Hybrid Electric Vehicle (HEV). The methods of how to get the best performance is also educed. Using the new Rule-based power management strat-egy adopted from the optimal results, it is easy to demonstrate the effectiveness of the new strategy in further improvement of the fuel economy by the synergic hybrid system.
基金supported in part by the National Natural Science Foundation of China(62033003,62003093,62373113,U23A20341,U21A20522)the Natural Science Foundation of Guangdong Province,China(2023A1515011527,2022A1515011506).
文摘An optimal tracking control problem for a class of nonlinear systems with guaranteed performance and asymmetric input constraints is discussed in this paper.The control policy is implemented by adaptive dynamic programming(ADP)algorithm under two event-based triggering mechanisms.It is often challenging to design an optimal control law due to the system deviation caused by asymmetric input constraints.First,a prescribed performance control technique is employed to guarantee the tracking errors within predetermined boundaries.Subsequently,considering the asymmetric input constraints,a discounted non-quadratic cost function is introduced.Moreover,in order to reduce controller updates,an event-triggered control law is developed for ADP algorithm.After that,to further simplify the complexity of controller design,this work is extended to a self-triggered case for relaxing the need for continuous signal monitoring by hardware devices.By employing the Lyapunov method,the uniform ultimate boundedness of all signals is proved to be guaranteed.Finally,a simulation example on a mass–spring–damper system subject to asymmetric input constraints is provided to validate the effectiveness of the proposed control scheme.
文摘Approximate dynamic programming (ADP) is a general and effective approach for solving optimal control and estimation problems by adapting to uncertain and nonconvex environments over time.
基金supported in part by the National Natural Science Foundation of China(61533017,U1501251,61374105,61722312)
文摘The residential energy scheduling of solar energy is an important research area of smart grid. On the demand side, factors such as household loads, storage batteries, the outside public utility grid and renewable energy resources, are combined together as a nonlinear, time-varying, indefinite and complex system, which is difficult to manage or optimize. Many nations have already applied the residential real-time pricing to balance the burden on their grid. In order to enhance electricity efficiency of the residential micro grid, this paper presents an action dependent heuristic dynamic programming(ADHDP) method to solve the residential energy scheduling problem. The highlights of this paper are listed below. First,the weather-type classification is adopted to establish three types of programming models based on the features of the solar energy. In addition, the priorities of different energy resources are set to reduce the loss of electrical energy transmissions.Second, three ADHDP-based neural networks, which can update themselves during applications, are designed to manage the flows of electricity. Third, simulation results show that the proposed scheduling method has effectively reduced the total electricity cost and improved load balancing process. The comparison with the particle swarm optimization algorithm further proves that the present method has a promising effect on energy management to save cost.
基金supported in part by National Natural Science Foundation of China(61533017,61273140,61304079,61374105,61379099,61233001)Fundamental Research Funds for the Central Universities(FRF-TP-15-056A3)the Open Research Project from SKLMCCS(20150104)
基金supported by the National Natural Science Foundation of China (No.60605023,60775048)Specialized Research Fund for the Doctoral Program of Higher Education (No.20060141006)
文摘An adaptive weighted stereo matching algorithm with multilevel and bidirectional dynamic programming based on ground control points (GCPs) is presented. To decrease time complexity without losing matching precision, using a multilevel search scheme, the coarse matching is processed in typical disparity space image, while the fine matching is processed in disparity-offset space image. In the upper level, GCPs are obtained by enhanced volumetric iterative algorithm enforcing the mutual constraint and the threshold constraint. Under the supervision of the highly reliable GCPs, bidirectional dynamic programming framework is employed to solve the inconsistency in the optimization path. In the lower level, to reduce running time, disparity-offset space is proposed to efficiently achieve the dense disparity image. In addition, an adaptive dual support-weight strategy is presented to aggregate matching cost, which considers photometric and geometric information. Further, post-processing algorithm can ameliorate disparity results in areas with depth discontinuities and related by occlusions using dual threshold algorithm, where missing stereo information is substituted from surrounding regions. To demonstrate the effectiveness of the algorithm, we present the two groups of experimental results for four widely used standard stereo data sets, including discussion on performance and comparison with other methods, which show that the algorithm has not only a fast speed, but also significantly improves the efficiency of holistic optimization.
基金Supported by the National Natural Science Foundation and Civil Aviation Administration of China(No.61071105)
文摘An essential characteristic of the 4th Generation(4G) wireless networks is integrating various heterogeneous wireless access networks.This paper considers the network selection for both admission and handoff strategy problems in heterogeneous network of 3G/WLAN.A novel dynamic programming algorithm is proposed by taking heterogeneous network characteristics,user mobility and different service types into account.The specificity of our approach is that it puts the situations in a new model and makes decisions in stages of different states.Simulation results validate that the proposed scheme can obtain better new call blocking and handoff dropping probability performance than traditional schemes while ensuring quality-of-services(QoS) for both real-time and data connections.
文摘This paper is concerned with the relationship between maximum principle and dynamic programming in zero-sum stochastic differential games. Under the assumption that the value function is enough smooth, relations among the adjoint processes, the generalized Hamiltonian function and the value function are given. A portfolio optimization problem under model uncertainty in the financial market is discussed to show the applications of our result.
文摘An identification problem is considered as inaccurate measurements of dynamics on a time interval are given. The model has the form of ordinary differential equations which are linear with respect to unknown parameters. A new approach is presented to solve the identification problem in the framework of the optimal control theory. A numerical algorithm based on the dynamic programming method is suggested to identify the unknown parameters. Results of simulations are exposed.
基金supported in part by the Science Center Program of National Natural Science Foundation of China(62373189,62188101,62020106003)the Research Fund of State Key Laboratory of Mechanics and Control for Aerospace Structures,China。
文摘In this paper,a novel adaptive Fault-Tolerant Control(FTC)strategy is proposed for non-minimum phase Hypersonic Vehicles(HSVs)that are affected by actuator faults and parameter uncertainties.The strategy is based on the output redefinition method and Adaptive Dynamic Programming(ADP).The intelligent FTC scheme consists of two main parts:a basic fault-tolerant and stable controller and an ADP-based supplementary controller.In the basic FTC part,an output redefinition approach is designed to make zero-dynamics stable with respect to the new output.Then,Ideal Internal Dynamic(IID)is obtained using an optimal bounded inversion approach,and a tracking controller is designed for the new output to realize output tracking of the nonminimum phase HSV system.For the ADP-based compensation control part,an ActionDependent Heuristic Dynamic Programming(ADHDP)adopting an actor-critic learning structure is utilized to further optimize the tracking performance of the HSV control system.Finally,simulation results are provided to verify the effectiveness and efficiency of the proposed FTC algorithm.
基金supported by the National Key Research and Development Program(Grant No.2021YFE0103000)National Key Research and Development Program(Grant No.2022YFD1900400)Ningxia Hui Autonomous Region Key Research and Development Programme(Grant No.2022BBF02026).
文摘In order to solve the immaturity of decision-making methods in the regulation of winter heating in greenhouses,this study proposed a solution to the problem of greenhouse winter heating regulation using a dynamic programming algorithm.A mathematical model that included indoor environmental state variables,optimization decision variables,and outdoor random variables was established.The temperature is kept close to the expected value and the energy consumption is low.The model predicts the control solution by considering the cost function within the next 10 steps.The two-stage planning method was used to optimize the state of each moment step by step.The temperature control strategy model was obtained by training the relationship between indoor temperature,outdoor temperature,and heating time after optimization using a regression algorithm.Based on a typical Internet of Things(IoT)structure,the greenhouse control system was designed to regulate the optimal control according to the feedback of the current environment.Through testing and verification,the optimized control method could stabilize the temperature near the target value.Compared to the threshold control(threshold interval of 2.0°C)under similar weather conditions,the optimized control method reduced the temperature fluctuation range by 0.9°C and saved 7.83 kW·h of electricity,which is about 14.56%of the total experimental electricity consumption.This shows that the dynamic programming method is feasible for environmental regulation in actual greenhouse production,and further research can be expanded in terms of decision variables and policy models to achieve a more comprehensive,scientific,and precise regulation.
基金supported in part by the National Natural Science Foundation of China under the Grants 52301418,51939001,and 61976033.
文摘This paper proposes an optimal output feedback tracking control scheme of the quadrotor unmanned aerial vehicle(UAV)attitude system with unmeasured angular velocities and model uncertainties.First,neural network(NN)is used to approximate the model uncertainties.Then,an NN velocity observer is established to estimate the unmeasured angular velocities.Further,a quadrotor output feedback attitude optimal tracking controller is designed,which consists of an adaptive controller designed by backstepping method and an optimal compensation term designed by adaptive dynamic programming.All signals in the closed-loop system are proved to be bounded.Finally,numerical simulation example shows that the quadrotor attitude tracking scheme is effective and feasible.