期刊文献+
共找到7篇文章
< 1 >
每页显示 20 50 100
Direct heuristic dynamic programming based on an improved PID neural network 被引量:2
1
作者 Jian SUN Feng LIU +1 位作者 Jennie SI Shengwei MEI 《控制理论与应用(英文版)》 EI 2012年第4期497-503,共7页
In this paper, an improved PID-neural network (IPIDNN) structure is proposed and applied to the critic and action networks of direct heuristic dynamic programming (DHDP). As one of online learning algorithm of app... In this paper, an improved PID-neural network (IPIDNN) structure is proposed and applied to the critic and action networks of direct heuristic dynamic programming (DHDP). As one of online learning algorithm of approximate dynamic programming (ADP), DHDP has demonstrated its applicability to large state and control problems. Theoretically, the DHDP algorithm requires access to full state feedback in order to obtain solutions to the Bellman optimality equation. Unfortunately, it is not always possible to access all the states in a real system. This paper proposes a solution by suggesting an IPIDNN configuration to construct the critic and action networks to achieve an output feedback control. Since this structure can estimate the integrals and derivatives of measurable outputs, more system states are utilized and thus better control performance are expected. Compared with traditional PIDNN, this configuration is flexible and easy to expand. Based on this structure, a gradient decent algorithm for this IPIDNN-based DHDP is presented. Convergence issues are addressed within a single learning time step and for the entire learning process. Some important insights are provided to guide the implementation of the algorithm. The proposed learning controller has been applied to a cart-pole system to validate the effectiveness of the structure and the algorithm. 展开更多
关键词 Approximate dynamic programming (ADP) Direct heuristic dynamic programming (DHDP) ImprovedPID neural network (IPIDNN)
原文传递
Heuristic dynamic programming-based learning control for discrete-time disturbed multi-agent systems
2
作者 Yao Zhang Chaoxu Mu +1 位作者 Yong Zhang Yanghe Feng 《Control Theory and Technology》 EI CSCD 2021年第3期339-353,共15页
Owing to extensive applications in many fields,the synchronization problem has been widely investigated in multi-agent systems.The synchronization for multi-agent systems is a pivotal issue,which means that under the ... Owing to extensive applications in many fields,the synchronization problem has been widely investigated in multi-agent systems.The synchronization for multi-agent systems is a pivotal issue,which means that under the designed control policy,the output of systems or the state of each agent can be consistent with the leader.The purpose of this paper is to investigate a heuristic dynamic programming(HDP)-based learning tracking control for discrete-time multi-agent systems to achieve synchronization while considering disturbances in systems.Besides,due to the difficulty of solving the coupled Hamilton–Jacobi–Bellman equation analytically,an improved HDP learning control algorithm is proposed to realize the synchronization between the leader and all following agents,which is executed by an action-critic neural network.The action and critic neural network are utilized to learn the optimal control policy and cost function,respectively,by means of introducing an auxiliary action network.Finally,two numerical examples and a practical application of mobile robots are presented to demonstrate the control performance of the HDP-based learning control algorithm. 展开更多
关键词 Multi-agent systems heuristic dynamic programming(HDP) Learning control Neural network SYNCHRONIZATION
原文传递
A novel adaptive heuristic dynamic programming-based algorithm for aircraft confrontation games
3
作者 Yi Mao Zhijie Chen +1 位作者 Yi Yang Yuxin Hu 《Fundamental Research》 CAS 2021年第6期792-799,共8页
Intelligent confrontation has become a vital technology for future air combats.Confrontation games between a penetrating aircraft and an intercepting aircraft are essential for modern air combats.In addition,the perfo... Intelligent confrontation has become a vital technology for future air combats.Confrontation games between a penetrating aircraft and an intercepting aircraft are essential for modern air combats.In addition,the perfor-mance indexes of both the interceptor and penetrator must be considered.Traditional methods only solve one side’s guidance problem without considering the intelligence of the opponent.In this paper,an adaptive heuristic dynamic programming-based algorithm is proposed for aircraft confrontation games.This algorithm constructs a heuristic dynamic programming model for both confrontation aircraft and then updates the critical and ac-tion network parameters using the dynamic confrontation state information.Numerical simulations indicate that the proposed algorithm can optimize the guidance law for both the interceptor and penetrator and is therefore superior to traditional proportional navigation methods. 展开更多
关键词 Aircraft confrontation Optimal control Neural network Adaptive heuristic dynamic programming Flight simulation
原文传递
HEURISTIC MODELING FOR A DYNAMIC AND GOAL PROGRAMMING IN PRODUCTION PLANNING OF CONTINUOUS MANUFACTURING SYSTEMS 被引量:2
4
作者 JAHAN A ABDOLSHAH M 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2007年第5期110-113,共4页
At the first sight it seems that advanced operation research is not used enough in continuous production systems as comparison with mass production, batch production and job shop systems, but really in a comprehensive... At the first sight it seems that advanced operation research is not used enough in continuous production systems as comparison with mass production, batch production and job shop systems, but really in a comprehensive evaluation the advanced operation research techniques can be used in continuous production systems in developing countries very widely, because of initial inadequate plant layout, stage by stage development of production lines, the purchase of second hand machineries from various countries, plurality of customers. A case of production system planning is proposed for a chemical company in which the above mentioned conditions are almost presented. The goals and constraints in this issue are as follows: (1) Minimizing deviation of customer's requirements. (2) Maximizing the profit. (3) Minimizing the frequencies of changes in formula production. (4) Minimizing the inventory of final products. (5) Balancing the production sections with regard to rate in production. (6) Limitation in inventory of raw material. The present situation is in such a way that various techniques such as goal programming, linear programming and dynamic programming can be used. But dynamic production programming issues are divided into two categories, at first one with limitation in production capacity and another with unlimited production capacity. For the first category, a systematic and acceptable solution has not been presented yet. Therefore an innovative method is used to convert the dynamic situation to a zero- one model. At last this issue is changed to a goal programming model with non-linear limitations with the use of GRG algorithm and that's how it is solved. 展开更多
关键词 heuristic model dynamic programming Goal programming production planning
下载PDF
Residential Energy Scheduling for Variable Weather Solar Energy Based on Adaptive Dynamic Programming 被引量:15
5
作者 Derong Liu Yancai Xu +1 位作者 Qinglai Wei Xinliang Liu 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2018年第1期36-46,共11页
The residential energy scheduling of solar energy is an important research area of smart grid. On the demand side, factors such as household loads, storage batteries, the outside public utility grid and renewable ener... The residential energy scheduling of solar energy is an important research area of smart grid. On the demand side, factors such as household loads, storage batteries, the outside public utility grid and renewable energy resources, are combined together as a nonlinear, time-varying, indefinite and complex system, which is difficult to manage or optimize. Many nations have already applied the residential real-time pricing to balance the burden on their grid. In order to enhance electricity efficiency of the residential micro grid, this paper presents an action dependent heuristic dynamic programming(ADHDP) method to solve the residential energy scheduling problem. The highlights of this paper are listed below. First,the weather-type classification is adopted to establish three types of programming models based on the features of the solar energy. In addition, the priorities of different energy resources are set to reduce the loss of electrical energy transmissions.Second, three ADHDP-based neural networks, which can update themselves during applications, are designed to manage the flows of electricity. Third, simulation results show that the proposed scheduling method has effectively reduced the total electricity cost and improved load balancing process. The comparison with the particle swarm optimization algorithm further proves that the present method has a promising effect on energy management to save cost. 展开更多
关键词 Action dependent heuristic dynamic programming adaptive dynamic programming control strategy residential energy management smart grid
下载PDF
Adaptive Multi-Step Evaluation Design With Stability Guarantee for Discrete-Time Optimal Learning Control 被引量:2
6
作者 Ding Wang Jiangyu Wang +2 位作者 Mingming Zhao Peng Xin Junfei Qiao 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2023年第9期1797-1809,共13页
This paper is concerned with a novel integrated multi-step heuristic dynamic programming(MsHDP)algorithm for solving optimal control problems.It is shown that,initialized by the zero cost function,MsHDP can converge t... This paper is concerned with a novel integrated multi-step heuristic dynamic programming(MsHDP)algorithm for solving optimal control problems.It is shown that,initialized by the zero cost function,MsHDP can converge to the optimal solution of the Hamilton-Jacobi-Bellman(HJB)equation.Then,the stability of the system is analyzed using control policies generated by MsHDP.Also,a general stability criterion is designed to determine the admissibility of the current control policy.That is,the criterion is applicable not only to traditional value iteration and policy iteration but also to MsHDP.Further,based on the convergence and the stability criterion,the integrated MsHDP algorithm using immature control policies is developed to accelerate learning efficiency greatly.Besides,actor-critic is utilized to implement the integrated MsHDP scheme,where neural networks are used to evaluate and improve the iterative policy as the parameter architecture.Finally,two simulation examples are given to demonstrate that the learning effectiveness of the integrated MsHDP scheme surpasses those of other fixed or integrated methods. 展开更多
关键词 Adaptive critic artificial neural networks Hamilton-Jacobi-Bellman(HJB)equation multi-step heuristic dynamic programming multi-step reinforcement learning optimal control
下载PDF
Robotic Knee Tracking Control to Mimic the Intact Human Knee Profile Based on Actor-Critic Reinforcement Learning 被引量:2
7
作者 Ruofan Wu Zhikai Yao +1 位作者 Jennie Si He(Helen)Huang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2022年第1期19-30,共12页
We address a state-of-the-art reinforcement learning(RL)control approach to automatically configure robotic pros-thesis impedance parameters to enable end-to-end,continuous locomotion intended for transfemoral amputee... We address a state-of-the-art reinforcement learning(RL)control approach to automatically configure robotic pros-thesis impedance parameters to enable end-to-end,continuous locomotion intended for transfemoral amputee subjects.Specifically,our actor-critic based RL provides tracking control of a robotic knee prosthesis to mimic the intact knee profile.This is a significant advance from our previous RL based automatic tuning of prosthesis control parameters which have centered on regulation control with a designer prescribed robotic knee profile as the target.In addition to presenting the tracking control algorithm based on direct heuristic dynamic programming(dHDP),we provide a control performance guarantee including the case of constrained inputs.We show that our proposed tracking control possesses several important properties,such as weight convergence of the learning networks,Bellman(sub)optimality of the cost-to-go value function and control input,and practical stability of the human-robot system.We further provide a systematic simulation of the proposed tracking control using a realistic human-robot system simulator,the OpenSim,to emulate how the dHDP enables level ground walking,walking on different terrains and at different paces.These results show that our proposed dHDP based tracking control is not only theoretically suitable,but also practically useful. 展开更多
关键词 Automatic tracking of intact knee configuration of robotic knee prosthesis direct heuristic dynamic programming(dHDP) reinforcement learning control
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部