For multi-agent reinforcement learning in Markov games, knowledge extraction and sharing are key research problems. State list extracting means to calculate the optimal shared state path from state trajectories with c...For multi-agent reinforcement learning in Markov games, knowledge extraction and sharing are key research problems. State list extracting means to calculate the optimal shared state path from state trajectories with cycles. A state list extracting algorithm checks cyclic state lists of a current state in the state trajectory, condensing the optimal action set of the current state. By reinforcing the optimal action selected, the action policy of cyclic states is optimized gradually. The state list extracting is repeatedly learned and used as the experience knowledge which is shared by teams. Agents speed up the rate of convergence by experience sharing. Competition games of preys and predators are used for the experiments. The results of experiments prove that the proposed algorithms overcome the lack of experience in the initial stage, speed up learning and improve the performance.展开更多
As to the fact that it is difficult to obtain analytical form of optimal sampling density and tracking performance of standard particle probability hypothesis density(P-PHD) filter would decline when clustering algori...As to the fact that it is difficult to obtain analytical form of optimal sampling density and tracking performance of standard particle probability hypothesis density(P-PHD) filter would decline when clustering algorithm is used to extract target states,a free clustering optimal P-PHD(FCO-P-PHD) filter is proposed.This method can lead to obtainment of analytical form of optimal sampling density of P-PHD filter and realization of optimal P-PHD filter without use of clustering algorithms in extraction target states.Besides,as sate extraction method in FCO-P-PHD filter is coupled with the process of obtaining analytical form for optimal sampling density,through decoupling process,a new single-sensor free clustering state extraction method is proposed.By combining this method with standard P-PHD filter,FC-P-PHD filter can be obtained,which significantly improves the tracking performance of P-PHD filter.In the end,the effectiveness of proposed algorithms and their advantages over other algorithms are validated through several simulation experiments.展开更多
基金supported by the National Natural Science Foundation of China (61070143 61173088)
文摘For multi-agent reinforcement learning in Markov games, knowledge extraction and sharing are key research problems. State list extracting means to calculate the optimal shared state path from state trajectories with cycles. A state list extracting algorithm checks cyclic state lists of a current state in the state trajectory, condensing the optimal action set of the current state. By reinforcing the optimal action selected, the action policy of cyclic states is optimized gradually. The state list extracting is repeatedly learned and used as the experience knowledge which is shared by teams. Agents speed up the rate of convergence by experience sharing. Competition games of preys and predators are used for the experiments. The results of experiments prove that the proposed algorithms overcome the lack of experience in the initial stage, speed up learning and improve the performance.
文摘As to the fact that it is difficult to obtain analytical form of optimal sampling density and tracking performance of standard particle probability hypothesis density(P-PHD) filter would decline when clustering algorithm is used to extract target states,a free clustering optimal P-PHD(FCO-P-PHD) filter is proposed.This method can lead to obtainment of analytical form of optimal sampling density of P-PHD filter and realization of optimal P-PHD filter without use of clustering algorithms in extraction target states.Besides,as sate extraction method in FCO-P-PHD filter is coupled with the process of obtaining analytical form for optimal sampling density,through decoupling process,a new single-sensor free clustering state extraction method is proposed.By combining this method with standard P-PHD filter,FC-P-PHD filter can be obtained,which significantly improves the tracking performance of P-PHD filter.In the end,the effectiveness of proposed algorithms and their advantages over other algorithms are validated through several simulation experiments.