期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Call for papers Journal of Control Theory and Applications Special issue on Approximate dynamic programming and reinforcement learning
1
《控制理论与应用(英文版)》 EI 2010年第2期257-257,共1页
Approximate dynamic programming (ADP) is a general and effective approach for solving optimal control and estimation problems by adapting to uncertain and nonconvex environments over time.
关键词 Call for papers Journal of Control Theory and Applications Special issue on Approximate dynamic programming and reinforcement learning
下载PDF
Feature Selection and Feature Learning for High-dimensional Batch Reinforcement Learning: A Survey 被引量:2
2
作者 De-Rong Liu Hong-Liang Li Ding Wang 《International Journal of Automation and computing》 EI CSCD 2015年第3期229-242,共14页
Tremendous amount of data are being generated and saved in many complex engineering and social systems every day.It is significant and feasible to utilize the big data to make better decisions by machine learning tech... Tremendous amount of data are being generated and saved in many complex engineering and social systems every day.It is significant and feasible to utilize the big data to make better decisions by machine learning techniques. In this paper, we focus on batch reinforcement learning(RL) algorithms for discounted Markov decision processes(MDPs) with large discrete or continuous state spaces, aiming to learn the best possible policy given a fixed amount of training data. The batch RL algorithms with handcrafted feature representations work well for low-dimensional MDPs. However, for many real-world RL tasks which often involve high-dimensional state spaces, it is difficult and even infeasible to use feature engineering methods to design features for value function approximation. To cope with high-dimensional RL problems, the desire to obtain data-driven features has led to a lot of works in incorporating feature selection and feature learning into traditional batch RL algorithms. In this paper, we provide a comprehensive survey on automatic feature selection and unsupervised feature learning for high-dimensional batch RL. Moreover, we present recent theoretical developments on applying statistical learning to establish finite-sample error bounds for batch RL algorithms based on weighted Lpnorms. Finally, we derive some future directions in the research of RL algorithms, theories and applications. 展开更多
关键词 Intelligent control reinforcement learning adaptive dynamic programming feature selection feature learning big data.
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部