Reinforcement learning(RL)has become a dominant decision-making paradigm and has achieved notable success in many real-world applications.Notably,deep neural networks play a crucial role in unlocking RL’s potential i...Reinforcement learning(RL)has become a dominant decision-making paradigm and has achieved notable success in many real-world applications.Notably,deep neural networks play a crucial role in unlocking RL’s potential in large-scale decision-making tasks.Inspired by current major success of Transformer in natural language processing and computer vision,numerous bottlenecks have been overcome by combining Transformer with RL for decision-making.This paper presents a multiangle systematic survey of various Transformer-based RL(TransRL)models applied in decision-making tasks,including basic models,advanced algorithms,representative implementation instances,typical applications,and known challenges.Our work aims to provide insights into problems that inherently arise with the current RL approaches,and examines how we can address them with better TransRL models.To our knowledge,we are the first to present a comprehensive review of the recent Transformer research developments in RL for decision-making.We hope that this survey provides a comprehensive review of TransRL models and inspires the RL community in its pursuit of future directions.To keep track of the rapid TransRL developments in the decision-making domains,we summarize the latest papers and their open-source implementations at https://github.com/williamyuanv0/Transformer-in-Reinforcement-Learning-for-Decision-Making-A-Survey.展开更多
COSMO-SkyMed is a constellation of four X-band high-resolution radar satellites with a minimum revisit period of 12 hours.These satellites can obtain ascending and descending synthetic aperture radar(SAR)images with v...COSMO-SkyMed is a constellation of four X-band high-resolution radar satellites with a minimum revisit period of 12 hours.These satellites can obtain ascending and descending synthetic aperture radar(SAR)images with very similar periods for use in the three-dimensional(3D)inversion of glacier velocities.In this paper,based on ascending and descending COSMO-SkyMed data acquired at nearly the same time,the surface velocity of the Yiga Glacier,located in the Jiali County,Tibet,China,is estimated in four directions using an offset tracking technique during the periods of 16 January to 3 February 2017 and 1 February to 19 February 2017.Through the geometrical relationships between the measurements and the SAR images,the least square method is used to retrieve the 3D components of the glacier surface velocity in the eastward,northward and upward directions.The results show that applying the offset tracking technique to COSMO-SkyMed images can be used to derive the true 3D velocity of a glacier’s surface.During the two periods,the Yiga Glacier had a stable velocity,and the maximum surface velocity,2.4 m/d,was observed in the middle portion of the glacier,which corresponds to the location of the steepest slope.展开更多
基金Project supported by the National Natural Science Foundation of China(No.62376280)。
文摘Reinforcement learning(RL)has become a dominant decision-making paradigm and has achieved notable success in many real-world applications.Notably,deep neural networks play a crucial role in unlocking RL’s potential in large-scale decision-making tasks.Inspired by current major success of Transformer in natural language processing and computer vision,numerous bottlenecks have been overcome by combining Transformer with RL for decision-making.This paper presents a multiangle systematic survey of various Transformer-based RL(TransRL)models applied in decision-making tasks,including basic models,advanced algorithms,representative implementation instances,typical applications,and known challenges.Our work aims to provide insights into problems that inherently arise with the current RL approaches,and examines how we can address them with better TransRL models.To our knowledge,we are the first to present a comprehensive review of the recent Transformer research developments in RL for decision-making.We hope that this survey provides a comprehensive review of TransRL models and inspires the RL community in its pursuit of future directions.To keep track of the rapid TransRL developments in the decision-making domains,we summarize the latest papers and their open-source implementations at https://github.com/williamyuanv0/Transformer-in-Reinforcement-Learning-for-Decision-Making-A-Survey.
基金supported by the China Geological Survey under grant number[DD20160342]the China MOST-ESA Dragon Project-4 under grant number[32365]the National Science Foundation of China(NSFC)under grant number[41590852,41001264].
文摘COSMO-SkyMed is a constellation of four X-band high-resolution radar satellites with a minimum revisit period of 12 hours.These satellites can obtain ascending and descending synthetic aperture radar(SAR)images with very similar periods for use in the three-dimensional(3D)inversion of glacier velocities.In this paper,based on ascending and descending COSMO-SkyMed data acquired at nearly the same time,the surface velocity of the Yiga Glacier,located in the Jiali County,Tibet,China,is estimated in four directions using an offset tracking technique during the periods of 16 January to 3 February 2017 and 1 February to 19 February 2017.Through the geometrical relationships between the measurements and the SAR images,the least square method is used to retrieve the 3D components of the glacier surface velocity in the eastward,northward and upward directions.The results show that applying the offset tracking technique to COSMO-SkyMed images can be used to derive the true 3D velocity of a glacier’s surface.During the two periods,the Yiga Glacier had a stable velocity,and the maximum surface velocity,2.4 m/d,was observed in the middle portion of the glacier,which corresponds to the location of the steepest slope.