The concept of reward is fundamental in reinforcement learning with a wide range of applications in natural and social sciences.Seeking an interpretable reward for decision-making that largely shapes the system's ...The concept of reward is fundamental in reinforcement learning with a wide range of applications in natural and social sciences.Seeking an interpretable reward for decision-making that largely shapes the system's behavior has always been a challenge in reinforcement learning.In this work,we explore a discrete-time reward for reinforcement learning in continuous time and action spaces that represent many phenomena captured by applying physical laws.We find that the discrete-time reward leads to the extraction of the unique continuous-time decision law and improved computational efficiency by dropping the integrator operator that appears in classical results with integral rewards.We apply this finding to solve output-feedback design problems in power systems.The results reveal that our approach removes an intermediate stage of identifying dynamical models.Our work suggests that the discrete-time reward is efficient in search of the desired decision law,which provides a computational tool to understand and modify the behavior of large-scale engineering systems using the optimal learned decision.展开更多
This paper is a sequel to Kageyama et al. [1], in which a Markov-type hybrid process has been constructed and the corresponding discounted total reward has been characterized by the recursive equation. The objective o...This paper is a sequel to Kageyama et al. [1], in which a Markov-type hybrid process has been constructed and the corresponding discounted total reward has been characterized by the recursive equation. The objective of this paper is to formulate a hybrid decision process and to give the existence and characterization of optimal policies.展开更多
Contribution:This paper designs a learning and training platform that can systematically help radiologists learn automated medical image analysis technology.The platform can help radiologists master deep learning theo...Contribution:This paper designs a learning and training platform that can systematically help radiologists learn automated medical image analysis technology.The platform can help radiologists master deep learning theories and medical applications such as the three-dimensional medical decision support system,and strengthen the teaching practice of deep learning related courses in hospitals,so as to help doctors better understand deep learning knowledge and improve the efficiency of auxiliary diagnosis.Background:In recent years,deep learning has been widely used in academia,industry,andmedicine.An increasing number of companies are starting to recruit a large number of professionals in the field of deep learning.Increasing numbers of colleges and universities also offer courses related to deep learning to help radiologists learn automated medical image analysis techniques.For now,however,there is no practical training platform that can help radiologists learn automated medical image analysis systematically.ApplicationDesign:The platform proposes the basic learning,model combat,business application(BMR)concept,including the learning guidance system and the assessment training system,which constitutes a closed-loop learning guidance mode of“learning-assessment-training-learning”.Findings:The survey results show that most of radiologists met their learning expectations by using this platform.The platform can help radiologists master deep learning techniques quickly,comprehensively and firmly.展开更多
China has implemented both quantitative and policy incentives for renewable energy development since 2019 and is currently in the policy transition stage.The implementation of renewable portfolio standards(RPSs)is dif...China has implemented both quantitative and policy incentives for renewable energy development since 2019 and is currently in the policy transition stage.The implementation of renewable portfolio standards(RPSs)is difficult due to the interests of multiple stakeholders,including power generation enterprises,power grid companies,power users,local governments,and the central government.Based on China’s RPS policy and power system reform documents,this research sorted out the core game decision problems of China’s renewable energy industry and established a conceptual game decision model of the renewable energy industry from the perspective of local governments,power generation enterprises and power grid companies.The results reveal that for local governments,the probability of meeting the earnings quota or punishments for not reaching quota completion are the major determinants for active participation in quota supervision.For power grid firms,the willingness to accept renewable electricity quotas depends on the additional cost of receiving renewable electricity and governmental incentives.It is reasonable,from the theoretical perspective,to implement the RPS policy on the power generation side.Electricity reform will help clarify the electricity price system and increase the transparency of the quota implementation process.Policy implications are suggested to achieve sustainable development of the renewable energy industry from price incentives and quantity delivery.展开更多
基金supported by the Guangdong Basic and Applied Basic Research Foundation(2024A1515011936)the National Natural Science Foundation of China(62320106008)
文摘The concept of reward is fundamental in reinforcement learning with a wide range of applications in natural and social sciences.Seeking an interpretable reward for decision-making that largely shapes the system's behavior has always been a challenge in reinforcement learning.In this work,we explore a discrete-time reward for reinforcement learning in continuous time and action spaces that represent many phenomena captured by applying physical laws.We find that the discrete-time reward leads to the extraction of the unique continuous-time decision law and improved computational efficiency by dropping the integrator operator that appears in classical results with integral rewards.We apply this finding to solve output-feedback design problems in power systems.The results reveal that our approach removes an intermediate stage of identifying dynamical models.Our work suggests that the discrete-time reward is efficient in search of the desired decision law,which provides a computational tool to understand and modify the behavior of large-scale engineering systems using the optimal learned decision.
文摘This paper is a sequel to Kageyama et al. [1], in which a Markov-type hybrid process has been constructed and the corresponding discounted total reward has been characterized by the recursive equation. The objective of this paper is to formulate a hybrid decision process and to give the existence and characterization of optimal policies.
基金This work is supported in part by the Major Fundamental Research of Natural Science Foundation of Shandong Province under Grant ZR2019ZD05Joint Fund for Smart Computing of Shandong Natural Science Foundation under Grant ZR2020LZH013+1 种基金the Scientific Research Platform and Projects of Department of Education of Guangdong Province under Grant 2019GKQNCX121the Intelligent Perception and Computing Innovation Platform of the Shenzhen Institute of Information Technology under Grant PT2019E001.
文摘Contribution:This paper designs a learning and training platform that can systematically help radiologists learn automated medical image analysis technology.The platform can help radiologists master deep learning theories and medical applications such as the three-dimensional medical decision support system,and strengthen the teaching practice of deep learning related courses in hospitals,so as to help doctors better understand deep learning knowledge and improve the efficiency of auxiliary diagnosis.Background:In recent years,deep learning has been widely used in academia,industry,andmedicine.An increasing number of companies are starting to recruit a large number of professionals in the field of deep learning.Increasing numbers of colleges and universities also offer courses related to deep learning to help radiologists learn automated medical image analysis techniques.For now,however,there is no practical training platform that can help radiologists learn automated medical image analysis systematically.ApplicationDesign:The platform proposes the basic learning,model combat,business application(BMR)concept,including the learning guidance system and the assessment training system,which constitutes a closed-loop learning guidance mode of“learning-assessment-training-learning”.Findings:The survey results show that most of radiologists met their learning expectations by using this platform.The platform can help radiologists master deep learning techniques quickly,comprehensively and firmly.
基金financial support from the National Natural Science Foundation of China(No.71704178)Beijing Excellent Talent Program(No.2017000020124G133)the Fundamental Research Funds for the Central Universities(Nos.2021YQNY07 and 2021YQNY01).
文摘China has implemented both quantitative and policy incentives for renewable energy development since 2019 and is currently in the policy transition stage.The implementation of renewable portfolio standards(RPSs)is difficult due to the interests of multiple stakeholders,including power generation enterprises,power grid companies,power users,local governments,and the central government.Based on China’s RPS policy and power system reform documents,this research sorted out the core game decision problems of China’s renewable energy industry and established a conceptual game decision model of the renewable energy industry from the perspective of local governments,power generation enterprises and power grid companies.The results reveal that for local governments,the probability of meeting the earnings quota or punishments for not reaching quota completion are the major determinants for active participation in quota supervision.For power grid firms,the willingness to accept renewable electricity quotas depends on the additional cost of receiving renewable electricity and governmental incentives.It is reasonable,from the theoretical perspective,to implement the RPS policy on the power generation side.Electricity reform will help clarify the electricity price system and increase the transparency of the quota implementation process.Policy implications are suggested to achieve sustainable development of the renewable energy industry from price incentives and quantity delivery.