The concept of reward is fundamental in reinforcement learning with a wide range of applications in natural and social sciences.Seeking an interpretable reward for decision-making that largely shapes the system's ...The concept of reward is fundamental in reinforcement learning with a wide range of applications in natural and social sciences.Seeking an interpretable reward for decision-making that largely shapes the system's behavior has always been a challenge in reinforcement learning.In this work,we explore a discrete-time reward for reinforcement learning in continuous time and action spaces that represent many phenomena captured by applying physical laws.We find that the discrete-time reward leads to the extraction of the unique continuous-time decision law and improved computational efficiency by dropping the integrator operator that appears in classical results with integral rewards.We apply this finding to solve output-feedback design problems in power systems.The results reveal that our approach removes an intermediate stage of identifying dynamical models.Our work suggests that the discrete-time reward is efficient in search of the desired decision law,which provides a computational tool to understand and modify the behavior of large-scale engineering systems using the optimal learned decision.展开更多
There have been many skewed cancer gene expression datasets in the post-genomic era. Extraction of differential expression genes or construction of decision rules using these skewed datasets by traditional algorithms ...There have been many skewed cancer gene expression datasets in the post-genomic era. Extraction of differential expression genes or construction of decision rules using these skewed datasets by traditional algorithms will seriously underestimate the performance of the minority class, leading to inaccurate diagnosis in clinical trails. This paper presents a skewed gene selection algorithm that introduces a weighted metric into the gene selection procedure. The extracted genes are paired as decision rules to distinguish both classes, with these decision rules then integrated into an ensemble learning framework by majority voting to recognize test examples; thus avoiding tedious data normalization and classifier construction. The mining and integrating of a few reliable decision rules gave higher or at least comparable classification performance than many traditional class imbalance learning algorithms on four benchmark imbalanced cancer gene expression datasets.展开更多
基金supported by the Guangdong Basic and Applied Basic Research Foundation(2024A1515011936)the National Natural Science Foundation of China(62320106008)
文摘The concept of reward is fundamental in reinforcement learning with a wide range of applications in natural and social sciences.Seeking an interpretable reward for decision-making that largely shapes the system's behavior has always been a challenge in reinforcement learning.In this work,we explore a discrete-time reward for reinforcement learning in continuous time and action spaces that represent many phenomena captured by applying physical laws.We find that the discrete-time reward leads to the extraction of the unique continuous-time decision law and improved computational efficiency by dropping the integrator operator that appears in classical results with integral rewards.We apply this finding to solve output-feedback design problems in power systems.The results reveal that our approach removes an intermediate stage of identifying dynamical models.Our work suggests that the discrete-time reward is efficient in search of the desired decision law,which provides a computational tool to understand and modify the behavior of large-scale engineering systems using the optimal learned decision.
基金Supported by the National Natural Science Foundation of China (No.61105057)the Ph.D Foundation of Jiangsu University of Science and Technology (Nos.35301002 and 35211104)
文摘There have been many skewed cancer gene expression datasets in the post-genomic era. Extraction of differential expression genes or construction of decision rules using these skewed datasets by traditional algorithms will seriously underestimate the performance of the minority class, leading to inaccurate diagnosis in clinical trails. This paper presents a skewed gene selection algorithm that introduces a weighted metric into the gene selection procedure. The extracted genes are paired as decision rules to distinguish both classes, with these decision rules then integrated into an ensemble learning framework by majority voting to recognize test examples; thus avoiding tedious data normalization and classifier construction. The mining and integrating of a few reliable decision rules gave higher or at least comparable classification performance than many traditional class imbalance learning algorithms on four benchmark imbalanced cancer gene expression datasets.