期刊文献+
共找到5篇文章
< 1 >
每页显示 20 50 100
EXISTENCE OF OPTIMAL POLICY FOR TIME NON-HOMOGENEOUS DISCOUNTED MARKOVIAN DECISION PR0GRAMMING
1
作者 郭世贞 董泽清 《Acta Mathematicae Applicatae Sinica》 SCIE CSCD 1990年第4期295-307,共13页
In this paper we discuss the discrete, time non--homogeneous discounted Markovian decisionprogramming, where the state space and all action sets are countable. Suppose that the optimumvalue function is finite. We give... In this paper we discuss the discrete, time non--homogeneous discounted Markovian decisionprogramming, where the state space and all action sets are countable. Suppose that the optimumvalue function is finite. We give the necessary and sufficient conditions for the existence of anoptimal policy. Suppose that the absolute mean of rewards is relatively bounded. We also give thenecessary and sufficient conditions for the existence of an optimal policy. 展开更多
关键词 Th EXISTENCE OF OPTIMAL POLICY FOR TIME NON-HOMOGENEOUS DISCOUNTED markovian decision PR0GRAMMING LIM 召亡 MDP POL
原文传递
MARKOVIAN DECISION PROGRAMMING WITH RECURSIVE VECTOR-REWARD
2
作者 刘建庸 刘克 《Acta Mathematicae Applicatae Sinica》 SCIE CSCD 1990年第2期158-165,共8页
In this paper, we discuss Markovian decision programming with recursive vector-reward andgive an algorithm to find optimal policies. We prove that: (1) There is a Markovian optimal policy for the nonstationary case; (... In this paper, we discuss Markovian decision programming with recursive vector-reward andgive an algorithm to find optimal policies. We prove that: (1) There is a Markovian optimal policy for the nonstationary case; (2) Thereis a stationary optimal policy for the stationary case. 展开更多
关键词 TH markovian decision PROGRAMMING WITH RECURSIVE VECTOR-REWARD
原文传递
Price-Based Residential Demand Response Management in Smart Grids:A Reinforcement Learning-Based Approach 被引量:2
3
作者 Yanni Wan Jiahu Qin +2 位作者 Xinghuo Yu Tao Yang Yu Kang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2022年第1期123-134,共12页
This paper studies price-based residential demand response management(PB-RDRM)in smart grids,in which non-dispatchable and dispatchable loads(including general loads and plug-in electric vehicles(PEVs))are both involv... This paper studies price-based residential demand response management(PB-RDRM)in smart grids,in which non-dispatchable and dispatchable loads(including general loads and plug-in electric vehicles(PEVs))are both involved.The PB-RDRM is composed of a bi-level optimization problem,in which the upper-level dynamic retail pricing problem aims to maximize the profit of a utility company(UC)by selecting optimal retail prices(RPs),while the lower-level demand response(DR)problem expects to minimize the comprehensive cost of loads by coordinating their energy consumption behavior.The challenges here are mainly two-fold:1)the uncertainty of energy consumption and RPs;2)the flexible PEVs’temporally coupled constraints,which make it impossible to directly develop a model-based optimization algorithm to solve the PB-RDRM.To address these challenges,we first model the dynamic retail pricing problem as a Markovian decision process(MDP),and then employ a model-free reinforcement learning(RL)algorithm to learn the optimal dynamic RPs of UC according to the loads’responses.Our proposed RL-based DR algorithm is benchmarked against two model-based optimization approaches(i.e.,distributed dual decomposition-based(DDB)method and distributed primal-dual interior(PDI)-based method),which require exact load and electricity price models.The comparison results show that,compared with the benchmark solutions,our proposed algorithm can not only adaptively decide the RPs through on-line learning processes,but also achieve larger social welfare within an unknown electricity market environment. 展开更多
关键词 Demand response management(DRM) markovian decision process(MDP) Monte Carlo simulation reinforcement learning(RL) smart grid
下载PDF
Determining Cost-Efficiency and Regulation Policy for Water Transfer of Lake Balaton by Stochastic Dynamic Programming
4
作者 Balint Muzelak Laszlo Koncsos 《Journal of Environmental Science and Engineering(B)》 2012年第5期586-592,共7页
The present research is based upon a comprehensive survey which discusses the slightly tolerable water level of Balaton between 2000 and 2003. The low water level of the extreme period caused considerable problems in ... The present research is based upon a comprehensive survey which discusses the slightly tolerable water level of Balaton between 2000 and 2003. The low water level of the extreme period caused considerable problems in recreation. Our goal was to investigate the possible water transfer policies and the water level regulation policy of Lake Balaton by applying the dynamic programming of Markov chains. This iteration supports the cost-benefit analysis of different scenarios and also provides information about the best water governing policy. As a basis of our scientific analysis, Markov chains were created by ARMA (autoregressive moving average) synthetic data generator. Profit was joined to each transition-probability for the economic analysis. In our case the profit was negative, because the harmful effects of the low water level should be estimated, which is based on the calculated willingness-to-pay for improving the water quality of Lake Balaton. In addition, the profit includes the cost of different water supplement scenarios. After computer programming, the method proved to be an efficient tool to buttress the cost-benefit analysis of water supplement scenarios. The result highlights the importance of further climate change monitoring. Calculation confirmed water transfer to be cost-effective, yet scenarios with less ecological risk are also effective, thus preferable. 展开更多
关键词 Climate change cost-benefit analysis markovian decision process Thomas-Fiering ARMA model water transfer.
下载PDF
Critical links detection in stochastic networks: application to the transport networks
5
作者 Mourad Guettiche Hamamache Kheddouci 《International Journal of Intelligent Computing and Cybernetics》 EI 2019年第1期42-69,共28页
Purpose–The purpose of this paper is to study a multiple-origin-multiple-destination variant of dynamic critical nodes detection problem(DCNDP)and dynamic critical links detection problem(DCLDP)in stochastic networks... Purpose–The purpose of this paper is to study a multiple-origin-multiple-destination variant of dynamic critical nodes detection problem(DCNDP)and dynamic critical links detection problem(DCLDP)in stochastic networks.DCNDP and DCLDP consist of identifying the subset of nodes and links,respectively,whose deletion maximizes the stochastic shortest paths between all origins–destinations pairs,in the graph modeling the transport network.The identification of such nodes(or links)helps to better control the road traffic and predict the necessary measures to avoid congestion.Design/methodology/approach–A Markovian decision process is used to model the shortest path problem underdynamic trafficconditions.Effectivealgorithmstodeterminethe criticalnodes(links)whileconsideringthe dynamicity of the traffic network are provided.Also,sensitivity analysis toward capacity reduction for critical links is studied.Moreover,the complexity of the underlying algorithms is analyzed and the computational efficiency resulting from the decomposition operation of the network into communities is highlighted.Findings–The numerical results demonstrate that the use of dynamic shortest path(time dependency)as a metric has a significant impact on the identification of critical nodes/links and the experiments conducted on real world networks highlight the importance of sensitive links to dynamically detect critical links and elaborate smart transport plans.Research limitations/implications–The research in this paper also revealed several challenges,which call for future investigations.First,the authors have restricted our experimentation to a small network where the only focus is on the model behavior,in the absence of historical data.The authors intend to extend this study to very large network using real data.Second,the authors have considered only congestion to assess network’s criticality;future research on this topic may include other factors,mainly vulnerability.Practical implications–Taking into consideration the dynamic and stochastic nature in problem modeling enables to be effective tools for real-time control of transportation networks.This leads to design optimized smart transport plans particularly in disaster management,to improve the emergency evacuation effeciency.Originality/value–The paper provides a novel approach to solve critical nodes/links detection problems.In contrast to the majority of research works in the literature,the proposed model considers dynamicity and betweennesswhiletakingintoaccount the stochasticaspectof transportnetworks.Thisenables theapproach to guide the traffic and analyze transport networks mainly under disaster conditions in which networks become highly dynamic. 展开更多
关键词 Critical links Critical nodes markovian decision process Transport networks
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部