期刊文献+
共找到66篇文章
< 1 2 4 >
每页显示 20 50 100
Optimal policy for controlling two-server queueing systems with jockeying
1
作者 LIN Bing LIN Yuchen BHATNAGAR Rohit 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2022年第1期144-155,共12页
This paper studies the optimal policy for joint control of admission, routing, service, and jockeying in a queueing system consisting of two exponential servers in parallel.Jobs arrive according to a Poisson process.U... This paper studies the optimal policy for joint control of admission, routing, service, and jockeying in a queueing system consisting of two exponential servers in parallel.Jobs arrive according to a Poisson process.Upon each arrival, an admission/routing decision is made, and the accepted job is routed to one of the two servers with each being associated with a queue.After each service completion, the servers have an option of serving a job from its own queue, serving a jockeying job from another queue, or staying idle.The system performance is inclusive of the revenues from accepted jobs, the costs of holding jobs in queues, the service costs and the job jockeying costs.To maximize the total expected discounted return, we formulate a Markov decision process(MDP) model for this system.The value iteration method is employed to characterize the optimal policy as a hedging point policy.Numerical studies verify the structure of the hedging point policy which is convenient for implementing control actions in practice. 展开更多
关键词 queueing system jockeying optimal policy Markov decision process(MDP) dynamic programming
下载PDF
Distributive Disturbance and Optimal Policy in Stochastic Control Model
2
作者 汪红初 胡适耕 张学清 《Journal of Southwest Jiaotong University(English Edition)》 2006年第4期408-414,共7页
To investigate the equilibrium relationships between the volatility of capital and income, taxation, and ance in a stochastic control model, the uniqueness of the solution to this model was proved by using the method ... To investigate the equilibrium relationships between the volatility of capital and income, taxation, and ance in a stochastic control model, the uniqueness of the solution to this model was proved by using the method of dynamic programming under the introduction of distributive disturbance and elastic labor supply. Furthermore, the effects of two types of shocks on labor-leisure choice, economic growth rate and welfare were numerically analyzed, and then the optimal tax policy was derived. 展开更多
关键词 Stochastic optimization Dynamic programming Bellman equation Macroeconomic equilibrium optimal policy
下载PDF
SIMPLE COMPUTING OF THE CUSTOMER LIFETIME VALUE:A FIXED LOCAL-OPTIMAL POLICY APPROACH 被引量:1
3
作者 Julio B.Clempner Alexander S.Poznyak 《Journal of Systems Science and Systems Engineering》 SCIE EI CSCD 2014年第4期439-459,共21页
In this paper,we present a new method for finding a fixed local-optimal policy for computing the customer lifetime value.The method is developed for a class of ergodic controllable finite Markov chains.We propose an a... In this paper,we present a new method for finding a fixed local-optimal policy for computing the customer lifetime value.The method is developed for a class of ergodic controllable finite Markov chains.We propose an approach based on a non-converging state-value function that fluctuates(increases and decreases) between states of the dynamic process.We prove that it is possible to represent that function in a recursive format using a one-step-ahead fixed-optimal policy.Then,we provide an analytical formula for the numerical realization of the fixed local-optimal strategy.We also present a second approach based on linear programming,to solve the same problem,that implement the c-variable method for making the problem computationally tractable.At the end,we show that these two approaches are related:after a finite number of iterations our proposed approach converges to same result as the linear programming method.We also present a non-traditional approach for ergodicity verification.The validity of the proposed methods is successfully demonstrated theoretically and,by simulated credit-card marketing experiments computing the customer lifetime value for both an optimization and a game theory approach. 展开更多
关键词 Customer lifetime value optimization optimal policy method linear programming ergodic controllable Markov chains asynchronous games
原文传递
EXISTENCE OF OPTIMAL POLICY FOR TIME NON-HOMOGENEOUS DISCOUNTED MARKOVIAN DECISION PR0GRAMMING
4
作者 郭世贞 董泽清 《Acta Mathematicae Applicatae Sinica》 SCIE CSCD 1990年第4期295-307,共13页
In this paper we discuss the discrete, time non--homogeneous discounted Markovian decisionprogramming, where the state space and all action sets are countable. Suppose that the optimumvalue function is finite. We give... In this paper we discuss the discrete, time non--homogeneous discounted Markovian decisionprogramming, where the state space and all action sets are countable. Suppose that the optimumvalue function is finite. We give the necessary and sufficient conditions for the existence of anoptimal policy. Suppose that the absolute mean of rewards is relatively bounded. We also give thenecessary and sufficient conditions for the existence of an optimal policy. 展开更多
关键词 Th EXISTENCE OF optimal policy FOR TIME NON-HOMOGENEOUS DISCOUNTED MARKOVIAN DECISION PR0GRAMMING LIM 召亡 MDP POL
原文传递
OPTIMAL HARVESTING POLICY FOR INSHORE-OFFSHORE FISHERY MODEL WITH IMPULSIVE DIFFUSION 被引量:7
5
作者 董玲珍 陈兰荪 孙丽华 《Acta Mathematica Scientia》 SCIE CSCD 2007年第2期405-412,共8页
This article studies the inshore-offshore fishery model with impulsive diffusion. The existence and global asymptotic stability of both the trivial periodic solution and the positive periodic solution are obtained. Th... This article studies the inshore-offshore fishery model with impulsive diffusion. The existence and global asymptotic stability of both the trivial periodic solution and the positive periodic solution are obtained. The complexity of this system is also analyzed. Moreover, the optimal harvesting policy are given for the inshore subpopulation, which includes the maximum sustainable yield and the corresponding harvesting effort. 展开更多
关键词 Impulsive diffusion inshore-offshore fishery model global asymptotic stability periodic solution optimal harvesting policy
下载PDF
RECURSIVE UTILITY,PRODUCTIVE GOVERNMENT EXPENDITURE AND OPTIMAL FISCAL POLICY 被引量:1
6
作者 Wang Haijun Hu Shigeng Zhang Xueqing 《Applied Mathematics(A Journal of Chinese Universities)》 SCIE CSCD 2005年第3期277-288,共12页
This paper employs a stochastic endogenous growth model extended to the case of a recursive utility function which can disentangle intertemporal substitution from risk aversion to analyze productive government expendi... This paper employs a stochastic endogenous growth model extended to the case of a recursive utility function which can disentangle intertemporal substitution from risk aversion to analyze productive government expenditure and optimal fiscal policy, particularly stresses the importance of factor income. First, the explicit solutions of the central planner's stochastic optimization problem are derived, the growth maximizing and welfare-maximizing government expenditure policies are obtained and their standing in conflict or coincidence depends upon intertemporal substitution. Second, the explicit solutions of the representative individual's stochastic optimization problem which permits to tax on capital income and labor income separately are derived ,and it is found that the effect of risk on growth crucially depends on the degree of risk aversion,the intertemporal elasticity of substitution and the capital income share. Finally, a flexible optimal tax policy which can be internally adjusted to a certain extent is derived, and it is found that the distribution of factor income plays an important role in designing the optimal tax policy. 展开更多
关键词 endogenous growth recursive utility productive government expenditure optimal fiscal policy.
下载PDF
Optimal switching policy for performance enhancement of distributed parameter systems based on event-driven control 被引量:1
7
作者 穆文英 崔宝同 +1 位作者 楼旭阳 李纹 《Chinese Physics B》 SCIE EI CAS CSCD 2014年第7期211-217,共7页
This paper aims to improve the performance of a class of distributed parameter systems for the optimal switching of actuators and controllers based on event-driven control. It is assumed that in the available multiple... This paper aims to improve the performance of a class of distributed parameter systems for the optimal switching of actuators and controllers based on event-driven control. It is assumed that in the available multiple actuators, only one actuator can receive the control signal and be activated over an unfixed time interval, and the other actuators keep dormant. After incorporating a state observer into the event generator, the event-driven control loop and the minimum inter-event time are ultimately bounded. Based on the event-driven state feedback control, the time intervals of unfixed length can be obtained. The optimal switching policy is based on finite horizon linear quadratic optimal control at the beginning of each time subinterval. A simulation example demonstrate the effectiveness of the proposed policy. 展开更多
关键词 distributed parameter systems optimal switching policy EVENT-DRIVEN
下载PDF
Analysis of a POMDP Model for an Optimal Maintenance Problem with Multiple Imperfect Repairs
8
作者 Nobuyuki Tamura 《American Journal of Operations Research》 2023年第6期133-146,共14页
I consider a system whose deterioration follows a discrete-time and discrete-state Markov chain with an absorbing state. When the system is put into practice, I may select operation (wait), imperfect repair, or replac... I consider a system whose deterioration follows a discrete-time and discrete-state Markov chain with an absorbing state. When the system is put into practice, I may select operation (wait), imperfect repair, or replacement at each discrete-time point. The true state of the system is not known when it is operated. Instead, the system is monitored after operation and some incomplete information concerned with the deterioration is obtained for decision making. Since there are multiple imperfect repairs, I can select one option from them when the imperfect repair is preferable to operation and replacement. To express this situation, I propose a POMDP model and theoretically investigate the structure of an optimal maintenance policy minimizing a total expected discounted cost for an unbounded horizon. Then two stochastic orders are used for the analysis of our problem. 展开更多
关键词 Partially Observable Markov Decision Process Imperfect Repair Stochastic Order Monotone Property optimal Maintenance policy
下载PDF
Optimal quasi-periodic maintenance policies for two-unit series system 被引量:2
9
作者 高文科 张志胜 +1 位作者 周一帆 甘淑媛 《Journal of Southeast University(English Edition)》 EI CAS 2013年第4期450-455,共6页
To investigate the effects of various random factors on the preventive maintenance (PM) decision-making of one type of two-unit series system, an optimal quasi-periodic PM policy is introduced. Assume that PM is per... To investigate the effects of various random factors on the preventive maintenance (PM) decision-making of one type of two-unit series system, an optimal quasi-periodic PM policy is introduced. Assume that PM is perfect for unit 1 and only mechanical service for unit 2 in the model. PM activity is randomly performed according to a dynamic PM plan distributed in each implementation period. A replacement is determined based on the competing results of unplanned and planned replacements. The unplanned replacement is trigged by a catastrophic failure of unit 2, and the planned replacement is executed when the PM number reaches the threshold N. Through modeling and analysis, a solution algorithm for an optimal implementation period and the PM number is given, and optimal process and parametric sensitivity are provided by a numerical example. Results show that the implementation period should be decreased as soon as possible under the condition of meeting the needs of practice, which can increase mean operating time and decrease the long-run cost rate. 展开更多
关键词 maintenance policy optimization quasi-periodic preventive maintenance two-unit series system
下载PDF
THE OPTIMAL STRATEGY FOR INSURANCE COMPANY UNDER THE INFLUENCE OF TERMINAL VALUE 被引量:3
10
作者 刘伟 袁海丽 胡亦钧 《Acta Mathematica Scientia》 SCIE CSCD 2011年第3期1077-1090,共14页
This paper considers a model of an insurance company which is allowed to invest a risky asset and to purchase proportional reinsurance. The objective is to find the policy which maximizes the expected total discounted... This paper considers a model of an insurance company which is allowed to invest a risky asset and to purchase proportional reinsurance. The objective is to find the policy which maximizes the expected total discounted dividend pay-out until the time of bankruptcy and the terminal value of the company under liquidity constraint. We find the solution of this problem via solving the problem with zero terminal value. We also analyze the influence of terminal value on the optimal policy. 展开更多
关键词 proportional reinsurance terminal value optimal policy HJB equation
下载PDF
A generalized geometric process based repairable system model with bivariate policy
11
作者 MA Ning YE Jimin WANG Junyuan 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2021年第3期631-641,共11页
The maintenance model of simple repairable system is studied.We assume that there are two types of failure,namely type Ⅰ failure(repairable failure)and type Ⅱ failure(irrepairable failure).As long as the type Ⅰ fai... The maintenance model of simple repairable system is studied.We assume that there are two types of failure,namely type Ⅰ failure(repairable failure)and type Ⅱ failure(irrepairable failure).As long as the type Ⅰ failure occurs,the system will be repaired immediately,which is failure repair(FR).Between the(n-1)th and the nth FR,the system is supposed to be preventively repaired(PR)as the consecutive working time of the system reaches λ^(n-1) T,where λ and T are specified values.Further,we assume that the system will go on working when the repair is finished and will be replaced at the occurrence of the Nth type Ⅰ failure or the occurrence of the first type Ⅱ failure,whichever occurs first.In practice,the system will degrade with the increasing number of repairs.That is,the consecutive working time of the system forms a decreasing generalized geometric process(GGP)whereas the successive repair time forms an increasing GGP.A simple bivariate policy(T,N)repairable model is introduced based on GGP.The alternative searching method is used to minimize the cost rate function C(N,T),and the optimal(T,N)^(*) is obtained.Finally,numerical cases are applied to demonstrate the reasonability of this model. 展开更多
关键词 renewal reward theorem generalized geometric process(GGP) average cost rate optimal policy replacement
下载PDF
Optimal Static Partition Configuration in ARINC653 System 被引量:4
12
作者 Sheng-Lin Gui Lei Luo Sen-Sen Tang Yang Meng 《Journal of Electronic Science and Technology》 CAS 2011年第4期373-378,共6页
ARINC653 systems, which have been widely used in avionics industry, are an important class of safety-critical applications. Partitions are the core concept in the Arinc653 system architecture. Due to the existence of ... ARINC653 systems, which have been widely used in avionics industry, are an important class of safety-critical applications. Partitions are the core concept in the Arinc653 system architecture. Due to the existence of partitions, the system designer must allocate adequate time slots statically to each partition in the design phase. Although some time slot allocation policies could be borrowed from task scheduling policies, no existing literatures give an optimal allocation policy. In this paper, we present a partition configuration policy and prove that this policy is optimal in the sense that if this policy fails to configure adequate time slots to each partition, nor do other policies. Then, by simulation, we show the effects of different partition configuration policies on time slot allocation of partitions and task response time, respectively. 展开更多
关键词 ARINC653 earliest-next release time first policy optimal partition configuration policy real-time systems.
下载PDF
Optimal production lot sizing model in a supply chain with periodically fixed demand considering learning effect 被引量:1
13
作者 熊中楷 SHEN Tiesong 《Journal of Chongqing University》 CAS 2002年第2期86-88,共3页
This paper presents an optimal production model for manufacturer in a supply chain with a fixed demand at a fixed interval with respect to the learning effect on production capacity. An algorithm is employed to find t... This paper presents an optimal production model for manufacturer in a supply chain with a fixed demand at a fixed interval with respect to the learning effect on production capacity. An algorithm is employed to find the optimal delay time for production and production time sequentially. It is found that the optimal delay time for production and the production time are not static, but dynamic and variant with time. It is important for a manufacturer to schedule the production so as to prevent facilities and workers from idling. 展开更多
关键词 learning curve capacity expansion supply chain optimal production policy.
下载PDF
STUDY ON THE OPTIMIZATION OF TRANSPORT CONTROL POLICY IN COMMUNICATION NETWORK 被引量:1
14
作者 Fan Shuyan Han Weizhan Lu Ran 《Journal of Electronics(China)》 2010年第2期261-266,共6页
In communication networks with policy-based Transport Control on-Demand (TCoD) function,the transport control policies play a great impact on the network effectiveness. To evaluate and optimize the transport policies ... In communication networks with policy-based Transport Control on-Demand (TCoD) function,the transport control policies play a great impact on the network effectiveness. To evaluate and optimize the transport policies in communication network,a policy-based TCoD network model is given and a comprehensive evaluation index system of the network effectiveness is put forward from both network application and handling mechanism perspectives. A TCoD network prototype system based on Asynchronous Transfer Mode/Multi-Protocol Label Switching (ATM/MPLS) is introduced and some experiments are performed on it. The prototype system is evaluated and analyzed with the comprehensive evaluation index system. The results show that the index system can be used to judge whether the communication network can meet the application requirements or not,and can provide references for the optimization of the transport policies so as to improve the communication network effectiveness. 展开更多
关键词 Communication network Comprehensive evaluation index system Network Application Effectiveness (NAE) Transport Control on-Demand (TCoD) policy optimization
下载PDF
The Dragon-shape Strategy of China's Regional Economic Development and Policy Analysis 被引量:1
15
作者 Jiankun Song Wenjie Zhang 《Chinese Business Review》 2004年第7期50-53,共4页
According to this paper, the dragon-shape strategy is the optimized option of China's future strategy with respect to the geographic distribution of regional economy.
关键词 geographic distribution optimization of strategy mode of policy
下载PDF
The Harvesting Optimal Problem of Richards Model
16
作者 ZENG Zhi-jun 《Chinese Quarterly Journal of Mathematics》 CSCD 2013年第3期366-375,共10页
In this paper, the exploitation of single population modelled by Richards model is studied. By choosing the maximum annual-sustainable yield as management objective, we investigate the optimal harvesting policies for ... In this paper, the exploitation of single population modelled by Richards model is studied. By choosing the maximum annual-sustainable yield as management objective, we investigate the optimal harvesting policies for autonomous and periodic exploited Richards model. Further, when the functions in the exploited Richards model are stably bounded functions, we study the ultimately optimal harvesting policy and obtain the corresponding average limiting maximum sustainable yield. 展开更多
关键词 Richards model stably bounded function ultimately optimal harvesting policy
下载PDF
Policy Optimization Study Based on Evolutionary Learning
17
作者 刘素平 丁永生 《Journal of Donghua University(English Edition)》 EI CAS 2009年第6期621-624,共4页
In order to achieve an intelligent and automated self-management network,dynamic policy configuration and selection are needed.A certain policy only suits to a certain network environment.If the network environment ch... In order to achieve an intelligent and automated self-management network,dynamic policy configuration and selection are needed.A certain policy only suits to a certain network environment.If the network environment changes,the certain policy does not suit any more.Thereby,the policy-based management should also have similar "natural selection" process.Useful policy will be retained,and policies which have lost their effectiveness are eliminated.A policy optimization method based on evolutionary learning was proposed.For different shooting times,the priority of policy with high shooting times is improved,while policy with a low rate has lower priority,and long-term no shooting policy will be dormant.Thus the strategy for the survival of the fittest is realized,and the degree of self-learning in policy management is improved. 展开更多
关键词 policy-based management evolution learning policy optimization
下载PDF
B-Spline-Based Curve Fitting to Cam Pitch Curve Using Reinforcement Learning 被引量:1
18
作者 Zhiwei Lin Tianding Chen +3 位作者 Yingtao Jiang Hui Wang Shuqin Lin Ming Zhu 《Intelligent Automation & Soft Computing》 SCIE 2023年第5期2145-2164,共20页
Directly applying the B-spline interpolation function to process plate cams in a computer numerical control(CNC)system may produce verbose tool-path codes and unsmooth trajectories.This paper is devoted to addressing ... Directly applying the B-spline interpolation function to process plate cams in a computer numerical control(CNC)system may produce verbose tool-path codes and unsmooth trajectories.This paper is devoted to addressing the problem of B-splinefitting for cam pitch curves.Considering that the B-spline curve needs to meet the motion law of the follower to approximate the pitch curve,we use the radial error to quantify the effects of thefitting B-spline curve and the pitch curve.The problem thus boils down to solving a difficult global optimization problem tofind the numbers and positions of the control points or data points of the B-spline curve such that the cumulative radial error between thefitting curve and the original curve is minimized,and this problem is attempted in this paper with a double deep Q-network(DDQN)reinforcement learning(RL)algorithm with data points traceability.Specifically,the RL envir-onment,actions set and current states set are designed to facilitate the search of the data points,along with the design of the reward function and the initialization of the neural network.The experimental results show that when the angle division value of the actions set isfixed,the proposed algorithm can maximize the number of data points of the B-spline curve,and accurately place these data points to the right positions,with the minimum average of radial errors.Our work establishes the theoretical foundation for studying splinefitting using the RL method. 展开更多
关键词 B-splinefitting radial error DDQN RL algorithm global optimal policy
下载PDF
On optimal charging scheduling for electric vehicles with wind power generation
19
作者 Junjie Wu Qing-Shan Jia 《Fundamental Research》 CAS CSCD 2024年第4期951-960,共10页
We consider the scheduling of battery charging of electric vehicles(EVs)integrated with renewable power generation.The increasing adoption of EVs and the development of renewable energies contribute importance to this... We consider the scheduling of battery charging of electric vehicles(EVs)integrated with renewable power generation.The increasing adoption of EVs and the development of renewable energies contribute importance to this research.The optimization of charging scheduling is challenging because of the large action space,the multi-stage decision making,and the high uncertainty.To solve this problem is time-consuming when the scale of the system is large.It is urgent to develop a practical and efficient method to properly schedule the charging of EVvs.The contribution of this work is threefold.First,we provide a sufficient condition on which the charging of EVs can be completely self-sustained by distributed generation.An algorithm is proposed to obtain the optimal charging policy when the sufficient condition holds.Second,the scenario when the supply of the renewable power generation is deficient is investigated.We prove that when the renewable generation is deterministic there exists an optimal policy which follows the modified least laxity and longer remaining processing time first(mLLLP)rule.Third,we provide an adaptive rule-based algorithm which obtains a near-optimal charging policy efficiently in general situations.We test the proposed algorithm by numerical experiments.The results show that it performs better than the other existing rule-based methods. 展开更多
关键词 Electric vehicle Charging scheduling Wind power optimal policy Renewable energy
原文传递
Task assignment in ground-to-air confrontation based on multiagent deep reinforcement learning 被引量:3
20
作者 Jia-yi Liu Gang Wang +2 位作者 Qiang Fu Shao-hua Yue Si-yuan Wang 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2023年第1期210-219,共10页
The scale of ground-to-air confrontation task assignments is large and needs to deal with many concurrent task assignments and random events.Aiming at the problems where existing task assignment methods are applied to... The scale of ground-to-air confrontation task assignments is large and needs to deal with many concurrent task assignments and random events.Aiming at the problems where existing task assignment methods are applied to ground-to-air confrontation,there is low efficiency in dealing with complex tasks,and there are interactive conflicts in multiagent systems.This study proposes a multiagent architecture based on a one-general agent with multiple narrow agents(OGMN)to reduce task assignment conflicts.Considering the slow speed of traditional dynamic task assignment algorithms,this paper proposes the proximal policy optimization for task assignment of general and narrow agents(PPOTAGNA)algorithm.The algorithm based on the idea of the optimal assignment strategy algorithm and combined with the training framework of deep reinforcement learning(DRL)adds a multihead attention mechanism and a stage reward mechanism to the bilateral band clipping PPO algorithm to solve the problem of low training efficiency.Finally,simulation experiments are carried out in the digital battlefield.The multiagent architecture based on OGMN combined with the PPO-TAGNA algorithm can obtain higher rewards faster and has a higher win ratio.By analyzing agent behavior,the efficiency,superiority and rationality of resource utilization of this method are verified. 展开更多
关键词 Ground-to-air confrontation Task assignment General and narrow agents Deep reinforcement learning Proximal policy optimization(PPO)
下载PDF
上一页 1 2 4 下一页 到第
使用帮助 返回顶部