This paper studies the optimal policy for joint control of admission, routing, service, and jockeying in a queueing system consisting of two exponential servers in parallel.Jobs arrive according to a Poisson process.U...This paper studies the optimal policy for joint control of admission, routing, service, and jockeying in a queueing system consisting of two exponential servers in parallel.Jobs arrive according to a Poisson process.Upon each arrival, an admission/routing decision is made, and the accepted job is routed to one of the two servers with each being associated with a queue.After each service completion, the servers have an option of serving a job from its own queue, serving a jockeying job from another queue, or staying idle.The system performance is inclusive of the revenues from accepted jobs, the costs of holding jobs in queues, the service costs and the job jockeying costs.To maximize the total expected discounted return, we formulate a Markov decision process(MDP) model for this system.The value iteration method is employed to characterize the optimal policy as a hedging point policy.Numerical studies verify the structure of the hedging point policy which is convenient for implementing control actions in practice.展开更多
To investigate the equilibrium relationships between the volatility of capital and income, taxation, and ance in a stochastic control model, the uniqueness of the solution to this model was proved by using the method ...To investigate the equilibrium relationships between the volatility of capital and income, taxation, and ance in a stochastic control model, the uniqueness of the solution to this model was proved by using the method of dynamic programming under the introduction of distributive disturbance and elastic labor supply. Furthermore, the effects of two types of shocks on labor-leisure choice, economic growth rate and welfare were numerically analyzed, and then the optimal tax policy was derived.展开更多
In this paper,we present a new method for finding a fixed local-optimal policy for computing the customer lifetime value.The method is developed for a class of ergodic controllable finite Markov chains.We propose an a...In this paper,we present a new method for finding a fixed local-optimal policy for computing the customer lifetime value.The method is developed for a class of ergodic controllable finite Markov chains.We propose an approach based on a non-converging state-value function that fluctuates(increases and decreases) between states of the dynamic process.We prove that it is possible to represent that function in a recursive format using a one-step-ahead fixed-optimal policy.Then,we provide an analytical formula for the numerical realization of the fixed local-optimal strategy.We also present a second approach based on linear programming,to solve the same problem,that implement the c-variable method for making the problem computationally tractable.At the end,we show that these two approaches are related:after a finite number of iterations our proposed approach converges to same result as the linear programming method.We also present a non-traditional approach for ergodicity verification.The validity of the proposed methods is successfully demonstrated theoretically and,by simulated credit-card marketing experiments computing the customer lifetime value for both an optimization and a game theory approach.展开更多
In this paper we discuss the discrete, time non--homogeneous discounted Markovian decisionprogramming, where the state space and all action sets are countable. Suppose that the optimumvalue function is finite. We give...In this paper we discuss the discrete, time non--homogeneous discounted Markovian decisionprogramming, where the state space and all action sets are countable. Suppose that the optimumvalue function is finite. We give the necessary and sufficient conditions for the existence of anoptimal policy. Suppose that the absolute mean of rewards is relatively bounded. We also give thenecessary and sufficient conditions for the existence of an optimal policy.展开更多
This article studies the inshore-offshore fishery model with impulsive diffusion. The existence and global asymptotic stability of both the trivial periodic solution and the positive periodic solution are obtained. Th...This article studies the inshore-offshore fishery model with impulsive diffusion. The existence and global asymptotic stability of both the trivial periodic solution and the positive periodic solution are obtained. The complexity of this system is also analyzed. Moreover, the optimal harvesting policy are given for the inshore subpopulation, which includes the maximum sustainable yield and the corresponding harvesting effort.展开更多
This paper employs a stochastic endogenous growth model extended to the case of a recursive utility function which can disentangle intertemporal substitution from risk aversion to analyze productive government expendi...This paper employs a stochastic endogenous growth model extended to the case of a recursive utility function which can disentangle intertemporal substitution from risk aversion to analyze productive government expenditure and optimal fiscal policy, particularly stresses the importance of factor income. First, the explicit solutions of the central planner's stochastic optimization problem are derived, the growth maximizing and welfare-maximizing government expenditure policies are obtained and their standing in conflict or coincidence depends upon intertemporal substitution. Second, the explicit solutions of the representative individual's stochastic optimization problem which permits to tax on capital income and labor income separately are derived ,and it is found that the effect of risk on growth crucially depends on the degree of risk aversion,the intertemporal elasticity of substitution and the capital income share. Finally, a flexible optimal tax policy which can be internally adjusted to a certain extent is derived, and it is found that the distribution of factor income plays an important role in designing the optimal tax policy.展开更多
This paper aims to improve the performance of a class of distributed parameter systems for the optimal switching of actuators and controllers based on event-driven control. It is assumed that in the available multiple...This paper aims to improve the performance of a class of distributed parameter systems for the optimal switching of actuators and controllers based on event-driven control. It is assumed that in the available multiple actuators, only one actuator can receive the control signal and be activated over an unfixed time interval, and the other actuators keep dormant. After incorporating a state observer into the event generator, the event-driven control loop and the minimum inter-event time are ultimately bounded. Based on the event-driven state feedback control, the time intervals of unfixed length can be obtained. The optimal switching policy is based on finite horizon linear quadratic optimal control at the beginning of each time subinterval. A simulation example demonstrate the effectiveness of the proposed policy.展开更多
I consider a system whose deterioration follows a discrete-time and discrete-state Markov chain with an absorbing state. When the system is put into practice, I may select operation (wait), imperfect repair, or replac...I consider a system whose deterioration follows a discrete-time and discrete-state Markov chain with an absorbing state. When the system is put into practice, I may select operation (wait), imperfect repair, or replacement at each discrete-time point. The true state of the system is not known when it is operated. Instead, the system is monitored after operation and some incomplete information concerned with the deterioration is obtained for decision making. Since there are multiple imperfect repairs, I can select one option from them when the imperfect repair is preferable to operation and replacement. To express this situation, I propose a POMDP model and theoretically investigate the structure of an optimal maintenance policy minimizing a total expected discounted cost for an unbounded horizon. Then two stochastic orders are used for the analysis of our problem.展开更多
To investigate the effects of various random factors on the preventive maintenance (PM) decision-making of one type of two-unit series system, an optimal quasi-periodic PM policy is introduced. Assume that PM is per...To investigate the effects of various random factors on the preventive maintenance (PM) decision-making of one type of two-unit series system, an optimal quasi-periodic PM policy is introduced. Assume that PM is perfect for unit 1 and only mechanical service for unit 2 in the model. PM activity is randomly performed according to a dynamic PM plan distributed in each implementation period. A replacement is determined based on the competing results of unplanned and planned replacements. The unplanned replacement is trigged by a catastrophic failure of unit 2, and the planned replacement is executed when the PM number reaches the threshold N. Through modeling and analysis, a solution algorithm for an optimal implementation period and the PM number is given, and optimal process and parametric sensitivity are provided by a numerical example. Results show that the implementation period should be decreased as soon as possible under the condition of meeting the needs of practice, which can increase mean operating time and decrease the long-run cost rate.展开更多
This paper considers a model of an insurance company which is allowed to invest a risky asset and to purchase proportional reinsurance. The objective is to find the policy which maximizes the expected total discounted...This paper considers a model of an insurance company which is allowed to invest a risky asset and to purchase proportional reinsurance. The objective is to find the policy which maximizes the expected total discounted dividend pay-out until the time of bankruptcy and the terminal value of the company under liquidity constraint. We find the solution of this problem via solving the problem with zero terminal value. We also analyze the influence of terminal value on the optimal policy.展开更多
The maintenance model of simple repairable system is studied.We assume that there are two types of failure,namely type Ⅰ failure(repairable failure)and type Ⅱ failure(irrepairable failure).As long as the type Ⅰ fai...The maintenance model of simple repairable system is studied.We assume that there are two types of failure,namely type Ⅰ failure(repairable failure)and type Ⅱ failure(irrepairable failure).As long as the type Ⅰ failure occurs,the system will be repaired immediately,which is failure repair(FR).Between the(n-1)th and the nth FR,the system is supposed to be preventively repaired(PR)as the consecutive working time of the system reaches λ^(n-1) T,where λ and T are specified values.Further,we assume that the system will go on working when the repair is finished and will be replaced at the occurrence of the Nth type Ⅰ failure or the occurrence of the first type Ⅱ failure,whichever occurs first.In practice,the system will degrade with the increasing number of repairs.That is,the consecutive working time of the system forms a decreasing generalized geometric process(GGP)whereas the successive repair time forms an increasing GGP.A simple bivariate policy(T,N)repairable model is introduced based on GGP.The alternative searching method is used to minimize the cost rate function C(N,T),and the optimal(T,N)^(*) is obtained.Finally,numerical cases are applied to demonstrate the reasonability of this model.展开更多
ARINC653 systems, which have been widely used in avionics industry, are an important class of safety-critical applications. Partitions are the core concept in the Arinc653 system architecture. Due to the existence of ...ARINC653 systems, which have been widely used in avionics industry, are an important class of safety-critical applications. Partitions are the core concept in the Arinc653 system architecture. Due to the existence of partitions, the system designer must allocate adequate time slots statically to each partition in the design phase. Although some time slot allocation policies could be borrowed from task scheduling policies, no existing literatures give an optimal allocation policy. In this paper, we present a partition configuration policy and prove that this policy is optimal in the sense that if this policy fails to configure adequate time slots to each partition, nor do other policies. Then, by simulation, we show the effects of different partition configuration policies on time slot allocation of partitions and task response time, respectively.展开更多
This paper presents an optimal production model for manufacturer in a supply chain with a fixed demand at a fixed interval with respect to the learning effect on production capacity. An algorithm is employed to find t...This paper presents an optimal production model for manufacturer in a supply chain with a fixed demand at a fixed interval with respect to the learning effect on production capacity. An algorithm is employed to find the optimal delay time for production and production time sequentially. It is found that the optimal delay time for production and the production time are not static, but dynamic and variant with time. It is important for a manufacturer to schedule the production so as to prevent facilities and workers from idling.展开更多
In communication networks with policy-based Transport Control on-Demand (TCoD) function,the transport control policies play a great impact on the network effectiveness. To evaluate and optimize the transport policies ...In communication networks with policy-based Transport Control on-Demand (TCoD) function,the transport control policies play a great impact on the network effectiveness. To evaluate and optimize the transport policies in communication network,a policy-based TCoD network model is given and a comprehensive evaluation index system of the network effectiveness is put forward from both network application and handling mechanism perspectives. A TCoD network prototype system based on Asynchronous Transfer Mode/Multi-Protocol Label Switching (ATM/MPLS) is introduced and some experiments are performed on it. The prototype system is evaluated and analyzed with the comprehensive evaluation index system. The results show that the index system can be used to judge whether the communication network can meet the application requirements or not,and can provide references for the optimization of the transport policies so as to improve the communication network effectiveness.展开更多
According to this paper, the dragon-shape strategy is the optimized option of China's future strategy with respect to the geographic distribution of regional economy.
In this paper, the exploitation of single population modelled by Richards model is studied. By choosing the maximum annual-sustainable yield as management objective, we investigate the optimal harvesting policies for ...In this paper, the exploitation of single population modelled by Richards model is studied. By choosing the maximum annual-sustainable yield as management objective, we investigate the optimal harvesting policies for autonomous and periodic exploited Richards model. Further, when the functions in the exploited Richards model are stably bounded functions, we study the ultimately optimal harvesting policy and obtain the corresponding average limiting maximum sustainable yield.展开更多
In order to achieve an intelligent and automated self-management network,dynamic policy configuration and selection are needed.A certain policy only suits to a certain network environment.If the network environment ch...In order to achieve an intelligent and automated self-management network,dynamic policy configuration and selection are needed.A certain policy only suits to a certain network environment.If the network environment changes,the certain policy does not suit any more.Thereby,the policy-based management should also have similar "natural selection" process.Useful policy will be retained,and policies which have lost their effectiveness are eliminated.A policy optimization method based on evolutionary learning was proposed.For different shooting times,the priority of policy with high shooting times is improved,while policy with a low rate has lower priority,and long-term no shooting policy will be dormant.Thus the strategy for the survival of the fittest is realized,and the degree of self-learning in policy management is improved.展开更多
Directly applying the B-spline interpolation function to process plate cams in a computer numerical control(CNC)system may produce verbose tool-path codes and unsmooth trajectories.This paper is devoted to addressing ...Directly applying the B-spline interpolation function to process plate cams in a computer numerical control(CNC)system may produce verbose tool-path codes and unsmooth trajectories.This paper is devoted to addressing the problem of B-splinefitting for cam pitch curves.Considering that the B-spline curve needs to meet the motion law of the follower to approximate the pitch curve,we use the radial error to quantify the effects of thefitting B-spline curve and the pitch curve.The problem thus boils down to solving a difficult global optimization problem tofind the numbers and positions of the control points or data points of the B-spline curve such that the cumulative radial error between thefitting curve and the original curve is minimized,and this problem is attempted in this paper with a double deep Q-network(DDQN)reinforcement learning(RL)algorithm with data points traceability.Specifically,the RL envir-onment,actions set and current states set are designed to facilitate the search of the data points,along with the design of the reward function and the initialization of the neural network.The experimental results show that when the angle division value of the actions set isfixed,the proposed algorithm can maximize the number of data points of the B-spline curve,and accurately place these data points to the right positions,with the minimum average of radial errors.Our work establishes the theoretical foundation for studying splinefitting using the RL method.展开更多
We consider the scheduling of battery charging of electric vehicles(EVs)integrated with renewable power generation.The increasing adoption of EVs and the development of renewable energies contribute importance to this...We consider the scheduling of battery charging of electric vehicles(EVs)integrated with renewable power generation.The increasing adoption of EVs and the development of renewable energies contribute importance to this research.The optimization of charging scheduling is challenging because of the large action space,the multi-stage decision making,and the high uncertainty.To solve this problem is time-consuming when the scale of the system is large.It is urgent to develop a practical and efficient method to properly schedule the charging of EVvs.The contribution of this work is threefold.First,we provide a sufficient condition on which the charging of EVs can be completely self-sustained by distributed generation.An algorithm is proposed to obtain the optimal charging policy when the sufficient condition holds.Second,the scenario when the supply of the renewable power generation is deficient is investigated.We prove that when the renewable generation is deterministic there exists an optimal policy which follows the modified least laxity and longer remaining processing time first(mLLLP)rule.Third,we provide an adaptive rule-based algorithm which obtains a near-optimal charging policy efficiently in general situations.We test the proposed algorithm by numerical experiments.The results show that it performs better than the other existing rule-based methods.展开更多
The scale of ground-to-air confrontation task assignments is large and needs to deal with many concurrent task assignments and random events.Aiming at the problems where existing task assignment methods are applied to...The scale of ground-to-air confrontation task assignments is large and needs to deal with many concurrent task assignments and random events.Aiming at the problems where existing task assignment methods are applied to ground-to-air confrontation,there is low efficiency in dealing with complex tasks,and there are interactive conflicts in multiagent systems.This study proposes a multiagent architecture based on a one-general agent with multiple narrow agents(OGMN)to reduce task assignment conflicts.Considering the slow speed of traditional dynamic task assignment algorithms,this paper proposes the proximal policy optimization for task assignment of general and narrow agents(PPOTAGNA)algorithm.The algorithm based on the idea of the optimal assignment strategy algorithm and combined with the training framework of deep reinforcement learning(DRL)adds a multihead attention mechanism and a stage reward mechanism to the bilateral band clipping PPO algorithm to solve the problem of low training efficiency.Finally,simulation experiments are carried out in the digital battlefield.The multiagent architecture based on OGMN combined with the PPO-TAGNA algorithm can obtain higher rewards faster and has a higher win ratio.By analyzing agent behavior,the efficiency,superiority and rationality of resource utilization of this method are verified.展开更多
基金supported by the National Social Science Fund of China (19BGL100)。
文摘This paper studies the optimal policy for joint control of admission, routing, service, and jockeying in a queueing system consisting of two exponential servers in parallel.Jobs arrive according to a Poisson process.Upon each arrival, an admission/routing decision is made, and the accepted job is routed to one of the two servers with each being associated with a queue.After each service completion, the servers have an option of serving a job from its own queue, serving a jockeying job from another queue, or staying idle.The system performance is inclusive of the revenues from accepted jobs, the costs of holding jobs in queues, the service costs and the job jockeying costs.To maximize the total expected discounted return, we formulate a Markov decision process(MDP) model for this system.The value iteration method is employed to characterize the optimal policy as a hedging point policy.Numerical studies verify the structure of the hedging point policy which is convenient for implementing control actions in practice.
文摘To investigate the equilibrium relationships between the volatility of capital and income, taxation, and ance in a stochastic control model, the uniqueness of the solution to this model was proved by using the method of dynamic programming under the introduction of distributive disturbance and elastic labor supply. Furthermore, the effects of two types of shocks on labor-leisure choice, economic growth rate and welfare were numerically analyzed, and then the optimal tax policy was derived.
文摘In this paper,we present a new method for finding a fixed local-optimal policy for computing the customer lifetime value.The method is developed for a class of ergodic controllable finite Markov chains.We propose an approach based on a non-converging state-value function that fluctuates(increases and decreases) between states of the dynamic process.We prove that it is possible to represent that function in a recursive format using a one-step-ahead fixed-optimal policy.Then,we provide an analytical formula for the numerical realization of the fixed local-optimal strategy.We also present a second approach based on linear programming,to solve the same problem,that implement the c-variable method for making the problem computationally tractable.At the end,we show that these two approaches are related:after a finite number of iterations our proposed approach converges to same result as the linear programming method.We also present a non-traditional approach for ergodicity verification.The validity of the proposed methods is successfully demonstrated theoretically and,by simulated credit-card marketing experiments computing the customer lifetime value for both an optimization and a game theory approach.
文摘In this paper we discuss the discrete, time non--homogeneous discounted Markovian decisionprogramming, where the state space and all action sets are countable. Suppose that the optimumvalue function is finite. We give the necessary and sufficient conditions for the existence of anoptimal policy. Suppose that the absolute mean of rewards is relatively bounded. We also give thenecessary and sufficient conditions for the existence of an optimal policy.
文摘This article studies the inshore-offshore fishery model with impulsive diffusion. The existence and global asymptotic stability of both the trivial periodic solution and the positive periodic solution are obtained. The complexity of this system is also analyzed. Moreover, the optimal harvesting policy are given for the inshore subpopulation, which includes the maximum sustainable yield and the corresponding harvesting effort.
文摘This paper employs a stochastic endogenous growth model extended to the case of a recursive utility function which can disentangle intertemporal substitution from risk aversion to analyze productive government expenditure and optimal fiscal policy, particularly stresses the importance of factor income. First, the explicit solutions of the central planner's stochastic optimization problem are derived, the growth maximizing and welfare-maximizing government expenditure policies are obtained and their standing in conflict or coincidence depends upon intertemporal substitution. Second, the explicit solutions of the representative individual's stochastic optimization problem which permits to tax on capital income and labor income separately are derived ,and it is found that the effect of risk on growth crucially depends on the degree of risk aversion,the intertemporal elasticity of substitution and the capital income share. Finally, a flexible optimal tax policy which can be internally adjusted to a certain extent is derived, and it is found that the distribution of factor income plays an important role in designing the optimal tax policy.
基金supported by the National Natural Science Foundation of China(Grant Nos.61174021 and 61104155)the Fundamental Research Funds for theCentral Universities,China(Grant Nos.JUDCF13037 and JUSRP51322B)+1 种基金the Programme of Introducing Talents of Discipline to Universities,China(GrantNo.B12018)the Jiangsu Innovation Program for Graduates,China(Grant No.CXZZ13-0740)
文摘This paper aims to improve the performance of a class of distributed parameter systems for the optimal switching of actuators and controllers based on event-driven control. It is assumed that in the available multiple actuators, only one actuator can receive the control signal and be activated over an unfixed time interval, and the other actuators keep dormant. After incorporating a state observer into the event generator, the event-driven control loop and the minimum inter-event time are ultimately bounded. Based on the event-driven state feedback control, the time intervals of unfixed length can be obtained. The optimal switching policy is based on finite horizon linear quadratic optimal control at the beginning of each time subinterval. A simulation example demonstrate the effectiveness of the proposed policy.
文摘I consider a system whose deterioration follows a discrete-time and discrete-state Markov chain with an absorbing state. When the system is put into practice, I may select operation (wait), imperfect repair, or replacement at each discrete-time point. The true state of the system is not known when it is operated. Instead, the system is monitored after operation and some incomplete information concerned with the deterioration is obtained for decision making. Since there are multiple imperfect repairs, I can select one option from them when the imperfect repair is preferable to operation and replacement. To express this situation, I propose a POMDP model and theoretically investigate the structure of an optimal maintenance policy minimizing a total expected discounted cost for an unbounded horizon. Then two stochastic orders are used for the analysis of our problem.
基金The National Natural Science Foundation of China(No.51275090,71201025)the Program for Special Talent in Six Fields of Jiangsu Province(No.2008144)+1 种基金the Scientific Research Foundation of Graduate School of Southeast University(No.YBJJ1302)the Scientific Innovation Research of College Graduates in Jiangsu Province(No.CXLX12_0078)
文摘To investigate the effects of various random factors on the preventive maintenance (PM) decision-making of one type of two-unit series system, an optimal quasi-periodic PM policy is introduced. Assume that PM is perfect for unit 1 and only mechanical service for unit 2 in the model. PM activity is randomly performed according to a dynamic PM plan distributed in each implementation period. A replacement is determined based on the competing results of unplanned and planned replacements. The unplanned replacement is trigged by a catastrophic failure of unit 2, and the planned replacement is executed when the PM number reaches the threshold N. Through modeling and analysis, a solution algorithm for an optimal implementation period and the PM number is given, and optimal process and parametric sensitivity are provided by a numerical example. Results show that the implementation period should be decreased as soon as possible under the condition of meeting the needs of practice, which can increase mean operating time and decrease the long-run cost rate.
基金Supported by Doctor Foundation of Xinjiang Universitythe National Natural Science Foundation of China
文摘This paper considers a model of an insurance company which is allowed to invest a risky asset and to purchase proportional reinsurance. The objective is to find the policy which maximizes the expected total discounted dividend pay-out until the time of bankruptcy and the terminal value of the company under liquidity constraint. We find the solution of this problem via solving the problem with zero terminal value. We also analyze the influence of terminal value on the optimal policy.
基金supported by the National Natural Science Foundation of China(61573014)the Fundamental Research Funds for the Central Universities(JB180702).
文摘The maintenance model of simple repairable system is studied.We assume that there are two types of failure,namely type Ⅰ failure(repairable failure)and type Ⅱ failure(irrepairable failure).As long as the type Ⅰ failure occurs,the system will be repaired immediately,which is failure repair(FR).Between the(n-1)th and the nth FR,the system is supposed to be preventively repaired(PR)as the consecutive working time of the system reaches λ^(n-1) T,where λ and T are specified values.Further,we assume that the system will go on working when the repair is finished and will be replaced at the occurrence of the Nth type Ⅰ failure or the occurrence of the first type Ⅱ failure,whichever occurs first.In practice,the system will degrade with the increasing number of repairs.That is,the consecutive working time of the system forms a decreasing generalized geometric process(GGP)whereas the successive repair time forms an increasing GGP.A simple bivariate policy(T,N)repairable model is introduced based on GGP.The alternative searching method is used to minimize the cost rate function C(N,T),and the optimal(T,N)^(*) is obtained.Finally,numerical cases are applied to demonstrate the reasonability of this model.
基金supported by the National Natural Science Foundation of China under Grant No. 90718019the National High-Tech Research and Development Plan of China under Grant No. 2007AA010304
文摘ARINC653 systems, which have been widely used in avionics industry, are an important class of safety-critical applications. Partitions are the core concept in the Arinc653 system architecture. Due to the existence of partitions, the system designer must allocate adequate time slots statically to each partition in the design phase. Although some time slot allocation policies could be borrowed from task scheduling policies, no existing literatures give an optimal allocation policy. In this paper, we present a partition configuration policy and prove that this policy is optimal in the sense that if this policy fails to configure adequate time slots to each partition, nor do other policies. Then, by simulation, we show the effects of different partition configuration policies on time slot allocation of partitions and task response time, respectively.
基金Funded by National Social Sciences Fund for Young Scholar ( No.020JY027)
文摘This paper presents an optimal production model for manufacturer in a supply chain with a fixed demand at a fixed interval with respect to the learning effect on production capacity. An algorithm is employed to find the optimal delay time for production and production time sequentially. It is found that the optimal delay time for production and the production time are not static, but dynamic and variant with time. It is important for a manufacturer to schedule the production so as to prevent facilities and workers from idling.
基金Supported by the National 863 Program (No.2007AA-701210)
文摘In communication networks with policy-based Transport Control on-Demand (TCoD) function,the transport control policies play a great impact on the network effectiveness. To evaluate and optimize the transport policies in communication network,a policy-based TCoD network model is given and a comprehensive evaluation index system of the network effectiveness is put forward from both network application and handling mechanism perspectives. A TCoD network prototype system based on Asynchronous Transfer Mode/Multi-Protocol Label Switching (ATM/MPLS) is introduced and some experiments are performed on it. The prototype system is evaluated and analyzed with the comprehensive evaluation index system. The results show that the index system can be used to judge whether the communication network can meet the application requirements or not,and can provide references for the optimization of the transport policies so as to improve the communication network effectiveness.
文摘According to this paper, the dragon-shape strategy is the optimized option of China's future strategy with respect to the geographic distribution of regional economy.
文摘In this paper, the exploitation of single population modelled by Richards model is studied. By choosing the maximum annual-sustainable yield as management objective, we investigate the optimal harvesting policies for autonomous and periodic exploited Richards model. Further, when the functions in the exploited Richards model are stably bounded functions, we study the ultimately optimal harvesting policy and obtain the corresponding average limiting maximum sustainable yield.
基金National Natural Science Foundation of China(No.60534020)Cultivation Fund of the Key Scientific and Technical Innovation Project from Ministry of Education of China(No.706024)International Science Cooperation Foundation of Shanghai,China(No.061307041)
文摘In order to achieve an intelligent and automated self-management network,dynamic policy configuration and selection are needed.A certain policy only suits to a certain network environment.If the network environment changes,the certain policy does not suit any more.Thereby,the policy-based management should also have similar "natural selection" process.Useful policy will be retained,and policies which have lost their effectiveness are eliminated.A policy optimization method based on evolutionary learning was proposed.For different shooting times,the priority of policy with high shooting times is improved,while policy with a low rate has lower priority,and long-term no shooting policy will be dormant.Thus the strategy for the survival of the fittest is realized,and the degree of self-learning in policy management is improved.
基金supported by Fujian Province Nature Science Foundation under Grant No.2018J01553.
文摘Directly applying the B-spline interpolation function to process plate cams in a computer numerical control(CNC)system may produce verbose tool-path codes and unsmooth trajectories.This paper is devoted to addressing the problem of B-splinefitting for cam pitch curves.Considering that the B-spline curve needs to meet the motion law of the follower to approximate the pitch curve,we use the radial error to quantify the effects of thefitting B-spline curve and the pitch curve.The problem thus boils down to solving a difficult global optimization problem tofind the numbers and positions of the control points or data points of the B-spline curve such that the cumulative radial error between thefitting curve and the original curve is minimized,and this problem is attempted in this paper with a double deep Q-network(DDQN)reinforcement learning(RL)algorithm with data points traceability.Specifically,the RL envir-onment,actions set and current states set are designed to facilitate the search of the data points,along with the design of the reward function and the initialization of the neural network.The experimental results show that when the angle division value of the actions set isfixed,the proposed algorithm can maximize the number of data points of the B-spline curve,and accurately place these data points to the right positions,with the minimum average of radial errors.Our work establishes the theoretical foundation for studying splinefitting using the RL method.
文摘We consider the scheduling of battery charging of electric vehicles(EVs)integrated with renewable power generation.The increasing adoption of EVs and the development of renewable energies contribute importance to this research.The optimization of charging scheduling is challenging because of the large action space,the multi-stage decision making,and the high uncertainty.To solve this problem is time-consuming when the scale of the system is large.It is urgent to develop a practical and efficient method to properly schedule the charging of EVvs.The contribution of this work is threefold.First,we provide a sufficient condition on which the charging of EVs can be completely self-sustained by distributed generation.An algorithm is proposed to obtain the optimal charging policy when the sufficient condition holds.Second,the scenario when the supply of the renewable power generation is deficient is investigated.We prove that when the renewable generation is deterministic there exists an optimal policy which follows the modified least laxity and longer remaining processing time first(mLLLP)rule.Third,we provide an adaptive rule-based algorithm which obtains a near-optimal charging policy efficiently in general situations.We test the proposed algorithm by numerical experiments.The results show that it performs better than the other existing rule-based methods.
基金the Project of National Natural Science Foundation of China(Grant No.62106283)the Project of National Natural Science Foundation of China(Grant No.72001214)to provide fund for conducting experimentsthe Project of Natural Science Foundation of Shaanxi Province(Grant No.2020JQ-484)。
文摘The scale of ground-to-air confrontation task assignments is large and needs to deal with many concurrent task assignments and random events.Aiming at the problems where existing task assignment methods are applied to ground-to-air confrontation,there is low efficiency in dealing with complex tasks,and there are interactive conflicts in multiagent systems.This study proposes a multiagent architecture based on a one-general agent with multiple narrow agents(OGMN)to reduce task assignment conflicts.Considering the slow speed of traditional dynamic task assignment algorithms,this paper proposes the proximal policy optimization for task assignment of general and narrow agents(PPOTAGNA)algorithm.The algorithm based on the idea of the optimal assignment strategy algorithm and combined with the training framework of deep reinforcement learning(DRL)adds a multihead attention mechanism and a stage reward mechanism to the bilateral band clipping PPO algorithm to solve the problem of low training efficiency.Finally,simulation experiments are carried out in the digital battlefield.The multiagent architecture based on OGMN combined with the PPO-TAGNA algorithm can obtain higher rewards faster and has a higher win ratio.By analyzing agent behavior,the efficiency,superiority and rationality of resource utilization of this method are verified.