摘要
With the proportion of intelligent services in the industrial internet of things(IIoT)rising rapidly,its data dependency and decomposability increase the difficulty of scheduling computing resources.In this paper,we propose an intelligent service computing framework.In the framework,we take the long-term rewards of its important participants,edge service providers,as the optimization goal,which is related to service delay and computing cost.Considering the different update frequencies of data deployment and service offloading,double-timescale reinforcement learning is utilized in the framework.In the small-scale strategy,the frequent concurrency of services and the difference in service time lead to the fuzzy relationship between reward and action.To solve the fuzzy reward problem,a reward mapping-based reinforcement learning(RMRL)algorithm is proposed,which enables the agent to learn the relationship between reward and action more clearly.The large time scale strategy adopts the improved Monte Carlo tree search(MCTS)algorithm to improve the learning speed.The simulation results show that the strategy is superior to popular reinforcement learning algorithms such as double Q-learning(DDQN)and dueling Q-learning(dueling-DQN)in learning speed,and the reward is also increased by 14%.
基金
supported by the National Natural Science Foundation of China(No.62171051)。