摘要
In the fifth generation(5G)wireless system,a closed-loop power control(CLPC)scheme based on deep Q learning network(DQN)is introduced to intelligently adjust the transmit power of the base station(BS),which can improve the user equipment(UE)received signal to interference plus noise ratio(SINR)to a target threshold range.However,the selected power control(PC)action in DQN is not accurately matched the fluctuations of the wireless environment.Since the experience replay characteristic of the conventional DQN scheme leads to a possibility of insufficient training in the target deep neural network(DNN).As a result,the Q-value of the sub-optimal PC action exceed the optimal one.To solve this problem,we propose the improved DQN scheme.In the proposed scheme,we add an additional DNN to the conventional DQN,and set a shorter training interval to speed up the training of the DNN in order to fully train it.Finally,the proposed scheme can ensure that the Q value of the optimal action remains maximum.After multiple episodes of training,the proposed scheme can generate more accurate PC actions to match the fluctuations of the wireless environment.As a result,the UE received SINR can achieve the target threshold range faster and keep more stable.The simulation results prove that the proposed scheme outperforms the conventional schemes.