摘要
One of the most effective technology for the 5G mobile communications is Device-to-device(D2D)communication which is also called terminal pass-through technology.It can directly communicate between devices under the control of a base station and does not require a base station to forward it.The advantages of applying D2D communication technology to cellular networks are:It can increase the communication system capacity,improve the system spectrum efficiency,increase the data transmission rate,and reduce the base station load.Aiming at the problem of co-channel interference between the D2D and cellular users,this paper proposes an efficient algorithm for resource allocation based on the idea of Q-learning,which creates multi-agent learners from multiple D2D users,and the system throughput is determined from the corresponding state-learning of the Q value list and the maximum Q action is obtained through dynamic power for control for D2D users.The mutual interference between the D2D users and base stations and exact channel state information is not required during the Q-learning process and symmetric data transmission mechanism is adopted.The proposed algorithm maximizes the system throughput by controlling the power of D2D users while guaranteeing the quality-of-service of the cellular users.Simulation results show that the proposed algorithm effectively improves system performance as compared with existing algorithms.