期刊文献+
共找到63篇文章
< 1 2 4 >
每页显示 20 50 100
Unleashing the Power of Multi-Agent Reinforcement Learning for Algorithmic Trading in the Digital Financial Frontier and Enterprise Information Systems
1
作者 Saket Sarin Sunil K.Singh +4 位作者 Sudhakar Kumar Shivam Goyal Brij Bhooshan Gupta Wadee Alhalabi Varsha Arya 《Computers, Materials & Continua》 SCIE EI 2024年第8期3123-3138,共16页
In the rapidly evolving landscape of today’s digital economy,Financial Technology(Fintech)emerges as a trans-formative force,propelled by the dynamic synergy between Artificial Intelligence(AI)and Algorithmic Trading... In the rapidly evolving landscape of today’s digital economy,Financial Technology(Fintech)emerges as a trans-formative force,propelled by the dynamic synergy between Artificial Intelligence(AI)and Algorithmic Trading.Our in-depth investigation delves into the intricacies of merging Multi-Agent Reinforcement Learning(MARL)and Explainable AI(XAI)within Fintech,aiming to refine Algorithmic Trading strategies.Through meticulous examination,we uncover the nuanced interactions of AI-driven agents as they collaborate and compete within the financial realm,employing sophisticated deep learning techniques to enhance the clarity and adaptability of trading decisions.These AI-infused Fintech platforms harness collective intelligence to unearth trends,mitigate risks,and provide tailored financial guidance,fostering benefits for individuals and enterprises navigating the digital landscape.Our research holds the potential to revolutionize finance,opening doors to fresh avenues for investment and asset management in the digital age.Additionally,our statistical evaluation yields encouraging results,with metrics such as Accuracy=0.85,Precision=0.88,and F1 Score=0.86,reaffirming the efficacy of our approach within Fintech and emphasizing its reliability and innovative prowess. 展开更多
关键词 Neurodynamic Fintech multi-agent reinforcement learning algorithmic trading digital financial frontier
下载PDF
MADDPG-D2: An Intelligent Dynamic Task Allocation Algorithm Based on Multi-Agent Architecture Driven by Prior Knowledge
2
作者 Tengda Li Gang Wang Qiang Fu 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第9期2559-2586,共28页
Aiming at the problems of low solution accuracy and high decision pressure when facing large-scale dynamic task allocation(DTA)and high-dimensional decision space with single agent,this paper combines the deep reinfor... Aiming at the problems of low solution accuracy and high decision pressure when facing large-scale dynamic task allocation(DTA)and high-dimensional decision space with single agent,this paper combines the deep reinforce-ment learning(DRL)theory and an improved Multi-Agent Deep Deterministic Policy Gradient(MADDPG-D2)algorithm with a dual experience replay pool and a dual noise based on multi-agent architecture is proposed to improve the efficiency of DTA.The algorithm is based on the traditional Multi-Agent Deep Deterministic Policy Gradient(MADDPG)algorithm,and considers the introduction of a double noise mechanism to increase the action exploration space in the early stage of the algorithm,and the introduction of a double experience pool to improve the data utilization rate;at the same time,in order to accelerate the training speed and efficiency of the agents,and to solve the cold-start problem of the training,the a priori knowledge technology is applied to the training of the algorithm.Finally,the MADDPG-D2 algorithm is compared and analyzed based on the digital battlefield of ground and air confrontation.The experimental results show that the agents trained by the MADDPG-D2 algorithm have higher win rates and average rewards,can utilize the resources more reasonably,and better solve the problem of the traditional single agent algorithms facing the difficulty of solving the problem in the high-dimensional decision space.The MADDPG-D2 algorithm based on multi-agent architecture proposed in this paper has certain superiority and rationality in DTA. 展开更多
关键词 Deep reinforcement learning dynamic task allocation intelligent decision-making multi-agent system MADDPG-D2 algorithm
下载PDF
Multi-User MmWave Beam Tracking via Multi-Agent Deep Q-Learning 被引量:1
3
作者 MENG Fan HUANG Yongming +1 位作者 LU Zhaohua XIAO Huahua 《ZTE Communications》 2023年第2期53-60,共8页
Beamforming is significant for millimeter wave multi-user massive multi-input multi-output systems.In the meanwhile,the overhead cost of channel state information and beam training is considerable,especially in dynami... Beamforming is significant for millimeter wave multi-user massive multi-input multi-output systems.In the meanwhile,the overhead cost of channel state information and beam training is considerable,especially in dynamic environments.To reduce the overhead cost,we propose a multi-user beam tracking algorithm using a distributed deep Q-learning method.With online learning of users’moving trajectories,the proposed algorithm learns to scan a beam subspace to maximize the average effective sum rate.Considering practical implementation,we model the continuous beam tracking problem as a non-Markov decision process and thus develop a simplified training scheme of deep Q-learning to reduce the training complexity.Furthermore,we propose a scalable state-action-reward design for scenarios with different users and antenna numbers.Simulation results verify the effectiveness of the designed method. 展开更多
关键词 multi-agent deep q-learning centralized training and distributed execution mmWave communication beam tracking scalability
下载PDF
Computation Tree Logic Model Checking of Multi-Agent Systems Based on Fuzzy Epistemic Interpreted Systems
4
作者 Xia Li Zhanyou Ma +3 位作者 Zhibao Mian Ziyuan Liu Ruiqi Huang Nana He 《Computers, Materials & Continua》 SCIE EI 2024年第3期4129-4152,共24页
Model checking is an automated formal verification method to verify whether epistemic multi-agent systems adhere to property specifications.Although there is an extensive literature on qualitative properties such as s... Model checking is an automated formal verification method to verify whether epistemic multi-agent systems adhere to property specifications.Although there is an extensive literature on qualitative properties such as safety and liveness,there is still a lack of quantitative and uncertain property verifications for these systems.In uncertain environments,agents must make judicious decisions based on subjective epistemic.To verify epistemic and measurable properties in multi-agent systems,this paper extends fuzzy computation tree logic by introducing epistemic modalities and proposing a new Fuzzy Computation Tree Logic of Knowledge(FCTLK).We represent fuzzy multi-agent systems as distributed knowledge bases with fuzzy epistemic interpreted systems.In addition,we provide a transformation algorithm from fuzzy epistemic interpreted systems to fuzzy Kripke structures,as well as transformation rules from FCTLK formulas to Fuzzy Computation Tree Logic(FCTL)formulas.Accordingly,we transform the FCTLK model checking problem into the FCTL model checking.This enables the verification of FCTLK formulas by using the fuzzy model checking algorithm of FCTL without additional computational overheads.Finally,we present correctness proofs and complexity analyses of the proposed algorithms.Additionally,we further illustrate the practical application of our approach through an example of a train control system. 展开更多
关键词 Model checking multi-agent systems fuzzy epistemic interpreted systems fuzzy computation tree logic transformation algorithm
下载PDF
A new accelerating algorithm for multi-agent reinforcement learning 被引量:1
5
作者 张汝波 仲宇 顾国昌 《Journal of Harbin Institute of Technology(New Series)》 EI CAS 2005年第1期48-51,共4页
In multi-agent systems, joint-action must be employed to achieve cooperation because the evaluation of the behavior of an agent often depends on the other agents’ behaviors. However, joint-action reinforcement learni... In multi-agent systems, joint-action must be employed to achieve cooperation because the evaluation of the behavior of an agent often depends on the other agents’ behaviors. However, joint-action reinforcement learning algorithms suffer the slow convergence rate because of the enormous learning space produced by joint-action. In this article, a prediction-based reinforcement learning algorithm is presented for multi-agent cooperation tasks, which demands all agents to learn predicting the probabilities of actions that other agents may execute. A multi-robot cooperation experiment is run to test the efficacy of the new algorithm, and the experiment results show that the new algorithm can achieve the cooperation policy much faster than the primitive reinforcement learning algorithm. 展开更多
关键词 distributed reinforcement learning accelerating algorithm machine learning multi-agent system
下载PDF
Step-coordination Algorithm of Traffic Control Based on Multi-agent System 被引量:1
6
作者 Hai-Tao Zhang Fang Yu Wen Li 《International Journal of Automation and computing》 EI 2009年第3期308-313,共6页
Aiming at the deficiency of conventional traffic control method, this paper proposes a new method based on multi-agent technology for traffic control. Different from many existing methods, this paper distinguishes tra... Aiming at the deficiency of conventional traffic control method, this paper proposes a new method based on multi-agent technology for traffic control. Different from many existing methods, this paper distinguishes traffic control on the basis of the agent technology from conventional traffic control method. The composition and structure of a multi-agent system (MAS) is first discussed. Then, the step-coordination strategies of intersection-agent, segment-agent, and area-agent are put forward. The advantages of the algorithm are demonstrated by a simulation study. 展开更多
关键词 Traffic control coordination algorithm multi-agent system (MAS) traffic control system agent.
下载PDF
A New Algorithm for Resource Constraint Project Scheduling Problem Based on Multi-Agent Systems 被引量:1
7
作者 何曙光 齐二石 李钢 《Transactions of Tianjin University》 EI CAS 2003年第4期348-352,共5页
The resource constrained project scheduling problem (RCPSP) and a decision-making model based on multi-agent systems (MAS) and general equilibrium marketing are proposed. An algorithm leading to the resource allocatio... The resource constrained project scheduling problem (RCPSP) and a decision-making model based on multi-agent systems (MAS) and general equilibrium marketing are proposed. An algorithm leading to the resource allocation decision involved in RCPSP has also been developed. And this algorithm can be used in the multi-project scheduling field as well.Finally, an illustration is given. 展开更多
关键词 resource constrained project scheduling problem multi-agent systems general equilibrium market algorithm
下载PDF
A Vision-based Robotic Navigation Method Using an Evolutionary and Fuzzy Q-Learning Approach
8
作者 Roberto Cuesta-Solano Ernesto Moya-Albor +1 位作者 Jorge Brieva Hiram Ponce 《Journal of Artificial Intelligence and Technology》 2024年第4期363-369,共7页
The paper presents a fuzzy Q-learning(FQL)and optical flow-based autonomous navigation approach.The FQL method takes decisions in an unknown environment and without mapping,using motion information and through a reinf... The paper presents a fuzzy Q-learning(FQL)and optical flow-based autonomous navigation approach.The FQL method takes decisions in an unknown environment and without mapping,using motion information and through a reinforcement signal into an evolutionary algorithm.The reinforcement signal is calculated by estimating the optical flow densities in areas of the camera to determine whether they are“dense”or“thin”which has a relationship with the proximity of objects.The results obtained show that the present approach improves the rate of learning compared with a method with a simple reward system and without the evolutionary component.The proposed system was implemented in a virtual robotics system using the CoppeliaSim software and in communication with Python. 展开更多
关键词 CoppeliaSim evolutionary algorithm fuzzy q-learning optical flow reinforced learning vision-based control navigation
下载PDF
Currency-based Iterative Multi-Agent Bidding Mechanism Based on Genetic Algorithm
9
作者 M K LIM Z ZHANG 《厦门大学学报(自然科学版)》 CAS CSCD 北大核心 2002年第S1期113-,共1页
This paper introduces a multi-agent system which i nt egrates process planning and production scheduling, in order to increase the fle xibility of manufacturing systems in coping with rapid changes in dynamic market a... This paper introduces a multi-agent system which i nt egrates process planning and production scheduling, in order to increase the fle xibility of manufacturing systems in coping with rapid changes in dynamic market and dealing with internal uncertainties such as machine breakdown or resources shortage. This system consists of various autonomous agents, each of which has t he capability of communicating with one another and making decisions based on it s knowledge and if necessary on information provided by other agents. Machine ag ents which represent the machines play an important role in the system in that t hey negotiate with each other to bid for jobs. An iterative bidding mechanism is proposed to facilitate the process of job assignment to machines and handle the negotiation between agents. This mechanism enables near optimal process plans a nd production schedules to be produced concurrently, so that dynamic changes in the market can be coped with at a minimum cost, and the utilisation of manufactu ring resources can be optimised. In addition, a currency scheme with currency-l ike metrics is proposed to encourage or prohibit machine agents to put forward t heir bids for the jobs announced. The values of the metrics are adjusted iterati vely so as to obtain an integrated plan and schedule which result in the minimum total production cost while satisfying products due dates. To deal with the optimisation problem, i.e. to what degree and how the currencies should be adj usted in each iteration, a genetic algorithm (GA) is developed. Comparisons are made between GA approach and simulated annealing (SA) optimisation technique. 展开更多
关键词 agile manufacturing multi-agent systems geneti c algorithm simulated annealing
下载PDF
A MULTI-AGENT LOCAL-LEARNING ALGORITHM UNDER GROUP ENVIROMENT
10
作者 Jiang Daoping Yin Yixin Ban Xiaojuan Meng Xiangsong 《Journal of Electronics(China)》 2009年第2期229-236,共8页
In this paper,a local-learning algorithm for multi-agent is presented based on the fact that individual agent performs local perception and local interaction under group environment.As for in-dividual-learning,agent a... In this paper,a local-learning algorithm for multi-agent is presented based on the fact that individual agent performs local perception and local interaction under group environment.As for in-dividual-learning,agent adopts greedy strategy to maximize its reward when interacting with envi-ronment.In group-learning,local interaction takes place between each two agents.A local-learning algorithm to choose and modify agents' actions is proposed to improve the traditional Q-learning algorithm,respectively in the situations of zero-sum games and general-sum games with unique equi-librium or multi-equilibrium.And this local-learning algorithm is proved to be convergent and the computation complexity is lower than the Nash-Q.Additionally,through grid-game test,it is indicated that by using this local-learning algorithm,the local behaviors of agents can spread to globe. 展开更多
关键词 multi-agent learning Game theory Nash-Q Local-learning algorithm
下载PDF
Necessary and Sufficient Conditions for Consensus in Third Order Multi-Agent Systems 被引量:9
11
作者 Chi Huang Guisheng Zhai Gesheng Xu 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2018年第6期1044-1053,共10页
We deal with a consensus control problem for a group of third order agents which are networked by digraphs.Assuming that the control input of each agent is constructed based on weighted difference between its states a... We deal with a consensus control problem for a group of third order agents which are networked by digraphs.Assuming that the control input of each agent is constructed based on weighted difference between its states and those of its neighbor agents, we aim to propose an algorithm on computing the weighting coefficients in the control input. The problem is reduced to designing Hurwitz polynomials with real or complex coefficients. We show that by using Hurwitz polynomials with complex coefficients, a necessary and sufficient condition can be obtained for designing the consensus algorithm. Since the condition is both necessary and sufficient, we provide a kind of parametrization for all the weighting coefficients achieving consensus. Moreover, the condition is a natural extension to second order consensus, and is reasonable and practical due to its comparatively decreased computation burden. The result is also extended to the case where communication delay exists in the control input. 展开更多
关键词 Communication delay consensus algorithms graph Laplacians Hurwitz polynomials third order multi-agent systems.
下载PDF
LMI Consensus Condition for Discrete-time Multi-agent Systems 被引量:6
12
作者 Magdi S.Mahmoud Gulam Dastagir Khan 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2018年第2期509-513,共5页
This paper examines a consensus problem in multiagent discrete-time systems, where each agent can exchange information only from its neighbor agents. A decentralized protocol is designed for each agent to steer all ag... This paper examines a consensus problem in multiagent discrete-time systems, where each agent can exchange information only from its neighbor agents. A decentralized protocol is designed for each agent to steer all agents to the same vector. The design condition is expressed in the form of a linear matrix inequality. Finally, a simulation example is presented and a comparison is made to demonstrate the effectiveness of the developed methodology. 展开更多
关键词 Index Terms-Consensus algorithms discrete-time systems linear matrix inequalities multi-agent systems.
下载PDF
A single-task and multi-decision evolutionary game model based on multi-agent reinforcement learning 被引量:3
13
作者 MA Ye CHANG Tianqing FAN Wenhui 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2021年第3期642-657,共16页
In the evolutionary game of the same task for groups,the changes in game rules,personal interests,the crowd size,and external supervision cause uncertain effects on individual decision-making and game results.In the M... In the evolutionary game of the same task for groups,the changes in game rules,personal interests,the crowd size,and external supervision cause uncertain effects on individual decision-making and game results.In the Markov decision framework,a single-task multi-decision evolutionary game model based on multi-agent reinforcement learning is proposed to explore the evolutionary rules in the process of a game.The model can improve the result of a evolutionary game and facilitate the completion of the task.First,based on the multi-agent theory,to solve the existing problems in the original model,a negative feedback tax penalty mechanism is proposed to guide the strategy selection of individuals in the group.In addition,in order to evaluate the evolutionary game results of the group in the model,a calculation method of the group intelligence level is defined.Secondly,the Q-learning algorithm is used to improve the guiding effect of the negative feedback tax penalty mechanism.In the model,the selection strategy of the Q-learning algorithm is improved and a bounded rationality evolutionary game strategy is proposed based on the rule of evolutionary games and the consideration of the bounded rationality of individuals.Finally,simulation results show that the proposed model can effectively guide individuals to choose cooperation strategies which are beneficial to task completion and stability under different negative feedback factor values and different group sizes,so as to improve the group intelligence level. 展开更多
关键词 multi-agent reinforcement learning evolutionary game q-learning
下载PDF
Multi-agent System Optimized Reconfiguration of Shipboard Power System 被引量:3
14
作者 兰海 肖云云 张利军 《Journal of Marine Science and Application》 2010年第3期334-339,共6页
Reconfigurability of the electrical network in a shipboard power system (SPS) after its failure is central to the restoration of power supply and improves survivability of an SPS. The navigational process creates a ... Reconfigurability of the electrical network in a shipboard power system (SPS) after its failure is central to the restoration of power supply and improves survivability of an SPS. The navigational process creates a sequence of different operating conditions. The priority of some loads differs in changing operating conditions. After analyzing characteristics of typical SPS, a model was developed used a grade III switchboard and an environmental prioritizing agent (EPA) algorithm. This algorithm was chosen as it is logically and physically decentralized as well as multi-agent oriented. The EPA algorithm was used to decide on the dynamic load priority, then it selected the means to best meet the maximum power supply load. The simulation results showed that higher priority loads were the first to be restored. The system satisfied all necessary constraints, demonstrating the effectiveness and validity of the proposed method. 展开更多
关键词 shipboard power system multi-agent system network reconfiguration environment priority agent algorithm
下载PDF
MARVEL:Multi-Agent Reinforcement Learning for VANET Delay Minimization 被引量:2
15
作者 Chengyue Lu Zihan Wang +3 位作者 Wenbo Ding Gang Li Sicong Liu Ling Cheng 《China Communications》 SCIE CSCD 2021年第6期1-11,共11页
In urban Vehicular Ad hoc Networks(VANETs),high mobility of vehicular environment and frequently changed network topology call for a low delay end-to-end routing algorithm.In this paper,we propose a Multi-Agent Reinfo... In urban Vehicular Ad hoc Networks(VANETs),high mobility of vehicular environment and frequently changed network topology call for a low delay end-to-end routing algorithm.In this paper,we propose a Multi-Agent Reinforcement Learning(MARL)based decentralized routing scheme,where the inherent similarity between the routing problem in VANET and the MARL problem is exploited.The proposed routing scheme models the interaction between vehicles and the environment as a multi-agent problem in which each vehicle autonomously establishes the communication channel with a neighbor device regardless of the global information.Simulation performed in the 3GPP Manhattan mobility model demonstrates that our proposed decentralized routing algorithm achieves less than 45.8 ms average latency and high stability of 0.05%averaging failure rate with varying vehicle capacities. 展开更多
关键词 VANET multi-agent RL delay minimization routing algorithm
下载PDF
Multidisciplinary design optimization for air-condition production system based on multi-agent technique 被引量:2
16
作者 杨海东 鄂加强 屈挺 《Journal of Central South University》 SCIE EI CAS 2012年第2期527-536,共10页
In order to guarantee the overall production performance of the multiple departments in an air-condition production industry, multidisciplinary design optimization model for production system is established based on t... In order to guarantee the overall production performance of the multiple departments in an air-condition production industry, multidisciplinary design optimization model for production system is established based on the multi-agent technology. Local operation models for departments of plan, marketing, sales, purchasing, as well as production and warehouse are formulated into individual agents, and their respective local objectives are collectively formulated into a multi-objective optimization problem. Considering the coupling effects among the correlated agents, the optimization process is carried out based on self-adaptive chaos immune optimization algorithm with mutative scale. The numerical results indicate that the proposed multi-agent optimization model truly reflects the actual situations of the air-condition production system. The proposed multi-agent based multidisciplinary design optimization method can help companies enhance their income ratio and profit by about 33% and 36%, respectively, and reduce the total cost by about 1.8%. 展开更多
关键词 multi-agent system production operation multidisciplinary optimization self-adaptive chaos optimization immune optimization algorithm
下载PDF
QMCR:A Q-Learning-Based Multi-Hop Cooperative Routing Protocol for Underwater Acoustic Sensor Networks 被引量:2
17
作者 Yougan Chen Kaitong Zheng +2 位作者 Xing Fang Lei Wan Xiaomei Xu 《China Communications》 SCIE CSCD 2021年第8期224-236,共13页
Routing plays a critical role in data transmission for underwater acoustic sensor networks(UWSNs)in the internet of underwater things(IoUT).Traditional routing methods suffer from high end-toend delay,limited bandwidt... Routing plays a critical role in data transmission for underwater acoustic sensor networks(UWSNs)in the internet of underwater things(IoUT).Traditional routing methods suffer from high end-toend delay,limited bandwidth,and high energy consumption.With the development of artificial intelligence and machine learning algorithms,many researchers apply these new methods to improve the quality of routing.In this paper,we propose a Qlearning-based multi-hop cooperative routing protocol(QMCR)for UWSNs.Our protocol can automatically choose nodes with the maximum Q-value as forwarders based on distance information.Moreover,we combine cooperative communications with Q-learning algorithm to reduce network energy consumption and improve communication efficiency.Experimental results show that the running time of the QMCR is less than one-tenth of that of the artificial fish-swarm algorithm(AFSA),while the routing energy consumption is kept at the same level.Due to the extremely fast speed of the algorithm,the QMCR is a promising method of routing design for UWSNs,especially for the case that it suffers from the extreme dynamic underwater acoustic channels in the real ocean environment. 展开更多
关键词 q-learning algorithm ROUTING internet of underwater things underwater acoustic communication multi-hop cooperative communication
下载PDF
Conflict-Free Routing Scheduling of OHTs Based on Multi-agent Intelligent Control System Framework 被引量:1
18
作者 周炳海 王翥 郑雯 《Journal of Donghua University(English Edition)》 EI CAS 2012年第6期484-488,共5页
Overhead-hoist-transporters (OHTs) have become the most appropriate tools to transport wafer lots between inter-bay and intra-bay in united layouts of automated material handling systems (AMHSs) in 300 mm semiconducto... Overhead-hoist-transporters (OHTs) have become the most appropriate tools to transport wafer lots between inter-bay and intra-bay in united layouts of automated material handling systems (AMHSs) in 300 mm semiconductor wafer fabrications. To obtain a conflict-free scheduling solution, an intelligent multi-agent-based control system framework was built to support the AMHSs. And corresponding algorithms and rules were proposed to implement cooperation among agents. On the basis of the mentioned above, a time-constraint-based heuristic scheduling algorithm was presented to support the routing decision agent in searching the conflict-free shortest path. In the construction of the algorithm, the conflicted intervals of the k-shortest-route were identified with the time window theory. The most available path was chosen with an objective of the minimum completion time. The back tracking method was combined to finish the routing scheduling. Finally, experiments of the proposed method were simulated. The results show that the multi-agent framework is suitable and the proposed scheduling algorithm is feasible and valid. 展开更多
关键词 overhead-hoist-transporter (OHT) automated material handling system(AMHS) SCHEDULING conflict-free multi-agent SYSTEM algorithm
下载PDF
Resource Allocation and Power Control Policy for Device-to-Device Communication Using Multi-Agent Reinforcement Learning
19
作者 Yifei Wei Yinxiang Qu +2 位作者 Min Zhao Lianping Zhang F.Richard Yu 《Computers, Materials & Continua》 SCIE EI 2020年第6期1515-1532,共18页
Device-to-Device(D2D)communication is a promising technology that can reduce the burden on cellular networks while increasing network capacity.In this paper,we focus on the channel resource allocation and power contro... Device-to-Device(D2D)communication is a promising technology that can reduce the burden on cellular networks while increasing network capacity.In this paper,we focus on the channel resource allocation and power control to improve the system resource utilization and network throughput.Firstly,we treat each D2D pair as an independent agent.Each agent makes decisions based on the local channel states information observed by itself.The multi-agent Reinforcement Learning(RL)algorithm is proposed for our multi-user system.We assume that the D2D pair do not possess any information on the availability and quality of the resource block to be selected,so the problem is modeled as a stochastic non-cooperative game.Hence,each agent becomes a player and they make decisions together to achieve global optimization.Thereby,the multi-agent Q-learning algorithm based on game theory is established.Secondly,in order to accelerate the convergence rate of multi-agent Q-learning,we consider a power allocation strategy based on Fuzzy C-means(FCM)algorithm.The strategy firstly groups the D2D users by FCM,and treats each group as an agent,and then performs multi-agent Q-learning algorithm to determine the power for each group of D2D users.The simulation results show that the Q-learning algorithm based on multi-agent can improve the throughput of the system.In particular,FCM can greatly speed up the convergence of the multi-agent Q-learning algorithm while improving system throughput. 展开更多
关键词 D2D communication resource allocation power control multi-agent q-learning fuzzy C-means
下载PDF
Adaptive Co-evolution Model for Multi-agent System
20
作者 张向锋 丁永生 梁朝霞 《Journal of Donghua University(English Edition)》 EI CAS 2010年第2期123-126,共4页
It is important to harmonize effectively the behaviors of the agents in the multi-agent system (MAS) to complete the solution process. The co-evolution computing techniques, inspired by natural selection and genetics,... It is important to harmonize effectively the behaviors of the agents in the multi-agent system (MAS) to complete the solution process. The co-evolution computing techniques, inspired by natural selection and genetics, are usually used to solve these problems. Based on learning and evolution mechanisms of the biological systems, an adaptive co-evolution model was proposed in this paper. Inner-population, inter-population, and community learning operators were presented. The adaptive co-evolution algorithm (ACEA) was designed in detail. Some simulation experiments were done to evaluate the performance of the ACEA. The results show that the ACEA is more effective and feasible than the genetic algorithm to solve the optimization problems. 展开更多
关键词 adaptive co-evolution algorithm multi-agent system LEARNING EVOLUTION genetic algorithm
下载PDF
上一页 1 2 4 下一页 到第
使用帮助 返回顶部