This paper researches some problems in complex formation for multi-agents,in which two matrices are proposed to record the formation.The pattern matrix is used to describe the pattern of the formation;meanwhile,the lo...This paper researches some problems in complex formation for multi-agents,in which two matrices are proposed to record the formation.The pattern matrix is used to describe the pattern of the formation;meanwhile,the location matrix is used to record the location of each agent.Thus,all desired positions of each agent will be obtained by geometrical relationship on the basis of two matrices above.In addition a self-adaptation flocking algorithm is proposed to control all agents to form a desired formation and avoid obstacles.The main idea is as follows:agents will form a desired formation through the method of formation control when far away from obstacles;otherwise,agents will freely fly to pass through the area of obstacles.In the simulation,three scenarios are designed to verify the effectiveness of our method.The results show that our method also can be applied in three dimensions.All agents will form a stable formation and keep the same velocity at last.展开更多
This paper deals with the distributed consensus problem of high-order multi-agent systems with nonlinear dynamics subject to external disturbances. The network topology is assumed to be a fixed undirected graph. Some ...This paper deals with the distributed consensus problem of high-order multi-agent systems with nonlinear dynamics subject to external disturbances. The network topology is assumed to be a fixed undirected graph. Some sufficient conditions are derived, under which the consensus can be achieved with a prescribed norm bound. It is shown that the parameter matrix in the consensus algorithm can be designed by solving two linear matrix inequalities (LMIs). In particular, if the nonzero eigenvalues of the laplacian matrix ac-cording to the network topology are identical, the parameter matrix in the consensus algorithm can be de-signed by solving one LMI. A numerical example is given to illustrate the proposed results.展开更多
Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metavers...Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metaverses. However, avatar tasks include a multitude of human-to-avatar and avatar-to-avatar interactive applications, e.g., augmented reality navigation,which consumes intensive computing resources. It is inefficient and impractical for vehicles to process avatar tasks locally. Fortunately, migrating avatar tasks to the nearest roadside units(RSU)or unmanned aerial vehicles(UAV) for execution is a promising solution to decrease computation overhead and reduce task processing latency, while the high mobility of vehicles brings challenges for vehicles to independently perform avatar migration decisions depending on current and future vehicle status. To address these challenges, in this paper, we propose a novel avatar task migration system based on multi-agent deep reinforcement learning(MADRL) to execute immersive vehicular avatar tasks dynamically. Specifically, we first formulate the problem of avatar task migration from vehicles to RSUs/UAVs as a partially observable Markov decision process that can be solved by MADRL algorithms. We then design the multi-agent proximal policy optimization(MAPPO) approach as the MADRL algorithm for the avatar task migration problem. To overcome slow convergence resulting from the curse of dimensionality and non-stationary issues caused by shared parameters in MAPPO, we further propose a transformer-based MAPPO approach via sequential decision-making models for the efficient representation of relationships among agents. Finally, to motivate terrestrial or non-terrestrial edge servers(e.g., RSUs or UAVs) to share computation resources and ensure traceability of the sharing records, we apply smart contracts and blockchain technologies to achieve secure sharing management. Numerical results demonstrate that the proposed approach outperforms the MAPPO approach by around 2% and effectively reduces approximately 20% of the latency of avatar task execution in UAV-assisted vehicular Metaverses.展开更多
In this paper,the bipartite consensus problem is studied for a class of uncertain high-order nonlinear multi-agent systems.A signed digraph is presented to describe the collaborative and competitive interactions among...In this paper,the bipartite consensus problem is studied for a class of uncertain high-order nonlinear multi-agent systems.A signed digraph is presented to describe the collaborative and competitive interactions among agents.For each agent with lower triangular structure,a time-varying gain compensator is first designed by relative output information of neighboring agents.Subsequently,a distributed controller with dynamic event-triggered mechanism is proposed to drive the bipartite consensus error to zero.It is worth noting that an internal dynamic variable is introduced in triggering function,which plays an essential role in excluding the Zeno behavior and reducing energy consumption.Furthermore,the dynamic event-triggered control protocol is developed for upper triangular multi-agent systems to realize the bipartite consensus without Zeno behavior.Finally,simulation examples are provided to illustrate the effectiveness of the presented results.展开更多
Efficient exploration in complex coordination tasks has been considered a challenging problem in multi-agent reinforcement learning(MARL). It is significantly more difficult for those tasks with latent variables that ...Efficient exploration in complex coordination tasks has been considered a challenging problem in multi-agent reinforcement learning(MARL). It is significantly more difficult for those tasks with latent variables that agents cannot directly observe. However, most of the existing latent variable discovery methods lack a clear representation of latent variables and an effective evaluation of the influence of latent variables on the agent. In this paper, we propose a new MARL algorithm based on the soft actor-critic method for complex continuous control tasks with confounders. It is called the multi-agent soft actor-critic with latent variable(MASAC-LV) algorithm, which uses variational inference theory to infer the compact latent variables representation space from a large amount of offline experience.Besides, we derive the counterfactual policy whose input has no latent variables and quantify the difference between the actual policy and the counterfactual policy via a distance function. This quantified difference is considered an intrinsic motivation that gives additional rewards based on how much the latent variable affects each agent. The proposed algorithm is evaluated on two collaboration tasks with confounders, and the experimental results demonstrate the effectiveness of MASAC-LV compared to other baseline algorithms.展开更多
This paper is concerned with consensus of a secondorder linear time-invariant multi-agent system in the situation that there exists a communication delay among the agents in the network.A proportional-integral consens...This paper is concerned with consensus of a secondorder linear time-invariant multi-agent system in the situation that there exists a communication delay among the agents in the network.A proportional-integral consensus protocol is designed by using delayed and memorized state information.Under the proportional-integral consensus protocol,the consensus problem of the multi-agent system is transformed into the problem of asymptotic stability of the corresponding linear time-invariant time-delay system.Note that the location of the eigenvalues of the corresponding characteristic function of the linear time-invariant time-delay system not only determines the stability of the system,but also plays a critical role in the dynamic performance of the system.In this paper,based on recent results on the distribution of roots of quasi-polynomials,several necessary conditions for Hurwitz stability for a class of quasi-polynomials are first derived.Then allowable regions of consensus protocol parameters are estimated.Some necessary and sufficient conditions for determining effective protocol parameters are provided.The designed protocol can achieve consensus and improve the dynamic performance of the second-order multi-agent system.Moreover,the effects of delays on consensus of systems of harmonic oscillators/double integrators under proportional-integral consensus protocols are investigated.Furthermore,some results on proportional-integral consensus are derived for a class of high-order linear time-invariant multi-agent systems.展开更多
This paper examines the bipartite consensus problems for the nonlinear multi-agent systems in Lurie dynamics form with cooperative and competitive communication between different agents. Based on the contraction theor...This paper examines the bipartite consensus problems for the nonlinear multi-agent systems in Lurie dynamics form with cooperative and competitive communication between different agents. Based on the contraction theory, some new conditions for the nonlinear Lurie multi-agent systems reaching bipartite leaderless consensus and bipartite tracking consensus are presented. Compared with the traditional methods, this approach degrades the dimensions of the conditions, eliminates some restrictions of the system matrix, and extends the range of the nonlinear function. Finally, two numerical examples are provided to illustrate the efficiency of our results.展开更多
This paper studies the problem of time-varying formation control with finite-time prescribed performance for nonstrict feedback second-order multi-agent systems with unmeasured states and unknown nonlinearities.To eli...This paper studies the problem of time-varying formation control with finite-time prescribed performance for nonstrict feedback second-order multi-agent systems with unmeasured states and unknown nonlinearities.To eliminate nonlinearities,neural networks are applied to approximate the inherent dynamics of the system.In addition,due to the limitations of the actual working conditions,each follower agent can only obtain the locally measurable partial state information of the leader agent.To address this problem,a neural network state observer based on the leader state information is designed.Then,a finite-time prescribed performance adaptive output feedback control strategy is proposed by restricting the sliding mode surface to a prescribed region,which ensures that the closed-loop system has practical finite-time stability and that formation errors of the multi-agent systems converge to the prescribed performance bound in finite time.Finally,a numerical simulation is provided to demonstrate the practicality and effectiveness of the developed algorithm.展开更多
This paper investigates the problem of global/semi-global finite-time consensus for integrator-type multi-agent sys-tems.New hyperbolic tangent function-based protocols are pro-posed to achieve global and semi-global ...This paper investigates the problem of global/semi-global finite-time consensus for integrator-type multi-agent sys-tems.New hyperbolic tangent function-based protocols are pro-posed to achieve global and semi-global finite-time consensus for both single-integrator and double-integrator multi-agent systems with leaderless undirected and leader-following directed commu-nication topologies.These new protocols not only provide an explicit upper-bound estimate for the settling time,but also have a user-prescribed bounded control level.In addition,compared to some existing results based on the saturation function,the pro-posed approach considerably simplifies the protocol design and the stability analysis.Illustrative examples and an application demonstrate the effectiveness of the proposed protocols.展开更多
Wind-photovoltaic(PV)-hydrogen-storage multi-agent energy systems are expected to play an important role in promoting renewable power utilization and decarbonization.In this study,a coordinated operation method was pr...Wind-photovoltaic(PV)-hydrogen-storage multi-agent energy systems are expected to play an important role in promoting renewable power utilization and decarbonization.In this study,a coordinated operation method was proposed for a wind-PVhydrogen-storage multi-agent energy system.First,a coordinated operation model was formulated for each agent considering peer-to-peer power trading.Second,a coordinated operation interactive framework for a multi-agent energy system was proposed based on the theory of the alternating direction method of multipliers.Third,a distributed interactive algorithm was proposed to protect the privacy of each agent and solve coordinated operation strategies.Finally,the effectiveness of the proposed coordinated operation method was tested on multi-agent energy systems with different structures,and the operational revenues of the wind power,PV,hydrogen,and energy storage agents of the proposed coordinated operation model were improved by approximately 59.19%,233.28%,16.75%,and 145.56%,respectively,compared with the independent operation model.展开更多
In the rapidly evolving landscape of today’s digital economy,Financial Technology(Fintech)emerges as a trans-formative force,propelled by the dynamic synergy between Artificial Intelligence(AI)and Algorithmic Trading...In the rapidly evolving landscape of today’s digital economy,Financial Technology(Fintech)emerges as a trans-formative force,propelled by the dynamic synergy between Artificial Intelligence(AI)and Algorithmic Trading.Our in-depth investigation delves into the intricacies of merging Multi-Agent Reinforcement Learning(MARL)and Explainable AI(XAI)within Fintech,aiming to refine Algorithmic Trading strategies.Through meticulous examination,we uncover the nuanced interactions of AI-driven agents as they collaborate and compete within the financial realm,employing sophisticated deep learning techniques to enhance the clarity and adaptability of trading decisions.These AI-infused Fintech platforms harness collective intelligence to unearth trends,mitigate risks,and provide tailored financial guidance,fostering benefits for individuals and enterprises navigating the digital landscape.Our research holds the potential to revolutionize finance,opening doors to fresh avenues for investment and asset management in the digital age.Additionally,our statistical evaluation yields encouraging results,with metrics such as Accuracy=0.85,Precision=0.88,and F1 Score=0.86,reaffirming the efficacy of our approach within Fintech and emphasizing its reliability and innovative prowess.展开更多
As an important mechanism in multi-agent interaction,communication can make agents form complex team relationships rather than constitute a simple set of multiple independent agents.However,the existing communication ...As an important mechanism in multi-agent interaction,communication can make agents form complex team relationships rather than constitute a simple set of multiple independent agents.However,the existing communication schemes can bring much timing redundancy and irrelevant messages,which seriously affects their practical application.To solve this problem,this paper proposes a targeted multiagent communication algorithm based on state control(SCTC).The SCTC uses a gating mechanism based on state control to reduce the timing redundancy of communication between agents and determines the interaction relationship between agents and the importance weight of a communication message through a series connection of hard-and self-attention mechanisms,realizing targeted communication message processing.In addition,by minimizing the difference between the fusion message generated from a real communication message of each agent and a fusion message generated from the buffered message,the correctness of the final action choice of the agent is ensured.Our evaluation using a challenging set of Star Craft II benchmarks indicates that the SCTC can significantly improve the learning performance and reduce the communication overhead between agents,thus ensuring better cooperation between agents.展开更多
This article studies the effective traffic signal control problem of multiple intersections in a city-level traffic system.A novel regional multi-agent cooperative reinforcement learning algorithm called RegionSTLight...This article studies the effective traffic signal control problem of multiple intersections in a city-level traffic system.A novel regional multi-agent cooperative reinforcement learning algorithm called RegionSTLight is proposed to improve the traffic efficiency.Firstly a regional multi-agent Q-learning framework is proposed,which can equivalently decompose the global Q value of the traffic system into the local values of several regions Based on the framework and the idea of human-machine cooperation,a dynamic zoning method is designed to divide the traffic network into several strong-coupled regions according to realtime traffic flow densities.In order to achieve better cooperation inside each region,a lightweight spatio-temporal fusion feature extraction network is designed.The experiments in synthetic real-world and city-level scenarios show that the proposed RegionS TLight converges more quickly,is more stable,and obtains better asymptotic performance compared to state-of-theart models.展开更多
文摘This paper researches some problems in complex formation for multi-agents,in which two matrices are proposed to record the formation.The pattern matrix is used to describe the pattern of the formation;meanwhile,the location matrix is used to record the location of each agent.Thus,all desired positions of each agent will be obtained by geometrical relationship on the basis of two matrices above.In addition a self-adaptation flocking algorithm is proposed to control all agents to form a desired formation and avoid obstacles.The main idea is as follows:agents will form a desired formation through the method of formation control when far away from obstacles;otherwise,agents will freely fly to pass through the area of obstacles.In the simulation,three scenarios are designed to verify the effectiveness of our method.The results show that our method also can be applied in three dimensions.All agents will form a stable formation and keep the same velocity at last.
文摘This paper deals with the distributed consensus problem of high-order multi-agent systems with nonlinear dynamics subject to external disturbances. The network topology is assumed to be a fixed undirected graph. Some sufficient conditions are derived, under which the consensus can be achieved with a prescribed norm bound. It is shown that the parameter matrix in the consensus algorithm can be designed by solving two linear matrix inequalities (LMIs). In particular, if the nonzero eigenvalues of the laplacian matrix ac-cording to the network topology are identical, the parameter matrix in the consensus algorithm can be de-signed by solving one LMI. A numerical example is given to illustrate the proposed results.
基金supported in part by NSFC (62102099, U22A2054, 62101594)in part by the Pearl River Talent Recruitment Program (2021QN02S643)+9 种基金Guangzhou Basic Research Program (2023A04J1699)in part by the National Research Foundation, SingaporeInfocomm Media Development Authority under its Future Communications Research Development ProgrammeDSO National Laboratories under the AI Singapore Programme under AISG Award No AISG2-RP-2020-019Energy Research Test-Bed and Industry Partnership Funding Initiative, Energy Grid (EG) 2.0 programmeDesCartes and the Campus for Research Excellence and Technological Enterprise (CREATE) programmeMOE Tier 1 under Grant RG87/22in part by the Singapore University of Technology and Design (SUTD) (SRG-ISTD-2021- 165)in part by the SUTD-ZJU IDEA Grant SUTD-ZJU (VP) 202102in part by the Ministry of Education, Singapore, through its SUTD Kickstarter Initiative (SKI 20210204)。
文摘Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metaverses. However, avatar tasks include a multitude of human-to-avatar and avatar-to-avatar interactive applications, e.g., augmented reality navigation,which consumes intensive computing resources. It is inefficient and impractical for vehicles to process avatar tasks locally. Fortunately, migrating avatar tasks to the nearest roadside units(RSU)or unmanned aerial vehicles(UAV) for execution is a promising solution to decrease computation overhead and reduce task processing latency, while the high mobility of vehicles brings challenges for vehicles to independently perform avatar migration decisions depending on current and future vehicle status. To address these challenges, in this paper, we propose a novel avatar task migration system based on multi-agent deep reinforcement learning(MADRL) to execute immersive vehicular avatar tasks dynamically. Specifically, we first formulate the problem of avatar task migration from vehicles to RSUs/UAVs as a partially observable Markov decision process that can be solved by MADRL algorithms. We then design the multi-agent proximal policy optimization(MAPPO) approach as the MADRL algorithm for the avatar task migration problem. To overcome slow convergence resulting from the curse of dimensionality and non-stationary issues caused by shared parameters in MAPPO, we further propose a transformer-based MAPPO approach via sequential decision-making models for the efficient representation of relationships among agents. Finally, to motivate terrestrial or non-terrestrial edge servers(e.g., RSUs or UAVs) to share computation resources and ensure traceability of the sharing records, we apply smart contracts and blockchain technologies to achieve secure sharing management. Numerical results demonstrate that the proposed approach outperforms the MAPPO approach by around 2% and effectively reduces approximately 20% of the latency of avatar task execution in UAV-assisted vehicular Metaverses.
基金This work was supported by the National Natural Science Foundation of China(Nos.61973189,62073190)the Research Fund for the Taishan Scholar Project of Shandong Province of China(No.ts20190905)the Natural Science Foundation of Shandong Province of China(No.ZR2020ZD25).
文摘In this paper,the bipartite consensus problem is studied for a class of uncertain high-order nonlinear multi-agent systems.A signed digraph is presented to describe the collaborative and competitive interactions among agents.For each agent with lower triangular structure,a time-varying gain compensator is first designed by relative output information of neighboring agents.Subsequently,a distributed controller with dynamic event-triggered mechanism is proposed to drive the bipartite consensus error to zero.It is worth noting that an internal dynamic variable is introduced in triggering function,which plays an essential role in excluding the Zeno behavior and reducing energy consumption.Furthermore,the dynamic event-triggered control protocol is developed for upper triangular multi-agent systems to realize the bipartite consensus without Zeno behavior.Finally,simulation examples are provided to illustrate the effectiveness of the presented results.
基金supported in part by the National Natural Science Foundation of China (62136008,62236002,61921004,62173251,62103104)the “Zhishan” Scholars Programs of Southeast Universitythe Fundamental Research Funds for the Central Universities (2242023K30034)。
文摘Efficient exploration in complex coordination tasks has been considered a challenging problem in multi-agent reinforcement learning(MARL). It is significantly more difficult for those tasks with latent variables that agents cannot directly observe. However, most of the existing latent variable discovery methods lack a clear representation of latent variables and an effective evaluation of the influence of latent variables on the agent. In this paper, we propose a new MARL algorithm based on the soft actor-critic method for complex continuous control tasks with confounders. It is called the multi-agent soft actor-critic with latent variable(MASAC-LV) algorithm, which uses variational inference theory to infer the compact latent variables representation space from a large amount of offline experience.Besides, we derive the counterfactual policy whose input has no latent variables and quantify the difference between the actual policy and the counterfactual policy via a distance function. This quantified difference is considered an intrinsic motivation that gives additional rewards based on how much the latent variable affects each agent. The proposed algorithm is evaluated on two collaboration tasks with confounders, and the experimental results demonstrate the effectiveness of MASAC-LV compared to other baseline algorithms.
基金supported in part by the National Natural Science Foundation of China (NSFC)(61703086, 61773106)the IAPI Fundamental Research Funds (2018ZCX27)
文摘This paper is concerned with consensus of a secondorder linear time-invariant multi-agent system in the situation that there exists a communication delay among the agents in the network.A proportional-integral consensus protocol is designed by using delayed and memorized state information.Under the proportional-integral consensus protocol,the consensus problem of the multi-agent system is transformed into the problem of asymptotic stability of the corresponding linear time-invariant time-delay system.Note that the location of the eigenvalues of the corresponding characteristic function of the linear time-invariant time-delay system not only determines the stability of the system,but also plays a critical role in the dynamic performance of the system.In this paper,based on recent results on the distribution of roots of quasi-polynomials,several necessary conditions for Hurwitz stability for a class of quasi-polynomials are first derived.Then allowable regions of consensus protocol parameters are estimated.Some necessary and sufficient conditions for determining effective protocol parameters are provided.The designed protocol can achieve consensus and improve the dynamic performance of the second-order multi-agent system.Moreover,the effects of delays on consensus of systems of harmonic oscillators/double integrators under proportional-integral consensus protocols are investigated.Furthermore,some results on proportional-integral consensus are derived for a class of high-order linear time-invariant multi-agent systems.
基金Project supported by the National Natural Science Foundation of China(Grant No.62363005)the Jiangxi Provincial Natural Science Foundation(Grant Nos.20161BAB212032 and 20232BAB202034)the Science and Technology Research Project of Jiangxi Provincial Department of Education(Grant Nos.GJJ202602 and GJJ202601)。
文摘This paper examines the bipartite consensus problems for the nonlinear multi-agent systems in Lurie dynamics form with cooperative and competitive communication between different agents. Based on the contraction theory, some new conditions for the nonlinear Lurie multi-agent systems reaching bipartite leaderless consensus and bipartite tracking consensus are presented. Compared with the traditional methods, this approach degrades the dimensions of the conditions, eliminates some restrictions of the system matrix, and extends the range of the nonlinear function. Finally, two numerical examples are provided to illustrate the efficiency of our results.
基金the National Natural Science Foundation of China(62203356)Fundamental Research Funds for the Central Universities of China(31020210502002)。
文摘This paper studies the problem of time-varying formation control with finite-time prescribed performance for nonstrict feedback second-order multi-agent systems with unmeasured states and unknown nonlinearities.To eliminate nonlinearities,neural networks are applied to approximate the inherent dynamics of the system.In addition,due to the limitations of the actual working conditions,each follower agent can only obtain the locally measurable partial state information of the leader agent.To address this problem,a neural network state observer based on the leader state information is designed.Then,a finite-time prescribed performance adaptive output feedback control strategy is proposed by restricting the sliding mode surface to a prescribed region,which ensures that the closed-loop system has practical finite-time stability and that formation errors of the multi-agent systems converge to the prescribed performance bound in finite time.Finally,a numerical simulation is provided to demonstrate the practicality and effectiveness of the developed algorithm.
基金supported by the National Natural Science Foundation of China(62073019)。
文摘This paper investigates the problem of global/semi-global finite-time consensus for integrator-type multi-agent sys-tems.New hyperbolic tangent function-based protocols are pro-posed to achieve global and semi-global finite-time consensus for both single-integrator and double-integrator multi-agent systems with leaderless undirected and leader-following directed commu-nication topologies.These new protocols not only provide an explicit upper-bound estimate for the settling time,but also have a user-prescribed bounded control level.In addition,compared to some existing results based on the saturation function,the pro-posed approach considerably simplifies the protocol design and the stability analysis.Illustrative examples and an application demonstrate the effectiveness of the proposed protocols.
基金supported by the Key Research and Development Program of Jiangsu Provincial Department of Science and Technology(BE2020081).
文摘Wind-photovoltaic(PV)-hydrogen-storage multi-agent energy systems are expected to play an important role in promoting renewable power utilization and decarbonization.In this study,a coordinated operation method was proposed for a wind-PVhydrogen-storage multi-agent energy system.First,a coordinated operation model was formulated for each agent considering peer-to-peer power trading.Second,a coordinated operation interactive framework for a multi-agent energy system was proposed based on the theory of the alternating direction method of multipliers.Third,a distributed interactive algorithm was proposed to protect the privacy of each agent and solve coordinated operation strategies.Finally,the effectiveness of the proposed coordinated operation method was tested on multi-agent energy systems with different structures,and the operational revenues of the wind power,PV,hydrogen,and energy storage agents of the proposed coordinated operation model were improved by approximately 59.19%,233.28%,16.75%,and 145.56%,respectively,compared with the independent operation model.
基金This project was funded by Deanship of Scientific Research(DSR)at King Abdulaziz University,Jeddah underGrant No.(IFPIP-1127-611-1443)the authors,therefore,acknowledge with thanks DSR technical and financial support.
文摘In the rapidly evolving landscape of today’s digital economy,Financial Technology(Fintech)emerges as a trans-formative force,propelled by the dynamic synergy between Artificial Intelligence(AI)and Algorithmic Trading.Our in-depth investigation delves into the intricacies of merging Multi-Agent Reinforcement Learning(MARL)and Explainable AI(XAI)within Fintech,aiming to refine Algorithmic Trading strategies.Through meticulous examination,we uncover the nuanced interactions of AI-driven agents as they collaborate and compete within the financial realm,employing sophisticated deep learning techniques to enhance the clarity and adaptability of trading decisions.These AI-infused Fintech platforms harness collective intelligence to unearth trends,mitigate risks,and provide tailored financial guidance,fostering benefits for individuals and enterprises navigating the digital landscape.Our research holds the potential to revolutionize finance,opening doors to fresh avenues for investment and asset management in the digital age.Additionally,our statistical evaluation yields encouraging results,with metrics such as Accuracy=0.85,Precision=0.88,and F1 Score=0.86,reaffirming the efficacy of our approach within Fintech and emphasizing its reliability and innovative prowess.
文摘As an important mechanism in multi-agent interaction,communication can make agents form complex team relationships rather than constitute a simple set of multiple independent agents.However,the existing communication schemes can bring much timing redundancy and irrelevant messages,which seriously affects their practical application.To solve this problem,this paper proposes a targeted multiagent communication algorithm based on state control(SCTC).The SCTC uses a gating mechanism based on state control to reduce the timing redundancy of communication between agents and determines the interaction relationship between agents and the importance weight of a communication message through a series connection of hard-and self-attention mechanisms,realizing targeted communication message processing.In addition,by minimizing the difference between the fusion message generated from a real communication message of each agent and a fusion message generated from the buffered message,the correctness of the final action choice of the agent is ensured.Our evaluation using a challenging set of Star Craft II benchmarks indicates that the SCTC can significantly improve the learning performance and reduce the communication overhead between agents,thus ensuring better cooperation between agents.
基金supported by the National Science and Technology Major Project (2021ZD0112702)the National Natural Science Foundation (NNSF)of China (62373100,62233003)the Natural Science Foundation of Jiangsu Province of China (BK20202006)。
文摘This article studies the effective traffic signal control problem of multiple intersections in a city-level traffic system.A novel regional multi-agent cooperative reinforcement learning algorithm called RegionSTLight is proposed to improve the traffic efficiency.Firstly a regional multi-agent Q-learning framework is proposed,which can equivalently decompose the global Q value of the traffic system into the local values of several regions Based on the framework and the idea of human-machine cooperation,a dynamic zoning method is designed to divide the traffic network into several strong-coupled regions according to realtime traffic flow densities.In order to achieve better cooperation inside each region,a lightweight spatio-temporal fusion feature extraction network is designed.The experiments in synthetic real-world and city-level scenarios show that the proposed RegionS TLight converges more quickly,is more stable,and obtains better asymptotic performance compared to state-of-theart models.