Multi‐agent reinforcement learning relies on reward signals to guide the policy networks of individual agents.However,in high‐dimensional continuous spaces,the non‐stationary environment can provide outdated experi...Multi‐agent reinforcement learning relies on reward signals to guide the policy networks of individual agents.However,in high‐dimensional continuous spaces,the non‐stationary environment can provide outdated experiences that hinder convergence,resulting in ineffective training performance for multi‐agent systems.To tackle this issue,a novel reinforcement learning scheme,Mutual Information Oriented Deep Skill Chaining(MioDSC),is proposed that generates an optimised cooperative policy by incorporating intrinsic rewards based on mutual information to improve exploration efficiency.These rewards encourage agents to diversify their learning process by engaging in actions that increase the mutual information between their actions and the environment state.In addition,MioDSC can generate cooperative policies using the options framework,allowing agents to learn and reuse complex action sequences and accelerating the convergence speed of multi‐agent learning.MioDSC was evaluated in the multi‐agent particle environment and the StarCraft multi‐agent challenge at varying difficulty levels.The experimental results demonstrate that MioDSC outperforms state‐of‐the‐art methods and is robust across various multi‐agent system tasks with high stability.展开更多
Aiming at the flexible manufacturing system with multi-machining and multi-assembly equipment, a new scheduling algorithm is proposed to decompose the assembly structure of the products, thus obtaining simple scheduli...Aiming at the flexible manufacturing system with multi-machining and multi-assembly equipment, a new scheduling algorithm is proposed to decompose the assembly structure of the products, thus obtaining simple scheduling problems and forming the cOrrespOnding agents. Then, the importance and the restriction of each agent are cOnsidered, to obtain an order of simple scheduling problems based on the cooperation game theory. With this order, the scheduling of sub-questions is implemented in term of rules, and the almost optimal scheduling results for meeting the restriction can be obtained. Experimental results verify the effectiveness of the proposed scheduling algorithm.展开更多
Due to the fading characteristics of wireless channels and the burstiness of data traffic,how to deal with congestion in Ad-hoc networks with effective algorithms is still open and challenging.In this paper,we focus o...Due to the fading characteristics of wireless channels and the burstiness of data traffic,how to deal with congestion in Ad-hoc networks with effective algorithms is still open and challenging.In this paper,we focus on enabling congestion control to minimize network transmission delays through flexible power control.To effectively solve the congestion problem,we propose a distributed cross-layer scheduling algorithm,which is empowered by graph-based multi-agent deep reinforcement learning.The transmit power is adaptively adjusted in real-time by our algorithm based only on local information(i.e.,channel state information and queue length)and local communication(i.e.,information exchanged with neighbors).Moreover,the training complexity of the algorithm is low due to the regional cooperation based on the graph attention network.In the evaluation,we show that our algorithm can reduce the transmission delay of data flow under severe signal interference and drastically changing channel states,and demonstrate the adaptability and stability in different topologies.The method is general and can be extended to various types of topologies.展开更多
A soccer robot system (HIT 1) was built to participate in MIROSOT_China99 held in Harbin Institute of Technology. Robot soccer game is a very complex robot application that incorporates real time vision system, robot ...A soccer robot system (HIT 1) was built to participate in MIROSOT_China99 held in Harbin Institute of Technology. Robot soccer game is a very complex robot application that incorporates real time vision system, robot control, wireless communication and control of multiple robots. In the paper, we present the design and the hardware architecture and software architecture of our distributed multiple robot system.展开更多
The increasing trend toward dematerialization and digitalization has prompted a surge in the adoption of IT service providers, offering cost-effective alternatives to traditional local services. Consequently, cloud se...The increasing trend toward dematerialization and digitalization has prompted a surge in the adoption of IT service providers, offering cost-effective alternatives to traditional local services. Consequently, cloud services have become prevalent across various industries. While these services offer undeniable benefits, they face significant threats, particularly concerning the sensitivity of the data they handle. Many existing mathematical models struggle to accurately depict the complex scenarios of cloud systems. In response to this challenge, this paper proposes a behavioral model for ransomware propagation within such environments. In this model, each component of the environment is defined as an agent responsible for monitoring the propagation of malware. Given the distinct characteristics and criticality of these agents, the impact of malware can vary significantly. Scenario attacks are constructed based on real-world vulnerabilities documented in the Common Vulnerabilities and Exposures (CVEs) through the National Vulnerability Database. Defender actions are guided by an Intrusion Detection System (IDS) guideline. This research aims to provide a comprehensive framework for understanding and addressing ransomware threats in cloud systems. By leveraging an agent- based approach and real-world vulnerability data, our model offers valuable insights into detection and mitigation strategies for safeguarding sensitive cloud-based assets.展开更多
基金National Natural Science Foundation of China,Grant/Award Number:61872171The Belt and Road Special Foundation of the State Key Laboratory of Hydrology‐Water Resources and Hydraulic Engineering,Grant/Award Number:2021490811。
文摘Multi‐agent reinforcement learning relies on reward signals to guide the policy networks of individual agents.However,in high‐dimensional continuous spaces,the non‐stationary environment can provide outdated experiences that hinder convergence,resulting in ineffective training performance for multi‐agent systems.To tackle this issue,a novel reinforcement learning scheme,Mutual Information Oriented Deep Skill Chaining(MioDSC),is proposed that generates an optimised cooperative policy by incorporating intrinsic rewards based on mutual information to improve exploration efficiency.These rewards encourage agents to diversify their learning process by engaging in actions that increase the mutual information between their actions and the environment state.In addition,MioDSC can generate cooperative policies using the options framework,allowing agents to learn and reuse complex action sequences and accelerating the convergence speed of multi‐agent learning.MioDSC was evaluated in the multi‐agent particle environment and the StarCraft multi‐agent challenge at varying difficulty levels.The experimental results demonstrate that MioDSC outperforms state‐of‐the‐art methods and is robust across various multi‐agent system tasks with high stability.
文摘Aiming at the flexible manufacturing system with multi-machining and multi-assembly equipment, a new scheduling algorithm is proposed to decompose the assembly structure of the products, thus obtaining simple scheduling problems and forming the cOrrespOnding agents. Then, the importance and the restriction of each agent are cOnsidered, to obtain an order of simple scheduling problems based on the cooperation game theory. With this order, the scheduling of sub-questions is implemented in term of rules, and the almost optimal scheduling results for meeting the restriction can be obtained. Experimental results verify the effectiveness of the proposed scheduling algorithm.
基金supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (No.RS-2022-00155885, Artificial Intelligence Convergence Innovation Human Resources Development (Hanyang University ERICA))supported by the National Natural Science Foundation of China under Grant No. 61971264the National Natural Science Foundation of China/Research Grants Council Collaborative Research Scheme under Grant No. 62261160390
文摘Due to the fading characteristics of wireless channels and the burstiness of data traffic,how to deal with congestion in Ad-hoc networks with effective algorithms is still open and challenging.In this paper,we focus on enabling congestion control to minimize network transmission delays through flexible power control.To effectively solve the congestion problem,we propose a distributed cross-layer scheduling algorithm,which is empowered by graph-based multi-agent deep reinforcement learning.The transmit power is adaptively adjusted in real-time by our algorithm based only on local information(i.e.,channel state information and queue length)and local communication(i.e.,information exchanged with neighbors).Moreover,the training complexity of the algorithm is low due to the regional cooperation based on the graph attention network.In the evaluation,we show that our algorithm can reduce the transmission delay of data flow under severe signal interference and drastically changing channel states,and demonstrate the adaptability and stability in different topologies.The method is general and can be extended to various types of topologies.
基金Supported by the High Technology Research and Developmeent Program of China
文摘A soccer robot system (HIT 1) was built to participate in MIROSOT_China99 held in Harbin Institute of Technology. Robot soccer game is a very complex robot application that incorporates real time vision system, robot control, wireless communication and control of multiple robots. In the paper, we present the design and the hardware architecture and software architecture of our distributed multiple robot system.
文摘The increasing trend toward dematerialization and digitalization has prompted a surge in the adoption of IT service providers, offering cost-effective alternatives to traditional local services. Consequently, cloud services have become prevalent across various industries. While these services offer undeniable benefits, they face significant threats, particularly concerning the sensitivity of the data they handle. Many existing mathematical models struggle to accurately depict the complex scenarios of cloud systems. In response to this challenge, this paper proposes a behavioral model for ransomware propagation within such environments. In this model, each component of the environment is defined as an agent responsible for monitoring the propagation of malware. Given the distinct characteristics and criticality of these agents, the impact of malware can vary significantly. Scenario attacks are constructed based on real-world vulnerabilities documented in the Common Vulnerabilities and Exposures (CVEs) through the National Vulnerability Database. Defender actions are guided by an Intrusion Detection System (IDS) guideline. This research aims to provide a comprehensive framework for understanding and addressing ransomware threats in cloud systems. By leveraging an agent- based approach and real-world vulnerability data, our model offers valuable insights into detection and mitigation strategies for safeguarding sensitive cloud-based assets.