期刊文献+
共找到1,254篇文章
< 1 2 63 >
每页显示 20 50 100
Investigation of nano-talc as a filling material and a reinforcing agent in high density polyethylene (HDPE) 被引量:1
1
作者 CHEN Nanchun MA Lei ZHANG Tao 《Rare Metals》 SCIE EI CAS CSCD 2006年第z1期422-425,共4页
An experiment of producing high density polyethylene (HDPE) nano-composite filled with 4wt.% talc was presented. Acting as filler and a reinforcing agent in the HDPE, talc powder, sized at around 5 μm, was surface-tr... An experiment of producing high density polyethylene (HDPE) nano-composite filled with 4wt.% talc was presented. Acting as filler and a reinforcing agent in the HDPE, talc powder, sized at around 5 μm, was surface-treated with aluminum diethylene glycol dinitrate coupling agent before adding to the HDPE. Analyses of the reinforced HDPE nano-composite show significant improvement in its mechanical properties including, tensile strength (>26 MPa), break elongation (<1.1%), flexural strength (>22 MPa), and friction coefficients<0.11. The results demonstrate that, after surface-treated, talc can be used as a promising filling material and a reinforcing agent in making HDPE nano-composite. 展开更多
关键词 HDPE TALC filling material reinforcing agent NANO-COMPOSITE mechanical properties
下载PDF
基于多Agent深度强化学习的无人机协作规划方法
2
作者 王娜 马利民 +1 位作者 姜云春 宗成国 《计算机应用与软件》 北大核心 2024年第9期83-89,96,共8页
人机协作控制是多无人机任务规划的重要方式。考虑多无人机任务环境协同解释和策略控制一致性需求,提出基于多Agent深度强化学习的无人机协作规划方法。依据任务知识和行为状态,构建基于任务分配Agent的任务规划器,生成人机交互的相互... 人机协作控制是多无人机任务规划的重要方式。考虑多无人机任务环境协同解释和策略控制一致性需求,提出基于多Agent深度强化学习的无人机协作规划方法。依据任务知识和行为状态,构建基于任务分配Agent的任务规划器,生成人机交互的相互依赖关系;设计一种深度学习强化方法,解决群体行为最优策略和协同控制方法,并利用混合主动行为选择机制评估学习策略。实验结果表明:作为人机交互实例,所提方法通过深度强化学习使群体全局联合动作表现较好,学习速度和稳定性均能优于确定性策略梯度方法。同时,在跟随、自主和混合主动3种模式比较下,可以较好地控制无人机飞行路径和任务,为无人机集群任务执行提供了智能决策依据。 展开更多
关键词 agent规划 深度强化学习 无人机协同规划 混合主动行为
下载PDF
Knowledge transfer in multi-agent reinforcement learning with incremental number of agents 被引量:4
3
作者 LIU Wenzhang DONG Lu +1 位作者 LIU Jian SUN Changyin 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2022年第2期447-460,共14页
In this paper, the reinforcement learning method for cooperative multi-agent systems(MAS) with incremental number of agents is studied. The existing multi-agent reinforcement learning approaches deal with the MAS with... In this paper, the reinforcement learning method for cooperative multi-agent systems(MAS) with incremental number of agents is studied. The existing multi-agent reinforcement learning approaches deal with the MAS with a specific number of agents, and can learn well-performed policies. However, if there is an increasing number of agents, the previously learned in may not perform well in the current scenario. The new agents need to learn from scratch to find optimal policies with others,which may slow down the learning speed of the whole team. To solve that problem, in this paper, we propose a new algorithm to take full advantage of the historical knowledge which was learned before, and transfer it from the previous agents to the new agents. Since the previous agents have been trained well in the source environment, they are treated as teacher agents in the target environment. Correspondingly, the new agents are called student agents. To enable the student agents to learn from the teacher agents, we first modify the input nodes of the networks for teacher agents to adapt to the current environment. Then, the teacher agents take the observations of the student agents as input, and output the advised actions and values as supervising information. Finally, the student agents combine the reward from the environment and the supervising information from the teacher agents, and learn the optimal policies with modified loss functions. By taking full advantage of the knowledge of teacher agents, the search space for the student agents will be reduced significantly, which can accelerate the learning speed of the holistic system. The proposed algorithm is verified in some multi-agent simulation environments, and its efficiency has been demonstrated by the experiment results. 展开更多
关键词 knowledge transfer multi-agent reinforcement learning(MARL) new agents
下载PDF
Task assignment in ground-to-air confrontation based on multiagent deep reinforcement learning 被引量:3
4
作者 Jia-yi Liu Gang Wang +2 位作者 Qiang Fu Shao-hua Yue Si-yuan Wang 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2023年第1期210-219,共10页
The scale of ground-to-air confrontation task assignments is large and needs to deal with many concurrent task assignments and random events.Aiming at the problems where existing task assignment methods are applied to... The scale of ground-to-air confrontation task assignments is large and needs to deal with many concurrent task assignments and random events.Aiming at the problems where existing task assignment methods are applied to ground-to-air confrontation,there is low efficiency in dealing with complex tasks,and there are interactive conflicts in multiagent systems.This study proposes a multiagent architecture based on a one-general agent with multiple narrow agents(OGMN)to reduce task assignment conflicts.Considering the slow speed of traditional dynamic task assignment algorithms,this paper proposes the proximal policy optimization for task assignment of general and narrow agents(PPOTAGNA)algorithm.The algorithm based on the idea of the optimal assignment strategy algorithm and combined with the training framework of deep reinforcement learning(DRL)adds a multihead attention mechanism and a stage reward mechanism to the bilateral band clipping PPO algorithm to solve the problem of low training efficiency.Finally,simulation experiments are carried out in the digital battlefield.The multiagent architecture based on OGMN combined with the PPO-TAGNA algorithm can obtain higher rewards faster and has a higher win ratio.By analyzing agent behavior,the efficiency,superiority and rationality of resource utilization of this method are verified. 展开更多
关键词 Ground-to-air confrontation Task assignment General and narrow agents Deep reinforcement learning Proximal policy optimization(PPO)
下载PDF
Incorporation of Perception-based Information in Robot Learning Using Fuzzy Reinforcement Learning Agents
5
作者 ZHOUChangjiu MENGQingchun +2 位作者 GUOZhongwen QUWeifen YINBo 《Journal of Ocean University of Qingdao》 2002年第1期93-100,共8页
Robot learning in unstructured environments has been proved to be an extremely challenging problem, mainly because of many uncertainties always present in the real world. Human beings, on the other hand, seem to cope ... Robot learning in unstructured environments has been proved to be an extremely challenging problem, mainly because of many uncertainties always present in the real world. Human beings, on the other hand, seem to cope very well with uncertain and unpredictable environments, often relying on perception-based information. Furthermore, humans beings can also utilize perceptions to guide their learning on those parts of the perception-action space that are actually relevant to the task. Therefore, we conduct a research aimed at improving robot learning through the incorporation of both perception-based and measurement-based information. For this reason, a fuzzy reinforcement learning (FRL) agent is proposed in this paper. Based on a neural-fuzzy architecture, different kinds of information can be incorporated into the FRL agent to initialise its action network, critic network and evaluation feedback module so as to accelerate its learning. By making use of the global optimisation capability of GAs (genetic algorithms), a GA-based FRL (GAFRL) agent is presented to solve the local minima problem in traditional actor-critic reinforcement learning. On the other hand, with the prediction capability of the critic network, GAs can perform a more effective global search. Different GAFRL agents are constructed and verified by using the simulation model of a physical biped robot. The simulation analysis shows that the biped learning rate for dynamic balance can be improved by incorporating perception-based information on biped balancing and walking evaluation. The biped robot can find its application in ocean exploration, detection or sea rescue activity, as well as military maritime activity. 展开更多
关键词 Robot learning reinforcement learning agents neural-fuzzy systems genetic algorithms biped robot
下载PDF
Exploring Local Chemical Space in De Novo Molecular Generation Using Multi-Agent Deep Reinforcement Learning 被引量:2
6
作者 Wei Hu 《Natural Science》 2021年第9期412-424,共13页
Single-agent reinforcement learning (RL) is commonly used to learn how to play computer games, in which the agent makes one move before making the next in a sequential decision process. Recently single agent was also ... Single-agent reinforcement learning (RL) is commonly used to learn how to play computer games, in which the agent makes one move before making the next in a sequential decision process. Recently single agent was also employed in the design of molecules and drugs. While a single agent is a good fit for computer games, it has limitations when used in molecule design. Its sequential learning makes it impossible to modify or improve the previous steps while working on the current step. In this paper, we proposed to apply the multi-agent RL approach to the research of molecules, which can optimize all sites of a molecule simultaneously. To elucidate the validity of our approach, we chose one chemical compound Favipiravir to explore its local chemical space. Favipiravir is a broad-spectrum inhibitor of viral RNA polymerase, and is one of the compounds that are currently being used in SARS-CoV-2 (COVID-19) clinical trials. Our experiments revealed the collaborative learning of a team of deep RL agents as well as the learning of its individual learning agent in the exploration of Favipiravir. In particular, our multi-agents not only discovered the molecules near Favipiravir in chemical space, but also the learnability of each site in the string representation of Favipiravir, critical information for us to understand the underline mechanism that supports machine learning of molecules. 展开更多
关键词 Multi-agent reinforcement Learning Actor-Critic Molecule Design SARS-CoV-2 COVID-19
下载PDF
Multi-Agent Deep Reinforcement Learning for Cross-Layer Scheduling in Mobile Ad-Hoc Networks
7
作者 Xinxing Zheng Yu Zhao +1 位作者 Joohyun Lee Wei Chen 《China Communications》 SCIE CSCD 2023年第8期78-88,共11页
Due to the fading characteristics of wireless channels and the burstiness of data traffic,how to deal with congestion in Ad-hoc networks with effective algorithms is still open and challenging.In this paper,we focus o... Due to the fading characteristics of wireless channels and the burstiness of data traffic,how to deal with congestion in Ad-hoc networks with effective algorithms is still open and challenging.In this paper,we focus on enabling congestion control to minimize network transmission delays through flexible power control.To effectively solve the congestion problem,we propose a distributed cross-layer scheduling algorithm,which is empowered by graph-based multi-agent deep reinforcement learning.The transmit power is adaptively adjusted in real-time by our algorithm based only on local information(i.e.,channel state information and queue length)and local communication(i.e.,information exchanged with neighbors).Moreover,the training complexity of the algorithm is low due to the regional cooperation based on the graph attention network.In the evaluation,we show that our algorithm can reduce the transmission delay of data flow under severe signal interference and drastically changing channel states,and demonstrate the adaptability and stability in different topologies.The method is general and can be extended to various types of topologies. 展开更多
关键词 Ad-hoc network cross-layer scheduling multi agent deep reinforcement learning interference elimination power control queue scheduling actorcritic methods markov decision process
下载PDF
Effect of Silane Coupling Agent Concentration on Interfacial Properties of Basalt Fiber Reinforced Composites
8
作者 Takao Ota 《材料科学与工程(中英文A版)》 2023年第2期36-42,共7页
The purpose of this study is to investigate the effect of the concentration of silane coupling solution on the tensile strength of basalt fiber and the interfacial properties of basalt fiber reinforced polymer composi... The purpose of this study is to investigate the effect of the concentration of silane coupling solution on the tensile strength of basalt fiber and the interfacial properties of basalt fiber reinforced polymer composites.The surface treatment of basalt fibers was carried out using an aqueous alcohol solution method.Basalt fibers were subjected to surface treatment with 3-Methacryloxypropyl trimethoxy silane at 0.5 wt.%,1 wt.%,2 wt.%,4 wt.%and 10 wt.%.The basalt monofilament tensile tests were carried out to investigate the variation in strength with the concentration of the silane coupling agent.The microdroplet test was performed to examine the effect of the concentration of the silane coupling agent on interfacial strength of basalt reinforced polymer composites.The film was formed on the surface of the basalt fiber treated silane coupling agent solution.The tensile strength of basalt fiber increased because the damaged fiber surface was repaired by the firm of silane coupling agent.The firm was effective in not only the surface protection of basalt fiber but also the improvement on the interfacial strength of fiber-matrix interface.However,the surface treatment using the high concentration silane coupling agent solution has an adverse effect on the mechanical properties of the composite materials,because of causing the degradation of the interfacial strength of the composite materials. 展开更多
关键词 Natural MINERAL FIBER reinforced composites BASALT FIBER SILANE coupling agent interface fiber/matrix BOND
下载PDF
竞争与合作视角下的多Agent强化学习研究进展
9
作者 田小禾 李伟 +3 位作者 许铮 刘天星 戚骁亚 甘中学 《计算机应用与软件》 北大核心 2024年第4期1-15,共15页
随着深度学习和强化学习研究取得长足的进展,多Agent强化学习已成为解决大规模复杂序贯决策问题的通用方法。为了推动该领域的发展,从竞争与合作的视角收集并总结近期相关的研究成果。该文介绍单Agent强化学习;分别介绍多Agent强化学习... 随着深度学习和强化学习研究取得长足的进展,多Agent强化学习已成为解决大规模复杂序贯决策问题的通用方法。为了推动该领域的发展,从竞争与合作的视角收集并总结近期相关的研究成果。该文介绍单Agent强化学习;分别介绍多Agent强化学习的基本理论框架——马尔可夫博弈以及扩展式博弈,并重点阐述了其在竞争、合作和混合三种场景下经典算法及其近期研究进展;讨论多Agent强化学习面临的核心挑战——环境的不稳定性,并通过一个例子对其解决思路进行总结与展望。 展开更多
关键词 深度学习 强化学习 agent强化学习 环境的不稳定性
下载PDF
Multi-Agent Reinforcement Learning Algorithm Based on Action Prediction
10
作者 童亮 陆际联 《Journal of Beijing Institute of Technology》 EI CAS 2006年第2期133-137,共5页
Multl-agent reinforcement learning algorithms are studied. A prediction-based multi-agent reinforcement learning algorithm is presented for multl-robot cooperation task. The multi-robot cooperation experiment based on... Multl-agent reinforcement learning algorithms are studied. A prediction-based multi-agent reinforcement learning algorithm is presented for multl-robot cooperation task. The multi-robot cooperation experiment based on multi-agent inverted pendulum is made to test the efficency of the new algorithm, and the experiment results show that the new algorithm can achieve the cooperation strategy much faster than the primitive multiagent reinforcement learning algorithm. 展开更多
关键词 multi-agent system reinforcement learning action prediction ROBOT
下载PDF
Knowledge Reasoning Method Based on Deep Transfer Reinforcement Learning:DTRLpath
11
作者 Shiming Lin Ling Ye +4 位作者 Yijie Zhuang Lingyun Lu Shaoqiu Zheng Chenxi Huang Ng Yin Kwee 《Computers, Materials & Continua》 SCIE EI 2024年第7期299-317,共19页
In recent years,with the continuous development of deep learning and knowledge graph reasoning methods,more and more researchers have shown great interest in improving knowledge graph reasoning methods by inferring mi... In recent years,with the continuous development of deep learning and knowledge graph reasoning methods,more and more researchers have shown great interest in improving knowledge graph reasoning methods by inferring missing facts through reasoning.By searching paths on the knowledge graph and making fact and link predictions based on these paths,deep learning-based Reinforcement Learning(RL)agents can demonstrate good performance and interpretability.Therefore,deep reinforcement learning-based knowledge reasoning methods have rapidly emerged in recent years and have become a hot research topic.However,even in a small and fixed knowledge graph reasoning action space,there are still a large number of invalid actions.It often leads to the interruption of RL agents’wandering due to the selection of invalid actions,resulting in a significant decrease in the success rate of path mining.In order to improve the success rate of RL agents in the early stages of path search,this article proposes a knowledge reasoning method based on Deep Transfer Reinforcement Learning path(DTRLpath).Before supervised pre-training and retraining,a pre-task of searching for effective actions in a single step is added.The RL agent is first trained in the pre-task to improve its ability to search for effective actions.Then,the trained agent is transferred to the target reasoning task for path search training,which improves its success rate in searching for target task paths.Finally,based on the comparative experimental results on the FB15K-237 and NELL-995 datasets,it can be concluded that the proposed method significantly improves the success rate of path search and outperforms similar methods in most reasoning tasks. 展开更多
关键词 Intelligent agent knowledge graph reasoning reinforcEMENT transfer learning
下载PDF
Interfacial reinforcement of core-shell HMX@energetic polymer composites featuring enhanced thermal and safety performance
12
作者 Binghui Duan Hongchang Mo +3 位作者 Bojun Tan Xianming Lu Bozhou Wang Ning Liu 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2024年第1期387-399,共13页
The weak interface interaction and solid-solid phase transition have long been a conundrum for 1,3,5,7-tetranitro-1,3,5,7-tetraazacyclooctane(HMX)-based polymer-bonded explosives(PBX).A two-step strategy that involves... The weak interface interaction and solid-solid phase transition have long been a conundrum for 1,3,5,7-tetranitro-1,3,5,7-tetraazacyclooctane(HMX)-based polymer-bonded explosives(PBX).A two-step strategy that involves the pretreatment of HMX to endow—OH groups on the surface via polyalcohol bonding agent modification and in situ coating with nitrate ester-containing polymer,was proposed to address the problem.Two types of energetic polyether—glycidyl azide polymer(GAP)and nitrate modified GAP(GNP)were grafted onto HMX crystal based on isocyanate addition reaction bridged through neutral polymeric bonding agent(NPBA)layer.The morphology and structure of the HMX-based composites were characterized in detail and the core-shell structure was validated.The grafted polymers obviously enhanced the adhesion force between HMX crystals and fluoropolymer(F2314)binder.Due to the interfacial reinforcement among the components,the two HMX-based composites exhibited a remarkable increment of phase transition peak temperature by 10.2°C and 19.6°C with no more than 1.5%shell content,respectively.Furthermore,the impact and friction sensitivity of the composites decreased significantly as a result of the barrier produced by the grafted polymers.These findings will enhance the future prospects for the interface design of energetic composites aiming to solve the weak interface and safety concerns. 展开更多
关键词 HMX crystals Polyalcohol bonding agent Energetic polymer Core-shell structure Interfacial reinforcement
下载PDF
Cooperative Multi-Agent Reinforcement Learning with Constraint-Reduced DCOP
13
作者 Yi Xie Zhongyi Liu +1 位作者 Zhao Liu Yijun Gu 《Journal of Beijing Institute of Technology》 EI CAS 2017年第4期525-533,共9页
Cooperative multi-agent reinforcement learning( MARL) is an important topic in the field of artificial intelligence,in which distributed constraint optimization( DCOP) algorithms have been widely used to coordinat... Cooperative multi-agent reinforcement learning( MARL) is an important topic in the field of artificial intelligence,in which distributed constraint optimization( DCOP) algorithms have been widely used to coordinate the actions of multiple agents. However,dense communication among agents affects the practicability of DCOP algorithms. In this paper,we propose a novel DCOP algorithm dealing with the previous DCOP algorithms' communication problem by reducing constraints.The contributions of this paper are primarily threefold:(1) It is proved that removing constraints can effectively reduce the communication burden of DCOP algorithms.(2) An criterion is provided to identify insignificant constraints whose elimination doesn't have a great impact on the performance of the whole system.(3) A constraint-reduced DCOP algorithm is proposed by adopting a variant of spectral clustering algorithm to detect and eliminate the insignificant constraints. Our algorithm reduces the communication burdern of the benchmark DCOP algorithm while keeping its overall performance unaffected. The performance of constraint-reduced DCOP algorithm is evaluated on four configurations of cooperative sensor networks. The effectiveness of communication reduction is also verified by comparisons between the constraint-reduced DCOP and the benchmark DCOP. 展开更多
关键词 reinforcement learning cooperative multi-agent system distributed constraint optimization (DCOP) constraint-reduced DCOP
下载PDF
Effect of Shrinkage Reducing Agent and Steel Fiber on the Fluidity and Cracking Performance of Ultra-High Performance Concrete
14
作者 Yong Wan Li Li +4 位作者 Jiaxin Zou Hucheng Xiao Mengdi Zhu Ying Su Jin Yang 《Fluid Dynamics & Materials Processing》 EI 2024年第9期1941-1956,共16页
Due to the low water-cement ratio of ultra-high-performance concrete(UHPC),fluidity and shrinkage cracking are key aspects determining the performance and durability of this type of concrete.In this study,the effects ... Due to the low water-cement ratio of ultra-high-performance concrete(UHPC),fluidity and shrinkage cracking are key aspects determining the performance and durability of this type of concrete.In this study,the effects of different types of cementitious materials,chemical shrinkage-reducing agents(SRA)and steel fiber(SF)were assessed.Compared with M2-UHPC and M3-UHPC,M1-UHPC was found to have better fluidity and shrinkage cracking performance.Moreover,different SRA incorporation methods,dosage and different SF types and aspect ratios were implemented.The incorporation of SRA and SF led to a decrease in the fluidity of UHPC.SRA internal content of 1%(NSRA-1%),SRA external content of 1%(WSRA-1%),STS-0.22 and STE-0.7 decreased the fluidity of UHPC by 3.3%,8.3%,9.2%and 25%,respectively.However,SRA and SF improved the UHPC shrinkage cracking performance.NSRA-1%and STE-0.7 reduced the shrinkage value of UHPC by 40%and 60%,respectively,and increased the crack resistance by 338%and 175%,respectively.In addition,the addition of SF was observed to make the microstructure of UHPC more compact,and the compressive strength and flexural strength of 28 d were increased by 26.9%and 19.9%,respectively. 展开更多
关键词 Ultra-high performance concrete chemical shrinkage reducing agent steel fiber shrinkage cracking repair and reinforcement
下载PDF
基于多Agent强化学习的电力通信网跨层保护方法
15
作者 陈毅龙 《自动化技术与应用》 2024年第10期112-115,共4页
针对当前方法存在数据传输成功率低、传输延迟时间长以及开销大等题,设计基于多Agent强化学习的电力通信网跨层保护方法。首先使用多Agent强化学习算法设定网络多路径协议,控制网络节点数据接收能力,然后构建网络跨层安全构架,设定相应... 针对当前方法存在数据传输成功率低、传输延迟时间长以及开销大等题,设计基于多Agent强化学习的电力通信网跨层保护方法。首先使用多Agent强化学习算法设定网络多路径协议,控制网络节点数据接收能力,然后构建网络跨层安全构架,设定相应网络模型作为网络跨层保护的基础,最后使用罚函数法对模型进行求解,保证函数解具有较高的可靠性,根据求解结果实现对网络跨层算法的优化,实现电力通信网跨层保护方法。实验结果可知,所提方法的收包率得到了明显提升,传输延迟时间缩短,开销低。 展开更多
关键词 agent强化学习 跨层保护 罚函数 数据包传输延迟
下载PDF
面向DAG任务的分布式智能计算卸载和服务缓存联合优化
16
作者 李云 南子煜 +2 位作者 姚枝秀 夏士超 鲜永菊 《中山大学学报(自然科学版)(中英文)》 CAS 北大核心 2025年第1期71-82,共12页
建立了一种有向无环图(DAG,directed acyclic graph)任务卸载和资源优化问题,旨在应用最大可容忍时延等约束实现系统能耗最小化。考虑到网络中计算请求高度动态、完整的系统状态信息难以获取等因素,最后使用多智能体深度确定性策略梯度(... 建立了一种有向无环图(DAG,directed acyclic graph)任务卸载和资源优化问题,旨在应用最大可容忍时延等约束实现系统能耗最小化。考虑到网络中计算请求高度动态、完整的系统状态信息难以获取等因素,最后使用多智能体深度确定性策略梯度(MADDPG,multi-agent deep deterministic policy gradient)算法来探寻最优的策略。相比于现有的任务卸载算法,MADDPG算法能够降低14.2%至40.8%的系统平均能耗,并且本地缓存命中率提高3.7%至4.1%。 展开更多
关键词 移动边缘计算 多智能体深度强化学习 计算卸载 资源分配 服务缓存
下载PDF
多Agent系统中强化学习的研究现状和发展趋势 被引量:12
17
作者 赵志宏 高阳 +1 位作者 骆斌 陈世福 《计算机科学》 CSCD 北大核心 2004年第3期23-27,共5页
本文对有关强化学习及其在多Agent系统中的应用等方面的研究现状、关键技术、问题和发展趋势进行了综述和讨论,试图给出强化学习目前研究的重点和发展方向。主要内容包括:(1)强化学习的框架结构;(2)几个有代表性的强化学习方法;(3)多Ag... 本文对有关强化学习及其在多Agent系统中的应用等方面的研究现状、关键技术、问题和发展趋势进行了综述和讨论,试图给出强化学习目前研究的重点和发展方向。主要内容包括:(1)强化学习的框架结构;(2)几个有代表性的强化学习方法;(3)多Agent系统中强化学习的应用和问题。最后讨论了多Agent系统中应用强化学习所面临的挑战。 展开更多
关键词 人工智能 agent系统 元对策理论 强化学习算法 POMDP模型
下载PDF
AODE中基于强化学习的Agent协商模型 被引量:14
18
作者 王立春 高阳 陈世福 《南京大学学报(自然科学版)》 CAS CSCD 北大核心 2001年第2期135-141,共7页
AODE是我们研制的一个面向Agent的智能系统开发环境 .AODE中基于强化学习的Agent协商模型采用Markov决策过程和连续决策过程分别描述系统状态变化和特定系统状态的Agent协商过程 ,并将强化学习技术应用于Agent协商过程 .该协商模型能够... AODE是我们研制的一个面向Agent的智能系统开发环境 .AODE中基于强化学习的Agent协商模型采用Markov决策过程和连续决策过程分别描述系统状态变化和特定系统状态的Agent协商过程 ,并将强化学习技术应用于Agent协商过程 .该协商模型能够描述动态环境下的多Agent协商 ,模型中所有Agent都采用元对策Q 学习算法时 ,系统能获得动态协商环境下的最优协商解 . 展开更多
关键词 多Agnet系统 强化学习 agent协商模型 AODE 智能系统开发环境 协商策略
下载PDF
基于强化学习的指挥控制Agent适应性仿真研究 被引量:8
19
作者 李志强 胡晓峰 +1 位作者 张斌 董忠林 《系统仿真学报》 EI CAS CSCD 北大核心 2005年第11期2801-2804,共4页
应用人工智能中的学习技术来赋予战争模拟系统中的智能Agent适应能力,是基于CAS理论的战争复杂性研究的基础内容之一。面对战争系统中复杂动态的环境,传统的监督学习方法不能很好满足智能Agent实时学习的要求。而强化学习却可以很好的... 应用人工智能中的学习技术来赋予战争模拟系统中的智能Agent适应能力,是基于CAS理论的战争复杂性研究的基础内容之一。面对战争系统中复杂动态的环境,传统的监督学习方法不能很好满足智能Agent实时学习的要求。而强化学习却可以很好的适应这种动态未知的环境。文章引入强化学习技术对战争系统中指挥控制Agent的适应性进行建模仿真研究。实验结果表明强化学习技术能很好的满足指挥控制Agent无师在线实时学习的要求,从而为战争模拟系统中的智能Agent的适应性机制提供良好的建模手段。 展开更多
关键词 适应性 强化学习 指挥控制 agent
下载PDF
一种基于Agent团队的强化学习模型与应用研究 被引量:31
20
作者 蔡庆生 张波 《计算机研究与发展》 EI CSCD 北大核心 2000年第9期1087-1093,共7页
多 Agent学习是近年来受到较多关注的研究方向 .以单 Agent强化学习 Q - learning算法为基础 ,提出了一种基于 Agent团队的强化学习模型 ,这个模型的最大特点是引入主导 Agent作为团队学习的主角 ,并通过主导Agent的角色变换实现整个团... 多 Agent学习是近年来受到较多关注的研究方向 .以单 Agent强化学习 Q - learning算法为基础 ,提出了一种基于 Agent团队的强化学习模型 ,这个模型的最大特点是引入主导 Agent作为团队学习的主角 ,并通过主导Agent的角色变换实现整个团队的学习 .结合仿真机器人足球领域 ,设计了具体的应用模型 ,在几个方面对 Q -learning进行了扩充 ,并进行了实验 . 展开更多
关键词 agent团队 机器人足球 强化学习模型 人工智能
下载PDF
上一页 1 2 63 下一页 到第
使用帮助 返回顶部