期刊文献+
共找到25篇文章
< 1 2 >
每页显示 20 50 100
基于改进多智能体Nash Q Learning的交通信号协调控制
1
作者 苏港 叶宝林 +2 位作者 姚青 陈滨 张一嘉 《软件工程》 2024年第10期43-49,共7页
为了优化区域交通信号配时方案,提升区域通行效率,文章提出一种基于改进多智能体Nash Q Learning的区域交通信号协调控制方法。首先,采用离散化编码方法,通过划分单元格将连续状态信息转化为离散形式。其次,在算法中融入长短时记忆网络(... 为了优化区域交通信号配时方案,提升区域通行效率,文章提出一种基于改进多智能体Nash Q Learning的区域交通信号协调控制方法。首先,采用离散化编码方法,通过划分单元格将连续状态信息转化为离散形式。其次,在算法中融入长短时记忆网络(Long Short Term Memory,LSTM)模块,用于从状态数据中挖掘更多的隐藏信息,丰富Q值表中的状态数据。最后,基于微观交通仿真软件SUMO(Simulation of Urban Mobility)的仿真测试结果表明,相较于原始Nash Q Learning交通信号控制方法,所提方法在低、中、高流量下车辆的平均等待时间分别减少了11.5%、16.2%和10.0%,平均排队长度分别减少了9.1%、8.2%和7.6%,平均停车次数分别减少了18.3%、16.1%和10.0%。结果证明了该算法具有更好的控制效果。 展开更多
关键词 区域交通信号协调控制 马尔科夫决策 多智能体Nash q learning LSTM SUMO
下载PDF
Fuzzy Q learning algorithm for dual-aircraft path planning to cooperatively detect targets by passive radars 被引量:6
2
作者 Xiang Gao Yangwang Fang Youli Wu 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2013年第5期800-810,共11页
The problem of passive detection discussed in this paper involves searching and locating an aerial emitter by dualaircraft using passive radars. In order to improve the detection probability and accuracy, a fuzzy Q le... The problem of passive detection discussed in this paper involves searching and locating an aerial emitter by dualaircraft using passive radars. In order to improve the detection probability and accuracy, a fuzzy Q learning algorithrn for dual-aircraft flight path planning is proposed. The passive detection task model of the dual-aircraft is set up based on the partition of the target active radar's radiation area. The problem is formulated as a Markov decision process (MDP) by using the fuzzy theory to make a generalization of the state space and defining the transition functions, action space and reward function properly. Details of the path planning algorithm are presented. Simulation results indicate that the algorithm can provide adaptive strategies for dual-aircraft to control their flight paths to detect a non-maneuvering or maneu- vering target. 展开更多
关键词 Markov decision process (MDP) fuzzy q learning dual-aircraft coordination path planning passive detection.
下载PDF
基于Q Learning算法的区域配网业务路由分配方法研究
3
作者 赵志军 金军 《计算技术与自动化》 2021年第1期104-108,共5页
传统的配网业务路由分配方法的链条占用率过高,导致丢包率较大。为此,设计了基于Q Learning算法的区域配网业务路由分配方法。按照传统分类方式划分业务路由中的性能指标,根据路由约束条件计算指标的约束值,从而确定业务路由的最优传输... 传统的配网业务路由分配方法的链条占用率过高,导致丢包率较大。为此,设计了基于Q Learning算法的区域配网业务路由分配方法。按照传统分类方式划分业务路由中的性能指标,根据路由约束条件计算指标的约束值,从而确定业务路由的最优传输路径。结合Bellman Equation方法不断计算并更新配网中的Q值,再综合节点和网络业务指标,利用Q Learning算法计算得到区域配网中的风险均衡度。不断变换VNFs的路由顺序将其转换为TSP路由问题,最终得到路由分配矩阵,实现区域配网业务路由的分配。实验结果表明:与传统分配方法相比,基于Q Learning算法的分配方法的链条占用率低,有效减小了业务数据转发过程的丢包率。 展开更多
关键词 q learning算法 业务路由 Bellman Equation方法 最优传输路径 风险均衡度 路由分配
下载PDF
A Deep Reinforcement Learning-Based Technique for Optimal Power Allocation in Multiple Access Communications
4
作者 Sepehr Soltani Ehsan Ghafourian +2 位作者 Reza Salehi Diego Martín Milad Vahidi 《Intelligent Automation & Soft Computing》 2024年第1期93-108,共16页
Formany years,researchers have explored power allocation(PA)algorithms driven bymodels in wireless networks where multiple-user communications with interference are present.Nowadays,data-driven machine learning method... Formany years,researchers have explored power allocation(PA)algorithms driven bymodels in wireless networks where multiple-user communications with interference are present.Nowadays,data-driven machine learning methods have become quite popular in analyzing wireless communication systems,which among them deep reinforcement learning(DRL)has a significant role in solving optimization issues under certain constraints.To this purpose,in this paper,we investigate the PA problem in a k-user multiple access channels(MAC),where k transmitters(e.g.,mobile users)aim to send an independent message to a common receiver(e.g.,base station)through wireless channels.To this end,we first train the deep Q network(DQN)with a deep Q learning(DQL)algorithm over the simulation environment,utilizing offline learning.Then,the DQN will be used with the real data in the online training method for the PA issue by maximizing the sumrate subjected to the source power.Finally,the simulation results indicate that our proposedDQNmethod provides better performance in terms of the sumrate compared with the available DQL training approaches such as fractional programming(FP)and weighted minimum mean squared error(WMMSE).Additionally,by considering different user densities,we show that our proposed DQN outperforms benchmark algorithms,thereby,a good generalization ability is verified over wireless multi-user communication systems. 展开更多
关键词 Deep reinforcement learning deep q learning multiple access channel power allocation
下载PDF
基于Q-learning的搜救机器人自主路径规划
5
作者 褚晶 邓旭辉 岳颀 《南京航空航天大学学报》 CAS CSCD 北大核心 2024年第2期364-374,共11页
当人为和自然灾害突然发生时,在极端情况下快速部署搜救机器人是拯救生命的关键。为了完成救援任务,搜救机器人需要在连续动态未知环境中,自主进行路径规划以到达救援目标位置。本文提出了一种搜救机器人传感器配置方案,应用基于Q⁃tabl... 当人为和自然灾害突然发生时,在极端情况下快速部署搜救机器人是拯救生命的关键。为了完成救援任务,搜救机器人需要在连续动态未知环境中,自主进行路径规划以到达救援目标位置。本文提出了一种搜救机器人传感器配置方案,应用基于Q⁃table和神经网络的Q⁃learning算法,实现搜救机器人的自主控制,解决了在未知环境中如何避开静态和动态障碍物的路径规划问题。如何平衡训练过程的探索与利用是强化学习的挑战之一,本文在贪婪搜索和Boltzmann搜索的基础上,提出了对搜索策略进行动态选择的混合优化方法。并用MATLAB进行了仿真,结果表明所提出的方法是可行有效的。采用该传感器配置的搜救机器人能够有效地响应环境变化,到达目标位置的同时成功避开静态、动态障碍物。 展开更多
关键词 搜救机器人 路径规划 传感器配置 qlearning 神经网络
下载PDF
基于Q-learning算法的煤矿井下移动机器人路径规划 被引量:4
6
作者 徐学东 《煤炭技术》 CAS 北大核心 2013年第2期105-106,共2页
如何针对煤矿井下环境的不确定性规划机器人的路径是其中的一个难点。文章提出了一种基于Q-learning算法的移动机器人路径规划,希望对提高机器人救援的避障能力的提升,起到一定的促进作用。
关键词 煤矿 机器人 qlearning 路径规划
下载PDF
Q-Learning-Based Pesticide Contamination Prediction in Vegetables and Fruits 被引量:1
7
作者 Kandasamy Sellamuthu Vishnu Kumar Kaliappan 《Computer Systems Science & Engineering》 SCIE EI 2023年第4期715-736,共22页
Pesticides have become more necessary in modern agricultural production.However,these pesticides have an unforeseeable long-term impact on people's wellbeing as well as the ecosystem.Due to a shortage of basic pes... Pesticides have become more necessary in modern agricultural production.However,these pesticides have an unforeseeable long-term impact on people's wellbeing as well as the ecosystem.Due to a shortage of basic pesticide exposure awareness,farmers typically utilize pesticides extremely close to harvesting.Pesticide residues within foods,particularly fruits as well as veggies,are a significant issue among farmers,merchants,and particularly consumers.The residual concentrations were far lower than these maximal allowable limits,with only a few surpassing the restrictions for such pesticides in food.There is an obligation to provide a warning about this amount of pesticide use in farming.Previous technologies failed to forecast the large number of pesticides that were dangerous to people,necessitating the development of improved detection and early warning systems.A novel methodology for verifying the status and evaluating the level of pesticides in regularly consumed veggies as well as fruits has been identified,named as the Hybrid Chronic Multi-Residual Framework(HCMF),in which the harmful level of used pesticide residues has been predicted for contamination in agro products using Q-Learning based Recurrent Neural Network and the predicted contamination levels have been analyzed using Complex Event Processing(CEP)by processing given spatial and sequential data.The analysis results are used to minimize and effectively use pesticides in the agricultural field and also ensure the safety of farmers and consumers.Overall,the technique is carried out in a Python environment,with the results showing that the proposed model has a 98.57%accuracy and a training loss of 0.30. 展开更多
关键词 Pesticide contamination complex event processing recurrent neural network q learning multi residual level and contamination level
下载PDF
Adaptive Kernel Firefly Algorithm Based Feature Selection and Q-Learner Machine Learning Models in Cloud
8
作者 I.Mettildha Mary K.Karuppasamy 《Computer Systems Science & Engineering》 SCIE EI 2023年第9期2667-2685,共19页
CC’s(Cloud Computing)networks are distributed and dynamic as signals appear/disappear or lose significance.MLTs(Machine learning Techniques)train datasets which sometime are inadequate in terms of sample for inferrin... CC’s(Cloud Computing)networks are distributed and dynamic as signals appear/disappear or lose significance.MLTs(Machine learning Techniques)train datasets which sometime are inadequate in terms of sample for inferring information.A dynamic strategy,DevMLOps(Development Machine Learning Operations)used in automatic selections and tunings of MLTs result in significant performance differences.But,the scheme has many disadvantages including continuity in training,more samples and training time in feature selections and increased classification execution times.RFEs(Recursive Feature Eliminations)are computationally very expensive in its operations as it traverses through each feature without considering correlations between them.This problem can be overcome by the use of Wrappers as they select better features by accounting for test and train datasets.The aim of this paper is to use DevQLMLOps for automated tuning and selections based on orchestrations and messaging between containers.The proposed AKFA(Adaptive Kernel Firefly Algorithm)is for selecting features for CNM(Cloud Network Monitoring)operations.AKFA methodology is demonstrated using CNSD(Cloud Network Security Dataset)with satisfactory results in the performance metrics like precision,recall,F-measure and accuracy used. 展开更多
关键词 Cloud analytics machine learning ensemble learning distributed learning clustering classification auto selection auto tuning decision feedback cloud DevOps feature selection wrapper feature selection Adaptive Kernel Firefly Algorithm(AKFA) q learning
下载PDF
发电商基于Q-Learning算法的日前市场竞价策略 被引量:8
9
作者 王帅 《能源技术经济》 2010年第3期34-39,共6页
电力市场仿真可以研究市场规则、市场结构对价格形成的影响和市场参与者的动态行为。初步建立了用智能多代理模拟日前市场发电商竞价策略的模型,采用Q-Learning算法优化自身策略。改进了增强学习算法中探索参数的选取,使程序在开始阶段... 电力市场仿真可以研究市场规则、市场结构对价格形成的影响和市场参与者的动态行为。初步建立了用智能多代理模拟日前市场发电商竞价策略的模型,采用Q-Learning算法优化自身策略。改进了增强学习算法中探索参数的选取,使程序在开始阶段以较大的概率进行新的搜索,避免过早陷入局部最优。改进了阶梯形报价曲线的构造方法,减小了计算量,提高了计算速度。 展开更多
关键词 电力市场 定价 智能代理 qlearning算法 竞价策略
下载PDF
Robot soccer action selection based on Qlearning 被引量:2
10
作者 刘新宇 洪炳熔 《Journal of Harbin Institute of Technology(New Series)》 EI CAS 2001年第3期212-214,共3页
This paper researches robot soccer action selection based on Q learning .The robot learn to activate particular behavior given their current situation and reward signal. We adopt neural network to implementations ... This paper researches robot soccer action selection based on Q learning .The robot learn to activate particular behavior given their current situation and reward signal. We adopt neural network to implementations of Q learning for their generalization properties and limited computer memory requirements. 展开更多
关键词 robot soccer action selection q learning neural network
下载PDF
基于Q⁃learning算法的无人机自组网AODV稳定路由改进方法 被引量:2
11
作者 李海滨 唐晓刚 +3 位作者 常继红 吴署光 周尚辉 王梦阳 《现代电子技术》 2023年第6期91-97,共7页
针对小型无人机集群组网中节点高速移动、网络拓扑变化剧烈导致的网络性能下降问题,在无线自组网按需平面距离向量(AODV)协议基础上,提出一种具有链路稳定度意识的无人机自组网路由协议(LN⁃AODV)。首先,通过加权计算链路维持时间和邻居... 针对小型无人机集群组网中节点高速移动、网络拓扑变化剧烈导致的网络性能下降问题,在无线自组网按需平面距离向量(AODV)协议基础上,提出一种具有链路稳定度意识的无人机自组网路由协议(LN⁃AODV)。首先,通过加权计算链路维持时间和邻居节点变化度,从而选择稳定路径,解决拓扑结构动态变化条件下数据传输延迟和数据成功投递率下降的难题;然后,结合Q⁃learning算法自适应调整Hello消息的发送周期,使协议能够通过感知拓扑变化程度调整路由控制开销和拓扑感知灵敏度。仿真结果表明,相比于AODV,所提方法在端到端延迟、分组投递率、路由开销和数据吞吐量4个网络性能指标上分别提升7.56%,2.58%,17.39%,2.62%,可适应于节点高速运动的无人机自组网,对于无线自组网理论研究及拓展应用具有重要的借鉴意义。 展开更多
关键词 无人机自组网 路由协议 AODV qlearning 稳定路径 拓扑感知 Hello报文 邻居节点
下载PDF
基于改进Q⁃learning算法的输电线路拟声驱鸟策略研究 被引量:1
12
作者 柯杰龙 张羽 +2 位作者 朱朋辉 黄炽坤 吴可廷 《南京信息工程大学学报(自然科学版)》 CAS 北大核心 2022年第5期579-586,共8页
日益频繁的鸟类活动给输电线路的安全运行带来了极大威胁,而现有拟声驱鸟装置由于缺乏智能性,无法长期有效驱鸟.为了解决上述问题,本文提出基于改进Q⁃learning算法的拟声驱鸟策略.首先,为了评价各音频的驱鸟效果,结合模糊理论,将鸟类听... 日益频繁的鸟类活动给输电线路的安全运行带来了极大威胁,而现有拟声驱鸟装置由于缺乏智能性,无法长期有效驱鸟.为了解决上述问题,本文提出基于改进Q⁃learning算法的拟声驱鸟策略.首先,为了评价各音频的驱鸟效果,结合模糊理论,将鸟类听到音频后的动作行为量化为不同鸟类反应类型.然后,设计单一音频驱鸟实验,统计各音频驱鸟效果数据,得到各音频的初始权重值,为拟声驱鸟装置的音频选择提供实验依据.为了使计算所得的音频权重值更符合实际实验情况,对CRITIC(Criteria Impor⁃tance Though Intercrieria Correlation)方法的权重计算公式进行了优化.最后,使用实验所得音频权重值对Q⁃learning算法进行改进,并设计与其他拟声驱鸟策略的对比实验,实验数据显示改进Q⁃learning算法的拟声驱鸟策略驱鸟效果优于其他三种驱鸟策略,收敛速度快,驱鸟效果稳定,能够降低鸟类的适应性. 展开更多
关键词 拟声音频 驱鸟效果 模糊理论 qlearning算法 驱鸟策略
下载PDF
基于Q-learning的电力通信网效用最大化资源分配策略生成算法 被引量:3
13
作者 谢小军 潘子春 吴非 《自动化技术与应用》 2018年第4期44-48,53,共6页
智能电网业务的快速发展,对电力通信网的资源需求逐渐增多。为了在提高电力通信网资源利用率的基础上,尽可能满足较多的业务需求,提高用户的满意度,本文建立了电力通信网资源分配模型,提出了基于Q-learning的电力通信网效用最大化的资... 智能电网业务的快速发展,对电力通信网的资源需求逐渐增多。为了在提高电力通信网资源利用率的基础上,尽可能满足较多的业务需求,提高用户的满意度,本文建立了电力通信网资源分配模型,提出了基于Q-learning的电力通信网效用最大化的资源分配策略生成算法。通过仿真实验证明了本文算法具有较快的收敛速度,同时,通过与静态资源分配算法和动态资源分配算法的比较,验证了本文算法在保证资源利用率较高的情况下,电力业务取得了较高的效用值,满足了更多业务的资源需求,提高了用户的满意度。 展开更多
关键词 电力通信网 网络虚拟化 资源分配 qlearning
下载PDF
Collaborative multi-agent reinforcement learning based on experience propagation 被引量:5
14
作者 Min Fang Frans C.A. Groen 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2013年第4期683-689,共7页
For multi-agent reinforcement learning in Markov games, knowledge extraction and sharing are key research problems. State list extracting means to calculate the optimal shared state path from state trajectories with c... For multi-agent reinforcement learning in Markov games, knowledge extraction and sharing are key research problems. State list extracting means to calculate the optimal shared state path from state trajectories with cycles. A state list extracting algorithm checks cyclic state lists of a current state in the state trajectory, condensing the optimal action set of the current state. By reinforcing the optimal action selected, the action policy of cyclic states is optimized gradually. The state list extracting is repeatedly learned and used as the experience knowledge which is shared by teams. Agents speed up the rate of convergence by experience sharing. Competition games of preys and predators are used for the experiments. The results of experiments prove that the proposed algorithms overcome the lack of experience in the initial stage, speed up learning and improve the performance. 展开更多
关键词 MULTI-AGENT q learning state list extracting experience sharing.
下载PDF
Local Path Planning Method of the Self-propelled Model Based on Reinforcement Learning in Complex Conditions
15
作者 Yi Yang Yongjie Pang +1 位作者 Hongwei Li Rubo Zhang 《Journal of Marine Science and Application》 2014年第3期333-339,共7页
Conducting hydrodynamic and physical motion simulation tests using a large-scale self-propelled model under actual wave conditions is an important means for researching environmental adaptability of ships. During the ... Conducting hydrodynamic and physical motion simulation tests using a large-scale self-propelled model under actual wave conditions is an important means for researching environmental adaptability of ships. During the navigation test of the self-propelled model, the complex environment including various port facilities, navigation facilities, and the ships nearby must be considered carefully, because in this dense environment the impact of sea waves and winds on the model is particularly significant. In order to improve the security of the self-propelled model, this paper introduces the Q learning based on reinforcement learning combined with chaotic ideas for the model's collision avoidance, in order to improve the reliability of the local path planning. Simulation and sea test results show that this algorithm is a better solution for collision avoidance of the self navigation model under the interference of sea winds and waves with good adaptability. 展开更多
关键词 self-propelled model local path planning q learning obstacle avoidance reinforcement learning
下载PDF
Efficient Temporal Difference Learning with Adaptive λ
16
作者 毕金波 吴沧浦 《Journal of Beijing Institute of Technology》 EI CAS 1999年第3期251-257,共7页
Aim To find a more efficient learning method based on temporal difference learning for delayed reinforcement learning tasks. Methods A kind of Q learning algorithm based on truncated TD( λ ) with adaptive scheme... Aim To find a more efficient learning method based on temporal difference learning for delayed reinforcement learning tasks. Methods A kind of Q learning algorithm based on truncated TD( λ ) with adaptive schemes of λ value selection addressed to absorbing Markov decision processes was presented and implemented on computers. Results and Conclusion Simulations on the shortest path searching problems show that using adaptive λ in the Q learning based on TTD( λ ) can speed up its convergence. 展开更多
关键词 dynamic programming delayed reinforcement learning absorbing Markov decision processes temporal difference learning q learning
下载PDF
基于强化学习的FANET自适应MAC协议
17
作者 闫涛 赵一帆 +2 位作者 高明虎 陈虎 唐嘉宁 《计算机工程与设计》 北大核心 2024年第9期2613-2619,共7页
针对预设的单一媒体接入控制(MAC)协议难以满足飞行自组织网络(FANET)多样化业务需求的问题,提出一种基于Q Learning的FANET自适应MAC协议(FQL-AMAC)。使用两种基准协议联合控制,根据当前网络条件自动选择并切换至服务质量(QoS)显著的MA... 针对预设的单一媒体接入控制(MAC)协议难以满足飞行自组织网络(FANET)多样化业务需求的问题,提出一种基于Q Learning的FANET自适应MAC协议(FQL-AMAC)。使用两种基准协议联合控制,根据当前网络条件自动选择并切换至服务质量(QoS)显著的MAC协议。优化单个网络性能指标实现QoS局部最优,采用熵值法融合吞吐量、延迟并提出综合性能指标构建奖励函数,以趋近QoS全局最优。实验结果表明,FQL-AMAC能有效选择最佳协议运行,吞吐量、延迟和综合性能表现优于现有协议。 展开更多
关键词 飞行自组织网络 媒体接入控制 多样化业务 自适应选择 q learning 熵值法 综合性能
下载PDF
一种改进博弈学习的无人机集群协同围捕方法 被引量:1
18
作者 刘菁 华翔 张金金 《西安工业大学学报》 CAS 2023年第3期277-286,共10页
针对无人机集群对单智能化目标协同围捕问题,文中提出一种改进博弈学习的无人机集群协同围捕方法。根据集群和目标的运动学关系,建立了一种结合博弈论与阿波罗尼斯圆的协同围捕模型;依据集群之间的相互合作关系和追逃双方的博弈关系,基... 针对无人机集群对单智能化目标协同围捕问题,文中提出一种改进博弈学习的无人机集群协同围捕方法。根据集群和目标的运动学关系,建立了一种结合博弈论与阿波罗尼斯圆的协同围捕模型;依据集群之间的相互合作关系和追逃双方的博弈关系,基于Q Learning算法和学习到的奖赏均值动态调整贪婪因子以构建和完善状态动作矩阵;根据状态动作矩阵求解支付矩阵的纳什均衡解,完成集群对单目标的协同围捕。研究结果表明:通过该协同围捕方法各围捕无人机获得的平均奖赏值较传统Q Learning算法分别提高了48%,32.4%,50.8%,完成围捕任务所需的平均行走步数减少了58.7%,能够有效对单目标进行围捕,且围捕时效性更强。 展开更多
关键词 无人机集群 协同围捕 博弈论 阿波罗尼斯圆 q learning
下载PDF
电力市场智能模拟中代理决策模块的实现 被引量:14
19
作者 陈皓勇 杨彦 +3 位作者 张尧 王野平 荆朝霞 陈青松 《电力系统自动化》 EI CSCD 北大核心 2008年第20期22-26,共5页
在日前交易方式下,发电厂商为了追求长期最大利润,竞价策略显得尤其重要。通常,发电厂商运用的策略过于复杂,难以用传统的博弈论方法来建模。人工智能中强化学习Q-learning算法是一种自适应的学习方法,使代理能够通过不断与环境进行交... 在日前交易方式下,发电厂商为了追求长期最大利润,竞价策略显得尤其重要。通常,发电厂商运用的策略过于复杂,难以用传统的博弈论方法来建模。人工智能中强化学习Q-learning算法是一种自适应的学习方法,使代理能够通过不断与环境进行交互所得到的经验进行学习,适合在电力市场智能模拟中运用。文中在开放源代码的电力市场智能模拟平台AMES上,增加了发电厂商代理基于Q-learning的竞价决策程序模块,并在5节点测试系统上进行模拟。实验结果表明,运用基于Q-learning算法竞价决策使代理可以较好地模拟发电厂商的经济特性,且在相同条件下表现出比AMES原有的VRElearning算法更强的探索能力。 展开更多
关键词 智能代理模拟 竞价策略 电力拍卖市场 qlearning算法 VRE learning算法
下载PDF
多AGV的路径规划与任务调度研究 被引量:10
20
作者 于会群 王意乐 黄贻海 《上海电力大学学报》 CAS 2022年第1期89-93,97,共6页
自动化分拣仓储包含大量的分拣任务,需要多个自动导引车(AGV)来辅助人工完成快速分拣任务。为了提高效率,在保障AGV电量的前提下,以AGV完成任务的空载时间与AGV的空置率为优化目标,对多AGV的碰撞进行了冲突分析,并通过改进的Q learning... 自动化分拣仓储包含大量的分拣任务,需要多个自动导引车(AGV)来辅助人工完成快速分拣任务。为了提高效率,在保障AGV电量的前提下,以AGV完成任务的空载时间与AGV的空置率为优化目标,对多AGV的碰撞进行了冲突分析,并通过改进的Q learning算法来生成AGV的无冲突搬运路径;为了完成多AGV路径和调度综合优化,提出了一种改进遗传算法,算法采用精英保留和轮盘赌的方式选择个体,运用自适应的交叉和变异算子来进行进化操作。最后,通过仿真验证了算法的有效性。 展开更多
关键词 多AGV 路径规划与任务调度 q learning算法 改进遗传算法
下载PDF
上一页 1 2 下一页 到第
使用帮助 返回顶部