检索结果-维普期刊中文期刊服务平台

期刊文献⁺

任意字段

题名或关键词

题名

关键词

文摘

作者

第一作者

机构

刊名

分类号

参考文献

作者简介

基金资助

栏目信息

共找到6篇文章

< 1 >

每页显示 20 50 100

已选择0条

导出题录引用分析

统计分析

显示方式：

文摘详细列表

相关度排序被引量排序时效性排序

强化学习原理、算法及应用被引量：19: 1; 作者黄炳强曹广益王占全《河北工业大学学报》 CAS 2006年第6期34-38,共5页; 强化学习(ReinforcementLearningRL)是从动物学习理论发展而来的,它不需要有先验知识,通过不断与环境交互来获得知识,自主的进行动作选择,具有自主学习能力,在自主机器人行为学习中受到广泛重视.本文综述了强化学习的基本原理,各种算法... 展开更多; 关键词强化学习 TD算法 Q-学习 r-学习; 下载PDF 职称材料

基于平均奖赏强化学习算法的零阶分类元系统被引量：1: 2; 作者臧兆祥李昭 +1 位作者王俊英但志平《计算机工程与应用》 CSCD 北大核心 2016年第21期14-20,48,共8页; 零阶学习分类元系统ZCS(Zeroth-level Classifier System)作为一种基于遗传的机器学习技术(GeneticsBased Machine Learning),在解决多步学习问题上,已展现出应用价值。然而标准的ZCS系统采用折扣奖赏强化学习技术,难于适应更为广泛的... 展开更多; 关键词平均奖赏强化学习 r-学习算法学习分类元系统(LCS) 零阶分类元系统(ZCS) 多步学习问题; 下载PDF 职称材料

Incremental Multi Step R Learning: 3; 作者胡光华吴沧浦《Journal of Beijing Institute of Technology》 EI CAS 1999年第3期245-250,共6页; Aim To investigate the model free multi step average reward reinforcement learning algorithm. Methods By combining the R learning algorithms with the temporal difference learning (TD( λ ) learning) algorithm... 展开更多; 关键词 reinforcement learning average reward R learning Markov decision processes temporal difference learning; 下载PDF 职称材料

Hierarchical annotation method for metal corrosion detection of power equipment: 4; 作者 Zhang Baili Cao YongZhang Pei +2 位作者 Zhang Zhao He Yina Zhong Mingjun 《Journal of Southeast University(English Edition)》 EI CAS 2021年第4期350-355,共6页; To solve the ambiguity and uncertainty in the labeling process of power equipment corrosion datasets,a novel hierarchical annotation method(HAM)is proposed.Firstly,large boxes are used to label a large area covering t... 展开更多; 关键词 deep learning Faster r-CNN YOLOv5 object detection hierarchical annotation; 下载PDF 职称材料

The Application of TPR method in English Classroom of primary school: 5; 作者 Yuan Xinhua 《International English Education Research》 2014年第8期57-60,共4页; This study based on the conclusion demonstrated in Asher＇s studies that display oral practice with actions brings considerable effectiveness. TPR would be an appropriate and effective teaching method that will promot... 展开更多; 关键词 total Physical Response strategy teaching children English Elementary English education. Language Action; 下载PDF 职称材料

一种结合Tile Coding的平均奖赏强化学习算法: 6; 作者王巍巍陈兴国高阳《模式识别与人工智能》 EI CSCD 北大核心 2008年第4期446-452,共7页; 平均奖赏强化学习是强化学习中的一类重要的非折扣最优性框架,目前大多工作都主要是在离散域进行.本文尝试将平均奖赏强化学习算法和函数估计结合来解决连续状态空间的问题,并根据状态域的改变,相应修改 R-learning 和 G-learning 中参... 展开更多; 关键词强化学习马尔可夫决策过程(MDP) r-学习 G-学习平均奖赏; 原文传递

	题名	作者	出处	发文年	被引量	操作
1	强化学习原理、算法及应用	黄炳强曹广益王占全	《河北工业大学学报》 CAS	2006	19	下载PDF 职称材料
2	基于平均奖赏强化学习算法的零阶分类元系统	臧兆祥李昭王俊英但志平	《计算机工程与应用》 CSCD 北大核心	2016	1	下载PDF 职称材料
3	Incremental Multi Step R Learning	胡光华吴沧浦	《Journal of Beijing Institute of Technology》 EI CAS	1999	0	下载PDF 职称材料
4	Hierarchical annotation method for metal corrosion detection of power equipment	Zhang Baili Cao YongZhang Pei Zhang Zhao He Yina Zhong Mingjun	《Journal of Southeast University(English Edition)》 EI CAS	2021	0	下载PDF 职称材料
5	The Application of TPR method in English Classroom of primary school	Yuan Xinhua	《International English Education Research》	2014	0	下载PDF 职称材料
6	一种结合Tile Coding的平均奖赏强化学习算法	王巍巍陈兴国高阳	《模式识别与人工智能》 EI CSCD 北大核心	2008	0	原文传递

已选择0条

导出题录引用分析

统计分析

使用帮助返回顶部