期刊文献+
共找到9篇文章
< 1 >
每页显示 20 50 100
Comparison of Students Academic Performance in Mathematics Between Online and Offline Learning
1
作者 Serkan Kaymak Aliyeva Kalamkas 《Economics World》 2021年第4期173-177,共5页
The coronavirus has affected many areas of life,especially in the field of education.With the beginning of the Pandemic,the transition to online learning began,which affected the development of students and teachers i... The coronavirus has affected many areas of life,especially in the field of education.With the beginning of the Pandemic,the transition to online learning began,which affected the development of students and teachers in terms of using innovative technologies and programs,such as Zoom,Webex,Discord,Google Meet,Moodle,EDX,Coursera,www.examus.network,etc.In this regard,many teachers are wondering whether the online method of teaching is as effective as the offline method.In this article,we focused on finding out whether there is a significant difference in student performance between online and offline modes of learning in the study of mathematics.58 students were in a group where they studied online and 58 students were in a group where they studied offline.The study involved first-year college students of Jambyl Innovative Higher College(JICH)in Taraz,Kazakhstan.The final control work was carried out at the end of week 18,which tested all areas covered by the topic in both groups.The average scores of students studying offline were compared with the average of students studying online.To avoid confusion,the researchers also conducted and analyzed an independent t-test.The results showed that there is a significant difference in the academic performance of students who study online and offline.The offline teaching method has proven to be more effective for improving students’understanding and comprehension of mathematics topics. 展开更多
关键词 online learning offline learning achievement in mathematics
下载PDF
Boundary Data Augmentation for Offline Reinforcement Learning
2
作者 SHEN Jiahao JIANG Ke TAN Xiaoyang 《ZTE Communications》 2023年第3期29-36,共8页
Offline reinforcement learning(ORL)aims to learn a rational agent purely from behavior data without any online interaction.One of the major challenges encountered in ORL is the problem of distribution shift,i.e.,the m... Offline reinforcement learning(ORL)aims to learn a rational agent purely from behavior data without any online interaction.One of the major challenges encountered in ORL is the problem of distribution shift,i.e.,the mismatch between the knowledge of the learned policy and the reality of the underlying environment.Recent works usually handle this in a too pessimistic manner to avoid out-of-distribution(OOD)queries as much as possible,but this can influence the robustness of the agents at unseen states.In this paper,we propose a simple but effective method to address this issue.The key idea of our method is to enhance the robustness of the new policy learned offline by weakening its confidence in highly uncertain regions,and we propose to find those regions by simulating them with modified Generative Adversarial Nets(GAN)such that the generated data not only follow the same distribution with the old experience but are very difficult to deal with by themselves,with regard to the behavior policy or some other reference policy.We then use this information to regularize the ORL algorithm to penalize the overconfidence behavior in these regions.Extensive experiments on several publicly available offline RL benchmarks demonstrate the feasibility and effectiveness of the proposed method. 展开更多
关键词 offline reinforcement learning out‐of‐distribution state ROBUSTNESS UNCERTAINTY
下载PDF
Deep reinforcement learning based multi-level dynamic reconfiguration for urban distribution network:a cloud-edge collaboration architecture 被引量:1
3
作者 Siyuan Jiang Hongjun Gao +2 位作者 Xiaohui Wang Junyong Liu Kunyu Zuo 《Global Energy Interconnection》 EI CAS CSCD 2023年第1期1-14,共14页
With the construction of the power Internet of Things(IoT),communication between smart devices in urban distribution networks has been gradually moving towards high speed,high compatibility,and low latency,which provi... With the construction of the power Internet of Things(IoT),communication between smart devices in urban distribution networks has been gradually moving towards high speed,high compatibility,and low latency,which provides reliable support for reconfiguration optimization in urban distribution networks.Thus,this study proposed a deep reinforcement learning based multi-level dynamic reconfiguration method for urban distribution networks in a cloud-edge collaboration architecture to obtain a real-time optimal multi-level dynamic reconfiguration solution.First,the multi-level dynamic reconfiguration method was discussed,which included feeder-,transformer-,and substation-levels.Subsequently,the multi-agent system was combined with the cloud-edge collaboration architecture to build a deep reinforcement learning model for multi-level dynamic reconfiguration in an urban distribution network.The cloud-edge collaboration architecture can effectively support the multi-agent system to conduct“centralized training and decentralized execution”operation modes and improve the learning efficiency of the model.Thereafter,for a multi-agent system,this study adopted a combination of offline and online learning to endow the model with the ability to realize automatic optimization and updation of the strategy.In the offline learning phase,a Q-learning-based multi-agent conservative Q-learning(MACQL)algorithm was proposed to stabilize the learning results and reduce the risk of the next online learning phase.In the online learning phase,a multi-agent deep deterministic policy gradient(MADDPG)algorithm based on policy gradients was proposed to explore the action space and update the experience pool.Finally,the effectiveness of the proposed method was verified through a simulation analysis of a real-world 445-node system. 展开更多
关键词 Cloud-edge collaboration architecture Multi-agent deep reinforcement learning Multi-level dynamic reconfiguration offline learning Online learning
下载PDF
A Practical Reinforcement Learning Framework for Automatic Radar Detection
4
作者 YU Junpeng CHEN Yiyu 《ZTE Communications》 2023年第3期22-28,共7页
At present,the parameters of radar detection rely heavily on manual adjustment and empirical knowledge,resulting in low automation.Traditional manual adjustment methods cannot meet the requirements of modern radars fo... At present,the parameters of radar detection rely heavily on manual adjustment and empirical knowledge,resulting in low automation.Traditional manual adjustment methods cannot meet the requirements of modern radars for high efficiency,high precision,and high automation.Therefore,it is necessary to explore a new intelligent radar control learning framework and technology to improve the capability and automation of radar detection.Reinforcement learning is popular in decision task learning,but the shortage of samples in radar control tasks makes it difficult to meet the requirements of reinforcement learning.To address the above issues,we propose a practical radar operation reinforcement learning framework,and integrate offline reinforcement learning and meta-reinforcement learning methods to alleviate the sample requirements of reinforcement learning.Experimental results show that our method can automatically perform as humans in radar detection with real-world settings,thereby promoting the practical application of reinforcement learning in radar operation. 展开更多
关键词 meta-reinforcement learning radar detection reinforcement learning offline reinforcement learning
下载PDF
Offline Reinforcement Learning with Constrained Hybrid Action Implicit Representation Towards Wargaming Decision-Making
5
作者 Liwei Dong Ni Li +1 位作者 Guanghong Gong Xin Lin 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2024年第5期1422-1440,共19页
Reinforcement Learning(RL)has emerged as a promising data-driven solution for wargaming decision-making.However,two domain challenges still exist:(1)dealing with discrete-continuous hybrid wargaming control and(2)acce... Reinforcement Learning(RL)has emerged as a promising data-driven solution for wargaming decision-making.However,two domain challenges still exist:(1)dealing with discrete-continuous hybrid wargaming control and(2)accelerating RL deployment with rich offline data.Existing RL methods fail to handle these two issues simultaneously,thereby we propose a novel offline RL method targeting hybrid action space.A new constrained action representation technique is developed to build a bidirectional mapping between the original hybrid action space and a latent space in a semantically consistent way.This allows learning a continuous latent policy with offline RL with better exploration feasibility and scalability and reconstructing it back to a needed hybrid policy.Critically,a novel offline RL optimization objective with adaptively adjusted constraints is designed to balance the alleviation and generalization of out-of-distribution actions.Our method demonstrates superior performance and generality across different tasks,particularly in typical realistic wargaming scenarios. 展开更多
关键词 offline Reinforcement learning(RL) WARGAMING DECISION-MAKING hybrid action space
原文传递
OSCAR:OOD State-Conservative Offline Reinforcement Learning for Sequential Decision Making
6
作者 Yi Ma Chao Wang +4 位作者 Chen Chen Jinyi Liu Zhaopeng Meng Yan Zheng Jianye Hao 《CAAI Artificial Intelligence Research》 2023年第1期91-101,共11页
Offline reinforcement learning(RL)is a data-driven learning paradigm for sequential decision making.Mitigating the overestimation of values originating from out-of-distribution(OOD)states induced by the distribution s... Offline reinforcement learning(RL)is a data-driven learning paradigm for sequential decision making.Mitigating the overestimation of values originating from out-of-distribution(OOD)states induced by the distribution shift between the learning policy and the previously-collected offline dataset lies at the core of offline RL.To tackle this problem,some methods underestimate the values of states given by learned dynamics models or state-action pairs with actions sampled from policies different from the behavior policy.However,since these generated states or state-action pairs are not guaranteed to be OOD,staying conservative on them may adversely affect the in-distribution ones.In this paper,we propose an OOD state-conservative offline RL method(OSCAR),which aims to address the limitation by explicitly generating reliable OOD states that are located near the manifold of the offline dataset,and then design a conservative policy evaluation approach that combines the vanilla Bellman error with a regularization term that only underestimates the values of these generated OOD states.In this way,we can prevent the value errors of OOD states from propagating to in-distribution states through value bootstrapping and policy improvement.We also theoretically prove that the proposed conservative policy evaluation approach guarantees to underestimate the values of OOD states.OSCAR along with several strong baselines is evaluated on the offline decision-making benchmarks D4RL and autonomous driving benchmark SMARTS.Experimental results show that OSCAR outperforms the baselines on a large portion of the benchmarks and attains the highest average return,substantially outperforming existing offline RL methods. 展开更多
关键词 offline reinforcement learning out-of-distribution decision making
原文传递
Offline Pre-trained Multi-agent Decision Transformer 被引量:2
7
作者 Linghui Meng Muning Wen +8 位作者 Chenyang Le Xiyun Li Dengpeng Xing Weinan Zhang Ying Wen Haifeng Zhang Jun Wang Yaodong Yang Bo Xu 《Machine Intelligence Research》 EI CSCD 2023年第2期233-248,共16页
Offline reinforcement learning leverages previously collected offline datasets to learn optimal policies with no necessity to access the real environment.Such a paradigm is also desirable for multi-agent reinforcement... Offline reinforcement learning leverages previously collected offline datasets to learn optimal policies with no necessity to access the real environment.Such a paradigm is also desirable for multi-agent reinforcement learning(MARL)tasks,given the combinatorially increased interactions among agents and with the environment.However,in MARL,the paradigm of offline pre-training with online fine-tuning has not been studied,nor even datasets or benchmarks for offline MARL research are available.In this paper,we facilitate the research by providing large-scale datasets and using them to examine the usage of the decision transformer in the context of MARL.We investigate the generalization of MARL offline pre-training in the following three aspects:1)between single agents and multiple agents,2)from offline pretraining to online fine tuning,and 3)to that of multiple downstream tasks with few-shot and zero-shot capabilities.We start by introducing the first offline MARL dataset with diverse quality levels based on the StarCraftII environment,and then propose the novel architecture of multi-agent decision transformer(MADT)for effective offline learning.MADT leverages the transformer′s modelling ability for sequence modelling and integrates it seamlessly with both offline and online MARL tasks.A significant benefit of MADT is that it learns generalizable policies that can transfer between different types of agents under different task scenarios.On the StarCraft II offline dataset,MADT outperforms the state-of-the-art offline reinforcement learning(RL)baselines,including BCQ and CQL.When applied to online tasks,the pre-trained MADT significantly improves sample efficiency and enjoys strong performance in both few-short and zero-shot cases.To the best of our knowledge,this is the first work that studies and demonstrates the effectiveness of offline pre-trained models in terms of sample efficiency and generalizability enhancements for MARL. 展开更多
关键词 Pre-training model multi-agent reinforcement learning(MARL) decision making TRANSFORMER offline reinforcement learning
原文传递
On Learning Adaptive Service Compositions 被引量:1
8
作者 Ahmed Moustafa 《Journal of Systems Science and Systems Engineering》 SCIE EI CSCD 2021年第4期465-481,共17页
Service composition is an important and effective technique that enables atomic services to be combined together to forma more powerful service,i.e.,a composite service.With the pervasiveness of the Internet and the p... Service composition is an important and effective technique that enables atomic services to be combined together to forma more powerful service,i.e.,a composite service.With the pervasiveness of the Internet and the proliferation of interconnected computing devices,it is essential that service composition embraces an adaptive service provisioning perspective.Reinforcement learning has emerged as a powerful tool to compose and adapt Web services in open and dynamic environments.However,the most common applications of reinforcement learning algorithms are relatively inefficient in their use of the interaction experience data,whichmay affect the stability of the learning process when deployed to cloud environments.In particular,they make just one learning update for each interaction experience.This paper introduces a novel approach that aims to achieve greater data efficiency by saving the experience data and using it in aggregate to make updates to the learned policy.The proposed approach devises an offline learning scheme for cloud service composition where the online learning task is transformed into a series of supervised learning tasks.A set of algorithms is proposed under this scheme in order to facilitate and empower efficient service composition in the cloud under various policies and different scenarios.The results of our experiments show the effectiveness of the proposed approach for composing and adapting cloud services,especially under dynamic environment settings,compared to their online learning counterparts. 展开更多
关键词 Service composition reinforcement learning cloud services offline learning
原文传递
Deep Reinforcement Learning with Fuse Adaptive Weighted Demonstration Data
9
作者 Baofu Fang Taifeng Guo 《国际计算机前沿大会会议论文集》 2022年第1期163-177,共15页
Traditional multi-agent deep reinforcement learning has difficulty obtaining rewards,slow convergence,and effective cooperation among agents in the pretraining period due to the large joint state space and sparse rewa... Traditional multi-agent deep reinforcement learning has difficulty obtaining rewards,slow convergence,and effective cooperation among agents in the pretraining period due to the large joint state space and sparse rewards for action.Therefore,this paper discusses the role of demonstration data in multiagent systems and proposes a multi-agent deep reinforcement learning algorithm from fuse adaptive weight fusion demonstration data.The algorithm sets the weights according to the performance and uses the importance sampling method to bridge the deviation in the mixed sampled data to combine the expert data obtained in the simulation environment with the distributed multi-agent reinforcement learning algorithm to solve the difficult problem.The problem of global exploration improves the convergence speed of the algorithm.The results in the RoboCup2D soccer simulation environment show that the algorithm improves the ability of the agent to hold and shoot the ball,enabling the agent to achieve a higher goal scoring rate and convergence speed relative to demonstration policies and mainstream multi-agent reinforcement learning algorithms. 展开更多
关键词 Multiagent deep reinforcement learning Exploration offline reinforcement learning Importance sampling
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部