期刊文献+
共找到13篇文章
< 1 >
每页显示 20 50 100
Learning to Branch in Combinatorial Optimization With Graph Pointer Networks
1
作者 Rui Wang Zhiming Zhou +4 位作者 Kaiwen Li Tao Zhang Ling Wang Xin Xu Xiangke Liao 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第1期157-169,共13页
Traditional expert-designed branching rules in branch-and-bound(B&B) are static, often failing to adapt to diverse and evolving problem instances. Crafting these rules is labor-intensive, and may not scale well wi... Traditional expert-designed branching rules in branch-and-bound(B&B) are static, often failing to adapt to diverse and evolving problem instances. Crafting these rules is labor-intensive, and may not scale well with complex problems.Given the frequent need to solve varied combinatorial optimization problems, leveraging statistical learning to auto-tune B&B algorithms for specific problem classes becomes attractive. This paper proposes a graph pointer network model to learn the branch rules. Graph features, global features and historical features are designated to represent the solver state. The graph neural network processes graph features, while the pointer mechanism assimilates the global and historical features to finally determine the variable on which to branch. The model is trained to imitate the expert strong branching rule by a tailored top-k Kullback-Leibler divergence loss function. Experiments on a series of benchmark problems demonstrate that the proposed approach significantly outperforms the widely used expert-designed branching rules. It also outperforms state-of-the-art machine-learning-based branch-and-bound methods in terms of solving speed and search tree size on all the test instances. In addition, the model can generalize to unseen instances and scale to larger instances. 展开更多
关键词 Branch-and-bound(B&B) combinatorial optimization deep learning graph neural network imitation learning
下载PDF
NOMA-Based Energy-Efficient Task Scheduling in Vehicular Edge Computing Networks: A Self-Imitation Learning-Based Approach 被引量:8
2
作者 Peiran Dong Zhaolong Ning +3 位作者 Rong Ma Xiaojie Wang Xiping Hu Bin Hu 《China Communications》 SCIE CSCD 2020年第11期1-11,共11页
Mobile Edge Computing(MEC)is promising to alleviate the computation and storage burdens for terminals in wireless networks.The huge energy consumption of MEC servers challenges the establishment of smart cities and th... Mobile Edge Computing(MEC)is promising to alleviate the computation and storage burdens for terminals in wireless networks.The huge energy consumption of MEC servers challenges the establishment of smart cities and their service time powered by rechargeable batteries.In addition,Orthogonal Multiple Access(OMA)technique cannot utilize limited spectrum resources fully and efficiently.Therefore,Non-Orthogonal Multiple Access(NOMA)-based energy-efficient task scheduling among MEC servers for delay-constraint mobile applications is important,especially in highly-dynamic vehicular edge computing networks.The various movement patterns of vehicles lead to unbalanced offloading requirements and different load pressure for MEC servers.Self-Imitation Learning(SIL)-based Deep Reinforcement Learning(DRL)has emerged as a promising machine learning technique to break through obstacles in various research fields,especially in time-varying networks.In this paper,we first introduce related MEC technologies in vehicular networks.Then,we propose an energy-efficient approach for task scheduling in vehicular edge computing networks based on DRL,with the purpose of both guaranteeing the task latency requirement for multiple users and minimizing total energy consumption of MEC servers.Numerical results demonstrate that the proposed algorithm outperforms other methods. 展开更多
关键词 NOMA energy-efficient scheduling vehicular edge computing imitation learning
下载PDF
Deep Imitation Learning for Autonomous Vehicles Based on Convolutional Neural Networks 被引量:10
3
作者 Parham M.Kebria Abbas Khosravi +1 位作者 Syed Moshfeq Salaken Saeid Nahavandi 《IEEE/CAA Journal of Automatica Sinica》 EI CSCD 2020年第1期82-95,共14页
Providing autonomous systems with an effective quantity and quality of information from a desired task is challenging. In particular, autonomous vehicles, must have a reliable vision of their workspace to robustly acc... Providing autonomous systems with an effective quantity and quality of information from a desired task is challenging. In particular, autonomous vehicles, must have a reliable vision of their workspace to robustly accomplish driving functions. Speaking of machine vision, deep learning techniques, and specifically convolutional neural networks, have been proven to be the state of the art technology in the field. As these networks typically involve millions of parameters and elements, designing an optimal architecture for deep learning structures is a difficult task which is globally under investigation by researchers. This study experimentally evaluates the impact of three major architectural properties of convolutional networks, including the number of layers, filters, and filter size on their performance. In this study, several models with different properties are developed,equally trained, and then applied to an autonomous car in a realistic simulation environment. A new ensemble approach is also proposed to calculate and update weights for the models regarding their mean squared error values. Based on design properties,performance results are reported and compared for further investigations. Surprisingly, the number of filters itself does not largely affect the performance efficiency. As a result, proper allocation of filters with different kernel sizes through the layers introduces a considerable improvement in the performance.Achievements of this study will provide the researchers with a clear clue and direction in designing optimal network architectures for deep learning purposes. 展开更多
关键词 Autonomous vehicles convolutional neural networks deep learning imitation learning
下载PDF
Knowledge Distillation for Mobile Edge Computation Offloading
4
作者 CHEN Haowei ZENG Liekang +1 位作者 YU Shuai CHEN Xu 《ZTE Communications》 2020年第2期40-48,共9页
Edge computation offloading allows mobile end devices to execute compute-inten?sive tasks on edge servers. End devices can decide whether the tasks are offloaded to edge servers, cloud servers or executed locally acco... Edge computation offloading allows mobile end devices to execute compute-inten?sive tasks on edge servers. End devices can decide whether the tasks are offloaded to edge servers, cloud servers or executed locally according to current network condition and devic?es'profiles in an online manner. In this paper, we propose an edge computation offloading framework based on deep imitation learning (DIL) and knowledge distillation (KD), which assists end devices to quickly make fine-grained decisions to optimize the delay of computa?tion tasks online. We formalize a computation offloading problem into a multi-label classifi?cation problem. Training samples for our DIL model are generated in an offline manner. Af?ter the model is trained, we leverage KD to obtain a lightweight DIL model, by which we fur?ther reduce the model's inference delay. Numerical experiment shows that the offloading de?cisions made by our model not only outperform those made by other related policies in laten?cy metric, but also have the shortest inference delay among all policies. 展开更多
关键词 mobile edge computation offloading deep imitation learning knowledge distillation
下载PDF
Imitation Learning Based Real-time Decision-making of Microgrid Economic Dispatch Under Multiple Uncertainties
5
作者 Wei Dong Fan Zhang +2 位作者 Meng Li Xiaolun Fang Qiang Yang 《Journal of Modern Power Systems and Clean Energy》 SCIE EI CSCD 2024年第4期1183-1193,共11页
The intermittency of renewable energy generation,variability of load demand,and stochasticity of market price bring about direct challenges to optimal energy management of microgrids.To cope with these different forms... The intermittency of renewable energy generation,variability of load demand,and stochasticity of market price bring about direct challenges to optimal energy management of microgrids.To cope with these different forms of operation uncertainties,an imitation learning based real-time decision-mak-ing solution for microgrid economic dispatch is proposed.In this solution,the optimal dispatch trajectories obtained by solving the optimal problem using historical deterministic operation patterns are demonstrated as the expert samples for imitation learning.To improve the generalization performance of imitation learning and the expressive ability of uncertain variables,a hybrid model combining the unsupervised and supervised learning is utilized.The denoising autoencoder based unsupervised learning model is adopted to enhance the feature extraction of operation patterns.Furthermore,the long short-term memory network based supervised learning model is used to efficiently characterize the mapping between the input space composed of the extracted operation patterns and system state variables and the output space composed of the optimal dispatch trajectories.The numerical simulation results demonstrate that under various operation uncertainties,the operation cost achieved by the proposed solution is close to the minimum theoretical value.Compared with the traditional model predictive control method and basic clone imitation learning method,the operation cost of the proposed solution is reduced by 6.3% and 2.8%,respectively,overa test period of three months. 展开更多
关键词 Energy management imitation learning datadriven decision economic dispatch
原文传递
Robot learning from demonstration for path planning: A review 被引量:7
6
作者 XIE ZongWu ZHANG Qi +1 位作者 JIANG ZaiNan LIU Hong 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2020年第8期1325-1334,共10页
Learning from demonstration(LfD)is an appealing method of helping robots learn new skills.Numerous papers have presented methods of LfD with good performance in robotics.However,complicated robot tasks that need to ca... Learning from demonstration(LfD)is an appealing method of helping robots learn new skills.Numerous papers have presented methods of LfD with good performance in robotics.However,complicated robot tasks that need to carefully regulate path planning strategies remain unanswered.Contact or non-contact constraints in specific robot tasks make the path planning problem more difficult,as the interaction between the robot and the environment is time-varying.In this paper,we focus on the path planning of complex robot tasks in the domain of LfD and give a novel perspective for classifying imitation learning and inverse reinforcement learning.This classification is based on constraints and obstacle avoidance.Finally,we summarize these methods and present promising directions for robot application and LfD theory. 展开更多
关键词 learning from demonstration path planning imitation learning inverse reinforcement learning obstacle avoidance
原文传递
A Data-driven Method for Fast AC Optimal Power Flow Solutions via Deep Reinforcement Learning 被引量:8
7
作者 Yuhao Zhou Bei Zhang +5 位作者 Chunlei Xu Tu Lan Ruisheng Diao Di Shi Zhiwei Wang Wei-Jen Lee 《Journal of Modern Power Systems and Clean Energy》 SCIE EI CSCD 2020年第6期1128-1139,共12页
With the increasing penetration of renewable energy,power grid operators are observing both fast and large fluctuations in power and voltage profiles on a daily basis.Fast and accurate control actions derived in real ... With the increasing penetration of renewable energy,power grid operators are observing both fast and large fluctuations in power and voltage profiles on a daily basis.Fast and accurate control actions derived in real time are vital to ensure system security and economics.To this end,solving alternating current(AC)optimal power flow(OPF)with operational constraints remains an important yet challenging optimization problem for secure and economic operation of the power grid.This paper adopts a novel method to derive fast OPF solutions using state-of-the-art deep reinforcement learning(DRL)algorithm,which can greatly assist power grid operators in making rapid and effective decisions.The presented method adopts imitation learning to generate initial weights for the neural network(NN),and a proximal policy optimization algorithm to train and test stable and robust artificial intelligence(AI)agents.Training and testing procedures are conducted on the IEEE 14-bus and the Illinois 200-bus systems.The results show the effectiveness of the method with significant potential for assisting power grid operators in real-time operations. 展开更多
关键词 Alternating current(AC)optimal power flow(OPF) deep reinforcement learning(DRL) imitation learning proximal policy optimization
原文传递
Joint Entity and Event Extraction with Generative Adversarial Imitation Learning 被引量:11
8
作者 Tongtao Zhang Heng Ji Avirup Sil 《Data Intelligence》 2019年第2期99-120,共22页
We propose a new framework for entity and event extraction based on generative adversarial imitation learning-an inverse reinforcement learning method using a generative adversarial network(GAN).We assume that instanc... We propose a new framework for entity and event extraction based on generative adversarial imitation learning-an inverse reinforcement learning method using a generative adversarial network(GAN).We assume that instances and labels yield to various extents of difficulty and the gains and penalties(rewards)are expected to be diverse.We utilize discriminators to estimate proper rewards according to the difference between the labels committed by the ground-truth(expert)and the extractor(agent).Our experiments demonstrate that the proposed framework outperforms state-of-the-art methods. 展开更多
关键词 Information extraction Event extraction Imitation learning Generative adversarial network
原文传递
Mapless navigation for UAVs via reinforcement learning from demonstrations 被引量:1
9
作者 YANG JiaNan LU ShengAo +4 位作者 HAN MingHao LI YunPeng MA YuTing LIN ZeFeng LI HaoWei 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2023年第5期1263-1270,共8页
This paper is concerned with the problems of mapless navigation for unmanned aerial vehicles in the scenarios with limited sensor accuracy and computing capability.A novel learning-based algorithm called soft actor-cr... This paper is concerned with the problems of mapless navigation for unmanned aerial vehicles in the scenarios with limited sensor accuracy and computing capability.A novel learning-based algorithm called soft actor-critic from demonstrations(SACfD)is proposed,integrating reinforcement learning with imitation learning.Specifically,the maximum entropy reinforcement learning framework is introduced to enhance the exploration capability of the algorithm,upon which the paper explores a way to sufficiently leverage demonstration data to significantly accelerate the convergence rate while improving policy performance reliably.Further,the proposed algorithm enables an implementation of mapless navigation for unmanned aerial vehicles and experimental results show that it outperforms the existing algorithms. 展开更多
关键词 autonomous navigation reinforcement learing imitation learning path planning
原文传递
Reinforcement learning building control approach harnessing imitation learning 被引量:3
10
作者 Sourav Dey Thibault Marzullo +1 位作者 Xiangyu Zhang Gregor Henze 《Energy and AI》 2023年第4期60-72,共13页
Reinforcement learning(RL)has shown significant success in sequential decision making in fields like autonomous vehicles,robotics,marketing and gaming industries.This success has attracted the attention to the RL cont... Reinforcement learning(RL)has shown significant success in sequential decision making in fields like autonomous vehicles,robotics,marketing and gaming industries.This success has attracted the attention to the RL control approach for building energy systems which are becoming complicated due to the need to optimize for multiple,potentially conflicting,goals like occupant comfort,energy use and grid interactivity.However,for real world applications,RL has several drawbacks like requiring large training data and time,and unstable control behavior during the early exploration process making it infeasible for an application directly to building control tasks.To address these issues,an imitation learning approach is utilized herein where the RL agents starts with a policy transferred from accepted rule based policies and heuristic policies.This approach is successful in reducing the training time,preventing the unstable early exploration behavior and improving upon an accepted rule-based policy-all of these make RL a more practical control approach for real world applications in the domain of building controls. 展开更多
关键词 Reinforcement learning Building controls Imitation learning Artificial intelligence
原文传递
GACS:Generative Adversarial Imitation Learning Based on Control Sharing 被引量:1
11
作者 Huaiwei SI Guozhen TAN +1 位作者 Dongyu LI Yanfei PENG 《Journal of Systems Science and Information》 CSCD 2023年第1期78-93,共16页
Generative adversarial imitation learning(GAIL)directly imitates the behavior of experts from human demonstration instead of designing explicit reward signals like reinforcement learning.Meanwhile,GAIL overcomes the d... Generative adversarial imitation learning(GAIL)directly imitates the behavior of experts from human demonstration instead of designing explicit reward signals like reinforcement learning.Meanwhile,GAIL overcomes the defects of traditional imitation learning by using a generative adversary network framework and shows excellent performance in many fields.However,GAIL directly acts on immediate rewards,a feature that is reflected in the value function after a period of accumulation.Thus,when faced with complex practical problems,the learning efficiency of GAIL is often extremely low and the policy may be slow to learn.One way to solve this problem is to directly guide the action(policy)in the agents'learning process,such as the control sharing(CS)method.This paper combines reinforcement learning and imitation learning and proposes a novel GAIL framework called generative adversarial imitation learning based on control sharing policy(GACS).GACS learns model constraints from expert samples and uses adversarial networks to guide learning directly.The actions are produced by adversarial networks and are used to optimize the policy and effectively improve learning efficiency.Experiments in the autonomous driving environment and the real-time strategy game breakout show that GACS has better generalization capabilities,more efficient imitation of the behavior of experts,and can learn better policies relative to other frameworks. 展开更多
关键词 generative adversarial imitation learning reinforcement learning control sharing deep reinforcement learning
原文传递
Heterogeneous multi-player imitation learning
12
作者 Bosen Lian Wenqian Xue Frank L.Lewis 《Control Theory and Technology》 EI CSCD 2023年第3期281-291,共11页
This paper studies imitation learning in nonlinear multi-player game systems with heterogeneous control input dynamics.We propose a model-free data-driven inverse reinforcement learning(RL)algorithm for a leaner to fi... This paper studies imitation learning in nonlinear multi-player game systems with heterogeneous control input dynamics.We propose a model-free data-driven inverse reinforcement learning(RL)algorithm for a leaner to find the cost functions of a N-player Nash expert system given the expert's states and control inputs.This allows us to address the imitation learning problem without prior knowledge of the expert's system dynamics.To achieve this,we provide a basic model-based algorithm that is built upon RL and inverse optimal control.This serves as the foundation for our final model-free inverse RL algorithm which is implemented via neural network-based value function approximators.Theoretical analysis and simulation examples verify the methods. 展开更多
关键词 Imitation learning Inverse reinforcement learning Heterogeneous multi-player games Data-driven model-free control
原文传递
Learning the optimal state-feedback via supervised imitation learning
13
作者 Dharmesh Tailor Dario Izzo 《Astrodynamics》 CSCD 2019年第4期361-374,共14页
Imitation learning is a control design paradigm that seeks to learn a control policy reproducing demonstrations from expert agents.By substituting expert demonstrations for optimal behaviours,the same paradigm leads t... Imitation learning is a control design paradigm that seeks to learn a control policy reproducing demonstrations from expert agents.By substituting expert demonstrations for optimal behaviours,the same paradigm leads to the design of control policies closely approximating the optimal state-feedback.This approach requires training a machine learning algorithm(in our case deep neural networks)directly on state-control pairs originating from optimal trajectories.We have shown in previous work that,when restricted to low-dimensional state and control spaces,this approach is very successful in several deterministic,non-linear problems in continuous-time.In this work,we refine our previous studies using as a test case a simple quadcopter model with quadratic and time-optimal objective functions.We describe in detail the best learning pipeline we have developed,that is able to approximate via deep neural networks the state-feedback map to a very high accuracy.We introduce the use of the softplus activation function in the hidden units of neural networks showing that it results in a smoother control profile whilst retaining the benefits of rectifiers.We show how to evaluate the optimality of the trained state-feedback,and find that already with two layers the objective function reached and its optimal value differ by less than one percent.We later consider also an additional metric linked to the system asymptotic behaviour-time taken to converge to the policy’s fixed point.With respect to these metrics,we show that improvements in the mean absolute error do not necessarily correspond to better policies. 展开更多
关键词 optimal control deep learning imitation learning G&CNET
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部