Multi-agent reinforcement learning is difficult to apply in practice,partially because of the gap between simulated and real-world scenarios.One reason for the gap is that simulated systems always assume that agents c...Multi-agent reinforcement learning is difficult to apply in practice,partially because of the gap between simulated and real-world scenarios.One reason for the gap is that simulated systems always assume that agents can work normally all the time,while in practice,one or more agents may unexpectedly“crash”during the coordination process due to inevitable hardware or software failures.Such crashes destroy the cooperation among agents and lead to performance degradation.In this work,we present a formal conceptualization of a cooperative multi-agent reinforcement learning system with unexpected crashes.To enhance the robustness of the system to crashes,we propose a coach-assisted multi-agent reinforcement learning framework that introduces a virtual coach agent to adjust the crash rate during training.We have designed three coaching strategies(fixed crash rate,curriculum learning,and adaptive crash rate)and a re-sampling strategy for our coach agent.To our knowledge,this work is the first to study unexpected crashes in a multi-agent system.Extensive experiments on grid-world and StarCraft II micromanagement tasks demonstrate the efficacy of the adaptive strategy compared with the fixed crash rate strategy and curriculum learning strategy.The ablation study further illustrates the effectiveness of our re-sampling strategy.展开更多
Video adaptation is a promising technique to bridge the gap between network status, device capabilities, and user preferences in pervasive media applications. However, conventional adaptation frameworks based on eithe...Video adaptation is a promising technique to bridge the gap between network status, device capabilities, and user preferences in pervasive media applications. However, conventional adaptation frameworks based on either transcoding or multiple pre-transcoding are not able to accommodate large numbers of users with diversified applications. This paper introduces an intermediate video description called "Inter- media", which consists of multiple level video signal components, such as texture, motion, and rate control information, as well as some semantic features, such as structural characteristics and Region Of Interest (ROI) information. It is generated off-line and stored in the video server or media gateway. Intermedia is then used to design a novel video adaptation system. The proposed adaptation system quickly and easily generates the required bit stream from Intermedia with very low complexity to fulfill a series of specific adaptation requirements, e.g., bitrate conversion, temporal/spatial resolution reduction, video summarization, ROI browsing, and some multi-level adaptations involving both signal level and semantic level adaptation. The satisfactory performance of such a system demonstrates the effectiveness and efficiency of the proposed video adaptation framework.展开更多
基金Project supported by the National Natural Science Foundation of China(No.61836011)the Youth Innovation Promotion Association of the Chinese Academy of Sciences(No.2018497)the GPU cluster built by the MCC Lab of Information Science and Technology Institution,USTC,China。
文摘Multi-agent reinforcement learning is difficult to apply in practice,partially because of the gap between simulated and real-world scenarios.One reason for the gap is that simulated systems always assume that agents can work normally all the time,while in practice,one or more agents may unexpectedly“crash”during the coordination process due to inevitable hardware or software failures.Such crashes destroy the cooperation among agents and lead to performance degradation.In this work,we present a formal conceptualization of a cooperative multi-agent reinforcement learning system with unexpected crashes.To enhance the robustness of the system to crashes,we propose a coach-assisted multi-agent reinforcement learning framework that introduces a virtual coach agent to adjust the crash rate during training.We have designed three coaching strategies(fixed crash rate,curriculum learning,and adaptive crash rate)and a re-sampling strategy for our coach agent.To our knowledge,this work is the first to study unexpected crashes in a multi-agent system.Extensive experiments on grid-world and StarCraft II micromanagement tasks demonstrate the efficacy of the adaptive strategy compared with the fixed crash rate strategy and curriculum learning strategy.The ablation study further illustrates the effectiveness of our re-sampling strategy.
基金Supported by the National Natural Science Foundation of China(No.60736043)
文摘Video adaptation is a promising technique to bridge the gap between network status, device capabilities, and user preferences in pervasive media applications. However, conventional adaptation frameworks based on either transcoding or multiple pre-transcoding are not able to accommodate large numbers of users with diversified applications. This paper introduces an intermediate video description called "Inter- media", which consists of multiple level video signal components, such as texture, motion, and rate control information, as well as some semantic features, such as structural characteristics and Region Of Interest (ROI) information. It is generated off-line and stored in the video server or media gateway. Intermedia is then used to design a novel video adaptation system. The proposed adaptation system quickly and easily generates the required bit stream from Intermedia with very low complexity to fulfill a series of specific adaptation requirements, e.g., bitrate conversion, temporal/spatial resolution reduction, video summarization, ROI browsing, and some multi-level adaptations involving both signal level and semantic level adaptation. The satisfactory performance of such a system demonstrates the effectiveness and efficiency of the proposed video adaptation framework.