The cloud platform has limited defense resources to fully protect the edge servers used to process crowd sensing data in Internet of Things.To guarantee the network's overall security,we present a network defense ...The cloud platform has limited defense resources to fully protect the edge servers used to process crowd sensing data in Internet of Things.To guarantee the network's overall security,we present a network defense resource allocation with multi-armed bandits to maximize the network's overall benefit.Firstly,we propose the method for dynamic setting of node defense resource thresholds to obtain the defender(attacker)benefit function of edge servers(nodes)and distribution.Secondly,we design a defense resource sharing mechanism for neighboring nodes to obtain the defense capability of nodes.Subsequently,we use the decomposability and Lipschitz conti-nuity of the defender's total expected utility to reduce the difference between the utility's discrete and continuous arms and analyze the difference theoretically.Finally,experimental results show that the method maximizes the defender's total expected utility and reduces the difference between the discrete and continuous arms of the utility.展开更多
The frequent rebellions in Northern Manchuria during the Third Revolutionary War occurred in the special context of the struggle between the Kuomintang(KMT)and the Communist Party of China(CPC)for Northeast China afte...The frequent rebellions in Northern Manchuria during the Third Revolutionary War occurred in the special context of the struggle between the Kuomintang(KMT)and the Communist Party of China(CPC)for Northeast China after the victory of the Anti-Japanese War.The rebellion reached its peak during the KMTs attack on Northeast China,followed by a second wave of rebellion after the defeat in the Defensive Battle of Siping.It tended to disappear after the downfall of the Jiang Pengfei Group.In addition to the blind recruitment of the CPC in traditional narratives,the instigation of the KMT,the traditional mutiny of the old army,the limitations of the early work of the Northeast Anti-Japanese United Army,the early activities of the KMT,and the regional conflicts between the local and foreign forces are also important reasons for the concentration of rebellions.展开更多
In mobile cloud computing(MCC) systems,both the mobile access network and the cloud computing network are heterogeneous,implying the diverse configurations of hardware,software,architecture,resource,etc.In such hetero...In mobile cloud computing(MCC) systems,both the mobile access network and the cloud computing network are heterogeneous,implying the diverse configurations of hardware,software,architecture,resource,etc.In such heterogeneous mobile cloud(HMC) networks,both radio and cloud resources could become the system bottleneck,thus designing the schemes that separately and independently manage the resources may severely hinder the system performance.In this paper,we aim to design the network as the integration of the mobile access part and the cloud computing part,utilizing the inherent heterogeneity to meet the diverse quality of service(QoS)requirements of tenants.Furthermore,we propose a novel cross-network radio and cloud resource management scheme for HMC networks,which is QoS-aware,with the objective of maximizing the tenant revenue while satisfying the QoS requirements.The proposed scheme is formulated as a restless bandits problem,whose "indexability" feature guarantees the low complexity with scalable and distributed characteristics.Extensive simulation results are presented to demonstrate the significant performance improvement of the proposed scheme compared to the existing ones.展开更多
In order to cope with the increasing threat of the ballistic missile(BM)in a shorter reaction time,the shooting policy of the layered defense system needs to be optimized.The main decisionmaking problem of shooting op...In order to cope with the increasing threat of the ballistic missile(BM)in a shorter reaction time,the shooting policy of the layered defense system needs to be optimized.The main decisionmaking problem of shooting optimization is how to choose the next BM which needs to be shot according to the previous engagements and results,thus maximizing the expected return of BMs killed or minimizing the cost of BMs penetration.Motivated by this,this study aims to determine an optimal shooting policy for a two-layer missile defense(TLMD)system.This paper considers a scenario in which the TLMD system wishes to shoot at a collection of BMs one at a time,and to maximize the return obtained from BMs killed before the system demise.To provide a policy analysis tool,this paper develops a general model for shooting decision-making,the shooting engagements can be described as a discounted reward Markov decision process.The index shooting policy is a strategy that can effectively balance the shooting returns and the risk that the defense mission fails,and the goal is to maximize the return obtained from BMs killed before the system demise.The numerical results show that the index policy is better than a range of competitors,especially the mean returns and the mean killing BM number.展开更多
The static nature of cyber defense systems gives attackers a sufficient amount of time to explore and further exploit the vulnerabilities of information technology systems.In this paper,we investigate a problem where ...The static nature of cyber defense systems gives attackers a sufficient amount of time to explore and further exploit the vulnerabilities of information technology systems.In this paper,we investigate a problem where multiagent sys-tems sensing and acting in an environment contribute to adaptive cyber defense.We present a learning strategy that enables multiple agents to learn optimal poli-cies using multiagent reinforcement learning(MARL).Our proposed approach is inspired by the multiarmed bandits(MAB)learning technique for multiple agents to cooperate in decision making or to work independently.We study a MAB approach in which defenders visit a system multiple times in an alternating fash-ion to maximize their rewards and protect their system.We find that this game can be modeled from an individual player’s perspective as a restless MAB problem.We discover further results when the MAB takes the form of a pure birth process,such as a myopic optimal policy,as well as providing environments that offer the necessary incentives required for cooperation in multiplayer projects.展开更多
In this paper,we investigate the minimization of age of information(AoI),a metric that measures the information freshness,at the network edge with unreliable wireless communications.Particularly,we consider a set of u...In this paper,we investigate the minimization of age of information(AoI),a metric that measures the information freshness,at the network edge with unreliable wireless communications.Particularly,we consider a set of users transmitting status updates,which are collected by the user randomly over time,to an edge server through unreliable orthogonal channels.It begs a natural question:with random status update arrivals and obscure channel conditions,can we devise an intelligent scheduling policy that matches the users and channels to stabilize the queues of all users while minimizing the average AoI?To give an adequate answer,we define a bipartite graph and formulate a dynamic edge activation problem with stability constraints.Then,we propose an online matching while learning algorithm(MatL)and discuss its implementation for wireless scheduling.Finally,simulation results demonstrate that the MatL is reliable to learn the channel states and manage the users’buffers for fresher information at the edge.展开更多
Timely information updates are critical for real-time monitoring and control applications in the Internet of Things(IoT). In this paper, we consider a multi-antenna cellular IoT for state update where a base station(B...Timely information updates are critical for real-time monitoring and control applications in the Internet of Things(IoT). In this paper, we consider a multi-antenna cellular IoT for state update where a base station(BS) collects information from randomly distributed IoT nodes through time-varying channel.Specifically, multiple IoT nodes are allowed to transmit their state update simultaneously in a spatial multiplex manner. Inspired by age of information(AoI),we introduce a novel concept of age of transmission(AoT) for the sceneries in which BS cannot obtain the generation time of the packets waiting to be transmitted. The deadline-constrained AoT-optimal scheduling problem is formulated as a restless multi-armed bandit(RMAB) problem. Firstly, we prove the indexability of the scheduling problem and derive the closed-form of the Whittle index. Then, the interference graph and complementary graph are constructed to illustrate the interference between two nodes. The complete subgraphs are detected in the complementary graph to avoid inter-node interference. Next, an AoT-optimal scheduling strategy based on the Whittle index and complete subgraph detection is proposed.Finally, numerous simulations are conducted to verify the performance of the proposed strategy.展开更多
As a combination of edge computing and artificial intelligence,edge intelligence has become a promising technique and provided its users with a series of fast,precise,and customized services.In edge intelligence,when ...As a combination of edge computing and artificial intelligence,edge intelligence has become a promising technique and provided its users with a series of fast,precise,and customized services.In edge intelligence,when learning agents are deployed on the edge side,the data aggregation from the end side to the designated edge devices is an important research topic.Considering the various importance of end devices,this paper studies the weighted data aggregation problem in a single hop end-to-edge communication network.Firstly,to make sure all the end devices with various weights are fairly treated in data aggregation,a distributed end-to-edge cooperative scheme is proposed.Then,to handle the massive contention on the wireless channel caused by end devices,a multi-armed bandit(MAB)algorithm is designed to help the end devices find their most appropriate update rates.Diffe-rent from the traditional data aggregation works,combining the MAB enables our algorithm a higher efficiency in data aggregation.With a theoretical analysis,we show that the efficiency of our algorithm is asymptotically optimal.Comparative experiments with previous works are also conducted to show the strength of our algorithm.展开更多
The communication in the Millimeter-wave(mmWave)band,i.e.,30~300 GHz,is characterized by short-range transmissions and the use of antenna beamforming(BF).Thus,multiple mmWave access points(APs)should be installed to f...The communication in the Millimeter-wave(mmWave)band,i.e.,30~300 GHz,is characterized by short-range transmissions and the use of antenna beamforming(BF).Thus,multiple mmWave access points(APs)should be installed to fully cover a target environment with gigabits per second(Gbps)connectivity.However,inter-beam interference prevents maximizing the sum rates of the established concurrent links.In this paper,a reinforcement learning(RL)approach is proposed for enabling mmWave concurrent transmissions by finding out beam directions that maximize the long-term average sum rates of the concurrent links.Specifically,the problem is formulated as a multiplayer multiarmed bandit(MAB),where mmWave APs act as the players aiming to maximize their achievable rewards,i.e.,data rates,and the arms to play are the available beam directions.In this setup,a selfish concurrent multiplayer MAB strategy is advocated.Four different MAB algorithms,namely,ϵ-greedy,upper confidence bound(UCB),Thompson sampling(TS),and exponential weight algorithm for exploration and exploitation(EXP3)are examined by employing them in each AP to selfishly enhance its beam selection based only on its previous observations.After a few rounds of interactions,mmWave APs learn how to select concurrent beams that enhance the overall system performance.The proposed MAB based mmWave concurrent BF shows comparable performance to the optimal solution.展开更多
基金supported by the National Natural Science Foundation of China(NSFC)[grant numbers 62172377,61872205]the Shandong Provincial Natural Science Foundation[grant number ZR2019MF018]the Startup Research Foundation for Distinguished Scholars No.202112016.
文摘The cloud platform has limited defense resources to fully protect the edge servers used to process crowd sensing data in Internet of Things.To guarantee the network's overall security,we present a network defense resource allocation with multi-armed bandits to maximize the network's overall benefit.Firstly,we propose the method for dynamic setting of node defense resource thresholds to obtain the defender(attacker)benefit function of edge servers(nodes)and distribution.Secondly,we design a defense resource sharing mechanism for neighboring nodes to obtain the defense capability of nodes.Subsequently,we use the decomposability and Lipschitz conti-nuity of the defender's total expected utility to reduce the difference between the utility's discrete and continuous arms and analyze the difference theoretically.Finally,experimental results show that the method maximizes the defender's total expected utility and reduces the difference between the discrete and continuous arms of the utility.
文摘The frequent rebellions in Northern Manchuria during the Third Revolutionary War occurred in the special context of the struggle between the Kuomintang(KMT)and the Communist Party of China(CPC)for Northeast China after the victory of the Anti-Japanese War.The rebellion reached its peak during the KMTs attack on Northeast China,followed by a second wave of rebellion after the defeat in the Defensive Battle of Siping.It tended to disappear after the downfall of the Jiang Pengfei Group.In addition to the blind recruitment of the CPC in traditional narratives,the instigation of the KMT,the traditional mutiny of the old army,the limitations of the early work of the Northeast Anti-Japanese United Army,the early activities of the KMT,and the regional conflicts between the local and foreign forces are also important reasons for the concentration of rebellions.
基金supported in part by the National Natural Science Foundation of China under Grant 61101113,61372089 and 61201198 the Beijing Natural Science Foundation under Grant 4132007,4132015 and 4132019 the Research Fund for the Doctoral Program of Higher Education of China under Grant 20111103120017
文摘In mobile cloud computing(MCC) systems,both the mobile access network and the cloud computing network are heterogeneous,implying the diverse configurations of hardware,software,architecture,resource,etc.In such heterogeneous mobile cloud(HMC) networks,both radio and cloud resources could become the system bottleneck,thus designing the schemes that separately and independently manage the resources may severely hinder the system performance.In this paper,we aim to design the network as the integration of the mobile access part and the cloud computing part,utilizing the inherent heterogeneity to meet the diverse quality of service(QoS)requirements of tenants.Furthermore,we propose a novel cross-network radio and cloud resource management scheme for HMC networks,which is QoS-aware,with the objective of maximizing the tenant revenue while satisfying the QoS requirements.The proposed scheme is formulated as a restless bandits problem,whose "indexability" feature guarantees the low complexity with scalable and distributed characteristics.Extensive simulation results are presented to demonstrate the significant performance improvement of the proposed scheme compared to the existing ones.
基金supported by the National Natural Science Foundation of China(7170120971771216)+1 种基金Shaanxi Natural Science Foundation(2019JQ-250)China Post-doctoral Fund(2019M653962)
文摘In order to cope with the increasing threat of the ballistic missile(BM)in a shorter reaction time,the shooting policy of the layered defense system needs to be optimized.The main decisionmaking problem of shooting optimization is how to choose the next BM which needs to be shot according to the previous engagements and results,thus maximizing the expected return of BMs killed or minimizing the cost of BMs penetration.Motivated by this,this study aims to determine an optimal shooting policy for a two-layer missile defense(TLMD)system.This paper considers a scenario in which the TLMD system wishes to shoot at a collection of BMs one at a time,and to maximize the return obtained from BMs killed before the system demise.To provide a policy analysis tool,this paper develops a general model for shooting decision-making,the shooting engagements can be described as a discounted reward Markov decision process.The index shooting policy is a strategy that can effectively balance the shooting returns and the risk that the defense mission fails,and the goal is to maximize the return obtained from BMs killed before the system demise.The numerical results show that the index policy is better than a range of competitors,especially the mean returns and the mean killing BM number.
基金This work is funded by the Deanship of Scientific Research(DSR)the University of Jeddah,under Grant No.(UJ-22-DR-1).
文摘The static nature of cyber defense systems gives attackers a sufficient amount of time to explore and further exploit the vulnerabilities of information technology systems.In this paper,we investigate a problem where multiagent sys-tems sensing and acting in an environment contribute to adaptive cyber defense.We present a learning strategy that enables multiple agents to learn optimal poli-cies using multiagent reinforcement learning(MARL).Our proposed approach is inspired by the multiarmed bandits(MAB)learning technique for multiple agents to cooperate in decision making or to work independently.We study a MAB approach in which defenders visit a system multiple times in an alternating fash-ion to maximize their rewards and protect their system.We find that this game can be modeled from an individual player’s perspective as a restless MAB problem.We discover further results when the MAB takes the form of a pure birth process,such as a myopic optimal policy,as well as providing environments that offer the necessary incentives required for cooperation in multiplayer projects.
基金supported in part by Shanghai Pujiang Program under Grant No.21PJ1402600in part by Natural Science Foundation of Chongqing,China under Grant No.CSTB2022NSCQ-MSX0375+4 种基金in part by Song Shan Laboratory Foundation,under Grant No.YYJC022022007in part by Zhejiang Provincial Natural Science Foundation of China under Grant LGJ22F010001in part by National Key Research and Development Program of China under Grant 2020YFA0711301in part by National Natural Science Foundation of China under Grant 61922049。
文摘In this paper,we investigate the minimization of age of information(AoI),a metric that measures the information freshness,at the network edge with unreliable wireless communications.Particularly,we consider a set of users transmitting status updates,which are collected by the user randomly over time,to an edge server through unreliable orthogonal channels.It begs a natural question:with random status update arrivals and obscure channel conditions,can we devise an intelligent scheduling policy that matches the users and channels to stabilize the queues of all users while minimizing the average AoI?To give an adequate answer,we define a bipartite graph and formulate a dynamic edge activation problem with stability constraints.Then,we propose an online matching while learning algorithm(MatL)and discuss its implementation for wireless scheduling.Finally,simulation results demonstrate that the MatL is reliable to learn the channel states and manage the users’buffers for fresher information at the edge.
基金supported by the Fundamental Research Funds for the Central Universities (2020ZDPYMS26)the National Natural Science Foundation of China (62071472, 51734009)+3 种基金the Natural Science Foundation o Jiangsu Province (BK20210489, BK20200650)China Postdoctoral Science Foundation (2019M660133)the Future Network Scientific Research Fund Project (FNSRFP-2021-YB-12)the Program for “Industrial IoT and Emergency Collaboration” Innovative Research Team in CUMT (No.2020ZY002)。
文摘Timely information updates are critical for real-time monitoring and control applications in the Internet of Things(IoT). In this paper, we consider a multi-antenna cellular IoT for state update where a base station(BS) collects information from randomly distributed IoT nodes through time-varying channel.Specifically, multiple IoT nodes are allowed to transmit their state update simultaneously in a spatial multiplex manner. Inspired by age of information(AoI),we introduce a novel concept of age of transmission(AoT) for the sceneries in which BS cannot obtain the generation time of the packets waiting to be transmitted. The deadline-constrained AoT-optimal scheduling problem is formulated as a restless multi-armed bandit(RMAB) problem. Firstly, we prove the indexability of the scheduling problem and derive the closed-form of the Whittle index. Then, the interference graph and complementary graph are constructed to illustrate the interference between two nodes. The complete subgraphs are detected in the complementary graph to avoid inter-node interference. Next, an AoT-optimal scheduling strategy based on the Whittle index and complete subgraph detection is proposed.Finally, numerous simulations are conducted to verify the performance of the proposed strategy.
基金supported by the National Natural Science Foundation of China(NSFC)(62102232,62122042,61971269)Natural Science Foundation of Shandong Province Under(ZR2021QF064)。
文摘As a combination of edge computing and artificial intelligence,edge intelligence has become a promising technique and provided its users with a series of fast,precise,and customized services.In edge intelligence,when learning agents are deployed on the edge side,the data aggregation from the end side to the designated edge devices is an important research topic.Considering the various importance of end devices,this paper studies the weighted data aggregation problem in a single hop end-to-edge communication network.Firstly,to make sure all the end devices with various weights are fairly treated in data aggregation,a distributed end-to-edge cooperative scheme is proposed.Then,to handle the massive contention on the wireless channel caused by end devices,a multi-armed bandit(MAB)algorithm is designed to help the end devices find their most appropriate update rates.Diffe-rent from the traditional data aggregation works,combining the MAB enables our algorithm a higher efficiency in data aggregation.With a theoretical analysis,we show that the efficiency of our algorithm is asymptotically optimal.Comparative experiments with previous works are also conducted to show the strength of our algorithm.
文摘The communication in the Millimeter-wave(mmWave)band,i.e.,30~300 GHz,is characterized by short-range transmissions and the use of antenna beamforming(BF).Thus,multiple mmWave access points(APs)should be installed to fully cover a target environment with gigabits per second(Gbps)connectivity.However,inter-beam interference prevents maximizing the sum rates of the established concurrent links.In this paper,a reinforcement learning(RL)approach is proposed for enabling mmWave concurrent transmissions by finding out beam directions that maximize the long-term average sum rates of the concurrent links.Specifically,the problem is formulated as a multiplayer multiarmed bandit(MAB),where mmWave APs act as the players aiming to maximize their achievable rewards,i.e.,data rates,and the arms to play are the available beam directions.In this setup,a selfish concurrent multiplayer MAB strategy is advocated.Four different MAB algorithms,namely,ϵ-greedy,upper confidence bound(UCB),Thompson sampling(TS),and exponential weight algorithm for exploration and exploitation(EXP3)are examined by employing them in each AP to selfishly enhance its beam selection based only on its previous observations.After a few rounds of interactions,mmWave APs learn how to select concurrent beams that enhance the overall system performance.The proposed MAB based mmWave concurrent BF shows comparable performance to the optimal solution.