In the context of edge computing environments in general and the metaverse in particular,federated learning(FL)has emerged as a distributed machine learning paradigm that allows multiple users to collaborate on traini...In the context of edge computing environments in general and the metaverse in particular,federated learning(FL)has emerged as a distributed machine learning paradigm that allows multiple users to collaborate on training a shared machine learning model locally,eliminating the need for uploading raw data to a central server.It is perhaps the only training paradigm that preserves the privacy of user data,which is essential for computing environments as personal as the metaverse.However,the original FL architecture proposed is not scalable to a large number of user devices in the metaverse community.To mitigate this problem,hierarchical federated learning(HFL)has been introduced as a general distributed learning paradigm,inspiring a number of research works.In this paper,we present several types of HFL architectures,with a special focus on the three-layer client-edge-cloud HFL architecture,which is most pertinent to the metaverse due to its delay-sensitive nature.We also examine works that take advantage of the natural layered organization of three-layer client-edge-cloud HFL to tackle some of the most challenging problems in FL within the metaverse.Finally,we outline some future research directions of HFL in the metaverse.展开更多
Federated learning(FL)is a distributed machine learning(ML)framework where several clients cooperatively train an ML model by exchanging the model parameters without directly sharing their local data.In FL,the limited...Federated learning(FL)is a distributed machine learning(ML)framework where several clients cooperatively train an ML model by exchanging the model parameters without directly sharing their local data.In FL,the limited number of participants for model aggregation and communication latency are two major bottlenecks.Hierarchical federated learning(HFL),with a cloud-edge-client hierarchy,can leverage the large coverage of cloud servers and the low transmission latency of edge servers.There are growing research interests in implementing FL in vehicular networks due to the requirements of timely ML training for intelligent vehicles.However,the limited number of participants in vehicular networks and vehicle mobility degrade the performance of FL training.In this context,HFL,which stands out for lower latency,wider coverage and more participants,is promising in vehicular networks.In this paper,we begin with the background and motivation of HFL and the feasibility of implementing HFL in vehicular networks.Then,the architecture of HFL is illustrated.Next,we clarify new issues in HFL and review several existing solutions.Furthermore,we introduce some typical use cases in vehicular networks as well as our initial efforts on implementing HFL in vehicular networks.Finally,we conclude with future research directions.展开更多
In reinforcement learning an agent may explore ineffectively when dealing with sparse reward tasks where finding a reward point is difficult.To solve the problem,we propose an algorithm called hierarchical deep reinfo...In reinforcement learning an agent may explore ineffectively when dealing with sparse reward tasks where finding a reward point is difficult.To solve the problem,we propose an algorithm called hierarchical deep reinforcement learning with automatic sub-goal identification via computer vision(HADS)which takes advantage of hierarchical reinforcement learning to alleviate the sparse reward problem and improve efficiency of exploration by utilizing a sub-goal mechanism.HADS uses a computer vision method to identify sub-goals automatically for hierarchical deep reinforcement learning.Due to the fact that not all sub-goal points are reachable,a mechanism is proposed to remove unreachable sub-goal points so as to further improve the performance of the algorithm.HADS involves contour recognition to identify sub-goals from the state image where some salient states in the state image may be recognized as sub-goals,while those that are not will be removed based on prior knowledge.Our experiments verified the effect of the algorithm.展开更多
The guidance strategy is an extremely critical factor in determining the striking effect of the missile operation.A novel guidance law is presented by exploiting the deep reinforcement learning(DRL)with the hierarchic...The guidance strategy is an extremely critical factor in determining the striking effect of the missile operation.A novel guidance law is presented by exploiting the deep reinforcement learning(DRL)with the hierarchical deep deterministic policy gradient(DDPG)algorithm.The reward functions are constructed to minimize the line-of-sight(LOS)angle rate and avoid the threat caused by the opposed obstacles.To attenuate the chattering of the acceleration,a hierarchical reinforcement learning structure and an improved reward function with action penalty are put forward.The simulation results validate that the missile under the proposed method can hit the target successfully and keep away from the threatened areas effectively.展开更多
This study proposed a weighted sampling hierarchical classification learning method based on an efficient backbone network model to address the problems of high costs,low accuracy,and time-consuming traditional tea di...This study proposed a weighted sampling hierarchical classification learning method based on an efficient backbone network model to address the problems of high costs,low accuracy,and time-consuming traditional tea disease recognition methods.This method enhances the feature extraction ability by conducting hierarchical classification learning based on the EfficientNet model,effectively alleviating the impact of high similarity between tea diseases on the model’s classification performance.To better solve the problem of few and unevenly distributed tea disease samples,this study introduced a weighted sampling scheme to optimize data processing,which not only alleviates the overfitting effect caused by too few sample data but also balances the probability of extracting imbalanced classification data.The experimental results show that the proposed method was significant in identifying both healthy tea leaves and four common leaf diseases of tea(tea algal spot disease,tea white spot disease,tea anthracnose disease,and tea leaf blight disease).After applying the“weighted sampling hierarchical classification learning method”to train 7 different efficient backbone networks,most of their accuracies have improved.The EfficientNet-B1 model proposed in this study achieved an accuracy rate of 99.21%after adopting this learning method,which is higher than EfficientNet-b2(98.82%)and MobileNet-V3(98.43%).In addition,to better apply the results of identifying tea diseases,this study developed a mini-program that operates on WeChat.Users can quickly obtain accurate identification results and corresponding disease descriptions and prevention methods through simple operations.This intelligent tool for identifying tea diseases can serve as an auxiliary tool for farmers,consumers,and related scientific researchers and has certain practical value.展开更多
Based on option-critic algorithm,a new adversarial algorithm named deterministic policy network with option architecture is proposed to improve agent's performance against opponent with fixed offensive algorithm.A...Based on option-critic algorithm,a new adversarial algorithm named deterministic policy network with option architecture is proposed to improve agent's performance against opponent with fixed offensive algorithm.An option network is introduced in upper level design,which can generate activated signal from defensive and of-fensive strategies according to temporary situation.Then the lower level executive layer can figure out interactive action with guidance of activated signal,and the value of both activated signal and interactive action is evaluated by critic structure together.This method could release requirement of semi Markov decision process effectively and eventually simplified network structure by eliminating termination possibility layer.According to the result of experiment,it is proved that new algorithm switches strategy style between offensive and defensive ones neatly and acquires more reward from environment than classical deep deterministic policy gradient algorithm does.展开更多
Computer-aided design(CAD)software continues to be a crucial tool in digital twin application and manufacturing,facilitating the design of various products.We present a novel CAD generation method,an agent that constr...Computer-aided design(CAD)software continues to be a crucial tool in digital twin application and manufacturing,facilitating the design of various products.We present a novel CAD generation method,an agent that constructs the CAD sequences containing the sketch-and-extrude modelling operations efficiently and with high quality.Starting from the sketch and extrusion operation sequences,we utilise the transformer encoder to encode them into different disentangled codebooks to represent their distribution properties while considering their correlations.Then,a combination of auto-regressive and non-autoregressive samplers is trained to sample the code for CAD sequence con-struction.Extensive experiments demonstrate that our model generates diverse and high-quality CAD models.We also show some cases of real digital twin applications and indicate that our generated model can be used as the data source for the digital twin platform,exhibiting designers'potential.展开更多
As intelligent vehicles usually have complex overtaking process,a safe and efficient automated overtaking system(AOS)is vital to avoid accidents caused by wrong operation of drivers.Existing AOSs rarely consider longi...As intelligent vehicles usually have complex overtaking process,a safe and efficient automated overtaking system(AOS)is vital to avoid accidents caused by wrong operation of drivers.Existing AOSs rarely consider longitudinal reactions of the overtaken vehicle(OV)during overtaking.This paper proposed a novel AOS based on hierarchical reinforcement learning,where the longitudinal reaction is given by a data-driven social preference estimation.This AOS incorporates two modules that can function in different overtaking phases.The first module based on semi-Markov decision process and motion primitives is built for motion planning and control.The second module based on Markov decision process is designed to enable vehicles to make proper decisions according to the social preference of OV.Based on realistic overtaking data,the proposed AOS and its modules are verified experimentally.The results of the tests show that the proposed AOS can realize safe and effective overtaking in scenes built by realistic data,and has the ability to flexibly adjust lateral driving behavior and lane changing position when the OVs have different social preferences.展开更多
Option is a promising method to discover the hierarchical structure in reinforcement learning (RL) for learning acceleration. The key to option discovery is about how an agent can find useful subgoals autonomically ...Option is a promising method to discover the hierarchical structure in reinforcement learning (RL) for learning acceleration. The key to option discovery is about how an agent can find useful subgoals autonomically among the passing trails. By analyzing the agent's actions in the trails, useful heuristics can be found. Not only does the agent pass subgoals more frequently, but also its effective actions are restricted in subgoals. As a consequence, the subgoals can be deemed as the most matching action-restricted states in the paths. In the grid-world environment, the concept of the unique-direction value reflecting the action-restricted property was introduced to find the most matching action-restricted states. The unique-direction-value (UDV) approach is chosen to form options offline and online autonomically. Experiments show that the approach can find subgoals correctly. Thus the Q-learning with options found on both offline and online process can accelerate learning significantly.展开更多
In some image classification tasks, similarities among different categories are different and the samples are usually misclassified as highly similar categories. To distinguish highly similar categories, more specific...In some image classification tasks, similarities among different categories are different and the samples are usually misclassified as highly similar categories. To distinguish highly similar categories, more specific features are required so that the classifier can improve the classification performance. In this paper, we propose a novel two-level hierarchical feature learning framework based on the deep convolutional neural network(CNN), which is simple and effective. First, the deep feature extractors of different levels are trained using the transfer learning method that fine-tunes the pre-trained deep CNN model toward the new target dataset. Second, the general feature extracted from all the categories and the specific feature extracted from highly similar categories are fused into a feature vector. Then the final feature representation is fed into a linear classifier. Finally, experiments using the Caltech-256, Oxford Flower-102, and Tasmania Coral Point Count(CPC) datasets demonstrate that the expression ability of the deep features resulting from two-level hierarchical feature learning is powerful. Our proposed method effectively increases the classification accuracy in comparison with flat multiple classification methods.展开更多
文摘In the context of edge computing environments in general and the metaverse in particular,federated learning(FL)has emerged as a distributed machine learning paradigm that allows multiple users to collaborate on training a shared machine learning model locally,eliminating the need for uploading raw data to a central server.It is perhaps the only training paradigm that preserves the privacy of user data,which is essential for computing environments as personal as the metaverse.However,the original FL architecture proposed is not scalable to a large number of user devices in the metaverse community.To mitigate this problem,hierarchical federated learning(HFL)has been introduced as a general distributed learning paradigm,inspiring a number of research works.In this paper,we present several types of HFL architectures,with a special focus on the three-layer client-edge-cloud HFL architecture,which is most pertinent to the metaverse due to its delay-sensitive nature.We also examine works that take advantage of the natural layered organization of three-layer client-edge-cloud HFL to tackle some of the most challenging problems in FL within the metaverse.Finally,we outline some future research directions of HFL in the metaverse.
基金sponsored in part by the National Key R&D Program of China under Grant No. 2020YFB1806605the National Natural Science Foundation of China under Grant Nos. 62022049, 62111530197, and 61871254+1 种基金OPPOsupported by the Fundamental Research Funds for the Central Universities under Grant No. 2022JBXT001
文摘Federated learning(FL)is a distributed machine learning(ML)framework where several clients cooperatively train an ML model by exchanging the model parameters without directly sharing their local data.In FL,the limited number of participants for model aggregation and communication latency are two major bottlenecks.Hierarchical federated learning(HFL),with a cloud-edge-client hierarchy,can leverage the large coverage of cloud servers and the low transmission latency of edge servers.There are growing research interests in implementing FL in vehicular networks due to the requirements of timely ML training for intelligent vehicles.However,the limited number of participants in vehicular networks and vehicle mobility degrade the performance of FL training.In this context,HFL,which stands out for lower latency,wider coverage and more participants,is promising in vehicular networks.In this paper,we begin with the background and motivation of HFL and the feasibility of implementing HFL in vehicular networks.Then,the architecture of HFL is illustrated.Next,we clarify new issues in HFL and review several existing solutions.Furthermore,we introduce some typical use cases in vehicular networks as well as our initial efforts on implementing HFL in vehicular networks.Finally,we conclude with future research directions.
基金supported by the National Natural Science Foundation of China(61303108)Suzhou Key Industries Technological Innovation-Prospective Applied Research Project(SYG201804)+2 种基金A Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)the Fundamental Research Funds for the Gentral UniversitiesJLU(93K172020K25)。
文摘In reinforcement learning an agent may explore ineffectively when dealing with sparse reward tasks where finding a reward point is difficult.To solve the problem,we propose an algorithm called hierarchical deep reinforcement learning with automatic sub-goal identification via computer vision(HADS)which takes advantage of hierarchical reinforcement learning to alleviate the sparse reward problem and improve efficiency of exploration by utilizing a sub-goal mechanism.HADS uses a computer vision method to identify sub-goals automatically for hierarchical deep reinforcement learning.Due to the fact that not all sub-goal points are reachable,a mechanism is proposed to remove unreachable sub-goal points so as to further improve the performance of the algorithm.HADS involves contour recognition to identify sub-goals from the state image where some salient states in the state image may be recognized as sub-goals,while those that are not will be removed based on prior knowledge.Our experiments verified the effect of the algorithm.
基金supported by the National Natural Science Foundation of China(62003021,91212304).
文摘The guidance strategy is an extremely critical factor in determining the striking effect of the missile operation.A novel guidance law is presented by exploiting the deep reinforcement learning(DRL)with the hierarchical deep deterministic policy gradient(DDPG)algorithm.The reward functions are constructed to minimize the line-of-sight(LOS)angle rate and avoid the threat caused by the opposed obstacles.To attenuate the chattering of the acceleration,a hierarchical reinforcement learning structure and an improved reward function with action penalty are put forward.The simulation results validate that the missile under the proposed method can hit the target successfully and keep away from the threatened areas effectively.
基金financial support provided by the Major Project of Yunnan Science and Technology,under Project No.202302AE09002003,entitled“Research on the Integration of Key Technologies in Smart Agriculture.”。
文摘This study proposed a weighted sampling hierarchical classification learning method based on an efficient backbone network model to address the problems of high costs,low accuracy,and time-consuming traditional tea disease recognition methods.This method enhances the feature extraction ability by conducting hierarchical classification learning based on the EfficientNet model,effectively alleviating the impact of high similarity between tea diseases on the model’s classification performance.To better solve the problem of few and unevenly distributed tea disease samples,this study introduced a weighted sampling scheme to optimize data processing,which not only alleviates the overfitting effect caused by too few sample data but also balances the probability of extracting imbalanced classification data.The experimental results show that the proposed method was significant in identifying both healthy tea leaves and four common leaf diseases of tea(tea algal spot disease,tea white spot disease,tea anthracnose disease,and tea leaf blight disease).After applying the“weighted sampling hierarchical classification learning method”to train 7 different efficient backbone networks,most of their accuracies have improved.The EfficientNet-B1 model proposed in this study achieved an accuracy rate of 99.21%after adopting this learning method,which is higher than EfficientNet-b2(98.82%)and MobileNet-V3(98.43%).In addition,to better apply the results of identifying tea diseases,this study developed a mini-program that operates on WeChat.Users can quickly obtain accurate identification results and corresponding disease descriptions and prevention methods through simple operations.This intelligent tool for identifying tea diseases can serve as an auxiliary tool for farmers,consumers,and related scientific researchers and has certain practical value.
基金the National Natural Science Foundation of China (No.61673265)the National Key Research and Development Program (No.2020YFC1512203)the Shanghai Commercial Aircraft System Engineering Joint Research Fund (No.CASEF-2022-Z05)。
文摘Based on option-critic algorithm,a new adversarial algorithm named deterministic policy network with option architecture is proposed to improve agent's performance against opponent with fixed offensive algorithm.An option network is introduced in upper level design,which can generate activated signal from defensive and of-fensive strategies according to temporary situation.Then the lower level executive layer can figure out interactive action with guidance of activated signal,and the value of both activated signal and interactive action is evaluated by critic structure together.This method could release requirement of semi Markov decision process effectively and eventually simplified network structure by eliminating termination possibility layer.According to the result of experiment,it is proved that new algorithm switches strategy style between offensive and defensive ones neatly and acquires more reward from environment than classical deep deterministic policy gradient algorithm does.
基金National Key Research and Development Program of China,Grant/Award Number:2022YFF0904303Beijing Science and Technology Planning Project,Grant/Award Number:Z221100006322003National Natural Science Foundation of China,Grant/Award Number:61932003。
文摘Computer-aided design(CAD)software continues to be a crucial tool in digital twin application and manufacturing,facilitating the design of various products.We present a novel CAD generation method,an agent that constructs the CAD sequences containing the sketch-and-extrude modelling operations efficiently and with high quality.Starting from the sketch and extrusion operation sequences,we utilise the transformer encoder to encode them into different disentangled codebooks to represent their distribution properties while considering their correlations.Then,a combination of auto-regressive and non-autoregressive samplers is trained to sample the code for CAD sequence con-struction.Extensive experiments demonstrate that our model generates diverse and high-quality CAD models.We also show some cases of real digital twin applications and indicate that our generated model can be used as the data source for the digital twin platform,exhibiting designers'potential.
基金The authors would like to appreciate the financial support of the National Natural Science Foundation of China(Grant No.61703041)the technological innovation program of Beijing Institute of Technology(2021CX11006).
文摘As intelligent vehicles usually have complex overtaking process,a safe and efficient automated overtaking system(AOS)is vital to avoid accidents caused by wrong operation of drivers.Existing AOSs rarely consider longitudinal reactions of the overtaken vehicle(OV)during overtaking.This paper proposed a novel AOS based on hierarchical reinforcement learning,where the longitudinal reaction is given by a data-driven social preference estimation.This AOS incorporates two modules that can function in different overtaking phases.The first module based on semi-Markov decision process and motion primitives is built for motion planning and control.The second module based on Markov decision process is designed to enable vehicles to make proper decisions according to the social preference of OV.Based on realistic overtaking data,the proposed AOS and its modules are verified experimentally.The results of the tests show that the proposed AOS can realize safe and effective overtaking in scenes built by realistic data,and has the ability to flexibly adjust lateral driving behavior and lane changing position when the OVs have different social preferences.
基金supported by the National Basic Research Program of China (2013CB329603)the National Natural Science Foundation of China (61375058, 71231002)+1 种基金the China Mobile Research Fund (MCM 20130351)the Ministry of Education of China and the Special Co-Construction Project of Beijing Municipal Commission of Education
文摘Option is a promising method to discover the hierarchical structure in reinforcement learning (RL) for learning acceleration. The key to option discovery is about how an agent can find useful subgoals autonomically among the passing trails. By analyzing the agent's actions in the trails, useful heuristics can be found. Not only does the agent pass subgoals more frequently, but also its effective actions are restricted in subgoals. As a consequence, the subgoals can be deemed as the most matching action-restricted states in the paths. In the grid-world environment, the concept of the unique-direction value reflecting the action-restricted property was introduced to find the most matching action-restricted states. The unique-direction-value (UDV) approach is chosen to form options offline and online autonomically. Experiments show that the approach can find subgoals correctly. Thus the Q-learning with options found on both offline and online process can accelerate learning significantly.
基金Project supported by the National Natural Science Foundation of China(No.61379074)the Zhejiang Provincial Natural Science Foundation of China(Nos.LZ12F02003 and LY15F020035)
文摘In some image classification tasks, similarities among different categories are different and the samples are usually misclassified as highly similar categories. To distinguish highly similar categories, more specific features are required so that the classifier can improve the classification performance. In this paper, we propose a novel two-level hierarchical feature learning framework based on the deep convolutional neural network(CNN), which is simple and effective. First, the deep feature extractors of different levels are trained using the transfer learning method that fine-tunes the pre-trained deep CNN model toward the new target dataset. Second, the general feature extracted from all the categories and the specific feature extracted from highly similar categories are fused into a feature vector. Then the final feature representation is fed into a linear classifier. Finally, experiments using the Caltech-256, Oxford Flower-102, and Tasmania Coral Point Count(CPC) datasets demonstrate that the expression ability of the deep features resulting from two-level hierarchical feature learning is powerful. Our proposed method effectively increases the classification accuracy in comparison with flat multiple classification methods.