The Autonomous Underwater Glider(AUG)is a kind of prevailing underwater intelligent internet vehicle and occupies a dominant position in industrial applications,in which path planning is an essential problem.Due to th...The Autonomous Underwater Glider(AUG)is a kind of prevailing underwater intelligent internet vehicle and occupies a dominant position in industrial applications,in which path planning is an essential problem.Due to the complexity and variability of the ocean,accurate environment modeling and flexible path planning algorithms are pivotal challenges.The traditional models mainly utilize mathematical functions,which are not complete and reliable.Most existing path planning algorithms depend on the environment and lack flexibility.To overcome these challenges,we propose a path planning system for underwater intelligent internet vehicles.It applies digital twins and sensor data to map the real ocean environment to a virtual digital space,which provides a comprehensive and reliable environment for path simulation.We design a value-based reinforcement learning path planning algorithm and explore the optimal network structure parameters.The path simulation is controlled by a closed-loop model integrated into the terminal vehicle through edge computing.The integration of state input enriches the learning of neural networks and helps to improve generalization and flexibility.The task-related reward function promotes the rapid convergence of the training.The experimental results prove that our reinforcement learning based path planning algorithm has great flexibility and can effectively adapt to a variety of different ocean conditions.展开更多
Intelligent penetration testing is of great significance for the improvement of the security of information systems,and the critical issue is the planning of penetration test paths.In view of the difficulty for attack...Intelligent penetration testing is of great significance for the improvement of the security of information systems,and the critical issue is the planning of penetration test paths.In view of the difficulty for attackers to obtain complete network information in realistic network scenarios,Reinforcement Learning(RL)is a promising solution to discover the optimal penetration path under incomplete information about the target network.Existing RL-based methods are challenged by the sizeable discrete action space,which leads to difficulties in the convergence.Moreover,most methods still rely on experts’knowledge.To address these issues,this paper proposes a penetration path planning method based on reinforcement learning with episodic memory.First,the penetration testing problem is formally described in terms of reinforcement learning.To speed up the training process without specific prior knowledge,the proposed algorithm introduces episodic memory to store experienced advantageous strategies for the first time.Furthermore,the method offers an exploration strategy based on episodic memory to guide the agents in learning.The design makes full use of historical experience to achieve the purpose of reducing blind exploration and improving planning efficiency.Ultimately,comparison experiments are carried out with the existing RL-based methods.The results reveal that the proposed method has better convergence performance.The running time is reduced by more than 20%.展开更多
With the arrival of the big data era, the modern higher education model has undergone radical changes, and higher requirements have been put forward for the data literacy of college teachers. The paper first analyzes ...With the arrival of the big data era, the modern higher education model has undergone radical changes, and higher requirements have been put forward for the data literacy of college teachers. The paper first analyzes the connotation of teacher data literacy, and then combs through the status quo and dilemmas of teachers’ data literacy ability in applied universities. The paper proposes to enhance the data literacy ability of teachers from the perspective of organizational learning. Through building a digital culture, building a data-driven teaching environment, and constructing an interdisciplinary learning community to further promote the application of the theory and practice of datafication inside and outside the organization, and ultimately improve the quality of teaching.展开更多
To solve the path following control problem for unmanned surface vehicles(USVs),a control method based on deep reinforcement learning(DRL)with long short-term memory(LSTM)networks is proposed.A distributed proximal po...To solve the path following control problem for unmanned surface vehicles(USVs),a control method based on deep reinforcement learning(DRL)with long short-term memory(LSTM)networks is proposed.A distributed proximal policy opti-mization(DPPO)algorithm,which is a modified actor-critic-based type of reinforcement learning algorithm,is adapted to improve the controller performance in repeated trials.The LSTM network structure is introduced to solve the strong temporal cor-relation USV control problem.In addition,a specially designed path dataset,including straight and curved paths,is established to simulate various sailing scenarios so that the reinforcement learning controller can obtain as much handling experience as possible.Extensive numerical simulation results demonstrate that the proposed method has better control performance under missions involving complex maneuvers than trained with limited scenarios and can potentially be applied in practice.展开更多
Unmanned Aerial Vehicles(UAVs)or drones introduced for military applications are gaining popularity in several other fields as well such as security and surveillance,due to their ability to perform repetitive and tedi...Unmanned Aerial Vehicles(UAVs)or drones introduced for military applications are gaining popularity in several other fields as well such as security and surveillance,due to their ability to perform repetitive and tedious tasks in hazardous environments.Their increased demand created the requirement for enabling the UAVs to traverse independently through the Three Dimensional(3D)flight environment consisting of various obstacles which have been efficiently addressed by metaheuristics in past literature.However,not a single optimization algorithms can solve all kind of optimization problem effectively.Therefore,there is dire need to integrate metaheuristic for general acceptability.To address this issue,in this paper,a novel reinforcement learning controlled Grey Wolf Optimisation-Archimedes Optimisation Algorithm(QGA)has been exhaustively introduced and exhaustively validated firstly on 22 benchmark functions and then,utilized to obtain the optimum flyable path without collision for UAVs in three dimensional environment.The performance of the developed QGA has been compared against the various metaheuristics.The simulation experimental results reveal that the QGA algorithm acquire a feasible and effective flyable path more efficiently in complicated environment.展开更多
Software testing courses are characterized by strong practicality,comprehensiveness,and diversity.Due to the differences among students and the needs to design personalized solutions for their specific requirements,th...Software testing courses are characterized by strong practicality,comprehensiveness,and diversity.Due to the differences among students and the needs to design personalized solutions for their specific requirements,the design of the existing software testing courses fails to meet the demands for personalized learning.Knowledge graphs,with their rich semantics and good visualization effects,have a wide range of applications in the field of education.In response to the current problem of software testing courses which fails to meet the needs for personalized learning,this paper offers a learning path recommendation based on knowledge graphs to provide personalized learning paths for students.展开更多
Recently,a generalized successive cancellation list(SCL)decoder implemented with shiftedpruning(SP)scheme,namely the SCL-SP-ωdecoder,is presented for polar codes,which is able to shift the pruning window at mostωtim...Recently,a generalized successive cancellation list(SCL)decoder implemented with shiftedpruning(SP)scheme,namely the SCL-SP-ωdecoder,is presented for polar codes,which is able to shift the pruning window at mostωtimes during each SCL re-decoding attempt to prevent the correct path from being eliminated.The candidate positions for applying the SP scheme are selected by a shifting metric based on the probability that the elimination occurs.However,the number of exponential/logarithm operations involved in the SCL-SP-ωdecoder grows linearly with the number of information bits and list size,which leads to high computational complexity.In this paper,we present a detailed analysis of the SCL-SP-ωdecoder in terms of the decoding performance and complexity,which unveils that the choice of the shifting metric is essential for improving the decoding performance and reducing the re-decoding attempts simultaneously.Then,we introduce a simplified metric derived from the path metric(PM)domain,and a custom-tailored deep learning(DL)network is further designed to enhance the efficiency of the proposed simplified metric.The proposed metrics are both free of transcendental functions and hence,are more hardware-friendly than the existing metrics.Simulation results show that the proposed DL-aided metric provides the best error correction performance as comparison with the state of the art.展开更多
The problem of passive detection discussed in this paper involves searching and locating an aerial emitter by dualaircraft using passive radars. In order to improve the detection probability and accuracy, a fuzzy Q le...The problem of passive detection discussed in this paper involves searching and locating an aerial emitter by dualaircraft using passive radars. In order to improve the detection probability and accuracy, a fuzzy Q learning algorithrn for dual-aircraft flight path planning is proposed. The passive detection task model of the dual-aircraft is set up based on the partition of the target active radar's radiation area. The problem is formulated as a Markov decision process (MDP) by using the fuzzy theory to make a generalization of the state space and defining the transition functions, action space and reward function properly. Details of the path planning algorithm are presented. Simulation results indicate that the algorithm can provide adaptive strategies for dual-aircraft to control their flight paths to detect a non-maneuvering or maneu- vering target.展开更多
基金supported by the National Natural Science Foundation of China(No.61871283).
文摘The Autonomous Underwater Glider(AUG)is a kind of prevailing underwater intelligent internet vehicle and occupies a dominant position in industrial applications,in which path planning is an essential problem.Due to the complexity and variability of the ocean,accurate environment modeling and flexible path planning algorithms are pivotal challenges.The traditional models mainly utilize mathematical functions,which are not complete and reliable.Most existing path planning algorithms depend on the environment and lack flexibility.To overcome these challenges,we propose a path planning system for underwater intelligent internet vehicles.It applies digital twins and sensor data to map the real ocean environment to a virtual digital space,which provides a comprehensive and reliable environment for path simulation.We design a value-based reinforcement learning path planning algorithm and explore the optimal network structure parameters.The path simulation is controlled by a closed-loop model integrated into the terminal vehicle through edge computing.The integration of state input enriches the learning of neural networks and helps to improve generalization and flexibility.The task-related reward function promotes the rapid convergence of the training.The experimental results prove that our reinforcement learning based path planning algorithm has great flexibility and can effectively adapt to a variety of different ocean conditions.
文摘Intelligent penetration testing is of great significance for the improvement of the security of information systems,and the critical issue is the planning of penetration test paths.In view of the difficulty for attackers to obtain complete network information in realistic network scenarios,Reinforcement Learning(RL)is a promising solution to discover the optimal penetration path under incomplete information about the target network.Existing RL-based methods are challenged by the sizeable discrete action space,which leads to difficulties in the convergence.Moreover,most methods still rely on experts’knowledge.To address these issues,this paper proposes a penetration path planning method based on reinforcement learning with episodic memory.First,the penetration testing problem is formally described in terms of reinforcement learning.To speed up the training process without specific prior knowledge,the proposed algorithm introduces episodic memory to store experienced advantageous strategies for the first time.Furthermore,the method offers an exploration strategy based on episodic memory to guide the agents in learning.The design makes full use of historical experience to achieve the purpose of reducing blind exploration and improving planning efficiency.Ultimately,comparison experiments are carried out with the existing RL-based methods.The results reveal that the proposed method has better convergence performance.The running time is reduced by more than 20%.
文摘With the arrival of the big data era, the modern higher education model has undergone radical changes, and higher requirements have been put forward for the data literacy of college teachers. The paper first analyzes the connotation of teacher data literacy, and then combs through the status quo and dilemmas of teachers’ data literacy ability in applied universities. The paper proposes to enhance the data literacy ability of teachers from the perspective of organizational learning. Through building a digital culture, building a data-driven teaching environment, and constructing an interdisciplinary learning community to further promote the application of the theory and practice of datafication inside and outside the organization, and ultimately improve the quality of teaching.
基金supported by the National Natural Science Foundation(61601491)the Natural Science Foundation of Hubei Province(2018CFC865)the China Postdoctoral Science Foundation Funded Project(2016T45686).
文摘To solve the path following control problem for unmanned surface vehicles(USVs),a control method based on deep reinforcement learning(DRL)with long short-term memory(LSTM)networks is proposed.A distributed proximal policy opti-mization(DPPO)algorithm,which is a modified actor-critic-based type of reinforcement learning algorithm,is adapted to improve the controller performance in repeated trials.The LSTM network structure is introduced to solve the strong temporal cor-relation USV control problem.In addition,a specially designed path dataset,including straight and curved paths,is established to simulate various sailing scenarios so that the reinforcement learning controller can obtain as much handling experience as possible.Extensive numerical simulation results demonstrate that the proposed method has better control performance under missions involving complex maneuvers than trained with limited scenarios and can potentially be applied in practice.
基金funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R66),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Unmanned Aerial Vehicles(UAVs)or drones introduced for military applications are gaining popularity in several other fields as well such as security and surveillance,due to their ability to perform repetitive and tedious tasks in hazardous environments.Their increased demand created the requirement for enabling the UAVs to traverse independently through the Three Dimensional(3D)flight environment consisting of various obstacles which have been efficiently addressed by metaheuristics in past literature.However,not a single optimization algorithms can solve all kind of optimization problem effectively.Therefore,there is dire need to integrate metaheuristic for general acceptability.To address this issue,in this paper,a novel reinforcement learning controlled Grey Wolf Optimisation-Archimedes Optimisation Algorithm(QGA)has been exhaustively introduced and exhaustively validated firstly on 22 benchmark functions and then,utilized to obtain the optimum flyable path without collision for UAVs in three dimensional environment.The performance of the developed QGA has been compared against the various metaheuristics.The simulation experimental results reveal that the QGA algorithm acquire a feasible and effective flyable path more efficiently in complicated environment.
基金supported by the Special Funds for Basic Research of Central Universities(D5000220240)the Special Funds for Education and Teaching Reform in 2023(06410-23GZ230102)。
文摘Software testing courses are characterized by strong practicality,comprehensiveness,and diversity.Due to the differences among students and the needs to design personalized solutions for their specific requirements,the design of the existing software testing courses fails to meet the demands for personalized learning.Knowledge graphs,with their rich semantics and good visualization effects,have a wide range of applications in the field of education.In response to the current problem of software testing courses which fails to meet the needs for personalized learning,this paper offers a learning path recommendation based on knowledge graphs to provide personalized learning paths for students.
基金supported in part by the National Key Research and Development Program of China under Grant 2018YFB1802303in part by the Zhejiang Provincial Natural Science Foundation of China under Grant LQ20F010010。
文摘Recently,a generalized successive cancellation list(SCL)decoder implemented with shiftedpruning(SP)scheme,namely the SCL-SP-ωdecoder,is presented for polar codes,which is able to shift the pruning window at mostωtimes during each SCL re-decoding attempt to prevent the correct path from being eliminated.The candidate positions for applying the SP scheme are selected by a shifting metric based on the probability that the elimination occurs.However,the number of exponential/logarithm operations involved in the SCL-SP-ωdecoder grows linearly with the number of information bits and list size,which leads to high computational complexity.In this paper,we present a detailed analysis of the SCL-SP-ωdecoder in terms of the decoding performance and complexity,which unveils that the choice of the shifting metric is essential for improving the decoding performance and reducing the re-decoding attempts simultaneously.Then,we introduce a simplified metric derived from the path metric(PM)domain,and a custom-tailored deep learning(DL)network is further designed to enhance the efficiency of the proposed simplified metric.The proposed metrics are both free of transcendental functions and hence,are more hardware-friendly than the existing metrics.Simulation results show that the proposed DL-aided metric provides the best error correction performance as comparison with the state of the art.
基金supported by the National Natural Science Foundation of China(60874040)
文摘The problem of passive detection discussed in this paper involves searching and locating an aerial emitter by dualaircraft using passive radars. In order to improve the detection probability and accuracy, a fuzzy Q learning algorithrn for dual-aircraft flight path planning is proposed. The passive detection task model of the dual-aircraft is set up based on the partition of the target active radar's radiation area. The problem is formulated as a Markov decision process (MDP) by using the fuzzy theory to make a generalization of the state space and defining the transition functions, action space and reward function properly. Details of the path planning algorithm are presented. Simulation results indicate that the algorithm can provide adaptive strategies for dual-aircraft to control their flight paths to detect a non-maneuvering or maneu- vering target.