Aim To investigate the model free multi step average reward reinforcement learning algorithm. Methods By combining the R learning algorithms with the temporal difference learning (TD( λ ) learning) algorithm...Aim To investigate the model free multi step average reward reinforcement learning algorithm. Methods By combining the R learning algorithms with the temporal difference learning (TD( λ ) learning) algorithms for average reward problems, a novel incremental algorithm, called R( λ ) learning, was proposed. Results and Conclusion The proposed algorithm is a natural extension of the Q( λ) learning, the multi step discounted reward reinforcement learning algorithm, to the average reward cases. Simulation results show that the R( λ ) learning with intermediate λ values makes significant performance improvement over the simple R learning.展开更多
In order to obtain direct solutions of parallel manipulator without divergence in real time,a modified global Newton-Raphson(MGNR) algorithm was proposed for forward kinematics analysis of six-degree-of-freedom(DOF) p...In order to obtain direct solutions of parallel manipulator without divergence in real time,a modified global Newton-Raphson(MGNR) algorithm was proposed for forward kinematics analysis of six-degree-of-freedom(DOF) parallel manipulator.Based on geometrical frame of parallel manipulator,the highly nonlinear equations of kinematics were derived using analytical approach.The MGNR algorithm was developed for the nonlinear equations based on Tailor expansion and Newton-Raphson iteration.The procedure of MGNR algorithm was programmed in Matlab/Simulink and compiled to a real-time computer with Microsoft visual studio.NET for implementation.The performance of the MGNR algorithms for 6-DOF parallel manipulator was analyzed and confirmed.Applying the MGNR algorithm,the real generalized pose of moving platform is solved by using the set of given positions of actuators.The theoretical analysis and numerical results indicate that the presented method can achieve the numerical convergent solution in less than 1 ms with high accuracy(1×10-9 m in linear motion and 1×10-9 rad in angular motion),even the initial guess value is far from the root.展开更多
During the storehouse surface rolling construction of a core rockfilldam, the spreading thickness of dam face is an important factor that affects the construction quality of the dam storehouse' rolling surface and...During the storehouse surface rolling construction of a core rockfilldam, the spreading thickness of dam face is an important factor that affects the construction quality of the dam storehouse' rolling surface and the overallquality of the entire dam. Currently, the method used to monitor and controlspreading thickness during the dam construction process is artificialsampling check after spreading, which makes it difficult to monitor the entire dam storehouse surface. In this paper, we present an in-depth study based on real-time monitoring and controltheory of storehouse surface rolling construction and obtain the rolling compaction thickness by analyzing the construction track of the rolling machine. Comparatively, the traditionalmethod can only analyze the rolling thickness of the dam storehouse surface after it has been compacted and cannot determine the thickness of the dam storehouse surface in realtime. To solve these problems, our system monitors the construction progress of the leveling machine and employs a real-time spreading thickness monitoring modelbased on the K-nearest neighbor algorithm. Taking the LHK core rockfilldam in Southwest China as an example, we performed real-time monitoring for the spreading thickness and conducted real-time interactive queries regarding the spreading thickness. This approach provides a new method for controlling the spreading thickness of the core rockfilldam storehouse surface.展开更多
Aim To find a more efficient learning method based on temporal difference learning for delayed reinforcement learning tasks. Methods A kind of Q learning algorithm based on truncated TD( λ ) with adaptive scheme...Aim To find a more efficient learning method based on temporal difference learning for delayed reinforcement learning tasks. Methods A kind of Q learning algorithm based on truncated TD( λ ) with adaptive schemes of λ value selection addressed to absorbing Markov decision processes was presented and implemented on computers. Results and Conclusion Simulations on the shortest path searching problems show that using adaptive λ in the Q learning based on TTD( λ ) can speed up its convergence.展开更多
Differently from the general online social network(OSN),locationbased mobile social network(LMSN),which seamlessly integrates mobile computing and social computing technologies,has unique characteristics of temporal,s...Differently from the general online social network(OSN),locationbased mobile social network(LMSN),which seamlessly integrates mobile computing and social computing technologies,has unique characteristics of temporal,spatial and social correlation.Recommending friends instantly based on current location of users in the real world has become increasingly popular in LMSN.However,the existing friend recommendation methods based on topological structures of a social network or non-topological information such as similar user profiles cannot well address the instant making friends in the real world.In this article,we analyze users' check-in behavior in a real LMSN site named Gowalla.According to this analysis,we present an approach of recommending friends instantly for LMSN users by considering the real-time physical location proximity,offline behavior similarity and friendship network information in the virtual community simultaneously.This approach effectively bridges the gap between the offline behavior of users in the real world and online friendship network information in the virtual community.Finally,we use the real user check-in dataset of Gowalla to verify the effectiveness of our approach.展开更多
Floods are one of the most common natural hazards occurring all around the world.However,the knowledge of the origins of a food and its possible magnitude in a given region remains unclear yet.This lack of understandi...Floods are one of the most common natural hazards occurring all around the world.However,the knowledge of the origins of a food and its possible magnitude in a given region remains unclear yet.This lack of understanding is particularly acute in mountainous regions with large degrees in Sichuan Province,China,where runoff is seldom measured.The nature of streamflow in a region is related to the time and spatial distribution of rainfall quantity and watershed geomorphology.The geomorphologic characteristics are the channel network and surrounding landscape which transform the rainfall input into an output hydrograph at the outlet of the watershed.With the given geomorphologic properties of the watershed,theoretically the hydrological response function can be determined hydraulically without using any recorded data of past rainfall or runoff events.In this study,a kinematic-wave-based geomorphologic instantaneous unit hydrograph (KW-GIUH) model was adopted and verified to estimate runoff in ungauged areas.Two mountain watersheds,the Yingjing River watershed and Tianquan River watershed in Sichuan were selected as study sites.The geomorphologic factors of the two watersheds were obtained by using a digital elevation model (DEM) based on the topographic database obtained from the Shuttle Radar Topography Mission of US's NASA.The tests of the model on the two watersheds were performed both at gauged and ungauged sites.Comparison between the simulated and observed hydrographs for a number of rainstorms at the gauged sites indicated the potential of the KW-GIUH model as a useful tool for runoff analysis in these regions.Moreover,to simulate possible concentrated rainstorms that could result in serious flooding in these areas,synthetic rainfall hyetographs were adopted as input to the KW-GIUH model to obtain the flow hydrographs at two ungauged sites for different return period conditions.Hydroeconomic analysis can be performed in the future to select the optimum design return period for determining the flood control work.展开更多
文摘Aim To investigate the model free multi step average reward reinforcement learning algorithm. Methods By combining the R learning algorithms with the temporal difference learning (TD( λ ) learning) algorithms for average reward problems, a novel incremental algorithm, called R( λ ) learning, was proposed. Results and Conclusion The proposed algorithm is a natural extension of the Q( λ) learning, the multi step discounted reward reinforcement learning algorithm, to the average reward cases. Simulation results show that the R( λ ) learning with intermediate λ values makes significant performance improvement over the simple R learning.
基金Project(HgdJG00401D04) supported by National 921 Manned Space Project Foundation of ChinaProject(SKLRS200803B) supported by the Self-Planned Task Foundation of State Key Laboratory of Robotics and System (HIT) of China+1 种基金Project(CDAZ98502211) supported by China’s "World Class University (985)" Project FoundationProject(50975055) supported by the National Natural Science Foundation of China
文摘In order to obtain direct solutions of parallel manipulator without divergence in real time,a modified global Newton-Raphson(MGNR) algorithm was proposed for forward kinematics analysis of six-degree-of-freedom(DOF) parallel manipulator.Based on geometrical frame of parallel manipulator,the highly nonlinear equations of kinematics were derived using analytical approach.The MGNR algorithm was developed for the nonlinear equations based on Tailor expansion and Newton-Raphson iteration.The procedure of MGNR algorithm was programmed in Matlab/Simulink and compiled to a real-time computer with Microsoft visual studio.NET for implementation.The performance of the MGNR algorithms for 6-DOF parallel manipulator was analyzed and confirmed.Applying the MGNR algorithm,the real generalized pose of moving platform is solved by using the set of given positions of actuators.The theoretical analysis and numerical results indicate that the presented method can achieve the numerical convergent solution in less than 1 ms with high accuracy(1×10-9 m in linear motion and 1×10-9 rad in angular motion),even the initial guess value is far from the root.
基金supported by the Innovative Research Groups of National Natural Science Foundation of China(No. 51621092)National Basic Research Program of China ("973" Program, No. 2013CB035904)National Natural Science Foundation of China (No. 51439005)
文摘During the storehouse surface rolling construction of a core rockfilldam, the spreading thickness of dam face is an important factor that affects the construction quality of the dam storehouse' rolling surface and the overallquality of the entire dam. Currently, the method used to monitor and controlspreading thickness during the dam construction process is artificialsampling check after spreading, which makes it difficult to monitor the entire dam storehouse surface. In this paper, we present an in-depth study based on real-time monitoring and controltheory of storehouse surface rolling construction and obtain the rolling compaction thickness by analyzing the construction track of the rolling machine. Comparatively, the traditionalmethod can only analyze the rolling thickness of the dam storehouse surface after it has been compacted and cannot determine the thickness of the dam storehouse surface in realtime. To solve these problems, our system monitors the construction progress of the leveling machine and employs a real-time spreading thickness monitoring modelbased on the K-nearest neighbor algorithm. Taking the LHK core rockfilldam in Southwest China as an example, we performed real-time monitoring for the spreading thickness and conducted real-time interactive queries regarding the spreading thickness. This approach provides a new method for controlling the spreading thickness of the core rockfilldam storehouse surface.
文摘Aim To find a more efficient learning method based on temporal difference learning for delayed reinforcement learning tasks. Methods A kind of Q learning algorithm based on truncated TD( λ ) with adaptive schemes of λ value selection addressed to absorbing Markov decision processes was presented and implemented on computers. Results and Conclusion Simulations on the shortest path searching problems show that using adaptive λ in the Q learning based on TTD( λ ) can speed up its convergence.
基金National Key Basic Research Program of China (973 Program) under Grant No.2012CB315802 and No.2013CB329102.National Natural Science Foundation of China under Grant No.61171102 and No.61132001.New generation broadband wireless mobile communication network Key Projects for Science and Technology Development under Grant No.2011ZX03002-002-01,Beijing Nova Program under Grant No.2008B50 and Beijing Higher Education Young Elite Teacher Project under Grant No.YETP0478
文摘Differently from the general online social network(OSN),locationbased mobile social network(LMSN),which seamlessly integrates mobile computing and social computing technologies,has unique characteristics of temporal,spatial and social correlation.Recommending friends instantly based on current location of users in the real world has become increasingly popular in LMSN.However,the existing friend recommendation methods based on topological structures of a social network or non-topological information such as similar user profiles cannot well address the instant making friends in the real world.In this article,we analyze users' check-in behavior in a real LMSN site named Gowalla.According to this analysis,we present an approach of recommending friends instantly for LMSN users by considering the real-time physical location proximity,offline behavior similarity and friendship network information in the virtual community simultaneously.This approach effectively bridges the gap between the offline behavior of users in the real world and online friendship network information in the virtual community.Finally,we use the real user check-in dataset of Gowalla to verify the effectiveness of our approach.
基金supported by the key project of the National Natural Science Foundation of China (NSFC No. 50739002)the National Science Council of Taibei of China (NSC 97-2625-M-019-001)+1 种基金the Open Research Fund Program of State key Laboratory of Hydraulics and River Engineering,Sichuan University,China (No. 1001)Financial supports from the above organizations are fully acknowledged
文摘Floods are one of the most common natural hazards occurring all around the world.However,the knowledge of the origins of a food and its possible magnitude in a given region remains unclear yet.This lack of understanding is particularly acute in mountainous regions with large degrees in Sichuan Province,China,where runoff is seldom measured.The nature of streamflow in a region is related to the time and spatial distribution of rainfall quantity and watershed geomorphology.The geomorphologic characteristics are the channel network and surrounding landscape which transform the rainfall input into an output hydrograph at the outlet of the watershed.With the given geomorphologic properties of the watershed,theoretically the hydrological response function can be determined hydraulically without using any recorded data of past rainfall or runoff events.In this study,a kinematic-wave-based geomorphologic instantaneous unit hydrograph (KW-GIUH) model was adopted and verified to estimate runoff in ungauged areas.Two mountain watersheds,the Yingjing River watershed and Tianquan River watershed in Sichuan were selected as study sites.The geomorphologic factors of the two watersheds were obtained by using a digital elevation model (DEM) based on the topographic database obtained from the Shuttle Radar Topography Mission of US's NASA.The tests of the model on the two watersheds were performed both at gauged and ungauged sites.Comparison between the simulated and observed hydrographs for a number of rainstorms at the gauged sites indicated the potential of the KW-GIUH model as a useful tool for runoff analysis in these regions.Moreover,to simulate possible concentrated rainstorms that could result in serious flooding in these areas,synthetic rainfall hyetographs were adopted as input to the KW-GIUH model to obtain the flow hydrographs at two ungauged sites for different return period conditions.Hydroeconomic analysis can be performed in the future to select the optimum design return period for determining the flood control work.