Policy evaluation(PE)is a critical sub-problem in reinforcement learning,which estimates the value function for a given policy and can be used for policy improvement.However,there still exist some limitations in curre...Policy evaluation(PE)is a critical sub-problem in reinforcement learning,which estimates the value function for a given policy and can be used for policy improvement.However,there still exist some limitations in current PE methods,such as low sample efficiency and local convergence,especially on complex tasks.In this study,a novel PE algorithm called Least-Squares Truncated Temporal-Difference learning(LST2D)is proposed.In LST2D,an adaptive truncation mechanism is designed,which effectively takes advantage of the fast convergence property of Least-Squares Temporal Difference learning and the asymptotic convergence property of Temporal Difference learning(TD).Then,two feature pre-training methods are utilised to improve the approximation ability of LST2D.Furthermore,an Actor-Critic algorithm based on LST2D and pre-trained feature representations(ACLPF)is proposed,where LST2D is integrated into the critic network to improve learning-prediction efficiency.Comprehensive simulation studies were conducted on four robotic tasks,and the corresponding results illustrate the effectiveness of LST2D.The proposed ACLPF algorithm outperformed DQN,ACER and PPO in terms of sample efficiency and stability,which demonstrated that LST2D can be applied to online learning control problems by incorporating it into the actor-critic architecture.展开更多
Temporal relation computation is one of the tasks of the extraction of temporal arguments from event, and it is also the ultimate goal of temporal information processing. However, temporal relation computation based o...Temporal relation computation is one of the tasks of the extraction of temporal arguments from event, and it is also the ultimate goal of temporal information processing. However, temporal relation computation based on machine learning requires a lot of hand-marked work, and exploring more features from discourse. A method of two-stage machine learning based on temporal relation computation (TSMLTRC) is proposed in this paper for the shortcomings of current temporal relation computation between two events. The first stage is to get the main temporal attributes of event based on classification learning. The second stage is to compute the event temporal relation in the discourse through employing the result of the first stage as the basic features, and also employing some new linguistic characteristics. Experiments show that, compared with the artificial golden rule, the computational efficiency in the first stage is much higher, and the F1-Score of event temporal relation which is computed through combining multi-features may be increased at 85.8% in the second stage.展开更多
Aim To find a more efficient learning method based on temporal difference learning for delayed reinforcement learning tasks. Methods A kind of Q learning algorithm based on truncated TD( λ ) with adaptive scheme...Aim To find a more efficient learning method based on temporal difference learning for delayed reinforcement learning tasks. Methods A kind of Q learning algorithm based on truncated TD( λ ) with adaptive schemes of λ value selection addressed to absorbing Markov decision processes was presented and implemented on computers. Results and Conclusion Simulations on the shortest path searching problems show that using adaptive λ in the Q learning based on TTD( λ ) can speed up its convergence.展开更多
The effect of Batroxobin expression of neural cell adhesion molecule (NCAM) in left temporal ischemic rats with spatial memory disorder was investigated by means of Morri's water maze and immunohistochemical metho...The effect of Batroxobin expression of neural cell adhesion molecule (NCAM) in left temporal ischemic rats with spatial memory disorder was investigated by means of Morri's water maze and immunohistochemical methods. The results showed that the mean reaction time and distance of temporal ischemic rats for searching a goal were significantly longer than those of sham-operated rats and at the same time NCAM expression of left temporal ischemic region was significantly increased. However, the mean reaction time and distance of Batroxobin-treated rats were shorter and they used normal strategies more often and earlier than those of ischemic rats. The number of NCAM immune reactive cells of Batroxobin-treated rats was more than that of ischemic group. In conclusion, Batroxobin can improve spatial memory disorder of temporal ischemic rats and the regulation of the expression of NCAM is probably related to the neuroprotective mechanism.展开更多
The effect of Batroxobin on spatial memory disorder of left temporal ischemic rats and the expression of HSP32 and HSP70 were investigated with Morri`s water maze and immunohistochemistry methods. The results show... The effect of Batroxobin on spatial memory disorder of left temporal ischemic rats and the expression of HSP32 and HSP70 were investigated with Morri`s water maze and immunohistochemistry methods. The results showed that the mean reaction time and distance of temporal ischemic rats in searching a goal were significantly longer than those of the sham-operated rats and at the same time HSP32 and HSP70 expression of left temporal ischemic region in rats was significantly increased as compared with the sham-operated rats. However, the mean reaction time and distance of the Batroxobin-treated rats were shorter and they used normal strategies more often and earlier than those of ischemic rats. The number of HSP32 and HSP70 immune reactive cells of Batroxobin-treated rats was also less than that of the ischemic group. In conclusion, Batroxobin can improve spatial memory disorder of temporal ischemic rats; and the down-regulation of the expression of HSP32 and HSP70 is probably related to the attenuation of ischemic injury.展开更多
The medial temporal lobe (MTL) has been assigned a central role in human episodic memory and learning. Evidence for this comes from PET and fMRI studies as well as lesion studies. This study aimed at comparing the eff...The medial temporal lobe (MTL) has been assigned a central role in human episodic memory and learning. Evidence for this comes from PET and fMRI studies as well as lesion studies. This study aimed at comparing the effect of atrophy at repeated trials of a supraspan test of memory. Included in the study were patients with Alzheimer’s Disease, Mild Cognitive Impairment, and Subjective Memory Disorders as well as Controls (n = 116). The supraspan test used was the Rey Auditory Verbal Learning test (RAVLT). Comparisons between extreme groups with high (Stanine 6 - 9) and low (Stanine 1 - 4) intracranial proportions (IP) of MTL were made at the five trials of RAVLT. There was a significantly higher rate of learning among subjects with high MTL IP compared to those with low MTL IP in both hemispheres. There was no difference in the rate of list learning performance due to education or age and interestingly: the list learning rates among subjects with high/low Lateral Temporal Lobe IPs were almost similar. The hemispheric differences regarding learning rate were small and insignificant. Results are discussed in terms of hippocampal involvement in associative processes necessary in supraspan list learning.展开更多
Background:Sepsis,a potentially fatal inflammatory disease triggered by infection,carries significant healthimplications worldwide.Timely detection is crucial as sepsis can rapidly escalate if left undetected.Recentad...Background:Sepsis,a potentially fatal inflammatory disease triggered by infection,carries significant healthimplications worldwide.Timely detection is crucial as sepsis can rapidly escalate if left undetected.Recentadvancements in deep learning(DL)offer powerful tools to address this challenge.Aim:Thus,this study proposeda hybrid CNNBDLSTM,a combination of a convolutional neural network(CNN)with a bi-directional long shorttermmemory(BDLSTM)model to predict sepsis onset.Implementing the proposed model provides a robustframework that capitalizes on the complementary strengths of both architectures,resulting in more accurate andtimelier predictions.Method:The sepsis prediction method proposed here utilizes temporal feature extraction todelineate six distinct time frames before the onset of sepsis.These time frames adhere to the sepsis-3 standardrequirement,which incorporates 12-h observation windows preceding sepsis onset.All models were trained usingthe Medical Information Mart for Intensive Care III(MIMIC-III)dataset,which sourced 61,522 patients with 40clinical variables obtained from the IoT medical environment.The confusion matrix,the area under the receiveroperating characteristic curve(AUCROC)curve,the accuracy,the precision,the F1-score,and the recall weredeployed to evaluate themodels.Result:The CNNBDLSTMmodel demonstrated superior performance comparedto the benchmark and other models,achieving an AUCROC of 99.74%and an accuracy of 99.15%one hour beforesepsis onset.These results indicate that the CNNBDLSTM model is highly effective in predicting sepsis onset,particularly within a close proximity of one hour.Implication:The results could assist practitioners in increasingthe potential survival of the patient one hour before sepsis onset.展开更多
Accurate wind power forecasting is critical for system integration and stability as renewable energy reliance grows.Traditional approaches frequently struggle with complex data and non-linear connections. This article...Accurate wind power forecasting is critical for system integration and stability as renewable energy reliance grows.Traditional approaches frequently struggle with complex data and non-linear connections. This article presentsa novel approach for hybrid ensemble learning that is based on rigorous requirements engineering concepts.The approach finds significant parameters influencing forecasting accuracy by evaluating real-time Modern-EraRetrospective Analysis for Research and Applications (MERRA2) data from several European Wind farms usingin-depth stakeholder research and requirements elicitation. Ensemble learning is used to develop a robust model,while a temporal convolutional network handles time-series complexities and data gaps. The ensemble-temporalneural network is enhanced by providing different input parameters including training layers, hidden and dropoutlayers along with activation and loss functions. The proposed framework is further analyzed by comparing stateof-the-art forecasting models in terms of Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE),respectively. The energy efficiency performance indicators showed that the proposed model demonstrates errorreduction percentages of approximately 16.67%, 28.57%, and 81.92% for MAE, and 38.46%, 17.65%, and 90.78%for RMSE for MERRAWind farms 1, 2, and 3, respectively, compared to other existingmethods. These quantitativeresults show the effectiveness of our proposed model with MAE values ranging from 0.0010 to 0.0156 and RMSEvalues ranging from 0.0014 to 0.0174. This work highlights the effectiveness of requirements engineering in windpower forecasting, leading to enhanced forecast accuracy and grid stability, ultimately paving the way for moresustainable energy solutions.展开更多
Universal lesion detection(ULD)methods for computed tomography(CT)images play a vital role in the modern clinical medicine and intelligent automation.It is well known that single 2D CT slices lack spatial-temporal cha...Universal lesion detection(ULD)methods for computed tomography(CT)images play a vital role in the modern clinical medicine and intelligent automation.It is well known that single 2D CT slices lack spatial-temporal characteristics and contextual information compared to 3D CT blocks.However,3D CT blocks necessitate significantly higher hardware resources during the learning phase.Therefore,efficiently exploiting temporal correlation and spatial-temporal features of 2D CT slices is crucial for ULD tasks.In this paper,we propose a ULD network with the enhanced temporal correlation for this purpose,named TCE-Net.The designed TCE module is applied to enrich the discriminate feature representation of multiple sequential CT slices.Besides,we employ multi-scale feature maps to facilitate the localization and detection of lesions in various sizes.Extensive experiments are conducted on the DeepLesion benchmark demonstrate that thismethod achieves 66.84%and 78.18%for FS@0.5 and FS@1.0,respectively,outperforming compared state-of-the-art methods.展开更多
Video salient object detection(VSOD)aims at locating the most attractive objects in a video by exploring the spatial and temporal features.VSOD poses a challenging task in computer vision,as it involves processing com...Video salient object detection(VSOD)aims at locating the most attractive objects in a video by exploring the spatial and temporal features.VSOD poses a challenging task in computer vision,as it involves processing complex spatial data that is also influenced by temporal dynamics.Despite the progress made in existing VSOD models,they still struggle in scenes of great background diversity within and between frames.Additionally,they encounter difficulties related to accumulated noise and high time consumption during the extraction of temporal features over a long-term duration.We propose a multi-stream temporal enhanced network(MSTENet)to address these problems.It investigates saliency cues collaboration in the spatial domain with a multi-stream structure to deal with the great background diversity challenge.A straightforward,yet efficient approach for temporal feature extraction is developed to avoid the accumulative noises and reduce time consumption.The distinction between MSTENet and other VSOD methods stems from its incorporation of both foreground supervision and background supervision,facilitating enhanced extraction of collaborative saliency cues.Another notable differentiation is the innovative integration of spatial and temporal features,wherein the temporal module is integrated into the multi-stream structure,enabling comprehensive spatial-temporal interactions within an end-to-end framework.Extensive experimental results demonstrate that the proposed method achieves state-of-the-art performance on five benchmark datasets while maintaining a real-time speed of 27 fps(Titan XP).Our code and models are available at https://github.com/RuJiaLe/MSTENet.展开更多
Latent variable models can effectively determine the condition of essential rotating machinery without needing labeled data.These models analyze vibration data via an unsupervised learning strategy.Temporal preservati...Latent variable models can effectively determine the condition of essential rotating machinery without needing labeled data.These models analyze vibration data via an unsupervised learning strategy.Temporal preservation is necessary to obtain an informative latent manifold for the fault diagnosis task.In a temporalpreserving context,two approaches exist to develop a condition-monitoring methodology:offline and online.For latent variable models,the available training modes are not different.While many traditional methods use offline training,online training can dynamically adjust the latent manifold,possibly leading to better fault signature extraction from the vibration data.This study explores online training using temporal-preserving latent variable models.Within online training,there are two main methods:one focuses on reconstructing data and the other on interpreting the data components.Both are considered to evaluate how they diagnose faults over time.Using two experimental datasets,the study confirms that models from both training modes can detect changes in machinery health and identify faults even under varying conditions.Importantly,the complementarity of offline and online models is emphasized,reassuring their versatility in fault diagnostics.Understanding the implications of the training approach and the available model formulations is crucial for further research in latent variable modelbased fault diagnostics.展开更多
Aim To investigate the model free multi step average reward reinforcement learning algorithm. Methods By combining the R learning algorithms with the temporal difference learning (TD( λ ) learning) algorithm...Aim To investigate the model free multi step average reward reinforcement learning algorithm. Methods By combining the R learning algorithms with the temporal difference learning (TD( λ ) learning) algorithms for average reward problems, a novel incremental algorithm, called R( λ ) learning, was proposed. Results and Conclusion The proposed algorithm is a natural extension of the Q( λ) learning, the multi step discounted reward reinforcement learning algorithm, to the average reward cases. Simulation results show that the R( λ ) learning with intermediate λ values makes significant performance improvement over the simple R learning.展开更多
In this work,we combined the model based reinforcement learning(MBRL)and model free reinforcement learning(MFRL)to stabilize a biped robot(NAO robot)on a rotating platform,where the angular velocity of the platform is...In this work,we combined the model based reinforcement learning(MBRL)and model free reinforcement learning(MFRL)to stabilize a biped robot(NAO robot)on a rotating platform,where the angular velocity of the platform is unknown for the proposed learning algorithm and treated as the external disturbance.Nonparametric Gaussian processes normally require a large number of training data points to deal with the discontinuity of the estimated model.Although some improved method such as probabilistic inference for learning control(PILCO)does not require an explicit global model as the actions are obtained by directly searching the policy space,the overfitting and lack of model complexity may still result in a large deviation between the prediction and the real system.Besides,none of these approaches consider the data error and measurement noise during the training process and test process,respectively.We propose a hierarchical Gaussian processes(GP)models,containing two layers of independent GPs,where the physically continuous probability transition model of the robot is obtained.Due to the physically continuous estimation,the algorithm overcomes the overfitting problem with a guaranteed model complexity,and the number of training data is also reduced.The policy for any given initial state is generated automatically by minimizing the expected cost according to the predefined cost function and the obtained probability distribution of the state.Furthermore,a novel Q(λ)based MFRL method scheme is employed to improve the policy.Simulation results show that the proposed RL algorithm is able to balance NAO robot on a rotating platform,and it is capable of adapting to the platform with varying angular velocity.展开更多
The iterated prisoner's dilemma(IPD) is an ideal model for analyzing interactions between agents in complex networks. It has attracted wide interest in the development of novel strategies since the success of tit-...The iterated prisoner's dilemma(IPD) is an ideal model for analyzing interactions between agents in complex networks. It has attracted wide interest in the development of novel strategies since the success of tit-for-tat in Axelrod's tournament. This paper studies a new adaptive strategy of IPD in different complex networks, where agents can learn and adapt their strategies through reinforcement learning method. A temporal difference learning method is applied for designing the adaptive strategy to optimize the decision making process of the agents. Previous studies indicated that mutual cooperation is hard to emerge in the IPD. Therefore, three examples which based on square lattice network and scale-free network are provided to show two features of the adaptive strategy. First, the mutual cooperation can be achieved by the group with adaptive agents under scale-free network, and once evolution has converged mutual cooperation, it is unlikely to shift. Secondly, the adaptive strategy can earn a better payoff compared with other strategies in the square network. The analytical properties are discussed for verifying evolutionary stability of the adaptive strategy.展开更多
In this paper,we summarize recent progresses made in deep learning based acoustic models and the motivation and insights behind the surveyed techniques.We first discuss models such as recurrent neural networks(RNNs) a...In this paper,we summarize recent progresses made in deep learning based acoustic models and the motivation and insights behind the surveyed techniques.We first discuss models such as recurrent neural networks(RNNs) and convolutional neural networks(CNNs) that can effectively exploit variablelength contextual information,and their various combination with other models.We then describe models that are optimized end-to-end and emphasize on feature representations learned jointly with the rest of the system,the connectionist temporal classification(CTC) criterion,and the attention-based sequenceto-sequence translation model.We further illustrate robustness issues in speech recognition systems,and discuss acoustic model adaptation,speech enhancement and separation,and robust training strategies.We also cover modeling techniques that lead to more efficient decoding and discuss possible future directions in acoustic model research.展开更多
In this paper,we investigate a spectrumsensing system in the presence of a satellite,where the satellite works as a sensing node.Considering the conventional energy detection method is sensitive to the noise uncertain...In this paper,we investigate a spectrumsensing system in the presence of a satellite,where the satellite works as a sensing node.Considering the conventional energy detection method is sensitive to the noise uncertainty,thus,a temporal convolutional network(TCN)based spectrum-sensing method is designed to eliminate the effect of the noise uncertainty and improve the performance of spectrum sensing,relying on the offline training and the online detection stages.Specifically,in the offline training stage,spectrum data captured by the satellite is sent to the TCN deployed on the gateway for training purpose.Moreover,in the online detection stage,the well trained TCN is utilized to perform real-time spectrum sensing,which can upgrade spectrum-sensing performance by exploiting the temporal features.Additionally,simulation results demonstrate that the proposed method achieves a higher probability of detection than that of the conventional energy detection(ED),the convolutional neural network(CNN),and deep neural network(DNN).Furthermore,the proposed method outperforms the CNN and the DNN in terms of a lower computational complexity.展开更多
The success of intelligent transportation systems relies heavily on accurate traffic prediction,in which how to model the underlying spatial-temporal information from traffic data has come under the spotlight.Most exi...The success of intelligent transportation systems relies heavily on accurate traffic prediction,in which how to model the underlying spatial-temporal information from traffic data has come under the spotlight.Most existing frameworks typically utilize separate modules for spatial and temporal correlations modeling.However,this stepwise pattern may limit the effectiveness and efficiency in spatial-temporal feature extraction and cause the overlook of important information in some steps.Furthermore,it is lacking sufficient guidance from prior information while modeling based on a given spatial adjacency graph(e.g.,deriving from the geodesic distance or approximate connectivity),and may not reflect the actual interaction between nodes.To overcome those limitations,our paper proposes a spatial-temporal graph synchronous aggregation(STGSA)model to extract the localized and long-term spatial-temporal dependencies simultaneously.Specifically,a tailored graph aggregation method in the vertex domain is designed to extract spatial and temporal features in one graph convolution process.In each STGSA block,we devise a directed temporal correlation graph to represent the localized and long-term dependencies between nodes,and the potential temporal dependence is further fine-tuned by an adaptive weighting operation.Meanwhile,we construct an elaborated spatial adjacency matrix to represent the road sensor graph by considering both physical distance and node similarity in a datadriven manner.Then,inspired by the multi-head attention mechanism which can jointly emphasize information from different r epresentation subspaces,we construct a multi-stream module based on the STGSA blocks to capture global information.It projects the embedding input repeatedly with multiple different channels.Finally,the predicted values are generated by stacking several multi-stream modules.Extensive experiments are constructed on six real-world datasets,and numerical results show that the proposed STGSA model significantly outperforms the benchmarks.展开更多
Accurate prediction of future events brings great benefits and reduces losses for society in many domains,such as civil unrest,pandemics,and crimes.Knowledge graph is a general language for describing and modeling com...Accurate prediction of future events brings great benefits and reduces losses for society in many domains,such as civil unrest,pandemics,and crimes.Knowledge graph is a general language for describing and modeling complex systems.Different types of events continually occur,which are often related to historical and concurrent events.In this paper,we formalize the future event prediction as a temporal knowledge graph reasoning problem.Most existing studies either conduct reasoning on static knowledge graphs or assume knowledges graphs of all timestamps are available during the training process.As a result,they cannot effectively reason over temporal knowledge graphs and predict events happening in the future.To address this problem,some recent works learn to infer future events based on historical eventbased temporal knowledge graphs.However,these methods do not comprehensively consider the latent patterns and influences behind historical events and concurrent events simultaneously.This paper proposes a new graph representation learning model,namely Recurrent Event Graph ATtention Network(RE-GAT),based on a novel historical and concurrent events attention-aware mechanism by modeling the event knowledge graph sequence recurrently.More specifically,our RE-GAT uses an attention-based historical events embedding module to encode past events,and employs an attention-based concurrent events embedding module to model the associations of events at the same timestamp.A translation-based decoder module and a learning objective are developed to optimize the embeddings of entities and relations.We evaluate our proposed method on four benchmark datasets.Extensive experimental results demonstrate the superiority of our RE-GAT model comparing to various base-lines,which proves that our method can more accurately predict what events are going to happen.展开更多
BACKGROUND: Presently, clinic memory scale is used to evaluate learning memory ability in most studies, and the influence of difference in measurement condition of individuals exists. OBJECTIVE: To study the correla...BACKGROUND: Presently, clinic memory scale is used to evaluate learning memory ability in most studies, and the influence of difference in measurement condition of individuals exists. OBJECTIVE: To study the correlation between regional cerebral blood flow (rCBF) perfusion and learning memory function in special brain regions of patients with cerebral infarction at convalescent period, and to try to find out a method which can quantitatively evaluate learning ability. DESIGN: Case observation, and correlation analysis. SETTINGS: Shandong Institute for Behavioral Medicine; the Affiliated Hospital of Jining Medical College. PARTICIPANTS: Totally 70 patients with cerebral infarction admitted to Department of Neurology, Jining Medical College between January 2004 and December 2005 were involved. The involved patients, 58 male and 12 female, were averaged (52±3) years, and they were all right handed. They all met the diagnosis criteria instituted by the Fourth National Conference on Cerebrovascular Disease, and were confirmed as cerebral infarction by skull CT or MRI. Informed consents of detected items were obtained from all the patients and relatives. METHODS: When the patients were at convalescent period, their learning and memory ability were measured with “ clinic memory scale (set A)”. The 18 patients whose total mark over 100 were regarded as good learning memory function group; The 23 cases whose total mark less than 70 were regarded as poor learning memory function group. RCBF of hippocampus, nucleus amygdalae, temporal cortex and prefrontal lobe of patients between two groups were measured and compared by single photon emission computed tomography (SPECT). The total scores of the 18 good learning memory patients and 23 poor learning memory patients were taken as dependent variable Y, and their rCBFs of hippocampus, nucleus amygdale, temporal cortex and prefrontal lobe respectively as independent variable X for linear correlation analysis. MAIN OUTCOME MEASURES: Correlation of rCBF in different brain regions and learning memory ability in patients with cerebral infarction. RESULTS: ①The rCBF of hippocampus, nucleus amygdale, temportal cortex and prefrontal cortex of good learning memory function group were significantly higher than those of poor learning memory function group (P 〈 0.05). ②In the good learning memory function group, rCBF of hippocampus, nucleus amygdale, temportal cortex and prefrontal cortex were significantly positively correlated with memory scale scores ( r = 0.961, 0.926, 0.954, 0.907, P 〈 0.05 ) , and also in the poor learning memory function group (r = 0.979, 0.976, 0.991, 0.953, P 〈 0.05 ) . CONCLUSION: The rCBF of hippocampus, nucleus amygdale, temportal cortex and prefrontal cortex of patients with cerebral infarction are significantly positively correlated with memory scale scores. Predicting learning memory ability of patients by quantitative determination of rCBF provides a quantitative and objective method for evaluating learning memory ability.展开更多
基金Joint Funds of the National Natural Science Foundation of China,Grant/Award Number:U21A20518National Natural Science Foundation of China,Grant/Award Numbers:62106279,61903372。
文摘Policy evaluation(PE)is a critical sub-problem in reinforcement learning,which estimates the value function for a given policy and can be used for policy improvement.However,there still exist some limitations in current PE methods,such as low sample efficiency and local convergence,especially on complex tasks.In this study,a novel PE algorithm called Least-Squares Truncated Temporal-Difference learning(LST2D)is proposed.In LST2D,an adaptive truncation mechanism is designed,which effectively takes advantage of the fast convergence property of Least-Squares Temporal Difference learning and the asymptotic convergence property of Temporal Difference learning(TD).Then,two feature pre-training methods are utilised to improve the approximation ability of LST2D.Furthermore,an Actor-Critic algorithm based on LST2D and pre-trained feature representations(ACLPF)is proposed,where LST2D is integrated into the critic network to improve learning-prediction efficiency.Comprehensive simulation studies were conducted on four robotic tasks,and the corresponding results illustrate the effectiveness of LST2D.The proposed ACLPF algorithm outperformed DQN,ACER and PPO in terms of sample efficiency and stability,which demonstrated that LST2D can be applied to online learning control problems by incorporating it into the actor-critic architecture.
基金Project supported the National Natural Science Foundation of China(Grant No.60975033)the Basic Scientific Research Project of International Centre for Bamboo Rattan(Grant No.1632009006)the Shanghai Leading Academic Discipline Project(Grant No.J50103)
文摘Temporal relation computation is one of the tasks of the extraction of temporal arguments from event, and it is also the ultimate goal of temporal information processing. However, temporal relation computation based on machine learning requires a lot of hand-marked work, and exploring more features from discourse. A method of two-stage machine learning based on temporal relation computation (TSMLTRC) is proposed in this paper for the shortcomings of current temporal relation computation between two events. The first stage is to get the main temporal attributes of event based on classification learning. The second stage is to compute the event temporal relation in the discourse through employing the result of the first stage as the basic features, and also employing some new linguistic characteristics. Experiments show that, compared with the artificial golden rule, the computational efficiency in the first stage is much higher, and the F1-Score of event temporal relation which is computed through combining multi-features may be increased at 85.8% in the second stage.
文摘Aim To find a more efficient learning method based on temporal difference learning for delayed reinforcement learning tasks. Methods A kind of Q learning algorithm based on truncated TD( λ ) with adaptive schemes of λ value selection addressed to absorbing Markov decision processes was presented and implemented on computers. Results and Conclusion Simulations on the shortest path searching problems show that using adaptive λ in the Q learning based on TTD( λ ) can speed up its convergence.
文摘The effect of Batroxobin expression of neural cell adhesion molecule (NCAM) in left temporal ischemic rats with spatial memory disorder was investigated by means of Morri's water maze and immunohistochemical methods. The results showed that the mean reaction time and distance of temporal ischemic rats for searching a goal were significantly longer than those of sham-operated rats and at the same time NCAM expression of left temporal ischemic region was significantly increased. However, the mean reaction time and distance of Batroxobin-treated rats were shorter and they used normal strategies more often and earlier than those of ischemic rats. The number of NCAM immune reactive cells of Batroxobin-treated rats was more than that of ischemic group. In conclusion, Batroxobin can improve spatial memory disorder of temporal ischemic rats and the regulation of the expression of NCAM is probably related to the neuroprotective mechanism.
文摘 The effect of Batroxobin on spatial memory disorder of left temporal ischemic rats and the expression of HSP32 and HSP70 were investigated with Morri`s water maze and immunohistochemistry methods. The results showed that the mean reaction time and distance of temporal ischemic rats in searching a goal were significantly longer than those of the sham-operated rats and at the same time HSP32 and HSP70 expression of left temporal ischemic region in rats was significantly increased as compared with the sham-operated rats. However, the mean reaction time and distance of the Batroxobin-treated rats were shorter and they used normal strategies more often and earlier than those of ischemic rats. The number of HSP32 and HSP70 immune reactive cells of Batroxobin-treated rats was also less than that of the ischemic group. In conclusion, Batroxobin can improve spatial memory disorder of temporal ischemic rats; and the down-regulation of the expression of HSP32 and HSP70 is probably related to the attenuation of ischemic injury.
文摘The medial temporal lobe (MTL) has been assigned a central role in human episodic memory and learning. Evidence for this comes from PET and fMRI studies as well as lesion studies. This study aimed at comparing the effect of atrophy at repeated trials of a supraspan test of memory. Included in the study were patients with Alzheimer’s Disease, Mild Cognitive Impairment, and Subjective Memory Disorders as well as Controls (n = 116). The supraspan test used was the Rey Auditory Verbal Learning test (RAVLT). Comparisons between extreme groups with high (Stanine 6 - 9) and low (Stanine 1 - 4) intracranial proportions (IP) of MTL were made at the five trials of RAVLT. There was a significantly higher rate of learning among subjects with high MTL IP compared to those with low MTL IP in both hemispheres. There was no difference in the rate of list learning performance due to education or age and interestingly: the list learning rates among subjects with high/low Lateral Temporal Lobe IPs were almost similar. The hemispheric differences regarding learning rate were small and insignificant. Results are discussed in terms of hippocampal involvement in associative processes necessary in supraspan list learning.
基金the Deputyship for Research&Innovation,Ministry of Education in Saudi Arabia,for funding this research work through Project Number RI-44-0214.
文摘Background:Sepsis,a potentially fatal inflammatory disease triggered by infection,carries significant healthimplications worldwide.Timely detection is crucial as sepsis can rapidly escalate if left undetected.Recentadvancements in deep learning(DL)offer powerful tools to address this challenge.Aim:Thus,this study proposeda hybrid CNNBDLSTM,a combination of a convolutional neural network(CNN)with a bi-directional long shorttermmemory(BDLSTM)model to predict sepsis onset.Implementing the proposed model provides a robustframework that capitalizes on the complementary strengths of both architectures,resulting in more accurate andtimelier predictions.Method:The sepsis prediction method proposed here utilizes temporal feature extraction todelineate six distinct time frames before the onset of sepsis.These time frames adhere to the sepsis-3 standardrequirement,which incorporates 12-h observation windows preceding sepsis onset.All models were trained usingthe Medical Information Mart for Intensive Care III(MIMIC-III)dataset,which sourced 61,522 patients with 40clinical variables obtained from the IoT medical environment.The confusion matrix,the area under the receiveroperating characteristic curve(AUCROC)curve,the accuracy,the precision,the F1-score,and the recall weredeployed to evaluate themodels.Result:The CNNBDLSTMmodel demonstrated superior performance comparedto the benchmark and other models,achieving an AUCROC of 99.74%and an accuracy of 99.15%one hour beforesepsis onset.These results indicate that the CNNBDLSTM model is highly effective in predicting sepsis onset,particularly within a close proximity of one hour.Implication:The results could assist practitioners in increasingthe potential survival of the patient one hour before sepsis onset.
文摘Accurate wind power forecasting is critical for system integration and stability as renewable energy reliance grows.Traditional approaches frequently struggle with complex data and non-linear connections. This article presentsa novel approach for hybrid ensemble learning that is based on rigorous requirements engineering concepts.The approach finds significant parameters influencing forecasting accuracy by evaluating real-time Modern-EraRetrospective Analysis for Research and Applications (MERRA2) data from several European Wind farms usingin-depth stakeholder research and requirements elicitation. Ensemble learning is used to develop a robust model,while a temporal convolutional network handles time-series complexities and data gaps. The ensemble-temporalneural network is enhanced by providing different input parameters including training layers, hidden and dropoutlayers along with activation and loss functions. The proposed framework is further analyzed by comparing stateof-the-art forecasting models in terms of Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE),respectively. The energy efficiency performance indicators showed that the proposed model demonstrates errorreduction percentages of approximately 16.67%, 28.57%, and 81.92% for MAE, and 38.46%, 17.65%, and 90.78%for RMSE for MERRAWind farms 1, 2, and 3, respectively, compared to other existingmethods. These quantitativeresults show the effectiveness of our proposed model with MAE values ranging from 0.0010 to 0.0156 and RMSEvalues ranging from 0.0014 to 0.0174. This work highlights the effectiveness of requirements engineering in windpower forecasting, leading to enhanced forecast accuracy and grid stability, ultimately paving the way for moresustainable energy solutions.
基金Taishan Young Scholars Program of Shandong Province,Key Development Program for Basic Research of Shandong Province(ZR2020ZD44).
文摘Universal lesion detection(ULD)methods for computed tomography(CT)images play a vital role in the modern clinical medicine and intelligent automation.It is well known that single 2D CT slices lack spatial-temporal characteristics and contextual information compared to 3D CT blocks.However,3D CT blocks necessitate significantly higher hardware resources during the learning phase.Therefore,efficiently exploiting temporal correlation and spatial-temporal features of 2D CT slices is crucial for ULD tasks.In this paper,we propose a ULD network with the enhanced temporal correlation for this purpose,named TCE-Net.The designed TCE module is applied to enrich the discriminate feature representation of multiple sequential CT slices.Besides,we employ multi-scale feature maps to facilitate the localization and detection of lesions in various sizes.Extensive experiments are conducted on the DeepLesion benchmark demonstrate that thismethod achieves 66.84%and 78.18%for FS@0.5 and FS@1.0,respectively,outperforming compared state-of-the-art methods.
基金funded by the Natural Science Foundation China(NSFC)under Grant No.62203192.
文摘Video salient object detection(VSOD)aims at locating the most attractive objects in a video by exploring the spatial and temporal features.VSOD poses a challenging task in computer vision,as it involves processing complex spatial data that is also influenced by temporal dynamics.Despite the progress made in existing VSOD models,they still struggle in scenes of great background diversity within and between frames.Additionally,they encounter difficulties related to accumulated noise and high time consumption during the extraction of temporal features over a long-term duration.We propose a multi-stream temporal enhanced network(MSTENet)to address these problems.It investigates saliency cues collaboration in the spatial domain with a multi-stream structure to deal with the great background diversity challenge.A straightforward,yet efficient approach for temporal feature extraction is developed to avoid the accumulative noises and reduce time consumption.The distinction between MSTENet and other VSOD methods stems from its incorporation of both foreground supervision and background supervision,facilitating enhanced extraction of collaborative saliency cues.Another notable differentiation is the innovative integration of spatial and temporal features,wherein the temporal module is integrated into the multi-stream structure,enabling comprehensive spatial-temporal interactions within an end-to-end framework.Extensive experimental results demonstrate that the proposed method achieves state-of-the-art performance on five benchmark datasets while maintaining a real-time speed of 27 fps(Titan XP).Our code and models are available at https://github.com/RuJiaLe/MSTENet.
文摘Latent variable models can effectively determine the condition of essential rotating machinery without needing labeled data.These models analyze vibration data via an unsupervised learning strategy.Temporal preservation is necessary to obtain an informative latent manifold for the fault diagnosis task.In a temporalpreserving context,two approaches exist to develop a condition-monitoring methodology:offline and online.For latent variable models,the available training modes are not different.While many traditional methods use offline training,online training can dynamically adjust the latent manifold,possibly leading to better fault signature extraction from the vibration data.This study explores online training using temporal-preserving latent variable models.Within online training,there are two main methods:one focuses on reconstructing data and the other on interpreting the data components.Both are considered to evaluate how they diagnose faults over time.Using two experimental datasets,the study confirms that models from both training modes can detect changes in machinery health and identify faults even under varying conditions.Importantly,the complementarity of offline and online models is emphasized,reassuring their versatility in fault diagnostics.Understanding the implications of the training approach and the available model formulations is crucial for further research in latent variable modelbased fault diagnostics.
文摘Aim To investigate the model free multi step average reward reinforcement learning algorithm. Methods By combining the R learning algorithms with the temporal difference learning (TD( λ ) learning) algorithms for average reward problems, a novel incremental algorithm, called R( λ ) learning, was proposed. Results and Conclusion The proposed algorithm is a natural extension of the Q( λ) learning, the multi step discounted reward reinforcement learning algorithm, to the average reward cases. Simulation results show that the R( λ ) learning with intermediate λ values makes significant performance improvement over the simple R learning.
文摘In this work,we combined the model based reinforcement learning(MBRL)and model free reinforcement learning(MFRL)to stabilize a biped robot(NAO robot)on a rotating platform,where the angular velocity of the platform is unknown for the proposed learning algorithm and treated as the external disturbance.Nonparametric Gaussian processes normally require a large number of training data points to deal with the discontinuity of the estimated model.Although some improved method such as probabilistic inference for learning control(PILCO)does not require an explicit global model as the actions are obtained by directly searching the policy space,the overfitting and lack of model complexity may still result in a large deviation between the prediction and the real system.Besides,none of these approaches consider the data error and measurement noise during the training process and test process,respectively.We propose a hierarchical Gaussian processes(GP)models,containing two layers of independent GPs,where the physically continuous probability transition model of the robot is obtained.Due to the physically continuous estimation,the algorithm overcomes the overfitting problem with a guaranteed model complexity,and the number of training data is also reduced.The policy for any given initial state is generated automatically by minimizing the expected cost according to the predefined cost function and the obtained probability distribution of the state.Furthermore,a novel Q(λ)based MFRL method scheme is employed to improve the policy.Simulation results show that the proposed RL algorithm is able to balance NAO robot on a rotating platform,and it is capable of adapting to the platform with varying angular velocity.
基金supported by the National Natural Science Foundation(NNSF)of China(61603196,61503079,61520106009,61533008)the Natural Science Foundation of Jiangsu Province of China(BK20150851)+4 种基金China Postdoctoral Science Foundation(2015M581842)Jiangsu Postdoctoral Science Foundation(1601259C)Nanjing University of Posts and Telecommunications Science Foundation(NUPTSF)(NY215011)Priority Academic Program Development of Jiangsu Higher Education Institutions,the open fund of Key Laboratory of Measurement and Control of Complex Systems of Engineering,Ministry of Education(MCCSE2015B02)the Research Innovation Program for College Graduates of Jiangsu Province(CXLX1309)
文摘The iterated prisoner's dilemma(IPD) is an ideal model for analyzing interactions between agents in complex networks. It has attracted wide interest in the development of novel strategies since the success of tit-for-tat in Axelrod's tournament. This paper studies a new adaptive strategy of IPD in different complex networks, where agents can learn and adapt their strategies through reinforcement learning method. A temporal difference learning method is applied for designing the adaptive strategy to optimize the decision making process of the agents. Previous studies indicated that mutual cooperation is hard to emerge in the IPD. Therefore, three examples which based on square lattice network and scale-free network are provided to show two features of the adaptive strategy. First, the mutual cooperation can be achieved by the group with adaptive agents under scale-free network, and once evolution has converged mutual cooperation, it is unlikely to shift. Secondly, the adaptive strategy can earn a better payoff compared with other strategies in the square network. The analytical properties are discussed for verifying evolutionary stability of the adaptive strategy.
文摘In this paper,we summarize recent progresses made in deep learning based acoustic models and the motivation and insights behind the surveyed techniques.We first discuss models such as recurrent neural networks(RNNs) and convolutional neural networks(CNNs) that can effectively exploit variablelength contextual information,and their various combination with other models.We then describe models that are optimized end-to-end and emphasize on feature representations learned jointly with the rest of the system,the connectionist temporal classification(CTC) criterion,and the attention-based sequenceto-sequence translation model.We further illustrate robustness issues in speech recognition systems,and discuss acoustic model adaptation,speech enhancement and separation,and robust training strategies.We also cover modeling techniques that lead to more efficient decoding and discuss possible future directions in acoustic model research.
基金the National Science Foundation of China (No.91738201, 61971440)the Jiangsu Province Basic Research Project (No.BK20192002)+1 种基金the China Postdoctoral Science Foundation (No.2018M632347)the Natural Science Research of Higher Education Institutions of Jiangsu Province (No.18KJB510030)。
文摘In this paper,we investigate a spectrumsensing system in the presence of a satellite,where the satellite works as a sensing node.Considering the conventional energy detection method is sensitive to the noise uncertainty,thus,a temporal convolutional network(TCN)based spectrum-sensing method is designed to eliminate the effect of the noise uncertainty and improve the performance of spectrum sensing,relying on the offline training and the online detection stages.Specifically,in the offline training stage,spectrum data captured by the satellite is sent to the TCN deployed on the gateway for training purpose.Moreover,in the online detection stage,the well trained TCN is utilized to perform real-time spectrum sensing,which can upgrade spectrum-sensing performance by exploiting the temporal features.Additionally,simulation results demonstrate that the proposed method achieves a higher probability of detection than that of the conventional energy detection(ED),the convolutional neural network(CNN),and deep neural network(DNN).Furthermore,the proposed method outperforms the CNN and the DNN in terms of a lower computational complexity.
基金partially supported by the National Key Research and Development Program of China(2020YFB2104001)。
文摘The success of intelligent transportation systems relies heavily on accurate traffic prediction,in which how to model the underlying spatial-temporal information from traffic data has come under the spotlight.Most existing frameworks typically utilize separate modules for spatial and temporal correlations modeling.However,this stepwise pattern may limit the effectiveness and efficiency in spatial-temporal feature extraction and cause the overlook of important information in some steps.Furthermore,it is lacking sufficient guidance from prior information while modeling based on a given spatial adjacency graph(e.g.,deriving from the geodesic distance or approximate connectivity),and may not reflect the actual interaction between nodes.To overcome those limitations,our paper proposes a spatial-temporal graph synchronous aggregation(STGSA)model to extract the localized and long-term spatial-temporal dependencies simultaneously.Specifically,a tailored graph aggregation method in the vertex domain is designed to extract spatial and temporal features in one graph convolution process.In each STGSA block,we devise a directed temporal correlation graph to represent the localized and long-term dependencies between nodes,and the potential temporal dependence is further fine-tuned by an adaptive weighting operation.Meanwhile,we construct an elaborated spatial adjacency matrix to represent the road sensor graph by considering both physical distance and node similarity in a datadriven manner.Then,inspired by the multi-head attention mechanism which can jointly emphasize information from different r epresentation subspaces,we construct a multi-stream module based on the STGSA blocks to capture global information.It projects the embedding input repeatedly with multiple different channels.Finally,the predicted values are generated by stacking several multi-stream modules.Extensive experiments are constructed on six real-world datasets,and numerical results show that the proposed STGSA model significantly outperforms the benchmarks.
基金supported by the National Natural Science Foundation of China under grants U19B2044National Key Research and Development Program of China(2021YFC3300500).
文摘Accurate prediction of future events brings great benefits and reduces losses for society in many domains,such as civil unrest,pandemics,and crimes.Knowledge graph is a general language for describing and modeling complex systems.Different types of events continually occur,which are often related to historical and concurrent events.In this paper,we formalize the future event prediction as a temporal knowledge graph reasoning problem.Most existing studies either conduct reasoning on static knowledge graphs or assume knowledges graphs of all timestamps are available during the training process.As a result,they cannot effectively reason over temporal knowledge graphs and predict events happening in the future.To address this problem,some recent works learn to infer future events based on historical eventbased temporal knowledge graphs.However,these methods do not comprehensively consider the latent patterns and influences behind historical events and concurrent events simultaneously.This paper proposes a new graph representation learning model,namely Recurrent Event Graph ATtention Network(RE-GAT),based on a novel historical and concurrent events attention-aware mechanism by modeling the event knowledge graph sequence recurrently.More specifically,our RE-GAT uses an attention-based historical events embedding module to encode past events,and employs an attention-based concurrent events embedding module to model the associations of events at the same timestamp.A translation-based decoder module and a learning objective are developed to optimize the embeddings of entities and relations.We evaluate our proposed method on four benchmark datasets.Extensive experimental results demonstrate the superiority of our RE-GAT model comparing to various base-lines,which proves that our method can more accurately predict what events are going to happen.
基金the Grant from Bureau of Science and Technology of Jining City, No.2004JH006
文摘BACKGROUND: Presently, clinic memory scale is used to evaluate learning memory ability in most studies, and the influence of difference in measurement condition of individuals exists. OBJECTIVE: To study the correlation between regional cerebral blood flow (rCBF) perfusion and learning memory function in special brain regions of patients with cerebral infarction at convalescent period, and to try to find out a method which can quantitatively evaluate learning ability. DESIGN: Case observation, and correlation analysis. SETTINGS: Shandong Institute for Behavioral Medicine; the Affiliated Hospital of Jining Medical College. PARTICIPANTS: Totally 70 patients with cerebral infarction admitted to Department of Neurology, Jining Medical College between January 2004 and December 2005 were involved. The involved patients, 58 male and 12 female, were averaged (52±3) years, and they were all right handed. They all met the diagnosis criteria instituted by the Fourth National Conference on Cerebrovascular Disease, and were confirmed as cerebral infarction by skull CT or MRI. Informed consents of detected items were obtained from all the patients and relatives. METHODS: When the patients were at convalescent period, their learning and memory ability were measured with “ clinic memory scale (set A)”. The 18 patients whose total mark over 100 were regarded as good learning memory function group; The 23 cases whose total mark less than 70 were regarded as poor learning memory function group. RCBF of hippocampus, nucleus amygdalae, temporal cortex and prefrontal lobe of patients between two groups were measured and compared by single photon emission computed tomography (SPECT). The total scores of the 18 good learning memory patients and 23 poor learning memory patients were taken as dependent variable Y, and their rCBFs of hippocampus, nucleus amygdale, temporal cortex and prefrontal lobe respectively as independent variable X for linear correlation analysis. MAIN OUTCOME MEASURES: Correlation of rCBF in different brain regions and learning memory ability in patients with cerebral infarction. RESULTS: ①The rCBF of hippocampus, nucleus amygdale, temportal cortex and prefrontal cortex of good learning memory function group were significantly higher than those of poor learning memory function group (P 〈 0.05). ②In the good learning memory function group, rCBF of hippocampus, nucleus amygdale, temportal cortex and prefrontal cortex were significantly positively correlated with memory scale scores ( r = 0.961, 0.926, 0.954, 0.907, P 〈 0.05 ) , and also in the poor learning memory function group (r = 0.979, 0.976, 0.991, 0.953, P 〈 0.05 ) . CONCLUSION: The rCBF of hippocampus, nucleus amygdale, temportal cortex and prefrontal cortex of patients with cerebral infarction are significantly positively correlated with memory scale scores. Predicting learning memory ability of patients by quantitative determination of rCBF provides a quantitative and objective method for evaluating learning memory ability.