For critical engineering systems such as aircraft and aerospace vehicles, accurate Remaining Useful Life(RUL) prediction not only means cost saving, but more importantly, is of great significance in ensuring system re...For critical engineering systems such as aircraft and aerospace vehicles, accurate Remaining Useful Life(RUL) prediction not only means cost saving, but more importantly, is of great significance in ensuring system reliability and preventing disaster. RUL is affected not only by a system's intrinsic deterioration, but also by the operational conditions under which the system is operating. This paper proposes an RUL prediction approach to estimate the mean RUL of a continuously degrading system under dynamic operational conditions and subjected to condition monitoring at short equi-distant intervals. The dynamic nature of the operational conditions is described by a discrete-time Markov chain, and their influences on the degradation signal are quantified by degradation rates and signal jumps in the degradation model. The uniqueness of our proposed approach is formulating the RUL prediction problem in a semi-Markov decision process framework, by which the system mean RUL can be obtained through the solution to a limited number of equations. To extend the use of our proposed approach in real applications, different failure standards according to different operational conditions are also considered. The application and effectiveness of this approach are illustrated by a turbofan engine dataset and a comparison with existing results for the same dataset.展开更多
This paper considers a first passage model for discounted semi-Markov decision processes with denumerable states and nonnegative costs. The criterion to be optimized is the expected discounted cost incurred during a f...This paper considers a first passage model for discounted semi-Markov decision processes with denumerable states and nonnegative costs. The criterion to be optimized is the expected discounted cost incurred during a first passage time to a given target set. We first construct a semi-Markov decision process under a given semi-Markov decision kernel and a policy. Then, we prove that the value function satisfies the optimality equation and there exists an optimal (or ε-optimal) stationary policy under suitable conditions by using a minimum nonnegative solution approach. Further we give some properties of optimal policies. In addition, a value iteration algorithm for computing the value function and optimal policies is developed and an example is given. Finally, it is showed that our model is an extension of the first passage models for both discrete-time and continuous-time Markov decision processes.展开更多
This paper investigates the Borel state space semi-Markov decision process (SMDP) with the criterion of expected total rewards in a semi-Markov environment. It describes a system which behaves like a SMDP except that ...This paper investigates the Borel state space semi-Markov decision process (SMDP) with the criterion of expected total rewards in a semi-Markov environment. It describes a system which behaves like a SMDP except that the system is influenced by its environment modeled by a semi-Markov process. We transform the SMDP in a semiMarkov environment into an equivalent discrete time Markov decision process under the condition that rewards are all positive or all negative, and obtain the optimality equation and some properties for it.展开更多
An alpha-uniformized Markov chain is defined by the concept of equivalent infinitesimalgenerator for a semi-Markov decision process (SMDP) with both average- and discounted-criteria.According to the relations of their...An alpha-uniformized Markov chain is defined by the concept of equivalent infinitesimalgenerator for a semi-Markov decision process (SMDP) with both average- and discounted-criteria.According to the relations of their performance measures and performance potentials, the optimiza-tion of an SMDP can be realized by simulating the chain. For the critic model of neuro-dynamicprogramming (NDP), a neuro-policy iteration (NPI) algorithm is presented, and the performanceerror bound is shown as there are approximate error and improvement error in each iteration step.The obtained results may be extended to Markov systems, and have much applicability. Finally, anumerical example is provided.展开更多
Markov modeling of HIV/AIDS progression was done under the assumption that the state holding time (waiting time) had a constant hazard. This paper discusses the properties of the hazard function of the Exponential dis...Markov modeling of HIV/AIDS progression was done under the assumption that the state holding time (waiting time) had a constant hazard. This paper discusses the properties of the hazard function of the Exponential distributions and its modifications namely;Parameter proportion hazard (PH) and Accelerated failure time models (AFT) and their effectiveness in modeling the state holding time in Markov modeling of HIV/AIDS progression with and without risk factors. Patients were categorized by gender and age with female gender being the baseline. Data simulated using R software was fitted to each model, and the model parameters were estimated. The estimated P and Z values were then used to test the null hypothesis that the state waiting time data followed an Exponential distribution. Model identification criteria;Akaike information criteria (AIC), Bayesian information criteria (BIC), log-likelihood (LL), and R2 were used to evaluate the performance of the models. For the Survival Regression model, P and Z values supported the non-rejection of the null hypothesis for mixed gender without interaction and supported the rejection of the same for mixed gender with interaction term and males aged 50 - 60 years. Both Parameters supported the non-rejection of the null hypothesis in the rest of the age groups. For Gender male with interaction both P and Z values supported rejection in all the age groups except the age group 20 - 30 years. For Cox Proportional hazard and AFT models, both P and Z values supported the non-rejection of the null hypothesis across all age groups. The P-values for the three models supported different decisions for and against the Null hypothesis with AFT and Cox values supporting similar decisions in most of the age groups. Among the models considered, the regression assumption provided a superior fit based on (AIC), (BIC), (LL), and R2 Model identification criteria. This was particularly evident in age and gender subgroups where the data exhibited non-proportional hazards and violated the assumptions required for the Cox Proportional Hazard model. Moreover, the simplicity of the regression model, along with its ability to capture essential state transitions without over fitting, made it a more appropriate choice.展开更多
针对传输控制协议(TCP,transmission control protocol)的拥塞控制算法未能满足视频传输质量要求的问题,提出了一种基于半马尔科夫决策过程的视频传输拥塞控制算法。首先,为克服目前基于峰值信噪比的视频质量评估方法实时性低的缺点,设...针对传输控制协议(TCP,transmission control protocol)的拥塞控制算法未能满足视频传输质量要求的问题,提出了一种基于半马尔科夫决策过程的视频传输拥塞控制算法。首先,为克服目前基于峰值信噪比的视频质量评估方法实时性低的缺点,设计了一种可在线运行的无参考视频质量评估方法。其次,根据接收端视频质量的反馈,采用半马尔科夫决策过程对拥塞控制进行建模,并通过求解此模型得到拥塞控制参数的调整策略。仿真实验结果表明,与目前典型的拥塞控制算法相比,该算法不但具备更好的TCP友好性,而且有效地提高了解码后视频序列的主观和客观质量。展开更多
基金the National Natural science Foundation of China (No. 71701008) for supporting this research
文摘For critical engineering systems such as aircraft and aerospace vehicles, accurate Remaining Useful Life(RUL) prediction not only means cost saving, but more importantly, is of great significance in ensuring system reliability and preventing disaster. RUL is affected not only by a system's intrinsic deterioration, but also by the operational conditions under which the system is operating. This paper proposes an RUL prediction approach to estimate the mean RUL of a continuously degrading system under dynamic operational conditions and subjected to condition monitoring at short equi-distant intervals. The dynamic nature of the operational conditions is described by a discrete-time Markov chain, and their influences on the degradation signal are quantified by degradation rates and signal jumps in the degradation model. The uniqueness of our proposed approach is formulating the RUL prediction problem in a semi-Markov decision process framework, by which the system mean RUL can be obtained through the solution to a limited number of equations. To extend the use of our proposed approach in real applications, different failure standards according to different operational conditions are also considered. The application and effectiveness of this approach are illustrated by a turbofan engine dataset and a comparison with existing results for the same dataset.
基金Supported by the Natural Science Foundation of China(No.60874004,60736028)Guangdong Province Universities and Colleges Pearl River Scholar Funded Scheme(2010)
文摘This paper considers a first passage model for discounted semi-Markov decision processes with denumerable states and nonnegative costs. The criterion to be optimized is the expected discounted cost incurred during a first passage time to a given target set. We first construct a semi-Markov decision process under a given semi-Markov decision kernel and a policy. Then, we prove that the value function satisfies the optimality equation and there exists an optimal (or ε-optimal) stationary policy under suitable conditions by using a minimum nonnegative solution approach. Further we give some properties of optimal policies. In addition, a value iteration algorithm for computing the value function and optimal policies is developed and an example is given. Finally, it is showed that our model is an extension of the first passage models for both discrete-time and continuous-time Markov decision processes.
文摘This paper investigates the Borel state space semi-Markov decision process (SMDP) with the criterion of expected total rewards in a semi-Markov environment. It describes a system which behaves like a SMDP except that the system is influenced by its environment modeled by a semi-Markov process. We transform the SMDP in a semiMarkov environment into an equivalent discrete time Markov decision process under the condition that rewards are all positive or all negative, and obtain the optimality equation and some properties for it.
文摘An alpha-uniformized Markov chain is defined by the concept of equivalent infinitesimalgenerator for a semi-Markov decision process (SMDP) with both average- and discounted-criteria.According to the relations of their performance measures and performance potentials, the optimiza-tion of an SMDP can be realized by simulating the chain. For the critic model of neuro-dynamicprogramming (NDP), a neuro-policy iteration (NPI) algorithm is presented, and the performanceerror bound is shown as there are approximate error and improvement error in each iteration step.The obtained results may be extended to Markov systems, and have much applicability. Finally, anumerical example is provided.
文摘Markov modeling of HIV/AIDS progression was done under the assumption that the state holding time (waiting time) had a constant hazard. This paper discusses the properties of the hazard function of the Exponential distributions and its modifications namely;Parameter proportion hazard (PH) and Accelerated failure time models (AFT) and their effectiveness in modeling the state holding time in Markov modeling of HIV/AIDS progression with and without risk factors. Patients were categorized by gender and age with female gender being the baseline. Data simulated using R software was fitted to each model, and the model parameters were estimated. The estimated P and Z values were then used to test the null hypothesis that the state waiting time data followed an Exponential distribution. Model identification criteria;Akaike information criteria (AIC), Bayesian information criteria (BIC), log-likelihood (LL), and R2 were used to evaluate the performance of the models. For the Survival Regression model, P and Z values supported the non-rejection of the null hypothesis for mixed gender without interaction and supported the rejection of the same for mixed gender with interaction term and males aged 50 - 60 years. Both Parameters supported the non-rejection of the null hypothesis in the rest of the age groups. For Gender male with interaction both P and Z values supported rejection in all the age groups except the age group 20 - 30 years. For Cox Proportional hazard and AFT models, both P and Z values supported the non-rejection of the null hypothesis across all age groups. The P-values for the three models supported different decisions for and against the Null hypothesis with AFT and Cox values supporting similar decisions in most of the age groups. Among the models considered, the regression assumption provided a superior fit based on (AIC), (BIC), (LL), and R2 Model identification criteria. This was particularly evident in age and gender subgroups where the data exhibited non-proportional hazards and violated the assumptions required for the Cox Proportional Hazard model. Moreover, the simplicity of the regression model, along with its ability to capture essential state transitions without over fitting, made it a more appropriate choice.
文摘针对传输控制协议(TCP,transmission control protocol)的拥塞控制算法未能满足视频传输质量要求的问题,提出了一种基于半马尔科夫决策过程的视频传输拥塞控制算法。首先,为克服目前基于峰值信噪比的视频质量评估方法实时性低的缺点,设计了一种可在线运行的无参考视频质量评估方法。其次,根据接收端视频质量的反馈,采用半马尔科夫决策过程对拥塞控制进行建模,并通过求解此模型得到拥塞控制参数的调整策略。仿真实验结果表明,与目前典型的拥塞控制算法相比,该算法不但具备更好的TCP友好性,而且有效地提高了解码后视频序列的主观和客观质量。