With the rapid development of information technology,the electronifi-cation of medical records has gradually become a trend.In China,the population base is huge and the supporting medical institutions are numerous,so ...With the rapid development of information technology,the electronifi-cation of medical records has gradually become a trend.In China,the population base is huge and the supporting medical institutions are numerous,so this reality drives the conversion of paper medical records to electronic medical records.Electronic medical records are the basis for establishing a smart hospital and an important guarantee for achieving medical intelligence,and the massive amount of electronic medical record data is also an important data set for conducting research in the medical field.However,electronic medical records contain a large amount of private patient information,which must be desensitized before they are used as open resources.Therefore,to solve the above problems,data masking for Chinese electronic medical records with named entity recognition is proposed in this paper.Firstly,the text is vectorized to satisfy the required format of the model input.Secondly,since the input sentences may have a long or short length and the relationship between sentences in context is not negligible.To this end,a neural network model for named entity recognition based on bidirectional long short-term memory(BiLSTM)with conditional random fields(CRF)is constructed.Finally,the data masking operation is performed based on the named entity recog-nition results,mainly using regular expression filtering encryption and principal component analysis(PCA)word vector compression and replacement.In addi-tion,comparison experiments with the hidden markov model(HMM)model,LSTM-CRF model,and BiLSTM model are conducted in this paper.The experi-mental results show that the method used in this paper achieves 92.72%Accuracy,92.30%Recall,and 92.51%F1_score,which has higher accuracy compared with other models.展开更多
Masked data are the system failure data when exact component causing system failure might be unknown.In this paper,the mathematical description of general masked data was presented in software reliability engineering....Masked data are the system failure data when exact component causing system failure might be unknown.In this paper,the mathematical description of general masked data was presented in software reliability engineering.Furthermore,a general maskedbased additive non-homogeneous Poisson process(NHPP) model was considered to analyze component reliability.However,the problem of masked-based additive model lies in the difficulty of estimating parameters.The maximum likelihood estimation procedure was derived to estimate parameters.Finally,a numerical example was given to illustrate the applicability of proposed model,and the immune particle swarm optimization(IPSO) algorithm was used in maximize log-likelihood function.展开更多
In general,simple subsystems like series or parallel are integrated to produce a complex hybrid system.The reliability of a system is determined by the reliability of its constituent components.It is often extremely d...In general,simple subsystems like series or parallel are integrated to produce a complex hybrid system.The reliability of a system is determined by the reliability of its constituent components.It is often extremely difficult or impossible to get specific information about the component that caused the system to fail.Unknown failure causes are instances in which the actual cause of systemfailure is unknown.On the other side,thanks to current advanced technology based on computers,automation,and simulation,products have become incredibly dependable and trustworthy,and as a result,obtaining failure data for testing such exceptionally reliable items have become a very costly and time-consuming procedure.Therefore,because of its capacity to produce rapid and adequate failure data in a short period of time,accelerated life testing(ALT)is the most utilized approach in the field of product reliability and life testing.Based on progressively hybrid censored(PrHC)data froma three-component parallel series hybrid system that failed to owe to unknown causes,this paper investigates a challenging problem of parameter estimation and reliability assessment under a step stress partially accelerated life-test(SSPALT).Failures of components are considered to follow a power linear hazard rate(PLHR),which can be used when the failure rate displays linear,decreasing,increasing or bathtub failure patterns.The Tempered random variable(TRV)model is considered to reflect the effect of the high stress level used to induce early failure data.The maximum likelihood estimation(MLE)approach is used to estimate the parameters of the PLHR distribution and the acceleration factor.A variance covariance matrix(VCM)is then obtained to construct the approximate confidence intervals(ACIs).In addition,studentized bootstrap confidence intervals(ST-B CIs)are also constructed and compared with ACIs in terms of their respective interval lengths(ILs).Moreover,a simulation study is conducted to demonstrate the performance of the estimation procedures and the methodology discussed in this paper.Finally,real failure data from the air conditioning systems of an airplane is used to illustrate further the performance of the suggested estimation technique.展开更多
We consider a series system of two independent and non-identical components which have different BurrⅫ distributed lifetime.The maximum likelihood and Bayes estimators of the parameters of the system's components ar...We consider a series system of two independent and non-identical components which have different BurrⅫ distributed lifetime.The maximum likelihood and Bayes estimators of the parameters of the system's components are obtained based on masked system life test data.The conclusion is that the Bayes estimates are better than the maximum likelihood estimates in the sense of having smaller mean squared errors.展开更多
Under Type-Ⅱ progressively hybrid censoring, this paper discusses statistical inference and optimal design on stepstress partially accelerated life test for hybrid system in presence of masked data. It is assumed tha...Under Type-Ⅱ progressively hybrid censoring, this paper discusses statistical inference and optimal design on stepstress partially accelerated life test for hybrid system in presence of masked data. It is assumed that the lifetime of the component in hybrid systems follows independent and identical modified Weibull distributions. The maximum likelihood estimations(MLEs)of the unknown parameters, acceleration factor and reliability indexes are derived by using the Newton-Raphson algorithm. The asymptotic variance-covariance matrix and the approximate confidence intervals are obtained based on normal approximation to the asymptotic distribution of MLEs of model parameters. Moreover,two bootstrap confidence intervals are constructed by using the parametric bootstrap method. The optimal time of changing stress levels is determined under D-optimality and A-optimality criteria.Finally, the Monte Carlo simulation study is carried out to illustrate the proposed procedures.展开更多
基金This research was supported by the National Natural Science Foundation of China under Grant(No.42050102)the Postgraduate Education Reform Project of Jiangsu Province under Grant(No.SJCX22_0343)Also,this research was supported by Dou Wanchun Expert Workstation of Yunnan Province(No.202205AF150013).
文摘With the rapid development of information technology,the electronifi-cation of medical records has gradually become a trend.In China,the population base is huge and the supporting medical institutions are numerous,so this reality drives the conversion of paper medical records to electronic medical records.Electronic medical records are the basis for establishing a smart hospital and an important guarantee for achieving medical intelligence,and the massive amount of electronic medical record data is also an important data set for conducting research in the medical field.However,electronic medical records contain a large amount of private patient information,which must be desensitized before they are used as open resources.Therefore,to solve the above problems,data masking for Chinese electronic medical records with named entity recognition is proposed in this paper.Firstly,the text is vectorized to satisfy the required format of the model input.Secondly,since the input sentences may have a long or short length and the relationship between sentences in context is not negligible.To this end,a neural network model for named entity recognition based on bidirectional long short-term memory(BiLSTM)with conditional random fields(CRF)is constructed.Finally,the data masking operation is performed based on the named entity recog-nition results,mainly using regular expression filtering encryption and principal component analysis(PCA)word vector compression and replacement.In addi-tion,comparison experiments with the hidden markov model(HMM)model,LSTM-CRF model,and BiLSTM model are conducted in this paper.The experi-mental results show that the method used in this paper achieves 92.72%Accuracy,92.30%Recall,and 92.51%F1_score,which has higher accuracy compared with other models.
基金Technology Foundation of Guizhou Province,China(No.QianKeHeJZi[2015]2064)Scientific Research Foundation for Advanced Talents in Guizhou Institue of Technology and Science,China(No.XJGC20150106)Joint Foundation of Guizhou Province,China(No.QianKeHeLHZi[2015]7105)
文摘Masked data are the system failure data when exact component causing system failure might be unknown.In this paper,the mathematical description of general masked data was presented in software reliability engineering.Furthermore,a general maskedbased additive non-homogeneous Poisson process(NHPP) model was considered to analyze component reliability.However,the problem of masked-based additive model lies in the difficulty of estimating parameters.The maximum likelihood estimation procedure was derived to estimate parameters.Finally,a numerical example was given to illustrate the applicability of proposed model,and the immune particle swarm optimization(IPSO) algorithm was used in maximize log-likelihood function.
文摘In general,simple subsystems like series or parallel are integrated to produce a complex hybrid system.The reliability of a system is determined by the reliability of its constituent components.It is often extremely difficult or impossible to get specific information about the component that caused the system to fail.Unknown failure causes are instances in which the actual cause of systemfailure is unknown.On the other side,thanks to current advanced technology based on computers,automation,and simulation,products have become incredibly dependable and trustworthy,and as a result,obtaining failure data for testing such exceptionally reliable items have become a very costly and time-consuming procedure.Therefore,because of its capacity to produce rapid and adequate failure data in a short period of time,accelerated life testing(ALT)is the most utilized approach in the field of product reliability and life testing.Based on progressively hybrid censored(PrHC)data froma three-component parallel series hybrid system that failed to owe to unknown causes,this paper investigates a challenging problem of parameter estimation and reliability assessment under a step stress partially accelerated life-test(SSPALT).Failures of components are considered to follow a power linear hazard rate(PLHR),which can be used when the failure rate displays linear,decreasing,increasing or bathtub failure patterns.The Tempered random variable(TRV)model is considered to reflect the effect of the high stress level used to induce early failure data.The maximum likelihood estimation(MLE)approach is used to estimate the parameters of the PLHR distribution and the acceleration factor.A variance covariance matrix(VCM)is then obtained to construct the approximate confidence intervals(ACIs).In addition,studentized bootstrap confidence intervals(ST-B CIs)are also constructed and compared with ACIs in terms of their respective interval lengths(ILs).Moreover,a simulation study is conducted to demonstrate the performance of the estimation procedures and the methodology discussed in this paper.Finally,real failure data from the air conditioning systems of an airplane is used to illustrate further the performance of the suggested estimation technique.
基金Supported by the National Natural Science Foundation of China(70471057)
文摘We consider a series system of two independent and non-identical components which have different BurrⅫ distributed lifetime.The maximum likelihood and Bayes estimators of the parameters of the system's components are obtained based on masked system life test data.The conclusion is that the Bayes estimates are better than the maximum likelihood estimates in the sense of having smaller mean squared errors.
基金supported by the National Natural Science Foundation of China(71401134 71571144+1 种基金 71171164)the Program of International Cooperation and Exchanges in Science and Technology Funded by Shaanxi Province(2016KW-033)
文摘Under Type-Ⅱ progressively hybrid censoring, this paper discusses statistical inference and optimal design on stepstress partially accelerated life test for hybrid system in presence of masked data. It is assumed that the lifetime of the component in hybrid systems follows independent and identical modified Weibull distributions. The maximum likelihood estimations(MLEs)of the unknown parameters, acceleration factor and reliability indexes are derived by using the Newton-Raphson algorithm. The asymptotic variance-covariance matrix and the approximate confidence intervals are obtained based on normal approximation to the asymptotic distribution of MLEs of model parameters. Moreover,two bootstrap confidence intervals are constructed by using the parametric bootstrap method. The optimal time of changing stress levels is determined under D-optimality and A-optimality criteria.Finally, the Monte Carlo simulation study is carried out to illustrate the proposed procedures.