This study presents a kinematic calibration method for exoskeletal inertial motion capture (EI-MoCap) system with considering the random colored noise such as gyroscopic drift.In this method, the geometric parameters ...This study presents a kinematic calibration method for exoskeletal inertial motion capture (EI-MoCap) system with considering the random colored noise such as gyroscopic drift.In this method, the geometric parameters are calibrated by the traditional calibration method at first. Then, in order to calibrate the parameters affected by the random colored noise, the expectation maximization (EM) algorithm is introduced. Through the use of geometric parameters calibrated by the traditional calibration method, the iterations under the EM framework are decreased and the efficiency of the proposed method on embedded system is improved. The performance of the proposed kinematic calibration method is compared to the traditional calibration method. Furthermore, the feasibility of the proposed method is verified on the EI-MoCap system. The simulation and experiment demonstrate that the motion capture precision is significantly improved by 16.79%and 7.16%respectively in comparison to the traditional calibration method.展开更多
Intrusion detection is the investigation process of information about the system activities or its data to detect any malicious behavior or unauthorized activity.Most of the IDS implement K-means clustering technique ...Intrusion detection is the investigation process of information about the system activities or its data to detect any malicious behavior or unauthorized activity.Most of the IDS implement K-means clustering technique due to its linear complexity and fast computing ability.Nonetheless,it is Naïve use of the mean data value for the cluster core that presents a major drawback.The chances of two circular clusters having different radius and centering at the same mean will occur.This condition cannot be addressed by the K-means algorithm because the mean value of the various clusters is very similar together.However,if the clusters are not spherical,it fails.To overcome this issue,a new integrated hybrid model by integrating expectation maximizing(EM)clustering using a Gaussian mixture model(GMM)and naïve Bays classifier have been proposed.In this model,GMM give more flexibility than K-Means in terms of cluster covariance.Also,they use probabilities function and soft clustering,that’s why they can have multiple cluster for a single data.In GMM,we can define the cluster form in GMM by two parameters:the mean and the standard deviation.This means that by using these two parameters,the cluster can take any kind of elliptical shape.EM-GMM will be used to cluster data based on data activity into the corresponding category.展开更多
Classical survival analysis assumes all subjects will experience the event of interest, but in some cases, a portion of the population may never encounter the event. These survival methods further assume independent s...Classical survival analysis assumes all subjects will experience the event of interest, but in some cases, a portion of the population may never encounter the event. These survival methods further assume independent survival times, which is not valid for honey bees, which live in nests. The study introduces a semi-parametric marginal proportional hazards mixture cure (PHMC) model with exchangeable correlation structure, using generalized estimating equations for survival data analysis. The model was tested on clustered right-censored bees survival data with a cured fraction, where two bee species were subjected to different entomopathogens to test the effect of the entomopathogens on the survival of the bee species. The Expectation-Solution algorithm is used to estimate the parameters. The study notes a weak positive association between cure statuses (ρ1=0.0007) and survival times for uncured bees (ρ2=0.0890), emphasizing their importance. The odds of being uncured for A. mellifera is higher than the odds for species M. ferruginea. The bee species, A. mellifera are more susceptible to entomopathogens icipe 7, icipe 20, and icipe 69. The Cox-Snell residuals show that the proposed semiparametric PH model generally fits the data well as compared to model that assume independent correlation structure. Thus, the semi parametric marginal proportional hazards mixture cure is parsimonious model for correlated bees survival data.展开更多
Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear mode...Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.展开更多
A new parallel expectation-maximization (EM) algorithm is proposed for large databases. The purpose of the algorithm is to accelerate the operation of the EM algorithm. As a well-known algorithm for estimation in ge...A new parallel expectation-maximization (EM) algorithm is proposed for large databases. The purpose of the algorithm is to accelerate the operation of the EM algorithm. As a well-known algorithm for estimation in generic statistical problems, the EM algorithm has been widely used in many domains. But it often requires significant computational resources. So it is needed to develop more elaborate methods to adapt the databases to a large number of records or large dimensionality. The parallel EM algorithm is based on partial Esteps which has the standard convergence guarantee of EM. The algorithm utilizes fully the advantage of parallel computation. It was confirmed that the algorithm obtains about 2.6 speedups in contrast with the standard EM algorithm through its application to large databases. The running time will decrease near linearly when the number of processors increasing.展开更多
The quality of synthetic aperture radar(SAR)image degrades in the case of multiple imaging projection planes(IPPs)and multiple overlapping ship targets,and then the performance of target classification and recognition...The quality of synthetic aperture radar(SAR)image degrades in the case of multiple imaging projection planes(IPPs)and multiple overlapping ship targets,and then the performance of target classification and recognition can be influenced.For addressing this issue,a method for extracting ship targets with overlaps via the expectation maximization(EM)algorithm is pro-posed.First,the scatterers of ship targets are obtained via the target detection technique.Then,the EM algorithm is applied to extract the scatterers of a single ship target with a single IPP.Afterwards,a novel image amplitude estimation approach is pro-posed,with which the radar image of a single target with a sin-gle IPP can be generated.The proposed method can accom-plish IPP selection and targets separation in the image domain,which can improve the image quality and reserve the target information most possibly.Results of simulated and real mea-sured data demonstrate the effectiveness of the proposed method.展开更多
Automatic image annotation has been an active topic of research in computer vision and pattern recognition for decades.A two stage automatic image annotation method based on Gaussian mixture model(GMM) and random walk...Automatic image annotation has been an active topic of research in computer vision and pattern recognition for decades.A two stage automatic image annotation method based on Gaussian mixture model(GMM) and random walk model(abbreviated as GMM-RW) is presented.To start with,GMM fitted by the rival penalized expectation maximization(RPEM) algorithm is employed to estimate the posterior probabilities of each annotation keyword.Subsequently,a random walk process over the constructed label similarity graph is implemented to further mine the potential correlations of the candidate annotations so as to capture the refining results,which plays a crucial role in semantic based image retrieval.The contributions exhibited in this work are multifold.First,GMM is exploited to capture the initial semantic annotations,especially the RPEM algorithm is utilized to train the model that can determine the number of components in GMM automatically.Second,a label similarity graph is constructed by a weighted linear combination of label similarity and visual similarity of images associated with the corresponding labels,which is able to avoid the phenomena of polysemy and synonym efficiently during the image annotation process.Third,the random walk is implemented over the constructed label graph to further refine the candidate set of annotations generated by GMM.Conducted experiments on the standard Corel5 k demonstrate that GMM-RW is significantly more effective than several state-of-the-arts regarding their effectiveness and efficiency in the task of automatic image annotation.展开更多
A fuzzy modeling method for complex systems is studied. The notation of general stochastic neural network (GSNN) is presented and a new modeling method is given based on the combination of the modified Takagi and Suge...A fuzzy modeling method for complex systems is studied. The notation of general stochastic neural network (GSNN) is presented and a new modeling method is given based on the combination of the modified Takagi and Sugeno's (MTS) fuzzy model and one-order GSNN. Using expectation-maximization(EM) algorithm, parameter estimation and model selection procedures are given. It avoids the shortcomings brought by other methods such as BP algorithm, when the number of parameters is large, BP algorithm is still difficult to apply directly without fine tuning and subjective tinkering. Finally, the simulated example demonstrates the effectiveness.展开更多
In this paper,we make use of the boosting method to introduce a new learning algorithm for Gaussian Mixture Models (GMMs) called adapted Boosted Mixture Learning (BML). The method possesses the ability to rectify the ...In this paper,we make use of the boosting method to introduce a new learning algorithm for Gaussian Mixture Models (GMMs) called adapted Boosted Mixture Learning (BML). The method possesses the ability to rectify the existing problems in other conventional techniques for estimating the GMM parameters, due in part to a new mixing-up strategy to increase the number of Gaussian components. The discriminative splitting idea is employed for Gaussian mixture densities followed by learning via the introduced method. Then, the GMM classifier was applied to distinguish between healthy infants and those that present a selected set of medical conditions. Each group includes both full-term and premature infants. Cry-pattern for each pathological condition is created by using the adapted BML method and 13-dimensional Mel-Frequency Cepstral Coefficients (MFCCs) feature vector. The test results demonstrate that the introduced method for training GMMs has a better performance than the traditional method based upon random splitting and EM-based re-estimation as a reference system in multi-pathological classification task.展开更多
Cyber losses in terms of number of records breached under cyber incidents commonly feature a significant portion of zeros, specific characteristics of mid-range losses and large losses, which make it hard to model the...Cyber losses in terms of number of records breached under cyber incidents commonly feature a significant portion of zeros, specific characteristics of mid-range losses and large losses, which make it hard to model the whole range of the losses using a standard loss distribution. We tackle this modeling problem by proposing a three-component spliced regression model that can simultaneously model zeros, moderate and large losses and consider heterogeneous effects in mixture components. To apply our proposed model to Privacy Right Clearinghouse (PRC) data breach chronology, we segment geographical groups using unsupervised cluster analysis, and utilize a covariate-dependent probability to model zero losses, finite mixture distributions for moderate body and an extreme value distribution for large losses capturing the heavy-tailed nature of the loss data. Parameters and coefficients are estimated using the Expectation-Maximization (EM) algorithm. Combining with our frequency model (generalized linear mixed model) for data breaches, aggregate loss distributions are investigated and applications on cyber insurance pricing and risk management are discussed.展开更多
针对当前广义频分复用(Generalized Frequency Division Multiplexing,GFDM)系统时变信道估计精度低的问题,提出基于稀疏贝叶斯学习的GFDM系统联合信道估计与符号检测算法.具体地,采用无干扰导频插入的GFDM多重响应信号模型,在稀疏贝叶...针对当前广义频分复用(Generalized Frequency Division Multiplexing,GFDM)系统时变信道估计精度低的问题,提出基于稀疏贝叶斯学习的GFDM系统联合信道估计与符号检测算法.具体地,采用无干扰导频插入的GFDM多重响应信号模型,在稀疏贝叶斯学习框架下,结合期望最大化算法(Expectation-Maximization,EM)和卡尔曼滤波与平滑算法实现块时变信道的最大似然估计;基于信道状态信息的估计值进行GFDM符号检测,并通过信道估计与符号检测的迭代处理逐步提高信道估计与符号检测的精度.仿真结果表明,所提算法能够获得接近完美信道状态信息条件下的误码率性能,且具有收敛速度快、对多普勒频移鲁棒性高等优点.展开更多
基金supported by the National Natural Science Foundation of China (61503392)。
文摘This study presents a kinematic calibration method for exoskeletal inertial motion capture (EI-MoCap) system with considering the random colored noise such as gyroscopic drift.In this method, the geometric parameters are calibrated by the traditional calibration method at first. Then, in order to calibrate the parameters affected by the random colored noise, the expectation maximization (EM) algorithm is introduced. Through the use of geometric parameters calibrated by the traditional calibration method, the iterations under the EM framework are decreased and the efficiency of the proposed method on embedded system is improved. The performance of the proposed kinematic calibration method is compared to the traditional calibration method. Furthermore, the feasibility of the proposed method is verified on the EI-MoCap system. The simulation and experiment demonstrate that the motion capture precision is significantly improved by 16.79%and 7.16%respectively in comparison to the traditional calibration method.
文摘Intrusion detection is the investigation process of information about the system activities or its data to detect any malicious behavior or unauthorized activity.Most of the IDS implement K-means clustering technique due to its linear complexity and fast computing ability.Nonetheless,it is Naïve use of the mean data value for the cluster core that presents a major drawback.The chances of two circular clusters having different radius and centering at the same mean will occur.This condition cannot be addressed by the K-means algorithm because the mean value of the various clusters is very similar together.However,if the clusters are not spherical,it fails.To overcome this issue,a new integrated hybrid model by integrating expectation maximizing(EM)clustering using a Gaussian mixture model(GMM)and naïve Bays classifier have been proposed.In this model,GMM give more flexibility than K-Means in terms of cluster covariance.Also,they use probabilities function and soft clustering,that’s why they can have multiple cluster for a single data.In GMM,we can define the cluster form in GMM by two parameters:the mean and the standard deviation.This means that by using these two parameters,the cluster can take any kind of elliptical shape.EM-GMM will be used to cluster data based on data activity into the corresponding category.
文摘Classical survival analysis assumes all subjects will experience the event of interest, but in some cases, a portion of the population may never encounter the event. These survival methods further assume independent survival times, which is not valid for honey bees, which live in nests. The study introduces a semi-parametric marginal proportional hazards mixture cure (PHMC) model with exchangeable correlation structure, using generalized estimating equations for survival data analysis. The model was tested on clustered right-censored bees survival data with a cured fraction, where two bee species were subjected to different entomopathogens to test the effect of the entomopathogens on the survival of the bee species. The Expectation-Solution algorithm is used to estimate the parameters. The study notes a weak positive association between cure statuses (ρ1=0.0007) and survival times for uncured bees (ρ2=0.0890), emphasizing their importance. The odds of being uncured for A. mellifera is higher than the odds for species M. ferruginea. The bee species, A. mellifera are more susceptible to entomopathogens icipe 7, icipe 20, and icipe 69. The Cox-Snell residuals show that the proposed semiparametric PH model generally fits the data well as compared to model that assume independent correlation structure. Thus, the semi parametric marginal proportional hazards mixture cure is parsimonious model for correlated bees survival data.
文摘Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.
基金the National Natural Science Foundation of China(79990584)
文摘A new parallel expectation-maximization (EM) algorithm is proposed for large databases. The purpose of the algorithm is to accelerate the operation of the EM algorithm. As a well-known algorithm for estimation in generic statistical problems, the EM algorithm has been widely used in many domains. But it often requires significant computational resources. So it is needed to develop more elaborate methods to adapt the databases to a large number of records or large dimensionality. The parallel EM algorithm is based on partial Esteps which has the standard convergence guarantee of EM. The algorithm utilizes fully the advantage of parallel computation. It was confirmed that the algorithm obtains about 2.6 speedups in contrast with the standard EM algorithm through its application to large databases. The running time will decrease near linearly when the number of processors increasing.
基金This work was supported by the National Science Fund for Distinguished Young Scholars(62325104).
文摘The quality of synthetic aperture radar(SAR)image degrades in the case of multiple imaging projection planes(IPPs)and multiple overlapping ship targets,and then the performance of target classification and recognition can be influenced.For addressing this issue,a method for extracting ship targets with overlaps via the expectation maximization(EM)algorithm is pro-posed.First,the scatterers of ship targets are obtained via the target detection technique.Then,the EM algorithm is applied to extract the scatterers of a single ship target with a single IPP.Afterwards,a novel image amplitude estimation approach is pro-posed,with which the radar image of a single target with a sin-gle IPP can be generated.The proposed method can accom-plish IPP selection and targets separation in the image domain,which can improve the image quality and reserve the target information most possibly.Results of simulated and real mea-sured data demonstrate the effectiveness of the proposed method.
基金Supported by the National Basic Research Program of China(No.2013CB329502)the National Natural Science Foundation of China(No.61202212)+1 种基金the Special Research Project of the Educational Department of Shaanxi Province of China(No.15JK1038)the Key Research Project of Baoji University of Arts and Sciences(No.ZK16047)
文摘Automatic image annotation has been an active topic of research in computer vision and pattern recognition for decades.A two stage automatic image annotation method based on Gaussian mixture model(GMM) and random walk model(abbreviated as GMM-RW) is presented.To start with,GMM fitted by the rival penalized expectation maximization(RPEM) algorithm is employed to estimate the posterior probabilities of each annotation keyword.Subsequently,a random walk process over the constructed label similarity graph is implemented to further mine the potential correlations of the candidate annotations so as to capture the refining results,which plays a crucial role in semantic based image retrieval.The contributions exhibited in this work are multifold.First,GMM is exploited to capture the initial semantic annotations,especially the RPEM algorithm is utilized to train the model that can determine the number of components in GMM automatically.Second,a label similarity graph is constructed by a weighted linear combination of label similarity and visual similarity of images associated with the corresponding labels,which is able to avoid the phenomena of polysemy and synonym efficiently during the image annotation process.Third,the random walk is implemented over the constructed label graph to further refine the candidate set of annotations generated by GMM.Conducted experiments on the standard Corel5 k demonstrate that GMM-RW is significantly more effective than several state-of-the-arts regarding their effectiveness and efficiency in the task of automatic image annotation.
基金This work was supported by the National Natural Science Foundation of China (51507015, 61773402, 61540037, 71271215, 61233008, 51425701, 70921001, 51577014), the Natural Science Foundation of Hunan Province (2015JJ3008), the Key Laboratory of Renewable Energy Electric-Technology of Hunan Province (2014ZNDL002), and Hunan Province Science and Technology Program(2015NK3035).
文摘A fuzzy modeling method for complex systems is studied. The notation of general stochastic neural network (GSNN) is presented and a new modeling method is given based on the combination of the modified Takagi and Sugeno's (MTS) fuzzy model and one-order GSNN. Using expectation-maximization(EM) algorithm, parameter estimation and model selection procedures are given. It avoids the shortcomings brought by other methods such as BP algorithm, when the number of parameters is large, BP algorithm is still difficult to apply directly without fine tuning and subjective tinkering. Finally, the simulated example demonstrates the effectiveness.
文摘In this paper,we make use of the boosting method to introduce a new learning algorithm for Gaussian Mixture Models (GMMs) called adapted Boosted Mixture Learning (BML). The method possesses the ability to rectify the existing problems in other conventional techniques for estimating the GMM parameters, due in part to a new mixing-up strategy to increase the number of Gaussian components. The discriminative splitting idea is employed for Gaussian mixture densities followed by learning via the introduced method. Then, the GMM classifier was applied to distinguish between healthy infants and those that present a selected set of medical conditions. Each group includes both full-term and premature infants. Cry-pattern for each pathological condition is created by using the adapted BML method and 13-dimensional Mel-Frequency Cepstral Coefficients (MFCCs) feature vector. The test results demonstrate that the introduced method for training GMMs has a better performance than the traditional method based upon random splitting and EM-based re-estimation as a reference system in multi-pathological classification task.
文摘Cyber losses in terms of number of records breached under cyber incidents commonly feature a significant portion of zeros, specific characteristics of mid-range losses and large losses, which make it hard to model the whole range of the losses using a standard loss distribution. We tackle this modeling problem by proposing a three-component spliced regression model that can simultaneously model zeros, moderate and large losses and consider heterogeneous effects in mixture components. To apply our proposed model to Privacy Right Clearinghouse (PRC) data breach chronology, we segment geographical groups using unsupervised cluster analysis, and utilize a covariate-dependent probability to model zero losses, finite mixture distributions for moderate body and an extreme value distribution for large losses capturing the heavy-tailed nature of the loss data. Parameters and coefficients are estimated using the Expectation-Maximization (EM) algorithm. Combining with our frequency model (generalized linear mixed model) for data breaches, aggregate loss distributions are investigated and applications on cyber insurance pricing and risk management are discussed.
文摘针对当前广义频分复用(Generalized Frequency Division Multiplexing,GFDM)系统时变信道估计精度低的问题,提出基于稀疏贝叶斯学习的GFDM系统联合信道估计与符号检测算法.具体地,采用无干扰导频插入的GFDM多重响应信号模型,在稀疏贝叶斯学习框架下,结合期望最大化算法(Expectation-Maximization,EM)和卡尔曼滤波与平滑算法实现块时变信道的最大似然估计;基于信道状态信息的估计值进行GFDM符号检测,并通过信道估计与符号检测的迭代处理逐步提高信道估计与符号检测的精度.仿真结果表明,所提算法能够获得接近完美信道状态信息条件下的误码率性能,且具有收敛速度快、对多普勒频移鲁棒性高等优点.