期刊文献+
共找到89篇文章
< 1 2 5 >
每页显示 20 50 100
Modeling Cyber Loss Severity Using a Spliced Regression Distribution with Mixture Components
1
作者 Meng Sun 《Open Journal of Statistics》 2023年第4期425-452,共28页
Cyber losses in terms of number of records breached under cyber incidents commonly feature a significant portion of zeros, specific characteristics of mid-range losses and large losses, which make it hard to model the... Cyber losses in terms of number of records breached under cyber incidents commonly feature a significant portion of zeros, specific characteristics of mid-range losses and large losses, which make it hard to model the whole range of the losses using a standard loss distribution. We tackle this modeling problem by proposing a three-component spliced regression model that can simultaneously model zeros, moderate and large losses and consider heterogeneous effects in mixture components. To apply our proposed model to Privacy Right Clearinghouse (PRC) data breach chronology, we segment geographical groups using unsupervised cluster analysis, and utilize a covariate-dependent probability to model zero losses, finite mixture distributions for moderate body and an extreme value distribution for large losses capturing the heavy-tailed nature of the loss data. Parameters and coefficients are estimated using the Expectation-Maximization (EM) algorithm. Combining with our frequency model (generalized linear mixed model) for data breaches, aggregate loss distributions are investigated and applications on cyber insurance pricing and risk management are discussed. 展开更多
关键词 Cyber Risk Data Breach Spliced regression model Finite mixture Distribu-tion Cluster Analysis Expectation-Maximization Algorithm Extreme Value Theory
下载PDF
Predicting carbon storage of mixed broadleaf forests based on the finite mixture model incorporating stand factors,site quality,and aridity index
2
作者 Yanlin Wang Dongzhi Wang +2 位作者 Dongyan Zhang Qiang Liu Yongning Li 《Forest Ecosystems》 SCIE CSCD 2024年第3期276-286,共11页
The diameter distribution function(DDF)is a crucial tool for accurately predicting stand carbon storage(CS).The current key issue,however,is how to construct a high-precision DDF based on stand factors,site quality,an... The diameter distribution function(DDF)is a crucial tool for accurately predicting stand carbon storage(CS).The current key issue,however,is how to construct a high-precision DDF based on stand factors,site quality,and aridity index to predict stand CS in multi-species mixed forests with complex structures.This study used data from70 survey plots for mixed broadleaf Populus davidiana and Betula platyphylla forests in the Mulan Rangeland State Forest,Hebei Province,China,to construct the DDF based on maximum likelihood estimation and finite mixture model(FMM).Ordinary least squares(OLS),linear seemingly unrelated regression(LSUR),and back propagation neural network(BPNN)were used to investigate the influences of stand factors,site quality,and aridity index on the shape and scale parameters of DDF and predicted stand CS of mixed broadleaf forests.The results showed that FMM accurately described the stand-level diameter distribution of the mixed P.davidiana and B.platyphylla forests;whereas the Weibull function constructed by MLE was more accurate in describing species-level diameter distribution.The combined variable of quadratic mean diameter(Dq),stand basal area(BA),and site quality improved the accuracy of the shape parameter models of FMM;the combined variable of Dq,BA,and De Martonne aridity index improved the accuracy of the scale parameter models.Compared to OLS and LSUR,the BPNN had higher accuracy in the re-parameterization process of FMM.OLS,LSUR,and BPNN overestimated the CS of P.davidiana but underestimated the CS of B.platyphylla in the large diameter classes(DBH≥18 cm).BPNN accurately estimated stand-and species-level CS,but it was more suitable for estimating stand-level CS compared to species-level CS,thereby providing a scientific basis for the optimization of stand structure and assessment of carbon sequestration capacity in mixed broadleaf forests. 展开更多
关键词 Weibull function Finite mixture model Linear seemingly unrelated regression Back propagation neural network Carbon storage
下载PDF
Application of a Novel Method for Machine Performance Degradation Assessment Based on Gaussian Mixture Model and Logistic Regression 被引量:3
3
作者 LIU Wenbin ZHONG Xin +2 位作者 LEE Jay LIAO Linxia ZHOU Min 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2011年第5期879-884,共6页
The currently prevalent machine performance degradation assessment techniques involve estimating a machine's current condition based upon the recognition of indications of failure features,which entail complete data ... The currently prevalent machine performance degradation assessment techniques involve estimating a machine's current condition based upon the recognition of indications of failure features,which entail complete data collected in different conditions.However,failure data are always hard to acquire,thus making those techniques hard to be applied.In this paper,a novel method which does not need failure history data is introduced.Wavelet packet decomposition(WPD) is used to extract features from raw signals,principal component analysis(PCA) is utilized to reduce feature dimensions,and Gaussian mixture model(GMM) is then applied to approximate the feature space distributions.Single-channel confidence value(SCV) is calculated by the overlap between GMM of the monitoring condition and that of the normal condition,which can indicate the performance of single-channel.Furthermore,multi-channel confidence value(MCV),which can be deemed as the overall performance index of multi-channel,is calculated via logistic regression(LR) and that the task of decision-level sensor fusion is also completed.Both SCV and MCV can serve as the basis on which proactive maintenance measures can be taken,thus preventing machine breakdown.The method has been adopted to assess the performance of the turbine of a centrifugal compressor in a factory of Petro-China,and the result shows that it can effectively complete this task.The proposed method has engineering significance for machine performance degradation assessment. 展开更多
关键词 performance degradation assessment Gaussian mixture model logistic regression proactive maintenance sensor fusion
下载PDF
Heteroscedastic Laplace mixture of experts regression models and applications
4
作者 WU Liu-cang ZHANG Shu-yu LI Shuang-shuang 《Applied Mathematics(A Journal of Chinese Universities)》 SCIE CSCD 2021年第1期60-69,共10页
Mixture of Experts(MoE)regression models are widely studied in statistics and machine learning for modeling heterogeneity in data for regression,clustering and classification.Laplace distribution is one of the most im... Mixture of Experts(MoE)regression models are widely studied in statistics and machine learning for modeling heterogeneity in data for regression,clustering and classification.Laplace distribution is one of the most important statistical tools to analyze thick and tail data.Laplace Mixture of Linear Experts(LMoLE)regression models are based on the Laplace distribution which is more robust.Similar to modelling variance parameter in a homogeneous population,we propose and study a new novel class of models:heteroscedastic Laplace mixture of experts regression models to analyze the heteroscedastic data coming from a heterogeneous population in this paper.The issues of maximum likelihood estimation are addressed.In particular,Minorization-Maximization(MM)algorithm for estimating the regression parameters is developed.Properties of the estimators of the regression coefficients are evaluated through Monte Carlo simulations.Results from the analysis of two real data sets are presented. 展开更多
关键词 mixture of experts regression models heteroscedastic mixture of experts regression models Laplace distribution MM algorithm
下载PDF
Variable Selection in Finite Mixture of Time-Varying Regression Models
5
作者 Jing Liu Wanzhou Ye 《Advances in Pure Mathematics》 2020年第3期101-113,共13页
In this paper, we research the regression problem of time series data from heterogeneous populations on the basis of the finite mixture regression model. We propose two finite mixed time-varying regression models to s... In this paper, we research the regression problem of time series data from heterogeneous populations on the basis of the finite mixture regression model. We propose two finite mixed time-varying regression models to solve this. A regularization method for variable selection of the models is proposed, which is a mixture of the appropriate penalty functions and l2 penalty. A Block-wise minimization maximization (MM) algorithm is used for maximum penalized log quasi-likelihood estimation of these models. The procedure is illustrated by analyzing simulations and with an application to analyze the behavior of urban vehicular traffic of the city of S&#227;o Paulo in the period from 14 to 18 December 2009, which shows that the proposed models outperform the FMR models. 展开更多
关键词 mixture regression models GARCH Block-Wise MM algorithm LASSO SCAD
下载PDF
Selecting the Quantity of Models in Mixture Regression
6
作者 Dawei Lang Wanzhou Ye 《Advances in Pure Mathematics》 2016年第8期555-563,共9页
Mixture regression is a regression problem with mixed data. Specifically, in the observations, some data are from one model, while others from other models. Only after assuming the quantity of the model is given, EM o... Mixture regression is a regression problem with mixed data. Specifically, in the observations, some data are from one model, while others from other models. Only after assuming the quantity of the model is given, EM or other algorithms can be used to solve this problem. We propose an information criterion for mixture regression model in this paper. Compared to ordinary information citizen by data simulations, results show our citizen has better performance on choosing the correct quantity of models. 展开更多
关键词 mixture regression model Based Clustering Information Criterion AIC BIC
下载PDF
Variable Selection for Robust Mixture Regression Model with Skew Scale Mixtures of Normal Distributions
7
作者 Tingzhu Chen Wanzhou Ye 《Advances in Pure Mathematics》 2022年第3期109-124,共16页
In this paper, we propose a robust mixture regression model based on the skew scale mixtures of normal distributions (RMR-SSMN) which can accommodate asymmetric, heavy-tailed and contaminated data better. For the vari... In this paper, we propose a robust mixture regression model based on the skew scale mixtures of normal distributions (RMR-SSMN) which can accommodate asymmetric, heavy-tailed and contaminated data better. For the variable selection problem, the penalized likelihood approach with a new combined penalty function which balances the SCAD and l<sub>2</sub> penalty is proposed. The adjusted EM algorithm is presented to get parameter estimates of RMR-SSMN models at a faster convergence rate. As simulations show, our mixture models are more robust than general FMR models and the new combined penalty function outperforms SCAD for variable selection. Finally, the proposed methodology and algorithm are applied to a real data set and achieve reasonable results. 展开更多
关键词 Robust mixture regression model Skew Scale mixtures of Normal Distributions EM Algorithm SCAD Penalty
下载PDF
Dynamic soft sensor development based on Gaussian mixture regression for fermentation processes 被引量:9
8
作者 Congli Mei Yong Su +2 位作者 Guohai Liu Yuhan Ding Zhiling Liao 《Chinese Journal of Chemical Engineering》 SCIE EI CAS CSCD 2017年第1期116-122,共7页
The dynamic soft sensor based on a single Gaussian process regression(GPR) model has been developed in fermentation processes.However,limitations of single regression models,for multiphase/multimode fermentation proce... The dynamic soft sensor based on a single Gaussian process regression(GPR) model has been developed in fermentation processes.However,limitations of single regression models,for multiphase/multimode fermentation processes,may result in large prediction errors and complexity of the soft sensor.Therefore,a dynamic soft sensor based on Gaussian mixture regression(GMR) was proposed to overcome the problems.Two structure parameters,the number of Gaussian components and the order of the model,are crucial to the soft sensor model.To achieve a simple and effective soft sensor,an iterative strategy was proposed to optimize the two structure parameters synchronously.For the aim of comparisons,the proposed dynamic GMR soft sensor and the existing dynamic GPR soft sensor were both investigated to estimate biomass concentration in a Penicillin simulation process and an industrial Erythromycin fermentation process.Results show that the proposed dynamic GMR soft sensor has higher prediction accuracy and is more suitable for dynamic multiphase/multimode fermentation processes. 展开更多
关键词 Dynamic modeling Process systems Instrumentation Gaussian mixture regression Fermentation processes
下载PDF
A Fast Iteration Method for Mixture Regression Problem
9
作者 Dawei Lang Wanzhou Ye 《Journal of Applied Mathematics and Physics》 2015年第9期1100-1107,共8页
In this paper, we propose a Fast Iteration Method for solving mixture regression problem, which can be treated as a model-based clustering. Compared to the EM algorithm, the proposed method is faster, more flexible an... In this paper, we propose a Fast Iteration Method for solving mixture regression problem, which can be treated as a model-based clustering. Compared to the EM algorithm, the proposed method is faster, more flexible and can solve mixture regression problem with different error distributions (i.e. Laplace and t distribution). Extensive numeric experiments show that our proposed method has better performance on randomly simulations and real data. 展开更多
关键词 mixture regression Problem FAST ITERATION Method model-BASED CLUSTERING
下载PDF
A skew–normal mixture of joint location, scale and skewness models 被引量:1
10
作者 LI Hui-qiong WU Liu-cang YI Jie-yi 《Applied Mathematics(A Journal of Chinese Universities)》 SCIE CSCD 2016年第3期283-295,共13页
Normal mixture regression models are one of the most important statistical data analysis tools in a heterogeneous population. When the data set under consideration involves asymmetric outcomes, in the last two decades... Normal mixture regression models are one of the most important statistical data analysis tools in a heterogeneous population. When the data set under consideration involves asymmetric outcomes, in the last two decades, the skew normal distribution has been shown beneficial in dealing with asymmetric data in various theoretic and applied problems. In this paper, we propose and study a novel class of models: a skew-normal mixture of joint location, scale and skewness models to analyze the heteroscedastic skew-normal data coming from a heterogeneous population. The issues of maximum likelihood estimation are addressed. In particular, an Expectation-Maximization (EM) algorithm for estimating the model parameters is developed. Properties of the estimators of the regression coefficients are evaluated through Monte Carlo experiments. Results from the analysis of a real data set from the Body Mass Index (BMI) data are presented. 展开更多
关键词 mixture regression models mixture of joint location scale and skewness models EM algorithm maximum likelihood estimation skew-normal mixtures
下载PDF
基于改进INFO-CNN-QRGRU模型的农村分布式光伏发电短期概率预测
11
作者 王俊 邱爽 +3 位作者 鞠丹阳 谢易澎 张楠楠 王慧 《沈阳农业大学学报》 CAS CSCD 北大核心 2024年第4期490-502,共13页
随着“双碳”目标的推进,清洁能源所占比重大幅度增加,分布式光伏发电在我国农村地区快速发展,但其随机性、间歇性的特点给新能源消纳和电网稳定带来很大的挑战。光伏发电预测可以在一定程度上改善新能源消纳问题,减少光伏发电的不稳定... 随着“双碳”目标的推进,清洁能源所占比重大幅度增加,分布式光伏发电在我国农村地区快速发展,但其随机性、间歇性的特点给新能源消纳和电网稳定带来很大的挑战。光伏发电预测可以在一定程度上改善新能源消纳问题,减少光伏发电的不稳定性对电网的冲击。因此,为提高光伏发电功率预测精度,提出一种基于改进向量加权平均算法优化CNN-QRGRU网络的光伏发电概率预测方法。首先采用ReliefF算法对特征变量进行选择,在此基础上利用高斯混合模型(Gaussian mixture model,GMM)聚类方法将天气分为晴天、晴转多云和阴雨天3种类型,将处理好的数据输入到CNN-GRU模型中,并利用向量加权平均(weighted mean of vectors algorithm,INFO)优化算法对模型超参数进行调参,将分位数回归模型(quantile regression,QR)与INFO-CNN-GRU模型相结合得到光伏功率条件分布,结合核密度估计法从条件分布中获得概率密度函数,完成概率预测。以实际光伏电站数据作为基础,将提出的INFO优化算法与其他几种传统的优化算法进行对比,结果表明INFO的优化效果更好,在此基础上进行概率预测,得到的概率预测结果相较于点预测能提供更多有效信息,更具有应用价值。 展开更多
关键词 光伏出力 高斯混合模型聚类 门控循环单元 向量加权平均算法 分位数回归 概率预测
下载PDF
脱贫户视角下的返贫风险预测模型
12
作者 李光辉 姜泽琴 冯姝 《凯里学院学报》 2024年第3期81-92,共12页
根据凯里市乡村振兴局提供的2021年脱贫户帮扶台账数据,建立风险度量的统计模型.首先,使用混料多项式模型构建年收入预测模型;其次,使用logistics回归模型建立返贫风险预测模型,并结合SVM等机器学习算法得到“三类户”的线性分类模型;然... 根据凯里市乡村振兴局提供的2021年脱贫户帮扶台账数据,建立风险度量的统计模型.首先,使用混料多项式模型构建年收入预测模型;其次,使用logistics回归模型建立返贫风险预测模型,并结合SVM等机器学习算法得到“三类户”的线性分类模型;然后,通过评价得分数据构建与年人均年收入的非参数回归模型.通过多重模型的分析,为基层开展返贫风险排查工作提供技术辅助参考. 展开更多
关键词 乡村振兴 混料模型 logistics回归 非参数回归
下载PDF
基于水代谢和水循环理论的石羊河流域水资源承载力评价
13
作者 贾玉博 杨宏伟 +2 位作者 粟晓玲 褚江东 徐吉海 《水资源保护》 EI CAS CSCD 北大核心 2024年第5期86-94,157,共10页
基于水代谢和水循环理论,构建了包含输入、消耗、活力、调节、输出5个子系统的水资源承载力评价指标体系,采用最小二乘法组合网络层次分析法和熵权法确定权重,基于可变模糊集模型综合评价了石羊河流域2011—2020年水资源承载力,耦合高... 基于水代谢和水循环理论,构建了包含输入、消耗、活力、调节、输出5个子系统的水资源承载力评价指标体系,采用最小二乘法组合网络层次分析法和熵权法确定权重,基于可变模糊集模型综合评价了石羊河流域2011—2020年水资源承载力,耦合高斯混合回归模型和3种可解释性机器学习方法量化了各评价指标对承载力的影响,从全局和局部尺度探究了其与水资源承载力的关系。结果表明:2011—2020年流域水资源承载力总体呈波动向好态势,但仍处于濒临超载的状态,评分值由2011年的3.79增长到2013年的4.18,之后下降到2020年的3.23;高斯混合回归模型能够较好地处理高维、小样本的水资源承载力指标数据;单位面积农业灌溉用水量、污水处理回用率、生态环境用水率、水资源开发利用率、产水模数和地下水开采率是该流域水资源承载力的主要影响因素;从全局看,水资源承载力与主要影响因素呈非线性关系,并随其非单调变化,从局部看,2011—2015年主要影响因素多表现为对水资源承载力的抑制作用,2016—2020年逐步转为促进作用;流域水资源承载力虽有提高,但仍需加强水资源开发利用管理,降低地下水开采率。 展开更多
关键词 水资源承载力 可变模糊集 高斯混合回归模型 水代谢和水循环理论 石羊河流域
下载PDF
基于轮轨定位数据的有轨电车区间驾驶特征分析
14
作者 童文聪 滕靖 +2 位作者 李君羡 姚幸 张中杰 《同济大学学报(自然科学版)》 EI CAS CSCD 北大核心 2024年第3期416-426,共11页
为分析人工驾驶条件下有轨电车区间速度及可靠性特征,基于轮轨定位数据,计算有轨电车在加速段、巡航段、制动段和交叉口的运行特征指标,分析人工驾驶决策对各指标的影响机制;并建立区间运行速度的多因素回归分析模型及概率分布模型。结... 为分析人工驾驶条件下有轨电车区间速度及可靠性特征,基于轮轨定位数据,计算有轨电车在加速段、巡航段、制动段和交叉口的运行特征指标,分析人工驾驶决策对各指标的影响机制;并建立区间运行速度的多因素回归分析模型及概率分布模型。结果表明:由于人工驾驶的模糊控制特点,司机无法实现充分加减速;终点速度和制动系数对区间运行速度贡献度总占比达57%,是驾驶行为优化的重点;区间运行速度呈高斯混合分布(Gaussian Mixture Model,GMM),对常见绿波带宽有较高的偏出率,是造成线路时间可靠性低的重要原因。 展开更多
关键词 有轨电车 驾驶行为 速度特征 多元线性回归 高斯混合分布
下载PDF
基于GPR模型的多孔沥青混合料空隙率预估
15
作者 马志鹏 章启月 +3 位作者 张泽霖 肖一帆 邓学耀 刘祥 《科技创新与应用》 2024年第30期52-54,59,共4页
多孔沥青混合料空隙率是影响其排水功能和路用性能的关键指标之一。为实现多孔沥青混合料空隙率的快速判别,该研究以混合料级配不同筛孔尺寸通过率、油石比为自变量,通过相关性分析提取特征参数,进而基于高斯过程回归(GPR)模型建立PAC-1... 多孔沥青混合料空隙率是影响其排水功能和路用性能的关键指标之一。为实现多孔沥青混合料空隙率的快速判别,该研究以混合料级配不同筛孔尺寸通过率、油石比为自变量,通过相关性分析提取特征参数,进而基于高斯过程回归(GPR)模型建立PAC-13多孔沥青混合料空隙率预估模型,并对比分析GPR模型与多元线性回归、AdaBoost和随机森林法对多孔沥青混合料空隙率的预估准确性。结果表明,以4.75、2.36、1.18、0.6、0.3、0.15和0.075 mm的筛孔通过率,以及油石比作为模型参数的多孔沥青混合料空隙率GPR预估模型具有较好的准确性,线性拟合系数达到0.95;相比多元线性回归、AdaBoost和随机森林法,GPR模型对于多孔沥青混合料空隙率预估的适用性相对更优。 展开更多
关键词 道路工程 多孔沥青混合料 空隙率 高斯过程回归 预估模型
下载PDF
A novel nonparametric mixture model for the detection pattern of COVID-19 on Diamond Princess cruise
16
作者 Huijuan Ma Jing Qin +1 位作者 Fang Chen Yong Zhou 《Statistical Theory and Related Fields》 CSCD 2023年第1期85-96,共12页
The outbreak of COVID-19 on the Diamond Princess cruise ship has attracted much attention.Motivated by the PCR testing data on the Diamond Princess,we propose a novel cure mixture nonparametric model to investigate th... The outbreak of COVID-19 on the Diamond Princess cruise ship has attracted much attention.Motivated by the PCR testing data on the Diamond Princess,we propose a novel cure mixture nonparametric model to investigate the detection pattern.It combines a logistic regression for the probability of susceptible subjects with a nonparametric distribution for the detection of infected individuals.Maximum likelihood estimators are proposed.The resulting estimators are shown to be consistent and asymptotically normal.Simulation studies demonstrate that the proposed approach is appropriate for practical use.Finally,we apply the proposed method to PCR testing data on the Diamond Princess to show its practical utility. 展开更多
关键词 Cure model logistic regression maximum likelihood estimator mixture
原文传递
基于PCA多模型融合的滚动轴承性能退化指标构建
17
作者 蒋丽英 郭濠 +2 位作者 李贺 刘明昆 张雷鸣 《沈阳航空航天大学学报》 2024年第1期54-60,共7页
单模型构建的滚动轴承性能健康指标仅能从本身的“单角度”来描述滚动轴承的性能退化状态,具有一定的局限性。为解决这个问题,提出一种基于主成分分析(principal component analysis,PCA)多模型融合的滚动轴承健康指标构建方法。该方法... 单模型构建的滚动轴承性能健康指标仅能从本身的“单角度”来描述滚动轴承的性能退化状态,具有一定的局限性。为解决这个问题,提出一种基于主成分分析(principal component analysis,PCA)多模型融合的滚动轴承健康指标构建方法。该方法分别采用支持向量数据描述(support vector data description,SVDD)模型、自联想核回归(auto-associative kernel regression,AAKA)模型和高斯混合模型(gaussian mixture module,GMM)构建相应单模型的健康指标,再将3个单模型的健康指标经主成分分析(PCA)融合,并选取第一主成分作为能够包含“多角度”性能退化信息的健康指标(SAG-HI)。试验结果表明,相比于各单模型的健康指标,SAG-HI与滚动轴承保持可靠度的灰置信水平达到98.38%,其相关性、单调性和鲁棒性也均表现为最优,且通过包络谱分析验证了其能够准确且及时监测到早期故障发生时刻。 展开更多
关键词 滚动轴承 支持向量数据 自联想核回归 高斯混合模型 主成分分析 性能退化指标 多模型融合
下载PDF
融合机制与高斯混合回归算法的成品油管道顺序输送混油长度预测模型 被引量:4
18
作者 袁子云 刘刚 +2 位作者 陈雷 邵伟明 张钰晗 《中国石油大学学报(自然科学版)》 EI CAS CSCD 北大核心 2023年第2期123-128,共6页
成品油管道顺序输送过程中会出现混油现象,精确预测混油长度对油品批次切割具有重要意义,混油长度机制模型存在精度不高,数值计算量庞杂等问题。当前基于机器学习算法构建的全局预测模型未考虑实际工况多模态特性,预测精度受限;直接引... 成品油管道顺序输送过程中会出现混油现象,精确预测混油长度对油品批次切割具有重要意义,混油长度机制模型存在精度不高,数值计算量庞杂等问题。当前基于机器学习算法构建的全局预测模型未考虑实际工况多模态特性,预测精度受限;直接引入高斯混合回归算法辨识数据模态难以准确表征变量间复杂非线性关系。采用现有机制计算公式与高斯混合回归算法构建融合机制认知的局部建模算法,基于真实成品油管道顺序输送混油长度数据集进行不同模型预测结果对比试验。结果表明,融合机制认知与局部建模算法能有效表征变量间函数关系,新模型预测精度有明显优势。 展开更多
关键词 成品油管道 混油长度 局部建模 高斯混合回归 机制-数据
下载PDF
基于混合偏正态数据下众数回归模型的变量选择 被引量:1
19
作者 曾鑫 吴刘仓 句媛媛 《工程数学学报》 CSCD 北大核心 2023年第3期381-397,共17页
有限混合回归(Finite Mixture of Regression,FMR)模型的变量选择常常在统计建模中使用。目前关于FMR模型的研究主要集中在回归误差服从正态分布的情形,而这种假设不适用于研究非对称的数据。对于偏斜数据,众数的代表性优于均值。本文... 有限混合回归(Finite Mixture of Regression,FMR)模型的变量选择常常在统计建模中使用。目前关于FMR模型的研究主要集中在回归误差服从正态分布的情形,而这种假设不适用于研究非对称的数据。对于偏斜数据,众数的代表性优于均值。本文基于混合偏正态数据介绍了众数回归模型的变量选择方法,并证明了变量选择方法的相合性和参数估计的Oracle性质。为了估计模型的参数,提出了一种改进的EM(Expectation-Maximum)算法,通过模拟研究和实例分析进一步说明了所提出模型和变量选择方法的有效性。 展开更多
关键词 混合偏正态数据 众数回归模型 变量选择 EM算法
下载PDF
响应变量随机缺失下偏正态众数混合专家模型的参数估计 被引量:1
20
作者 鲁钰 吴刘仓 王格格 《应用数学》 北大核心 2023年第2期474-486,共13页
数据缺失是众多影响数据质量的因素中最常见的一种.若缺失数据处理不当,将直接影响分析结果的可靠性,进而达不到分析的目的.本文针对随机缺失偏正态数据,研究了偏正态众数混合专家模型的参数估计.将众数回归插补与聚类相结合,提出分层... 数据缺失是众多影响数据质量的因素中最常见的一种.若缺失数据处理不当,将直接影响分析结果的可靠性,进而达不到分析的目的.本文针对随机缺失偏正态数据,研究了偏正态众数混合专家模型的参数估计.将众数回归插补与聚类相结合,提出分层众数回归插补方法.利用机器学习插补和统计学插补的方法,进一步比较研究三种机器学习插补方法:支持向量机插补、随机森林插补和神经网络插补,三种统计学插补方法:分层均值插补、众数回归插补和分层众数回归插补的缺失数据处理效果.通过Monte Carlo模拟和实例分析结果表明,分层众数回归插补的优良性. 展开更多
关键词 缺失偏正态数据 众数混合专家模型 支持向量机插补 随机森林插补 BP神经网络插补 分层众数回归插补
下载PDF
上一页 1 2 5 下一页 到第
使用帮助 返回顶部