This paper investigates the tolerable sample size needed for Ordinary Least Square (OLS) Estimator to be used when there is presence of Multicollinearity among the exogenous variables of a linear regression model. A r...This paper investigates the tolerable sample size needed for Ordinary Least Square (OLS) Estimator to be used when there is presence of Multicollinearity among the exogenous variables of a linear regression model. A regression model with constant term (β0) and two independent variables (with β1 and β2 as their respective regression coefficients) that exhibit multicollinearity was considered. A Monte Carlo study of 1000 trials was conducted at eight levels of multicollinearity (0, 0.25, 0.5, 0.7, 0.75, 0.8, 0.9 and 0.99) and sample sizes (10, 20, 40, 80, 100, 150, 250 and 500). At each specification, the true regression coefficients were set at unity while 1.5, 2.0 and 2.5 were taken as the hypothesized value. The power value rate was obtained at every multicollinearity level for the aforementioned sample sizes. Therefore, whether the hypothesized values highly depart from the true values or not once the multicollinearity level is very high (i.e. 0.99), the sample size needed to work with in order to have an error free estimation or the inference result must be greater than five hundred.展开更多
为进一步提高FTIR光谱法实现特征吸收光谱严重重叠的甲烷、乙烷、丙烷、异丁烷、正丁烷、异戊烷以及正戊烷七组分混合气体定量分析的精度和速度,提出一种核偏最小二乘(Kernel Partial Least Square,KPLS)特征提取耦合支持向量回归机(Sup...为进一步提高FTIR光谱法实现特征吸收光谱严重重叠的甲烷、乙烷、丙烷、异丁烷、正丁烷、异戊烷以及正戊烷七组分混合气体定量分析的精度和速度,提出一种核偏最小二乘(Kernel Partial Least Square,KPLS)特征提取耦合支持向量回归机(Support Vector Regression Machine,SVR)的红外光谱定量分析新方法.首先采用KPLS方法对上述七组分混合气体的FTIR光谱进行特征提取,然后将特征提取得到的特征组分作为SVR的输入建立混合气体的定量分析模型.对标准混合气体进行定量分析的结果显示:KPLS-SVR模型的预测精度高于未进行特征提取SVR模型预测的精度,同时预测时间也减少了一半.研究表明,KPLS法可以很好地提取隐含在混合气体FTIR光谱数据与其组分浓度之间的非线性特征并有效地消除光谱数据噪声,大幅度降低数据维数,与SVR耦合可以提高红外光谱分析的精度和速度,是一种有效的红外光谱定量分析方法.展开更多
文摘This paper investigates the tolerable sample size needed for Ordinary Least Square (OLS) Estimator to be used when there is presence of Multicollinearity among the exogenous variables of a linear regression model. A regression model with constant term (β0) and two independent variables (with β1 and β2 as their respective regression coefficients) that exhibit multicollinearity was considered. A Monte Carlo study of 1000 trials was conducted at eight levels of multicollinearity (0, 0.25, 0.5, 0.7, 0.75, 0.8, 0.9 and 0.99) and sample sizes (10, 20, 40, 80, 100, 150, 250 and 500). At each specification, the true regression coefficients were set at unity while 1.5, 2.0 and 2.5 were taken as the hypothesized value. The power value rate was obtained at every multicollinearity level for the aforementioned sample sizes. Therefore, whether the hypothesized values highly depart from the true values or not once the multicollinearity level is very high (i.e. 0.99), the sample size needed to work with in order to have an error free estimation or the inference result must be greater than five hundred.