期刊文献+

1种基于自助法的奇异值检测方法 被引量:1

A bootstrap-based outlier detection method
原文传递
导出
摘要 如何检测数据集中的奇异值仍然是多元校正中的1个重要的问题。对于化学计量学研究者来说,找到1个普遍适用的方法仍然是1个重要的任务。本文的目的是介绍1种较新的基于自助法的奇异值检测方法。本法以内部学生化残差为基准,用自助法对相关变量进行估计,并采用刀切-自助法对估计值进行评价。它不要求回归模型的残差服从正态分布,因而适用于大部分回归分析中的奇异值检测。本文中采用烟草和玉米样本的近红外光谱数据对该法进行验证,结果表明,采用基于自助法的奇异值检测方法剔除奇异样品后,模型的预测误差减小15%,优于学生化残差-杠杆值法和稳健偏最小二乘法。我们还在玉米近红外光谱的基础上,进行了奇异样品数的模拟研究,并采用该法进行检验。结果表明,当奇异样品的数量少于总样品数的10%时,该方法的表现较其它2种方法好。所以,基于自助法的奇异值检测方法是1种有效的方法。 How to detect outliers in a data set is still an important problem in multivariate calibration.To chemometricians,finding a widely applicable method is still a task.The aim of this paper is to introduce a bootstrap-based outlier detection.The inner Studentized residuals was taken as an index.A bootstrap was applied to estimate model parameters.And the jackknife-after-bootstrap was applied to evaluate the parameters.A tobacco data set and a corn data set were used to demonstrate the performance of the method.The results showed that a 15%improvement of the performance of a PLS model was obtained after the outliers were detected and removed by the method.This result is better than the ordinary outlier detection method based on the Studentized residuals and robust SIMPLS.On the other hand,we made simulation based on the corn data.A number of outliers were introduced into the data and the method was applied to detect the outliers.The results show that the performance of the method is relatively better than two other methods when the percentage of outliers is less than 10%.So,the bootstrap-based outlier detection is a valid method.
出处 《计算机与应用化学》 CAS CSCD 北大核心 2010年第11期1476-1480,共5页 Computers and Applied Chemistry
基金 国家自然科学基金(20875106) 广东省自然科学基金委员会(9151027501000003) 中国烟草广东工业有限公司(No.I05XM-QK[2008]017)
关键词 自助法 奇异值检测 刀切-自助法 偏最小二乘法 烟草 玉米 bootstrap outlier detection jackknife-after-bootstrap partial least squares tobacco corn
  • 相关文献

参考文献1

二级参考文献50

  • 1Da Chen,Bin Hu,Xueguang Shao,Qingde Su.Removal of major interference sources in aqueous near-infrared spectroscopy techniques[J]. Analytical and Bioanalytical Chemistry . 2004 (1)
  • 2Xueguang Shao,Fang Wang,Da Chen,Qingde Su.A method for near-infrared spectral calibration of complex plant samples with wavelet transform and elimination of uninformative variables[J]. Analytical and Bioanalytical Chemistry . 2004 (5)
  • 3Massart D L,Vandeginste B G M, et al.Handbook of chemometrics and Qualimetrics: Part A. . 1997
  • 4Centner V,Massart D L,De Noord O E,De Jong S,Vandeginste B M,Sterna C.Elimination of uninformative variables for multivariate calibration. Analytical Chemistry . 1996
  • 5Daszykowski M,Kaczmarek K,Heyden V Y,Walczak B.Robust statistics in data analysis: A review Basic concepts. Chemom Intell Lab Syst . 2007
  • 6Pena D,Yohai V.A Fast procedure for outlier diagnostics in large re- gression systems. Journal of the American Statistical Association . 1999
  • 7Zhang M H,Xu Q S,Massart D L.Robust principal components re- gression based on principal sensitivity vectors. Chemometrics and Intelligent Laboratory Systems . 2003
  • 8Hubert M,Rousseeuw P J,Verboven S.A fast method for robust principal components with application to chemometrics. Chemometrics and Intelligent Laboratory Systems . 2002
  • 9Croux C,Ruiz-Gazen A.High breakdown estimators for principal components: The projection-pursuit approach revisited. Journal of Multivariate Analysis . 2005
  • 10Cummins D J,Andrews C W.Iteratively reweighted partial least squares: a performance analysis by Monte Carlo simulation. Journal of Chromatography . 1995

共引文献12

同被引文献6

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部