An outlier detection method is proposed for near-infrared spectral analysis. The underlying philosophy of the method is that,in random test(Monte Carlo) cross-validation,the probability of outliers presenting in good ...An outlier detection method is proposed for near-infrared spectral analysis. The underlying philosophy of the method is that,in random test(Monte Carlo) cross-validation,the probability of outliers presenting in good models with smaller prediction residual error sum of squares(PRESS) or in bad models with larger PRESS should be obviously different from normal samples. The method builds a large number of PLS models by using random test cross-validation at first,then the models are sorted by the PRESS,and at last the outliers are recognized according to the accumulative probability of each sample in the sorted models. For validation of the proposed method,four data sets,including three published data sets and a large data set of tobacco lamina,were investigated. The proposed method was proved to be highly efficient and veracious compared with the conventional leave-one-out(LOO) cross validation method.展开更多
This paper presents a novel spectroscopic method for searching for supernova candidates from massive galaxy spectra,which is expected to be applied to the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAM...This paper presents a novel spectroscopic method for searching for supernova candidates from massive galaxy spectra,which is expected to be applied to the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST).This method includes mainly five steps.The first step is spectral preprocessing,including removing spectral noise using wavelet transform,spectral de-redshift,etc.The second step is decomposition of galactic spectra;we can get the galaxy component and supernova component and calculate the Supernova Statistical Characterization Vector (SNSCV) of each galaxy spectrum.The third step is to decrease samples in all the galaxy spectral datasets according to SNSCV of each spectrum,and to use the LOF (Local Outlier Factor)-based outlier detection algorithm to obtain the preliminary selected spectral data.The fourth step is template matching by cross-correlation,according to the matched results we get the secondary selected spectral data.Finally,we choose the final supernova candidates manually through checking the spectral features characteristic of a supernova.By the spectroscopic method proposed in this paper,thirty-six supernova candidates have been detected in a dataset including 294843 galaxy spectra from the Sloan Digital Sky Survey Data Release 7.Nine of these objects are detected first and the other twenty-seven have been reported in other publications (fifteen of which are detected and reported first by us).The twenty-four new super-nova candidates include twenty Ia type supernova candidates,three Ic type supernova candidates and one II type supernova candidate.展开更多
基金supported by the National Natural Science Foundation of China(No.NSFC 41204101)Open Projects Fund of the State Key Laboratory of Oil and Gas Reservoir Geology and Exploitation(No.PLN201733)+1 种基金Youth Innovation Promotion Association of the Chinese Academy of Sciences(No.2015051)Open Projects Fund of the Natural Gas and Geology Key Laboratory of Sichuan Province(No.2015trqdz03)
基金Supported by the National Natural Science Foundation of China (Grant Nos. 20575031 and 20775036)the Ph.D. Programs Foundation of Ministry of Education (MOE) of China (Grant No. 20050055001)
文摘An outlier detection method is proposed for near-infrared spectral analysis. The underlying philosophy of the method is that,in random test(Monte Carlo) cross-validation,the probability of outliers presenting in good models with smaller prediction residual error sum of squares(PRESS) or in bad models with larger PRESS should be obviously different from normal samples. The method builds a large number of PLS models by using random test cross-validation at first,then the models are sorted by the PRESS,and at last the outliers are recognized according to the accumulative probability of each sample in the sorted models. For validation of the proposed method,four data sets,including three published data sets and a large data set of tobacco lamina,were investigated. The proposed method was proved to be highly efficient and veracious compared with the conventional leave-one-out(LOO) cross validation method.
基金supported by the National Natural Science Foundation of China (Grant Nos. 60773040,10973021)
文摘This paper presents a novel spectroscopic method for searching for supernova candidates from massive galaxy spectra,which is expected to be applied to the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST).This method includes mainly five steps.The first step is spectral preprocessing,including removing spectral noise using wavelet transform,spectral de-redshift,etc.The second step is decomposition of galactic spectra;we can get the galaxy component and supernova component and calculate the Supernova Statistical Characterization Vector (SNSCV) of each galaxy spectrum.The third step is to decrease samples in all the galaxy spectral datasets according to SNSCV of each spectrum,and to use the LOF (Local Outlier Factor)-based outlier detection algorithm to obtain the preliminary selected spectral data.The fourth step is template matching by cross-correlation,according to the matched results we get the secondary selected spectral data.Finally,we choose the final supernova candidates manually through checking the spectral features characteristic of a supernova.By the spectroscopic method proposed in this paper,thirty-six supernova candidates have been detected in a dataset including 294843 galaxy spectra from the Sloan Digital Sky Survey Data Release 7.Nine of these objects are detected first and the other twenty-seven have been reported in other publications (fifteen of which are detected and reported first by us).The twenty-four new super-nova candidates include twenty Ia type supernova candidates,three Ic type supernova candidates and one II type supernova candidate.