期刊文献+

基于Isomap算法的恒星光谱离群点挖掘 被引量:4

Stellar Spectral Outliers Detection Based on Isomap
下载PDF
导出
摘要 如何从已分类的海量光谱中发现被错分的光谱一直是天文数据处理专家重点研究的问题,探讨的Isomap算法在该问题方面有很好的表现。通过Isomap算法与主成分分析方法(PCA)算法的实验结果对比发现:(1)PCA将具有不同特征的光谱投影到邻近的区域,而Isomap算法却可以将具有相似特征的光谱投影到邻近区域,而将具有不同特征的光谱投影到相距较远的区域;(2)Isomap算法给出的大部分离群点较易判断,且是具有很高科学价值的双星;而PCA给出的离群点难以判断,科学价值不高。因此,在光谱离群点发掘上Isomap算法比PCA有明显优势。由于使用的数据为SDSS最新发布的M型的九种光谱次型的光谱,因而Isomap算法能够快速发现被斯隆数字巡天数据处理流程(SDSS pipeline)错分的光谱,可帮助有效提高现有光谱分类算法的准确率。更进一步,由于被SDSS pipeline错分的光谱大部分是双星,因而Isomap算法还可以进一步帮助我们发现有很高科学研究价值的双星,提高双星的发现效率。虽然实验显示Isomap算法对信噪比变化较为敏感,在具有较低信噪比的光谱上表现较差,但由于信噪比低的光谱的光谱型难以判断,因而该缺点并不影响Isomap算法的在光谱发掘上的应用。 How to find the spectra misclassified by traditional methods is the key problem that has been widely studied by the ex- perts of astronomical data processing. We found that Isomap algorithm performs well for this problerrL By comparing the per- formance of Isomap with that of principal component analysis (PCA), we found that (1) Isomap can project the spectra with similar features together and project the spectra with different features far away, while PCA may project the spectra with differ- ent features into nearby regions; (2) the outliers given by Isomap can be easily determined, and most of the outliers are binary stars with high scientific values; while the outliers given by PCA are difficult to determine and most of outliers are not binary stars. Thus, Isomap is more efficient than PCA in finding the outliers. Since the spectral data used in experiment are the spectra from the ninth data release of Sloan Digital Sky Survey (SDSS DRg), Isomap can find the spectra misclassified by SDSS pipeline efficiently and improve the classification accuracy obviously. Furthermore, since most of the spectra misclassified by SDSS pipe- line are binary stars, Isomap can improve the efficiency of finding the binary stars with high scientific values. Though the experi- ment results show that Isomap is more sensitive to the noise than PCA, this disadvantage will not affect the application of Isomap in spectral classification since most of the spectra with low signal-to-noise ratios are the spectra whose spectral type can't be de- termined manually.
出处 《光谱学与光谱分析》 SCIE EI CAS CSCD 北大核心 2014年第1期267-273,共7页 Spectroscopy and Spectral Analysis
基金 国家自然科学基金项目(11078013)资助
关键词 流形学习算法 ISOMAP算法 主成分分析 数据挖掘 Manifold learning algorithm Isomap algorithm PCA Data mining
  • 相关文献

参考文献5

  • 1Deeming.查看详情[J],{H}MNRAS1964493.
  • 2Singh H P;Gulati R K;Gupta R.查看详情[J],{H}MNRAS1998312.
  • 3Connolly A J;Szalay A S;Bershady M A.查看详情[J],Astron J19951071.
  • 4Daniel S F;Connolly A;Schneider J.查看详情[J],{H}The Astrophysical Journal20111.
  • 5Tenenbaum J B;de Silva V;Langford J C.查看详情[J],{H}SCIENCE20002319.

同被引文献73

引证文献4

二级引证文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部