期刊文献+

基于集成树的M型星光谱分类 被引量:1

Spectral Classification of M-Type Stars Based on Ensemble Tree Models
下载PDF
导出
摘要 在赫罗图中,M巨星位于红巨星的顶端,是由类太阳的主序星逐渐演化而成的最明亮的一类恒星。M巨星的研究对于理解银河系,特别是银河系晕的性质至关重要。中低分辨率的M巨星光谱,常因为特征不显著、噪声影响等因素而与M矮星的光谱混在一起,不易区分。现有研究一般利用CaH2+CaH3vs.TiO5分子谱指数初步筛选M巨星光谱候选体,再通过人眼检查确认。但这种方法仅利用了三个巨星相关的分子带指数,没有利用识别M巨星的其他光谱特征,可能会由于噪声对指数的污染而导致分类错误。而且,人眼检查数量众多的光谱不仅耗时而且检查质量依赖于人的经验,可靠性无法得到保证。LAMOST望远镜自2011年开始先导巡天到2017年6月,已经发布了900多万天体的光谱,最新释放的光谱数据DR5包含了52万的M型星光谱数据,需要采用自动、准确、有效的方法来区分其中不同光度级的M子样本。本研究利用集成树模型分类M巨星和M矮星光谱,分别采用随机森林、GBDT、XGBoost和LightGBM算法,构建区分M巨星和M矮星的光度分类器。四种分类器的测试准确率分别达到97.23%,98%,98.05%和98.32%。实验表明LightGBM模型比其他三种集成树模型准确率更高,训练时间更少,分类效率更高。对分类器模型获取到的重要特征分析的结果表明,集成树算法有效提取并表达了用于区分M巨星和M矮星的结构性特征,模型提取到的重要特征不仅包括原子线或分子带吸收的波长位置,还包含了它们相邻的伪连续谱,这与传统上计算指数所需要特征波长和伪连续谱是一致的。相比于传统M巨星和M矮星分类方法,集成树模型能够采用光谱中的多个重要特征组合进行分类,避免仅依赖某一种特征易受噪声影响而得出错误的分类结果。研究结果表明集成树算法在巨星识别过程中具有显著优势,完全可以替代传统上只利用CaH和TiO指数的巨星光谱判别方法。基于集成树模型对M型星光谱的分类研究,为LAMOST高效、准确地处理海量天体光谱提供了有效的方法。随着LAMOST巡天项目不断开展,积累的M巨星和M矮星样本将为研究银河系的结构和演化提供重要的数据基础。 Located at the top of the red giants in Hertzsprung-Russell diagram, M giants are the brightest stars that evolved from the sun-like main sequence stars. The study of M giants is crucial to understand the Milky Way, especially the Galactic haloes. The spectrum of an M giants in medium and low resolution is often mixed with spectra of M dwarfs because of insignificant features, noise effects, and other factors. Previous studies often used the molecular index of CaH2+CaH3 vs. TiO5 to search for M giant candidates, then checked them with human eyes. However, this method only used three important molecular band indices associated with giants, without using other spectral features to identify the M giants, which may cause misclassification due to noise pollution of the index. Moreover, relying on human eyes to check a large number of spectra is time-consuming, and the quality of the inspection dependings on people’s experience and its reliability is not guaranteed. Since 2011, LAMOST has released more than 9 million celestial spectra. The latest spectral data product data release 5(DR5) contains 520 000 M-type spectral data, which needs an automatic, accurate and effective method to distinguish the M sub-samples of different luminosity levels. This study uses four ensemble tree models: Random Forest, GBDT, XGBoost, and LightGBM to construct classifiers that distinguish between M giants and M dwarfs. The accuracy of four classifiers is 97.23%, 98%, 98.05%, and 98.32%, respectively. Experiments showed that LightGBM has higher accuracy and less training time when compared to the other threemodels. The analysis of important features obtained by the classifier models showed that ensemble tree model can efficiently extract and express the structural features that distinguish M giants and M dwarfs. These features include not only the atomic lines, molecular bands, but also their adjacent pseudo-continuum spectrum, which is consistent with the features and pseudo-continuum spectra that we traditionally need to calculate the indices. Compared to the traditional classification methods, ensemble tree can use the combination of tens or hundreds important features in the spectrum rather than only several features to avoid misclassification affected by noises. The results of this study showed that the ensemble tree algorithm has significant advantages in the process of M giant recognition, and it can completely replace the traditional M giant spectral discrimination method using only CaH and TiO indices. In this study an effective method has been provided for LAMOST to efficiently and effectively process the massive celestial spectra. As the LAMOST survey continues, more and more M spectra will be accumulated, which provides massive data for the studies of structure and evolution of the Milky Way.
作者 王晶 衣振萍 岳丽丽 董慧芬 潘景昌 卜育德 WANG Jing;YI Zhen-ping;YUE Li-li;DONG Hui-fen;PAN Jing-chang;BU Yu-de(School of Mechanical,Electrical&Information Engineering,Shandong University,Weihai,Weihai 264209,China;School of Mathematics and Statistics,Shandong University,Weihai,Weihai 264209,China)
出处 《光谱学与光谱分析》 SCIE EI CAS CSCD 北大核心 2019年第7期2288-2292,共5页 Spectroscopy and Spectral Analysis
基金 国家自然科学基金项目(11603014,11603012) 山东大学青年学者未来计划(2016WHWLJH09)资助
关键词 M巨星 集成树 光谱分类 特征提取 M giants Ensemble tree Spectral classification Feature extraction
  • 相关文献

参考文献1

二级参考文献10

  • 1Wang S G, Su D Q, Chu Y Q et al. Applied Optics, 1996,35 : 5155.
  • 2Cui X Q, Zhao Y H, Chu Y Q et al. Research on Astronomy and Astrophysics, 2012, 12:1197.
  • 3Su D Q, Cui X Q, Wang Y N et al. Proc. SPIE, 1998,3352 : 76.
  • 4Su D Q, Cui X Q. Chin. J. Astron. Astrophys., 2004,4(1): 1.
  • 5Cui X Q, Su D Q, Li G P et al. Proc. SPIE, 2004,5489 : 974.
  • 6Xing X Z, Zhai C, Du H Set al. Proc. SPIE, 1998,3352:839.
  • 7Zhu Y T, Hu Z W, Zhang Q F et al. Proc. SP1E, 2006,6269 : 62690M- 1.
  • 8Zhao Y H. Proc. SPIE,2000,4010:290.
  • 9Luo A L,Zhang Y X,Zhao Y H. Proc. SPIE,2004,5496:756.
  • 10Luo A L, Zhang H T, Zhao Y H et al. Research on Astronomy and Astrophysics, 2012,12:1243.

共引文献3

同被引文献8

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部