摘要
为了解决脉冲星识别研究局限在常规分类算法的视野中而缺乏针对性的问题,文章针对脉冲星数据集的特点,挖掘其内在特征与其他研究领域的关联性,发现了脉冲星数据与长尾分布之间存在的联系,探求脉冲星数据与长尾分布的特征一致性,首次将脉冲星数据分布看作长尾分布的一种特例。并从长尾视觉识别视角中的优化训练策略角度出发,提出了一种基于解耦训练策略的脉冲星识别算法。算法采用解耦训练策略,在操作上简捷高效,具备更强的可移植性。经过数据集的验证,算法能有效改善决策边界,在HTRU_bands和HTRU_ints数据集的召回率相较于对比方法分别提升了11.8%和13%,是一种性价比较高的有效识别算法。
In order to solve the problem that the pulsar recognition research is limited to the field of conventional classification algorithm,the research ideas are mainly divided into two categories:data adaptation model and model adaptation data,and the lack of pertinence is the problem,to study the characteristics of the pulsar dataset,excavates the correlation between its intrinsic characteristics and other research fields,and finds that there is a connection between pulsar data and long-tail distribution.Aiming at the limitations of the field of pulsar data identification,combined with the characteristics of pulsar data itself,in this study.The consistency of pulsar data with the long-tail distribution is explored,and for the first time,the pulsar data distribution is regarded as a special case of long-tail distribution.On this basis,from the perspective of optimized training strategy from the perspective of long-tail visual recognition,a pulsar recognition algorithm based on decoupling training strategy is proposed.The traditional algorithm in the field of pulsar recognition is mainly improved from the idea of optimizing the model and data,compared with the traditional algorithm,the algorithm proposed starts from the training point of view,adopts the decoupling training strategy,and is simple and efficient in operation,with stronger portability.The training process of the algorithm is divided into two stages,the first stage is the joint training of the overall pulsar recognition model,and the sampling strategy is sampling based on the balance of the instances;The second stage is to fix the feature extraction network on the basis of the first stage,and fine-tune the classifier.A variety of different fine-tuning strategies were used,including class balancing training on classifiers and normalization of features to find nearest neighbors.After the verification of multiple data sets,the individual fine-tuning of the classifier can effectively improve the decision-making boundary and improve the recall rate and other indicators,which is a cost-effective improvement method.
作者
尹乾
车润琪
杨如意
郑新
YIN Qian;CHE Runqi;YANG Ruyi;ZHENG Xin(School of Artificial Intelligence,Beijing Normal University,Beijing 100875,China;School of Information Science and Technology,University of Jinan,Jinan 250002,China)
出处
《光学技术》
CAS
CSCD
北大核心
2023年第6期680-684,698,共6页
Optical Technique
基金
国家自然科学基金天文联合基金(U2031136)。
关键词
脉冲星识别算法
长尾分布
解耦训练
pulsar recognition algorithm
long-tail distribution
decoupling training