摘要
在处理高维特征数据时通常会面临冗余和不相关的问题,Relief作为一种传统的特征选择算法因其具有较高的稳定性和计算效率,从而被广泛应用。但其特征选择结果具有随机性,且对于特征之间存在较强依赖关系的数据集,如共线性等,可能会导致结果不准确。基于对特征选择方法的研究,给出了基于LightGBM和蚁群算法的L-ACO方法,使用LightGBM算法的特征重要性来表示L-ACO算法蚁群路径搜索过程的启发式信息。同时,使用特征之间的皮尔森相关系数来调整信息素浓度,以便更好地控制特征的相关性。实验证明,L-ACO方法可以在保证分类准确率的前提下,减少特征数量,降低特征冗余,并提高算法性能。
When processing high⁃dimensional feature data,there are usually issues of redundancy and irrelevance.As a tradi⁃tional feature selection algorithm,Relief is widely used due to its high stability and computational efficiency.However,the feature selection results are random,and for datasets with strong dependencies between features,such as collinearity,it may lead to inaccu⁃rate results.Based on the research on feature selection methods,an L⁃ACO method based on LightGBM and ant colony algorithm was proposed.The heuristic information of the L⁃ACO algorithm ant colony path search process was represented by the feature im⁃portance of LightGBM algorithm.At the same time,the Pearson correlation coefficient between features is used to adjust the concen⁃tration of pheromones in order to better control the correlation of features.Experiments have shown that the L⁃ACO method can re⁃duce the number of features,reduce feature redundancy,and improve algorithm performance while ensuring classification accuracy.
作者
别春洋
陶贻勇
Bie Chunyang;Tao Yiyong(School of Computer Science and Engineering,Anhui University of Science&Technology,Huainan 232001,China)
出处
《现代计算机》
2024年第4期34-38,共5页
Modern Computer