摘要
[目的]利用机器学习方法对列车空调制冷剂充注量的诊断中,经常出现许多特征,如何选择特征成为诊断中两难的抉择。特征选择过多,算法资源开销巨大;特征选择过少,则不能很好地学习故障信息,导致诊断模型的效果不好。目前,在特征选择中,有一个比较常用的方法,即PCA(主成分分析),还有一个引用次数比较少的为最大化瑞利熵,2个皆存在特征过多导致的算法资源开销过大问题。为减少资源开销及比较二者优劣,特开展本研究。[方法]选用比较经典的支持向量机和均值聚类模型作比较,同时利用小样本历史数据构造降维算法而不是利用全体样本构造最大化瑞利熵的降维算法,同样是为了减少资源开销,属于对算法的改进。比较基于小样本历史数据的最大化广义瑞利熵降维和常用的PCA降维方式在均值聚类和支持向量机模型的f1得分、准确率和时间开销上的优劣。[结果及结论]结果表明,最大化广义瑞利熵的支持向量机的故障诊断和探测时间资源开销只有原始数据训练支持向量机模型的3%,无论是均值聚类还是支持向量机在使用最大化瑞利熵的投影后的数据,测试数据准确率比其他模型更接近100%。
[Objective]In diagnosing train air-conditioning refrigerant charge using machine learning methods,numerous features often appear,posing a dilemma in feature selection for diagnosis.Excessive feature selection leads to high algorithm resource costs,while insufficient feature selection results in poor learning of fault information,negatively affecting the diagnostic model performance.Currently,a commonly used method for feature selection is PCA(principal component analysis),and another less frequently cited method is MRE(Maximum Rayleigh Entropy).Both methods,however,face the problem of high resource costs due to excessive features.The research is specially carried out aiming to reduce resource costs and the effectiveness of these two methods is compared.[Method]For this purpose,the typical SVM(support vector machine)and K-means clustering model are selected for comparison,and a small sample of historical data is used to construct dimensionality reduction algorithm,instead of using the entire sample to construct MRE dimensionality reduction algorithm,representing an improvement to the algorithm under the same goal of reducing resource costs.MRE dimensionality reduction based on small historical data sample and the commonly used PCA dimensionality reduction are compared in terms of their performance on F1 score,accuracy rate,and time cost in K-means clustering and SVM models.[Result&Conclusion]The results indicate that the fault diagnosis and detection time resource costs of SVM machine using MRE is only 3%that of SVM model trained on original data.Whether using K-means clustering or SVM,the test data accuracy rate after using the projected data from MRE is closer to 100%compared to other models.
作者
张梦源
王钊
鲍超
陈焕新
张鉴心
程亨达
ZHANG Mengyuan;WANG Zhao;BAO Chao;CHEN Huanxin;ZHANG Jianxin;CHENG Henda(China-EU Institute for Clean and Renewable Energy at Huazhong University of Science&Technology,430074,Wuhan,China;Guangzhou Dinghan Railway Vehicles Equipment Co.,Ltd.,510260,Guangzhou,China;Guangzhou Metro Group Co.,Ltd.,510330,Guangzhou,China;School of Energy and Power Engineering,Huazhong University of Science and Technology,430074,Wuhan,China)
出处
《城市轨道交通研究》
北大核心
2024年第10期255-259,共5页
Urban Mass Transit
基金
国家自然科学基金项目(51876070)。
关键词
列车空调
制冷剂充注量
小样本
广义瑞利熵
train air-conditioning
refrigerant charge amount
small sample
maximum Rayleigh Entropy