摘要
为提高管道状况异常检测的识别率和实时性,提出基于禁忌搜索的半监督K-means聚类和C4.5决策树的集成检测方法.在禁忌搜索中引入代价敏感函数,选择具有最佳分类性能的特征组合和最佳组合权值,提高了不平衡数据分布中少数类的识别率.半监督K-means方法首先把样本特征聚类为k类,再利用C4.5方法精确每一类的边界,级联式集成方法缓解不平衡数据分布问题,提高管道检测的准确度.并提出3种集成原则:加权叠加、最近一致和最邻近原则.实验结果验证了算法的有效性,在管道状况的异常检测中具有较高的分类准确度.
To improve the recognition rate of pipe anomaly detection and real-time performance, an ensemble classification method based on Tabu search is proposed which combines semi-supervise K-means clustering and C4.5 decision tree. The cost-sensitive function is introduced in Tabu search to select the most discriminating feature subset and the best ensemble weights. Thus, the classification performance of the minority class in imbalance data is improved. The semi-supervise K-means approach partitions the features of samples into k clusters firstly. Then, a supervised C4.5 decision tree in each K-means cluster is trained to refine the decision boundaries by learning the subgroups within the cluster. The ensemble classification by cascading K-means and C4.5 alleviates the problems of imbalance data and improves the classification accuracy of imbalance data. The final decisions of the K-means and C4.5 methods are integrated based on the weighted sum rule, the nearest-neighbor rule, and the nearest consensus rule respectively. The experimental results show that the proposed system is effective in classifying imbalance data and has high performance in detecting the anomaly of pipeline.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2013年第1期83-89,共7页
Pattern Recognition and Artificial Intelligence
基金
国家自然科学基金重点资助项目(No.60935001)