摘要
许多现实应用中,由于数据流的特性,使人们难以获得全部数据的类标签。为了解决类标签不完整数据流的分类问题,本文首先分析了有标签数据集对基于聚类假设半监督分类算法分类误差的影响;然后,利用分类误差影响分析以及数据流的特点,提出一种基于聚类假设半监督数据流集成分类器算法(semi-supervised data stream ensemble classifiers under the cluster assumption,SSDSEC),并针对个体分类器的权值设定进行了探讨;最后,利用仿真实验验证本文算法的有效性。
In many real-world applications, due to the characteristics of the data stream, makes it difficult to get the class labels of all data. This paper first analyzes in order to solve the problem of the class label incomplete data stream classification, labeled data set based on clustering assuming semi-supervised classification algorithms classification error; then use classification errors affect the analysis as well as the characteristics of the data stream is proposed semi-supervised data stream the integrated classifier algorithm (Semi-supervised data stream ensemble classifiers under the cluster assumption, SSDSEC), and assigning weights for individual classifier based clustering assumptions; Finally, the simulation results verify the proposed algorithm effectiveness.
出处
《科技通报》
北大核心
2014年第1期117-122,共6页
Bulletin of Science and Technology
关键词
数据流
分类器
水质环境
data stream
classifier
quality environment