摘要
在多标签分类问题中,通过k近邻的分类思想,构造测试样本关于近邻样本类别标签的新数据,通过回归模型建立在新数据下的多标签分类算法。计算测试样本在每个标签上考虑距离的k近邻,构造出每个样本关于标签的新数据集。对新数据集采取线性回归和Logistic回归,给出基于样本k近邻数据的多标签分类算法。为了进一步利用原始数据的信息,考虑每个标签关于原始属性的Markov边界,结合新数据的特征建立新的回归模型,提出考虑Markov边界的多标签分类算法。实验结果表明所给出的方法性能优于常用的多标签学习算法。
In multi-label classification, this paper constructs the new dataset about the nearest neighbors sample class mark through the classification idea of the k-nearest neighbors. The multi-label classification algorithm are established on the new dataset through the regression model. Firstly, this paper calculates the k-nearest neighbors distance of the test samples in each label and constructs new dataset of each sample on the label set. Secondly, the multi label classification algorithm is given based on sample k-nearest neighbors dataset, using linear regression and Logistic regression. In order to further exploit the information of original dataset, considering the Markov boundary of the original property each label and combining the feature of the new dataset to establish a new regression model, a multi-label classification algorithm about Markov boundary is proposed. The experimental results show that the multi-label learning method is better than the common learning algorithm.
出处
《计算机工程与应用》
CSCD
北大核心
2018年第6期135-142,共8页
Computer Engineering and Applications
基金
国家自然科学基金(No.11501435)
西安市科技计划项目(No.CXY1441(2))
西安工程大学研究生创新基金(No.CX201726)