摘要
基于L2范数度量的k平面聚类(k-Plane Clustering,k PC)设计思想,本文提出了一种采用L1范数度量的聚类算法。由于在平面更新步骤中,所导出的优化问题是非凸的,文中给出了一种求解方法,即将非凸问题转化为有限个子集上的凸问题,为避免求解多个优化问题导致训练时间过长问题,本文还设计了一种新的优选策略,有限个子集的搜索任务可在线性时间内完成。本文所提出的方法只需要求解k个线性规划,而不再是k PC的求解特征值问题。在人工和UCI数据集上的实验结果表明:基于L1范数平面聚类算法的训练和测试时间更短,且在大多数数据集上均表现出了更好的聚类性能。
Inspiring by the k-plane clustering(k PC)on L2 norm metrics,a L1 norm clustering algorithm is proposed by introducing L1 metric into clustering,which is termed as L1 k PC(k-plane clustering using L1 norm). The plane-updating of L1 k PC can be characterized by a nonconvex optimization problem. An alternative strategy is provided to conquer such non-convexity. That is,the nonconvex problem can be transformed into a series of convex problems on a finite number of subsets. Meanwhile,in order to avoid solving multiple optimization problems on individual subsets thereby resulting in heavy training burden,a search strategy is also provided to seek suitable subset and this search task can be completed in a linear time.Thus the foresaid optimization problem only needs to solve k linear programming instead of solving the k eigenvalue problems in k PC. Experimental results on artificial and UCI datasets show that the proposed method has less training and testing time-consume,and comparable or even better clustering performance on the majority data sets.
作者
杨红鑫
杨绪兵
寇振宇
业巧林
张福全
许等平
YANG Hongxin;YANG Xubing;KOU Zhenyu;YE Qiaolin;ZHANG Fuquan;XU Dengping(College of Information Science and Technology,Nanjing Forestry University,Nanjing,210037,China;State Forestry Administration Survey Planning Institute,Beijing,100714,China)
出处
《南京航空航天大学学报》
EI
CAS
CSCD
北大核心
2019年第5期681-686,共6页
Journal of Nanjing University of Aeronautics & Astronautics
基金
国家自然科学基金(61472186,50375057)资助项目
江苏省自然科学基金(BK20161527,BK20171453)资助项目
江苏省研究生科研与实践创新计划(SJKY19_0907)资助项目
关键词
L1范数
凸问题
平面聚类
线性规划
L1 norm
convex problem
plane clustering
linear programming