摘要
该文针对非线性数据集中线状模式的挖掘问题,提出一种基于密度权期望最大(EM)与分裂合并策略的回归算法。算法基于有限混合模型思想,使用点向式方程定义线状模式表示,将网格密度作为调节权值引入EM过程,有效降低了回归落入局部极值的可能性。同时,引入分裂合并策略,使得算法能够解决连通性问题,并且即使在挖掘数设置与本质线状模式数不相符时也能获得正确结果。实验结果表明,算法对挖掘数设置不敏感,能够正确挖掘出噪声环境下数据集的线状模式。
To address the issue of line pattern mining of non-linear dataset,a new regression algorithm based on density weight Expectation Maximization(EM) and splitting merging strategy is proposed.Point-direction function is first employed to establish the expression of line pattern based on finite mixture model,and grid density is introduced into EM processing as adjust weight,which can effectively reduce the possibility of fall into local optimum of regression.Then a splitting merging strategy is introduced,which ensure the proposed algorithm can overcome the connectivity limitation,and can obtain a correct result even when the number of mining is not set as the same with the real line pattern number.Experiments demonstrate that the proposed algorithm is not sensitive to the set of mining number,and is able to correctly explore the line pattern of non-linear dataset under the noise environment.
出处
《电子与信息学报》
EI
CSCD
北大核心
2012年第5期1162-1167,共6页
Journal of Electronics & Information Technology
基金
国家自然科学基金(61005032)
辽宁省自然科学基金(20102062)
沈阳市科学计划项目(F10-147-9-00)
中央高校基本科研业务费项目(N100604018)资助课题
关键词
数据挖掘
线状模式
期望最大化
网格密度
分裂合并
Data mining
Line pattern
Expectation Maximization(EM)
Grid density
Splitting merging strategy