摘要
针对高维数据特征占用空间较大,导致挖掘精准度不高、完整程度较低的问题,提出基于维度扩展重排的高维数据降维挖掘技术。明确高维数据结构对数据挖掘的影响,在特定区间内对数据进行预处理变换;利用奇异值分解法选择首维度,使用皮尔森相关系数计算维度相似性,建立相似性结果矩阵,结合首维度找出第二维度,以此类推实现维度扩展重排;将高维数据通过某种组合投影到低维子空间中,降低数据维度,通过数据聚类及特征压缩转换建立高维数据降维挖掘模型。仿真结果表明,所提方法能够改善挖掘精准度、减少时间消耗,大幅度提高数据完整性。
The high-dimensional data feature that takes up large space causes low accuracy and integrity of mining. Therefore, a dimensionality reduction mining technology based on dimension extension rearrangement is presented in this paper. Firstly, the influence of high-dimensional data structure on data mining was investigated, and the data was preprocessed and transformed in a specific interval. Secondly, the singular value decomposition method was applied to select the first dimension. Pearson correlation coefficient was used to calculate the dimension similarity and establish the similarity result matrix. Then, according to the first dimension, the second dimension was found, achieving the dimension expansion and rearrangement. And then, the high-dimensional data were projected into the low-dimensional subspace to reduce the data dimension by the specific combination. Finally, according to the data clustering and feature compression transformation, the dimensionality reduction mining model of high-dimensional data was established. Simulation results show that the method has high mining accuracy and data integrity, and short time consumption.
作者
邓慧
谭乐婷
DENG Hui;TAN Le-ting(Southwest Petroleum University,Nanchong Sichuan 637000,China)
出处
《计算机仿真》
北大核心
2022年第6期434-438,共5页
Computer Simulation
基金
南充市市校科技战略合作项目“基于扩展云计算的高维数据降维研究”(18SXHZ0027)。
关键词
维度扩展重排
降维处理
数据挖掘
奇异值分解
Dimension extension rearrangement
Dimension reduction processing
Data mining
Singular value decomposition