摘要
针对高维数据存在数据簇易受外界环境干扰,出现大量异常簇,且对高维数据异常簇挖掘效率低、误差大等问题,本文以局部敏感哈希的高维数据异常簇为基础提出对其进行智能挖掘的方法。分析局部敏感哈希算法,对高维数据异常簇相似性进行度量,并引入相关的向量空间模型,实现对高维数据异常簇的智能挖掘。实验结果表明,采用本文挖掘方法相比传统的机器学习法和加权快速聚类法,其挖掘数据的准确率、召回率均大幅提高,因此该算法具有一定的实用性。
Aiming at the problems of low efficiency and large error in mining high-dimensional data abnormal clusters, an intelligent mining method of high-dimensional data abnormal clusters based on local sensitive hash is proposed. The local sensitive hash algorithm is analyzed to measure the similarity of abnormal clusters of high-dimensional data;Vector space model is introduced to realize intelligent mining of high-dimensional data anomaly clusters. The experimental results show that compared with the traditional machine learning method and weighted fast clustering method, the mining accuracy and recall rate of this method are improved, and it has certain practicability.
作者
王劭博
WANG Shaobo(Shenzhen Guodian Technology Communication Co.,Ltd.,Beijing 102299,China)
出处
《信息与电脑》
2022年第7期207-209,共3页
Information & Computer
关键词
高维数据
异常簇
挖掘
局部敏感哈希
距离
high dimensional data
abnormal cluster
excavate
local sensitive hash
distance