提出了一种基于马尔可夫链的离群点检测(outlier detection algorithms based on Markov chain,MRKFOD)算法。该算法把基本数据集看作一个加权无向图,数据集中的每个数据表示一个节点,用每条加权边表示节点之间的相似度;形成一个邻接矩...提出了一种基于马尔可夫链的离群点检测(outlier detection algorithms based on Markov chain,MRKFOD)算法。该算法把基本数据集看作一个加权无向图,数据集中的每个数据表示一个节点,用每条加权边表示节点之间的相似度;形成一个邻接矩阵,把邻接矩阵当作马尔可夫链中的概率转移矩阵;寻求概率转移矩阵的主要特征向量;把每个节点的主要特征向量值作为每个数据的离群度。实验结果表明,该算法与其他高维离群点挖掘算法相比,在效率及有效处理的维数方面均有显著提高。展开更多
An algorithm, Clustering Algorithm Based On Sparse Feature Vector (CABOSFV), was proposed for the high dimensional clustering of binary sparse data. This algorithm compresses the data effectively by using a tool '...An algorithm, Clustering Algorithm Based On Sparse Feature Vector (CABOSFV), was proposed for the high dimensional clustering of binary sparse data. This algorithm compresses the data effectively by using a tool 'Sparse Feature Vector', thus reduces the data scale enormously, and can get the clustering result with only one data scan. Both theoretical analysis and empirical tests showed that CABOSFV is of low computational complexity. The algorithm finds clusters in high dimensional large datasets efficiently and handles noise effectively.展开更多
文摘提出了一种基于马尔可夫链的离群点检测(outlier detection algorithms based on Markov chain,MRKFOD)算法。该算法把基本数据集看作一个加权无向图,数据集中的每个数据表示一个节点,用每条加权边表示节点之间的相似度;形成一个邻接矩阵,把邻接矩阵当作马尔可夫链中的概率转移矩阵;寻求概率转移矩阵的主要特征向量;把每个节点的主要特征向量值作为每个数据的离群度。实验结果表明,该算法与其他高维离群点挖掘算法相比,在效率及有效处理的维数方面均有显著提高。
文摘An algorithm, Clustering Algorithm Based On Sparse Feature Vector (CABOSFV), was proposed for the high dimensional clustering of binary sparse data. This algorithm compresses the data effectively by using a tool 'Sparse Feature Vector', thus reduces the data scale enormously, and can get the clustering result with only one data scan. Both theoretical analysis and empirical tests showed that CABOSFV is of low computational complexity. The algorithm finds clusters in high dimensional large datasets efficiently and handles noise effectively.