累计局部离群因子(cumulative local outlier factor,C_LOF)算法能有效解决数据流中的概念漂移问题和克服离群点检测中的伪装问题,但在处理高维数据时,时间复杂度较高。为有效解决时间复杂度高的问题,提出一种基于投影索引近邻的累计局...累计局部离群因子(cumulative local outlier factor,C_LOF)算法能有效解决数据流中的概念漂移问题和克服离群点检测中的伪装问题,但在处理高维数据时,时间复杂度较高。为有效解决时间复杂度高的问题,提出一种基于投影索引近邻的累计局部离群因子(cumulative local outlier factor based projection indexed nearest neighbor,PINN_C_LOF)算法。使用滑动窗口维护活跃数据点,在新数据到达和旧数据过期时,引入投影索引近邻(projection indexed nearest neighbor,PINN)方法,增量更新窗口中受影响数据点的近邻。实验结果表明,PINN_C_LOF算法在检测高维流数据离群值时,在保持检测精确度的前提下,其时间复杂度较C_LOF算法明显降低。展开更多
Data reconciliation technology can decrease the level of corruption of process data due to measurement noise, but the presence of outliers caused by process peaks or unmeasured disturbances will smear the reconciled r...Data reconciliation technology can decrease the level of corruption of process data due to measurement noise, but the presence of outliers caused by process peaks or unmeasured disturbances will smear the reconciled results. Based on the analysis of limitation of conventional outlier detection algorithms, a modified outlier detection method in dynamic data reconciliation (DDR) is proposed in this paper. In the modified method, the outliers of each variable are distinguished individually and the weight is modified accordingly. Therefore, the modified method can use more information of normal data, and can efficiently decrease the effect of outliers. Simulation of a continuous stirred tank reactor (CSTR) process verifies the effectiveness of the proposed algorithm.展开更多
文摘累计局部离群因子(cumulative local outlier factor,C_LOF)算法能有效解决数据流中的概念漂移问题和克服离群点检测中的伪装问题,但在处理高维数据时,时间复杂度较高。为有效解决时间复杂度高的问题,提出一种基于投影索引近邻的累计局部离群因子(cumulative local outlier factor based projection indexed nearest neighbor,PINN_C_LOF)算法。使用滑动窗口维护活跃数据点,在新数据到达和旧数据过期时,引入投影索引近邻(projection indexed nearest neighbor,PINN)方法,增量更新窗口中受影响数据点的近邻。实验结果表明,PINN_C_LOF算法在检测高维流数据离群值时,在保持检测精确度的前提下,其时间复杂度较C_LOF算法明显降低。
基金Supported by the National Outstanding Youth Science Foundation of China (No. 60025308) and Key Technologies R&DProgram in the 10th Five-year Plan (No. 2001BA204B07)
文摘Data reconciliation technology can decrease the level of corruption of process data due to measurement noise, but the presence of outliers caused by process peaks or unmeasured disturbances will smear the reconciled results. Based on the analysis of limitation of conventional outlier detection algorithms, a modified outlier detection method in dynamic data reconciliation (DDR) is proposed in this paper. In the modified method, the outliers of each variable are distinguished individually and the weight is modified accordingly. Therefore, the modified method can use more information of normal data, and can efficiently decrease the effect of outliers. Simulation of a continuous stirred tank reactor (CSTR) process verifies the effectiveness of the proposed algorithm.