A fundamental problem in whole sequence matching and subsequence matching is the problem of representation of time series.In the last decade many high level representations of time series have been proposed for data m...A fundamental problem in whole sequence matching and subsequence matching is the problem of representation of time series.In the last decade many high level representations of time series have been proposed for data mining which involve a trade-off between accuracy and compactness.In this paper the author proposes a novel time series representation called Grid Minimum Bounding Rectangle(GMBR) and based on Minimum Bounding Rectangle.In this paper,the binary idea is applied into the Minimum Bounding Rectangle.The experiments have been performed on synthetic,as well as real data sequences to evaluate the proposed method.The experiment demonstrates that 69%-92% of irrelevant sequences are pruned using the proposed method.展开更多
In this letter, a real-time C-V (Characteristic-Vector) clustering algorithm is put forth to treat with vast action data which are dynamically collected from web site. The algorithm cites the concept of C-V to denote ...In this letter, a real-time C-V (Characteristic-Vector) clustering algorithm is put forth to treat with vast action data which are dynamically collected from web site. The algorithm cites the concept of C-V to denote characteristic, synchronously it adopts two-value [0,1]input and self-definition vigilance parameter to design clustering-architecture. Vector Degree of Matching (VDM) plays a key role in the clustering algorithm, which determines the magnitude of typical characteristic. Making use of stability analysis, the classifications are confirmed to have reliably hierarchical structure when vigilance parameter shifts from 0.1 to 0.99. This non-linear relation between vigilance parameter and classification upper limit helps mining out representative classifications from net-users according to the actual web resource, then administering system can map them to web resource space to implement the intelligent configuration effectually and rapidly.展开更多
基金National Natural Science Foundation of China (No.60674088)Shandong Education Committee 2007 Scientific Research Development Plan (No.J07WJ20)
文摘A fundamental problem in whole sequence matching and subsequence matching is the problem of representation of time series.In the last decade many high level representations of time series have been proposed for data mining which involve a trade-off between accuracy and compactness.In this paper the author proposes a novel time series representation called Grid Minimum Bounding Rectangle(GMBR) and based on Minimum Bounding Rectangle.In this paper,the binary idea is applied into the Minimum Bounding Rectangle.The experiments have been performed on synthetic,as well as real data sequences to evaluate the proposed method.The experiment demonstrates that 69%-92% of irrelevant sequences are pruned using the proposed method.
基金Supported by 973 National R&D Items(G1998030413)and Centurial Project of CAS
文摘In this letter, a real-time C-V (Characteristic-Vector) clustering algorithm is put forth to treat with vast action data which are dynamically collected from web site. The algorithm cites the concept of C-V to denote characteristic, synchronously it adopts two-value [0,1]input and self-definition vigilance parameter to design clustering-architecture. Vector Degree of Matching (VDM) plays a key role in the clustering algorithm, which determines the magnitude of typical characteristic. Making use of stability analysis, the classifications are confirmed to have reliably hierarchical structure when vigilance parameter shifts from 0.1 to 0.99. This non-linear relation between vigilance parameter and classification upper limit helps mining out representative classifications from net-users according to the actual web resource, then administering system can map them to web resource space to implement the intelligent configuration effectually and rapidly.