The problem of scalable classification by clustering in large databases was discussed. Clustering based classification method first generates clusters using clustering algorithms. To classify new coming da-ta points, ...The problem of scalable classification by clustering in large databases was discussed. Clustering based classification method first generates clusters using clustering algorithms. To classify new coming da-ta points, it finds the κ nearest clusters of the data point as neighbors, and assign each data point to the dominant class of these neighbors. Existing algorithms incorporated class information in making clustering decisions and produced pure clusters (each cluster associated with only one class). We presented hybrid cluster based algorithms, which produce clusters by unsupervised clustering and allow each cluster associ- ated with multiple classes. Experimental results show that hybrid cluster based algorithms outperform pure ones in both classification accuracy and training soeed.展开更多
A good voice-band signal classification can not only enable the safe application of speech ceding techniques, the implementation of a Digital Signal Interpolation (DSI) system, but also facilitate network administra...A good voice-band signal classification can not only enable the safe application of speech ceding techniques, the implementation of a Digital Signal Interpolation (DSI) system, but also facilitate network administration and planning by providing accurate voice-band traffic analysis. A new method is proposed to detect and classify the presence of various voice-band signals on the General Switched Telephone Network (GSTN). The method uses a combination of simple base classifiers through the AdaBoost algorithm. The conventional classification features for voice- band data classification are combined and optimized by the AdaBoost algorithm and spectral subtraction method. Experiments show the simpleness, effectiveness, efficiency and flexibility of the method.展开更多
In this research article, we analyze the multimedia data mining and classification algorithm based on database optimization techniques. Of high performance application requirements of various kinds are springing up co...In this research article, we analyze the multimedia data mining and classification algorithm based on database optimization techniques. Of high performance application requirements of various kinds are springing up constantly makes parallel computer system structure is valued by more and more common but the corresponding software system development lags far behind the development of the hardware system, it is more obvious in the field of database technology application. Multimedia mining is different from the low level of computer multimedia processing technology and the former focuses on the extracted from huge multimedia collection mode which focused on specific features of understanding or extraction from a single multimedia objects. Our research provides new paradigm for the methodology which will be meaningful and necessary.展开更多
Clustering categorical data, an integral part of data mining,has attracted much attention recently. In this paper, the authors formally define the categorical data clustering problem as an optimization problem from th...Clustering categorical data, an integral part of data mining,has attracted much attention recently. In this paper, the authors formally define the categorical data clustering problem as an optimization problem from the viewpoint of cluster ensemble, and apply cluster ensemble approach for clustering categorical data. Experimental results on real datasets show that better clustering accuracy can be obtained by comparing with existing categorical data clustering algorithms.展开更多
The rapid developments in the fields of telecommunication, sensor data, financial applications, analyzing of data streams, and so on, increase the rate of data arrival, among which the data mining technique is conside...The rapid developments in the fields of telecommunication, sensor data, financial applications, analyzing of data streams, and so on, increase the rate of data arrival, among which the data mining technique is considered a vital process. The data analysis process consists of different tasks, among which the data stream classification approaches face more challenges than the other commonly used techniques. Even though the classification is a continuous process, it requires a design that can adapt the classification model so as to adjust the concept change or the boundary change between the classes. Hence, we design a novel fuzzy classifier known as THRFuzzy to classify new incoming data streams. Rough set theory along with tangential holoentropy function helps in the designing the dynamic classification model. The classification approach uses kernel fuzzy c-means(FCM) clustering for the generation of the rules and tangential holoentropy function to update the membership function. The performance of the proposed THRFuzzy method is verified using three datasets, namely skin segmentation, localization, and breast cancer datasets, and the evaluated metrics, accuracy and time, comparing its performance with HRFuzzy and adaptive k-NN classifiers. The experimental results conclude that THRFuzzy classifier shows better classification results providing a maximum accuracy consuming a minimal time than the existing classifiers.展开更多
文摘The problem of scalable classification by clustering in large databases was discussed. Clustering based classification method first generates clusters using clustering algorithms. To classify new coming da-ta points, it finds the κ nearest clusters of the data point as neighbors, and assign each data point to the dominant class of these neighbors. Existing algorithms incorporated class information in making clustering decisions and produced pure clusters (each cluster associated with only one class). We presented hybrid cluster based algorithms, which produce clusters by unsupervised clustering and allow each cluster associ- ated with multiple classes. Experimental results show that hybrid cluster based algorithms outperform pure ones in both classification accuracy and training soeed.
文摘A good voice-band signal classification can not only enable the safe application of speech ceding techniques, the implementation of a Digital Signal Interpolation (DSI) system, but also facilitate network administration and planning by providing accurate voice-band traffic analysis. A new method is proposed to detect and classify the presence of various voice-band signals on the General Switched Telephone Network (GSTN). The method uses a combination of simple base classifiers through the AdaBoost algorithm. The conventional classification features for voice- band data classification are combined and optimized by the AdaBoost algorithm and spectral subtraction method. Experiments show the simpleness, effectiveness, efficiency and flexibility of the method.
文摘In this research article, we analyze the multimedia data mining and classification algorithm based on database optimization techniques. Of high performance application requirements of various kinds are springing up constantly makes parallel computer system structure is valued by more and more common but the corresponding software system development lags far behind the development of the hardware system, it is more obvious in the field of database technology application. Multimedia mining is different from the low level of computer multimedia processing technology and the former focuses on the extracted from huge multimedia collection mode which focused on specific features of understanding or extraction from a single multimedia objects. Our research provides new paradigm for the methodology which will be meaningful and necessary.
文摘Clustering categorical data, an integral part of data mining,has attracted much attention recently. In this paper, the authors formally define the categorical data clustering problem as an optimization problem from the viewpoint of cluster ensemble, and apply cluster ensemble approach for clustering categorical data. Experimental results on real datasets show that better clustering accuracy can be obtained by comparing with existing categorical data clustering algorithms.
基金supported by proposal No.OSD/BCUD/392/197 Board of Colleges and University Development,Savitribai Phule Pune University,Pune
文摘The rapid developments in the fields of telecommunication, sensor data, financial applications, analyzing of data streams, and so on, increase the rate of data arrival, among which the data mining technique is considered a vital process. The data analysis process consists of different tasks, among which the data stream classification approaches face more challenges than the other commonly used techniques. Even though the classification is a continuous process, it requires a design that can adapt the classification model so as to adjust the concept change or the boundary change between the classes. Hence, we design a novel fuzzy classifier known as THRFuzzy to classify new incoming data streams. Rough set theory along with tangential holoentropy function helps in the designing the dynamic classification model. The classification approach uses kernel fuzzy c-means(FCM) clustering for the generation of the rules and tangential holoentropy function to update the membership function. The performance of the proposed THRFuzzy method is verified using three datasets, namely skin segmentation, localization, and breast cancer datasets, and the evaluated metrics, accuracy and time, comparing its performance with HRFuzzy and adaptive k-NN classifiers. The experimental results conclude that THRFuzzy classifier shows better classification results providing a maximum accuracy consuming a minimal time than the existing classifiers.