The rapid developments in the fields of telecommunication, sensor data, financial applications, analyzing of data streams, and so on, increase the rate of data arrival, among which the data mining technique is conside...The rapid developments in the fields of telecommunication, sensor data, financial applications, analyzing of data streams, and so on, increase the rate of data arrival, among which the data mining technique is considered a vital process. The data analysis process consists of different tasks, among which the data stream classification approaches face more challenges than the other commonly used techniques. Even though the classification is a continuous process, it requires a design that can adapt the classification model so as to adjust the concept change or the boundary change between the classes. Hence, we design a novel fuzzy classifier known as THRFuzzy to classify new incoming data streams. Rough set theory along with tangential holoentropy function helps in the designing the dynamic classification model. The classification approach uses kernel fuzzy c-means(FCM) clustering for the generation of the rules and tangential holoentropy function to update the membership function. The performance of the proposed THRFuzzy method is verified using three datasets, namely skin segmentation, localization, and breast cancer datasets, and the evaluated metrics, accuracy and time, comparing its performance with HRFuzzy and adaptive k-NN classifiers. The experimental results conclude that THRFuzzy classifier shows better classification results providing a maximum accuracy consuming a minimal time than the existing classifiers.展开更多
One of the most important methods that finds usefulness in various applications, such as searching historical manuscripts, forensic search, bank check reading, mail sorting, book and handwritten notes transcription, i...One of the most important methods that finds usefulness in various applications, such as searching historical manuscripts, forensic search, bank check reading, mail sorting, book and handwritten notes transcription, is handwritten character recognition. The common issues in the character recognition are often due to different writing styles, orientation angle, size variation(regarding length and height), etc. This study presents a classification model using a hybrid classifier for the character recognition by combining holoentropy enabled decision tree(HDT) and deep neural network(DNN). In feature extraction, the local gradient features that include histogram oriented gabor feature and grid level feature, and grey level co-occurrence matrix(GLCM) features are extracted. Then, the extracted features are concatenated to encode shape, color, texture, local and statistical information, for the recognition of characters in the image by applying the extracted features to the hybrid classifier. In the experimental analysis, recognition accuracy of 96% is achieved. Thus, it can be suggested that the proposed model intends to provide more accurate character recognition rate compared to that of character recognition techniques used in the literature.展开更多
Purpose–Document retrieval has become a hot research topic over the past few years,and has been paid more attention in browsing and synthesizing information from different documents.The purpose of this paper is to de...Purpose–Document retrieval has become a hot research topic over the past few years,and has been paid more attention in browsing and synthesizing information from different documents.The purpose of this paper is to develop an effective document retrieval method,which focuses on reducing the time needed for the navigator to evoke the whole document based on contents,themes and concepts of documents.Design/methodology/approach–This paper introduces an incremental learning approach for text categorization using Monarch Butterfly optimization–FireFly optimization based Neural Network(MB–FF based NN).Initially,the feature extraction is carried out on the pre-processed data using Term Frequency–Inverse Document Frequency(TF–IDF)and holoentropy to find the keywords of the document.Then,cluster-based indexing is performed using MB–FF algorithm,and finally,by matching process with the modified Bhattacharya distance measure,the document retrieval is done.In MB–FF based NN,the weights in the NN are chosen using MB–FF algorithm.Findings–The effectiveness of the proposed MB–FF based NN is proven with an improved precision value of 0.8769,recall value of 0.7957,F-measure of 0.8143 and accuracy of 0.7815,respectively.Originality/value–The experimental results show that the proposed MB–FF based NN is useful to companies,which have a large workforce across the country.展开更多
基金supported by proposal No.OSD/BCUD/392/197 Board of Colleges and University Development,Savitribai Phule Pune University,Pune
文摘The rapid developments in the fields of telecommunication, sensor data, financial applications, analyzing of data streams, and so on, increase the rate of data arrival, among which the data mining technique is considered a vital process. The data analysis process consists of different tasks, among which the data stream classification approaches face more challenges than the other commonly used techniques. Even though the classification is a continuous process, it requires a design that can adapt the classification model so as to adjust the concept change or the boundary change between the classes. Hence, we design a novel fuzzy classifier known as THRFuzzy to classify new incoming data streams. Rough set theory along with tangential holoentropy function helps in the designing the dynamic classification model. The classification approach uses kernel fuzzy c-means(FCM) clustering for the generation of the rules and tangential holoentropy function to update the membership function. The performance of the proposed THRFuzzy method is verified using three datasets, namely skin segmentation, localization, and breast cancer datasets, and the evaluated metrics, accuracy and time, comparing its performance with HRFuzzy and adaptive k-NN classifiers. The experimental results conclude that THRFuzzy classifier shows better classification results providing a maximum accuracy consuming a minimal time than the existing classifiers.
文摘One of the most important methods that finds usefulness in various applications, such as searching historical manuscripts, forensic search, bank check reading, mail sorting, book and handwritten notes transcription, is handwritten character recognition. The common issues in the character recognition are often due to different writing styles, orientation angle, size variation(regarding length and height), etc. This study presents a classification model using a hybrid classifier for the character recognition by combining holoentropy enabled decision tree(HDT) and deep neural network(DNN). In feature extraction, the local gradient features that include histogram oriented gabor feature and grid level feature, and grey level co-occurrence matrix(GLCM) features are extracted. Then, the extracted features are concatenated to encode shape, color, texture, local and statistical information, for the recognition of characters in the image by applying the extracted features to the hybrid classifier. In the experimental analysis, recognition accuracy of 96% is achieved. Thus, it can be suggested that the proposed model intends to provide more accurate character recognition rate compared to that of character recognition techniques used in the literature.
文摘Purpose–Document retrieval has become a hot research topic over the past few years,and has been paid more attention in browsing and synthesizing information from different documents.The purpose of this paper is to develop an effective document retrieval method,which focuses on reducing the time needed for the navigator to evoke the whole document based on contents,themes and concepts of documents.Design/methodology/approach–This paper introduces an incremental learning approach for text categorization using Monarch Butterfly optimization–FireFly optimization based Neural Network(MB–FF based NN).Initially,the feature extraction is carried out on the pre-processed data using Term Frequency–Inverse Document Frequency(TF–IDF)and holoentropy to find the keywords of the document.Then,cluster-based indexing is performed using MB–FF algorithm,and finally,by matching process with the modified Bhattacharya distance measure,the document retrieval is done.In MB–FF based NN,the weights in the NN are chosen using MB–FF algorithm.Findings–The effectiveness of the proposed MB–FF based NN is proven with an improved precision value of 0.8769,recall value of 0.7957,F-measure of 0.8143 and accuracy of 0.7815,respectively.Originality/value–The experimental results show that the proposed MB–FF based NN is useful to companies,which have a large workforce across the country.