Emotion detection from the text is a challenging problem in the text analytics.The opinion mining experts are focusing on the development of emotion detection applications as they have received considerable attention ...Emotion detection from the text is a challenging problem in the text analytics.The opinion mining experts are focusing on the development of emotion detection applications as they have received considerable attention of online community including users and business organization for collecting and interpreting public emotions.However,most of the existing works on emotion detection used less efficient machine learning classifiers with limited datasets,resulting in performance degradation.To overcome this issue,this work aims at the evaluation of the performance of different machine learning classifiers on a benchmark emotion dataset.The experimental results show the performance of different machine learning classifiers in terms of different evaluation metrics like precision,recall ad f-measure.Finally,a classifier with the best performance is recommended for the emotion classification.展开更多
Cardiovascular disease is among the top five fatal diseases that affect lives worldwide.Therefore,its early prediction and detection are crucial,allowing one to take proper and necessary measures at earlier stages.Mac...Cardiovascular disease is among the top five fatal diseases that affect lives worldwide.Therefore,its early prediction and detection are crucial,allowing one to take proper and necessary measures at earlier stages.Machine learning(ML)techniques are used to assist healthcare providers in better diagnosing heart disease.This study employed three boosting algorithms,namely,gradient boost,XGBoost,and AdaBoost,to predict heart disease.The dataset contained heart disease-related clinical features and was sourced from the publicly available UCI ML repository.Exploratory data analysis is performed to find the characteristics of data samples about descriptive and inferential statistics.Specifically,it was carried out to identify and replace outliers using the interquartile range and detect and replace the missing values using the imputation method.Results were recorded before and after the data preprocessing techniques were applied.Out of all the algorithms,gradient boosting achieved the highest accuracy rate of 92.20%for the proposed model.The proposed model yielded better results with gradient boosting in terms of precision,recall,and f1-score.It attained better prediction performance than the existing works and can be used for other diseases that share common features using transfer learning.展开更多
Objective Technological advances have led to drastic changes in daily life,and particularly healthcare,while traditional diagnosis methods are being replaced by technology-oriented models and paper-based patient healt...Objective Technological advances have led to drastic changes in daily life,and particularly healthcare,while traditional diagnosis methods are being replaced by technology-oriented models and paper-based patient health-care records with digital files.Using the latest technology and data mining techniques,we aimed to develop an automated clinical decision support system(CDSS),to improve patient prognoses and healthcare delivery.Our proposed approach placed a strong emphasis on improvements that meet patient,parent,and physician expec-tations.We developed a flexible framework to identify hepatitis,dermatological conditions,hepatic disease,and autism in adults and provide results to patients as recommendations.The novelty of this CDSS lies in its inte-gration of rough set theory(RST)and machine learning(ML)techniques to improve clinical decision-making accuracy and effectiveness.Methods Data were collected through various web-based resources.Standard preprocessing techniques were applied to encode categorical features,conduct min-max scaling,and remove null and duplicate entries.The most prevalent feature in the class and standard deviation were used to fill missing categorical and continuous feature values,respectively.A rough set approach was applied as feature selection,to remove highly redundant and irrelevant elements.Then,various ML techniques,including K nearest neighbors(KNN),linear support vector machine(LSVM),radial basis function support vector machine(RBF SVM),decision tree(DT),random forest(RF),and Naive Bayes(NB),were employed to analyze four publicly available benchmark medical datasets of different types from the UCI repository and Kaggle.The model was implemented in Python,and various validity metrics,including precision,recall,F1-score,and root mean square error(RMSE),applied to measure its performance.Results Features were selected using an RST approach and examined by RF analysis and important features of hepatitis,dermatology conditions,hepatic disease,and autism determined by RST and RF exhibited 92.85%,90.90%,100%,and 80%similarity,respectively.Selected features were stored as electronic health records and various ML classifiers,such as KNN,LSVM,RBF SVM,DT,RF,and NB,applied to classify patients with hepatitis,dermatology conditions,hepatic disease,and autism.In the last phase,the performance of proposed classifiers was compared with that of existing state-of-the-art methods,using various validity measures.RF was found to be the best approach for adult screening of:hepatitis with accuracy 88.66%,precision 74.46%,recall 75.17%,F1-score 74.81%,and RMSE value 0.244;dermatology conditions with accuracy 97.29%,precision 96.96%,recall 96.96%,F1-score 96.96%,and RMSE value,0.173;hepatic disease,with accuracy 91.58%,precision 81.76%,recall 81.82%,F1-Score 81.79%,and RMSE value 0.193;and autism,with accuracy 100%,precision 100%,recall 100%,F1-score 100%,and RMSE value 0.064.Conclusion The overall performance of our proposed framework may suggest that it could assist medical experts in more accurately identifying and diagnosing patients with hepatitis,dermatology conditions,hepatic disease,and autism.展开更多
Smartphone devices particularly Android devices are in use by billions of people everywhere in the world.Similarly,this increasing rate attracts mobile botnet attacks which is a network of interconnected nodes operate...Smartphone devices particularly Android devices are in use by billions of people everywhere in the world.Similarly,this increasing rate attracts mobile botnet attacks which is a network of interconnected nodes operated through the command and control(C&C)method to expand malicious activities.At present,mobile botnet attacks launched the Distributed denial of services(DDoS)that causes to steal of sensitive data,remote access,and spam generation,etc.Consequently,various approaches are defined in the literature to detect mobile botnet attacks using static or dynamic analysis.In this paper,a novel hybrid model,the combination of static and dynamic methods that relies on machine learning to detect android botnet applications is proposed.Furthermore,results are evaluated using machine learning classifiers.The Random Forest(RF)classifier outperform as compared to other ML techniques i.e.,Naïve Bayes(NB),Support Vector Machine(SVM),and Simple Logistic(SL).Our proposed framework achieved 97.48%accuracy in the detection of botnet applications.Finally,some future research directions are highlighted regarding botnet attacks detection for the entire community.展开更多
Modern leather industries are focused on producing high quality leather products for sustaining the market com-petitiveness. However, various leather defects are introduced during various stages of manufacturing proce...Modern leather industries are focused on producing high quality leather products for sustaining the market com-petitiveness. However, various leather defects are introduced during various stages of manufacturing process such as material handling, tanning and dyeing. Manual inspection of leather surfaces is subjective and inconsistent in nature;hence machine vision systems have been widely adopted for the automated inspection of leather defects. It is neces-sary develop suitable image processing algorithms for localize leather defects such as folding marks, growth marks, grain off, loose grain, and pinhole due to the ambiguous texture pattern and tiny nature in the localized regions of the leather. This paper presents deep learning neural network-based approach for automatic localization and classifica-tion of leather defects using a machine vision system. In this work, popular convolutional neural networks are trained using leather images of different leather defects and a class activation mapping technique is followed to locate the region of interest for the class of leather defect. Convolution neural networks such as Google net, Squeeze-net, RestNet are found to provide better accuracy of classification as compared with the state-of-the-art neural network architectures and the results are presented.展开更多
Finger Knuckle Print biometric plays a vital role in establishing security for real-time environments. The success of human authentication depends on high speed and accuracy. This paper proposed an integrated approach...Finger Knuckle Print biometric plays a vital role in establishing security for real-time environments. The success of human authentication depends on high speed and accuracy. This paper proposed an integrated approach of personal authentication using texture based Finger Knuckle Print (FKP) recognition in multiresolution domain. FKP images are rich in texture patterns. Recently, many texture patterns are proposed for biometric feature extraction. Hence, it is essential to review whether Local Binary Patterns or its variants perform well for FKP recognition. In this paper, Local Directional Pattern (LDP), Local Derivative Ternary Pattern (LDTP) and Local Texture Description Framework based Modified Local Directional Pattern (LTDF_MLDN) based feature extraction in multiresolution domain are experimented with Nearest Neighbor and Extreme Learning Machine (ELM) Classifier for FKP recognition. Experiments were conducted on PolYU database. The result shows that LDTP in Contourlet domain achieves a promising performance. It also proves that Soft classifier performs better than the hard classifier.展开更多
The continuous increase of electric vehicles is being facilitating the large-scale distributed charging-pile deployment.It is crucial to guarantee normal operation of charging piles,resulting in the importance of diag...The continuous increase of electric vehicles is being facilitating the large-scale distributed charging-pile deployment.It is crucial to guarantee normal operation of charging piles,resulting in the importance of diagnosing charging-pile faults.The existing fault-diagnosis approaches were based on physical fault data like mechanical log data and sensor data streams.However,there are other types of fault data,which cannot be used for diagnosis by these existing approaches.This paper aims to fill this gap and consider 8 types of fault data for diagnosing,at least including physical installation error fault,charging-pile mechanical fault,charging-pile program fault,user personal fault,signal fault(offline),pile compatibility fault,charging platform fault,and other faults.We aim to find out how to combine existing feature-extraction and machine learning techniques to make the better diagnosis by conducting experiments on realistic dataset.4 word embedding models are investigated for feature extraction of fault data,including N-gram,GloVe,Word2vec,and BERT.Moreover,we classify the word embedding results using 10 machine learning classifiers,including Random Forest(RF),Support Vector Machine,K-Nearest Neighbor,Multilayer Perceptron,Recurrent Neural Network,AdaBoost,Gradient Boosted Decision Tree,Decision Tree,Extra Tree,and VOTE.Compared with original fault record dataset,we utilize paraphrasing-based data augmentation method to improve the classification accuracy up to 10.40%.Our extensive experiment results reveal that RF classifier combining the GloVe embedding model achieves the best accuracy with acceptable training time.In addition,we discuss the interpretability of RF and GloVe.展开更多
基金This work has partially been sponsored by the Hungarian National Scientific Fund under contract OTKA 129374the Research&Development Operational Program for the project“Modernization and Improvement of Technical Infrastructure for Research and Development of J.Selye University in the Fields of Nanotechnology and Intelligent Space”,ITMS 26210120042,co-funded by the European Regional Development Fund.
文摘Emotion detection from the text is a challenging problem in the text analytics.The opinion mining experts are focusing on the development of emotion detection applications as they have received considerable attention of online community including users and business organization for collecting and interpreting public emotions.However,most of the existing works on emotion detection used less efficient machine learning classifiers with limited datasets,resulting in performance degradation.To overcome this issue,this work aims at the evaluation of the performance of different machine learning classifiers on a benchmark emotion dataset.The experimental results show the performance of different machine learning classifiers in terms of different evaluation metrics like precision,recall ad f-measure.Finally,a classifier with the best performance is recommended for the emotion classification.
基金This work was supported by National Research Foundation of Korea-Grant funded by the Korean Government(MSIT)-NRF-2020R1A2B5B02002478.
文摘Cardiovascular disease is among the top five fatal diseases that affect lives worldwide.Therefore,its early prediction and detection are crucial,allowing one to take proper and necessary measures at earlier stages.Machine learning(ML)techniques are used to assist healthcare providers in better diagnosing heart disease.This study employed three boosting algorithms,namely,gradient boost,XGBoost,and AdaBoost,to predict heart disease.The dataset contained heart disease-related clinical features and was sourced from the publicly available UCI ML repository.Exploratory data analysis is performed to find the characteristics of data samples about descriptive and inferential statistics.Specifically,it was carried out to identify and replace outliers using the interquartile range and detect and replace the missing values using the imputation method.Results were recorded before and after the data preprocessing techniques were applied.Out of all the algorithms,gradient boosting achieved the highest accuracy rate of 92.20%for the proposed model.The proposed model yielded better results with gradient boosting in terms of precision,recall,and f1-score.It attained better prediction performance than the existing works and can be used for other diseases that share common features using transfer learning.
文摘Objective Technological advances have led to drastic changes in daily life,and particularly healthcare,while traditional diagnosis methods are being replaced by technology-oriented models and paper-based patient health-care records with digital files.Using the latest technology and data mining techniques,we aimed to develop an automated clinical decision support system(CDSS),to improve patient prognoses and healthcare delivery.Our proposed approach placed a strong emphasis on improvements that meet patient,parent,and physician expec-tations.We developed a flexible framework to identify hepatitis,dermatological conditions,hepatic disease,and autism in adults and provide results to patients as recommendations.The novelty of this CDSS lies in its inte-gration of rough set theory(RST)and machine learning(ML)techniques to improve clinical decision-making accuracy and effectiveness.Methods Data were collected through various web-based resources.Standard preprocessing techniques were applied to encode categorical features,conduct min-max scaling,and remove null and duplicate entries.The most prevalent feature in the class and standard deviation were used to fill missing categorical and continuous feature values,respectively.A rough set approach was applied as feature selection,to remove highly redundant and irrelevant elements.Then,various ML techniques,including K nearest neighbors(KNN),linear support vector machine(LSVM),radial basis function support vector machine(RBF SVM),decision tree(DT),random forest(RF),and Naive Bayes(NB),were employed to analyze four publicly available benchmark medical datasets of different types from the UCI repository and Kaggle.The model was implemented in Python,and various validity metrics,including precision,recall,F1-score,and root mean square error(RMSE),applied to measure its performance.Results Features were selected using an RST approach and examined by RF analysis and important features of hepatitis,dermatology conditions,hepatic disease,and autism determined by RST and RF exhibited 92.85%,90.90%,100%,and 80%similarity,respectively.Selected features were stored as electronic health records and various ML classifiers,such as KNN,LSVM,RBF SVM,DT,RF,and NB,applied to classify patients with hepatitis,dermatology conditions,hepatic disease,and autism.In the last phase,the performance of proposed classifiers was compared with that of existing state-of-the-art methods,using various validity measures.RF was found to be the best approach for adult screening of:hepatitis with accuracy 88.66%,precision 74.46%,recall 75.17%,F1-score 74.81%,and RMSE value 0.244;dermatology conditions with accuracy 97.29%,precision 96.96%,recall 96.96%,F1-score 96.96%,and RMSE value,0.173;hepatic disease,with accuracy 91.58%,precision 81.76%,recall 81.82%,F1-Score 81.79%,and RMSE value 0.193;and autism,with accuracy 100%,precision 100%,recall 100%,F1-score 100%,and RMSE value 0.064.Conclusion The overall performance of our proposed framework may suggest that it could assist medical experts in more accurately identifying and diagnosing patients with hepatitis,dermatology conditions,hepatic disease,and autism.
文摘Smartphone devices particularly Android devices are in use by billions of people everywhere in the world.Similarly,this increasing rate attracts mobile botnet attacks which is a network of interconnected nodes operated through the command and control(C&C)method to expand malicious activities.At present,mobile botnet attacks launched the Distributed denial of services(DDoS)that causes to steal of sensitive data,remote access,and spam generation,etc.Consequently,various approaches are defined in the literature to detect mobile botnet attacks using static or dynamic analysis.In this paper,a novel hybrid model,the combination of static and dynamic methods that relies on machine learning to detect android botnet applications is proposed.Furthermore,results are evaluated using machine learning classifiers.The Random Forest(RF)classifier outperform as compared to other ML techniques i.e.,Naïve Bayes(NB),Support Vector Machine(SVM),and Simple Logistic(SL).Our proposed framework achieved 97.48%accuracy in the detection of botnet applications.Finally,some future research directions are highlighted regarding botnet attacks detection for the entire community.
文摘Modern leather industries are focused on producing high quality leather products for sustaining the market com-petitiveness. However, various leather defects are introduced during various stages of manufacturing process such as material handling, tanning and dyeing. Manual inspection of leather surfaces is subjective and inconsistent in nature;hence machine vision systems have been widely adopted for the automated inspection of leather defects. It is neces-sary develop suitable image processing algorithms for localize leather defects such as folding marks, growth marks, grain off, loose grain, and pinhole due to the ambiguous texture pattern and tiny nature in the localized regions of the leather. This paper presents deep learning neural network-based approach for automatic localization and classifica-tion of leather defects using a machine vision system. In this work, popular convolutional neural networks are trained using leather images of different leather defects and a class activation mapping technique is followed to locate the region of interest for the class of leather defect. Convolution neural networks such as Google net, Squeeze-net, RestNet are found to provide better accuracy of classification as compared with the state-of-the-art neural network architectures and the results are presented.
文摘Finger Knuckle Print biometric plays a vital role in establishing security for real-time environments. The success of human authentication depends on high speed and accuracy. This paper proposed an integrated approach of personal authentication using texture based Finger Knuckle Print (FKP) recognition in multiresolution domain. FKP images are rich in texture patterns. Recently, many texture patterns are proposed for biometric feature extraction. Hence, it is essential to review whether Local Binary Patterns or its variants perform well for FKP recognition. In this paper, Local Directional Pattern (LDP), Local Derivative Ternary Pattern (LDTP) and Local Texture Description Framework based Modified Local Directional Pattern (LTDF_MLDN) based feature extraction in multiresolution domain are experimented with Nearest Neighbor and Extreme Learning Machine (ELM) Classifier for FKP recognition. Experiments were conducted on PolYU database. The result shows that LDTP in Contourlet domain achieves a promising performance. It also proves that Soft classifier performs better than the hard classifier.
基金This work was supported by the State Grid Technology Project“Research on Interaction between Large-scale Electric Vehicles and Power Grid and Charging Safety Protection Technology”(5418-202071490A-0-0-00)from State Grid Corporation of China..
文摘The continuous increase of electric vehicles is being facilitating the large-scale distributed charging-pile deployment.It is crucial to guarantee normal operation of charging piles,resulting in the importance of diagnosing charging-pile faults.The existing fault-diagnosis approaches were based on physical fault data like mechanical log data and sensor data streams.However,there are other types of fault data,which cannot be used for diagnosis by these existing approaches.This paper aims to fill this gap and consider 8 types of fault data for diagnosing,at least including physical installation error fault,charging-pile mechanical fault,charging-pile program fault,user personal fault,signal fault(offline),pile compatibility fault,charging platform fault,and other faults.We aim to find out how to combine existing feature-extraction and machine learning techniques to make the better diagnosis by conducting experiments on realistic dataset.4 word embedding models are investigated for feature extraction of fault data,including N-gram,GloVe,Word2vec,and BERT.Moreover,we classify the word embedding results using 10 machine learning classifiers,including Random Forest(RF),Support Vector Machine,K-Nearest Neighbor,Multilayer Perceptron,Recurrent Neural Network,AdaBoost,Gradient Boosted Decision Tree,Decision Tree,Extra Tree,and VOTE.Compared with original fault record dataset,we utilize paraphrasing-based data augmentation method to improve the classification accuracy up to 10.40%.Our extensive experiment results reveal that RF classifier combining the GloVe embedding model achieves the best accuracy with acceptable training time.In addition,we discuss the interpretability of RF and GloVe.