Cross entropy is a measure in machine learning and deep learning that assesses the difference between predicted and actual probability distributions. In this study, we propose cross entropy as a performance evaluation...Cross entropy is a measure in machine learning and deep learning that assesses the difference between predicted and actual probability distributions. In this study, we propose cross entropy as a performance evaluation metric for image classifier models and apply it to the CT image classification of lung cancer. A convolutional neural network is employed as the deep neural network (DNN) image classifier, with the residual network (ResNet) 50 chosen as the DNN archi-tecture. The image data used comprise a lung CT image set. Two classification models are built from datasets with varying amounts of data, and lung cancer is categorized into four classes using 10-fold cross-validation. Furthermore, we employ t-distributed stochastic neighbor embedding to visually explain the data distribution after classification. Experimental results demonstrate that cross en-tropy is a highly useful metric for evaluating the reliability of image classifier models. It is noted that for a more comprehensive evaluation of model perfor-mance, combining with other evaluation metrics is considered essential. .展开更多
Malware attacks on Windows machines pose significant cybersecurity threats,necessitating effective detection and prevention mechanisms.Supervised machine learning classifiers have emerged as promising tools for malwar...Malware attacks on Windows machines pose significant cybersecurity threats,necessitating effective detection and prevention mechanisms.Supervised machine learning classifiers have emerged as promising tools for malware detection.However,there remains a need for comprehensive studies that compare the performance of different classifiers specifically for Windows malware detection.Addressing this gap can provide valuable insights for enhancing cybersecurity strategies.While numerous studies have explored malware detection using machine learning techniques,there is a lack of systematic comparison of supervised classifiers for Windows malware detection.Understanding the relative effectiveness of these classifiers can inform the selection of optimal detection methods and improve overall security measures.This study aims to bridge the research gap by conducting a comparative analysis of supervised machine learning classifiers for detecting malware on Windows systems.The objectives include Investigating the performance of various classifiers,such as Gaussian Naïve Bayes,K Nearest Neighbors(KNN),Stochastic Gradient Descent Classifier(SGDC),and Decision Tree,in detecting Windows malware.Evaluating the accuracy,efficiency,and suitability of each classifier for real-world malware detection scenarios.Identifying the strengths and limitations of different classifiers to provide insights for cybersecurity practitioners and researchers.Offering recommendations for selecting the most effective classifier for Windows malware detection based on empirical evidence.The study employs a structured methodology consisting of several phases:exploratory data analysis,data preprocessing,model training,and evaluation.Exploratory data analysis involves understanding the dataset’s characteristics and identifying preprocessing requirements.Data preprocessing includes cleaning,feature encoding,dimensionality reduction,and optimization to prepare the data for training.Model training utilizes various supervised classifiers,and their performance is evaluated using metrics such as accuracy,precision,recall,and F1 score.The study’s outcomes comprise a comparative analysis of supervised machine learning classifiers for Windows malware detection.Results reveal the effectiveness and efficiency of each classifier in detecting different types of malware.Additionally,insights into their strengths and limitations provide practical guidance for enhancing cybersecurity defenses.Overall,this research contributes to advancing malware detection techniques and bolstering the security posture of Windows systems against evolving cyber threats.展开更多
Biometric recognition refers to the identification of individuals through their unique behavioral features(e.g.,fingerprint,face,and iris).We need distinguishing characteristics to identify people,such as fingerprints...Biometric recognition refers to the identification of individuals through their unique behavioral features(e.g.,fingerprint,face,and iris).We need distinguishing characteristics to identify people,such as fingerprints,which are world-renowned as the most reliablemethod to identify people.The recognition of fingerprints has become a standard procedure in forensics,and different techniques are available for this purpose.Most current techniques lack interest in image enhancement and rely on high-dimensional features to generate classification models.Therefore,we proposed an effective fingerprint classification method for classifying the fingerprint image as authentic or altered since criminals and hackers routinely change their fingerprints to generate fake ones.In order to improve fingerprint classification accuracy,our proposed method used the most effective texture features and classifiers.Discriminant Analysis(DCA)and Gaussian Discriminant Analysis(GDA)are employed as classifiers,along with Histogram of Oriented Gradient(HOG)and Segmentation-based Feature Texture Analysis(SFTA)feature vectors as inputs.The performance of the classifiers is determined by assessing a range of feature sets,and the most accurate results are obtained.The proposed method is tested using a Sokoto Coventry Fingerprint Dataset(SOCOFing).The SOCOFing project includes 6,000 fingerprint images collected from 600 African people whose fingerprints were taken ten times.Three distinct degrees of obliteration,central rotation,and z-cut have been performed to obtain synthetically altered replicas of the genuine fingerprints.The proposal achieved massive success with a classification accuracy reaching 99%.The experimental results indicate that the proposed method for fingerprint classification is feasible and effective.The experiments also showed that the proposed SFTA-based GDA method outperformed state-of-art approaches in feature dimension and classification accuracy.展开更多
One of the most common types of threats to the digital world is malicious software.It is of great importance to detect and prevent existing and new malware before it damages information assets.Machine learning approac...One of the most common types of threats to the digital world is malicious software.It is of great importance to detect and prevent existing and new malware before it damages information assets.Machine learning approaches are used effectively for this purpose.In this study,we present a model in which supervised and unsupervised learning algorithms are used together.Clustering is used to enhance the prediction performance of the supervised classifiers.The aim of the proposed model is to make predictions in the shortest possible time with high accuracy and f1 score.In the first stage of the model,the data are clustered with the k-means algorithm.In the second stage,the prediction is made with the combination of the classifier with the best prediction performance for the related cluster.While choosing the best classifiers for the given clusters,triple combinations of ten machine learning algorithms(kernel support vector machine,k-nearest neighbor,naive Bayes,decision tree,random forest,extra gradient boosting,categorical boosting,adaptive boosting,extra trees,and gradient boosting)are used.The selected triple classifier combination is positioned in two stages.The prediction time of the model is improved by positioning the classifier with the slowest prediction time in the second stage.The selected triple classifier combination is positioned in two tiers.The prediction time of the model is improved by positioning the classifier with the highest prediction time in the second tier.It is seen that clustering before classification improves prediction performance,which is presented using Blue Hexagon Open Dataset for Malware Analysis(BODMAS),Elastic Malware Benchmark for Empowering Researchers(EMBER)2018 and Kaggle malware detection datasets.The model has 99.74%accuracy and 99.77%f1 score for the BODMAS dataset,99.04%accuracy and 98.63%f1 score for the Kaggle malware detection dataset,and 96.77%accuracy and 96.77%f1 score for the EMBER 2018 dataset.In addition,the tiered positioning of classifiers shortened the average prediction time by 76.13%for the BODMAS dataset and 95.95%for the EMBER 2018 dataset.The proposed method’s prediction performance is better than the rest of the studies in the literature in which BODMAS and EMBER 2018 datasets are used.展开更多
To improve the performance of the multiple classifier system, a new method of feature-decision level fusion is proposed based on knowledge discovery. In the new method, the base classifiers operate on different featur...To improve the performance of the multiple classifier system, a new method of feature-decision level fusion is proposed based on knowledge discovery. In the new method, the base classifiers operate on different feature spaces and their types depend on different measures of between-class separability. The uncertainty measures corresponding to each output of each base classifier are induced from the established decision tables (DTs) in the form of mass function in the Dempster-Shafer theory (DST). Furthermore, an effective fusion framework is built at the feature-decision level on the basis of a generalized rough set model and the DST. The experiment for the classification of hyperspectral remote sensing images shows that the performance of the classification can be improved by the proposed method compared with that of plurality voting (PV).展开更多
This study investigated the efficiency of learning the Chinese numeral classifiers by L2 Chinese learners by means of an alignment-oriented task. Participants were a total of 96 intermediate learners of L2 Chinese, wh...This study investigated the efficiency of learning the Chinese numeral classifiers by L2 Chinese learners by means of an alignment-oriented task. Participants were a total of 96 intermediate learners of L2 Chinese, who were randomly assigned to two experimental groups and one control group, with each group consisting of 32 participants. The continuation task used in this study consisted of a picture-based Chinese text depicting a room with an array of objects, which necessitates the use of classifiers. The two experimental groups were both required to first read the text and then write to describe their own rooms in comparison with the one in the text. One group was instructed to use the classifiers from the text as much as possible in their writing, whereas the other was not required to do so. Participants in the control group were first given the picture to look at in the absence of the text and then asked to describe their own rooms. The results showed that the continuation task significantly enhanced participants’ retention of the Chinese numeral classifiers, suggesting that the alignment-based approach is an effective way to learn difficult linguistic categories such as the Chinese classifiers.展开更多
Predicting stock price movements is a challenging task for academicians and practitioners. In particular, forecasting price movements in emerging markets seems to be more elusive because they are usually more volatile...Predicting stock price movements is a challenging task for academicians and practitioners. In particular, forecasting price movements in emerging markets seems to be more elusive because they are usually more volatile often accompa-nied by thin trading-volumes and they are susceptible to more manipulation compared to mature markets. Technical analysis of stocks and commodities has become a science on its own;quantitative methods and techniques have been applied by many practitioners to forecast price movements. Lagging and sometimes leading technical indicators pro-vide rich quantitative tools for traders and investors in their attempt to gain advantage when making investment or trading decisions. Artificial Neural Networks (ANN) have been used widely in predicting stock prices because of their capability in capturing the non-linearity that often exists in price movements. Recently, Polynomial Classifiers (PC) have been applied to various recognition and classification application and showed favorable results in terms of recog-nition rates and computational complexity as compared to ANN. In this paper, we present two prediction models for predicting securities’ prices. The first model was developed using back propagation feed forward neural networks. The second model was developed using polynomial classifiers (PC), as a first time application for PC to be used in stock prices prediction. The inputs to both models were identical, and both models were trained and tested on the same data. The study was conducted on Dubai Financial Market as an emerging market and applied to two of the market’s leading stocks. In general, both models achieved very good results in terms of mean absolute error percentage. Both models show an average error around 1.5% predicting the next day price, an average error of 2.5% when predicting second day price, and an average error of 4% when predicted the third day price.展开更多
In various application areas of pattern recognition, combing multiple classifiers is regarded as a new method for achieving a substantial gain in performance of systems. This paper discusses the properties of the dive...In various application areas of pattern recognition, combing multiple classifiers is regarded as a new method for achieving a substantial gain in performance of systems. This paper discusses the properties of the diversity of classifiers and its applications. At the same time, the paper presents a novel method for combining multiple classifiers based on the diversity. Fusion strategies are discussed for providing a basis for combing classifiers. These combination strategies are experimentally tested on online handwritten Chinese character recognition system and their effectiveness is considered.展开更多
Wind energy is considered as a alternative renewable energy source due to its low operating cost when compared with other sources.The wind turbine is an essential system used to change kinetic energy into electrical e...Wind energy is considered as a alternative renewable energy source due to its low operating cost when compared with other sources.The wind turbine is an essential system used to change kinetic energy into electrical energy.Wind turbine blades,in particular,require a competitive condition inspection approach as it is a significant component of the wind turbine system that costs around 20-25 percent of the total turbine cost.The main objective of this study is to differentiate between various blade faults which affect the wind turbine blade under operating conditions using a machine learning approach through histogram features.In this study,blade bend,hub-blade loose connection,blade erosion,pitch angle twist,and blade cracks were simulated on the blade.This problem is formulated as a machine learning problem which consists of three phases,namely feature extraction,feature selection and feature classification.Histogram features are extracted from vibration signals and feature selection was carried out using the J48 decision tree algorithm.Feature classification was performed using 15 tree classifiers.The results of the machine learning classifiers were compared with respect to their accuracy percentage and a better model is suggested for real-time monitoring of a wind turbine blade.展开更多
On the semantic web, data interoperability and ontology heterogeneity are becoming ever more important issues. To resolve these problems, multiple classification methods can be used to learn the matching between ontol...On the semantic web, data interoperability and ontology heterogeneity are becoming ever more important issues. To resolve these problems, multiple classification methods can be used to learn the matching between ontologies. The paper uses the general statistic classification method to discover category features in data instances and use the first-order learning algorithm FOIL to exploit the semantic relations among data instances. When using multistrategy learning approach, a central problem is the evaluation of multistrategy classifiers. The goal and the conditions of using multistrategy classifiers within ontology matching are different from the ones for general text classification. This paper describes the combination rule of multiple classifiers called the Best Outstanding Champion, which is suitable for heterogeneous ontology mapping. On the prediction results of individual methods, the method can well accumulate the correct matching of alone classifier. The experiments show that the approach achieves high accuracy on real-world domain.展开更多
The rise of fake news on social media has had a detrimental effect on society. Numerous performance evaluations on classifiers that can detect fake news have previously been undertaken by researchers in this area. To ...The rise of fake news on social media has had a detrimental effect on society. Numerous performance evaluations on classifiers that can detect fake news have previously been undertaken by researchers in this area. To assess their performance, we used 14 different classifiers in this study. Secondly, we looked at how soft voting and hard voting classifiers performed in a mixture of distinct individual classifiers. Finally, heuristics are used to create 9 models of stacking classifiers. The F1 score, prediction, recall, and accuracy have all been used to assess performance. Models 6 and 7 achieved the best accuracy of 96.13 while having a larger computational complexity. For benchmarking purposes, other individual classifiers are also tested.展开更多
Fuel injectors are considered as an important component of combustion engines. Operational weakness can possibly lead to the complete machine malfunction, decreasing reliability and leading to loss of production. To o...Fuel injectors are considered as an important component of combustion engines. Operational weakness can possibly lead to the complete machine malfunction, decreasing reliability and leading to loss of production. To overcome these circumstances, various condition monitoring techniques can be applied. The application of acoustic signals is common in the field of fault diagnosis of rotating machinery. Advanced signal processing is utilized for the construction of features that are specialized in detecting fuel injector faults. A performance comparison between novelty detection algorithms in the form of one-class classifiers is presented. The one-class classifiers that were tested included One-Class Support Vector Machine (OCSVM) and One-Class Self Organizing Map (OCSOM). The acoustic signals of fuel injectors in different operational conditions were processed for feature extraction. Features from all the signals were used as input to the one-class classifiers. The one-class classifiers were trained only with healthy fuel injector conditions and compared with new experimental data which belonged to different operational conditions that were not included in the training set so as to contribute to generalization. The results present the effectiveness of one-class classifiers for detecting faults in fuel injectors.展开更多
In this paper, polynomial fuzzy neural network classifiers (PFNNCs) is proposed by means of density fuzzy c-means and L2-norm regularization. The overall design of PFNNCs was realized by means of fuzzy rules that come...In this paper, polynomial fuzzy neural network classifiers (PFNNCs) is proposed by means of density fuzzy c-means and L2-norm regularization. The overall design of PFNNCs was realized by means of fuzzy rules that come in form of three parts, namely premise part, consequence part and aggregation part. The premise part was developed by density fuzzy c-means that helps determine the apex parameters of membership functions, while the consequence part was realized by means of two types of polynomials including linear and quadratic. L2-norm regularization that can alleviate the overfitting problem was exploited to estimate the parameters of polynomials, which constructed the aggregation part. Experimental results of several data sets demonstrate that the proposed classifiers show higher classification accuracy in comparison with some other classifiers reported in the literature.展开更多
Background Human-machine dialog generation is an essential topic of research in the field of natural language processing.Generating high-quality,diverse,fluent,and emotional conversation is a challenging task.Based on...Background Human-machine dialog generation is an essential topic of research in the field of natural language processing.Generating high-quality,diverse,fluent,and emotional conversation is a challenging task.Based on continuing advancements in artificial intelligence and deep learning,new methods have come to the forefront in recent times.In particular,the end-to-end neural network model provides an extensible conversation generation framework that has the potential to enable machines to understand semantics and automatically generate responses.However,neural network models come with their own set of questions and challenges.The basic conversational model framework tends to produce universal,meaningless,and relatively"safe"answers.Methods Based on generative adversarial networks(GANs),a new emotional dialog generation framework called EMC-GAN is proposed in this study to address the task of emotional dialog generation.The proposed model comprises a generative and three discriminative models.The generator is based on the basic sequence-to-sequence(Seq2Seq)dialog generation model,and the aggregate discriminative model for the overall framework consists of a basic discriminative model,an emotion discriminative model,and a fluency discriminative model.The basic discriminative model distinguishes generated fake sentences from real sentences in the training corpus.The emotion discriminative model evaluates whether the emotion conveyed via the generated dialog agrees with a pre-specified emotion,and directs the generative model to generate dialogs that correspond to the category of the pre-specified emotion.Finally,the fluency discriminative model assigns a score to the fluency of the generated dialog and guides the generator to produce more fluent sentences.Results Based on the experimental results,this study confirms the superiority of the proposed model over similar existing models with respect to emotional accuracy,fluency,and consistency.Conclusions The proposed EMC-GAN model is capable of generating consistent,smooth,and fluent dialog that conveys pre-specified emotions,and exhibits better performance with respect to emotional accuracy,consistency,and fluency compared to its competitors.展开更多
In this paper, the visual feature space based on the long Horizontals, the long Verticals, and the radicals are given. An adaptive combination of classifiers, whose coefficients vary with the input pattern, is also pr...In this paper, the visual feature space based on the long Horizontals, the long Verticals, and the radicals are given. An adaptive combination of classifiers, whose coefficients vary with the input pattern, is also proposed. Experiments show that the approach is promising for character recognition in video sequences.展开更多
Human Gait recognition is emerging as a supportive biometric technique in recent years that identifies the people through the way they walk. The gait recognition in model free approaches faces the challenges like spee...Human Gait recognition is emerging as a supportive biometric technique in recent years that identifies the people through the way they walk. The gait recognition in model free approaches faces the challenges like speed variation, cloth variation, illumination changes and view angle variations which result in the reduced recognition rate. The proposed algorithm selected the exhaustive angles from head to toe of a person, and also height and width of the same subject. The experiments were conducted using silhouettes with view angle variation, and cloth variation. The recognition rate is improved to the extent of 91% using Support vector machine classifier. The proposed method is evaluated using CASIA Gait Dataset B (The institute of Automation, ChineseAcademy of Sciences), China. Experimental results demonstrate that the proposed technique shows promising results using state of the art classifiers.展开更多
The turbo air classifier is widely used powder classification equipment in a variety of fields. The flow field characteristics of the turbo air classifier are important basis for the improvement of the turbo air class...The turbo air classifier is widely used powder classification equipment in a variety of fields. The flow field characteristics of the turbo air classifier are important basis for the improvement of the turbo air classifier's structural design. The flow field characteristics of the rotor cage in turbo air classifiers were investigated trader different operating conditions by laser Doppler velocimeter(LDV), and a measure diminishing the axial velocity is proposed. The investigation results show that the tangential velocity of the air flow inside the rotor cage is different from the rotary speed of the rotor cage on the same measurement point due to the influences of both the negative pressure at the exit and the rotation of the rotor cage. The tangential velocity of the air flow likewise decreases as the radius decreases in the case of the rotor cage's low rotary speed. In contrast, the tangential velocity of the air flow increases as the radius decreases in the case of the rotor cage's high rotary speed. Meanwhile, the vortex inside the rotor cage is found to occur near the pressure side of the blade when the rotor cage's rotary speed is less than the tangential velocity of air flow. On the contrary, the vortex is found to occur near the blade suction side once the rotor cage's rotary speed is higher than the tangential velocity of air flow. Inside the rotor cage, the axial velocity could not be disregarded and is largely determined by the distances between the measurement point and the exit.展开更多
The classification performance of model coal mill classifiers with different bottom incoming flow inlets was experimentally and numerically studied.The flow field adjacent to two neighboring impeller blades was measur...The classification performance of model coal mill classifiers with different bottom incoming flow inlets was experimentally and numerically studied.The flow field adjacent to two neighboring impeller blades was measured using the particle image velocimetry technique.The results showed that the flow field adjacent to two neighboring blades with the swirling inlet was significantly different from that with the non-swirling inlet.With the swirling inlet,there was a vortex located between two neighboring blades,while with the nonswirling inlet,the vortex was attached to the blade tip.The vorticity of the vortex with the non-swirling inlet was much lower than that with the swirling inlet.The classifier with the non-swirling inlet demonstrated a larger cut size than that with the swirling inlet when the impeller was stationary(~0 r·min-1).As the impeller rotational speed increased,the cut size of the cases with non-swirling and swirling inlets both decreased,and the one with the non-swirling inlet decreased more dramatically.The values of the cut size of the two classifiers were close to each other at a high impeller rotational speed(≥120 r·min-1).The overall separation efficiency of the classifier with the non-swirling inlet was lower than that with the swirling inlet,and monotonically increased as the impeller rotational speed increased.With the swirling inlet,the overall separation efficiency first increased with the impeller rotational speed and then decreased when the rotational speed was above 120 r·min-1,and the variation trend of the separation efficiency was more moderate.As the initial particle concentration increased,the cut sizes of both swirling and non-swirling inlet cases decreased first and then barely changed.At a low initial particle concentration(b 0.04 kg·m-3),the classifier with the swirling inlet had a larger cut size than that with the non-swirling inlet.展开更多
The rapid growth of multimedia content necessitates powerful technologies to filter, classify, index and retrieve video documents more efficiently. However, the essential bottleneck of image and video analysis is the ...The rapid growth of multimedia content necessitates powerful technologies to filter, classify, index and retrieve video documents more efficiently. However, the essential bottleneck of image and video analysis is the problem of semantic gap that low level features extracted by computers always fail to coincide with high-level concepts interpreted by humans. In this paper, we present a generic scheme for the detection video semantic concepts based on multiple visual features machine learning. Various global and local low-level visual features are systelrtically investigated, and kernelbased learning method equips the concept detection system to explore the potential of these features. Then we combine the different features and sub-systen on both classifier-level and kernel-level fusion that contribute to a more robust system Our proposed system is tested on the TRECVID dataset. The resulted Mean Average Precision (MAP) score is rmch better than the benchmark perforrmnce, which proves that our concepts detection engine develops a generic model and perforrrs well on both object and scene type concepts.展开更多
This study modeled the effects of structural and dimensional manipulations on hydrodynamic behavior of a bench vertical current classifier. Computational fluid dynamics (CFD) approach was used as modeling method, an...This study modeled the effects of structural and dimensional manipulations on hydrodynamic behavior of a bench vertical current classifier. Computational fluid dynamics (CFD) approach was used as modeling method, and turbulent intensity and fluid velocity were applied as system responses to predict the over- flow cut size variations. These investigations showed that cut size would decrease by increasing diameter and height of the separation column and cone section depth, due to the decrease of turbulent intensity and fluid velocity. As the size of discharge gate increases, the overflow cut-size would decrease due to freely fluid stream out of the column. The overflow cut-size was significantly increased in downward fed classifier compared to that fed by upward fluid stream. In addition, reforming the shape of angular overflow outlet's weir into the curved form prevented stream inside returning and consequently unselec- tire cut-size decreasing.展开更多
文摘Cross entropy is a measure in machine learning and deep learning that assesses the difference between predicted and actual probability distributions. In this study, we propose cross entropy as a performance evaluation metric for image classifier models and apply it to the CT image classification of lung cancer. A convolutional neural network is employed as the deep neural network (DNN) image classifier, with the residual network (ResNet) 50 chosen as the DNN archi-tecture. The image data used comprise a lung CT image set. Two classification models are built from datasets with varying amounts of data, and lung cancer is categorized into four classes using 10-fold cross-validation. Furthermore, we employ t-distributed stochastic neighbor embedding to visually explain the data distribution after classification. Experimental results demonstrate that cross en-tropy is a highly useful metric for evaluating the reliability of image classifier models. It is noted that for a more comprehensive evaluation of model perfor-mance, combining with other evaluation metrics is considered essential. .
基金This researchwork is supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2024R411),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Malware attacks on Windows machines pose significant cybersecurity threats,necessitating effective detection and prevention mechanisms.Supervised machine learning classifiers have emerged as promising tools for malware detection.However,there remains a need for comprehensive studies that compare the performance of different classifiers specifically for Windows malware detection.Addressing this gap can provide valuable insights for enhancing cybersecurity strategies.While numerous studies have explored malware detection using machine learning techniques,there is a lack of systematic comparison of supervised classifiers for Windows malware detection.Understanding the relative effectiveness of these classifiers can inform the selection of optimal detection methods and improve overall security measures.This study aims to bridge the research gap by conducting a comparative analysis of supervised machine learning classifiers for detecting malware on Windows systems.The objectives include Investigating the performance of various classifiers,such as Gaussian Naïve Bayes,K Nearest Neighbors(KNN),Stochastic Gradient Descent Classifier(SGDC),and Decision Tree,in detecting Windows malware.Evaluating the accuracy,efficiency,and suitability of each classifier for real-world malware detection scenarios.Identifying the strengths and limitations of different classifiers to provide insights for cybersecurity practitioners and researchers.Offering recommendations for selecting the most effective classifier for Windows malware detection based on empirical evidence.The study employs a structured methodology consisting of several phases:exploratory data analysis,data preprocessing,model training,and evaluation.Exploratory data analysis involves understanding the dataset’s characteristics and identifying preprocessing requirements.Data preprocessing includes cleaning,feature encoding,dimensionality reduction,and optimization to prepare the data for training.Model training utilizes various supervised classifiers,and their performance is evaluated using metrics such as accuracy,precision,recall,and F1 score.The study’s outcomes comprise a comparative analysis of supervised machine learning classifiers for Windows malware detection.Results reveal the effectiveness and efficiency of each classifier in detecting different types of malware.Additionally,insights into their strengths and limitations provide practical guidance for enhancing cybersecurity defenses.Overall,this research contributes to advancing malware detection techniques and bolstering the security posture of Windows systems against evolving cyber threats.
文摘Biometric recognition refers to the identification of individuals through their unique behavioral features(e.g.,fingerprint,face,and iris).We need distinguishing characteristics to identify people,such as fingerprints,which are world-renowned as the most reliablemethod to identify people.The recognition of fingerprints has become a standard procedure in forensics,and different techniques are available for this purpose.Most current techniques lack interest in image enhancement and rely on high-dimensional features to generate classification models.Therefore,we proposed an effective fingerprint classification method for classifying the fingerprint image as authentic or altered since criminals and hackers routinely change their fingerprints to generate fake ones.In order to improve fingerprint classification accuracy,our proposed method used the most effective texture features and classifiers.Discriminant Analysis(DCA)and Gaussian Discriminant Analysis(GDA)are employed as classifiers,along with Histogram of Oriented Gradient(HOG)and Segmentation-based Feature Texture Analysis(SFTA)feature vectors as inputs.The performance of the classifiers is determined by assessing a range of feature sets,and the most accurate results are obtained.The proposed method is tested using a Sokoto Coventry Fingerprint Dataset(SOCOFing).The SOCOFing project includes 6,000 fingerprint images collected from 600 African people whose fingerprints were taken ten times.Three distinct degrees of obliteration,central rotation,and z-cut have been performed to obtain synthetically altered replicas of the genuine fingerprints.The proposal achieved massive success with a classification accuracy reaching 99%.The experimental results indicate that the proposed method for fingerprint classification is feasible and effective.The experiments also showed that the proposed SFTA-based GDA method outperformed state-of-art approaches in feature dimension and classification accuracy.
文摘One of the most common types of threats to the digital world is malicious software.It is of great importance to detect and prevent existing and new malware before it damages information assets.Machine learning approaches are used effectively for this purpose.In this study,we present a model in which supervised and unsupervised learning algorithms are used together.Clustering is used to enhance the prediction performance of the supervised classifiers.The aim of the proposed model is to make predictions in the shortest possible time with high accuracy and f1 score.In the first stage of the model,the data are clustered with the k-means algorithm.In the second stage,the prediction is made with the combination of the classifier with the best prediction performance for the related cluster.While choosing the best classifiers for the given clusters,triple combinations of ten machine learning algorithms(kernel support vector machine,k-nearest neighbor,naive Bayes,decision tree,random forest,extra gradient boosting,categorical boosting,adaptive boosting,extra trees,and gradient boosting)are used.The selected triple classifier combination is positioned in two stages.The prediction time of the model is improved by positioning the classifier with the slowest prediction time in the second stage.The selected triple classifier combination is positioned in two tiers.The prediction time of the model is improved by positioning the classifier with the highest prediction time in the second tier.It is seen that clustering before classification improves prediction performance,which is presented using Blue Hexagon Open Dataset for Malware Analysis(BODMAS),Elastic Malware Benchmark for Empowering Researchers(EMBER)2018 and Kaggle malware detection datasets.The model has 99.74%accuracy and 99.77%f1 score for the BODMAS dataset,99.04%accuracy and 98.63%f1 score for the Kaggle malware detection dataset,and 96.77%accuracy and 96.77%f1 score for the EMBER 2018 dataset.In addition,the tiered positioning of classifiers shortened the average prediction time by 76.13%for the BODMAS dataset and 95.95%for the EMBER 2018 dataset.The proposed method’s prediction performance is better than the rest of the studies in the literature in which BODMAS and EMBER 2018 datasets are used.
文摘To improve the performance of the multiple classifier system, a new method of feature-decision level fusion is proposed based on knowledge discovery. In the new method, the base classifiers operate on different feature spaces and their types depend on different measures of between-class separability. The uncertainty measures corresponding to each output of each base classifier are induced from the established decision tables (DTs) in the form of mass function in the Dempster-Shafer theory (DST). Furthermore, an effective fusion framework is built at the feature-decision level on the basis of a generalized rough set model and the DST. The experiment for the classification of hyperspectral remote sensing images shows that the performance of the classification can be improved by the proposed method compared with that of plurality voting (PV).
文摘This study investigated the efficiency of learning the Chinese numeral classifiers by L2 Chinese learners by means of an alignment-oriented task. Participants were a total of 96 intermediate learners of L2 Chinese, who were randomly assigned to two experimental groups and one control group, with each group consisting of 32 participants. The continuation task used in this study consisted of a picture-based Chinese text depicting a room with an array of objects, which necessitates the use of classifiers. The two experimental groups were both required to first read the text and then write to describe their own rooms in comparison with the one in the text. One group was instructed to use the classifiers from the text as much as possible in their writing, whereas the other was not required to do so. Participants in the control group were first given the picture to look at in the absence of the text and then asked to describe their own rooms. The results showed that the continuation task significantly enhanced participants’ retention of the Chinese numeral classifiers, suggesting that the alignment-based approach is an effective way to learn difficult linguistic categories such as the Chinese classifiers.
文摘Predicting stock price movements is a challenging task for academicians and practitioners. In particular, forecasting price movements in emerging markets seems to be more elusive because they are usually more volatile often accompa-nied by thin trading-volumes and they are susceptible to more manipulation compared to mature markets. Technical analysis of stocks and commodities has become a science on its own;quantitative methods and techniques have been applied by many practitioners to forecast price movements. Lagging and sometimes leading technical indicators pro-vide rich quantitative tools for traders and investors in their attempt to gain advantage when making investment or trading decisions. Artificial Neural Networks (ANN) have been used widely in predicting stock prices because of their capability in capturing the non-linearity that often exists in price movements. Recently, Polynomial Classifiers (PC) have been applied to various recognition and classification application and showed favorable results in terms of recog-nition rates and computational complexity as compared to ANN. In this paper, we present two prediction models for predicting securities’ prices. The first model was developed using back propagation feed forward neural networks. The second model was developed using polynomial classifiers (PC), as a first time application for PC to be used in stock prices prediction. The inputs to both models were identical, and both models were trained and tested on the same data. The study was conducted on Dubai Financial Market as an emerging market and applied to two of the market’s leading stocks. In general, both models achieved very good results in terms of mean absolute error percentage. Both models show an average error around 1.5% predicting the next day price, an average error of 2.5% when predicting second day price, and an average error of 4% when predicted the third day price.
文摘In various application areas of pattern recognition, combing multiple classifiers is regarded as a new method for achieving a substantial gain in performance of systems. This paper discusses the properties of the diversity of classifiers and its applications. At the same time, the paper presents a novel method for combining multiple classifiers based on the diversity. Fusion strategies are discussed for providing a basis for combing classifiers. These combination strategies are experimentally tested on online handwritten Chinese character recognition system and their effectiveness is considered.
文摘Wind energy is considered as a alternative renewable energy source due to its low operating cost when compared with other sources.The wind turbine is an essential system used to change kinetic energy into electrical energy.Wind turbine blades,in particular,require a competitive condition inspection approach as it is a significant component of the wind turbine system that costs around 20-25 percent of the total turbine cost.The main objective of this study is to differentiate between various blade faults which affect the wind turbine blade under operating conditions using a machine learning approach through histogram features.In this study,blade bend,hub-blade loose connection,blade erosion,pitch angle twist,and blade cracks were simulated on the blade.This problem is formulated as a machine learning problem which consists of three phases,namely feature extraction,feature selection and feature classification.Histogram features are extracted from vibration signals and feature selection was carried out using the J48 decision tree algorithm.Feature classification was performed using 15 tree classifiers.The results of the machine learning classifiers were compared with respect to their accuracy percentage and a better model is suggested for real-time monitoring of a wind turbine blade.
文摘On the semantic web, data interoperability and ontology heterogeneity are becoming ever more important issues. To resolve these problems, multiple classification methods can be used to learn the matching between ontologies. The paper uses the general statistic classification method to discover category features in data instances and use the first-order learning algorithm FOIL to exploit the semantic relations among data instances. When using multistrategy learning approach, a central problem is the evaluation of multistrategy classifiers. The goal and the conditions of using multistrategy classifiers within ontology matching are different from the ones for general text classification. This paper describes the combination rule of multiple classifiers called the Best Outstanding Champion, which is suitable for heterogeneous ontology mapping. On the prediction results of individual methods, the method can well accumulate the correct matching of alone classifier. The experiments show that the approach achieves high accuracy on real-world domain.
文摘The rise of fake news on social media has had a detrimental effect on society. Numerous performance evaluations on classifiers that can detect fake news have previously been undertaken by researchers in this area. To assess their performance, we used 14 different classifiers in this study. Secondly, we looked at how soft voting and hard voting classifiers performed in a mixture of distinct individual classifiers. Finally, heuristics are used to create 9 models of stacking classifiers. The F1 score, prediction, recall, and accuracy have all been used to assess performance. Models 6 and 7 achieved the best accuracy of 96.13 while having a larger computational complexity. For benchmarking purposes, other individual classifiers are also tested.
文摘Fuel injectors are considered as an important component of combustion engines. Operational weakness can possibly lead to the complete machine malfunction, decreasing reliability and leading to loss of production. To overcome these circumstances, various condition monitoring techniques can be applied. The application of acoustic signals is common in the field of fault diagnosis of rotating machinery. Advanced signal processing is utilized for the construction of features that are specialized in detecting fuel injector faults. A performance comparison between novelty detection algorithms in the form of one-class classifiers is presented. The one-class classifiers that were tested included One-Class Support Vector Machine (OCSVM) and One-Class Self Organizing Map (OCSOM). The acoustic signals of fuel injectors in different operational conditions were processed for feature extraction. Features from all the signals were used as input to the one-class classifiers. The one-class classifiers were trained only with healthy fuel injector conditions and compared with new experimental data which belonged to different operational conditions that were not included in the training set so as to contribute to generalization. The results present the effectiveness of one-class classifiers for detecting faults in fuel injectors.
基金This work was supported in part by the National Natural Science Foundation of China under Grant 61673295the Natural Science Foundation of Tianjin under Grant 18JCYBJC85200by the National College Students’ innovation and entrepreneurship project under Grant 201710060041.
文摘In this paper, polynomial fuzzy neural network classifiers (PFNNCs) is proposed by means of density fuzzy c-means and L2-norm regularization. The overall design of PFNNCs was realized by means of fuzzy rules that come in form of three parts, namely premise part, consequence part and aggregation part. The premise part was developed by density fuzzy c-means that helps determine the apex parameters of membership functions, while the consequence part was realized by means of two types of polynomials including linear and quadratic. L2-norm regularization that can alleviate the overfitting problem was exploited to estimate the parameters of polynomials, which constructed the aggregation part. Experimental results of several data sets demonstrate that the proposed classifiers show higher classification accuracy in comparison with some other classifiers reported in the literature.
文摘Background Human-machine dialog generation is an essential topic of research in the field of natural language processing.Generating high-quality,diverse,fluent,and emotional conversation is a challenging task.Based on continuing advancements in artificial intelligence and deep learning,new methods have come to the forefront in recent times.In particular,the end-to-end neural network model provides an extensible conversation generation framework that has the potential to enable machines to understand semantics and automatically generate responses.However,neural network models come with their own set of questions and challenges.The basic conversational model framework tends to produce universal,meaningless,and relatively"safe"answers.Methods Based on generative adversarial networks(GANs),a new emotional dialog generation framework called EMC-GAN is proposed in this study to address the task of emotional dialog generation.The proposed model comprises a generative and three discriminative models.The generator is based on the basic sequence-to-sequence(Seq2Seq)dialog generation model,and the aggregate discriminative model for the overall framework consists of a basic discriminative model,an emotion discriminative model,and a fluency discriminative model.The basic discriminative model distinguishes generated fake sentences from real sentences in the training corpus.The emotion discriminative model evaluates whether the emotion conveyed via the generated dialog agrees with a pre-specified emotion,and directs the generative model to generate dialogs that correspond to the category of the pre-specified emotion.Finally,the fluency discriminative model assigns a score to the fluency of the generated dialog and guides the generator to produce more fluent sentences.Results Based on the experimental results,this study confirms the superiority of the proposed model over similar existing models with respect to emotional accuracy,fluency,and consistency.Conclusions The proposed EMC-GAN model is capable of generating consistent,smooth,and fluent dialog that conveys pre-specified emotions,and exhibits better performance with respect to emotional accuracy,consistency,and fluency compared to its competitors.
文摘In this paper, the visual feature space based on the long Horizontals, the long Verticals, and the radicals are given. An adaptive combination of classifiers, whose coefficients vary with the input pattern, is also proposed. Experiments show that the approach is promising for character recognition in video sequences.
文摘Human Gait recognition is emerging as a supportive biometric technique in recent years that identifies the people through the way they walk. The gait recognition in model free approaches faces the challenges like speed variation, cloth variation, illumination changes and view angle variations which result in the reduced recognition rate. The proposed algorithm selected the exhaustive angles from head to toe of a person, and also height and width of the same subject. The experiments were conducted using silhouettes with view angle variation, and cloth variation. The recognition rate is improved to the extent of 91% using Support vector machine classifier. The proposed method is evaluated using CASIA Gait Dataset B (The institute of Automation, ChineseAcademy of Sciences), China. Experimental results demonstrate that the proposed technique shows promising results using state of the art classifiers.
基金supported by National Natural Science Foundation of China (Grant No. 50474035)
文摘The turbo air classifier is widely used powder classification equipment in a variety of fields. The flow field characteristics of the turbo air classifier are important basis for the improvement of the turbo air classifier's structural design. The flow field characteristics of the rotor cage in turbo air classifiers were investigated trader different operating conditions by laser Doppler velocimeter(LDV), and a measure diminishing the axial velocity is proposed. The investigation results show that the tangential velocity of the air flow inside the rotor cage is different from the rotary speed of the rotor cage on the same measurement point due to the influences of both the negative pressure at the exit and the rotation of the rotor cage. The tangential velocity of the air flow likewise decreases as the radius decreases in the case of the rotor cage's low rotary speed. In contrast, the tangential velocity of the air flow increases as the radius decreases in the case of the rotor cage's high rotary speed. Meanwhile, the vortex inside the rotor cage is found to occur near the pressure side of the blade when the rotor cage's rotary speed is less than the tangential velocity of air flow. On the contrary, the vortex is found to occur near the blade suction side once the rotor cage's rotary speed is higher than the tangential velocity of air flow. Inside the rotor cage, the axial velocity could not be disregarded and is largely determined by the distances between the measurement point and the exit.
基金financial support from the National Key Technologies R&D Program of China(2018YFF0216002)。
文摘The classification performance of model coal mill classifiers with different bottom incoming flow inlets was experimentally and numerically studied.The flow field adjacent to two neighboring impeller blades was measured using the particle image velocimetry technique.The results showed that the flow field adjacent to two neighboring blades with the swirling inlet was significantly different from that with the non-swirling inlet.With the swirling inlet,there was a vortex located between two neighboring blades,while with the nonswirling inlet,the vortex was attached to the blade tip.The vorticity of the vortex with the non-swirling inlet was much lower than that with the swirling inlet.The classifier with the non-swirling inlet demonstrated a larger cut size than that with the swirling inlet when the impeller was stationary(~0 r·min-1).As the impeller rotational speed increased,the cut size of the cases with non-swirling and swirling inlets both decreased,and the one with the non-swirling inlet decreased more dramatically.The values of the cut size of the two classifiers were close to each other at a high impeller rotational speed(≥120 r·min-1).The overall separation efficiency of the classifier with the non-swirling inlet was lower than that with the swirling inlet,and monotonically increased as the impeller rotational speed increased.With the swirling inlet,the overall separation efficiency first increased with the impeller rotational speed and then decreased when the rotational speed was above 120 r·min-1,and the variation trend of the separation efficiency was more moderate.As the initial particle concentration increased,the cut sizes of both swirling and non-swirling inlet cases decreased first and then barely changed.At a low initial particle concentration(b 0.04 kg·m-3),the classifier with the swirling inlet had a larger cut size than that with the non-swirling inlet.
基金Acknowledgements This paper was supported by the coUabomtive Research Project SEV under Cant No. 01100474 between Beijing University of Posts and Telecorrrcnications and France Telecom R&D Beijing the National Natural Science Foundation of China under Cant No. 90920001 the Caduate Innovation Fund of SICE, BUPT, 2011.
文摘The rapid growth of multimedia content necessitates powerful technologies to filter, classify, index and retrieve video documents more efficiently. However, the essential bottleneck of image and video analysis is the problem of semantic gap that low level features extracted by computers always fail to coincide with high-level concepts interpreted by humans. In this paper, we present a generic scheme for the detection video semantic concepts based on multiple visual features machine learning. Various global and local low-level visual features are systelrtically investigated, and kernelbased learning method equips the concept detection system to explore the potential of these features. Then we combine the different features and sub-systen on both classifier-level and kernel-level fusion that contribute to a more robust system Our proposed system is tested on the TRECVID dataset. The resulted Mean Average Precision (MAP) score is rmch better than the benchmark perforrmnce, which proves that our concepts detection engine develops a generic model and perforrrs well on both object and scene type concepts.
基金financially supported by INVENTIVE~ Mineral Processing Research Center of Iran
文摘This study modeled the effects of structural and dimensional manipulations on hydrodynamic behavior of a bench vertical current classifier. Computational fluid dynamics (CFD) approach was used as modeling method, and turbulent intensity and fluid velocity were applied as system responses to predict the over- flow cut size variations. These investigations showed that cut size would decrease by increasing diameter and height of the separation column and cone section depth, due to the decrease of turbulent intensity and fluid velocity. As the size of discharge gate increases, the overflow cut-size would decrease due to freely fluid stream out of the column. The overflow cut-size was significantly increased in downward fed classifier compared to that fed by upward fluid stream. In addition, reforming the shape of angular overflow outlet's weir into the curved form prevented stream inside returning and consequently unselec- tire cut-size decreasing.