This study explores the area of Author Profiling(AP)and its importance in several industries,including forensics,security,marketing,and education.A key component of AP is the extraction of useful information from text...This study explores the area of Author Profiling(AP)and its importance in several industries,including forensics,security,marketing,and education.A key component of AP is the extraction of useful information from text,with an emphasis on the writers’ages and genders.To improve the accuracy of AP tasks,the study develops an ensemble model dubbed ABMRF that combines AdaBoostM1(ABM1)and Random Forest(RF).The work uses an extensive technique that involves textmessage dataset pretreatment,model training,and assessment.To evaluate the effectiveness of several machine learning(ML)algorithms in classifying age and gender,including Composite Hypercube on Random Projection(CHIRP),Decision Trees(J48),Na飗e Bayes(NB),K Nearest Neighbor,AdaboostM1,NB-Updatable,RF,andABMRF,they are compared.The findings demonstrate thatABMRFregularly beats the competition,with a gender classification accuracy of 71.14%and an age classification accuracy of 54.29%,respectively.Additional metrics like precision,recall,F-measure,Matthews Correlation Coefficient(MCC),and accuracy support ABMRF’s outstanding performance in age and gender profiling tasks.This study demonstrates the usefulness of ABMRF as an ensemble model for author profiling and highlights its possible uses in marketing,law enforcement,and education.The results emphasize the effectiveness of ensemble approaches in enhancing author profiling task accuracy,particularly when it comes to age and gender identification.展开更多
Text classification is an essential task for many applications related to the Natural Language Processing domain.It can be applied in many fields,such as Information Retrieval,Knowledge Extraction,and Knowledge modeli...Text classification is an essential task for many applications related to the Natural Language Processing domain.It can be applied in many fields,such as Information Retrieval,Knowledge Extraction,and Knowledge modeling.Even though the importance of this task,Arabic Text Classification tools still suffer from many problems and remain incapable of responding to the increasing volume of Arabic content that circulates on the web or resides in large databases.This paper introduces a novel machine learning-based approach that exclusively uses hybrid(stylistic and semantic)features.First,we clean the Arabic documents and translate them to English using translation tools.Consequently,the semantic features are automatically extracted from the translated documents using an existing database of English topics.Besides,the model automatically extracts from the textual content a set of stylistic features such as word and character frequencies and punctuation.Therefore,we obtain 3 types of features:semantic,stylistic and hybrid.Using each time,a different type of feature,we performed an in-depth comparison study of nine well-known Machine Learning models to evaluate our approach and used a standard Arabic corpus.The obtained results show that Neural Network outperforms other models and provides good performances using hybrid features(F1-score=0.88%).展开更多
A complex Laboratory Developed Test(LDT)is a clinical test developed within a single laboratory.It is typically configured from many fea-ture constraints from clinical repositories,which are part of the existing Lab-o...A complex Laboratory Developed Test(LDT)is a clinical test developed within a single laboratory.It is typically configured from many fea-ture constraints from clinical repositories,which are part of the existing Lab-oratory Information Management System(LIMS).Although these clinical repositories are automated,support for managing patient information with test results of an LDT is also integrated within the existing LIMS.Still,the support to configure LDTs design needs to be made available even in standard LIMS packages.The manual configuration of LDTs is a complex process and can generate configuration inconsistencies because many constraints between features can remain unsatisfied.It is a risky process and can lead patients to undergo unnecessary treatments.We proposed an optimized solution(opt-LDT)based on Genetic Algorithms to automate the configuration and resolve the inconsistencies in LDTs.Opt-LDT encodes LDT configuration as an optimization problem and generates a consistent configuration that satisfies the constraints of the features.We tested and validated opt-LDT for a local secondary care hospital in a real healthcare environment.Our results,averaged over ten runs,show that opt-LDT resolves 90%of inconsistencies while taking between 6 and 6.5 s for each configuration.Moreover,positive feedback based on a subjective questionnaire from clinicians regarding the performance,acceptability,and efficiency of opt-LDT motivates us to present our results for regulatory approval.展开更多
Electrical load forecasting is very crucial for electrical power systems’planning and operation.Both electrical buildings’load demand and meteorological datasets may contain hidden patterns that are required to be i...Electrical load forecasting is very crucial for electrical power systems’planning and operation.Both electrical buildings’load demand and meteorological datasets may contain hidden patterns that are required to be investigated and studied to show their potential impact on load forecasting.The meteorological data are analyzed in this study through different data mining techniques aiming to predict the electrical load demand of a factory located in Riyadh,Saudi Arabia.The factory load and meteorological data used in this study are recorded hourly between 2016 and 2017.These data are provided by King Abdullah City for Atomic and Renewable Energy and Saudi Electricity Company at a site located in Riyadh.After applying the data pre-processing techniques to prepare the data,different machine learning algorithms,namely Artificial Neural Network and Support Vector Regression(SVR),are applied and compared to predict the factory load.In addition,for the sake of selecting the optimal set of features,13 different combinations of features are investigated in this study.The outcomes of this study emphasize selecting the optimal set of features as more features may add complexity to the learning process.Finally,the SVR algorithm with six features provides the most accurate prediction values to predict the factory load.展开更多
Depression is a mental psychological disorder that may cause a physical disorder or lead to death.It is highly impactful on the socialeconomical life of a person;therefore,its effective and timely detection is needful...Depression is a mental psychological disorder that may cause a physical disorder or lead to death.It is highly impactful on the socialeconomical life of a person;therefore,its effective and timely detection is needful.Despite speech and gait,facial expressions have valuable clues to depression.This study proposes a depression detection system based on facial expression analysis.Facial features have been used for depression detection using Support Vector Machine(SVM)and Convolutional Neural Network(CNN).We extracted micro-expressions using Facial Action Coding System(FACS)as Action Units(AUs)correlated with the sad,disgust,and contempt features for depression detection.A CNN-based model is also proposed in this study to auto classify depressed subjects from images or videos in real-time.Experiments have been performed on the dataset obtained from Bahawal Victoria Hospital,Bahawalpur,Pakistan,as per the patient health questionnaire depression scale(PHQ-8);for inferring the mental condition of a patient.The experiments revealed 99.9%validation accuracy on the proposed CNN model,while extracted features obtained 100%accuracy on SVM.Moreover,the results proved the superiority of the reported approach over state-of-the-art methods.展开更多
A tremendous amount of vendor invoices is generated in the corporate sector.To automate the manual data entry in payable documents,highly accurate Optical Character Recognition(OCR)is required.This paper proposes an e...A tremendous amount of vendor invoices is generated in the corporate sector.To automate the manual data entry in payable documents,highly accurate Optical Character Recognition(OCR)is required.This paper proposes an end-to-end OCR system that does both localization and recognition and serves as a single unit to automate payable document processing such as cheques and cash disbursement.For text localization,the maximally stable extremal region is used,which extracts a word or digit chunk from an invoice.This chunk is later passed to the deep learning model,which performs text recognition.The deep learning model utilizes both convolution neural networks and long short-term memory(LSTM).The convolution layer is used for extracting features,which are fed to the LSTM.The model integrates feature extraction,modeling sequence,and transcription into a unified network.It handles the sequences of unconstrained lengths,independent of the character segmentation or horizontal scale normalization.Furthermore,it applies to both the lexicon-free and lexicon-based text recognition,and finally,it produces a comparatively smaller model,which can be implemented in practical applications.The overall superior performance in the experimental evaluation demonstrates the usefulness of the proposed model.The model is thus generic and can be used for other similar recognition scenarios.展开更多
The advent of the COVID-19 pandemic has adversely affected the entire world and has put forth high demand for techniques that remotely manage crowd-related tasks.Video surveillance and crowd management using video ana...The advent of the COVID-19 pandemic has adversely affected the entire world and has put forth high demand for techniques that remotely manage crowd-related tasks.Video surveillance and crowd management using video analysis techniques have significantly impacted today’s research,and numerous applications have been developed in this domain.This research proposed an anomaly detection technique applied to Umrah videos in Kaaba during the COVID-19 pandemic through sparse crowd analysis.Managing theKaaba rituals is crucial since the crowd gathers from around the world and requires proper analysis during these days of the pandemic.The Umrah videos are analyzed,and a system is devised that can track and monitor the crowd flow in Kaaba.The crowd in these videos is sparse due to the pandemic,and we have developed a technique to track the maximum crowd flow and detect any object(person)moving in the direction unlikely of the major flow.We have detected abnormal movement by creating the histograms for the vertical and horizontal flows and applying thresholds to identify the non-majority flow.Our algorithm aims to analyze the crowd through video surveillance and timely detect any abnormal activity tomaintain a smooth crowd flowinKaaba during the pandemic.展开更多
Bloom filter(BF)is a space-and-time efficient probabilistic technique that helps answermembership queries.However,BF faces several issues.The problems with traditional BF are generally two.Firstly,a large number of fa...Bloom filter(BF)is a space-and-time efficient probabilistic technique that helps answermembership queries.However,BF faces several issues.The problems with traditional BF are generally two.Firstly,a large number of false positives can return wrong content when the data is queried.Secondly,the large size of BF is a bottleneck in the speed of querying and thus uses large memory.In order to solve the above two issues,in this article,we propose the check bits concept.From the implementation perspective,in the check bits approach,before saving the content value in the BF,we obtain the binary representation of the content value.Then,we take some bits of the content value,we call these the check bits.These bits are stored in a separate array such that they point to the same location as the BF.Finally,the content value(data)is stored in the BF based on the hash function values.Before retrieval of data from BF,the reverse process of the steps ensures that even if the same hash functions output has been generated for the content,the check bits make sure that the retrieval does not depend on the hash output alone.This thus helps in the reduction of false positives.In the experimental evaluation,we are able to reduce more than 50%of false positives.In our proposed approach,the false positives can still occur,however,false positives can only occur if the hash functions and check bits generate the same value for a particular content.The chances of such scenarios are less,therefore,we get a reduction of approximately more than 50%false positives in all cases.We believe that the proposed approach adds to the state of the art and opens new directions as such.展开更多
Currently,mobile communication is one of the widely used means of communication.Nevertheless,it is quite challenging for a telecommunication company to attract new customers.The recent concept of mobile number portabi...Currently,mobile communication is one of the widely used means of communication.Nevertheless,it is quite challenging for a telecommunication company to attract new customers.The recent concept of mobile number portability has also aggravated the problem of customer churn.Companies need to identify beforehand the customers,who could potentially churn out to the competitors.In the telecommunication industry,such identification could be done based on call detail records.This research presents an extensive experimental study based on various deep learning models,such as the 1D convolutional neural network(CNN)model along with the recurrent neural network(RNN)and deep neural network(DNN)for churn prediction.We use the mobile telephony churn prediction dataset obtained from customers-dna.com,containing the data for around 100,000 individuals,out of which 86,000 are non-churners,whereas 14,000 are churned customers.The imbalanced data are handled using undersampling and oversampling.The accuracy for CNN,RNN,and DNN is 91%,93%,and 96%,respectively.Furthermore,DNN got 99%for ROC.展开更多
There are over 200 different varieties of dates fruit in the world.Interestingly,every single type has some very specific features that differ from the others.In recent years,sorting,separating,and arranging in automa...There are over 200 different varieties of dates fruit in the world.Interestingly,every single type has some very specific features that differ from the others.In recent years,sorting,separating,and arranging in automated industries,in fruits businesses,and more specifically in dates businesses have inspired many research dimensions.In this regard,this paper focuses on the detection and recognition of dates using computer vision and machine learning.Our experimental setup is based on the classical machine learning approach and the deep learning approach for nine classes of dates fruit.Classical machine learning includes the Bayesian network,Support Vector Machine,Random Forest,and Multi-Layer Perceptron(MLP),while the Convolutional Neural Network is used for the deep learning set.The feature set includes Color Layout features,Fuzzy Color and Texture Histogram,Gabor filtering,and the Pyramid Histogram of the Oriented Gradients.The fusion of various features is also extensively explored in this paper.The MLP achieves the highest detection performance with an F-measure of 0.938.Moreover,deep learning shows better accuracy than the classical machine learning algorithms.In fact,deep learning got 2%more accurate results as compared to the MLP and the Random forest.We also show that classical machine learning could give increased classification performance which could get close to that provided by deep learning through the use of optimized tuning and a good feature set.展开更多
Neural talk models play a leading role in the growing popular building of conversational managers.A commonplace criticism of those systems is that they seldom understand or use the conversation data efficiently.The d...Neural talk models play a leading role in the growing popular building of conversational managers.A commonplace criticism of those systems is that they seldom understand or use the conversation data efficiently.The development of profound concentration on innovations has increased the use of neural models for a discussion display.In recent years,deep learning(DL)models have achieved significant success in various tasks,and many dialogue systems are also employing DL techniques.The primary issues involved in the generation of the dialogue system are acquiring perspectives into instinctual linguistics,comprehension provision,and conversation assessment.In this paper,we mainly focus on DL-based dialogue systems.The issue to be overcome under this publication would be dialogue supervision,which will determine how the framework responds to recognizing the needs of the user.The dataset utilized in this research is extracted from movies.The models implemented in this research are the seq2seq model,transformers,and GPT while using word embedding and NLP.The results obtained after implementation depicted that all three models produced accurate results.In the modern revolutionized world,the demand for a dialogue system is more than ever.Therefore,it is essential to take the necessary steps to build effective dialogue systems.展开更多
Taxonomy is generated to effectively organize and access large volume of data. A taxonomy is a way of representing concepts that exist in data. It needs to continuously evolve to reflect changes in data. Existing auto...Taxonomy is generated to effectively organize and access large volume of data. A taxonomy is a way of representing concepts that exist in data. It needs to continuously evolve to reflect changes in data. Existing automatic taxonomy generation techniques do not handle the evolution of data; therefore, the generated taxonomies do not truly represent the data. The evolution of data can be handled by either regenerating taxonomy from scratch, or allowing taxonomy to incrementally evolve whenever changes occur in the data. The former approach is not economical in terms of time and resources. A taxonomy incremental evolution(TIE) algorithm, as proposed, is a novel attempt to handle the data that evolve in time. It serves as a layer over an existing clustering-based taxonomy generation technique and allows an existing taxonomy to incrementally evolve. The algorithm was evaluated in research articles selected from the computing domain. It was found that the taxonomy using the algorithm that evolved with data needed considerably shorter time, and had better quality per unit time as compared to the taxonomy regenerated from scratch.展开更多
文摘This study explores the area of Author Profiling(AP)and its importance in several industries,including forensics,security,marketing,and education.A key component of AP is the extraction of useful information from text,with an emphasis on the writers’ages and genders.To improve the accuracy of AP tasks,the study develops an ensemble model dubbed ABMRF that combines AdaBoostM1(ABM1)and Random Forest(RF).The work uses an extensive technique that involves textmessage dataset pretreatment,model training,and assessment.To evaluate the effectiveness of several machine learning(ML)algorithms in classifying age and gender,including Composite Hypercube on Random Projection(CHIRP),Decision Trees(J48),Na飗e Bayes(NB),K Nearest Neighbor,AdaboostM1,NB-Updatable,RF,andABMRF,they are compared.The findings demonstrate thatABMRFregularly beats the competition,with a gender classification accuracy of 71.14%and an age classification accuracy of 54.29%,respectively.Additional metrics like precision,recall,F-measure,Matthews Correlation Coefficient(MCC),and accuracy support ABMRF’s outstanding performance in age and gender profiling tasks.This study demonstrates the usefulness of ABMRF as an ensemble model for author profiling and highlights its possible uses in marketing,law enforcement,and education.The results emphasize the effectiveness of ensemble approaches in enhancing author profiling task accuracy,particularly when it comes to age and gender identification.
文摘Text classification is an essential task for many applications related to the Natural Language Processing domain.It can be applied in many fields,such as Information Retrieval,Knowledge Extraction,and Knowledge modeling.Even though the importance of this task,Arabic Text Classification tools still suffer from many problems and remain incapable of responding to the increasing volume of Arabic content that circulates on the web or resides in large databases.This paper introduces a novel machine learning-based approach that exclusively uses hybrid(stylistic and semantic)features.First,we clean the Arabic documents and translate them to English using translation tools.Consequently,the semantic features are automatically extracted from the translated documents using an existing database of English topics.Besides,the model automatically extracts from the textual content a set of stylistic features such as word and character frequencies and punctuation.Therefore,we obtain 3 types of features:semantic,stylistic and hybrid.Using each time,a different type of feature,we performed an in-depth comparison study of nine well-known Machine Learning models to evaluate our approach and used a standard Arabic corpus.The obtained results show that Neural Network outperforms other models and provides good performances using hybrid features(F1-score=0.88%).
文摘A complex Laboratory Developed Test(LDT)is a clinical test developed within a single laboratory.It is typically configured from many fea-ture constraints from clinical repositories,which are part of the existing Lab-oratory Information Management System(LIMS).Although these clinical repositories are automated,support for managing patient information with test results of an LDT is also integrated within the existing LIMS.Still,the support to configure LDTs design needs to be made available even in standard LIMS packages.The manual configuration of LDTs is a complex process and can generate configuration inconsistencies because many constraints between features can remain unsatisfied.It is a risky process and can lead patients to undergo unnecessary treatments.We proposed an optimized solution(opt-LDT)based on Genetic Algorithms to automate the configuration and resolve the inconsistencies in LDTs.Opt-LDT encodes LDT configuration as an optimization problem and generates a consistent configuration that satisfies the constraints of the features.We tested and validated opt-LDT for a local secondary care hospital in a real healthcare environment.Our results,averaged over ten runs,show that opt-LDT resolves 90%of inconsistencies while taking between 6 and 6.5 s for each configuration.Moreover,positive feedback based on a subjective questionnaire from clinicians regarding the performance,acceptability,and efficiency of opt-LDT motivates us to present our results for regulatory approval.
基金Funding Statement:The researchers would like to thank the Deanship of Scientific Research,Qassim University for funding the publication of this project.
文摘Electrical load forecasting is very crucial for electrical power systems’planning and operation.Both electrical buildings’load demand and meteorological datasets may contain hidden patterns that are required to be investigated and studied to show their potential impact on load forecasting.The meteorological data are analyzed in this study through different data mining techniques aiming to predict the electrical load demand of a factory located in Riyadh,Saudi Arabia.The factory load and meteorological data used in this study are recorded hourly between 2016 and 2017.These data are provided by King Abdullah City for Atomic and Renewable Energy and Saudi Electricity Company at a site located in Riyadh.After applying the data pre-processing techniques to prepare the data,different machine learning algorithms,namely Artificial Neural Network and Support Vector Regression(SVR),are applied and compared to predict the factory load.In addition,for the sake of selecting the optimal set of features,13 different combinations of features are investigated in this study.The outcomes of this study emphasize selecting the optimal set of features as more features may add complexity to the learning process.Finally,the SVR algorithm with six features provides the most accurate prediction values to predict the factory load.
文摘Depression is a mental psychological disorder that may cause a physical disorder or lead to death.It is highly impactful on the socialeconomical life of a person;therefore,its effective and timely detection is needful.Despite speech and gait,facial expressions have valuable clues to depression.This study proposes a depression detection system based on facial expression analysis.Facial features have been used for depression detection using Support Vector Machine(SVM)and Convolutional Neural Network(CNN).We extracted micro-expressions using Facial Action Coding System(FACS)as Action Units(AUs)correlated with the sad,disgust,and contempt features for depression detection.A CNN-based model is also proposed in this study to auto classify depressed subjects from images or videos in real-time.Experiments have been performed on the dataset obtained from Bahawal Victoria Hospital,Bahawalpur,Pakistan,as per the patient health questionnaire depression scale(PHQ-8);for inferring the mental condition of a patient.The experiments revealed 99.9%validation accuracy on the proposed CNN model,while extracted features obtained 100%accuracy on SVM.Moreover,the results proved the superiority of the reported approach over state-of-the-art methods.
基金Researchers would like to thank the Deanship of Scientific Research,Qassim University,for funding publication of this project.
文摘A tremendous amount of vendor invoices is generated in the corporate sector.To automate the manual data entry in payable documents,highly accurate Optical Character Recognition(OCR)is required.This paper proposes an end-to-end OCR system that does both localization and recognition and serves as a single unit to automate payable document processing such as cheques and cash disbursement.For text localization,the maximally stable extremal region is used,which extracts a word or digit chunk from an invoice.This chunk is later passed to the deep learning model,which performs text recognition.The deep learning model utilizes both convolution neural networks and long short-term memory(LSTM).The convolution layer is used for extracting features,which are fed to the LSTM.The model integrates feature extraction,modeling sequence,and transcription into a unified network.It handles the sequences of unconstrained lengths,independent of the character segmentation or horizontal scale normalization.Furthermore,it applies to both the lexicon-free and lexicon-based text recognition,and finally,it produces a comparatively smaller model,which can be implemented in practical applications.The overall superior performance in the experimental evaluation demonstrates the usefulness of the proposed model.The model is thus generic and can be used for other similar recognition scenarios.
基金The authors extend their appreciation to the Deputyship for Research and Innovation,Ministry of Education in Saudi Arabia for funding this research work through the Project Number QURDO001Project title:Intelligent Real-Time Crowd Monitoring System Using Unmanned Aerial Vehicle(UAV)Video and Global Positioning Systems(GPS)Data。
文摘The advent of the COVID-19 pandemic has adversely affected the entire world and has put forth high demand for techniques that remotely manage crowd-related tasks.Video surveillance and crowd management using video analysis techniques have significantly impacted today’s research,and numerous applications have been developed in this domain.This research proposed an anomaly detection technique applied to Umrah videos in Kaaba during the COVID-19 pandemic through sparse crowd analysis.Managing theKaaba rituals is crucial since the crowd gathers from around the world and requires proper analysis during these days of the pandemic.The Umrah videos are analyzed,and a system is devised that can track and monitor the crowd flow in Kaaba.The crowd in these videos is sparse due to the pandemic,and we have developed a technique to track the maximum crowd flow and detect any object(person)moving in the direction unlikely of the major flow.We have detected abnormal movement by creating the histograms for the vertical and horizontal flows and applying thresholds to identify the non-majority flow.Our algorithm aims to analyze the crowd through video surveillance and timely detect any abnormal activity tomaintain a smooth crowd flowinKaaba during the pandemic.
基金The authors would like to thank the chair of Prince Faisal binMishaal Al Saud for Artificial Intelligent research for funding this research work through the project number QU-CPFAI-2-7-4Also would like to extend their appreciation to the Deputyship for Research&Innovation,Ministry of Education,and the Deanship of Scientific Research,Qassim University,for their support of this research.
文摘Bloom filter(BF)is a space-and-time efficient probabilistic technique that helps answermembership queries.However,BF faces several issues.The problems with traditional BF are generally two.Firstly,a large number of false positives can return wrong content when the data is queried.Secondly,the large size of BF is a bottleneck in the speed of querying and thus uses large memory.In order to solve the above two issues,in this article,we propose the check bits concept.From the implementation perspective,in the check bits approach,before saving the content value in the BF,we obtain the binary representation of the content value.Then,we take some bits of the content value,we call these the check bits.These bits are stored in a separate array such that they point to the same location as the BF.Finally,the content value(data)is stored in the BF based on the hash function values.Before retrieval of data from BF,the reverse process of the steps ensures that even if the same hash functions output has been generated for the content,the check bits make sure that the retrieval does not depend on the hash output alone.This thus helps in the reduction of false positives.In the experimental evaluation,we are able to reduce more than 50%of false positives.In our proposed approach,the false positives can still occur,however,false positives can only occur if the hash functions and check bits generate the same value for a particular content.The chances of such scenarios are less,therefore,we get a reduction of approximately more than 50%false positives in all cases.We believe that the proposed approach adds to the state of the art and opens new directions as such.
文摘Currently,mobile communication is one of the widely used means of communication.Nevertheless,it is quite challenging for a telecommunication company to attract new customers.The recent concept of mobile number portability has also aggravated the problem of customer churn.Companies need to identify beforehand the customers,who could potentially churn out to the competitors.In the telecommunication industry,such identification could be done based on call detail records.This research presents an extensive experimental study based on various deep learning models,such as the 1D convolutional neural network(CNN)model along with the recurrent neural network(RNN)and deep neural network(DNN)for churn prediction.We use the mobile telephony churn prediction dataset obtained from customers-dna.com,containing the data for around 100,000 individuals,out of which 86,000 are non-churners,whereas 14,000 are churned customers.The imbalanced data are handled using undersampling and oversampling.The accuracy for CNN,RNN,and DNN is 91%,93%,and 96%,respectively.Furthermore,DNN got 99%for ROC.
文摘There are over 200 different varieties of dates fruit in the world.Interestingly,every single type has some very specific features that differ from the others.In recent years,sorting,separating,and arranging in automated industries,in fruits businesses,and more specifically in dates businesses have inspired many research dimensions.In this regard,this paper focuses on the detection and recognition of dates using computer vision and machine learning.Our experimental setup is based on the classical machine learning approach and the deep learning approach for nine classes of dates fruit.Classical machine learning includes the Bayesian network,Support Vector Machine,Random Forest,and Multi-Layer Perceptron(MLP),while the Convolutional Neural Network is used for the deep learning set.The feature set includes Color Layout features,Fuzzy Color and Texture Histogram,Gabor filtering,and the Pyramid Histogram of the Oriented Gradients.The fusion of various features is also extensively explored in this paper.The MLP achieves the highest detection performance with an F-measure of 0.938.Moreover,deep learning shows better accuracy than the classical machine learning algorithms.In fact,deep learning got 2%more accurate results as compared to the MLP and the Random forest.We also show that classical machine learning could give increased classification performance which could get close to that provided by deep learning through the use of optimized tuning and a good feature set.
文摘Neural talk models play a leading role in the growing popular building of conversational managers.A commonplace criticism of those systems is that they seldom understand or use the conversation data efficiently.The development of profound concentration on innovations has increased the use of neural models for a discussion display.In recent years,deep learning(DL)models have achieved significant success in various tasks,and many dialogue systems are also employing DL techniques.The primary issues involved in the generation of the dialogue system are acquiring perspectives into instinctual linguistics,comprehension provision,and conversation assessment.In this paper,we mainly focus on DL-based dialogue systems.The issue to be overcome under this publication would be dialogue supervision,which will determine how the framework responds to recognizing the needs of the user.The dataset utilized in this research is extracted from movies.The models implemented in this research are the seq2seq model,transformers,and GPT while using word embedding and NLP.The results obtained after implementation depicted that all three models produced accurate results.In the modern revolutionized world,the demand for a dialogue system is more than ever.Therefore,it is essential to take the necessary steps to build effective dialogue systems.
文摘Taxonomy is generated to effectively organize and access large volume of data. A taxonomy is a way of representing concepts that exist in data. It needs to continuously evolve to reflect changes in data. Existing automatic taxonomy generation techniques do not handle the evolution of data; therefore, the generated taxonomies do not truly represent the data. The evolution of data can be handled by either regenerating taxonomy from scratch, or allowing taxonomy to incrementally evolve whenever changes occur in the data. The former approach is not economical in terms of time and resources. A taxonomy incremental evolution(TIE) algorithm, as proposed, is a novel attempt to handle the data that evolve in time. It serves as a layer over an existing clustering-based taxonomy generation technique and allows an existing taxonomy to incrementally evolve. The algorithm was evaluated in research articles selected from the computing domain. It was found that the taxonomy using the algorithm that evolved with data needed considerably shorter time, and had better quality per unit time as compared to the taxonomy regenerated from scratch.