With the rapid growth of internet usage,a new situation has been created that enables practicing bullying.Cyberbullying has increased over the past decade,and it has the same adverse effects as face-to-face bullying,l...With the rapid growth of internet usage,a new situation has been created that enables practicing bullying.Cyberbullying has increased over the past decade,and it has the same adverse effects as face-to-face bullying,like anger,sadness,anxiety,and fear.With the anonymity people get on the internet,they tend to bemore aggressive and express their emotions freely without considering the effects,which can be a reason for the increase in cyberbullying and it is the main motive behind the current study.This study presents a thorough background of cyberbullying and the techniques used to collect,preprocess,and analyze the datasets.Moreover,a comprehensive review of the literature has been conducted to figure out research gaps and effective techniques and practices in cyberbullying detection in various languages,and it was deduced that there is significant room for improvement in the Arabic language.As a result,the current study focuses on the investigation of shortlisted machine learning algorithms in natural language processing(NLP)for the classification of Arabic datasets duly collected from Twitter(also known as X).In this regard,support vector machine(SVM),Naive Bayes(NB),Random Forest(RF),Logistic regression(LR),Bootstrap aggregating(Bagging),Gradient Boosting(GBoost),Light Gradient Boosting Machine(LightGBM),Adaptive Boosting(AdaBoost),and eXtreme Gradient Boosting(XGBoost)were shortlisted and investigated due to their effectiveness in the similar problems.Finally,the scheme was evaluated by well-known performance measures like accuracy,precision,Recall,and F1-score.Consequently,XGBoost exhibited the best performance with 89.95%accuracy,which is promising compared to the state-of-the-art.展开更多
Recognizing handwritten characters remains a critical and formidable challenge within the realm of computervision. Although considerable strides have been made in enhancing English handwritten character recognitionthr...Recognizing handwritten characters remains a critical and formidable challenge within the realm of computervision. Although considerable strides have been made in enhancing English handwritten character recognitionthrough various techniques, deciphering Arabic handwritten characters is particularly intricate. This complexityarises from the diverse array of writing styles among individuals, coupled with the various shapes that a singlecharacter can take when positioned differently within document images, rendering the task more perplexing. Inthis study, a novel segmentation method for Arabic handwritten scripts is suggested. This work aims to locatethe local minima of the vertical and diagonal word image densities to precisely identify the segmentation pointsbetween the cursive letters. The proposed method starts with pre-processing the word image without affectingits main features, then calculates the directions pixel density of the word image by scanning it vertically and fromangles 30° to 90° to count the pixel density fromall directions and address the problem of overlapping letters, whichis a commonly attitude in writing Arabic texts by many people. Local minima and thresholds are also determinedto identify the ideal segmentation area. The proposed technique is tested on samples obtained fromtwo datasets: Aself-curated image dataset and the IFN/ENIT dataset. The results demonstrate that the proposed method achievesa significant improvement in the proportions of cursive segmentation of 92.96% on our dataset, as well as 89.37%on the IFN/ENIT dataset.展开更多
Handwritten character recognition(HCR)involves identifying characters in images,documents,and various sources such as forms surveys,questionnaires,and signatures,and transforming them into a machine-readable format fo...Handwritten character recognition(HCR)involves identifying characters in images,documents,and various sources such as forms surveys,questionnaires,and signatures,and transforming them into a machine-readable format for subsequent processing.Successfully recognizing complex and intricately shaped handwritten characters remains a significant obstacle.The use of convolutional neural network(CNN)in recent developments has notably advanced HCR,leveraging the ability to extract discriminative features from extensive sets of raw data.Because of the absence of pre-existing datasets in the Kurdish language,we created a Kurdish handwritten dataset called(KurdSet).The dataset consists of Kurdish characters,digits,texts,and symbols.The dataset consists of 1560 participants and contains 45,240 characters.In this study,we chose characters only from our dataset.We utilized a Kurdish dataset for handwritten character recognition.The study also utilizes various models,including InceptionV3,Xception,DenseNet121,and a customCNNmodel.To show the performance of the KurdSet dataset,we compared it to Arabic handwritten character recognition dataset(AHCD).We applied the models to both datasets to show the performance of our dataset.Additionally,the performance of the models is evaluated using test accuracy,which measures the percentage of correctly classified characters in the evaluation phase.All models performed well in the training phase,DenseNet121 exhibited the highest accuracy among the models,achieving a high accuracy of 99.80%on the Kurdish dataset.And Xception model achieved 98.66%using the Arabic dataset.展开更多
Handwritten character recognition is considered challenging compared with machine-printed characters due to the different human writing styles.Arabic is morphologically rich,and its characters have a high similarity.T...Handwritten character recognition is considered challenging compared with machine-printed characters due to the different human writing styles.Arabic is morphologically rich,and its characters have a high similarity.The Arabic language includes 28 characters.Each character has up to four shapes according to its location in the word(at the beginning,middle,end,and isolated).This paper proposed 12 CNN architectures for recognizing handwritten Arabic characters.The proposed architectures were derived from the popular CNN architectures,such as VGG,ResNet,and Inception,to make them applicable to recognizing character-size images.The experimental results on three well-known datasets showed that the proposed architectures significantly enhanced the recognition rate compared to the baseline models.The experiments showed that data augmentation improved the models’accuracies on all tested datasets.The proposed model outperformed most of the existing approaches.The best achieved results were 93.05%,98.30%,and 96.88%on the HIJJA,AHCD,and AIA9K datasets.展开更多
Spices are defined as any aromatic condiment of plant origin used to alter the flavor and aroma of foods. Besides flavor and aroma, many spices have antioxidant activity, mainly related to the presence in cloves of ph...Spices are defined as any aromatic condiment of plant origin used to alter the flavor and aroma of foods. Besides flavor and aroma, many spices have antioxidant activity, mainly related to the presence in cloves of phenolic compounds, such as flavonoids, terpenoids and eugenol. In turn, the most common uses of gum arabic are in the form of powder for addition to soft drink syrups, cuisine and baked goods, specifically to stabilize the texture of products, increase the viscosity of liquids and promote the leavening of baked products (e.g., cakes). Both eugenol, extracted from cloves, and gum arabic, extracted from the hardened sap of two species of the Acacia tree, are dietary constituents routinely consumed virtually throughout the world. Both of them are also widely used medicinally to inhibit oxidative stress and genotoxicity. The prevention arm of the study included groups: Ia, IIa, IIIa, Iva, V, VI, VII, VIII. Once a week for 20 weeks, the controls received saline s.c. while the experimental groups received DMH at 20 mg/kg s.c. During the same period and for an additional 9 weeks, the animals received either water, 10% GA, EUG, or 10% GA + EUG by gavage. The treatment arm of the study included groups Ib, IIb, IIIb e IVb, IX, X, XI, XII). Once a week for 20 weeks, the controls received saline s.c. while the experimental groups received DMH at 20 mg/kg s.c. During the subsequent 9 weeks, the animals received either water, 10% GA, EUG or 10% GA + EUG by gavage. The novelty of this study is the investigation of their use alone and together for the prevention and treatment of experimental colorectal carcinogenesis induced by dimethylhydrazine. Our results show that the combined use of 10% gum arabic and eugenol was effective, with antioxidant action in the colon, as well as reducing oxidative stress in all colon segments and preventing and treating genotoxicity in all colon segments. Furthermore, their joint administration reduced the number of aberrant crypts and the number of aberrant crypt foci (ACF) in the distal segment and entire colon, as well as the number of ACF with at least 5 crypts in the entire colon. Thus, our results also demonstrate the synergistic effects of 10% gum arabic together with eugenol (from cloves), with antioxidant, antigenotoxic and anticarcinogenic actions (prevention and treatment) at the doses and durations studied, in the colon of rats submitted to colorectal carcinogenesis induced by dimethylhydrazine.展开更多
Handwritten character recognition becomes one of the challenging research matters.More studies were presented for recognizing letters of various languages.The availability of Arabic handwritten characters databases wa...Handwritten character recognition becomes one of the challenging research matters.More studies were presented for recognizing letters of various languages.The availability of Arabic handwritten characters databases was confined.Almost a quarter of a billion people worldwide write and speak Arabic.More historical books and files indicate a vital data set for many Arab nationswritten in Arabic.Recently,Arabic handwritten character recognition(AHCR)has grabbed the attention and has become a difficult topic for pattern recognition and computer vision(CV).Therefore,this study develops fireworks optimizationwith the deep learning-based AHCR(FWODL-AHCR)technique.Themajor intention of the FWODL-AHCR technique is to recognize the distinct handwritten characters in the Arabic language.It initially pre-processes the handwritten images to improve their quality of them.Then,the RetinaNet-based deep convolutional neural network is applied as a feature extractor to produce feature vectors.Next,the deep echo state network(DESN)model is utilized to classify handwritten characters.Finally,the FWO algorithm is exploited as a hyperparameter tuning strategy to boost recognition performance.Various simulations in series were performed to exhibit the enhanced performance of the FWODL-AHCR technique.The comparison study portrayed the supremacy of the FWODL-AHCR technique over other approaches,with 99.91%and 98.94%on Hijja and AHCD datasets,respectively.展开更多
Gum Arabic (GA) from Acacia senegal var. kerensis has been approved as an emulsifier, stabilizer, thickener, and encapsulator in food processing industry. Chia mucilage, on the other hand, has been approved to be used...Gum Arabic (GA) from Acacia senegal var. kerensis has been approved as an emulsifier, stabilizer, thickener, and encapsulator in food processing industry. Chia mucilage, on the other hand, has been approved to be used as a fat and egg yolk mimic. However, both chia mucilage and gum Arabic are underutilized locally in Kenya;thus, marginal reports have been published despite their potential to alter functional properties in food products. In this study, the potential use of chia mucilage and gum Arabic was evaluated in the development of an eggless fat-reduced mayonnaise (FRM). The mayonnaise substitute was prepared by replacing eggs and partially substituting sunflower oil with chia mucilage at 15%, 30%, 45%, and 60% levels and gum Arabic at 3% while reducing the oil levels to 15%, 30%, 45%, and 60%. The effect of different concentrations of oil and chia mucilage on the physicochemical properties, for example, pH, emulsion stability, moisture content, protein, carbohydrate, fats, calories, ash, and titratable acidity using AOAC methods and sensory properties for both consumer acceptability and quantitative descriptive analysis of mayonnaise were evaluated and compared to the control with eggs and 75% sunflower oil. The results indicated that all fat-reduced mayonnaises had significantly lower energy to 493 kcal/100g and 20% fat content but higher water content of 0.74 than the control with 784 Kcal/100g calories, 77% fat and 0.39 moisture. These differences increased with increasing substitution levels of chia mucilage, as impacted on pH, carbohydrate, and protein. There was no significant difference between ash content for both fat-reduced mayonnaise and control. Sensory evaluation demonstrated that mayonnaises substituted with chia seeds mucilage and gum Arabic were accepted. All the parameters are positively correlated to overall acceptability, with flavor having the strongest correlation of r = 0.78. Loadings from principal component analysis (PCA) of 16 sensory attributes of mayonnaise showed that approximately over 66% of the variations in sensory attributes were explained by the first six principal components. This study shows good potential for chia mucilage and gum Arabic to be used as fat and egg mimetics and stabilizers, respectively, in mayonnaise with functional properties.展开更多
Dough improvers are substances with functional characteristics used in baking industry to enhance dough properties. Currently, the baking industry is faced with increasing demand for natural ingredients owing to incre...Dough improvers are substances with functional characteristics used in baking industry to enhance dough properties. Currently, the baking industry is faced with increasing demand for natural ingredients owing to increasing consumer awareness, thus contributing to the rising demand for natural hydrocolloids. Gum Arabic from Acacia senegal var. kerensis is a natural gum exhibiting excellent water binding and emulsification capacity. However, very little is reported on how it affects the rheological properties of wheat dough. The aim of this study was therefore, to determine the rheological properties of wheat dough with partial additions of gum Arabic as an improver. Six treatments were analyzed comprising of: flour-gum blends prepared by adding gum Arabic to wheat flour at different levels (1%, 2% and 3%), plain wheat flour (negative control), commercial bread flour and commercial chapati flour (positive controls). The rheological properties were determined using Brabender Farinograph, Brabender Extensograph and Brabender Viscograph. Results showed that addition of gum Arabic significantly (p chapati. These findings support the need to utilize gum Arabic from Acacia senegal var. kerensis as a dough improver.展开更多
This study investigated the perceptions of English educators and supervisors in Jeddah Governorate regarding the process of teaching English to elementary students.A survey was conducted using a sample size of 94 educ...This study investigated the perceptions of English educators and supervisors in Jeddah Governorate regarding the process of teaching English to elementary students.A survey was conducted using a sample size of 94 educators and 10 supervisors.The data indicate that respondents considered English instruction at the elementary level essential for expanding kids’perspectives,improving academic performance,and promoting international involvement.The main advantages cited are the development of English language skills and the promotion of early education.Although not as easily noticeable,the disadvantages include potential negative impacts on an individual’s proficiency in Arabic and their sense of national identification.The highlighted challenges encompass insufficient teacher training,student reluctance towards English,limited resources,and school disparities.The proposed techniques focused on prioritizing English instructors’training,ensuring the use of appropriate content,utilizing technology,and promoting awareness of students and educators.The current research found different obstacles in teaching English at elementary stages.To overcome these obstacles,it will be essential to enhance teacher competencies,develop efficient teaching methods,get the backing of stakeholders,assign adequate resources,and carry out continuous evaluations.Further research can also contribute to a better understanding of how early English learning impacts on Arabic identity and proficiency.展开更多
In recent years,the usage of social networking sites has considerably increased in the Arab world.It has empowered individuals to express their opinions,especially in politics.Furthermore,various organizations that op...In recent years,the usage of social networking sites has considerably increased in the Arab world.It has empowered individuals to express their opinions,especially in politics.Furthermore,various organizations that operate in the Arab countries have embraced social media in their day-to-day business activities at different scales.This is attributed to business owners’understanding of social media’s importance for business development.However,the Arabic morphology is too complicated to understand due to the availability of nearly 10,000 roots and more than 900 patterns that act as the basis for verbs and nouns.Hate speech over online social networking sites turns out to be a worldwide issue that reduces the cohesion of civil societies.In this background,the current study develops a Chaotic Elephant Herd Optimization with Machine Learning for Hate Speech Detection(CEHOML-HSD)model in the context of the Arabic language.The presented CEHOML-HSD model majorly concentrates on identifying and categorising the Arabic text into hate speech and normal.To attain this,the CEHOML-HSD model follows different sub-processes as discussed herewith.At the initial stage,the CEHOML-HSD model undergoes data pre-processing with the help of the TF-IDF vectorizer.Secondly,the Support Vector Machine(SVM)model is utilized to detect and classify the hate speech texts made in the Arabic language.Lastly,the CEHO approach is employed for fine-tuning the parameters involved in SVM.This CEHO approach is developed by combining the chaotic functions with the classical EHO algorithm.The design of the CEHO algorithm for parameter tuning shows the novelty of the work.A widespread experimental analysis was executed to validate the enhanced performance of the proposed CEHOML-HSD approach.The comparative study outcomes established the supremacy of the proposed CEHOML-HSD model over other approaches.展开更多
The COVID-19 pandemic caused significant disruptions in the field of education worldwide,including in the United Arab Emirates.Teachers and students had to adapt to remote learning and virtual classrooms,leading to va...The COVID-19 pandemic caused significant disruptions in the field of education worldwide,including in the United Arab Emirates.Teachers and students had to adapt to remote learning and virtual classrooms,leading to various challenges in maintaining educational standards.The sudden transition to remote teaching could have a negative impact on students’reading abilities,especially in the Arabic language.To gain insight into the unique challenges encountered by Arabic language teachers in the UAE,a survey was conducted to explore their assessment of teaching quality,student-teacher interaction,and learning outcomes amidst the COVID-19 pandemic.The results of the survey revealed a significant decline of student reading abilities and identified several major issues in online Arabic language teaching.These issues included limited interaction between students and teachers,challenges in monitoring students’class participation and performance,and challenges in effectively assessing students’reading skills.The results also demonstrated some other challenges faced by Arabic language teachers,including a lack of preparedness,a lack of subscription to relevant platforms,and a lack of resources for online learning.Several solutions to these challenges are proposed,including reevaluating the balance between depth and breadth in the curriculum,integrating language skills into the curriculum more effectively,providing more comprehensive teacher professional development,implementing student grouping strategies,utilizing retired and expert teachers in specific content areas,allocating time for interventions,and improving support from both teachers and parents to ensure the quality of online learning.展开更多
Nowadays,the usage of socialmedia platforms is rapidly increasing,and rumours or false information are also rising,especially among Arab nations.This false information is harmful to society and individuals.Blocking an...Nowadays,the usage of socialmedia platforms is rapidly increasing,and rumours or false information are also rising,especially among Arab nations.This false information is harmful to society and individuals.Blocking and detecting the spread of fake news in Arabic becomes critical.Several artificial intelligence(AI)methods,including contemporary transformer techniques,BERT,were used to detect fake news.Thus,fake news in Arabic is identified by utilizing AI approaches.This article develops a new hunterprey optimization with hybrid deep learning-based fake news detection(HPOHDL-FND)model on the Arabic corpus.The HPOHDL-FND technique undergoes extensive data pre-processing steps to transform the input data into a useful format.Besides,the HPOHDL-FND technique utilizes long-term memory with a recurrent neural network(LSTM-RNN)model for fake news detection and classification.Finally,hunter prey optimization(HPO)algorithm is exploited for optimal modification of the hyperparameters related to the LSTM-RNN model.The performance validation of the HPOHDL-FND technique is tested using two Arabic datasets.The outcomes exemplified better performance over the other existing techniques with maximum accuracy of 96.57%and 93.53%on Covid19Fakes and satirical datasets,respectively.展开更多
This study aims to review the latest contributions in Arabic Optical Character Recognition(OCR)during the last decade,which helps interested researchers know the existing techniques and extend or adapt them accordingl...This study aims to review the latest contributions in Arabic Optical Character Recognition(OCR)during the last decade,which helps interested researchers know the existing techniques and extend or adapt them accordingly.The study describes the characteristics of the Arabic language,different types of OCR systems,different stages of the Arabic OCR system,the researcher’s contributions in each step,and the evaluationmetrics for OCR.The study reviews the existing datasets for the Arabic OCR and their characteristics.Additionally,this study implemented some preprocessing and segmentation stages of Arabic OCR.The study compares the performance of the existing methods in terms of recognition accuracy.In addition to researchers’OCRmethods,commercial and open-source systems are used in the comparison.The Arabic language is morphologically rich and written cursive with dots and diacritics above and under the characters.Most of the existing approaches in the literature were evaluated on isolated characters or isolated words under a controlled environment,and few approaches were tested on pagelevel scripts.Some comparative studies show that the accuracy of the existing Arabic OCR commercial systems is low,under 75%for printed text,and further improvement is needed.Moreover,most of the current approaches are offline OCR systems,and there is no remarkable contribution to online OCR systems.展开更多
Text classification is an essential task for many applications related to the Natural Language Processing domain.It can be applied in many fields,such as Information Retrieval,Knowledge Extraction,and Knowledge modeli...Text classification is an essential task for many applications related to the Natural Language Processing domain.It can be applied in many fields,such as Information Retrieval,Knowledge Extraction,and Knowledge modeling.Even though the importance of this task,Arabic Text Classification tools still suffer from many problems and remain incapable of responding to the increasing volume of Arabic content that circulates on the web or resides in large databases.This paper introduces a novel machine learning-based approach that exclusively uses hybrid(stylistic and semantic)features.First,we clean the Arabic documents and translate them to English using translation tools.Consequently,the semantic features are automatically extracted from the translated documents using an existing database of English topics.Besides,the model automatically extracts from the textual content a set of stylistic features such as word and character frequencies and punctuation.Therefore,we obtain 3 types of features:semantic,stylistic and hybrid.Using each time,a different type of feature,we performed an in-depth comparison study of nine well-known Machine Learning models to evaluate our approach and used a standard Arabic corpus.The obtained results show that Neural Network outperforms other models and provides good performances using hybrid features(F1-score=0.88%).展开更多
Arabic is the world’s first language,categorized by its rich and complicated grammatical formats.Furthermore,the Arabic morphology can be perplexing because nearly 10,000 roots and 900 patterns were the basis for ver...Arabic is the world’s first language,categorized by its rich and complicated grammatical formats.Furthermore,the Arabic morphology can be perplexing because nearly 10,000 roots and 900 patterns were the basis for verbs and nouns.The Arabic language consists of distinct variations utilized in a community and particular situations.Social media sites are a medium for expressing opinions and social phenomena like racism,hatred,offensive language,and all kinds of verbal violence.Such conduct does not impact particular nations,communities,or groups only,extending beyond such areas into people’s everyday lives.This study introduces an Improved Ant Lion Optimizer with Deep Learning Dirven Offensive and Hate Speech Detection(IALODL-OHSD)on Arabic Cross-Corpora.The presented IALODL-OHSD model mainly aims to detect and classify offensive/hate speech expressed on social media.In the IALODL-OHSD model,a threestage process is performed,namely pre-processing,word embedding,and classification.Primarily,data pre-processing is performed to transform the Arabic social media text into a useful format.In addition,the word2vec word embedding process is utilized to produce word embeddings.The attentionbased cascaded long short-term memory(ACLSTM)model is utilized for the classification process.Finally,the IALO algorithm is exploited as a hyperparameter optimizer to boost classifier results.To illustrate a brief result analysis of the IALODL-OHSD model,a detailed set of simulations were performed.The extensive comparison study portrayed the enhanced performance of the IALODL-OHSD model over other approaches.展开更多
Sentiment Analysis(SA)is one of the Machine Learning(ML)techniques that has been investigated by several researchers in recent years,especially due to the evolution of novel data collection methods focused on social m...Sentiment Analysis(SA)is one of the Machine Learning(ML)techniques that has been investigated by several researchers in recent years,especially due to the evolution of novel data collection methods focused on social media.In literature,it has been reported that SA data is created for English language in excess of any other language.It is challenging to perform SA for Arabic Twitter data owing to informal nature and rich morphology of Arabic language.An earlier study conducted upon SA for Arabic Twitter focused mostly on automatic extraction of the features from the text.Neural word embedding has been employed in literature,since it is less labor-intensive than automatic feature engineering.By ignoring the context of sentiment,most of the word-embedding models follow syntactic data of words.The current study presents a new Dragonfly Optimization with Deep Learning Enabled Sentiment Analysis for Arabic Tweets(DFODLSAAT)model.The aim of the presented DFODL-SAAT model is to distinguish the sentiments from opinions that are tweeted in Arabic language.At first,data cleaning and pre-processing steps are performed to convert the input tweets into a useful format.In addition,TF-IDF model is exploited as a feature extractor to generate the feature vectors.Besides,Attention-based Bidirectional Long Short Term Memory(ABLSTM)technique is applied for identification and classification of sentiments.At last,the hyperparameters of ABLSTM model are optimized using DFO algorithm.The performance of the proposed DFODL-SAAT model was validated using the benchmark dataset and the outcomes were investigated under different aspects.The experimental outcomes highlight the superiority of DFODL-SAAT model over recent approaches.展开更多
Aspect-based sentiment analysis(ABSA)is a fine-grained process.Its fundamental subtasks are aspect termextraction(ATE)and aspect polarity classification(APC),and these subtasks are dependent and closely related.Howeve...Aspect-based sentiment analysis(ABSA)is a fine-grained process.Its fundamental subtasks are aspect termextraction(ATE)and aspect polarity classification(APC),and these subtasks are dependent and closely related.However,most existing works on Arabic ABSA content separately address them,assume that aspect terms are preidentified,or use a pipeline model.Pipeline solutions design different models for each task,and the output from the ATE model is used as the input to the APC model,which may result in error propagation among different steps because APC is affected by ATE error.These methods are impractical for real-world scenarios where the ATE task is the base task for APC,and its result impacts the accuracy of APC.Thus,in this study,we focused on a multi-task learning model for Arabic ATE and APC in which the model is jointly trained on two subtasks simultaneously in a singlemodel.This paper integrates themulti-task model,namely Local Cotext Foucse-Aspect Term Extraction and Polarity classification(LCF-ATEPC)and Arabic Bidirectional Encoder Representation from Transformers(AraBERT)as a shred layer for Arabic contextual text representation.The LCF-ATEPC model is based on a multi-head selfattention and local context focus mechanism(LCF)to capture the interactive information between an aspect and its context.Moreover,data augmentation techniques are proposed based on state-of-the-art augmentation techniques(word embedding substitution with constraints and contextual embedding(AraBERT))to increase the diversity of the training dataset.This paper examined the effect of data augmentation on the multi-task model for Arabic ABSA.Extensive experiments were conducted on the original and combined datasets(merging the original and augmented datasets).Experimental results demonstrate that the proposed Multi-task model outperformed existing APC techniques.Superior results were obtained by AraBERT and LCF-ATEPC with fusion layer(AR-LCF-ATEPC-Fusion)and the proposed data augmentation word embedding-based method(FastText)on the combined dataset.展开更多
Despite the extensive effort to improve intelligent educational tools for smart learning environments,automatic Arabic essay scoring remains a big research challenge.The nature of the writing style of the Arabic langu...Despite the extensive effort to improve intelligent educational tools for smart learning environments,automatic Arabic essay scoring remains a big research challenge.The nature of the writing style of the Arabic language makes the problem even more complicated.This study designs,implements,and evaluates an automatic Arabic essay scoring system.The proposed system starts with pre-processing the student answer and model answer dataset using data cleaning and natural language processing tasks.Then,it comprises two main components:the grading engine and the adaptive fusion engine.The grading engine employs string-based and corpus-based similarity algorithms separately.After that,the adaptive fusion engine aims to prepare students’scores to be delivered to different feature selection algorithms,such as Recursive Feature Elimination and Boruta.Then,some machine learning algorithms such as Decision Tree,Random Forest,Adaboost,Lasso,Bagging,and K-Nearest Neighbor are employed to improve the suggested system’s efficiency.The experimental results in the grading engine showed that Extracting DIStributionally similar words using the CO-occurrences similarity measure achieved the best correlation values.Furthermore,in the adaptive fusion engine,the Random Forest algorithm outperforms all other machine learning algorithms using the(80%–20%)splitting method on the original dataset.It achieves 91.30%,94.20%,0.023,0.106,and 0.153 in terms of Pearson’s Correlation Coefficient,Willmot’s Index of Agreement,Mean Square Error,Mean Absolute Error,and Root Mean Square Error metrics,respectively.展开更多
Automatic Speaker Identification(ASI)involves the process of distinguishing an audio stream associated with numerous speakers’utterances.Some common aspects,such as the framework difference,overlapping of different s...Automatic Speaker Identification(ASI)involves the process of distinguishing an audio stream associated with numerous speakers’utterances.Some common aspects,such as the framework difference,overlapping of different sound events,and the presence of various sound sources during recording,make the ASI task much more complicated and complex.This research proposes a deep learning model to improve the accuracy of the ASI system and reduce the model training time under limited computation resources.In this research,the performance of the transformer model is investigated.Seven audio features,chromagram,Mel-spectrogram,tonnetz,Mel-Frequency Cepstral Coefficients(MFCCs),delta MFCCs,delta-delta MFCCs and spectral contrast,are extracted from the ELSDSR,CSTRVCTK,and Ar-DAD,datasets.The evaluation of various experiments demonstrates that the best performance was achieved by the proposed transformer model using seven audio features on all datasets.For ELSDSR,CSTRVCTK,and Ar-DAD,the highest attained accuracies are 0.99,0.97,and 0.99,respectively.The experimental results reveal that the proposed technique can achieve the best performance for ASI problems.展开更多
Aspect-Based Sentiment Analysis(ABSA)on Arabic corpus has become an active research topic in recent days.ABSA refers to a fine-grained Sentiment Analysis(SA)task that focuses on the extraction of the conferred aspects...Aspect-Based Sentiment Analysis(ABSA)on Arabic corpus has become an active research topic in recent days.ABSA refers to a fine-grained Sentiment Analysis(SA)task that focuses on the extraction of the conferred aspects and the identification of respective sentiment polarity from the provided text.Most of the prevailing Arabic ABSA techniques heavily depend upon dreary feature-engineering and pre-processing tasks and utilize external sources such as lexicons.In literature,concerning the Arabic language text analysis,the authors made use of regular Machine Learning(ML)techniques that rely on a group of rare sources and tools.These sources were used for processing and analyzing the Arabic language content like lexicons.However,an important challenge in this domain is the unavailability of sufficient and reliable resources.In this background,the current study introduces a new Battle Royale Optimization with Fuzzy Deep Learning for Arabic Aspect Based Sentiment Classification(BROFDL-AASC)technique.The aim of the presented BROFDL-AASC model is to detect and classify the sentiments in the Arabic language.In the presented BROFDL-AASC model,data pre-processing is performed at first to convert the input data into a useful format.Besides,the BROFDL-AASC model includes Discriminative Fuzzy-based Restricted Boltzmann Machine(DFRBM)model for the identification and categorization of sentiments.Furthermore,the BRO algorithm is exploited for optimal fine-tuning of the hyperparameters related to the FBRBM model.This scenario establishes the novelty of current study.The performance of the proposed BROFDL-AASC model was validated and the outcomes demonstrate the supremacy of BROFDL-AASC model over other existing models.展开更多
文摘With the rapid growth of internet usage,a new situation has been created that enables practicing bullying.Cyberbullying has increased over the past decade,and it has the same adverse effects as face-to-face bullying,like anger,sadness,anxiety,and fear.With the anonymity people get on the internet,they tend to bemore aggressive and express their emotions freely without considering the effects,which can be a reason for the increase in cyberbullying and it is the main motive behind the current study.This study presents a thorough background of cyberbullying and the techniques used to collect,preprocess,and analyze the datasets.Moreover,a comprehensive review of the literature has been conducted to figure out research gaps and effective techniques and practices in cyberbullying detection in various languages,and it was deduced that there is significant room for improvement in the Arabic language.As a result,the current study focuses on the investigation of shortlisted machine learning algorithms in natural language processing(NLP)for the classification of Arabic datasets duly collected from Twitter(also known as X).In this regard,support vector machine(SVM),Naive Bayes(NB),Random Forest(RF),Logistic regression(LR),Bootstrap aggregating(Bagging),Gradient Boosting(GBoost),Light Gradient Boosting Machine(LightGBM),Adaptive Boosting(AdaBoost),and eXtreme Gradient Boosting(XGBoost)were shortlisted and investigated due to their effectiveness in the similar problems.Finally,the scheme was evaluated by well-known performance measures like accuracy,precision,Recall,and F1-score.Consequently,XGBoost exhibited the best performance with 89.95%accuracy,which is promising compared to the state-of-the-art.
文摘Recognizing handwritten characters remains a critical and formidable challenge within the realm of computervision. Although considerable strides have been made in enhancing English handwritten character recognitionthrough various techniques, deciphering Arabic handwritten characters is particularly intricate. This complexityarises from the diverse array of writing styles among individuals, coupled with the various shapes that a singlecharacter can take when positioned differently within document images, rendering the task more perplexing. Inthis study, a novel segmentation method for Arabic handwritten scripts is suggested. This work aims to locatethe local minima of the vertical and diagonal word image densities to precisely identify the segmentation pointsbetween the cursive letters. The proposed method starts with pre-processing the word image without affectingits main features, then calculates the directions pixel density of the word image by scanning it vertically and fromangles 30° to 90° to count the pixel density fromall directions and address the problem of overlapping letters, whichis a commonly attitude in writing Arabic texts by many people. Local minima and thresholds are also determinedto identify the ideal segmentation area. The proposed technique is tested on samples obtained fromtwo datasets: Aself-curated image dataset and the IFN/ENIT dataset. The results demonstrate that the proposed method achievesa significant improvement in the proportions of cursive segmentation of 92.96% on our dataset, as well as 89.37%on the IFN/ENIT dataset.
文摘Handwritten character recognition(HCR)involves identifying characters in images,documents,and various sources such as forms surveys,questionnaires,and signatures,and transforming them into a machine-readable format for subsequent processing.Successfully recognizing complex and intricately shaped handwritten characters remains a significant obstacle.The use of convolutional neural network(CNN)in recent developments has notably advanced HCR,leveraging the ability to extract discriminative features from extensive sets of raw data.Because of the absence of pre-existing datasets in the Kurdish language,we created a Kurdish handwritten dataset called(KurdSet).The dataset consists of Kurdish characters,digits,texts,and symbols.The dataset consists of 1560 participants and contains 45,240 characters.In this study,we chose characters only from our dataset.We utilized a Kurdish dataset for handwritten character recognition.The study also utilizes various models,including InceptionV3,Xception,DenseNet121,and a customCNNmodel.To show the performance of the KurdSet dataset,we compared it to Arabic handwritten character recognition dataset(AHCD).We applied the models to both datasets to show the performance of our dataset.Additionally,the performance of the models is evaluated using test accuracy,which measures the percentage of correctly classified characters in the evaluation phase.All models performed well in the training phase,DenseNet121 exhibited the highest accuracy among the models,achieving a high accuracy of 99.80%on the Kurdish dataset.And Xception model achieved 98.66%using the Arabic dataset.
文摘Handwritten character recognition is considered challenging compared with machine-printed characters due to the different human writing styles.Arabic is morphologically rich,and its characters have a high similarity.The Arabic language includes 28 characters.Each character has up to four shapes according to its location in the word(at the beginning,middle,end,and isolated).This paper proposed 12 CNN architectures for recognizing handwritten Arabic characters.The proposed architectures were derived from the popular CNN architectures,such as VGG,ResNet,and Inception,to make them applicable to recognizing character-size images.The experimental results on three well-known datasets showed that the proposed architectures significantly enhanced the recognition rate compared to the baseline models.The experiments showed that data augmentation improved the models’accuracies on all tested datasets.The proposed model outperformed most of the existing approaches.The best achieved results were 93.05%,98.30%,and 96.88%on the HIJJA,AHCD,and AIA9K datasets.
文摘Spices are defined as any aromatic condiment of plant origin used to alter the flavor and aroma of foods. Besides flavor and aroma, many spices have antioxidant activity, mainly related to the presence in cloves of phenolic compounds, such as flavonoids, terpenoids and eugenol. In turn, the most common uses of gum arabic are in the form of powder for addition to soft drink syrups, cuisine and baked goods, specifically to stabilize the texture of products, increase the viscosity of liquids and promote the leavening of baked products (e.g., cakes). Both eugenol, extracted from cloves, and gum arabic, extracted from the hardened sap of two species of the Acacia tree, are dietary constituents routinely consumed virtually throughout the world. Both of them are also widely used medicinally to inhibit oxidative stress and genotoxicity. The prevention arm of the study included groups: Ia, IIa, IIIa, Iva, V, VI, VII, VIII. Once a week for 20 weeks, the controls received saline s.c. while the experimental groups received DMH at 20 mg/kg s.c. During the same period and for an additional 9 weeks, the animals received either water, 10% GA, EUG, or 10% GA + EUG by gavage. The treatment arm of the study included groups Ib, IIb, IIIb e IVb, IX, X, XI, XII). Once a week for 20 weeks, the controls received saline s.c. while the experimental groups received DMH at 20 mg/kg s.c. During the subsequent 9 weeks, the animals received either water, 10% GA, EUG or 10% GA + EUG by gavage. The novelty of this study is the investigation of their use alone and together for the prevention and treatment of experimental colorectal carcinogenesis induced by dimethylhydrazine. Our results show that the combined use of 10% gum arabic and eugenol was effective, with antioxidant action in the colon, as well as reducing oxidative stress in all colon segments and preventing and treating genotoxicity in all colon segments. Furthermore, their joint administration reduced the number of aberrant crypts and the number of aberrant crypt foci (ACF) in the distal segment and entire colon, as well as the number of ACF with at least 5 crypts in the entire colon. Thus, our results also demonstrate the synergistic effects of 10% gum arabic together with eugenol (from cloves), with antioxidant, antigenotoxic and anticarcinogenic actions (prevention and treatment) at the doses and durations studied, in the colon of rats submitted to colorectal carcinogenesis induced by dimethylhydrazine.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2022R263)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabiathe Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:22UQU4340237DSR39.
文摘Handwritten character recognition becomes one of the challenging research matters.More studies were presented for recognizing letters of various languages.The availability of Arabic handwritten characters databases was confined.Almost a quarter of a billion people worldwide write and speak Arabic.More historical books and files indicate a vital data set for many Arab nationswritten in Arabic.Recently,Arabic handwritten character recognition(AHCR)has grabbed the attention and has become a difficult topic for pattern recognition and computer vision(CV).Therefore,this study develops fireworks optimizationwith the deep learning-based AHCR(FWODL-AHCR)technique.Themajor intention of the FWODL-AHCR technique is to recognize the distinct handwritten characters in the Arabic language.It initially pre-processes the handwritten images to improve their quality of them.Then,the RetinaNet-based deep convolutional neural network is applied as a feature extractor to produce feature vectors.Next,the deep echo state network(DESN)model is utilized to classify handwritten characters.Finally,the FWO algorithm is exploited as a hyperparameter tuning strategy to boost recognition performance.Various simulations in series were performed to exhibit the enhanced performance of the FWODL-AHCR technique.The comparison study portrayed the supremacy of the FWODL-AHCR technique over other approaches,with 99.91%and 98.94%on Hijja and AHCD datasets,respectively.
文摘Gum Arabic (GA) from Acacia senegal var. kerensis has been approved as an emulsifier, stabilizer, thickener, and encapsulator in food processing industry. Chia mucilage, on the other hand, has been approved to be used as a fat and egg yolk mimic. However, both chia mucilage and gum Arabic are underutilized locally in Kenya;thus, marginal reports have been published despite their potential to alter functional properties in food products. In this study, the potential use of chia mucilage and gum Arabic was evaluated in the development of an eggless fat-reduced mayonnaise (FRM). The mayonnaise substitute was prepared by replacing eggs and partially substituting sunflower oil with chia mucilage at 15%, 30%, 45%, and 60% levels and gum Arabic at 3% while reducing the oil levels to 15%, 30%, 45%, and 60%. The effect of different concentrations of oil and chia mucilage on the physicochemical properties, for example, pH, emulsion stability, moisture content, protein, carbohydrate, fats, calories, ash, and titratable acidity using AOAC methods and sensory properties for both consumer acceptability and quantitative descriptive analysis of mayonnaise were evaluated and compared to the control with eggs and 75% sunflower oil. The results indicated that all fat-reduced mayonnaises had significantly lower energy to 493 kcal/100g and 20% fat content but higher water content of 0.74 than the control with 784 Kcal/100g calories, 77% fat and 0.39 moisture. These differences increased with increasing substitution levels of chia mucilage, as impacted on pH, carbohydrate, and protein. There was no significant difference between ash content for both fat-reduced mayonnaise and control. Sensory evaluation demonstrated that mayonnaises substituted with chia seeds mucilage and gum Arabic were accepted. All the parameters are positively correlated to overall acceptability, with flavor having the strongest correlation of r = 0.78. Loadings from principal component analysis (PCA) of 16 sensory attributes of mayonnaise showed that approximately over 66% of the variations in sensory attributes were explained by the first six principal components. This study shows good potential for chia mucilage and gum Arabic to be used as fat and egg mimetics and stabilizers, respectively, in mayonnaise with functional properties.
文摘Dough improvers are substances with functional characteristics used in baking industry to enhance dough properties. Currently, the baking industry is faced with increasing demand for natural ingredients owing to increasing consumer awareness, thus contributing to the rising demand for natural hydrocolloids. Gum Arabic from Acacia senegal var. kerensis is a natural gum exhibiting excellent water binding and emulsification capacity. However, very little is reported on how it affects the rheological properties of wheat dough. The aim of this study was therefore, to determine the rheological properties of wheat dough with partial additions of gum Arabic as an improver. Six treatments were analyzed comprising of: flour-gum blends prepared by adding gum Arabic to wheat flour at different levels (1%, 2% and 3%), plain wheat flour (negative control), commercial bread flour and commercial chapati flour (positive controls). The rheological properties were determined using Brabender Farinograph, Brabender Extensograph and Brabender Viscograph. Results showed that addition of gum Arabic significantly (p chapati. These findings support the need to utilize gum Arabic from Acacia senegal var. kerensis as a dough improver.
文摘This study investigated the perceptions of English educators and supervisors in Jeddah Governorate regarding the process of teaching English to elementary students.A survey was conducted using a sample size of 94 educators and 10 supervisors.The data indicate that respondents considered English instruction at the elementary level essential for expanding kids’perspectives,improving academic performance,and promoting international involvement.The main advantages cited are the development of English language skills and the promotion of early education.Although not as easily noticeable,the disadvantages include potential negative impacts on an individual’s proficiency in Arabic and their sense of national identification.The highlighted challenges encompass insufficient teacher training,student reluctance towards English,limited resources,and school disparities.The proposed techniques focused on prioritizing English instructors’training,ensuring the use of appropriate content,utilizing technology,and promoting awareness of students and educators.The current research found different obstacles in teaching English at elementary stages.To overcome these obstacles,it will be essential to enhance teacher competencies,develop efficient teaching methods,get the backing of stakeholders,assign adequate resources,and carry out continuous evaluations.Further research can also contribute to a better understanding of how early English learning impacts on Arabic identity and proficiency.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2024R263)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.This study is supported via funding from Prince Sattam bin Abdulaziz University Project Number(PSAU/2024/R/1445).
文摘In recent years,the usage of social networking sites has considerably increased in the Arab world.It has empowered individuals to express their opinions,especially in politics.Furthermore,various organizations that operate in the Arab countries have embraced social media in their day-to-day business activities at different scales.This is attributed to business owners’understanding of social media’s importance for business development.However,the Arabic morphology is too complicated to understand due to the availability of nearly 10,000 roots and more than 900 patterns that act as the basis for verbs and nouns.Hate speech over online social networking sites turns out to be a worldwide issue that reduces the cohesion of civil societies.In this background,the current study develops a Chaotic Elephant Herd Optimization with Machine Learning for Hate Speech Detection(CEHOML-HSD)model in the context of the Arabic language.The presented CEHOML-HSD model majorly concentrates on identifying and categorising the Arabic text into hate speech and normal.To attain this,the CEHOML-HSD model follows different sub-processes as discussed herewith.At the initial stage,the CEHOML-HSD model undergoes data pre-processing with the help of the TF-IDF vectorizer.Secondly,the Support Vector Machine(SVM)model is utilized to detect and classify the hate speech texts made in the Arabic language.Lastly,the CEHO approach is employed for fine-tuning the parameters involved in SVM.This CEHO approach is developed by combining the chaotic functions with the classical EHO algorithm.The design of the CEHO algorithm for parameter tuning shows the novelty of the work.A widespread experimental analysis was executed to validate the enhanced performance of the proposed CEHOML-HSD approach.The comparative study outcomes established the supremacy of the proposed CEHOML-HSD model over other approaches.
文摘The COVID-19 pandemic caused significant disruptions in the field of education worldwide,including in the United Arab Emirates.Teachers and students had to adapt to remote learning and virtual classrooms,leading to various challenges in maintaining educational standards.The sudden transition to remote teaching could have a negative impact on students’reading abilities,especially in the Arabic language.To gain insight into the unique challenges encountered by Arabic language teachers in the UAE,a survey was conducted to explore their assessment of teaching quality,student-teacher interaction,and learning outcomes amidst the COVID-19 pandemic.The results of the survey revealed a significant decline of student reading abilities and identified several major issues in online Arabic language teaching.These issues included limited interaction between students and teachers,challenges in monitoring students’class participation and performance,and challenges in effectively assessing students’reading skills.The results also demonstrated some other challenges faced by Arabic language teachers,including a lack of preparedness,a lack of subscription to relevant platforms,and a lack of resources for online learning.Several solutions to these challenges are proposed,including reevaluating the balance between depth and breadth in the curriculum,integrating language skills into the curriculum more effectively,providing more comprehensive teacher professional development,implementing student grouping strategies,utilizing retired and expert teachers in specific content areas,allocating time for interventions,and improving support from both teachers and parents to ensure the quality of online learning.
基金The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work through Small Groups Project under Grant Number(120/43)Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2022R281)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:(22UQU4331004DSR32).
文摘Nowadays,the usage of socialmedia platforms is rapidly increasing,and rumours or false information are also rising,especially among Arab nations.This false information is harmful to society and individuals.Blocking and detecting the spread of fake news in Arabic becomes critical.Several artificial intelligence(AI)methods,including contemporary transformer techniques,BERT,were used to detect fake news.Thus,fake news in Arabic is identified by utilizing AI approaches.This article develops a new hunterprey optimization with hybrid deep learning-based fake news detection(HPOHDL-FND)model on the Arabic corpus.The HPOHDL-FND technique undergoes extensive data pre-processing steps to transform the input data into a useful format.Besides,the HPOHDL-FND technique utilizes long-term memory with a recurrent neural network(LSTM-RNN)model for fake news detection and classification.Finally,hunter prey optimization(HPO)algorithm is exploited for optimal modification of the hyperparameters related to the LSTM-RNN model.The performance validation of the HPOHDL-FND technique is tested using two Arabic datasets.The outcomes exemplified better performance over the other existing techniques with maximum accuracy of 96.57%and 93.53%on Covid19Fakes and satirical datasets,respectively.
文摘This study aims to review the latest contributions in Arabic Optical Character Recognition(OCR)during the last decade,which helps interested researchers know the existing techniques and extend or adapt them accordingly.The study describes the characteristics of the Arabic language,different types of OCR systems,different stages of the Arabic OCR system,the researcher’s contributions in each step,and the evaluationmetrics for OCR.The study reviews the existing datasets for the Arabic OCR and their characteristics.Additionally,this study implemented some preprocessing and segmentation stages of Arabic OCR.The study compares the performance of the existing methods in terms of recognition accuracy.In addition to researchers’OCRmethods,commercial and open-source systems are used in the comparison.The Arabic language is morphologically rich and written cursive with dots and diacritics above and under the characters.Most of the existing approaches in the literature were evaluated on isolated characters or isolated words under a controlled environment,and few approaches were tested on pagelevel scripts.Some comparative studies show that the accuracy of the existing Arabic OCR commercial systems is low,under 75%for printed text,and further improvement is needed.Moreover,most of the current approaches are offline OCR systems,and there is no remarkable contribution to online OCR systems.
文摘Text classification is an essential task for many applications related to the Natural Language Processing domain.It can be applied in many fields,such as Information Retrieval,Knowledge Extraction,and Knowledge modeling.Even though the importance of this task,Arabic Text Classification tools still suffer from many problems and remain incapable of responding to the increasing volume of Arabic content that circulates on the web or resides in large databases.This paper introduces a novel machine learning-based approach that exclusively uses hybrid(stylistic and semantic)features.First,we clean the Arabic documents and translate them to English using translation tools.Consequently,the semantic features are automatically extracted from the translated documents using an existing database of English topics.Besides,the model automatically extracts from the textual content a set of stylistic features such as word and character frequencies and punctuation.Therefore,we obtain 3 types of features:semantic,stylistic and hybrid.Using each time,a different type of feature,we performed an in-depth comparison study of nine well-known Machine Learning models to evaluate our approach and used a standard Arabic corpus.The obtained results show that Neural Network outperforms other models and provides good performances using hybrid features(F1-score=0.88%).
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R263)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:22UQU4340237DSR43.
文摘Arabic is the world’s first language,categorized by its rich and complicated grammatical formats.Furthermore,the Arabic morphology can be perplexing because nearly 10,000 roots and 900 patterns were the basis for verbs and nouns.The Arabic language consists of distinct variations utilized in a community and particular situations.Social media sites are a medium for expressing opinions and social phenomena like racism,hatred,offensive language,and all kinds of verbal violence.Such conduct does not impact particular nations,communities,or groups only,extending beyond such areas into people’s everyday lives.This study introduces an Improved Ant Lion Optimizer with Deep Learning Dirven Offensive and Hate Speech Detection(IALODL-OHSD)on Arabic Cross-Corpora.The presented IALODL-OHSD model mainly aims to detect and classify offensive/hate speech expressed on social media.In the IALODL-OHSD model,a threestage process is performed,namely pre-processing,word embedding,and classification.Primarily,data pre-processing is performed to transform the Arabic social media text into a useful format.In addition,the word2vec word embedding process is utilized to produce word embeddings.The attentionbased cascaded long short-term memory(ACLSTM)model is utilized for the classification process.Finally,the IALO algorithm is exploited as a hyperparameter optimizer to boost classifier results.To illustrate a brief result analysis of the IALODL-OHSD model,a detailed set of simulations were performed.The extensive comparison study portrayed the enhanced performance of the IALODL-OHSD model over other approaches.
基金The authors are thankful to the Deanship of Scientific Research at Najran University for funding this work under the National Research Priorities funding program,support under code number:NU/NRP/SERC/11/3.
文摘Sentiment Analysis(SA)is one of the Machine Learning(ML)techniques that has been investigated by several researchers in recent years,especially due to the evolution of novel data collection methods focused on social media.In literature,it has been reported that SA data is created for English language in excess of any other language.It is challenging to perform SA for Arabic Twitter data owing to informal nature and rich morphology of Arabic language.An earlier study conducted upon SA for Arabic Twitter focused mostly on automatic extraction of the features from the text.Neural word embedding has been employed in literature,since it is less labor-intensive than automatic feature engineering.By ignoring the context of sentiment,most of the word-embedding models follow syntactic data of words.The current study presents a new Dragonfly Optimization with Deep Learning Enabled Sentiment Analysis for Arabic Tweets(DFODLSAAT)model.The aim of the presented DFODL-SAAT model is to distinguish the sentiments from opinions that are tweeted in Arabic language.At first,data cleaning and pre-processing steps are performed to convert the input tweets into a useful format.In addition,TF-IDF model is exploited as a feature extractor to generate the feature vectors.Besides,Attention-based Bidirectional Long Short Term Memory(ABLSTM)technique is applied for identification and classification of sentiments.At last,the hyperparameters of ABLSTM model are optimized using DFO algorithm.The performance of the proposed DFODL-SAAT model was validated using the benchmark dataset and the outcomes were investigated under different aspects.The experimental outcomes highlight the superiority of DFODL-SAAT model over recent approaches.
文摘Aspect-based sentiment analysis(ABSA)is a fine-grained process.Its fundamental subtasks are aspect termextraction(ATE)and aspect polarity classification(APC),and these subtasks are dependent and closely related.However,most existing works on Arabic ABSA content separately address them,assume that aspect terms are preidentified,or use a pipeline model.Pipeline solutions design different models for each task,and the output from the ATE model is used as the input to the APC model,which may result in error propagation among different steps because APC is affected by ATE error.These methods are impractical for real-world scenarios where the ATE task is the base task for APC,and its result impacts the accuracy of APC.Thus,in this study,we focused on a multi-task learning model for Arabic ATE and APC in which the model is jointly trained on two subtasks simultaneously in a singlemodel.This paper integrates themulti-task model,namely Local Cotext Foucse-Aspect Term Extraction and Polarity classification(LCF-ATEPC)and Arabic Bidirectional Encoder Representation from Transformers(AraBERT)as a shred layer for Arabic contextual text representation.The LCF-ATEPC model is based on a multi-head selfattention and local context focus mechanism(LCF)to capture the interactive information between an aspect and its context.Moreover,data augmentation techniques are proposed based on state-of-the-art augmentation techniques(word embedding substitution with constraints and contextual embedding(AraBERT))to increase the diversity of the training dataset.This paper examined the effect of data augmentation on the multi-task model for Arabic ABSA.Extensive experiments were conducted on the original and combined datasets(merging the original and augmented datasets).Experimental results demonstrate that the proposed Multi-task model outperformed existing APC techniques.Superior results were obtained by AraBERT and LCF-ATEPC with fusion layer(AR-LCF-ATEPC-Fusion)and the proposed data augmentation word embedding-based method(FastText)on the combined dataset.
文摘Despite the extensive effort to improve intelligent educational tools for smart learning environments,automatic Arabic essay scoring remains a big research challenge.The nature of the writing style of the Arabic language makes the problem even more complicated.This study designs,implements,and evaluates an automatic Arabic essay scoring system.The proposed system starts with pre-processing the student answer and model answer dataset using data cleaning and natural language processing tasks.Then,it comprises two main components:the grading engine and the adaptive fusion engine.The grading engine employs string-based and corpus-based similarity algorithms separately.After that,the adaptive fusion engine aims to prepare students’scores to be delivered to different feature selection algorithms,such as Recursive Feature Elimination and Boruta.Then,some machine learning algorithms such as Decision Tree,Random Forest,Adaboost,Lasso,Bagging,and K-Nearest Neighbor are employed to improve the suggested system’s efficiency.The experimental results in the grading engine showed that Extracting DIStributionally similar words using the CO-occurrences similarity measure achieved the best correlation values.Furthermore,in the adaptive fusion engine,the Random Forest algorithm outperforms all other machine learning algorithms using the(80%–20%)splitting method on the original dataset.It achieves 91.30%,94.20%,0.023,0.106,and 0.153 in terms of Pearson’s Correlation Coefficient,Willmot’s Index of Agreement,Mean Square Error,Mean Absolute Error,and Root Mean Square Error metrics,respectively.
基金The authors are grateful to the Taif University Researchers Supporting Project Number(TURSP-2020/36)Taif University,Taif,Saudi Arabia.
文摘Automatic Speaker Identification(ASI)involves the process of distinguishing an audio stream associated with numerous speakers’utterances.Some common aspects,such as the framework difference,overlapping of different sound events,and the presence of various sound sources during recording,make the ASI task much more complicated and complex.This research proposes a deep learning model to improve the accuracy of the ASI system and reduce the model training time under limited computation resources.In this research,the performance of the transformer model is investigated.Seven audio features,chromagram,Mel-spectrogram,tonnetz,Mel-Frequency Cepstral Coefficients(MFCCs),delta MFCCs,delta-delta MFCCs and spectral contrast,are extracted from the ELSDSR,CSTRVCTK,and Ar-DAD,datasets.The evaluation of various experiments demonstrates that the best performance was achieved by the proposed transformer model using seven audio features on all datasets.For ELSDSR,CSTRVCTK,and Ar-DAD,the highest attained accuracies are 0.99,0.97,and 0.99,respectively.The experimental results reveal that the proposed technique can achieve the best performance for ASI problems.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2022R281)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:22UQU4340237DSR52。
文摘Aspect-Based Sentiment Analysis(ABSA)on Arabic corpus has become an active research topic in recent days.ABSA refers to a fine-grained Sentiment Analysis(SA)task that focuses on the extraction of the conferred aspects and the identification of respective sentiment polarity from the provided text.Most of the prevailing Arabic ABSA techniques heavily depend upon dreary feature-engineering and pre-processing tasks and utilize external sources such as lexicons.In literature,concerning the Arabic language text analysis,the authors made use of regular Machine Learning(ML)techniques that rely on a group of rare sources and tools.These sources were used for processing and analyzing the Arabic language content like lexicons.However,an important challenge in this domain is the unavailability of sufficient and reliable resources.In this background,the current study introduces a new Battle Royale Optimization with Fuzzy Deep Learning for Arabic Aspect Based Sentiment Classification(BROFDL-AASC)technique.The aim of the presented BROFDL-AASC model is to detect and classify the sentiments in the Arabic language.In the presented BROFDL-AASC model,data pre-processing is performed at first to convert the input data into a useful format.Besides,the BROFDL-AASC model includes Discriminative Fuzzy-based Restricted Boltzmann Machine(DFRBM)model for the identification and categorization of sentiments.Furthermore,the BRO algorithm is exploited for optimal fine-tuning of the hyperparameters related to the FBRBM model.This scenario establishes the novelty of current study.The performance of the proposed BROFDL-AASC model was validated and the outcomes demonstrate the supremacy of BROFDL-AASC model over other existing models.