Amid the increasingly severe global natural eco-environment,it is necessary to build a natural ecological civilization by constructing an ecological civilization discourse.Against this background,this study compiles a...Amid the increasingly severe global natural eco-environment,it is necessary to build a natural ecological civilization by constructing an ecological civilization discourse.Against this background,this study compiles a corpus of natural ecological discourse in the English translation of Xi Jinping:The Governance of China(short for Xi).By using Wordsmith and AntConc,this study explores the linguistic features of the ecological discourse in English translation of Xi in the following dimensions,including high-frequency words,keywords,word collocations,concordance lines.This study aims to analyze the concepts and attitudes towards natural ecology,so as to provide certain valuable insights for the construction of China’s discourse on natural ecological civilization.The study found that the natural ecology discourse involving in the English translation of Xi turned out to be ecologically beneficial.展开更多
Using Corpus of Contemporary American English as the source data,this paper carries out a corpus-based behavioral profile study to investigate four near-synonymous adjectives(serious,severe,grave,and grievous),focusin...Using Corpus of Contemporary American English as the source data,this paper carries out a corpus-based behavioral profile study to investigate four near-synonymous adjectives(serious,severe,grave,and grievous),focusing on their register and the types of nouns they each modify.Although sharing core meaning,these adjectives exhibit variations in formality levels and usage patterns.The identification of fine-grained usage differences complements the current inadequacies in describing these adjectives.Furthermore,the study reaffirms the effectiveness of the corpus-based behavioral profile approach in examining synonym differences.展开更多
Nowadays,the usage of socialmedia platforms is rapidly increasing,and rumours or false information are also rising,especially among Arab nations.This false information is harmful to society and individuals.Blocking an...Nowadays,the usage of socialmedia platforms is rapidly increasing,and rumours or false information are also rising,especially among Arab nations.This false information is harmful to society and individuals.Blocking and detecting the spread of fake news in Arabic becomes critical.Several artificial intelligence(AI)methods,including contemporary transformer techniques,BERT,were used to detect fake news.Thus,fake news in Arabic is identified by utilizing AI approaches.This article develops a new hunterprey optimization with hybrid deep learning-based fake news detection(HPOHDL-FND)model on the Arabic corpus.The HPOHDL-FND technique undergoes extensive data pre-processing steps to transform the input data into a useful format.Besides,the HPOHDL-FND technique utilizes long-term memory with a recurrent neural network(LSTM-RNN)model for fake news detection and classification.Finally,hunter prey optimization(HPO)algorithm is exploited for optimal modification of the hyperparameters related to the LSTM-RNN model.The performance validation of the HPOHDL-FND technique is tested using two Arabic datasets.The outcomes exemplified better performance over the other existing techniques with maximum accuracy of 96.57%and 93.53%on Covid19Fakes and satirical datasets,respectively.展开更多
Daily newspapers publish a tremendous amount of information disseminated through the Internet.Freely available and easily accessible large online repositories are not indexed and are in an un-processable format.The ma...Daily newspapers publish a tremendous amount of information disseminated through the Internet.Freely available and easily accessible large online repositories are not indexed and are in an un-processable format.The major hindrance in developing and evaluating existing/new monolingual text in an image is that it is not linked and indexed.There is no method to reuse the online news images because of the unavailability of standardized benchmark corpora,especially for South Asian languages.The corpus is a vital resource for developing and evaluating text in an image to reuse local news systems in general and specifically for the Urdu language.Lack of indexing,primarily semantic indexing of the daily news items,makes news items impracticable for any querying.Moreover,the most straightforward search facility does not support these unindexed news resources.Our study addresses this gap by associating and marking the newspaper images with one of the widely spoken but under-resourced languages,i.e.,Urdu.The present work proposed a method to build a benchmark corpus of news in image form by introducing a web crawler.The corpus is then semantically linked and annotated with daily news items.Two techniques are proposed for image annotation,free annotation and fixed cross examination annotation.The second technique got higher accuracy.Build news ontology in protégéusing OntologyWeb Language(OWL)language and indexed the annotations under it.The application is also built and linked with protégéso that the readers and journalists have an interface to query the news items directly.Similarly,news items linked together will provide complete coverage and bring together different opinions at a single location for readers to do the analysis themselves.展开更多
An obviously challenging problem in named entity recognition is the construction of the kind data set of entities.Although some research has been conducted on entity database construction,the majority of them are dire...An obviously challenging problem in named entity recognition is the construction of the kind data set of entities.Although some research has been conducted on entity database construction,the majority of them are directed at Wikipedia or the minority at structured entities such as people,locations and organizational nouns in the news.This paper focuses on the identification of scientific entities in carbonate platforms in English literature,using the example of carbonate platforms in sedimentology.Firstly,based on the fact that the reasons for writing literature in key disciplines are likely to be provided by multidisciplinary experts,this paper designs a literature content extraction method that allows dealing with complex text structures.Secondly,based on the literature extraction content,we formalize the entity extraction task(lexicon and lexical-based entity extraction)for entity extraction.Furthermore,for testing the accuracy of entity extraction,three currently popular recognition methods are chosen to perform entity detection in this paper.Experiments show that the entity data set provided by the lexicon and lexical-based entity extraction method is of significant assistance for the named entity recognition task.This study presents a pilot study of entity extraction,which involves the use of a complex structure and specialized literature on carbonate platforms in English.展开更多
Text classification or categorization is the procedure of automatically tagging a textual document with most related labels or classes.When the number of labels is limited to one,the task becomes single-label text cat...Text classification or categorization is the procedure of automatically tagging a textual document with most related labels or classes.When the number of labels is limited to one,the task becomes single-label text categorization.The Arabic texts include unstructured information also like English texts,and that is understandable for machine learning(ML)techniques,the text is changed and demonstrated by numerical value.In recent times,the dominant method for natural language processing(NLP)tasks is recurrent neural network(RNN),in general,long short termmemory(LSTM)and convolutional neural network(CNN).Deep learning(DL)models are currently presented for deriving a massive amount of text deep features to an optimum performance from distinct domains such as text detection,medical image analysis,and so on.This paper introduces aModified Dragonfly Optimization with Extreme Learning Machine for Text Representation and Recognition(MDFO-EMTRR)model onArabicCorpus.The presentedMDFO-EMTRR technique mainly concentrates on the recognition and classification of the Arabic text.To achieve this,theMDFO-EMTRRtechnique encompasses data pre-processing to transform the input data into compatible format.Next,the ELM model is utilized for the representation and recognition of the Arabic text.At last,the MDFO algorithm was exploited for optimal tuning of the parameters related to the ELM method and thereby accomplish enhanced classifier results.The experimental result analysis of the MDFO-EMTRR system was performed on benchmark datasets and attained maximum accuracy of 99.74%.展开更多
Currently,the video captioning models based on an encoder-decoder mainly rely on a single video input source.The contents of video captioning are limited since few studies employed external corpus information to guide...Currently,the video captioning models based on an encoder-decoder mainly rely on a single video input source.The contents of video captioning are limited since few studies employed external corpus information to guide the generation of video captioning,which is not conducive to the accurate descrip-tion and understanding of video content.To address this issue,a novel video captioning method guided by a sentence retrieval generation network(ED-SRG)is proposed in this paper.First,a ResNeXt network model,an efficient convolutional network for online video understanding(ECO)model,and a long short-term memory(LSTM)network model are integrated to construct an encoder-decoder,which is utilized to extract the 2D features,3D features,and object features of video data respectively.These features are decoded to generate textual sentences that conform to video content for sentence retrieval.Then,a sentence-transformer network model is employed to retrieve different sentences in an external corpus that are semantically similar to the above textual sentences.The candidate sentences are screened out through similarity measurement.Finally,a novel GPT-2 network model is constructed based on GPT-2 network structure.The model introduces a designed random selector to randomly select predicted words with a high probability in the corpus,which is used to guide and generate textual sentences that are more in line with human natural language expressions.The proposed method in this paper is compared with several existing works by experiments.The results show that the indicators BLEU-4,CIDEr,ROUGE_L,and METEOR are improved by 3.1%,1.3%,0.3%,and 1.5%on a public dataset MSVD and 1.3%,0.5%,0.2%,1.9%on a public dataset MSR-VTT respectively.It can be seen that the proposed method in this paper can generate video captioning with richer semantics than several state-of-the-art approaches.展开更多
Sentiment analysis(SA)of the Arabic language becomes important despite scarce annotated corpora and confined sources.Arabic affect Analysis has become an active research zone nowadays.But still,the Arabic language lag...Sentiment analysis(SA)of the Arabic language becomes important despite scarce annotated corpora and confined sources.Arabic affect Analysis has become an active research zone nowadays.But still,the Arabic language lags behind adequate language sources for enabling the SA tasks.Thus,Arabic still faces challenges in natural language processing(NLP)tasks because of its structure complexities,history,and distinct cultures.It has gained lesser effort than the other languages.This paper developed a Multi-versus Optimization with Deep Reinforcement Learning Enabled Affect Analysis(MVODRL-AA)on Arabic Corpus.The presented MVODRL-AAmodelmajorly concentrates on identifying and classifying effects or emotions that occurred in the Arabic corpus.Firstly,the MVODRL-AA model follows data pre-processing and word embedding.Next,an n-gram model is utilized to generate word embeddings.A deep Q-learning network(DQLN)model is then exploited to identify and classify the effect on the Arabic corpus.At last,the MVO algorithm is used as a hyperparameter tuning approach to adjust the hyperparameters related to the DQLN model,showing the novelty of the work.A series of simulations were carried out to exhibit the promising performance of the MVODRL-AA model.The simulation outcomes illustrate the betterment of the MVODRL-AA method over the other approaches with an accuracy of 99.27%.展开更多
Natural Language Processing(NLP)for the Arabic language has gained much significance in recent years.The most commonly-utilized NLP task is the‘Text Classification’process.Its main intention is to apply the Machine ...Natural Language Processing(NLP)for the Arabic language has gained much significance in recent years.The most commonly-utilized NLP task is the‘Text Classification’process.Its main intention is to apply the Machine Learning(ML)approaches for automatically classifying the textual files into one or more pre-defined categories.In ML approaches,the first and foremost crucial step is identifying an appropriate large dataset to test and train the method.One of the trending ML techniques,i.e.,Deep Learning(DL)technique needs huge volumes of different types of datasets for training to yield the best outcomes.The current study designs a new Dice Optimization with a Deep Hybrid Boltzmann Machinebased Arabic Corpus Classification(DODHBM-ACC)model in this background.The presented DODHBM-ACC model primarily relies upon different stages of pre-processing and the word2vec word embedding process.For Arabic text classification,the DHBM technique is utilized.This technique is a hybrid version of the Deep Boltzmann Machine(DBM)and Deep Belief Network(DBN).It has the advantage of learning the decisive intention of the classification process.To adjust the hyperparameters of the DHBM technique,the Dice Optimization Algorithm(DOA)is exploited in this study.The experimental analysis was conducted to establish the superior performance of the proposed DODHBM-ACC model.The outcomes inferred the better performance of the proposed DODHBM-ACC model over other recent approaches.展开更多
Background:The relationship between microRNA(miRNA)expression patterns and tumor mutation burden(TMB)in uterine corpus endometrial carcinoma(UCEC)was investigated in this study.Methods:The UCEC dataset from The Cancer...Background:The relationship between microRNA(miRNA)expression patterns and tumor mutation burden(TMB)in uterine corpus endometrial carcinoma(UCEC)was investigated in this study.Methods:The UCEC dataset from The Cancer Genome Atlas(TCGA)database was used to identify the miRNAs that differ in expression between high TMB and low TMB sample sets.The total sample sets were divided into a training set and a test set.TMB levels were predicted using miRNA-based signature classifiers developed by Lasso Cox regression.Test sets were used to validate the classifier.This study investigated the relationship between a miRNA-based signature classifier and three immune checkpoint molecules(programmed cell death protein 1[PD-1],programmed cell death ligand 1[PD-L1],cytotoxic T lymphocyte-associated antigen 4[CTLA-4]).For the miRNA-based signature classifier,functional enrichment analysis was performed on the miRNAs.An analysis of the relationship between PD-1,PD-L1,and CTLA-4 immune checkpoint genes was carried out using the miRNA-based signature classifier.Results:We identified 27 differentially expressed miRNAs in miRNA-base signature.For predicting the TMB level,27-miRNA-based signature classifiers had accuracies of 0.8689 in the training cohort,0.8276 in the test cohort,and 0.8524 in the total cohort.The correlation between the miRNA-based signature classifier and PD-1 was negative,while the correlation between PD-L1 and CTLA4 was positive.Based on the miRNA profiling described above,we validated the expression levels of 9 miRNAs in clinical samples by quantitative reverse transcription PCR(qRT-PCR).Four of them were highly expressed and many cancer-related and immune-associated biological processes were linked to these 27 miRNAs.Thus,the developed miRNA-based signature classifier was correlated with TMB levels that could also predict TMB levels in UCEC samples.Conclusion:In this study,we investigated the relationship between a miRNAbased signature classifier and TMB levels in Uterine Corpus Endometrial Carcinoma.Further,this is the first study to confirm their relationship in clinical samples,which may provide more evidence support for immunotherapy of endometrial cancer.展开更多
Sentiment Analysis(SA)of natural language text is not only a challenging process but also gains significance in various Natural Language Processing(NLP)applications.The SA is utilized in various applications,namely,ed...Sentiment Analysis(SA)of natural language text is not only a challenging process but also gains significance in various Natural Language Processing(NLP)applications.The SA is utilized in various applications,namely,education,to improve the learning and teaching processes,marketing strategies,customer trend predictions,and the stock market.Various researchers have applied lexicon-related approaches,Machine Learning(ML)techniques and so on to conduct the SA for multiple languages,for instance,English and Chinese.Due to the increased popularity of the Deep Learning models,the current study used diverse configuration settings of the Convolution Neural Network(CNN)model and conducted SA for Hindi movie reviews.The current study introduces an Effective Improved Metaheuristics with Deep Learning(DL)-Enabled Sentiment Analysis for Movie Reviews(IMDLSA-MR)model.The presented IMDLSA-MR technique initially applies different levels of pre-processing to convert the input data into a compatible format.Besides,the Term Frequency-Inverse Document Frequency(TF-IDF)model is exploited to generate the word vectors from the pre-processed data.The Deep Belief Network(DBN)model is utilized to analyse and classify the sentiments.Finally,the improved Jellyfish Search Optimization(IJSO)algorithm is utilized for optimal fine-tuning of the hyperparameters related to the DBN model,which shows the novelty of the work.Different experimental analyses were conducted to validate the better performance of the proposed IMDLSA-MR model.The comparative study outcomes highlighted the enhanced performance of the proposed IMDLSA-MR model over recent DL models with a maximum accuracy of 98.92%.展开更多
Currently,individuals use online social media,namely Facebook or Twitter,for sharing their thoughts and emotions.Detection of emotions on social networking sites’finds useful in several applications in social welfare...Currently,individuals use online social media,namely Facebook or Twitter,for sharing their thoughts and emotions.Detection of emotions on social networking sites’finds useful in several applications in social welfare,commerce,public health,and so on.Emotion is expressed in several means,like facial and speech expressions,gestures,and written text.Emotion recognition in a text document is a content-based classification problem that includes notions from deep learning(DL)and natural language processing(NLP)domains.This article proposes a Deer HuntingOptimizationwithDeep Belief Network Enabled Emotion Classification(DHODBN-EC)on English Twitter Data in this study.The presented DHODBN-EC model aims to examine the existence of distinct emotion classes in tweets.At the introductory level,the DHODBN-EC technique pre-processes the tweets at different levels.Besides,the word2vec feature extraction process is applied to generate the word embedding process.For emotion classification,the DHODBN-EC model utilizes the DBN model,which helps to determine distinct emotion class labels.Lastly,the DHO algorithm is leveraged for optimal hyperparameter adjustment of the DBN technique.An extensive range of experimental analyses can be executed to demonstrate the enhanced performance of the DHODBN-EC approach.A comprehensive comparison study exhibited the improvements of the DHODBN-EC model over other approaches with increased accuracy of 96.67%.展开更多
The term‘corpus’refers to a huge volume of structured datasets containing machine-readable texts.Such texts are generated in a natural communicative setting.The explosion of social media permitted individuals to spr...The term‘corpus’refers to a huge volume of structured datasets containing machine-readable texts.Such texts are generated in a natural communicative setting.The explosion of social media permitted individuals to spread data with minimal examination and filters freely.Due to this,the old problem of fake news has resurfaced.It has become an important concern due to its negative impact on the community.To manage the spread of fake news,automatic recognition approaches have been investigated earlier using Artificial Intelligence(AI)and Machine Learning(ML)techniques.To perform the medicinal text classification tasks,the ML approaches were applied,and they performed quite effectively.Still,a huge effort is required from the human side to generate the labelled training data.The recent progress of the Deep Learning(DL)methods seems to be a promising solution to tackle difficult types of Natural Language Processing(NLP)tasks,especially fake news detection.To unlock social media data,an automatic text classifier is highly helpful in the domain of NLP.The current research article focuses on the design of the Optimal Quad ChannelHybrid Long Short-Term Memory-based Fake News Classification(QCLSTM-FNC)approach.The presented QCLSTM-FNC approach aims to identify and differentiate fake news from actual news.To attain this,the proposed QCLSTM-FNC approach follows two methods such as the pre-processing data method and the Glovebased word embedding process.Besides,the QCLSTM model is utilized for classification.To boost the classification results of the QCLSTM model,a Quasi-Oppositional Sandpiper Optimization(QOSPO)algorithm is utilized to fine-tune the hyperparameters.The proposed QCLSTM-FNC approach was experimentally validated against a benchmark dataset.The QCLSTMFNC approach successfully outperformed all other existing DL models under different measures.展开更多
Aspect-Based Sentiment Analysis(ABSA)on Arabic corpus has become an active research topic in recent days.ABSA refers to a fine-grained Sentiment Analysis(SA)task that focuses on the extraction of the conferred aspects...Aspect-Based Sentiment Analysis(ABSA)on Arabic corpus has become an active research topic in recent days.ABSA refers to a fine-grained Sentiment Analysis(SA)task that focuses on the extraction of the conferred aspects and the identification of respective sentiment polarity from the provided text.Most of the prevailing Arabic ABSA techniques heavily depend upon dreary feature-engineering and pre-processing tasks and utilize external sources such as lexicons.In literature,concerning the Arabic language text analysis,the authors made use of regular Machine Learning(ML)techniques that rely on a group of rare sources and tools.These sources were used for processing and analyzing the Arabic language content like lexicons.However,an important challenge in this domain is the unavailability of sufficient and reliable resources.In this background,the current study introduces a new Battle Royale Optimization with Fuzzy Deep Learning for Arabic Aspect Based Sentiment Classification(BROFDL-AASC)technique.The aim of the presented BROFDL-AASC model is to detect and classify the sentiments in the Arabic language.In the presented BROFDL-AASC model,data pre-processing is performed at first to convert the input data into a useful format.Besides,the BROFDL-AASC model includes Discriminative Fuzzy-based Restricted Boltzmann Machine(DFRBM)model for the identification and categorization of sentiments.Furthermore,the BRO algorithm is exploited for optimal fine-tuning of the hyperparameters related to the FBRBM model.This scenario establishes the novelty of current study.The performance of the proposed BROFDL-AASC model was validated and the outcomes demonstrate the supremacy of BROFDL-AASC model over other existing models.展开更多
Gender analysis of Twitter could reveal significant socio-cultural differ-ences between female and male users.Efforts had been made to analyze and auto-matically infer gender formerly for more commonly spoken language...Gender analysis of Twitter could reveal significant socio-cultural differ-ences between female and male users.Efforts had been made to analyze and auto-matically infer gender formerly for more commonly spoken languages’content,but,as we now know that limited work is being undertaken for Arabic.Most of the research works are done mainly for English and least amount of effort for non-English language.The study for Arabic demographic inference like gen-der is relatively uncommon for social networking users,especially for Twitter.Therefore,this study aims to design an optimal marginalized stacked denoising autoencoder for gender identification on Arabic Twitter(OMSDAE-GIAT)model.The presented OMSDAE-GIAR technique mainly concentrates on the identifica-tion and classification of gender exist in the Twitter data.To attain this,the OMS-DAE-GIAT model derives initial stages of data pre-processing and word embedding.Next,the MSDAE model is exploited for the identification of gender into two classes namely male and female.In the final stage,the OMSDAE-GIAT technique uses enhanced bat optimization algorithm(EBOA)for parameter tuning process,showing the novelty of our work.The performance validation of the OMSDAE-GIAT model is inspected against an Arabic corpus dataset and the results are measured under distinct metrics.The comparison study reported the enhanced performance of the OMSDAE-GIAT model over other recent approaches.展开更多
本文以Halliday系统功能语言学中的及物性系统为理论框架,选取《经济学人》中ChatGPT系列报道为语料,运用及物性标注软件UAM Corpus Tool 3.3对新闻报道语料进行标注和成分分析,并探索新闻报道的语篇深层含义。本研究主要回答以下三个问...本文以Halliday系统功能语言学中的及物性系统为理论框架,选取《经济学人》中ChatGPT系列报道为语料,运用及物性标注软件UAM Corpus Tool 3.3对新闻报道语料进行标注和成分分析,并探索新闻报道的语篇深层含义。本研究主要回答以下三个问题:1) 在所选新闻语料中,及物性参与者分布是怎样的?2) 新闻语料中的及物性六大过程成分的分布如何?3) 新闻语料中的环境成分的分布如何?本文揭示了隐藏在西方国家新闻报道背后的态度,这将有助于启发读者理解新闻报道背后的态度及所隐含的意义。展开更多
The difficulty of learning English is not only related to interest,but also related to the correctness of learning methods.Especially in English teaching,a comprehensive and in-depth mastery of vocabulary can improve ...The difficulty of learning English is not only related to interest,but also related to the correctness of learning methods.Especially in English teaching,a comprehensive and in-depth mastery of vocabulary can improve the level of English language,learn English knowledge better,and improve the level of cross-cultural communication.The application of corpus in English classroom vocabulary teaching can provide more educational space for vocabulary teaching,enrich teaching methods,and at the same time,facilitate students to learn vocabulary and lay a foundation for learning English language.To this end,this article first describes the important role of corpus application in vocabulary learning in English classroom teaching.Secondly,it discusses the difficulties of vocabulary learning and the factors that affect the quality of learning.Finally,in order to enhance the learning effect of students and improve the teaching level,several learning strategies have been formulated to continuously highlight the practicality of the corpus.展开更多
The discrimination of synonyms has always been one of the great challenges for English learners.Taking assessment and evaluation as examples,this study analyses the similarities and differences of the two words,as wel...The discrimination of synonyms has always been one of the great challenges for English learners.Taking assessment and evaluation as examples,this study analyses the similarities and differences of the two words,as well as their usage from the perspectives of frequency,stylistics,collocation and semantic prosody with the help of British National Corpus,and demonstrates the importance of corpus retrieval tools in synonyms discrimination.Furthermore,this paper will give some suggestions for English learners and teachers in English vocabulary teaching.展开更多
We use a lot of devices in our daily life to communicate with others. In this modern world, people use email, Facebook, Twitter, and many other social network sites for exchanging information. People lose their valuab...We use a lot of devices in our daily life to communicate with others. In this modern world, people use email, Facebook, Twitter, and many other social network sites for exchanging information. People lose their valuable time misspelling and retyping, and some people are not happy to type large sentences because they face unnecessary words or grammatical issues. So, for this reason, word predictive systems help to exchange textual information more quickly, easier, and comfortably for all people. These systems predict the next most probable words and give users to choose of the needed word from these suggested words. Word prediction can help the writer by predicting the next word and helping complete the sentence correctly. This research aims to forecast the most suitable next word to complete a sentence for any given context. In this research, we have worked on the Bangla language. We have presented a process that can expect the next maximum probable and proper words and suggest a complete sentence using predicted words. In this research, GRU-based RNN has been used on the N-gram dataset to develop the proposed model. We collected a large dataset using multiple sources in the Bangla language and also compared it to the other approaches that have been used such as LSTM, and Naive Bayes. But this suggested approach provides excellent exactness than others. Here, the Unigram model provides 88.22%, Bi-gram model is 99.24%, Tri-gram model is 97.69%, and 4-gram and 5-gram models provide 99.43% and 99.78% on average accurateness. We think that our proposed method profound impression on Bangla search engines.展开更多
This study takes the image of China’s military as c onstructed in the British media’s military coverage of China as an object of study,and creates two reference corpora for comparative research.It is used to better ...This study takes the image of China’s military as c onstructed in the British media’s military coverage of China as an object of study,and creates two reference corpora for comparative research.It is used to better analyse and compare the ideologies hidden in the British media’s coverage of China’s milita ry.From the analysis of the corpus,it is concluded that due to China’s rapid military and economic development,Western countries have developed a sense of crisis and have used media reports to instigate China’s foreign relations and mislead public think ing.展开更多
文摘Amid the increasingly severe global natural eco-environment,it is necessary to build a natural ecological civilization by constructing an ecological civilization discourse.Against this background,this study compiles a corpus of natural ecological discourse in the English translation of Xi Jinping:The Governance of China(short for Xi).By using Wordsmith and AntConc,this study explores the linguistic features of the ecological discourse in English translation of Xi in the following dimensions,including high-frequency words,keywords,word collocations,concordance lines.This study aims to analyze the concepts and attitudes towards natural ecology,so as to provide certain valuable insights for the construction of China’s discourse on natural ecological civilization.The study found that the natural ecology discourse involving in the English translation of Xi turned out to be ecologically beneficial.
文摘Using Corpus of Contemporary American English as the source data,this paper carries out a corpus-based behavioral profile study to investigate four near-synonymous adjectives(serious,severe,grave,and grievous),focusing on their register and the types of nouns they each modify.Although sharing core meaning,these adjectives exhibit variations in formality levels and usage patterns.The identification of fine-grained usage differences complements the current inadequacies in describing these adjectives.Furthermore,the study reaffirms the effectiveness of the corpus-based behavioral profile approach in examining synonym differences.
基金The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work through Small Groups Project under Grant Number(120/43)Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2022R281)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:(22UQU4331004DSR32).
文摘Nowadays,the usage of socialmedia platforms is rapidly increasing,and rumours or false information are also rising,especially among Arab nations.This false information is harmful to society and individuals.Blocking and detecting the spread of fake news in Arabic becomes critical.Several artificial intelligence(AI)methods,including contemporary transformer techniques,BERT,were used to detect fake news.Thus,fake news in Arabic is identified by utilizing AI approaches.This article develops a new hunterprey optimization with hybrid deep learning-based fake news detection(HPOHDL-FND)model on the Arabic corpus.The HPOHDL-FND technique undergoes extensive data pre-processing steps to transform the input data into a useful format.Besides,the HPOHDL-FND technique utilizes long-term memory with a recurrent neural network(LSTM-RNN)model for fake news detection and classification.Finally,hunter prey optimization(HPO)algorithm is exploited for optimal modification of the hyperparameters related to the LSTM-RNN model.The performance validation of the HPOHDL-FND technique is tested using two Arabic datasets.The outcomes exemplified better performance over the other existing techniques with maximum accuracy of 96.57%and 93.53%on Covid19Fakes and satirical datasets,respectively.
基金King Saud University through Researchers Supporting Project number(RSP-2021/387),King Saud University,Riyadh,Saudi Arabia.
文摘Daily newspapers publish a tremendous amount of information disseminated through the Internet.Freely available and easily accessible large online repositories are not indexed and are in an un-processable format.The major hindrance in developing and evaluating existing/new monolingual text in an image is that it is not linked and indexed.There is no method to reuse the online news images because of the unavailability of standardized benchmark corpora,especially for South Asian languages.The corpus is a vital resource for developing and evaluating text in an image to reuse local news systems in general and specifically for the Urdu language.Lack of indexing,primarily semantic indexing of the daily news items,makes news items impracticable for any querying.Moreover,the most straightforward search facility does not support these unindexed news resources.Our study addresses this gap by associating and marking the newspaper images with one of the widely spoken but under-resourced languages,i.e.,Urdu.The present work proposed a method to build a benchmark corpus of news in image form by introducing a web crawler.The corpus is then semantically linked and annotated with daily news items.Two techniques are proposed for image annotation,free annotation and fixed cross examination annotation.The second technique got higher accuracy.Build news ontology in protégéusing OntologyWeb Language(OWL)language and indexed the annotations under it.The application is also built and linked with protégéso that the readers and journalists have an interface to query the news items directly.Similarly,news items linked together will provide complete coverage and bring together different opinions at a single location for readers to do the analysis themselves.
基金supported by the National Natural Science Foundation of China under Grant No.42050102the National Science Foundation of China(Grant No.62001236)the Natural Science Foundation of the Jiangsu Higher Education Institutions of China(Grant No.20KJA520003).
文摘An obviously challenging problem in named entity recognition is the construction of the kind data set of entities.Although some research has been conducted on entity database construction,the majority of them are directed at Wikipedia or the minority at structured entities such as people,locations and organizational nouns in the news.This paper focuses on the identification of scientific entities in carbonate platforms in English literature,using the example of carbonate platforms in sedimentology.Firstly,based on the fact that the reasons for writing literature in key disciplines are likely to be provided by multidisciplinary experts,this paper designs a literature content extraction method that allows dealing with complex text structures.Secondly,based on the literature extraction content,we formalize the entity extraction task(lexicon and lexical-based entity extraction)for entity extraction.Furthermore,for testing the accuracy of entity extraction,three currently popular recognition methods are chosen to perform entity detection in this paper.Experiments show that the entity data set provided by the lexicon and lexical-based entity extraction method is of significant assistance for the named entity recognition task.This study presents a pilot study of entity extraction,which involves the use of a complex structure and specialized literature on carbonate platforms in English.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R263),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabiathe Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:22UQU4340237DSR35.
文摘Text classification or categorization is the procedure of automatically tagging a textual document with most related labels or classes.When the number of labels is limited to one,the task becomes single-label text categorization.The Arabic texts include unstructured information also like English texts,and that is understandable for machine learning(ML)techniques,the text is changed and demonstrated by numerical value.In recent times,the dominant method for natural language processing(NLP)tasks is recurrent neural network(RNN),in general,long short termmemory(LSTM)and convolutional neural network(CNN).Deep learning(DL)models are currently presented for deriving a massive amount of text deep features to an optimum performance from distinct domains such as text detection,medical image analysis,and so on.This paper introduces aModified Dragonfly Optimization with Extreme Learning Machine for Text Representation and Recognition(MDFO-EMTRR)model onArabicCorpus.The presentedMDFO-EMTRR technique mainly concentrates on the recognition and classification of the Arabic text.To achieve this,theMDFO-EMTRRtechnique encompasses data pre-processing to transform the input data into compatible format.Next,the ELM model is utilized for the representation and recognition of the Arabic text.At last,the MDFO algorithm was exploited for optimal tuning of the parameters related to the ELM method and thereby accomplish enhanced classifier results.The experimental result analysis of the MDFO-EMTRR system was performed on benchmark datasets and attained maximum accuracy of 99.74%.
基金supported in part by the National Natural Science Foundation of China under Grants 62273272 and 61873277in part by the Chinese Postdoctoral Science Foundation under Grant 2020M673446+1 种基金in part by the Key Research and Development Program of Shaanxi Province under Grant 2023-YBGY-243in part by the Youth Innovation Team of Shaanxi Universities.
文摘Currently,the video captioning models based on an encoder-decoder mainly rely on a single video input source.The contents of video captioning are limited since few studies employed external corpus information to guide the generation of video captioning,which is not conducive to the accurate descrip-tion and understanding of video content.To address this issue,a novel video captioning method guided by a sentence retrieval generation network(ED-SRG)is proposed in this paper.First,a ResNeXt network model,an efficient convolutional network for online video understanding(ECO)model,and a long short-term memory(LSTM)network model are integrated to construct an encoder-decoder,which is utilized to extract the 2D features,3D features,and object features of video data respectively.These features are decoded to generate textual sentences that conform to video content for sentence retrieval.Then,a sentence-transformer network model is employed to retrieve different sentences in an external corpus that are semantically similar to the above textual sentences.The candidate sentences are screened out through similarity measurement.Finally,a novel GPT-2 network model is constructed based on GPT-2 network structure.The model introduces a designed random selector to randomly select predicted words with a high probability in the corpus,which is used to guide and generate textual sentences that are more in line with human natural language expressions.The proposed method in this paper is compared with several existing works by experiments.The results show that the indicators BLEU-4,CIDEr,ROUGE_L,and METEOR are improved by 3.1%,1.3%,0.3%,and 1.5%on a public dataset MSVD and 1.3%,0.5%,0.2%,1.9%on a public dataset MSR-VTT respectively.It can be seen that the proposed method in this paper can generate video captioning with richer semantics than several state-of-the-art approaches.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project Number (PNURSP2022R263)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Ara-bia.The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:22UQU4340237DSR38.
文摘Sentiment analysis(SA)of the Arabic language becomes important despite scarce annotated corpora and confined sources.Arabic affect Analysis has become an active research zone nowadays.But still,the Arabic language lags behind adequate language sources for enabling the SA tasks.Thus,Arabic still faces challenges in natural language processing(NLP)tasks because of its structure complexities,history,and distinct cultures.It has gained lesser effort than the other languages.This paper developed a Multi-versus Optimization with Deep Reinforcement Learning Enabled Affect Analysis(MVODRL-AA)on Arabic Corpus.The presented MVODRL-AAmodelmajorly concentrates on identifying and classifying effects or emotions that occurred in the Arabic corpus.Firstly,the MVODRL-AA model follows data pre-processing and word embedding.Next,an n-gram model is utilized to generate word embeddings.A deep Q-learning network(DQLN)model is then exploited to identify and classify the effect on the Arabic corpus.At last,the MVO algorithm is used as a hyperparameter tuning approach to adjust the hyperparameters related to the DQLN model,showing the novelty of the work.A series of simulations were carried out to exhibit the promising performance of the MVODRL-AA model.The simulation outcomes illustrate the betterment of the MVODRL-AA method over the other approaches with an accuracy of 99.27%.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R263)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:(22UQU4310373DSR53).
文摘Natural Language Processing(NLP)for the Arabic language has gained much significance in recent years.The most commonly-utilized NLP task is the‘Text Classification’process.Its main intention is to apply the Machine Learning(ML)approaches for automatically classifying the textual files into one or more pre-defined categories.In ML approaches,the first and foremost crucial step is identifying an appropriate large dataset to test and train the method.One of the trending ML techniques,i.e.,Deep Learning(DL)technique needs huge volumes of different types of datasets for training to yield the best outcomes.The current study designs a new Dice Optimization with a Deep Hybrid Boltzmann Machinebased Arabic Corpus Classification(DODHBM-ACC)model in this background.The presented DODHBM-ACC model primarily relies upon different stages of pre-processing and the word2vec word embedding process.For Arabic text classification,the DHBM technique is utilized.This technique is a hybrid version of the Deep Boltzmann Machine(DBM)and Deep Belief Network(DBN).It has the advantage of learning the decisive intention of the classification process.To adjust the hyperparameters of the DHBM technique,the Dice Optimization Algorithm(DOA)is exploited in this study.The experimental analysis was conducted to establish the superior performance of the proposed DODHBM-ACC model.The outcomes inferred the better performance of the proposed DODHBM-ACC model over other recent approaches.
基金the National Natural Science Foundation(81803877,82104705)the Natural Science Foundation of Guangdong Province of China(2017A030310178)+5 种基金the Guangdong Sci-Tech Commissioner(20211800500322)the China Postdoctoral Science Foundation(2020M682817)Guangdong Basic and Applied Basic Research Foundation(2020A1515110651,2020B1515120063)Guangdong Medical Science and Technology Research Foundation(A2021476)Traditional Chinese Medicine Research Project of Guangdong Province Traditional Chinese Medicine Bureau(20221256)the Dongguan Social Technology Development Fund(202050715001207).
文摘Background:The relationship between microRNA(miRNA)expression patterns and tumor mutation burden(TMB)in uterine corpus endometrial carcinoma(UCEC)was investigated in this study.Methods:The UCEC dataset from The Cancer Genome Atlas(TCGA)database was used to identify the miRNAs that differ in expression between high TMB and low TMB sample sets.The total sample sets were divided into a training set and a test set.TMB levels were predicted using miRNA-based signature classifiers developed by Lasso Cox regression.Test sets were used to validate the classifier.This study investigated the relationship between a miRNA-based signature classifier and three immune checkpoint molecules(programmed cell death protein 1[PD-1],programmed cell death ligand 1[PD-L1],cytotoxic T lymphocyte-associated antigen 4[CTLA-4]).For the miRNA-based signature classifier,functional enrichment analysis was performed on the miRNAs.An analysis of the relationship between PD-1,PD-L1,and CTLA-4 immune checkpoint genes was carried out using the miRNA-based signature classifier.Results:We identified 27 differentially expressed miRNAs in miRNA-base signature.For predicting the TMB level,27-miRNA-based signature classifiers had accuracies of 0.8689 in the training cohort,0.8276 in the test cohort,and 0.8524 in the total cohort.The correlation between the miRNA-based signature classifier and PD-1 was negative,while the correlation between PD-L1 and CTLA4 was positive.Based on the miRNA profiling described above,we validated the expression levels of 9 miRNAs in clinical samples by quantitative reverse transcription PCR(qRT-PCR).Four of them were highly expressed and many cancer-related and immune-associated biological processes were linked to these 27 miRNAs.Thus,the developed miRNA-based signature classifier was correlated with TMB levels that could also predict TMB levels in UCEC samples.Conclusion:In this study,we investigated the relationship between a miRNAbased signature classifier and TMB levels in Uterine Corpus Endometrial Carcinoma.Further,this is the first study to confirm their relationship in clinical samples,which may provide more evidence support for immunotherapy of endometrial cancer.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2023R161)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:22UQU4340237DSR51).
文摘Sentiment Analysis(SA)of natural language text is not only a challenging process but also gains significance in various Natural Language Processing(NLP)applications.The SA is utilized in various applications,namely,education,to improve the learning and teaching processes,marketing strategies,customer trend predictions,and the stock market.Various researchers have applied lexicon-related approaches,Machine Learning(ML)techniques and so on to conduct the SA for multiple languages,for instance,English and Chinese.Due to the increased popularity of the Deep Learning models,the current study used diverse configuration settings of the Convolution Neural Network(CNN)model and conducted SA for Hindi movie reviews.The current study introduces an Effective Improved Metaheuristics with Deep Learning(DL)-Enabled Sentiment Analysis for Movie Reviews(IMDLSA-MR)model.The presented IMDLSA-MR technique initially applies different levels of pre-processing to convert the input data into a compatible format.Besides,the Term Frequency-Inverse Document Frequency(TF-IDF)model is exploited to generate the word vectors from the pre-processed data.The Deep Belief Network(DBN)model is utilized to analyse and classify the sentiments.Finally,the improved Jellyfish Search Optimization(IJSO)algorithm is utilized for optimal fine-tuning of the hyperparameters related to the DBN model,which shows the novelty of the work.Different experimental analyses were conducted to validate the better performance of the proposed IMDLSA-MR model.The comparative study outcomes highlighted the enhanced performance of the proposed IMDLSA-MR model over recent DL models with a maximum accuracy of 98.92%.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project Number (PNURSP2022R281)Princess Nourah bint Abdulrahman University,Riyadh,Saudi ArabiaDeanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code: (22UQU4340237DSR61).
文摘Currently,individuals use online social media,namely Facebook or Twitter,for sharing their thoughts and emotions.Detection of emotions on social networking sites’finds useful in several applications in social welfare,commerce,public health,and so on.Emotion is expressed in several means,like facial and speech expressions,gestures,and written text.Emotion recognition in a text document is a content-based classification problem that includes notions from deep learning(DL)and natural language processing(NLP)domains.This article proposes a Deer HuntingOptimizationwithDeep Belief Network Enabled Emotion Classification(DHODBN-EC)on English Twitter Data in this study.The presented DHODBN-EC model aims to examine the existence of distinct emotion classes in tweets.At the introductory level,the DHODBN-EC technique pre-processes the tweets at different levels.Besides,the word2vec feature extraction process is applied to generate the word embedding process.For emotion classification,the DHODBN-EC model utilizes the DBN model,which helps to determine distinct emotion class labels.Lastly,the DHO algorithm is leveraged for optimal hyperparameter adjustment of the DBN technique.An extensive range of experimental analyses can be executed to demonstrate the enhanced performance of the DHODBN-EC approach.A comprehensive comparison study exhibited the improvements of the DHODBN-EC model over other approaches with increased accuracy of 96.67%.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R281)PrincessNourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:(22UQU4331004DSR41).
文摘The term‘corpus’refers to a huge volume of structured datasets containing machine-readable texts.Such texts are generated in a natural communicative setting.The explosion of social media permitted individuals to spread data with minimal examination and filters freely.Due to this,the old problem of fake news has resurfaced.It has become an important concern due to its negative impact on the community.To manage the spread of fake news,automatic recognition approaches have been investigated earlier using Artificial Intelligence(AI)and Machine Learning(ML)techniques.To perform the medicinal text classification tasks,the ML approaches were applied,and they performed quite effectively.Still,a huge effort is required from the human side to generate the labelled training data.The recent progress of the Deep Learning(DL)methods seems to be a promising solution to tackle difficult types of Natural Language Processing(NLP)tasks,especially fake news detection.To unlock social media data,an automatic text classifier is highly helpful in the domain of NLP.The current research article focuses on the design of the Optimal Quad ChannelHybrid Long Short-Term Memory-based Fake News Classification(QCLSTM-FNC)approach.The presented QCLSTM-FNC approach aims to identify and differentiate fake news from actual news.To attain this,the proposed QCLSTM-FNC approach follows two methods such as the pre-processing data method and the Glovebased word embedding process.Besides,the QCLSTM model is utilized for classification.To boost the classification results of the QCLSTM model,a Quasi-Oppositional Sandpiper Optimization(QOSPO)algorithm is utilized to fine-tune the hyperparameters.The proposed QCLSTM-FNC approach was experimentally validated against a benchmark dataset.The QCLSTMFNC approach successfully outperformed all other existing DL models under different measures.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2022R281)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:22UQU4340237DSR52。
文摘Aspect-Based Sentiment Analysis(ABSA)on Arabic corpus has become an active research topic in recent days.ABSA refers to a fine-grained Sentiment Analysis(SA)task that focuses on the extraction of the conferred aspects and the identification of respective sentiment polarity from the provided text.Most of the prevailing Arabic ABSA techniques heavily depend upon dreary feature-engineering and pre-processing tasks and utilize external sources such as lexicons.In literature,concerning the Arabic language text analysis,the authors made use of regular Machine Learning(ML)techniques that rely on a group of rare sources and tools.These sources were used for processing and analyzing the Arabic language content like lexicons.However,an important challenge in this domain is the unavailability of sufficient and reliable resources.In this background,the current study introduces a new Battle Royale Optimization with Fuzzy Deep Learning for Arabic Aspect Based Sentiment Classification(BROFDL-AASC)technique.The aim of the presented BROFDL-AASC model is to detect and classify the sentiments in the Arabic language.In the presented BROFDL-AASC model,data pre-processing is performed at first to convert the input data into a useful format.Besides,the BROFDL-AASC model includes Discriminative Fuzzy-based Restricted Boltzmann Machine(DFRBM)model for the identification and categorization of sentiments.Furthermore,the BRO algorithm is exploited for optimal fine-tuning of the hyperparameters related to the FBRBM model.This scenario establishes the novelty of current study.The performance of the proposed BROFDL-AASC model was validated and the outcomes demonstrate the supremacy of BROFDL-AASC model over other existing models.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R263)Princess Nourah bint Abdulrahman University,Riyadh,Saudi ArabiaThe authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:22UQU4310373DSR55.
文摘Gender analysis of Twitter could reveal significant socio-cultural differ-ences between female and male users.Efforts had been made to analyze and auto-matically infer gender formerly for more commonly spoken languages’content,but,as we now know that limited work is being undertaken for Arabic.Most of the research works are done mainly for English and least amount of effort for non-English language.The study for Arabic demographic inference like gen-der is relatively uncommon for social networking users,especially for Twitter.Therefore,this study aims to design an optimal marginalized stacked denoising autoencoder for gender identification on Arabic Twitter(OMSDAE-GIAT)model.The presented OMSDAE-GIAR technique mainly concentrates on the identifica-tion and classification of gender exist in the Twitter data.To attain this,the OMS-DAE-GIAT model derives initial stages of data pre-processing and word embedding.Next,the MSDAE model is exploited for the identification of gender into two classes namely male and female.In the final stage,the OMSDAE-GIAT technique uses enhanced bat optimization algorithm(EBOA)for parameter tuning process,showing the novelty of our work.The performance validation of the OMSDAE-GIAT model is inspected against an Arabic corpus dataset and the results are measured under distinct metrics.The comparison study reported the enhanced performance of the OMSDAE-GIAT model over other recent approaches.
基金This paper is the research result of the school level scientific research project of Chengdu College of Arts and Sciences,“Research on the Interpretation of Chinese Cultural Images and Translation Teaching Strategies in Accordance With Internationalization Strategy”(Fund Project No.:WLYB2022066)the research result of China Private Education Association’s 2022 annual planning topic“Research on the Application of Corpus in English Teaching in Private Colleges and Universities”(Fund Project No.:CANFZG22201).
文摘The difficulty of learning English is not only related to interest,but also related to the correctness of learning methods.Especially in English teaching,a comprehensive and in-depth mastery of vocabulary can improve the level of English language,learn English knowledge better,and improve the level of cross-cultural communication.The application of corpus in English classroom vocabulary teaching can provide more educational space for vocabulary teaching,enrich teaching methods,and at the same time,facilitate students to learn vocabulary and lay a foundation for learning English language.To this end,this article first describes the important role of corpus application in vocabulary learning in English classroom teaching.Secondly,it discusses the difficulties of vocabulary learning and the factors that affect the quality of learning.Finally,in order to enhance the learning effect of students and improve the teaching level,several learning strategies have been formulated to continuously highlight the practicality of the corpus.
文摘The discrimination of synonyms has always been one of the great challenges for English learners.Taking assessment and evaluation as examples,this study analyses the similarities and differences of the two words,as well as their usage from the perspectives of frequency,stylistics,collocation and semantic prosody with the help of British National Corpus,and demonstrates the importance of corpus retrieval tools in synonyms discrimination.Furthermore,this paper will give some suggestions for English learners and teachers in English vocabulary teaching.
文摘We use a lot of devices in our daily life to communicate with others. In this modern world, people use email, Facebook, Twitter, and many other social network sites for exchanging information. People lose their valuable time misspelling and retyping, and some people are not happy to type large sentences because they face unnecessary words or grammatical issues. So, for this reason, word predictive systems help to exchange textual information more quickly, easier, and comfortably for all people. These systems predict the next most probable words and give users to choose of the needed word from these suggested words. Word prediction can help the writer by predicting the next word and helping complete the sentence correctly. This research aims to forecast the most suitable next word to complete a sentence for any given context. In this research, we have worked on the Bangla language. We have presented a process that can expect the next maximum probable and proper words and suggest a complete sentence using predicted words. In this research, GRU-based RNN has been used on the N-gram dataset to develop the proposed model. We collected a large dataset using multiple sources in the Bangla language and also compared it to the other approaches that have been used such as LSTM, and Naive Bayes. But this suggested approach provides excellent exactness than others. Here, the Unigram model provides 88.22%, Bi-gram model is 99.24%, Tri-gram model is 97.69%, and 4-gram and 5-gram models provide 99.43% and 99.78% on average accurateness. We think that our proposed method profound impression on Bangla search engines.
文摘This study takes the image of China’s military as c onstructed in the British media’s military coverage of China as an object of study,and creates two reference corpora for comparative research.It is used to better analyse and compare the ideologies hidden in the British media’s coverage of China’s milita ry.From the analysis of the corpus,it is concluded that due to China’s rapid military and economic development,Western countries have developed a sense of crisis and have used media reports to instigate China’s foreign relations and mislead public think ing.