An N-gram Chinese language model incorporating linguistic rules is presented. By constructing elements lattice, rules information is incorporated in statistical frame. To facilitate the hybrid modeling, novel methods ...An N-gram Chinese language model incorporating linguistic rules is presented. By constructing elements lattice, rules information is incorporated in statistical frame. To facilitate the hybrid modeling, novel methods such as MI-based rule evaluating, weighted rule quantification and element-based n-gram probability approximation are presented. Dynamic Viterbi algorithm is adopted to search the best path in lattice. To strengthen the model, transformation-based error-driven rules learning is adopted. Applying proposed model to Chinese Pinyin-to-character conversion, high performance has been achieved in accuracy, flexibility and robustness simultaneously. Tests show correct rate achieves 94.81% instead of 90.53% using bi-gram Markov model alone. Many long-distance dependency and recursion in language can be processed effectively.展开更多
This study presents a single-class and multi-class instance segmentation approach applied to ancient Palmyrene inscriptions,employing two state-of-the-art deep learning algorithms,namely YOLOv8 and Roboflow 3.0.The go...This study presents a single-class and multi-class instance segmentation approach applied to ancient Palmyrene inscriptions,employing two state-of-the-art deep learning algorithms,namely YOLOv8 and Roboflow 3.0.The goal is to contribute to the preservation and understanding of historical texts,showcasing the potential of modern deep learning methods in archaeological research.Our research culminates in several key findings and scientific contributions.We comprehensively compare the performance of YOLOv8 and Roboflow 3.0 in the context of Palmyrene character segmentation—this comparative analysis mainly focuses on the strengths and weaknesses of each algorithm in this context.We also created and annotated an extensive dataset of Palmyrene inscriptions,a crucial resource for further research in the field.The dataset serves for training and evaluating the segmentation models.We employ comparative evaluation metrics to quantitatively assess the segmentation results,ensuring the reliability and reproducibility of our findings and we present custom visualization tools for predicted segmentation masks.Our study advances the state of the art in semi-automatic reading of Palmyrene inscriptions and establishes a benchmark for future research.The availability of the Palmyrene dataset and the insights into algorithm performance contribute to the broader understanding of historical text analysis.展开更多
Handwritten character recognition becomes one of the challenging research matters.More studies were presented for recognizing letters of various languages.The availability of Arabic handwritten characters databases wa...Handwritten character recognition becomes one of the challenging research matters.More studies were presented for recognizing letters of various languages.The availability of Arabic handwritten characters databases was confined.Almost a quarter of a billion people worldwide write and speak Arabic.More historical books and files indicate a vital data set for many Arab nationswritten in Arabic.Recently,Arabic handwritten character recognition(AHCR)has grabbed the attention and has become a difficult topic for pattern recognition and computer vision(CV).Therefore,this study develops fireworks optimizationwith the deep learning-based AHCR(FWODL-AHCR)technique.Themajor intention of the FWODL-AHCR technique is to recognize the distinct handwritten characters in the Arabic language.It initially pre-processes the handwritten images to improve their quality of them.Then,the RetinaNet-based deep convolutional neural network is applied as a feature extractor to produce feature vectors.Next,the deep echo state network(DESN)model is utilized to classify handwritten characters.Finally,the FWO algorithm is exploited as a hyperparameter tuning strategy to boost recognition performance.Various simulations in series were performed to exhibit the enhanced performance of the FWODL-AHCR technique.The comparison study portrayed the supremacy of the FWODL-AHCR technique over other approaches,with 99.91%and 98.94%on Hijja and AHCD datasets,respectively.展开更多
A good language model is essential to a postprocessing algorithm for recognition systems. In the past, researchers have presented various language models, such as character based language models, word based language m...A good language model is essential to a postprocessing algorithm for recognition systems. In the past, researchers have presented various language models, such as character based language models, word based language model, syntactical rules language model, hybrid models, etc . The word N gram model is by far an effective and efficient model, but one has to address the problem of data sparseness in establishing the model. Katz and Kneser et al. respectively presented effective remedies to solve this challenging problem. In this study, we proposed an improvement to their methods by incorporating Chinese language specific information or Chinese word class information into the system.展开更多
In recent years,researchers in handwriting recognition analysis relating to indigenous languages have gained significant internet among research communities.The recent developments of artificial intelligence(AI),natur...In recent years,researchers in handwriting recognition analysis relating to indigenous languages have gained significant internet among research communities.The recent developments of artificial intelligence(AI),natural language processing(NLP),and computational linguistics(CL)find useful in the analysis of regional low resource languages.Automatic lexical task participation might be elaborated to various applications in the NLP.It is apparent from the availability of effective machine recognition models and open access handwritten databases.Arabic language is a commonly spoken Semitic language,and it is written with the cursive Arabic alphabet from right to left.Arabic handwritten Character Recognition(HCR)is a crucial process in optical character recognition.In this view,this paper presents effective Computational linguistics with Deep Learning based Handwriting Recognition and Speech Synthesizer(CLDL-THRSS)for Indigenous Language.The presented CLDL-THRSS model involves two stages of operations namely automated handwriting recognition and speech recognition.Firstly,the automated handwriting recognition procedure involves preprocessing,segmentation,feature extraction,and classification.Also,the Capsule Network(CapsNet)based feature extractor is employed for the recognition of handwritten Arabic characters.For optimal hyperparameter tuning,the cuckoo search(CS)optimization technique was included to tune the parameters of the CapsNet method.Besides,deep neural network with hidden Markov model(DNN-HMM)model is employed for the automatic speech synthesizer.To validate the effective performance of the proposed CLDL-THRSS model,a detailed experimental validation process takes place and investigates the outcomes interms of different measures.The experimental outcomes denoted that the CLDL-THRSS technique has demonstrated the compared methods.展开更多
The recognition of the Arabic characters is a crucial task incomputer vision and Natural Language Processing fields. Some major complicationsin recognizing handwritten texts include distortion and patternvariabilities...The recognition of the Arabic characters is a crucial task incomputer vision and Natural Language Processing fields. Some major complicationsin recognizing handwritten texts include distortion and patternvariabilities. So, the feature extraction process is a significant task in NLPmodels. If the features are automatically selected, it might result in theunavailability of adequate data for accurately forecasting the character classes.But, many features usually create difficulties due to high dimensionality issues.Against this background, the current study develops a Sailfish Optimizer withDeep Transfer Learning-Enabled Arabic Handwriting Character Recognition(SFODTL-AHCR) model. The projected SFODTL-AHCR model primarilyfocuses on identifying the handwritten Arabic characters in the inputimage. The proposed SFODTL-AHCR model pre-processes the input imageby following the Histogram Equalization approach to attain this objective.The Inception with ResNet-v2 model examines the pre-processed image toproduce the feature vectors. The Deep Wavelet Neural Network (DWNN)model is utilized to recognize the handwritten Arabic characters. At last,the SFO algorithm is utilized for fine-tuning the parameters involved in theDWNNmodel to attain better performance. The performance of the proposedSFODTL-AHCR model was validated using a series of images. Extensivecomparative analyses were conducted. The proposed method achieved a maximum accuracy of 99.73%. The outcomes inferred the supremacy of theproposed SFODTL-AHCR model over other approaches.展开更多
The expanding role of the Chinese language in international communications has become increasingly prominent as China’s comprehensive national power continues to grow,leading to a significant rise in the number of Ch...The expanding role of the Chinese language in international communications has become increasingly prominent as China’s comprehensive national power continues to grow,leading to a significant rise in the number of Chinese language learners.Since online teaching is not limited by time and space,its application is widespread.For beginners in the Chinese language,the Chinese characters are both a priority and a challenge.The“Chinese Character Classification,”also known as the“Six Writings,”is the earliest systematic theory of Chinese character structures,and teaching Chinese characters in categories based on the“Chinese Character Classification”is a method that fits the cognition of beginners.In order to teach Chinese characters in a targeted approach,based on the collection and analysis of the common errors of Chinese characters among beginners,(1)this paper proposes that(a)the intuitive method can be applied to teach pictographic characters,indicative characters,and associative compound characters in online teaching;(b)the inductive-deductive method of“basic characters to new characters”can be applied for the teaching of pictophonetic characters and associative compound characters;(c)the learning of character patterns should be approached in a whole-part-whole process,while importance should be attached to the suggestion of the frequency effect with a view to facilitating the online learning of Chinese characters for beginners.The aim of this paper is to provide some practical implications for the online teaching of Chinese characters to foreigners.展开更多
This article describes a multiyear initiative of a multilingual multicultural international school that has come to adopt and internalize character development as part of its identity.That is,character education has b...This article describes a multiyear initiative of a multilingual multicultural international school that has come to adopt and internalize character development as part of its identity.That is,character education has been treated as a central tenet and core value that permeates the school and binds the community.It has not been regarded as a supplemental or enhancement project,but rather integral to the general educational program.Built from a principled framework with sound theoretical backing,the infusion of character education at this international school has resulted in the crafting of new standards and the introduction of teacher and student self-assessment tools.In that vein,in this article,we share how the school has come to embrace character development and has forged personalized ways for stakeholders,including teachers and multilingual learners,to engage in improving teaching and learning.展开更多
Recognizing signs and fonts of prehistoric language is a fairly difficult job that requires special tools.This stipulation make the dispensation period over-riding,difficult and tiresome to calculate.This paper present ...Recognizing signs and fonts of prehistoric language is a fairly difficult job that requires special tools.This stipulation make the dispensation period over-riding,difficult and tiresome to calculate.This paper present a technique for recognizing ancient south Indian languages by applying Artificial Neural Network(ANN)associated with Opposition based Grey Wolf Optimization Algorithm(OGWA).It identifies the prehistoric language,signs and fonts.It is an apparent from the ANN system that arbitrarily produced weights or neurons linking various layers play a significant role in its performance.For adaptively determining these weights,this paper applies various optimization algorithms such as Opposition based Grey Wolf Optimization,Particle Swarm Optimization and Grey Wolf Opti-mization to the ANN system.Performance results are illustrated that the proposed ANN-OGWO technique achieves superior accuracy over the other techniques.In test case 1,the accuracy value of OGWO is 94.89%and in test case 2,the accu-racy value of OGWO is 92.34%,on average,the accuracy of OGWO achieves 5.8%greater accuracy than ANN-GWO,10.1%greater accuracy than ANN-PSO and 22.1%greater accuracy over conventional ANN technique.展开更多
Language and gender is an important topic in sociolinguistic studies. This paper aims to analyze the differences between male language and female language of"Friends"on the basis of understanding the relativ...Language and gender is an important topic in sociolinguistic studies. This paper aims to analyze the differences between male language and female language of"Friends"on the basis of understanding the relative theories and studies about language and gender.展开更多
Chinese characters learning is one of the top challenges for novice non-Chinese speaking learners.Comparison betweenDCI(Delayed Character Introduction)and ICI(Immediate Character Introduction)is offered.Further,both a...Chinese characters learning is one of the top challenges for novice non-Chinese speaking learners.Comparison betweenDCI(Delayed Character Introduction)and ICI(Immediate Character Introduction)is offered.Further,both affirmative and negativediscussion is presented from the perspectives of feasibility,target language environment,pinyin dependence,and compressiveteaching time.DCI is considered more suitable for novice Chinese learners especially adults,and is more a theoretical suggestionthan a practical pedagogy.展开更多
文摘An N-gram Chinese language model incorporating linguistic rules is presented. By constructing elements lattice, rules information is incorporated in statistical frame. To facilitate the hybrid modeling, novel methods such as MI-based rule evaluating, weighted rule quantification and element-based n-gram probability approximation are presented. Dynamic Viterbi algorithm is adopted to search the best path in lattice. To strengthen the model, transformation-based error-driven rules learning is adopted. Applying proposed model to Chinese Pinyin-to-character conversion, high performance has been achieved in accuracy, flexibility and robustness simultaneously. Tests show correct rate achieves 94.81% instead of 90.53% using bi-gram Markov model alone. Many long-distance dependency and recursion in language can be processed effectively.
基金The results and knowledge included herein have been obtained owing to support from the following institutional grant.Internal grant agency of the Faculty of Economics and Management,Czech University of Life Sciences Prague,Grant No.2023A0004-“Text Segmentation Methods of Historical Alphabets in OCR Development”.https://iga.pef.czu.cz/.Funds were granted to T.Novák,A.Hamplová,O.Svojše,and A.Veselýfrom the author team.
文摘This study presents a single-class and multi-class instance segmentation approach applied to ancient Palmyrene inscriptions,employing two state-of-the-art deep learning algorithms,namely YOLOv8 and Roboflow 3.0.The goal is to contribute to the preservation and understanding of historical texts,showcasing the potential of modern deep learning methods in archaeological research.Our research culminates in several key findings and scientific contributions.We comprehensively compare the performance of YOLOv8 and Roboflow 3.0 in the context of Palmyrene character segmentation—this comparative analysis mainly focuses on the strengths and weaknesses of each algorithm in this context.We also created and annotated an extensive dataset of Palmyrene inscriptions,a crucial resource for further research in the field.The dataset serves for training and evaluating the segmentation models.We employ comparative evaluation metrics to quantitatively assess the segmentation results,ensuring the reliability and reproducibility of our findings and we present custom visualization tools for predicted segmentation masks.Our study advances the state of the art in semi-automatic reading of Palmyrene inscriptions and establishes a benchmark for future research.The availability of the Palmyrene dataset and the insights into algorithm performance contribute to the broader understanding of historical text analysis.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2022R263)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabiathe Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:22UQU4340237DSR39.
文摘Handwritten character recognition becomes one of the challenging research matters.More studies were presented for recognizing letters of various languages.The availability of Arabic handwritten characters databases was confined.Almost a quarter of a billion people worldwide write and speak Arabic.More historical books and files indicate a vital data set for many Arab nationswritten in Arabic.Recently,Arabic handwritten character recognition(AHCR)has grabbed the attention and has become a difficult topic for pattern recognition and computer vision(CV).Therefore,this study develops fireworks optimizationwith the deep learning-based AHCR(FWODL-AHCR)technique.Themajor intention of the FWODL-AHCR technique is to recognize the distinct handwritten characters in the Arabic language.It initially pre-processes the handwritten images to improve their quality of them.Then,the RetinaNet-based deep convolutional neural network is applied as a feature extractor to produce feature vectors.Next,the deep echo state network(DESN)model is utilized to classify handwritten characters.Finally,the FWO algorithm is exploited as a hyperparameter tuning strategy to boost recognition performance.Various simulations in series were performed to exhibit the enhanced performance of the FWODL-AHCR technique.The comparison study portrayed the supremacy of the FWODL-AHCR technique over other approaches,with 99.91%and 98.94%on Hijja and AHCD datasets,respectively.
文摘A good language model is essential to a postprocessing algorithm for recognition systems. In the past, researchers have presented various language models, such as character based language models, word based language model, syntactical rules language model, hybrid models, etc . The word N gram model is by far an effective and efficient model, but one has to address the problem of data sparseness in establishing the model. Katz and Kneser et al. respectively presented effective remedies to solve this challenging problem. In this study, we proposed an improvement to their methods by incorporating Chinese language specific information or Chinese word class information into the system.
文摘In recent years,researchers in handwriting recognition analysis relating to indigenous languages have gained significant internet among research communities.The recent developments of artificial intelligence(AI),natural language processing(NLP),and computational linguistics(CL)find useful in the analysis of regional low resource languages.Automatic lexical task participation might be elaborated to various applications in the NLP.It is apparent from the availability of effective machine recognition models and open access handwritten databases.Arabic language is a commonly spoken Semitic language,and it is written with the cursive Arabic alphabet from right to left.Arabic handwritten Character Recognition(HCR)is a crucial process in optical character recognition.In this view,this paper presents effective Computational linguistics with Deep Learning based Handwriting Recognition and Speech Synthesizer(CLDL-THRSS)for Indigenous Language.The presented CLDL-THRSS model involves two stages of operations namely automated handwriting recognition and speech recognition.Firstly,the automated handwriting recognition procedure involves preprocessing,segmentation,feature extraction,and classification.Also,the Capsule Network(CapsNet)based feature extractor is employed for the recognition of handwritten Arabic characters.For optimal hyperparameter tuning,the cuckoo search(CS)optimization technique was included to tune the parameters of the CapsNet method.Besides,deep neural network with hidden Markov model(DNN-HMM)model is employed for the automatic speech synthesizer.To validate the effective performance of the proposed CLDL-THRSS model,a detailed experimental validation process takes place and investigates the outcomes interms of different measures.The experimental outcomes denoted that the CLDL-THRSS technique has demonstrated the compared methods.
基金The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work through Large Groups Project under grant number(168/43)Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R263),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia+1 种基金The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:(22UQU4340237DSR32)The author would like to thank the Deanship of Scientific Research at Shaqra University for supporting this work。
文摘The recognition of the Arabic characters is a crucial task incomputer vision and Natural Language Processing fields. Some major complicationsin recognizing handwritten texts include distortion and patternvariabilities. So, the feature extraction process is a significant task in NLPmodels. If the features are automatically selected, it might result in theunavailability of adequate data for accurately forecasting the character classes.But, many features usually create difficulties due to high dimensionality issues.Against this background, the current study develops a Sailfish Optimizer withDeep Transfer Learning-Enabled Arabic Handwriting Character Recognition(SFODTL-AHCR) model. The projected SFODTL-AHCR model primarilyfocuses on identifying the handwritten Arabic characters in the inputimage. The proposed SFODTL-AHCR model pre-processes the input imageby following the Histogram Equalization approach to attain this objective.The Inception with ResNet-v2 model examines the pre-processed image toproduce the feature vectors. The Deep Wavelet Neural Network (DWNN)model is utilized to recognize the handwritten Arabic characters. At last,the SFO algorithm is utilized for fine-tuning the parameters involved in theDWNNmodel to attain better performance. The performance of the proposedSFODTL-AHCR model was validated using a series of images. Extensivecomparative analyses were conducted. The proposed method achieved a maximum accuracy of 99.73%. The outcomes inferred the supremacy of theproposed SFODTL-AHCR model over other approaches.
基金an outcome of the project of Sichuan University,“A Preliminary Study on Online Chinese Character Teaching Strategies for Teaching Chinese as a Foreign Language During the COVID-19 Pandemic,”Project No.2022 Self-Research-Overseas 008。
文摘The expanding role of the Chinese language in international communications has become increasingly prominent as China’s comprehensive national power continues to grow,leading to a significant rise in the number of Chinese language learners.Since online teaching is not limited by time and space,its application is widespread.For beginners in the Chinese language,the Chinese characters are both a priority and a challenge.The“Chinese Character Classification,”also known as the“Six Writings,”is the earliest systematic theory of Chinese character structures,and teaching Chinese characters in categories based on the“Chinese Character Classification”is a method that fits the cognition of beginners.In order to teach Chinese characters in a targeted approach,based on the collection and analysis of the common errors of Chinese characters among beginners,(1)this paper proposes that(a)the intuitive method can be applied to teach pictographic characters,indicative characters,and associative compound characters in online teaching;(b)the inductive-deductive method of“basic characters to new characters”can be applied for the teaching of pictophonetic characters and associative compound characters;(c)the learning of character patterns should be approached in a whole-part-whole process,while importance should be attached to the suggestion of the frequency effect with a view to facilitating the online learning of Chinese characters for beginners.The aim of this paper is to provide some practical implications for the online teaching of Chinese characters to foreigners.
文摘This article describes a multiyear initiative of a multilingual multicultural international school that has come to adopt and internalize character development as part of its identity.That is,character education has been treated as a central tenet and core value that permeates the school and binds the community.It has not been regarded as a supplemental or enhancement project,but rather integral to the general educational program.Built from a principled framework with sound theoretical backing,the infusion of character education at this international school has resulted in the crafting of new standards and the introduction of teacher and student self-assessment tools.In that vein,in this article,we share how the school has come to embrace character development and has forged personalized ways for stakeholders,including teachers and multilingual learners,to engage in improving teaching and learning.
文摘Recognizing signs and fonts of prehistoric language is a fairly difficult job that requires special tools.This stipulation make the dispensation period over-riding,difficult and tiresome to calculate.This paper present a technique for recognizing ancient south Indian languages by applying Artificial Neural Network(ANN)associated with Opposition based Grey Wolf Optimization Algorithm(OGWA).It identifies the prehistoric language,signs and fonts.It is an apparent from the ANN system that arbitrarily produced weights or neurons linking various layers play a significant role in its performance.For adaptively determining these weights,this paper applies various optimization algorithms such as Opposition based Grey Wolf Optimization,Particle Swarm Optimization and Grey Wolf Opti-mization to the ANN system.Performance results are illustrated that the proposed ANN-OGWO technique achieves superior accuracy over the other techniques.In test case 1,the accuracy value of OGWO is 94.89%and in test case 2,the accu-racy value of OGWO is 92.34%,on average,the accuracy of OGWO achieves 5.8%greater accuracy than ANN-GWO,10.1%greater accuracy than ANN-PSO and 22.1%greater accuracy over conventional ANN technique.
文摘Language and gender is an important topic in sociolinguistic studies. This paper aims to analyze the differences between male language and female language of"Friends"on the basis of understanding the relative theories and studies about language and gender.
文摘Chinese characters learning is one of the top challenges for novice non-Chinese speaking learners.Comparison betweenDCI(Delayed Character Introduction)and ICI(Immediate Character Introduction)is offered.Further,both affirmative and negativediscussion is presented from the perspectives of feasibility,target language environment,pinyin dependence,and compressiveteaching time.DCI is considered more suitable for novice Chinese learners especially adults,and is more a theoretical suggestionthan a practical pedagogy.