期刊文献+
共找到54篇文章
< 1 2 3 >
每页显示 20 50 100
RUSAS: Roman Urdu Sentiment Analysis System
1
作者 Kazim Jawad Muhammad Ahmad +1 位作者 Majdah Alvi Muhammad Bux Alvi 《Computers, Materials & Continua》 SCIE EI 2024年第4期1463-1480,共18页
Sentiment analysis, the meta field of Natural Language Processing (NLP), attempts to analyze and identify thesentiments in the opinionated text data. People share their judgments, reactions, and feedback on the intern... Sentiment analysis, the meta field of Natural Language Processing (NLP), attempts to analyze and identify thesentiments in the opinionated text data. People share their judgments, reactions, and feedback on the internetusing various languages. Urdu is one of them, and it is frequently used worldwide. Urdu-speaking people prefer tocommunicate on social media in Roman Urdu (RU), an English scripting style with the Urdu language dialect.Researchers have developed versatile lexical resources for features-rich comprehensive languages, but limitedlinguistic resources are available to facilitate the sentiment classification of Roman Urdu. This effort encompassesextracting subjective expressions in Roman Urdu and determining the implied opinionated text polarity. Theprimary sources of the dataset are Daraz (an e-commerce platform), Google Maps, and the manual effort. Thecontributions of this study include a Bilingual Roman Urdu Language Detector (BRULD) and a Roman UrduSpelling Checker (RUSC). These integrated modules accept the user input, detect the text language, correct thespellings, categorize the sentiments, and return the input sentence’s orientation with a sentiment intensity score.The developed system gains strength with each input experience gradually. The results show that the languagedetector gives an accuracy of 97.1% on a close domain dataset, with an overall sentiment classification accuracy of94.3%. 展开更多
关键词 Roman urdu sentiment analysis Roman urdu language detector Roman urdu spelling checker FLASK
下载PDF
Translation of English Language into Urdu Language Using LSTM Model
2
作者 Sajadul Hassan Kumhar Syed Immamul Ansarullah +3 位作者 Akber Abid Gardezi Shafiq Ahmad Abdelaty Edrees Sayed Muhammad Shafiq 《Computers, Materials & Continua》 SCIE EI 2023年第2期3899-3912,共14页
English to Urdu machine translation is still in its beginning and lacks simple translation methods to provide motivating and adequate English to Urdu translation.In order tomake knowledge available to the masses,there... English to Urdu machine translation is still in its beginning and lacks simple translation methods to provide motivating and adequate English to Urdu translation.In order tomake knowledge available to the masses,there should be mechanisms and tools in place to make things understandable by translating from source language to target language in an automated fashion.Machine translation has achieved this goal with encouraging results.When decoding the source text into the target language,the translator checks all the characteristics of the text.To achieve machine translation,rule-based,computational,hybrid and neural machine translation approaches have been proposed to automate the work.In this research work,a neural machine translation approach is employed to translate English text into Urdu.Long Short Term Short Model(LSTM)Encoder Decoder is used to translate English to Urdu.The various steps required to perform translation tasks include preprocessing,tokenization,grammar and sentence structure analysis,word embeddings,training data preparation,encoder-decoder models,and output text generation.The results show that the model used in the research work shows better performance in translation.The results were evaluated using bilingual research metrics and showed that the test and training data yielded the highest score sequences with an effective length of ten(10). 展开更多
关键词 Machine translation urdu language word embedding
下载PDF
Validation of LittleEARS questionnaire in Hindi language
3
作者 Praveen Prakash S.Lakshmi +2 位作者 Adithya Sreedhar Arena Varan Mathur Sreeraj Konadath 《Journal of Otology》 CSCD 2023年第2期71-78,共8页
Background:Subjective measures of auditory development are equally important as objective measures to obtain a realistic image of the hearing status in infants and toddlers.Objectives:The objectives of the current stu... Background:Subjective measures of auditory development are equally important as objective measures to obtain a realistic image of the hearing status in infants and toddlers.Objectives:The objectives of the current study were to translate and validate the LittleEARS questionnaire into the Hindi language,to calculate its psychometric properties and establish a regression curve of the scores obtained as a function of age,to calculate the inter-test and test retest reliability of the same.The secondary objectives were to compare the scores obtained by the normal hearing children and those with hearing impairment and to plot a regression curve of total scores obtained by the hearing-impaired children as a function of the duration of auditory training attended since their first fit of the device.Materials and methods:The procedures involved conventional translation,reverse translation,and content validation before administering the questionnaire.The translated version was administered to parents of 59 children with normal hearing and 41 children with hearing impairment.Results:The finalized version had good reliability and efficient internal consistency with a Cronbach alpha value of 0.96.The mean scores obtained by the normal hearing children showed a progressive pattern as a function of age.Conclusion:The LittleEARS questionnaire has been successfully translated and validated into the Hindi language with excellent validity and reliability and can be used for screening and early identification of hearing impairment and in evaluating the outcome of audiological treatment approaches. 展开更多
关键词 LittleEARS auditory questionnaire hindi language Auditory outcome measures Screening tools Preverbal auditory checklist
下载PDF
融合乌尔都语词性序列预测的汉乌神经机器翻译
4
作者 陈欢欢 王剑 Muhammad Naeem Ul Hassan 《计算机工程与科学》 CSCD 北大核心 2024年第3期518-524,共7页
面向南亚和东南亚的小语种机器翻译,目前已有不少研究团队开展了深入研究,但作为巴基斯坦官方语言的乌尔都语,由于稀缺的数据资源和与汉语之间的巨大差距,有针对性的汉乌机器翻译方法研究非常稀少。针对这种情况,提出了基于Transformer... 面向南亚和东南亚的小语种机器翻译,目前已有不少研究团队开展了深入研究,但作为巴基斯坦官方语言的乌尔都语,由于稀缺的数据资源和与汉语之间的巨大差距,有针对性的汉乌机器翻译方法研究非常稀少。针对这种情况,提出了基于Transformer的融合乌尔都语词性序列的汉乌神经机器翻译模型。首先利用Transformer对目标语言乌尔都语的词性序列进行预测,然后将翻译模型的预测结果和词性序列模型的预测结果相结合进行联合预测,从而实现语言知识到翻译模型的融入。在现有小规模汉乌数据集上的实验表明,所提方法在数据集上的BLEU值相较于基准模型提升了0.13,取得了较为明显的效果。 展开更多
关键词 TRANSFORMER 神经机器翻译 乌尔都语 词性序列
下载PDF
Sentiment Analysis of Low-Resource Language Literature Using Data Processing and Deep Learning
5
作者 Aizaz Ali Maqbool Khan +2 位作者 Khalil Khan Rehan Ullah Khan Abdulrahman Aloraini 《Computers, Materials & Continua》 SCIE EI 2024年第4期713-733,共21页
Sentiment analysis, a crucial task in discerning emotional tones within the text, plays a pivotal role in understandingpublic opinion and user sentiment across diverse languages.While numerous scholars conduct sentime... Sentiment analysis, a crucial task in discerning emotional tones within the text, plays a pivotal role in understandingpublic opinion and user sentiment across diverse languages.While numerous scholars conduct sentiment analysisin widely spoken languages such as English, Chinese, Arabic, Roman Arabic, and more, we come to grapplingwith resource-poor languages like Urdu literature which becomes a challenge. Urdu is a uniquely crafted language,characterized by a script that amalgamates elements from diverse languages, including Arabic, Parsi, Pashtu,Turkish, Punjabi, Saraiki, and more. As Urdu literature, characterized by distinct character sets and linguisticfeatures, presents an additional hurdle due to the lack of accessible datasets, rendering sentiment analysis aformidable undertaking. The limited availability of resources has fueled increased interest among researchers,prompting a deeper exploration into Urdu sentiment analysis. This research is dedicated to Urdu languagesentiment analysis, employing sophisticated deep learning models on an extensive dataset categorized into fivelabels: Positive, Negative, Neutral, Mixed, and Ambiguous. The primary objective is to discern sentiments andemotions within the Urdu language, despite the absence of well-curated datasets. To tackle this challenge, theinitial step involves the creation of a comprehensive Urdu dataset by aggregating data from various sources such asnewspapers, articles, and socialmedia comments. Subsequent to this data collection, a thorough process of cleaningand preprocessing is implemented to ensure the quality of the data. The study leverages two well-known deeplearningmodels, namely Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), for bothtraining and evaluating sentiment analysis performance. Additionally, the study explores hyperparameter tuning tooptimize the models’ efficacy. Evaluation metrics such as precision, recall, and the F1-score are employed to assessthe effectiveness of the models. The research findings reveal that RNN surpasses CNN in Urdu sentiment analysis,gaining a significantly higher accuracy rate of 91%. This result accentuates the exceptional performance of RNN,solidifying its status as a compelling option for conducting sentiment analysis tasks in the Urdu language. 展开更多
关键词 urdu sentiment analysis convolutional neural networks recurrent neural network deep learning natural language processing neural networks
下载PDF
Offline Urdu Nastaleeq Optical Character Recognition Based on Stacked Denoising Autoencoder 被引量:2
6
作者 Ibrar Ahmad Xiaojie Wang +1 位作者 Ruifan Li Shahid Rasheed 《China Communications》 SCIE CSCD 2017年第1期146-157,共12页
Offline Urdu Nastaleeq text recognition has long been a serious problem due to its very cursive nature. In order to get rid of the character segmentation problems, many researchers are shifting focus towards segmentat... Offline Urdu Nastaleeq text recognition has long been a serious problem due to its very cursive nature. In order to get rid of the character segmentation problems, many researchers are shifting focus towards segmentation free ligature based recognition approaches. Majority of the prevalent ligature based recognition systems heavily rely on hand-engineered feature extraction techniques. However, such techniques are more error prone and may often lead to a loss of useful information that might hardly be captured later by any manual features. Most of the prevalent Urdu Nastaleeq test recognition was trained and tested on small sets. This paper proposes the use of stacked denoising autoencoder for automatic feature extraction directly from raw pixel values of ligature images. Such deep learning networks have not been applied for the recognition of Urdu text thus far. Different stacked denoising autoencoders have been trained on 178573 ligatures with 3732 classes from un-degraded(noise free) UPTI(Urdu Printed Text Image) data set. Subsequently, trained networks are validated and tested on degraded versions of UPTI data set. The experimental results demonstrate accuracies in range of 93% to 96% which are better than the existing Urdu OCR systems for such large dataset of ligatures. 展开更多
关键词 offline printed ligature recognition urdu nastaleeq denoising autoencoder deep learning classification
下载PDF
Sentiment Analysis of Roman Urdu on E-Commerce Reviews Using Machine Learning 被引量:1
7
作者 Bilal Chandio Asadullah Shaikh +5 位作者 Maheen Bakhtyar Mesfer Alrizq Junaid Baber Adel Sulaiman Adel Rajab Waheed Noor 《Computer Modeling in Engineering & Sciences》 SCIE EI 2022年第6期1263-1287,共25页
Sentiment analysis task has widely been studied for various languages such as English and French.However,Roman Urdu sentiment analysis yet requires more attention from peer-researchers due to the lack of Off-the-Shelf... Sentiment analysis task has widely been studied for various languages such as English and French.However,Roman Urdu sentiment analysis yet requires more attention from peer-researchers due to the lack of Off-the-Shelf Natural Language Processing(NLP)solutions.The primary objective of this study is to investigate the diverse machine learning methods for the sentiment analysis of Roman Urdu data which is very informal in nature and needs to be lexically normalized.To mitigate this challenge,we propose a fine-tuned Support Vector Machine(SVM)powered by Roman Urdu Stemmer.In our proposed scheme,the corpus data is initially cleaned to remove the anomalies from the text.After initial pre-processing,each user review is being stemmed.The input text is transformed into a feature vector using the bag-of-word model.Subsequently,the SVM is used to classify and detect user sentiment.Our proposed scheme is based on a dictionary based Roman Urdu stemmer.The creation of the Roman Urdu stemmer is aimed at standardizing the text so as to minimize the level of complexity.The efficacy of our proposed model is also empirically evaluated with diverse experimental configurations,so as to fine-tune the hyper-parameters and achieve superior performance.Moreover,a series of experiments are conducted on diverse machine learning and deep learning models to compare the performance with our proposed model.We also introduced the largest dataset on Roman Urdu,i.e.,Roman Urdu e-commerce dataset(RUECD),which contains 26K+user reviews annotated by the group of experts.The RUECD is challenging and the largest dataset available of Roman Urdu.The experiments show that the newly generated dataset is quite challenging and requires more attention from the peer researchers for Roman Urdu sentiment analysis. 展开更多
关键词 Sentiment analysis Roman urdu machine learning SVM
下载PDF
Roman Urdu News Headline Classification Empowered with Machine Learning 被引量:2
8
作者 Rizwan Ali Naqvi Muhammad Adnan Khan +3 位作者 Nauman Malik Shazia Saqib Tahir Alyas Dildar Hussain 《Computers, Materials & Continua》 SCIE EI 2020年第11期1221-1236,共16页
Roman Urdu has been used for text messaging over the Internet for years especially in Indo-Pak Subcontinent.Persons from the subcontinent may speak the same Urdu language but they might be using different scripts for ... Roman Urdu has been used for text messaging over the Internet for years especially in Indo-Pak Subcontinent.Persons from the subcontinent may speak the same Urdu language but they might be using different scripts for writing.The communication using the Roman characters,which are used in the script of Urdu language on social media,is now considered the most typical standard of communication in an Indian landmass that makes it an expensive information supply.English Text classification is a solved problem but there have been only a few efforts to examine the rich information supply of Roman Urdu in the past.This is due to the numerous complexities involved in the processing of Roman Urdu data.The complexities associated with Roman Urdu include the non-availability of the tagged corpus,lack of a set of rules,and lack of standardized spellings.A large amount of Roman Urdu news data is available on mainstream news websites and social media websites like Facebook,Twitter but meaningful information can only be extracted if data is in a structured format.We have developed a Roman Urdu news headline classifier,which will help to classify news into relevant categories on which further analysis and modeling can be done.The author of this research aims to develop the Roman Urdu news classifier,which will classify the news into five categories(health,business,technology,sports,international).First,we will develop the news dataset using scraping tools and then after preprocessing,we will compare the results of different machine learning algorithms like Logistic Regression(LR),Multinomial Naïve Bayes(MNB),Long short term memory(LSTM),and Convolutional Neural Network(CNN).After this,we will use a phonetic algorithm to control lexical variation and test news from different websites.The preliminary results suggest that a more accurate classification can be accomplished by monitoring noise inside data and by classifying the news.After applying above mentioned different machine learning algorithms,results have shown that Multinomial Naïve Bayes classifier is giving the best accuracy of 90.17%which is due to the noise lexical variation. 展开更多
关键词 Roman urdu news headline classification long short term memory recurrent neural network logistic regression multinomial naïve Bayes random forest k neighbor gradient boosting classifier
下载PDF
Urdu Ligature Recognition System:An Evolutionary Approach
9
作者 Naila Habib Khan Awais Adnan +3 位作者 AbdulWaheed Mahdi Zareei Abdallah Aldosary Ehab Mahmoud Mohamed 《Computers, Materials & Continua》 SCIE EI 2021年第2期1347-1367,共21页
Cursive text recognition of Arabic script-based languages like Urdu is extremely complicated due to its diverse and complex characteristics.Evolutionary approaches like genetic algorithms have been used in the past fo... Cursive text recognition of Arabic script-based languages like Urdu is extremely complicated due to its diverse and complex characteristics.Evolutionary approaches like genetic algorithms have been used in the past for various optimization as well as pattern recognition tasks,reporting exceptional results.The proposed Urdu ligature recognition system uses a genetic algorithm for optimization and recognition.Overall the proposed recognition system observes the processes of pre-processing,segmentation,feature extraction,hierarchical clustering,classification rules and genetic algorithm optimization and recognition.The pre-processing stage removes noise from the sentence images,whereas,in segmentation,the sentences are segmented into ligature components.Fifteen features are extracted from each of the segmented ligature images.Intra-feature hierarchical clustering is observed that results in clustered data.Next,classification rules are used for the representation of the clustered data.The genetic algorithm performs an optimization mechanism using multi-level sorting of the clustered data for improving the classification rules used for recognition of Urdu ligatures.Experiments conducted on the benchmark UPTI dataset for the proposed Urdu ligature recognition system yields promising results,achieving a recognition rate of 96.72%. 展开更多
关键词 Classification rules genetic algorithm intra-feature hierarchical clustering ligature recognition urdu script
下载PDF
Recognition of Urdu Handwritten Alphabet Using Convolutional Neural Network (CNN)
10
作者 Gulzar Ahmed Tahir Alyas +4 位作者 Muhammad Waseem Iqbal Muhammad Usman Ashraf Ahmed Mohammed Alghamdi Adel A.Bahaddad Khalid Ali Almarhabi 《Computers, Materials & Continua》 SCIE EI 2022年第11期2967-2984,共18页
Handwritten character recognition systems are used in every field of life nowadays,including shopping malls,banks,educational institutes,etc.Urdu is the national language of Pakistan,and it is the fourth spoken langua... Handwritten character recognition systems are used in every field of life nowadays,including shopping malls,banks,educational institutes,etc.Urdu is the national language of Pakistan,and it is the fourth spoken language in the world.However,it is still challenging to recognize Urdu handwritten characters owing to their cursive nature.Our paper presents a Convolutional Neural Networks(CNN)model to recognize Urdu handwritten alphabet recognition(UHAR)offline and online characters.Our research contributes an Urdu handwritten dataset(aka UHDS)to empower future works in this field.For offline systems,optical readers are used for extracting the alphabets,while diagonal-based extraction methods are implemented in online systems.Moreover,our research tackled the issue concerning the lack of comprehensive and standard Urdu alphabet datasets to empower research activities in the area of Urdu text recognition.To this end,we collected 1000 handwritten samples for each alphabet and a total of 38000 samples from 12 to 25 age groups to train our CNN model using online and offline mediums.Subsequently,we carried out detailed experiments for character recognition,as detailed in the results.The proposed CNN model outperformed as compared to previously published approaches. 展开更多
关键词 urdu handwritten text recognition handwritten dataset convolutional neural network artificial intelligence machine learning deep learning
下载PDF
Negative Polarity Item “Renhe” in Chinese and “Koī” in Hindi
11
作者 GAO Xirui GAO Wencheng 《Sino-US English Teaching》 2019年第11期466-475,共10页
This paper conducts a comparative analysis of negative polarity item“renhe”in Chinese and“koī”in Hindi.In the aspect of licensing conditions,it is found that both“renhe”and“koī”can be licensed by negative se... This paper conducts a comparative analysis of negative polarity item“renhe”in Chinese and“koī”in Hindi.In the aspect of licensing conditions,it is found that both“renhe”and“koī”can be licensed by negative sentences,yes-no interrogative sentences,A-not-A interrogative sentences,and the antecedent clause of a conditional.Both“renhe”in Chinese and“koī”in Hindi are strong negative polarity items(NPIs).NPI“renhe”can be focalized by adding“ye”or“dou”;in this case,the modified noun phrase is moved from the right to the left of the negative marker,reinforcing negative effect.NPI“koī”can also be focalized by adding a modal particle“hī”,but the modified noun phrase is not moved,with“koī(-bhī)…hī”collocation reinforcing negative effect. 展开更多
关键词 NEGATIVE polarity ITEM (NPI) “renhe” in CHINESE “koī” in hindi COMPARATIVE STUDY
下载PDF
Support Vector Machine Based Handwritten Hindi Character Recognition and Summarization
12
作者 Sunil Dhankhar Mukesh Kumar Gupta +3 位作者 Fida Hussain Memon Surbhi Bhatia Pankaj Dadheech Arwa Mashat 《Computer Systems Science & Engineering》 SCIE EI 2022年第10期397-412,共16页
In today’s digital era,the text may be in form of images.This research aims to deal with the problem by recognizing such text and utilizing the support vector machine(SVM).A lot of work has been done on the English l... In today’s digital era,the text may be in form of images.This research aims to deal with the problem by recognizing such text and utilizing the support vector machine(SVM).A lot of work has been done on the English language for handwritten character recognition but very less work on the under-resourced Hindi language.A method is developed for identifying Hindi language characters that use morphology,edge detection,histograms of oriented gradients(HOG),and SVM classes for summary creation.SVM rank employs the summary to extract essential phrases based on paragraph position,phrase position,numerical data,inverted comma,sentence length,and keywords features.The primary goal of the SVM optimization function is to reduce the number of features by eliminating unnecessary and redundant features.The second goal is to maintain or improve the classification system’s performance.The experiment included news articles from various genres,such as Bollywood,politics,and sports.The proposed method’s accuracy for Hindi character recognition is 96.97%,which is good compared with baseline approaches,and system-generated summaries are compared to human summaries.The evaluated results show a precision of 72%at a compression ratio of 50%and a precision of 60%at a compression ratio of 25%,in comparison to state-of-the-art methods,this is a decent result. 展开更多
关键词 Support vector machine(SVM) optimization PRECISION hindi character recognition optical character recognition(OCR) automatic summarization and compression ratio
下载PDF
《道德经》中生命观的印地语译介及跨文化阐释
13
作者 田克萍 《南亚东南亚研究》 2023年第3期96-110,156,157,共17页
《道德经》对生命主题的强烈关注,与信奉“万物有灵”的印度传统文化哲学存在共鸣。这很可能是《道德经》成为在印度被译介得最为广泛且最为深入的中华典籍的一个重要原因。在《道德经》流传于印度的多语种译本中,作为官方语言的印地语... 《道德经》对生命主题的强烈关注,与信奉“万物有灵”的印度传统文化哲学存在共鸣。这很可能是《道德经》成为在印度被译介得最为广泛且最为深入的中华典籍的一个重要原因。在《道德经》流传于印度的多语种译本中,作为官方语言的印地语译本尤为引人注目。五位印度译者翻译了五种《道德经》的印地语译本中,在翻译和阐释老子的生死观、行动观和身体观这三个与生命息息相关的主题时,不同的印地语译者在自身不同的知识结构、文化背景及读者期待视野的影响下,经历了不同的叛逆、融合、对话、变形与拓展。而在种种不同之中,存在某种以印度文化框架对老子思想进行格义的相同趋向。这种趋向让《道德经》从古代汉语到现代印地语转换的过程中,实现了老子生命观在印度文化空间中的重建。换言之,当老子的生命观思想披着印地语的外衣在印度重生之时,同时焕发出了中国智慧与印度智慧的光芒,可谓是中印双方文化基因交融的产物,是中印文化你情我愿的双向奔赴。 展开更多
关键词 《道德经》 印地语 译介 生命观 跨文化阐释
下载PDF
Neural Machine Translation Models with Attention-Based Dropout Layer
14
作者 Huma Israr Safdar Abbas Khan +3 位作者 Muhammad Ali Tahir Muhammad Khuram Shahzad Muneer Ahmad Jasni Mohamad Zain 《Computers, Materials & Continua》 SCIE EI 2023年第5期2981-3009,共29页
In bilingual translation,attention-based Neural Machine Translation(NMT)models are used to achieve synchrony between input and output sequences and the notion of alignment.NMT model has obtained state-of-the-art perfo... In bilingual translation,attention-based Neural Machine Translation(NMT)models are used to achieve synchrony between input and output sequences and the notion of alignment.NMT model has obtained state-of-the-art performance for several language pairs.However,there has been little work exploring useful architectures for Urdu-to-English machine translation.We conducted extensive Urdu-to-English translation experiments using Long short-term memory(LSTM)/Bidirectional recurrent neural networks(Bi-RNN)/Statistical recurrent unit(SRU)/Gated recurrent unit(GRU)/Convolutional neural network(CNN)and Transformer.Experimental results show that Bi-RNN and LSTM with attention mechanism trained iteratively,with a scalable data set,make precise predictions on unseen data.The trained models yielded competitive results by achieving 62.6%and 61%accuracy and 49.67 and 47.14 BLEU scores,respectively.From a qualitative perspective,the translation of the test sets was examined manually,and it was observed that trained models tend to produce repetitive output more frequently.The attention score produced by Bi-RNN and LSTM produced clear alignment,while GRU showed incorrect translation for words,poor alignment and lack of a clear structure.Therefore,we considered refining the attention-based models by defining an additional attention-based dropout layer.Attention dropout fixes alignment errors and minimizes translation errors at the word level.After empirical demonstration and comparison with their counterparts,we found improvement in the quality of the resulting translation system and a decrease in the perplexity and over-translation score.The ability of the proposed model was evaluated using Arabic-English and Persian-English datasets as well.We empirically concluded that adding an attention-based dropout layer helps improve GRU,SRU,and Transformer translation and is considerably more efficient in translation quality and speed. 展开更多
关键词 Natural language processing neural machine translation word embedding ATTENTION PERPLEXITY selective dropout regularization urdu PERSIAN Arabic BLEU
下载PDF
印地语自然语言处理研究进展
15
作者 王连喜 林楠铠 +1 位作者 蒋盛益 邓致妍 《中文信息学报》 CSCD 北大核心 2023年第5期53-69,共17页
与西方语言相比,印地语是东南亚地区的一种低资源语言。由于缺少相应的语料、标注规范及计算模型,当前印地语自然语言处理工作并未得到重视,也不能较好地迁移通用语种研究中的前沿方法。该文在进行文献调研和计量分析的基础上,回顾了印... 与西方语言相比,印地语是东南亚地区的一种低资源语言。由于缺少相应的语料、标注规范及计算模型,当前印地语自然语言处理工作并未得到重视,也不能较好地迁移通用语种研究中的前沿方法。该文在进行文献调研和计量分析的基础上,回顾了印地语自然语言处理研究在基础资源建设、词性标注、命名实体识别、句法分析、词义消歧、信息检索、机器翻译、情感分析以及自动摘要等方面的研究进展,最后提出了该领域研究可能面临的问题及挑战,并展望未来发展趋势。 展开更多
关键词 印地语 自然语言处理 低资源语言
下载PDF
19世纪后半叶印度民族主义意识探析——以帕勒登杜的戏剧为例
16
作者 姜景奎 《南亚东南亚研究》 2023年第6期121-134,157,共15页
英国自18世纪初开始对印度进行占领和统治,至19世纪中叶实现完全殖民。随着英国殖民的深入推进,19世纪后半叶印度产生了民族主义意识。鉴于印度特殊的历史,该民族主义意识并不唯一。首先,由于英国殖民者的现实存在,印度出现了次大陆人... 英国自18世纪初开始对印度进行占领和统治,至19世纪中叶实现完全殖民。随着英国殖民的深入推进,19世纪后半叶印度产生了民族主义意识。鉴于印度特殊的历史,该民族主义意识并不唯一。首先,由于英国殖民者的现实存在,印度出现了次大陆人民整体反对英国殖民者的民族主义意识,该意识反映了次大陆人民与英国殖民者的对立状况;第二,由于历史上伊斯兰教民族对次大陆进行了占领和统治,这一时期次大陆也产生了印度教民族主义意识,该意识反映了次大陆上印度教族群和伊斯兰教族群的对立状况;第三,由于更早时期雅利安人的进入和占领,19世纪后半叶还出现了雅利安民族主义意识,该意识反映了雅利安族群和达罗毗荼族群的对立状况。这三类民族主义意识在当代印度仍然存在,印度民族主义意识体现在当代印度的“去殖民化”现象之中;印度教民族主义意识体现在印度教社团组织和现执政党印度人民党的“印度教民族国家建构”工作之中;雅利安民族主义意识体现在独立后印度南北族群的对立言行之中。帕勒登杜是19世纪后半叶最为著名的印地语文学家和社会活动家之一,被誉为近现代印地语戏剧文学之父,其戏剧创作恰好反映了这三类民族主义意识,是印度19世纪后半叶民族主义意识的完整且集中叙事,代表性极强。 展开更多
关键词 印度民族主义 印度教民族主义 雅利安民族主义 印地语戏剧 帕勒登杜
下载PDF
格比尔文本与印度文化的早期现代化
17
作者 张忞煜 《国家现代化建设研究》 2023年第2期144-160,共17页
自欧洲殖民印度以来,殖民统治的建立被视为印度现代化进程的开端。尽管对早期现代印度史,尤其是经济史的研究已经反思了以欧洲殖民为核心的现代化叙事,但对前殖民时代本土文化现代化研究的不足依然制约着对印度前殖民现代化的研究。本... 自欧洲殖民印度以来,殖民统治的建立被视为印度现代化进程的开端。尽管对早期现代印度史,尤其是经济史的研究已经反思了以欧洲殖民为核心的现代化叙事,但对前殖民时代本土文化现代化研究的不足依然制约着对印度前殖民现代化的研究。本文重点研究与纺织工格比尔相关的早期现代印地语文献,通过在早期现代印度纺织业发展和印度纺织业者的神秘主义文学传统双重语境中重新挖掘其思想内涵,寻找前殖民时代的印度城市纺织业发展出的具有新时代烙印的现代思想文化的历史线索,以期跳出资本主义全球体系在形成“中央—边缘”格局的过程中凭借殖民知识体系塑造出的单一的“全球现代化叙事”,重新思考多元文明世界的早期现代化实践。 展开更多
关键词 早期现代印度史 印度现代化 本土现代性 印地语文学 格比尔
下载PDF
巴基斯坦的语言与民族关系探析 被引量:10
18
作者 满在江 谢妍 艾佳 《徐州师范大学学报(哲学社会科学版)》 北大核心 2011年第3期16-20,共5页
巴基斯坦是一个有两种官方语言的多语言国家。乌尔都语作为巴基斯坦的官方语言之一,将其作为母语使用的人数却很少,而作为交际用语其使用范围又很广。造成这一现象的原因与伊斯兰民族认同有着密切的关系。乌尔都语国语地位的确立是穆斯... 巴基斯坦是一个有两种官方语言的多语言国家。乌尔都语作为巴基斯坦的官方语言之一,将其作为母语使用的人数却很少,而作为交际用语其使用范围又很广。造成这一现象的原因与伊斯兰民族认同有着密切的关系。乌尔都语国语地位的确立是穆斯林民族主义意识的必然结果,同时也和统治阶层为了维护民族统一所采取的语言政策相关。英语和乌尔都语作为巴基斯坦的两种官方语言是殖民统治的结果,而乌尔都语至今无法完全取代英语,则与社会的发展、尤其是英语的全球化密切关联。乌尔都语与孟加拉语曾经的冲突以及东巴基斯坦最终的脱离都进一步表明语言在民族意识觉醒中所起的重要作用。乌尔都语是巴基斯坦民族意识的产物,同时,作为伊斯兰民族认同的象征,它又促进和增强了民族的凝聚力,在一定程度上缓解了一个多民族、多语言的国家在民族统一、宗教信仰等方面的分歧和冲突。 展开更多
关键词 巴基斯坦 乌尔都语 民族认同
下载PDF
词切分对印-英双语者阅读影响的眼动研究 被引量:2
19
作者 白学军 郭志英 +3 位作者 曹玉肖 顾俊娟 闫国利 臧传丽 《心理科学》 CSSCI CSCD 北大核心 2012年第3期544-549,共6页
以25名印-英双语者为被试,采用EyeLink2000眼动仪,探讨词切分对印-英双语者阅读的影响。实验一要求被试阅读印地语句子,实验二要求被试阅读英语句子,词切分方式有三种:正常条件、无空格隔词加灰条件和无空格条件。结果发现:(1)对印-英... 以25名印-英双语者为被试,采用EyeLink2000眼动仪,探讨词切分对印-英双语者阅读的影响。实验一要求被试阅读印地语句子,实验二要求被试阅读英语句子,词切分方式有三种:正常条件、无空格隔词加灰条件和无空格条件。结果发现:(1)对印-英双语者阅读自己的两种官方语言来说,空格因素在其阅读中发挥着积极作用,删除空格会明显影响他们的阅读;(2)在无空格隔词加灰条件和无空格条件下,印-英双语者的英语阅读速度的下降明显大于印地语阅读的速度下降,这表明空格因素的作用会受语言特点的制约。 展开更多
关键词 词切分 印地语 英语 眼动
下载PDF
印汉动物词语的文化喻义与翻译 被引量:9
20
作者 任飞 《解放军外国语学院学报》 北大核心 2003年第1期77-81,共5页
印地语和汉语中都存在大量与动物有关的词语,许多动物词语已逐渐附加了特有的隐喻意义。分析印汉动物词语的文化内涵,对比其隐喻义并探讨翻译方法,有助于提高翻译的准确性,促进中印文化交流。
关键词 印地语 汉语 动物词语 隐喻义 翻译
下载PDF
上一页 1 2 3 下一页 到第
使用帮助 返回顶部