期刊文献+
共找到9篇文章
< 1 >
每页显示 20 50 100
Cross-Lingual Non-Ferrous Metals Related News Recognition Method Based on CNN with A Limited Bi-Lingual Dictionary 被引量:2
1
作者 Xudong Hong Xiao Zheng +1 位作者 Jinyuan Xia Linna Wei 《Computers, Materials & Continua》 SCIE EI 2019年第2期379-389,共11页
To acquire non-ferrous metals related news from different countries’internet,we proposed a cross-lingual non-ferrous metals related news recognition method based on CNN with a limited bilingual dictionary.Firstly,con... To acquire non-ferrous metals related news from different countries’internet,we proposed a cross-lingual non-ferrous metals related news recognition method based on CNN with a limited bilingual dictionary.Firstly,considering the lack of related language resources of non-ferrous metals,we use a limited bilingual dictionary and CCA to learn cross-lingual word vector and to represent news in different languages uniformly.Then,to improve the effect of recognition,we use a variant of the CNN to learn recognition features and construct the recognition model.The experimental results show that our proposed method acquires better results. 展开更多
关键词 Non-ferrous metal CNN cross-lingual text classification word vector
下载PDF
Knowledge-Enhanced Bilingual Textual Representations for Cross-Lingual Semantic Textual Similarity
2
作者 Hsuehkuan Lu Yixin Cao +1 位作者 Hou Lei Juanzi Li 《国际计算机前沿大会会议论文集》 2019年第1期436-440,共5页
Joint learning of words and entities is advantageous to various NLP tasks, while most of the works focus on single language setting. Cross-lingual representations learning receives high attention recently, but is stil... Joint learning of words and entities is advantageous to various NLP tasks, while most of the works focus on single language setting. Cross-lingual representations learning receives high attention recently, but is still restricted by the availability of parallel data. In this paper, a method is proposed to jointly embed texts and entities on comparable data. In addition to evaluate on public semantic textual similarity datasets, a task (cross-lingual text extraction) was proposed to assess the similarities between texts and contribute to this dataset. It shows that the proposed method outperforms cross-lingual representations methods using parallel data on cross-lingual tasks, and achieves competitive results on mono-lingual tasks. 展开更多
关键词 Text and knowledge REPRESENTATIONS cross-lingual REPRESENTATIONS cross-lingual SEMANTIC TEXTUAL SIMILARITY
下载PDF
Enhancing low-resource cross-lingual summarization from noisy data with fine-grained reinforcement learning
3
作者 Yuxin HUANG Huailing GU +3 位作者 Zhengtao YU Yumeng GAO Tong PAN Jialong XU 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2024年第1期121-134,共14页
Cross-lingual summarization(CLS)is the task of generating a summary in a target language from a document in a source language.Recently,end-to-end CLS models have achieved impressive results using large-scale,high-qual... Cross-lingual summarization(CLS)is the task of generating a summary in a target language from a document in a source language.Recently,end-to-end CLS models have achieved impressive results using large-scale,high-quality datasets typically constructed by translating monolingual summary corpora into CLS corpora.However,due to the limited performance of low-resource language translation models,translation noise can seriously degrade the performance of these models.In this paper,we propose a fine-grained reinforcement learning approach to address low-resource CLS based on noisy data.We introduce the source language summary as a gold signal to alleviate the impact of the translated noisy target summary.Specifically,we design a reinforcement reward by calculating the word correlation and word missing degree between the source language summary and the generated target language summary,and combine it with cross-entropy loss to optimize the CLS model.To validate the performance of our proposed model,we construct Chinese-Vietnamese and Vietnamese-Chinese CLS datasets.Experimental results show that our proposed model outperforms the baselines in terms of both the ROUGE score and BERTScore. 展开更多
关键词 cross-lingual summarization Low-resource language Noisy data Fine-grained reinforcement learning Word correlation Word missing degree
原文传递
Investigation of Knowledge Transfer Approaches to Improve the Acoustic Modeling of Vietnamese ASR System 被引量:4
4
作者 Danyang Liu Ji Xu +1 位作者 Pengyuan Zhang Yonghong Yan 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2019年第5期1187-1195,共9页
It is well known that automatic speech recognition(ASR) is a resource consuming task. It takes sufficient amount of data to train a state-of-the-art deep neural network acoustic model. As for some low-resource languag... It is well known that automatic speech recognition(ASR) is a resource consuming task. It takes sufficient amount of data to train a state-of-the-art deep neural network acoustic model. As for some low-resource languages where scripted speech is difficult to obtain, data sparsity is the main problem that limits the performance of speech recognition system. In this paper, several knowledge transfer methods are investigated to overcome the data sparsity problem with the help of high-resource languages.The first one is a pre-training and fine-tuning(PT/FT) method, in which the parameters of hidden layers are initialized with a welltrained neural network. Secondly, the progressive neural networks(Prognets) are investigated. With the help of lateral connections in the network architecture, Prognets are immune to forgetting effect and superior in knowledge transferring. Finally,bottleneck features(BNF) are extracted using cross-lingual deep neural networks and serves as an enhanced feature to improve the performance of ASR system. Experiments are conducted in a low-resource Vietnamese dataset. The results show that all three methods yield significant gains over the baseline system, and the Prognets acoustic model performs the best. Further improvements can be obtained by combining the Prognets model and bottleneck features. 展开更多
关键词 BOTTLENECK feature (BNF) cross-lingual automatic speech recognition (ASR) PROGRESSIVE neural networks (Prognets) model transfer learning
下载PDF
Multi-Level Cross-Lingual Attentive Neural Architecture for Low Resource Name Tagging 被引量:2
5
作者 Xiaocheng Feng Lifu Huang +3 位作者 Bing Qin Ying Lin Heng Ji Ting Liu 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2017年第6期633-645,共13页
Neural networks have been widely used for English name tagging and have delivered state-of-the-art results. However, for low resource languages, due to the limited resources and lack of training data, taggers tend to ... Neural networks have been widely used for English name tagging and have delivered state-of-the-art results. However, for low resource languages, due to the limited resources and lack of training data, taggers tend to have lower performance, in comparison to the English language. In this paper, we tackle this challenging issue by incorporating multi-level cross-lingual knowledge as attention into a neural architecture, which guides low resource name tagging to achieve a better performance. Specifically, we regard entity type distribution as language independent and use bilingual lexicons to bridge cross-lingual semantic mapping. Then, we jointly apply word-level cross-lingual mutual influence and entity-type level monolingual word distributions to enhance low resource name tagging. Experiments on three languages demonstrate the effectiveness of this neural architecture: for Chinese,Uzbek, and Turkish, we are able to yield significant improvements in name tagging over all previous baselines. 展开更多
关键词 name tagging deep learning recurrent neural network cross-lingual information extraction
原文传递
The Application of the Comparable Corpora in Chinese-English Cross-Lingual Information Retrieval
6
作者 杜林 张毅波 +1 位作者 孙乐 孙玉芳 《Journal of Computer Science & Technology》 SCIE EI CSCD 2001年第4期351-358,共8页
This paper proposes a novel Chinese-English Cross-Lingual Information Retrieval (CECLIR) model PME, in which bilingual dictionary and comparable corpora are used to translate the query terms. The Proximity and mutua... This paper proposes a novel Chinese-English Cross-Lingual Information Retrieval (CECLIR) model PME, in which bilingual dictionary and comparable corpora are used to translate the query terms. The Proximity and mutual information of the term-pairs in the Chinese and English comparable corpora are employed not only to resolve the translation ambiguities but also to perform the query expansion so as to deal with the out-of-vocabulary issues in the CECLIR. The evaluation results show that the query precision of PME algorithm is about 84.4% of the monolingual information retrieval. 展开更多
关键词 cross-lingual information retrieval comparable corpus mutual information query expansion
原文传递
Cross-Language Information Extraction and Auto Evaluation for OOV Term Translations
7
作者 Jian Qu Le Minh Nguyen Akira Shimazu 《China Communications》 SCIE CSCD 2016年第12期277-296,共20页
OOV term translation plays an important role in natural language processing. Although many researchers in the past have endeavored to solve the OOV term translation problems, but none existing methods offer definition... OOV term translation plays an important role in natural language processing. Although many researchers in the past have endeavored to solve the OOV term translation problems, but none existing methods offer definition or context information of OOV terms. Furthermore, non-existing methods focus on cross-language definition retrieval for OOV terms. Never the less, it has always been so difficult to evaluate the correctness of an OOV term translation without domain specific knowledge and correct references. Our English definition ranking method differentiate the types of OOV terms, and applies different methods for translation extraction. Our English definition ranking method also extracts multilingual context information and monolingual definitions of OOV terms. In addition, we propose a novel cross-language definition retrieval system for OOV terms. Never the less, we propose an auto re-evaluation method to evaluate the correctness of OOV translations and definitions. Our methods achieve high performances against existing methods. 展开更多
关键词 Term translation multilingual information retrieval definition extraction cross-lingual definition extraction auto re-evaluation
下载PDF
Towards Realizing Mandarin-Tibetan Bi-lingual Emotional Speech Synthesis with Mandarin Emotional Training Corpus
8
作者 Peiwen Wu Hongwu Yang Zhenye Gan 《国际计算机前沿大会会议论文集》 2017年第2期29-32,共4页
This paper presents a method of hidden Markov model (HMM)-based Mandarin-Tibetan bi-lingual emotional speech synthesis by speaker adaptive training with a Mandarin emotional speech corpus.A one-speaker Tibetan neutral... This paper presents a method of hidden Markov model (HMM)-based Mandarin-Tibetan bi-lingual emotional speech synthesis by speaker adaptive training with a Mandarin emotional speech corpus.A one-speaker Tibetan neutral speech corpus, a multi-speaker Mandarin neutral speech corpus and a multi-speaker Mandarin emotional speech corpus are firstly employed to train a set of mixed language average acoustic models of target emotion by using speaker adaptive training.Then a one-speaker Mandarin neutral speech corpus or a one-speaker Tibetan neutral speech corpus is adopted to obtain a set of speaker dependent acoustic models of target emotion by using the speaker adap-tation transformation. The Mandarin emotional speech or the Tibetan emotional speech is finally synthesized from Mandarin speaker depen-dent acoustic models of target emotion or Tibetan speaker dependent acoustic models of target emotion. Subjective tests show that the aver-age emotional mean opinion score is 4.14 for Tibetan and 4.26 for Mandarin. The average mean opinion score is 4.16 for Tibetan and 4.28 for Mandarin. The average degradation opinion score is 4.28 for Tibetan and 4.24 for Mandarin. Therefore, the proposed method can synthesize both Tibetan speech and Mandarin speech with high naturalness and emotional expression by using only Mandarin emotional training speech corpus. 展开更多
关键词 Mandarin-Tibetan cross-lingual EMOTIONAL SPEECH SYNTHESIS hidden Markov model (HMM) Speaker adaptive training Mandarin-Tibetan cross-lingual SPEECH SYNTHESIS EMOTIONAL SPEECH SYNTHESIS
下载PDF
Multimodal Pretraining from Monolingual to Multilingual
9
作者 Liang Zhang Ludan Ruan +1 位作者 Anwen Hu Qin Jin 《Machine Intelligence Research》 EI CSCD 2023年第2期220-232,共13页
Multimodal pretraining has made convincing achievements in various downstream tasks in recent years.However,since the majority of the existing works construct models based on English,their applications are limited by ... Multimodal pretraining has made convincing achievements in various downstream tasks in recent years.However,since the majority of the existing works construct models based on English,their applications are limited by language.In this work,we address this issue by developing models with multimodal and multilingual capabilities.We explore two types of methods to extend multimodal pretraining model from monolingual to multilingual.Specifically,we propose a pretraining-based model named multilingual multimodal pretraining(MLMM),and two generalization-based models named multilingual CLIP(M-CLIP)and multilingual acquisition(MLA).In addition,we further extend the generalization-based models to incorporate the audio modality and develop the multilingual CLIP for vision,language,and audio(CLIP4VLA).Our models achieve state-of-the-art performances on multilingual vision-text retrieval,visual question answering,and image captioning benchmarks.Based on the experimental results,we discuss the pros and cons of the two types of models and their potential practical applications. 展开更多
关键词 Multilingual pretraining multimodal pretraining cross-lingual transfer multilingual generation cross-modal retrieval
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部