摘要
Joint learning of words and entities is advantageous to various NLP tasks, while most of the works focus on single language setting. Cross-lingual representations learning receives high attention recently, but is still restricted by the availability of parallel data. In this paper, a method is proposed to jointly embed texts and entities on comparable data. In addition to evaluate on public semantic textual similarity datasets, a task (cross-lingual text extraction) was proposed to assess the similarities between texts and contribute to this dataset. It shows that the proposed method outperforms cross-lingual representations methods using parallel data on cross-lingual tasks, and achieves competitive results on mono-lingual tasks.
出处
《国际计算机前沿大会会议论文集》
2019年第1期436-440,共5页
International Conference of Pioneering Computer Scientists, Engineers and Educators(ICPCSEE)