Query translation mining is a key technique in cross-language information retrieval and machine translation knowl-edge acquisition. For better performance, the queries are classified into transliterated words and non-...Query translation mining is a key technique in cross-language information retrieval and machine translation knowl-edge acquisition. For better performance, the queries are classified into transliterated words and non-transliterated words based on transliterated word identification model, and are further channeled to different mining processes. This paper is a pilot study on query classification for better translation mining performance, which is based on supervised classification and linguistic heuristics. The person name identification gets a precision of over 97%. Transliterated word translation mining shows satisfactory performance.展开更多
We study implicit discourse relation detection,which is one of the most challenging tasks in the field of discourse analysis.We specialize in ambiguous implicit discourse relation,which is an imperceptible linguistic ...We study implicit discourse relation detection,which is one of the most challenging tasks in the field of discourse analysis.We specialize in ambiguous implicit discourse relation,which is an imperceptible linguistic phenomenon and therefore difficult to identify and eliminate.In this paper,we first create a novel task named implicit discourse relation disambiguation(IDRD).Second,we propose a focus-sensitive relation disambiguation model that affirms a truly-correct relation when it is triggered by focal sentence constituents.In addition,we specifically develop a topicdriven focus identification method and a relation search system(RSS)to support the relation disambiguation.Finally,we improve current relation detection systems by using the disambiguation model.Experiments on the penn discourse treebank(PDTB)show promising improvements.展开更多
文摘Query translation mining is a key technique in cross-language information retrieval and machine translation knowl-edge acquisition. For better performance, the queries are classified into transliterated words and non-transliterated words based on transliterated word identification model, and are further channeled to different mining processes. This paper is a pilot study on query classification for better translation mining performance, which is based on supervised classification and linguistic heuristics. The person name identification gets a precision of over 97%. Transliterated word translation mining shows satisfactory performance.
基金supported by the National Natural Science Foundation of China(Grant Nos.61672368,61373097,61672367,61331011)the Research Foundation of the Ministry of Education and China Mobile(MCM20150602)Natural Science Foundation of Jiangsu(BK20151222).
文摘We study implicit discourse relation detection,which is one of the most challenging tasks in the field of discourse analysis.We specialize in ambiguous implicit discourse relation,which is an imperceptible linguistic phenomenon and therefore difficult to identify and eliminate.In this paper,we first create a novel task named implicit discourse relation disambiguation(IDRD).Second,we propose a focus-sensitive relation disambiguation model that affirms a truly-correct relation when it is triggered by focal sentence constituents.In addition,we specifically develop a topicdriven focus identification method and a relation search system(RSS)to support the relation disambiguation.Finally,we improve current relation detection systems by using the disambiguation model.Experiments on the penn discourse treebank(PDTB)show promising improvements.