期刊文献+

基于完全加权正负关联模式挖掘的越-英跨语言查询译后扩展 被引量:11

Vietnamese-English Cross Language Query Post-Translation Expansion Based on All-Weighted Positive and Negative Association Patterns Mining
下载PDF
导出
摘要 主题漂移和词不匹配是自然语言处理中一个难题,文本挖掘与信息检索的结合有助于解决该问题.鉴于此,本文提出一种基于完全加权正负关联模式挖掘的越-英跨语言查询译后扩展算法.该算法采用新的完全加权正负项集支持度和关联度计算方法以及模式评价框架,对初检用户相关反馈文档集挖掘与原查询词相关的正负关联模式,从模式中提取扩展词实现跨语言查询译后扩展.与现有基于伪相关反馈、加权关联模式挖掘的跨语言扩展算法比较,本文算法能有效地减少查询主题漂移和词不匹配问题,提高跨语言信息检索性能;本文模式挖掘方法可用于推荐系统,提高其准确性. Topic drift and word mismatch are a difficult problem in natural language processing.The combination of text mining and information retrieval can help to solve the problem.In view of this,this paper proposes an algorithm of Vietnamese-English cross language(VECL)query post-translation expansion based on all-weighted positive and negative association pattern mining.The algorithm utilized a computing method of support and correlation degree of all-weighted positive and negative itemset,and mined the all-weighted positive and negative association pattern related to the original query by the pattern evaluation framework in the user relevance feedback document set from the VECL first retrieval results.The expansion terms were extracted from the patterns in order to carry out VECL query post-translation expansion.A comparison between the proposed algorithm and the existing cross language query expansion algorithms based on pseudo relevance feedback and weighted association pattern mining is made,which shows that the former can effectively reduce the problems of query topic drift and word mismatch,and improve the performance of cross language information retrieval.And moreover,the method of pattern mining in this paper can be used in recommender systems and improve its accuracy.
作者 黄名选 蒋曹清 HUANG Ming-xuan;JIANG Cao-qing(Guangxi Key Laboratory Cultivation Base of Cross-border E-commerce Intelligent Information Processing,Guangxi University of Finance and Economics,Nanning,Guangxi 530003,China;School of Information and Statistics,Guangxi University of Financeand Economics,Nanning,Guangxi 530003,China)
出处 《电子学报》 EI CAS CSCD 北大核心 2018年第12期3029-3036,共8页 Acta Electronica Sinica
基金 国家自然科学基金(No.61762006 No.61662003 No.61262028)
关键词 自然语言处理 信息检索 文本挖掘 模式挖掘 查询扩展 推荐系统 natural language processing information retrieval text mining pattern mining query expansion recommender system
  • 相关文献

参考文献6

二级参考文献68

  • 1刘远超,王晓龙,徐志明,刘秉权.基于粗集理论的中文关键词短语构成规则挖掘[J].电子学报,2007,35(2):371-374. 被引量:17
  • 2Ruthven I,Lalmas M.A survey on the use of relevance feedback for information access systems[J].The Knowledge Engineering Review,2003,18(2):95-145.
  • 3Harman D.Relevance feedback revisited[C]// Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,1992:1-10.
  • 4Xu J,Croft W B.Query expansion using local and global document analysis[C]//Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,1996:4-11.
  • 5Orengo V M,Huyck C.Relevance feedback and cross-language information retrieval[J].Information Processing & Management,2006,42(5):1203-1217.
  • 6Ballestors L A,Croft W B.Phrasal translation and query expansion techniques for cross-language information retrieval[C]// Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,1997:84-91.
  • 7McNamee P,Mayfield J.Comparing cross-language query expansion techniques by degrading translation resources[C]// Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,2002:159-166.
  • 8Gao J F,et al.TREC-9 CLIR Experiments at MSRCN[C]// Proceedings of the 9th Text Retrieval Conference,2001:343-353.
  • 9Lavrenko V,Croft W B.Relevance-based language models[C]// Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,2001:120-127.
  • 10Wu Dan,He Daqing.ICE-TEA:an Interactive Cross-language Search Engine with Translation Enhancement[C]// Proceedings of 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,2008:882.

共引文献37

同被引文献51

引证文献11

二级引证文献41

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部