期刊文献+

A Supervised Learning Approach to Search of Definitions 被引量:1

A Supervised Learning Approach to Search of Definitions
原文传递
导出
摘要 This paper addresses the issue of search of definitions. Specifically, for a given term, we are to find out its definition candidates and rank the candidates according to their likelihood of being good definitions. This is in contrast to the traditional methods of either generating a single combined definition or outputting all retrieved definitions. Definition ranking is essential for tasks. A specification for judging the goodness of a definition is given. In the specification, a definition is categorized into one of the three levels: good definition, indifferent definition, or bad definition. Methods of performing definition ranking are also proposed in this paper, which formalize the problem as either classification or ordinal regression. We employ SVM (Support Vector Machines) as the classification model and Ranking SVM as the ordinal regression model respectively, and thus they rank definition candidates according to their likelihood of being good definitions. Features for constructing the SVM and Ranking SVM models are defined, which represent the characteristics of terms, definition candidate, and their relationship. Experimental results indicate that the use of SVM and Ranking SVM can significantly outperform the baseline methods such as heuristic rules, the conventional information retrieval--Okapi, or SVM regression. This is true when both the answers are paragraphs and they are sentences. Experimental results also show that SVM or Ranking SVM models trained in one domain can be adapted to another domain, indicating that generic models for definition ranking can be constructed. This paper addresses the issue of search of definitions. Specifically, for a given term, we are to find out its definition candidates and rank the candidates according to their likelihood of being good definitions. This is in contrast to the traditional methods of either generating a single combined definition or outputting all retrieved definitions. Definition ranking is essential for tasks. A specification for judging the goodness of a definition is given. In the specification, a definition is categorized into one of the three levels: good definition, indifferent definition, or bad definition. Methods of performing definition ranking are also proposed in this paper, which formalize the problem as either classification or ordinal regression. We employ SVM (Support Vector Machines) as the classification model and Ranking SVM as the ordinal regression model respectively, and thus they rank definition candidates according to their likelihood of being good definitions. Features for constructing the SVM and Ranking SVM models are defined, which represent the characteristics of terms, definition candidate, and their relationship. Experimental results indicate that the use of SVM and Ranking SVM can significantly outperform the baseline methods such as heuristic rules, the conventional information retrieval--Okapi, or SVM regression. This is true when both the answers are paragraphs and they are sentences. Experimental results also show that SVM or Ranking SVM models trained in one domain can be adapted to another domain, indicating that generic models for definition ranking can be constructed.
出处 《Journal of Computer Science & Technology》 SCIE EI CSCD 2006年第3期439-449,共11页 计算机科学技术学报(英文版)
关键词 definition search text mining web mining web search definition search, text mining, web mining, web search
  • 相关文献

参考文献35

  • 1Salton G, McGill M. Introduction to Modern Information Retrieval. McGraw-Hill, 1983.
  • 2Voorhees E. Evaluating answers to definition questions. In Proc. Human Language Technology Conference/North American Chapter of the Association for Computational Linguistics Annual Meeting, Edmonton, Canada,2003, pp.109-111.
  • 3Voorhees E. Overview of the TREC 2003 question answering track. In Proc. 12th Text Retrieval Conference, Washington,2003, pp.54-68.
  • 4Blair-Goldensohn S, McKeown K R, Schlaikjer A H. A hybrid approach for QA track definitional questions. In Proc. 12th Text Retrieval Conference, Washington, 2003, pp.185-192.
  • 5Harabagiu S, Moldovan D et al. Answer mining by combining extraction techniques with abductive reasoning. In Proc. 12th Text Retrieval Conference, Washington, 2003, pp.375-382.
  • 6Xu J, Licuanan A, Weischedel R. TREC 2003 QA at BBN:Answering definitional questions. In Proc. 12th Text Retrieval Conference, Washington, 2003, pp.98-106.
  • 7Echihabi A, Hermjakob U, Hovy E et al. Multiple-engine question answering in TextMap. In Proc. 12th Text Retrieval Conference, Washington, 2003, pp.772-781.
  • 8Yang H, Cui H, Kan M Y et al.QUALIFIER in TREC-12 QA main task. In Proc. 12th Text Retrieval Conference,Washington, 2003, pp.480-488.
  • 9Klavans J, Muresan S. DEFINDER: Rule-based methods for the extraction of medical terminology and their associated definitions from on-line text. In Proc. American Medical Informatics Association Symposium, Los Angeles, CA, USA,2000,pp.201-202.
  • 10Liu B, Chin C, Ng H. Mining topic-specific concepts and definitions on the web. In Proc. 12th Int. Conf. World Wide Web, Budapest, Hungary, May 20-24, 2003, pp.251-260.

同被引文献1

引证文献1

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部