A new method for combining features via importance-inhibition analysis (IIA) is described to obtain more effective feature combination in learning question classification. Features are combined based on the inhibiti...A new method for combining features via importance-inhibition analysis (IIA) is described to obtain more effective feature combination in learning question classification. Features are combined based on the inhibition among features as well as the importance of individual features. Experimental results on the Chinese questions set show that, the IIA method shows a gradual increase in average and maximum accuracies at all feature combinations, and achieves great improvement over the importance analysis(IA) method on the whole. Moreover, the IIA method achieves the same highest accuracy as the one by the exhaustive method, and further improves the performance of question classification.展开更多
To improve question answering (QA) performance based on real-world web data sets,a new set of question classes and a general answer re-ranking model are defined.With pre-defined dictionary and grammatical analysis,t...To improve question answering (QA) performance based on real-world web data sets,a new set of question classes and a general answer re-ranking model are defined.With pre-defined dictionary and grammatical analysis,the question classifier draws both semantic and grammatical information into information retrieval and machine learning methods in the form of various training features,including the question word,the main verb of the question,the dependency structure,the position of the main auxiliary verb,the main noun of the question,the top hypernym of the main noun,etc.Then the QA query results are re-ranked by question class information.Experiments show that the questions in real-world web data sets can be accurately classified by the classifier,and the QA results after re-ranking can be obviously improved.It is proved that with both semantic and grammatical information,applications such as QA, built upon real-world web data sets, can be improved,thus showing better performance.展开更多
基金The National Natural Science Foundation of China(No.61003112,61170181)the Open Research Fund of State Key Laboratory for Novel Softw are Technology of China(No.KFKT2010B02)the Key Project of Natural Science Research for Anhui Colleges of China(No.KJ2011A048)
文摘A new method for combining features via importance-inhibition analysis (IIA) is described to obtain more effective feature combination in learning question classification. Features are combined based on the inhibition among features as well as the importance of individual features. Experimental results on the Chinese questions set show that, the IIA method shows a gradual increase in average and maximum accuracies at all feature combinations, and achieves great improvement over the importance analysis(IA) method on the whole. Moreover, the IIA method achieves the same highest accuracy as the one by the exhaustive method, and further improves the performance of question classification.
基金Microsoft Research Asia Internet Services in Academic Research Fund(No.FY07-RES-OPP-116)the Science and Technology Development Program of Tianjin(No.06YFGZGX05900)
文摘To improve question answering (QA) performance based on real-world web data sets,a new set of question classes and a general answer re-ranking model are defined.With pre-defined dictionary and grammatical analysis,the question classifier draws both semantic and grammatical information into information retrieval and machine learning methods in the form of various training features,including the question word,the main verb of the question,the dependency structure,the position of the main auxiliary verb,the main noun of the question,the top hypernym of the main noun,etc.Then the QA query results are re-ranked by question class information.Experiments show that the questions in real-world web data sets can be accurately classified by the classifier,and the QA results after re-ranking can be obviously improved.It is proved that with both semantic and grammatical information,applications such as QA, built upon real-world web data sets, can be improved,thus showing better performance.