For the complex questions of Chinese question answering system, we propose an answer extraction method with discourse structure feature combination. This method uses the relevance of questions and answers to learn to ...For the complex questions of Chinese question answering system, we propose an answer extraction method with discourse structure feature combination. This method uses the relevance of questions and answers to learn to rank the answers. Firstly, the method analyses questions to generate the query string, and then submits the query string to search engines to retrieve relevant documents. Sec- ondly, the method makes retrieved documents seg- mentation and identifies the most relevant candidate answers, in addition, it uses the rhetorical relations of rhetorical structure theory to analyze the relationship to determine the inherent relationship between para- graphs or sentences and generate the answer candi- date paragraphs or sentences. Thirdly, we construct the answer ranking model,, and extract five feature groups and adopt Ranking Support Vector Machine (SVM) algorithm to train ranking model. Finally, it re-ranks the answers with the training model and fred the optimal answers. Experiments show that the proposed method combined with discourse structure features can effectively improve the answer extrac- ting accuracy and the quality of non-factoid an- swers. The Mean Reciprocal Rank (MRR) of the an- swer extraction reaches 69.53%.展开更多
Aiming at the relation linking task for question answering over knowledge base,especially the multi relation linking task for complex questions,a relation linking approach based on the multi-attention recurrent neural...Aiming at the relation linking task for question answering over knowledge base,especially the multi relation linking task for complex questions,a relation linking approach based on the multi-attention recurrent neural network(RNN)model is proposed,which works for both simple and complex questions.First,the vector representations of questions are learned by the bidirectional long short-term memory(Bi-LSTM)model at the word and character levels,and named entities in questions are labeled by the conditional random field(CRF)model.Candidate entities are generated based on a dictionary,the disambiguation of candidate entities is realized based on predefined rules,and named entities mentioned in questions are linked to entities in knowledge base.Next,questions are classified into simple or complex questions by the machine learning method.Starting from the identified entities,for simple questions,one-hop relations are collected in the knowledge base as candidate relations;for complex questions,two-hop relations are collected as candidates.Finally,the multi-attention Bi-LSTM model is used to encode questions and candidate relations,compare their similarity,and return the candidate relation with the highest similarity as the result of relation linking.It is worth noting that the Bi-LSTM model with one attentions is adopted for simple questions,and the Bi-LSTM model with two attentions is adopted for complex questions.The experimental results show that,based on the effective entity linking method,the Bi-LSTM model with the attention mechanism improves the relation linking effectiveness of both simple and complex questions,which outperforms the existing relation linking methods based on graph algorithm or linguistics understanding.展开更多
基金supported by the National Nature Science Foundation of China under Grants No.60863011,No.61175068,No.61100205,No.60873001the Fundamental Research Funds for the Central Universities under Grant No.2009RC0212+1 种基金the National Innovation Fund for Technology based Firms under Grant No.11C26215305905the Open Fund of Software Engineering Key Laboratory of Yunnan Province under Grant No.2011SE14
文摘For the complex questions of Chinese question answering system, we propose an answer extraction method with discourse structure feature combination. This method uses the relevance of questions and answers to learn to rank the answers. Firstly, the method analyses questions to generate the query string, and then submits the query string to search engines to retrieve relevant documents. Sec- ondly, the method makes retrieved documents seg- mentation and identifies the most relevant candidate answers, in addition, it uses the rhetorical relations of rhetorical structure theory to analyze the relationship to determine the inherent relationship between para- graphs or sentences and generate the answer candi- date paragraphs or sentences. Thirdly, we construct the answer ranking model,, and extract five feature groups and adopt Ranking Support Vector Machine (SVM) algorithm to train ranking model. Finally, it re-ranks the answers with the training model and fred the optimal answers. Experiments show that the proposed method combined with discourse structure features can effectively improve the answer extrac- ting accuracy and the quality of non-factoid an- swers. The Mean Reciprocal Rank (MRR) of the an- swer extraction reaches 69.53%.
基金The National Natural Science Foundation of China(No.61502095).
文摘Aiming at the relation linking task for question answering over knowledge base,especially the multi relation linking task for complex questions,a relation linking approach based on the multi-attention recurrent neural network(RNN)model is proposed,which works for both simple and complex questions.First,the vector representations of questions are learned by the bidirectional long short-term memory(Bi-LSTM)model at the word and character levels,and named entities in questions are labeled by the conditional random field(CRF)model.Candidate entities are generated based on a dictionary,the disambiguation of candidate entities is realized based on predefined rules,and named entities mentioned in questions are linked to entities in knowledge base.Next,questions are classified into simple or complex questions by the machine learning method.Starting from the identified entities,for simple questions,one-hop relations are collected in the knowledge base as candidate relations;for complex questions,two-hop relations are collected as candidates.Finally,the multi-attention Bi-LSTM model is used to encode questions and candidate relations,compare their similarity,and return the candidate relation with the highest similarity as the result of relation linking.It is worth noting that the Bi-LSTM model with one attentions is adopted for simple questions,and the Bi-LSTM model with two attentions is adopted for complex questions.The experimental results show that,based on the effective entity linking method,the Bi-LSTM model with the attention mechanism improves the relation linking effectiveness of both simple and complex questions,which outperforms the existing relation linking methods based on graph algorithm or linguistics understanding.