摘要
社交媒体中的问答对可以为自动问答系统提供答案,但有些答案的质量不高,因此答案质量评价方法具有研究价值。已有的评价方法没有考虑问题类别特征,对不同类型的问题采用统一的评价方法。因此提出了一个层次分类模型。首先分析问题类型;然后提取文本、非文本、语言翻译性、答案中的链接数4类特征,依据特征分类影响力随问题类型不同而不同这一客观现象,采用逻辑回归算法对各类型问题的答案质量进行评价,取得了较好的实验效果;最后分析了影响各类问题答案质量的主要特征。
Social media question-answer pairs can provide the answer to the automatic question and answering system,but the quality about some of the answers is not so high.So the evaluation method of answer quality has the research value.The existing evaluation methods without considering the problem of question types feature use the uniform evaluation method for different question types.This paper presented a hierarchical classification model.Firstly we analyzed the types of question,and then extracted four features of text,non-text,language translation,number of links in the answer.According to this objective phenomenon that the influence of feature classification varies with the types of different questions,we used logistic regression algorithm to evaluate various types of answer quality based on these features,achieving good results.Finally the main features that influence the anawer quality of all kinds of questions were analyzed.
出处
《计算机科学》
CSCD
北大核心
2016年第1期94-97,102,共5页
Computer Science
基金
武汉大学软件工程国家重点实验室开放课题项目(SKLSE2012-09-30)
山西省自然科(2013011015-2)资助
关键词
层次分类模型
问题类别
答案质量评价
特征分析
Hierarchical classification model
Question types
Answer quality evaluation
Feature analysis