摘要
机器学习被广泛应用到自然语言处理中,社区问答提供了新的有趣的研究方向。在传统问答领域,通过分类算法研究用户交互行为并分析其交互方式,能够促进用户交互与相关岗位结构的开发。在此背景下,针对SemEval语义测评大赛提供的语料库进行了研究,基于KNN算法、随机森林等分类方法对问题的答案进行分类,并对分类结果进行分析和研究。实验结果表明,GBRT和随机森林这两种算法的分类效果最好。
Machine learning is widely used in natural language processing,and community question answering provides a new and interesting research direction.In the field of traditional Question Answering(QA),it can promote the development of user interaction and related post structure by studying user interaction behavior and analyzing its interaction mode through classification algorithm.In this context,this paper studies the corpus provided by SemEval semantic evaluation contest,classifies the answers based on KNN algorithm,random forest and other classification methods,and analyzes and studies the classification results.Experimental results show that GBRT and random forest algorithm are the best.
作者
孙熙然
SUN Xi-ran(China Electronic Technology Corporation,Chengdu 610030,China)
出处
《电脑知识与技术》
2021年第12期195-197,共3页
Computer Knowledge and Technology
关键词
答案分类
自然语言处理
机器学习
随机森林
最邻近节点算法
answer classification
natural language processing
machine learning
nearest neighbor node algorithm