聊天机器人评估研究综述:指标、方法与应用领域被引量：1

A Review on Chatbot Assessment: Indicators, Methods and Application

导出

摘要 [目的/意义]系统分析国内外聊天机器人应用和评估现状,判断聊天机器人评估工作中的问题和进一步的应用场景,推动聊天机器人的评估和应用活动。[方法/过程]以国外“Web of Science”和国内知网(CNKI)作为数据库主要来源,以熊猫学术、谷歌学术及百度学术等作为补充,筛选出662篇研究论文作为原始样本,经过流程图分析,最终纳入66篇有效文献用于全文分析。采用归纳法将聊天机器人评估内容的相关研究归纳为三个方面:评估指标、评估方法及评估应用。[结果/结论 ]评估指标主要围绕其功能、用途及用户体验三方面展开评价指标研究,但目前尚未建立出一套针对聊天机器人的标准评价指标体系;评估方法主要分为主观评估与客观评估,方法选用较为单一,缺乏交叉性综合评估,从而弥合人为因素与技术因素之间的缺陷;评估应用主要集中于教育、医疗、心理健康等领域,而在政府管理、社会服务等方面的评估仍有待探索。最后,从加快形成聊天机器人评估研究的指标体系,拓宽应用领域及场景模式、实现跨平台联动,及强化聊天机器人伦理治理规范等三方面为国内研究提供参考。 [Purpose/Significance]This paper systematically analyzes the current application and assessment of chatbot at home and abroad,judges the existing problems and impossible application scenarios,so as to promote chatbot assessment and application activities.[Method/Process]In this paper,with"Web of Science"and CNKI(CNKI)as the main data source,supplemented by Panda Academic,Google Scholar and Baidu Academic,662 research papers were selected as the original samples.After flow chart analysis,66 papers were obtained for full-text analysis.And through induction,the chatbot assessment contents were summarized into three aspects:assessment indicator,assessment method and assessment application.[Result/Conclusion]The research of assessment indicators mainly focuses on its function,usage and user experience.However,there is still no standard evaluation index system for chatbots.The assessment methods are mainly divided into subjective and objective evaluation.Although the selection method is relatively simple and lacks cross-comprehensive evaluation,it can make up for the defects between human factors and technical factors.The applications mainly focus on education,medical treatment,mental health,while in government management and social service,it is still to be explored.Finally,this paper provides reference for domestic research in three aspects:accelerating the construction and research of chatbot assessment indicator system,broadening the application field and scene mode to achieve cross-platform linkage,and strengthening the ethical governance norms of chatbot.

作者任牡丹耿骞吴义熔 Ren Mudan;Geng Qian;Wu Yirong(School of Government Management,Beijing Normal University,Beijing 100875;Institute of Advanced Studies in Humanities and Social Sciences,Beijing Normal University,Zhuhai 519087)

机构地区北京师范大学政府管理学院北京师范大学人文和社会科学高等研究院

出处《图书情报工作》 CSSCI 北大核心 2023年第22期140-148,共9页 Library and Information Service

基金国家社会科学基金项目“基于数据语义的电子病历数据质量研究”(项目编号:20BTQ066)研究成果之一。

关键词聊天机器人 ChatGPT 测评研究指标体系评估方法 Chatbot ChatGPT assessment research index system assessment method

分类号 TN912.3 [电子电信—通信与信息系统] G25 [文化科学—图书馆学]