期刊文献+

基于语言特征集成学习的大语言模型生成文本检测

Large Language Model-Generated Text Detection Based on Linguistic Feature Ensemble Learning
下载PDF
导出
摘要 大语言模型的快速发展为日常生活和工作提供了极大的便利,但也为个人和社会带来了挑战。因此,迫切需要能够检测大语言模型生成文本的检测器。为了兼具良好的检测性能和泛化能力,文章提出了一种基于语言特征集成学习的大语言模型生成文本检测方法EBF Detection。EBF Detection融合了微调预训练语言模型和高阶自然语言统计特征,利用判决机制,实现了大语言模型生成文本检测。实验结果显示,EBF Detection不仅在域内数据上平均的检测准确率达到了98.72%,而且在域外数据上的平均检测准确率达到了96.79%。 The rapid development of large language model (LLM) has provided great convenience for daily life and work,but has also brought challenges for individuals and society.Therefore,there is an urgent need for detectors that can detect text generated by large language models.For good detection performance and generalization ability,this paper proposed a large language model-generated text detection method based on linguistic feature learning—EBF detection.EBF detection combined the fine-tuned pre-trained language model and higher-order natural language statistical features,and used the decision mechanism to realize the LLM-generated text detection.Experimental results show that EBF Detection not only achieves an average detection accuracy of 98.72% on in-domain data,but also achieves an average detection accuracy of 96.79% on out-of-domain data.
作者 项慧 薛鋆豪 郝玲昕 XIANG Hui;XUE Yunhao;HAO Lingxin(School of Cyberspace,Hangzhou Dianzi University,Hangzhou 310018,China)
出处 《信息网络安全》 CSCD 北大核心 2024年第7期1098-1109,共12页 Netinfo Security
基金 国家自然科学基金[61772162] 浙江省重点研发计划[2023C03198]。
关键词 大语言模型 大语言模型生成文本检测 集成学习 语言特征 large language model LLM-generated text detection ensemble learning linguistic feature

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部