期刊文献+

ChatGPT生成中文学术内容分析——以情报学领域为例 被引量:1

Feature Analysis of Chinese Academic Content Generated by ChatGPT:Taking the Field of Intelligence as An Example
下载PDF
导出
摘要 学术写作是ChatGPT的主要应用方向之一。文章以情报学领域的核心期刊论文为研究对象,首先从词、句、篇3个维度出发,使用词性标注、n-gram等文本处理方法对ChatGPT和人类产出的论文引言内容进行对比分析。然后将判断学术内容是否由ChatGPT生成视作一个二元分类任务,采用朴素贝叶斯、支持向量机、随机森林算法进行文本分类实验,并使用SHAP方法对文本结构特征的重要性进行分析。研究发现:ChatGPT在描述有具体时间节点的事实性信息和引用政策文件或研究报告等方面表现较弱,生成引言的篇幅较集中,撰写论文相较于人类更加“循规蹈矩”;查重工具通常无法准确检测出ChatGPT生成内容的原创性,但分类模型可以比较容易地区分出引言是否由ChatGPT生成,平均句子长度、词汇多样性和文本长度是影响分类结果最重要的文本结构特征。 Academic writing is one of the main applications of ChatGPT.This paper focuses on the core journal articles in the field of intelligence.Starting from three dimensions:word,sentence,and paragraph,text processing methods such as part-of-speech tagging and n-gram are used to compare the introductions of articles produced by ChatGPT and humans.Furthermore,determining whether the academic content was generated by ChatGPT is treated as a binary classification task.Naive Bayes,Support Vector Machine,and Random Forest algorithms are employed for text classification experiments,and the SHAP method is used to analyze the significance of textual structural features.The study shows that ChatGPT has weaknesses in describing factual information with specific dates and in referencing policy documents or research reports.The introductions generated by ChatGPT are relatively consistent in length,and its academic writing tends to be more"rule-following"compared to human authors.Plagiarism detection tools typically struggle to accurately identify the originality of content produced by ChatGPT.However,classification models are better at distinguishing whether the introductions were generated by ChatGPT.Average sentence length,lexical diversity,and total text length are the most significant textual structural features that influence classification results.
作者 郭鑫 王一博 王继民 GUO Xin;WANG Yibo;WANG Jimin
出处 《图书馆论坛》 北大核心 2024年第3期134-143,共10页 Library Tribune
基金 国家社会科学基金重点项目“开放科学数据集统一发现的关键问题与平台构建研究”(项目编号:20ATQ007)研究成果。
关键词 ChatGPT 论文写作 情报学 文本分类 查重检测 ChatGPT academic writing intelligence text classification plagiarism detection
  • 相关文献

参考文献5

二级参考文献56

共引文献186

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部