期刊文献+

基于大型语言模型指令微调的心理健康领域联合信息抽取

Instruction Tuning of LLM for Unified Information Extraction in Mental Health Domain
下载PDF
导出
摘要 信息抽取目的在于从文本中提取关键的信息。心理健康领域的信息抽取能力反映了语言模型对人类心理健康相关信息的自然语言理解能力。提高语言模型的领域信息抽取能力,还能为AI心理健康服务提供重要的知识来源。但目前心理健康信息抽取的中文指令数据集十分匮乏,这限制了相关研究和应用的发展。针对以上问题,该文在心理学专家的指导下提示ChatGPT生成样本实例,并通过设计生成指令以及数据增强,构建了5641条包含命名实体识别、关系抽取和事件抽取三项基本抽取任务的心理健康领域联合信息抽取指令数据集,旨在填补心理健康领域信息抽取中文指令数据集的不足。随后使用该指令数据集对大型语言模型进行参数高效微调。与基线模型的性能对比以及人工评估的实验结果表明,大型语言模型经过有效的指令微调后可以完成心理健康领域信息抽取的联合任务。 Information extraction is to extract essential information from text.The information extraction ability in the mental health domain reflects the large language model(LLM)'s understanding of human mental health related information.To improve the LLM's ability in mental health domain,however,is currently blocked by the severe shortage of Chinese instruction datasets.This paper,under the guidance of psychologists,makes ChatGPT generate sample instances,and finally created 5641 unified instruction datasets for information extraction in the field of mental health through the designed instruction generation and data augmentation.This dataset covers three basic extraction tasks:name entity recognition,relation extraction,and event extraction,with the aim of filling the gap in mental health information extraction Chinese instruction datasets.Applied parameter-efficient tuning with this instruction dataset,LLM is shown to be capable of performing unified information extraction tasks in the mental health field according to the comparison against the baseline models and the results of human evaluations.
作者 蔡子杰 方荟 刘建华 徐戈 龙云飞 CAI Zijie;FANG Hui;LIU Jianhua;XU Ge;LONG Yunfei(School of Computer Science and Mathematics,Fujian University of Technology,Fuzhou,Fujian 350118,China;Fujian Provincial Key Laboratory of Big Data Mining and Applications,Fuzhou,Fujian 350118,China;College of Computer and Control Engineering,Minjiang University,Fuzhou,Fujian 350108,China;Fujian Mental Health Human-Computer Interaction Technology Research Center,Fuzhou,Fujian 350108,China;School of Computer Science and Electronic Engineering,University of Essex,Colchester CO43SQ,UK)
出处 《中文信息学报》 CSCD 北大核心 2024年第8期112-127,共16页 Journal of Chinese Information Processing
基金 科技创新2030-“新一代人工智能”重大项目(2022ZD0116308) 福建省自然科学基金(2023J01349) 福建省创新资金项目(2022C0022) 闽江学院引进人才科技预研项目(MJY23033) 闽江学院引进人才科技预研项目(MJY21032)。
关键词 信息抽取 心理健康 大型语言模型 指令微调 information extraction mental health large language model instruction tuning
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部