采用低秩编码优化大语言模型的高校基础知识问答研究

Research on University Basic Knowledge Question-Answering Using Low-Rank Encoding to Optimize Large Language Model

下载PDF

导出

摘要在高等教育领域,基础知识问答系统对学生学术成就提升及教育资源公平分配具有重要作用。近年来已有基于预训练语言模型上使用机器阅读理解和文本相似度匹配的问答技术,在处理复杂的自然语言问题时仍然面临因训练数据不足、模型泛化能力限制等瓶颈导致的回答质量和准确性不足的情况。本研究旨在解决如何在降低资源消耗的同时,提升基础知识问答系统在高校环境中的性能优势和准确率。为实现该目标,提出了一种高校基础知识领域的低秩编码大语言模型微调方法。该方法通过低秩编码的方法降低大语言模型的内存、显存在训练和预测的消耗量,并且运用大语言模型的生成式方法优化我校基础知识数据问答领域的研究与分析,从而提高日常基础知识问答的质量、准确性和响应速度。通过冻结大型预训练模型权重,将高校基础知识语言信息融入原Transformer架构的预训练层,并且加入了问答优化模块来规范生成式模型的准确性。此方法在显著减少下游任务可训练参数数量的同时,可以较好地保留原模型的生成式语言能力,并且针对高校基础知识领域展现出更优的性能优势和准确率。 In the field of higher education,foundational knowledge question-answering(QA)systems play a crucial role in enhancing students’academic performance and facilitating equitable distribution of educational resources.In recent years,question-answering techniques based on machine reading comprehension and text similarity matching have been developed atop pre-trained language models.However,when addressing complex natural language problems,these techniques still face challenges in answer quality and accuracy due to limitations like insufficient training data and restricted model generalization capabilities.This research aims to address the dual objective of reducing resource consumption while simultaneously enhancing the performance and accuracy of basic knowledge questionanswering systems in university settings.To achieve this goal,this paper proposes an efficient fine-tuning approach for large language model with low-rank encoding in the domain of fundamental knowledge.This method uses lowrank encoding to minimize computational costs during the training and prediction stages of large language models.It enhances research and analysis in our university’s basic knowledge question-answering field by employing the generative capabilities of these models,improving the quality,accuracy,and response speed of everyday queries.By freezing the weights of the pre-trained model and integrating university-specific knowledge into the Transformer architecture,along with a question-answering optimization module,this approach preserves generative abilities and achieves superior performance and accuracy in the university knowledge domain while reducing trainable parameters for downstream tasks.

作者骆仕杰金日泽韩抒真 LUO Shijie;JIN Rize;HAN Shuzhen(Office of the Cyberspace Affairs,Tiangong University,Tianjin 300387,China;School of Software,Tiangong University,Tianjin 300387,China)

机构地区天津工业大学网络安全和信息化办公室天津工业大学软件学院

出处《计算机科学与探索》 CSCD 北大核心 2024年第8期2156-2168,共13页 Journal of Frontiers of Computer Science and Technology

基金国家自然科学基金(61806142) 天津市科学技术局项目(19PTZWHZ00020) 中国学位与研究生教育学会项目(2020MSA50) 产学合作协同育人项目(202102084059)。

关键词生成式语言模型基础知识问答大语言模型 TRANSFORMER 冻结模型权重 generative language model fundamental knowledge question-answering large language model Transformer freezing model weights

分类号 TP391 [自动化与计算机技术—计算机应用技术]