摘要
偏见现象普遍存在于人类社会,并通常以自然语言为载体呈现。传统的偏见研究主要针对静态词嵌入模型展开,但随着自然语言处理技术的不断演进,研究对象逐渐转向上下文处理能力更强的预训练模型。而作为预训练模型的进一步发展,尽管大型语言模型凭借惊人的性能和广阔的发展前景在多个应用场景中得到了广泛部署,但其仍可能会从未经处理的训练数据中捕捉到社会偏见,并将偏见传播到下游任务中。含有偏见的大型语言模型系统会产生不良的社会影响和潜在危害,因此针对大型语言模型的偏见研究亟待深入探讨。探讨了自然语言处理中偏见的由来,并对从词嵌入模型到现在大型语言模型的偏见评估和偏见缓解方法进行了分析与总结,旨在为未来相关研究提供有益参考。
The phenomenon of bias existed widely in human society,and typically manifested through natural language.Traditional bias studies have mainly focused on static word embedding models,but with the continuous evolution of natural language processing technology,research has gradually shifted towards pre-trained models with stronger contextual processing capabilities.As a further development of pre-trained models,although large language mo-dels have been widely deployed in multiple applications due to their remarkable performance and broad prospects,they may still capture social biases from unprocessed training data and propagate these biases to downstream tasks.Biased large language model systems can cause adverse social impacts and other potential harm.Therefore,there is an urgent need for further exploration of bias in large language mo-dels.This paper discussed the origins of bias in natural language processing and provided an analysis and summary of the deve-lopment of bias evaluation and mitigation methods from word embedding models to the current large language models,aiming to provide valuable references for future related research.
作者
徐磊
胡亚豪
潘志松
Xu Lei;Hu Yahao;Pan Zhisong(College of Command&Control Engineering,Army Engineering University of PLA,Nanjing 210007,China)
出处
《计算机应用研究》
CSCD
北大核心
2024年第10期2881-2892,共12页
Application Research of Computers
基金
国家自然科学基金资助项目(62076251)。
关键词
自然语言处理
词嵌入
预训练模型
大型语言模型
偏见
natural language processing
word embedding
pre-trained model
large language model
bias