摘要
鉴于传统机器学习和自然语言处理技术在处理微博等社交媒体短文本数据时遇到的挑战,特别是数据的信息稀疏性影响数据的表达和分类性能,文中通过提示微调的方法,进行微博立场检测任务。提示微调方法可以充分挖掘和利用预训练语言模型内嵌的丰富知识库,以便更精确捕捉并识别不同文本内容对特定话题的立场倾向。文中首先对微博立场检测数据进行基于反翻译的数据增强,将训练数据从3000条增强到12000条;然后,根据微博文本内容以及对应话题,设计出提示词。这些提示旨在引导预训练语言模型的注意力机制关注文本中对立场检测尤为关键的信息片段,从而提高模型对微博文本立场的识别能力。为了验证提示微调在微博立场检测任务上的有效性,文中在NLPCC 2016的中文微博立场数据集上进行实验,实验结果显示,相较于最优的基线方法,基于提示微调的微博立场检测方法在五个评价指标上提升了0.6%~6%。综上,本研究不仅揭示了基于提示微调的方法在微博立场检测任务中具有巨大的应用潜力,同时也为未来的研究提供了有价值的参考。
Given the challenges faced by traditional machine learning and natural language processing techniques in handling short text data from social media platforms such as Weibo,particularly the impact of data sparsity on data representation and classification performance,this study conducts Weibo stance detection tasks using the prompt tuning method.The prompt tuning approach can effectively explore and utilize the rich knowledge embedded in pre-trained language models to accurately capture and identify the stance tendencies of different text contents towards specific topics.Initially,the study performs data augmentation on the Weibo stance detection data based on back-translation,increasing the training data from 3000 to 12000 instances.Subsequently,prompt words are designed based on the Weibo text content and corresponding topics.These prompts are intended to guide the attention mechanism of pre-trained language models to focus on crucial information segments related to stance detection in the text,thereby enhancing the model’s ability to recognize stances in Weibo texts.To validate the effectiveness of prompt fine-tuning in Weibo stance detection tasks,experiments are conducted on the Chinese Weibo stance dataset from NLPCC 2016.The results demonstrate improvements in various key performance indicators compared to existing baseline stance detection methods.Compared to the best baseline method,the Weibo stance detection method based on prompt tuning improved by 0.6%to 6%across five evaluation metrics.In summary,this research not only reveals the significant potential applications of the prompt tuning method in Weibo stance detection tasks but also provides valuable references for future studie.
作者
蒲秋梅
李辅德
PU Qiu-mei;LI Fu-de(Key Laboratory of Ethnic Language Intelligent Analysis and Security Governance of MOE,Minzu University of China,Beijing 100081,China)
出处
《中国电子科学研究院学报》
2024年第4期340-349,共10页
Journal of China Academy of Electronics and Information Technology
基金
国家社会科学基金资助项目(20BGL251)。