摘要
在藏文信息处理中,文本分类技术可以将藏文文档自动分类为某种提前设定的类别,因此藏文文本分类在信息检索、新闻推荐等应用中具有重要的应用价值。传统的文本分类方法需要复杂的特征工程处理,分类效果不是非常理想。随着深度学习的快速发展,基于深度学习的藏文文本分类方法成为主要研究趋势。文章提出基于双向LSTM的文本分类方法,在藏文文本分类数据集上进行实验,本文算法的精准率、召回率、F1分数分别提升2.56%、1.87%和1.75%。
In Tibetan information processing,text classification technology can automatically classify Tibetan documents into certain pre-set categories.Therefore,Tibetan text classification has important application value in applications such as information retrieval and news recommendation.Traditional text classification methods require complex feature engineering processing,and the classification effect is not very ideal.With the rapid development of deep learning,Tibetan text classification methods based on deep learning have become the main research trend.This paper proposes Bi-LSTM,a text classification method based on bidirectional LSTM.Experiments were carried out on the Tibetan text classification dataset.The algorithm in this paper improved the precision rate,recall rate,and F1 score by 2.56%,1.87%and 1.75%respectively.
作者
索南多杰
官却多杰
拉玛杰
公保加羊
Suonan Duojie;Guanque Duojie;La Majie;Gongbao Jiayang(Hainan Prefecture Tibetan Information Technology Research Center,Gonghe 813099,Qinghai,China)
出处
《青海科技》
2023年第3期192-196,共5页
Qinghai Science and Technology
基金
青海省重点研发与转化计划—科技成果转化专项项目“‘云藏’高效爬虫及检索系统优化与集成”(2020-GX-164)。