期刊文献+

面向彝文网页的敏感内容分级系统研究 被引量:4

A rating system for the sensitive content in the Yi-language web-pages
下载PDF
导出
摘要 随着互联网和彝文信息化的快速发展,彝文网络上充斥着大量的敏感信息,极大的影响了我国边疆的舆情信息安全.但彝文信息技术的发展与中英文相比还比较滞后,因彝文语言结构复杂、彝语分布环境广泛等原因,彝文的信息采集和文本分词等技术还不够成熟,这对涉外彝文网页的敏感内容监管带来巨大的挑战.为解决彝文网络信息的安全传播和舆情稳定,试图提出彝文敏感内容分级模型,并结合自研的彝文爬虫及分词等技术.构建一种面向彝文网页的敏感内容分级的算法模型和演示系统,相比于同类的民族语言舆情分析系统,不仅可实现敏感词的识别和过滤,还具有敏感内容分级、敏感源地址追踪等功能.通过人工评测与分析,该系统对敏感内容的分级可达到48%的准确率,敏感词的识别率为80%. With the rapid development of the Internet and Yi-language informatization,the Yi-language network is full of sensitive information,which greatly affects the information security in terms of public opinion in China s border areas. However,compared with that of the Chinese language and the English language,the development of information technology in the Yi language is still lagging behind. Because of the complex language structure and wide distribution environment of the Yi language,the technology of information collection and word segmentation of the Yi language is not mature enough,which brings great challenges to the supervision of the sensitive content in foreign-related web-pages in the Yi language. In order to promote the safe dissemination of the Yi network information and help the stability of the public opinion, we try to propose a sensitive word filtering algorithm and a content sensitivity classification model for the Yi text,and construct an algorithm model and a demonstration system for sensitive content rating of the Yi web based on self-developed reptile and word segmentation techniques of the Yi language. Compared with other similar systems in ethnic minority languages,this one has not only the functions of identifying and filtering sensitive words,but also the functions of rating the sensitive content and tracking the sensitive sources. Through a manual analysis,the system can achieve 48% accuracy in rating the sensitive content,and 80% accuracy in detecting sensitive words.
作者 王清 李炳泽 王嘉梅 WANG Qing;LI Bing-ze;WANG Jia-mei(Yunnan Province for Minority Language Information Processing Engineering Research Center,Yunnan Minzu University,Kunming 650504,China;Wenshan University,Wenshan 663000,China)
出处 《云南民族大学学报(自然科学版)》 CAS 2019年第2期177-185,共9页 Journal of Yunnan Minzu University:Natural Sciences Edition
基金 国家语委科研基金(WT125-61) 云南民族大学研究生创新基金(2018YJCXS152)
关键词 彝文网络 敏感信息 内容分级 舆情分析 Yi-language network sensitive information content rating public opinion analysis
  • 相关文献

参考文献11

二级参考文献69

共引文献45

同被引文献37

引证文献4

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部