摘要
目前使用的敏感词识别方法不能得到全局最优解,导致查全率和查准率较低,为此,提出了基于模糊遗传算法的融媒体平台敏感词鉴别方法。部署融媒体平台架构,在该架构上初始化敏感词群,通过计算个体适应值,统计个体被选择的累计概率,选择分配的敏感词个体。经过模糊遗传算法的个体均匀交叉、非均匀变异处理后,使用优化敏感词定性结构判断准则,完成敏感词定性结构处理。构建词语敏感性判断公式,划分敏感等级,消除文本长度对词语敏感程度的影响。确定位置信息集合,划分敏感性词位置,确定敏感程度。通过实验验证结果可知,该方法查全率和查准率趋近于稳定状态,均超过90%,具有精准鉴别效果。
The current sensitive word recognition methods can not get the global optimal solution,resulting in low recall and precision.Therefore,a sensitive word identification method for financial media platforms based on a fuzzy genetic algorithm is proposed.Deploy the financial media platform architecture,initialize the sensitive word group on the architecture,calculate the individual fitness value,count the cumulative probability of individual selection,and select the allocated sensitive word individuals.After the individual uniform crossover and non⁃uniform mutation processing of fuzzy genetic algorithm,the qualitative structure processing of sensitive words is completed by using the judgment criterion of optimizing the qualitative structure of sensitive words.Construct the judgment formula of word sensitivity,divide the sensitivity level,and eliminate the influence of text length on word sensitivity.Determine the location information set,divide the location of sensitive words,and determine the sensitivity.The experimental results show that the recall and precision of this method approach to a stable state,both exceeding 90%,and has accurate identification effect.
作者
陈佐虎
刘少博
彭振国
张珍芬
陈丽
CHEN Zuohu;LIU Shaobo;PENG Zhenguo;ZHANG Zhenfen;CHEN Li(Gansu Tongxing Intelligent Technology Development Co.,Ltd.,Lanzhou 730050,China;Internet Division of State Grid Gansu Electric Power Company,Lanzhou 730050,China)
出处
《电子设计工程》
2023年第14期187-190,195,共5页
Electronic Design Engineering
关键词
模糊遗传算法
融媒体平台
敏感词鉴别
交叉
变异
fuzzy genetic algorithm
fusion media platform
sensitive word identification
cross
variation