摘要
关键词组合策略是一种治理垃圾信息(如垃圾短信、垃圾彩信和RCS)的高效手段。当前的治理策略主要依靠人工的方式进行编制和维护,由于需要分析大量垃圾信息,工作量繁重。本文提出了一种基于AI的策略自动生成方法,能够辅助人工进行垃圾信息的分析和关键字组合策略生成,从而大幅减少人力。具体地,本文将垃圾信息中词语按照特定规则进行矩阵化排列,形成关键词矩阵,并输入基于二维卷积神经网络的分类器中进行训练,使得将关键词提取问题转化为在关键词矩阵中进行卷积操作。通过训练分类器,卷积网络能够自动提取出具有显著类别特征的关键字组合特征,在进行任意信息分类的同时,通过提取能够最大化卷积网络激活值的卷积窗口,可以找到针对该信息最合适的关键词组合策略。实验证明,该算法生成的关键字组合策略具有较好的查准率和查全率。
The keyword combination strategy is an effi cient means of managing spam,such as spam messages,and RCS.The current strategy relies mainly on manual methods for preparation and maintenance.Due to the need to analyze a large amount of spam,the workload is heavy.This paper proposes an automatic AIbased strategy generation method,which can assist manual analysis of spam information and keyword combination strategy generation,thus greatly reducing manpower.Specifi cally,in this paper,the words in the spam are matrixed according to specifi c rules to form a keyword matrix,and input into a classifi er based on a two-dimensional convolutional neural network for training.This translates the keyword extraction problem into a convolution operation in the keyword matrix.By training the classifi er,the convolutional network can automatically extract the key combination features with signifi cant category features.By extracting arbitrary information,the information can be found by extracting the convolution window that can maximize the convolution network activation value.The most appropriate keyword combination strategy.Experiments show that the keyword combination strategy generated by the algorithm has a good precision and recall.
作者
杜刚
朱艳云
张晨
常潇
杜雪涛
DU Gang;ZHU Yan-yun;ZHANG Chen;CHANG Xiao;DU Xue-tao(China Mobile Group Design Institute Co.,Ltd.,Beijing 100080,China)
出处
《电信工程技术与标准化》
2020年第2期11-16,共6页
Telecom Engineering Technics and Standardization
关键词
人工智能
卷积网络
关键词提取
artifi cial intelligence
convolutional network
keyword extraction