摘要
论文运用基于正向最大匹配算法的破译方式,进行分词。同时基于单词频率进行差错更改及控制。其中主要步骤如下:1)编制符合要求的密文;2)进行替换解密;3)基于词库分词,同时进行差错仿真;4)构建句子。首先基于编码原理进行密文编写,同时进行随机差错制造。最后将分析后的结果存入文本,处理之后得到部分与原文进行对比,发现其错误率较低。故而所得模型其可靠度较高。文章主要工作在于构建一个较为可靠的具有较强普适性的基于字符频率的最大正向分词模型。同时在此基础上建立其余模型共同解决问题。
In this paper,a method of decipher based on the maximum matching method is used for word segmentation,and the method based on the word frequency is used for error change and control.The main steps are as follows:1.Compile the required cipher text.2.Replace the decryption.3.Segment words based on lexicon,meanwhile performs error simulation.4.Construct sentences.First,a cipher is written based on codingtheory at the same time random errors are made.Then,the analyzed results are saved in the text,and it's found that its low error rate after compared the processed plaintext and the original,so the resulting model is high reliability.The mian goal of this paper is to create a reliability maximum forward word segmentation model based on character frequency which has storng universality,meanwhile,on the basis of this,the other models are established to solve the problem.
出处
《计算机与数字工程》
2016年第5期924-928,965,共6页
Computer & Digital Engineering
关键词
英文分词
字符频率
字典优化
替换解密
正向最大匹配
English word segmentation
character frequency
dictionary optimization
replace decryption
positive maximum matching