摘要
分析了页面乱码出现的原因,提出了根据HTML规范来获取页面字符编码的方法.当出现根据HTML规范不能正常地获取到页面字符编码的情况时,采用基于常用字符出现频率统计的方法来检测出其字符编码.
The reasons of garbled character on the web page appearing were analyzed. To obtain the web page character, specific encoding methods were proposed under the HTML specification. When the character encoding method was not successfully under the HTML specification, a statistical method based on the occurrence frequency of characters was used to detect the character encoding.
出处
《仲恺农业工程学院学报》
CAS
2009年第3期41-43,48,共4页
Journal of Zhongkai University of Agriculture and Engineering
关键词
字符编码
检测
character encoding
detect