摘要
简单分析网页数据的特点,并针对网页数据的特点设计统计分析的预处理流程,对每一步处理过程都用几种不同的算法进行实验,以期得到最优的解决方案。实验证明,通过减少IO操作、提高处理粒度、适当使用词表等方法可以提高程序运行速度与准确率。
Process of statistic is designed in accordance with character of Web data after analyzing them. Each stage is experimented with some different algorithms in order to achieve optimal solution. According to experiment, efficiency and effectiveness can be improved by decreasing IO operation, increasing process granularity and using lexicon.
出处
《现代图书情报技术》
CSSCI
北大核心
2007年第3期69-72,共4页
New Technology of Library and Information Service