摘要
随着大数据信息技术的发展,数据在线监测和数据挖掘成为计算机信息领域研究的热点。通过对Web热点数据分割挖掘,提高信息热点追踪和Web数据分类能力。传统算法采用非结构化数据挖掘算法,无法有效对Web热点数据进行准确定位和分层挖掘。提出一种基于半结构化分割的Web热点数据挖掘算法。采用半结构化数据进行特征分割,基于优秀基因位进行差分进化,使寻优曲线不断趋于平缓,在多个节点上并行的运行比较脚本,采用半结构化分割,使得Web热点特征挖掘实现自适应寻优,得到Web热点数据的分配因子,提高了挖掘性能。仿真结果表明,该算法获得了良好的效率和精度,提高了Web热点数据挖掘的自适应寻优能力。
With the development of big data information technology, online monitoring data and data mining has become a hot research field of computer information. The segmentation of Web hot data mining, improve the classification ability of information focus and Web data. Using the traditional algorithm of unstructured data mining algorithms, it is not valid for Web hot data for accurate positioning and layered mining. The paper proposed a mining algorithm Web hot data structured based on segmentation, feature segmentation using semi structured data, excellent genes are based on differential evolution,make the optimization curve tends to be gentle, parallel on multiple nodes running script, through the code makes the unstructured data mapped to the data block, make the data stored in the database relational data model, to get the distribution factor Web hot data, improve the mining performance.The simulation results show that the high efficiency and accuracy, it improved adaptive Web hotspot of data mining optimization ability.
出处
《科技通报》
北大核心
2015年第4期115-117,共3页
Bulletin of Science and Technology