摘要
网络数据具备数量大、平台多、增速快、内容多的优势特点。随着网络资源的不断丰富,要想在庞大的资源库中获取自己所需的数据愈发困难。虽然近几年已经出现了不少共享参考数据,但是对于诸多实际应用,有必要进行网络爬虫采集抓取网页收集信息。数据挖掘可以发现不能靠直觉发现的信息数据,甚至是得到违反直觉的数据结论。面对如今更为庞大的数据规模,挖掘得到的信息具备更高的价值和意义。
IT network continues rapid development of today, accumulated a wealth of data resources, networking can help us derive own useful data. Network data includes the number of large, multi-platform, grew faster, the characteristics of the content of many advantages. With a wealth of network resources, in order to get the data they need more difficult in the huge repository. Although in recent years there have been a lot of shared reference data, but for many practical applications, there is still a need for web crawlers crawl the page collecting information gathering. Data mining can find information and data can not be found by intuition, or even to data counterintuitive conclusion. So now we face a much larger scale data mining information obtained have a higher value and significance.
出处
《烟台南山学院学报》
2017年第2期59-61,共3页
JOURNAL OF YANTAI NANSHAN UNIVERSITY