摘要
由协同开发社区和知识分享社区所组成的开源社区中汇集海量的开源数据资源。如何从数量众多,页面结构各异的开源社区中准确、高效地获取这些数据是对开源数据资源进行全面分析,深度关联的前提。阐述面向开源社区的Web数据抽取方法研究过程,实现对开源社区中Web数据的精确抽取。
Open source community, which consists of collaborative development community and knowledge sharing community, assembles a huge amount of open-source data resources together. How to obtain these data precisely and efficiently from numerous open source communi- ties with various page structures is a prerequisite for comprehensive analysis and deep correlation. Describes the research process of web data extraction method and achieves the accurate extraction of Web data from open source communities.
出处
《现代计算机》
2017年第3期27-29,39,共4页
Modern Computer
关键词
开源社区
WEB数据抽取
协同开发社区
知识共享社区
Open Source Community
Web Data Extraction
Collaborative Development Community
Knowledge Sharing Community