摘要
随着网络资源的不断丰富,人们获取信息的途径已被网络代替。维吾尔文,在语言信息处理,WEB应用等领域有了迅速的发展。文章针对网络爬虫的工作原理以及聚焦爬虫策略进行阐述,在此基础上结合维吾尔语信息提取的相关研究,研究了维吾尔文的网络爬虫技术的结构和策略,从而为维吾尔文搜索引擎的网页数据库建设和维吾尔文网络舆情分析研究提供海量的语料。
The way people getting various information have gradually been replaced by the vast growing Inter-net,along with rich online resources. as for this, Uyghur language have developed very fast in many research fields, in which natural language processing and Web application. This paper, mainly presented basic theory of web crawl-er and strategy of focused carawler, on the basis of study on Uyghur information extraction. Then discussed Uyghur web crawler in both structural and strategic way. Thus, massively provided large rage corpus for Uyghur search en-gine and Uyghur public network analysis.
出处
《新疆师范大学学报(自然科学版)》
2014年第4期75-78,共4页
Journal of Xinjiang Normal University(Natural Sciences Edition)