摘要
由于高校行政管理体制分割以及高校的保护主义,各高校信息网的就业信息基本是分立甚至是隔绝的,鲜有院校相互合作、共享就业信息。为把这些存储高校毕业生就业信息的信息孤岛连接在一起,给广大毕业生和用人单位搭建一个畅通的无障碍的沟通桥梁,笔者研究并实现了一个高校毕业生就业信息搜索引擎系统,阐述了就业信息采集器的算法及原理,利用多线程技术实现了就业信息采集器;建立了中文分词、索引算法,对命中的词语进行了高亮显示。
Because of the universities' administrative systems are separated and protectionism,the Employment Information which in the university’s Employment Information websites is separate instead of shared.In order to provide more Employment Informa tion and build a unimpeded communicating bridge for postgraduates and employers,this paper studies and develops an Employ ment Information searching engine.With the start point of four basic requirements of search engines as Web Spider,Chinese To kenizer,Indexer and Providing Searching Services,this paper divides the system into four functional modules.Based on Lucene.Net,this paper explores the principle and realizes one search and elaborates the webspider’s algorithm and theory.By using the Multithreading Technology and Regular Expressions,the webspider is realized and the algorithm of Lucene.Net is analyzed.The Chinese word segmentation is fulfilled and its effectiveness is verified.Finally,the index algorithm is achieved.After the informa tion extraction and segmentation pre-processing,the result shows its accuracy.
作者
阮昆
RUAN Kun(Enrollment and Employment Department of CQUPT,Chongqing 400065,China)
出处
《电脑知识与技术》
2013年第5期3081-3085,共5页
Computer Knowledge and Technology
关键词
就业信息采集
搜索引擎
中文分词
索引算法
LUCENE
NET
Employment Information Webspider
Searching Engine
Chinese word Segmentation
index algorithm
Lucene.Net