摘要
研究维文信息检索中网页分类问题。在维文信息预处理,文档特征词组抽取和信息检索模型的建立等方面做了一些探讨。提出一种引入网页分类和词组抽取技术的信息检索方法。采用了基于KNN的网页分类方法,此方法符合雏文语言特点,能够提高信息检索系统的查询准确率,使得返回结果更符合用户检索需求。
This paper studies the problems of Uyghur Information Retrieval automated text classification. We probe into the pre-process of Uyghur information, the extract of character phrases of documents and the establishment of information retrieval model. Web text classification algorithm based on KNN method presented. The experiments has proved that the design of the system accords with the language characteristic of Uyghur and improves the query precision in Uyghur information retrieval system, so the returned query results can best meet the users' needs.
出处
《电脑知识与技术》
2011年第1期192-193,共2页
Computer Knowledge and Technology
基金
国家自然科学基金项目(61063022)
新疆维吾尔自治区高校科研计划重点资助项目(XJEDU2006113)
关键词
维文网页
网页预处理
网页分类
Uyghur web
web pre-process
text classification