期刊文献+

网页分类及其维文信息检索中的应用研究 被引量:2

The Research of Automated Text classification in the Uyghur Information Retrieval
下载PDF
导出
摘要 研究维文信息检索中网页分类问题。在维文信息预处理,文档特征词组抽取和信息检索模型的建立等方面做了一些探讨。提出一种引入网页分类和词组抽取技术的信息检索方法。采用了基于KNN的网页分类方法,此方法符合雏文语言特点,能够提高信息检索系统的查询准确率,使得返回结果更符合用户检索需求。 This paper studies the problems of Uyghur Information Retrieval automated text classification. We probe into the pre-process of Uyghur information, the extract of character phrases of documents and the establishment of information retrieval model. Web text classification algorithm based on KNN method presented. The experiments has proved that the design of the system accords with the language characteristic of Uyghur and improves the query precision in Uyghur information retrieval system, so the returned query results can best meet the users' needs.
出处 《电脑知识与技术》 2011年第1期192-193,共2页 Computer Knowledge and Technology
基金 国家自然科学基金项目(61063022) 新疆维吾尔自治区高校科研计划重点资助项目(XJEDU2006113)
关键词 维文网页 网页预处理 网页分类 Uyghur web web pre-process text classification
  • 相关文献

参考文献5

二级参考文献19

  • 1冯是聪 单松巍 张志刚 等.一个中文网页数据集及其分类体系[A]..海峡两岸技术交流会[C].南京,2002-10.121-129.
  • 2Yiming Yang,Jan O Pedersen.A comparative Study on Feature Selection in Text Categorization[C].In :Proceedings of the Fourteenth International Conference on Machine Leaming(ICML'97), 1997.
  • 3Yiming Yang,Xin Liu.A re-examination of text categorization methods[C].In:Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval SIGIR'99,1999:42---49.
  • 4Yiming Yang.A study on thresholding strategies for text categorization[C].In:Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR'01),2001.
  • 5John G H,Kohavi R,Pfleger K,Irrelevant feature and the subset selection problem[EB/OL] ,http://www,stanford,edu/-kpfleger/copy/publications/relevance4,ps,gz,1994.
  • 6Yang Y,Pedersen J P,A comparative study on feature selection in text categorization[A] ,In:Proc of the 14th Int' l ConferenceMachine Learning (ICML'97)[C],1997 ,412 -420.
  • 7Mladenic D,Grobelnk M,Feature selection for unbalanced class distribution and Na 1ve bayes [ A ],In:Proc of the 16th Int'l Confon Machine Learning (ICML'99) [C],San Francisco:Morgan Kaufmann Publishers,1999,258- 267.
  • 8ladenic M D,Machine Learning on non-homogeneous,distributed text data [EB/OL],http://www,cs,cmu,edu/afs/cs/project/theo-4/text-learning/www/pww/papers/PhD/PhDBib,ps,gz,1998.
  • 9Lewis D D,Gale WA,A Sequential Algorithm for Training Text Classifiers[A],SIGIR 94:Proceedings of 17th Annual InternationalACM-SIGIR Conference on Research and Development in Information Retrival[C],Springer- Verlag,London,1994,3-12.
  • 10RE Filman,S Pant.Searching the Internet:IEEE Internet Computing,1998

共引文献93

同被引文献29

  • 1陈丽珍,卡米力.毛依丁.WEB维文信息检索系统中维文的存储和特征项抽取[J].新疆大学学报(自然科学版),2006,23(1):90-92. 被引量:1
  • 2茆诗松,程依明,濮晓龙.概率论与数理统计教程[M].北京:高等教育出版社.2009.
  • 3Soumen Chakrabarti.Web数据挖掘[M].北京:人民邮电出版社,2009,53-137.
  • 4Yiming yang, Jan O Pedersen. A comparative Study on Feature Selection in text Categorization In:Proceeding of the Fourteenth International[C].Conference on Machine Learning ICML(97),1997,2-6.
  • 5Yiming Yang. A study on thresholding strategies for text categorization[C]. In: Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval (SICIR'01),2001,137-145.
  • 6Hanchuan Peng,Fuhui Long,Chris Ding. Feature Selection Based on Mutual Information:Criteria of Max-Dependency, Max- Relevance,and Min-Redundacy[J]. IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2005,27(2):1228-1236.
  • 7VLACHOS A. Active learning with support vector machines[D]. MS:University of Edinburgh,2004.
  • 8Hsu C W, Lin C J. A comparison of methods for multi class support vector machines[J]. IEEE Transactions on Neural Networks, 2002,13(2):415-425.
  • 9Yang Yi-ming.An evaluation of statistical approaches to text categorization [J]. Information Retrieval, 1999,1(1):76-88.
  • 10Mladenic D.Machine Learning on non - homogeneous, Distributed Text Data[D].Doctoral Dissertation , University of Ljublijana ,1998: 163-168.

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部