摘要
文本分类是指在给定的分类体系下,根据文本的内容自动地确定文本所属的类别。与当前的文本分类技术相比,统计语义方法描述了语义元的相互关系,定义了语义元间的亲和力、语义元集的松散度等。基于上述定义,给出了一种选取关键词集的方法,并用所获得的关键词集构造了关键词集树,完成了映射类别未知的文本的词集到关键词集树的分类过程。
Text Categorization means under a defined classifying system, texts are classified automatically according to the contents. Compare with the traditional technologies of text categorization, the statistic semantic theory describes the relationship between semantic elements and defines the affinity between semantic elements and the loose degree of the semantic density. Based on the above definition, a way of selecting a key-word group, therefore a key-word group tree is presented, leading to a classifying process reflecting the word group whose category is unknown to a key-word group tree.
出处
《昆明冶金高等专科学校学报》
CAS
2004年第4期11-15,共5页
Journal of Kunming Metallurgy College