摘要
提出了一种网络信息文本分类模型的建立方法.根据网络报文的特点,抽取其中关键词作为分类特征词条,并以报文关键词进行词频统计分析建立文本分模型.分别进行了基于最近邻决策和K-近邻决策的分类效果试验研究.结果显示,K-近邻决策的分类效果要优于最近邻决策的分类效果.
The construction of a network information text categorization model is proposed. According to the feature of network datagram, the keywords are selected as classification features and the keywords frequencies are counted to construct a text categorization model. The experiments show that the K-nearest neighbour method is superior to the nearest neighbour method.
出处
《上海理工大学学报》
EI
CAS
北大核心
2005年第1期83-86,共4页
Journal of University of Shanghai For Science and Technology
关键词
文本分类
最近邻
K-近邻
特征提取
text categorization
nearest neighbor
K-nearest neighbor
feature extraction