To improve the traditional classifying methods, such as vector space model (VSM)-based methods with highly complicated computation and poor scalability, a new classifying method ( called IER) is presented based on...To improve the traditional classifying methods, such as vector space model (VSM)-based methods with highly complicated computation and poor scalability, a new classifying method ( called IER) is presented based on two new concepts: interdependence and equivalent radius. In IER, the attribute is selected according to the value of interdependence, and the classifying rule is based on equivalent radius and center of gravity. The algorithm analysis shows that IER is good at classifying a large number of samples with higher scalability and lower computation complexity. After several experiments in classifying Chinese texts, the conclusion is drawn that IER outperforms k-nearest neighbor (kNN) and classifcation based on the center of classes (CCC) methods, so IER can be used online to automatically classify a large number of samples while keeping higher precision and recall.展开更多
基金Research supported by NNSF of China(No. 10271048)the Science and Technology Commission of Shanghai Municipality (No.04JC14031)NSF of Shanghai(05ZR14046)
基金The National Natural Science Foundation of China(No70501024,70501022)the Humanity & Social Science ResearchProgram of Ministry of Education of China (No05JC870013)
文摘To improve the traditional classifying methods, such as vector space model (VSM)-based methods with highly complicated computation and poor scalability, a new classifying method ( called IER) is presented based on two new concepts: interdependence and equivalent radius. In IER, the attribute is selected according to the value of interdependence, and the classifying rule is based on equivalent radius and center of gravity. The algorithm analysis shows that IER is good at classifying a large number of samples with higher scalability and lower computation complexity. After several experiments in classifying Chinese texts, the conclusion is drawn that IER outperforms k-nearest neighbor (kNN) and classifcation based on the center of classes (CCC) methods, so IER can be used online to automatically classify a large number of samples while keeping higher precision and recall.