Discarding more and more complicated algorithms, this paper presents a new classification algorithm with single category concept match. It also introduces the method to find such concepts, which is important to the al...Discarding more and more complicated algorithms, this paper presents a new classification algorithm with single category concept match. It also introduces the method to find such concepts, which is important to the algorithm. Experiment results show that it can improve classification precision and accelerate classification speed to some extent.展开更多
Classification algorithm is one of the key techniques to affect text automatic classification system’s performance, play an important role in automatic classification research area. This paper comparatively analyzed ...Classification algorithm is one of the key techniques to affect text automatic classification system’s performance, play an important role in automatic classification research area. This paper comparatively analyzed k-NN. VSM and hybrid classification algorithm presented by our research group. Some 2000 pieces of Internet news provided by ChinaInfoBank are used in the experiment. The result shows that the hybrid algorithm’s performance presented by the groups is superior to the other two algorithms.展开更多
Web pages contain more abundant contents than pure text ,such as hyperlinks,html tags and metadata et al.So that Web page categorization is different from pure text. According to Internet Chinese news pages, a practic...Web pages contain more abundant contents than pure text ,such as hyperlinks,html tags and metadata et al.So that Web page categorization is different from pure text. According to Internet Chinese news pages, a practical algorithm for extracting subject concepts from web page without thesaurus was proposed, when incorporated these category-subject concepts into knowledge base, Web pages was classified by hybrid algorithm, with experiment corpus extracting from Xinhua net. Experimental result shows that the categorization performance is improved using Web page feature.展开更多
Fuzzy clustering has been used widely in pattern recognition, image processing, and data analysis. An improved fuzzy clustering algorithm was developed based on the conventional fuzzy c-means (FCM) to obtain better qu...Fuzzy clustering has been used widely in pattern recognition, image processing, and data analysis. An improved fuzzy clustering algorithm was developed based on the conventional fuzzy c-means (FCM) to obtain better quality clustering results. The update equations for the membership and the cluster center are derived from the alternating optimization algorithm. Two fuzzy scattering matrices in the objective function assure the compactness between data points and cluster centers, and also strengthen the separation be- tween cluster centers in terms of a novel separable criterion. The clustering algorithm properties are shown to be an improvement over the FCM method’s properties. Numerical simulations show that the clustering al- gorithm gives more accurate clustering results than the FCM method.展开更多
文摘Discarding more and more complicated algorithms, this paper presents a new classification algorithm with single category concept match. It also introduces the method to find such concepts, which is important to the algorithm. Experiment results show that it can improve classification precision and accelerate classification speed to some extent.
文摘Classification algorithm is one of the key techniques to affect text automatic classification system’s performance, play an important role in automatic classification research area. This paper comparatively analyzed k-NN. VSM and hybrid classification algorithm presented by our research group. Some 2000 pieces of Internet news provided by ChinaInfoBank are used in the experiment. The result shows that the hybrid algorithm’s performance presented by the groups is superior to the other two algorithms.
基金The National Natural Science Foundation of China(No60082003)
文摘Web pages contain more abundant contents than pure text ,such as hyperlinks,html tags and metadata et al.So that Web page categorization is different from pure text. According to Internet Chinese news pages, a practical algorithm for extracting subject concepts from web page without thesaurus was proposed, when incorporated these category-subject concepts into knowledge base, Web pages was classified by hybrid algorithm, with experiment corpus extracting from Xinhua net. Experimental result shows that the categorization performance is improved using Web page feature.
基金Supported by the National Excellent Doctoral Dissertation Foundation(No. 200041) and the National Key Basic Research and Development (973) Program of China (No. G2002cb312205)
文摘Fuzzy clustering has been used widely in pattern recognition, image processing, and data analysis. An improved fuzzy clustering algorithm was developed based on the conventional fuzzy c-means (FCM) to obtain better quality clustering results. The update equations for the membership and the cluster center are derived from the alternating optimization algorithm. Two fuzzy scattering matrices in the objective function assure the compactness between data points and cluster centers, and also strengthen the separation be- tween cluster centers in terms of a novel separable criterion. The clustering algorithm properties are shown to be an improvement over the FCM method’s properties. Numerical simulations show that the clustering al- gorithm gives more accurate clustering results than the FCM method.