To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree...To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree(fuzzy classification rules tree)for text categorization is proposed.The compactness of the FCR-tree saves significant space in storing a large set of rules when there are many repeated words in the rules.In comparison with classification rules,the fuzzy classification rules contain not only words,but also the fuzzy sets corresponding to the frequencies of words appearing in texts.Therefore,the construction of an FCR-tree and its structure are different from a CR-tree.To debase the difficulty of FCR-tree construction and rules retrieval,more k-FCR-trees are built.When classifying a new text,it is not necessary to search the paths of the sub-trees led by those words not appearing in this text,thus reducing the number of traveling rules.Experimental results show that the proposed approach obviously outperforms the conventional method in efficiency.展开更多
In order to improve the efficiency of learning the triangular membership functions( TMFs) for mining fuzzy association rule( FAR) in dynamic database,a single-pass fuzzy c means( SPFCM)algorithm is combined with the r...In order to improve the efficiency of learning the triangular membership functions( TMFs) for mining fuzzy association rule( FAR) in dynamic database,a single-pass fuzzy c means( SPFCM)algorithm is combined with the real-coded CHC genetic model to incrementally learn the TMFs. The cluster centers resulting from SPFCM are regarded as the midpoint of TMFs. The population of CHC is generated randomly according to the cluster center and constraint conditions among TMFs. Then a new population for incremental learning is composed of the excellent chromosomes stored in the first genetic process and the chromosomes generated based on the cluster center adjusted by SPFCM. The experiments on real datasets show that the number of generations converging to the solution of the proposed approach is less than that of the existing batch learning approach. The quality of TMFs generated by the approach is comparable to that of the batch learning approach. Compared with the existing incremental learning strategy,the proposed approach is superior in terms of the quality of TMFs and time cost.展开更多
基金The National Natural Science Foundation of China(No.60473045)the Technology Research Project of Hebei Province(No.05213573)the Research Plan of Education Office of Hebei Province(No.2004406)
文摘To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree(fuzzy classification rules tree)for text categorization is proposed.The compactness of the FCR-tree saves significant space in storing a large set of rules when there are many repeated words in the rules.In comparison with classification rules,the fuzzy classification rules contain not only words,but also the fuzzy sets corresponding to the frequencies of words appearing in texts.Therefore,the construction of an FCR-tree and its structure are different from a CR-tree.To debase the difficulty of FCR-tree construction and rules retrieval,more k-FCR-trees are built.When classifying a new text,it is not necessary to search the paths of the sub-trees led by those words not appearing in this text,thus reducing the number of traveling rules.Experimental results show that the proposed approach obviously outperforms the conventional method in efficiency.
基金Supported by the National Natural Science Foundation of China(No.61301245,U1533104)
文摘In order to improve the efficiency of learning the triangular membership functions( TMFs) for mining fuzzy association rule( FAR) in dynamic database,a single-pass fuzzy c means( SPFCM)algorithm is combined with the real-coded CHC genetic model to incrementally learn the TMFs. The cluster centers resulting from SPFCM are regarded as the midpoint of TMFs. The population of CHC is generated randomly according to the cluster center and constraint conditions among TMFs. Then a new population for incremental learning is composed of the excellent chromosomes stored in the first genetic process and the chromosomes generated based on the cluster center adjusted by SPFCM. The experiments on real datasets show that the number of generations converging to the solution of the proposed approach is less than that of the existing batch learning approach. The quality of TMFs generated by the approach is comparable to that of the batch learning approach. Compared with the existing incremental learning strategy,the proposed approach is superior in terms of the quality of TMFs and time cost.