This paper presents a new improved term frequency/inverse document frequency (TF-IDF) approach which uses confidence, support and characteristic words to enhance the recall and precision of text classification. Synony...This paper presents a new improved term frequency/inverse document frequency (TF-IDF) approach which uses confidence, support and characteristic words to enhance the recall and precision of text classification. Synonyms defined by a lexicon are processed in the improved TF-IDF approach. We detailedly discuss and analyze the relationship among confidence, recall and precision. The experiments based on science and technology gave promising results that the new TF-IDF approach improves the precision and recall of text classification compared with the conventional TF-IDF approach.展开更多
This paper is based on two existing theories about automatic indexing of thematic knowledge concept. The prohibit-word table with position information has been designed. The improved Maximum Matching-Minimum Backtrack...This paper is based on two existing theories about automatic indexing of thematic knowledge concept. The prohibit-word table with position information has been designed. The improved Maximum Matching-Minimum Backtracking method has been researched. Moreover it has been studied on improved indexing algorithm and application technology based on rules and thematic concept word table.展开更多
In order to improve the automatic retrieval ability of English vocabulary, for the distribution of semantic attributes in English vocabulary, an automatic classification method of English vocabulary is proposed based ...In order to improve the automatic retrieval ability of English vocabulary, for the distribution of semantic attributes in English vocabulary, an automatic classification method of English vocabulary is proposed based on association rules, English vocabulary data storage model is constructed, a two element linguistic feature function is constructed for describing the directionality of English lexical retrieval scheduling, English vocabulary classification decision making model is constructed based on contextual relations of English vocabulary, the features of the association rules of English vocabulary are extracted, the adaptive learning method is used to realize the automatic classification of English vocabulary. The simulation results show that the method of English vocabulary classification has good performance, the classification error rate is low, the retrieval precision is high, and the computational overhead is small.展开更多
基金Project (No. 60082003) supported by the National Natural Science Foundation of China
文摘This paper presents a new improved term frequency/inverse document frequency (TF-IDF) approach which uses confidence, support and characteristic words to enhance the recall and precision of text classification. Synonyms defined by a lexicon are processed in the improved TF-IDF approach. We detailedly discuss and analyze the relationship among confidence, recall and precision. The experiments based on science and technology gave promising results that the new TF-IDF approach improves the precision and recall of text classification compared with the conventional TF-IDF approach.
基金the Science Foundation of Shanghai Archive Bureau (0215)
文摘This paper is based on two existing theories about automatic indexing of thematic knowledge concept. The prohibit-word table with position information has been designed. The improved Maximum Matching-Minimum Backtracking method has been researched. Moreover it has been studied on improved indexing algorithm and application technology based on rules and thematic concept word table.
文摘In order to improve the automatic retrieval ability of English vocabulary, for the distribution of semantic attributes in English vocabulary, an automatic classification method of English vocabulary is proposed based on association rules, English vocabulary data storage model is constructed, a two element linguistic feature function is constructed for describing the directionality of English lexical retrieval scheduling, English vocabulary classification decision making model is constructed based on contextual relations of English vocabulary, the features of the association rules of English vocabulary are extracted, the adaptive learning method is used to realize the automatic classification of English vocabulary. The simulation results show that the method of English vocabulary classification has good performance, the classification error rate is low, the retrieval precision is high, and the computational overhead is small.