Although, researchers in the ATC field have done a wide range of work based on SVM, almost all existing approaches utilize an empirical model of selection algorithms. Their attempts to model automatic selection in pra...Although, researchers in the ATC field have done a wide range of work based on SVM, almost all existing approaches utilize an empirical model of selection algorithms. Their attempts to model automatic selection in practical, large-scale, text classification systems have been limited. In this paper, we propose a new model selection algorithm that utilizes the DDAG learning architecture. This architecture derives a new large-scale text classifier with very good performance. Experimental results show that the proposed algorithm has good efficiency and the necessary generalization capability while handling large-scale multi-class text classification tasks.展开更多
Compared with the traditional method of adding sentences to get summary in multi-document summarization,a two-stage sentence selection approach based on deleting sentences in acandidate sentence set to generate summar...Compared with the traditional method of adding sentences to get summary in multi-document summarization,a two-stage sentence selection approach based on deleting sentences in acandidate sentence set to generate summary is proposed,which has two stages,the acquisition of acandidate sentence set and the optimum selection of sentence.At the first stage,the candidate sentenceset is obtained by redundancy-based sentence selection approach.At the second stage,optimum se-lection of sentences is proposed to delete sentences in the candidate sentence set according to itscontribution to the whole set until getting the appointed summary length.With a test corpus,theROUGE value of summaries gotten by the proposed approach proves its validity,compared with thetraditional method of sentence selection.The influence of the token chosen in the two-stage sentenceselection approach on the quality of the generated summaries is analyzed.展开更多
文摘Although, researchers in the ATC field have done a wide range of work based on SVM, almost all existing approaches utilize an empirical model of selection algorithms. Their attempts to model automatic selection in practical, large-scale, text classification systems have been limited. In this paper, we propose a new model selection algorithm that utilizes the DDAG learning architecture. This architecture derives a new large-scale text classifier with very good performance. Experimental results show that the proposed algorithm has good efficiency and the necessary generalization capability while handling large-scale multi-class text classification tasks.
基金the National Natural Science Foundation of China(No.60575041)the High Technology Researchand Development Program of China(No.2006AA01Z150).
文摘Compared with the traditional method of adding sentences to get summary in multi-document summarization,a two-stage sentence selection approach based on deleting sentences in acandidate sentence set to generate summary is proposed,which has two stages,the acquisition of acandidate sentence set and the optimum selection of sentence.At the first stage,the candidate sentenceset is obtained by redundancy-based sentence selection approach.At the second stage,optimum se-lection of sentences is proposed to delete sentences in the candidate sentence set according to itscontribution to the whole set until getting the appointed summary length.With a test corpus,theROUGE value of summaries gotten by the proposed approach proves its validity,compared with thetraditional method of sentence selection.The influence of the token chosen in the two-stage sentenceselection approach on the quality of the generated summaries is analyzed.