期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
Word Net-based lexical semantic classification for text corpus analysis
1
作者 龙军 王鲁达 +2 位作者 李祖德 张祖平 杨柳 《Journal of Central South University》 SCIE EI CAS CSCD 2015年第5期1833-1840,共8页
Many text classifications depend on statistical term measures to implement document representation. Such document representations ignore the lexical semantic contents of terms and the distilled mutual information, lea... Many text classifications depend on statistical term measures to implement document representation. Such document representations ignore the lexical semantic contents of terms and the distilled mutual information, leading to text classification errors.This work proposed a document representation method, Word Net-based lexical semantic VSM, to solve the problem. Using Word Net,this method constructed a data structure of semantic-element information to characterize lexical semantic contents, and adjusted EM modeling to disambiguate word stems. Then, in the lexical-semantic space of corpus, lexical-semantic eigenvector of document representation was built by calculating the weight of each synset, and applied to a widely-recognized algorithm NWKNN. On text corpus Reuter-21578 and its adjusted version of lexical replacement, the experimental results show that the lexical-semantic eigenvector performs F1 measure and scales of dimension better than term-statistic eigenvector based on TF-IDF. Formation of document representation eigenvectors ensures the method a wide prospect of classification applications in text corpus analysis. 展开更多
关键词 document representation lexical semantic content CLASSIFICATION EIGENVECTOR
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部