摘要
特征提取是文本挖掘、信息检索、自然语言处理(NLP)、文本情感分析、网络舆情分析等领域的研究热点。特征提取作为文本挖掘系统的主要因素,文本特征提取性能是文本分类结果的重要性度量。从两方面对特征选择算法进行总结,分析国内外对常用特征提取算法的改进和创新,最后针对影响特征提取的因素,指出在实际应用中应考虑的问题。
Feature extraction is the research focus of text mining,information retrieval , Natural Language Processing (NLP),text sentiment analysis, network public opinion analysis,etc. Feature extraction is the main factor of tethe performance of text feature extraction is the important measurement of text categorizationtwo kinds of feature selection algorithms, and analyzes the improvement and innovation of common feature extraction algorithms at home and abroad. Finally, it points out issues which should be taken into account in practical application influenced by feature extraction.
作者
徐冠华
赵景秀
杨红亚
刘爽
XU Guan-hua;ZHAO Jing-xiu;YANG Ho ng-ya;LIU Shuang(School of Informat ion Science and Engineering,QufuNormal University,Rizhao 276800,China)
出处
《软件导刊》
2018年第5期13-18,共6页
Software Guide
关键词
特征提取
距离测度
信息测度
feature extraction
distance measure
information measure