摘要
随着大数据的快速发展,每天产生呈几何级增长的海量信息,其中很大一部分是文本数据。文本分类指按照一定规则或体系通过计算机对文本进行划分归类,广泛应用在自然语言处理中。阐述TF-IDF、文本分类,并基于TF-IDF实现文本分类。
With the rapid development of big data,a huge amount of information with geometric growth is generated every day,a large part of which is text data.Text classification refers to the classification of text by computer according to certain rules or systems,which is widely used in nat ural language processing.This paper describes TF-IDF and text classification,and implements text classification based on TF-IDF.
作者
石凤贵
SHI Feng-gui(Department of Software Engineering,Ma'anshan Normal College,Anhui 243041)
出处
《现代计算机》
2020年第6期51-54,75,共5页
Modern Computer
基金
安徽省教育厅科学研究项目(No.KJ2017A852)。