摘要
SVM(Support Vector Machine,支持向量机)是由Vapnik等人提出的一种新的机器学习方法。以结构误差理论、条件二次优化理论与核空间理论作为理论基础,能够较地的解决机器学习的问题,如模型选择、过学习、非线性问题和灾难维数等,很适合应用在文本分类领域。针对文本分类技术的新问题,研究了已有的主动学习方法并对其进行改进,提出了一种新的主动学习算法,很好地解决了小规模标注样本集的分类问题。该方法尤其在难以获得大量类标签或者标注样本耗费较大的领域,更能显示出它的优越性,适合日新月异的互联网的应用。
SVM takes structural error theory, condition quadratic optimization theory and kernel space theory foundation, so it can preferably solve machine learning problems such as model selection, excessive learning, non-linear problem, the cruse of dimensionality and so on. SVM is quite suitable to be used in the field of text classification.In this paper, the advantages and new problems of SVM in text classification are discussed and an improved active learning method is proposed. It makes small-scale labeled training set get good classification effect and quite suitable to be used in those fields facing difficulty of labeling large-scale training set or costing a lot to do so.
出处
《软件导刊》
2006年第12期26-28,共3页
Software Guide