摘要
专利摘要是对专利的浓缩表述,将专利摘要按内容分割后,能更准确地定位对应的专利。由于专利摘要长度较短,而且不同内容间没有明显标志,使其分割不能使用传统的文本分割方法。本文将专利摘要的分割问题转化为句子分类问题,并尝试采用分类算法解决该问题。通过分析不同分类算法以及不同特征对本问题的解决效果,最终验证了利用句子分类方法进行专利摘要分割的可行性。
Patent summaries are condensed representation of the patents,and if patent summaries are divided by using their contents,the corresponding patents will be more accurately positioned.Because the length of each patent summary is too short and there are no signs between two different contents,the traditional text segmentation methods cannot be used.In this paper,the problem of text segmentation of a patent summary was changed into sentence classification,and the classification algorithms attempted to solve the problem.The effects of solving the problem with different classification algorithms and different features were analyzed,and the results proved that the segmentation method of the patent summaries by using the methods of sentence classification is feasible.
出处
《山东大学学报(理学版)》
CAS
CSCD
北大核心
2012年第5期68-72,77,共6页
Journal of Shandong University(Natural Science)
关键词
专利摘要
文本分割
句子单元
分类算法
词性
patent summary
text segmentation
sentence unit
classification algorithm
part of speech