摘要
句子语义表述是当前自然语言处理领域亟待解决的重要问题,是制约自然语言能否取得深度应用的重要因素。根据中文文本的特点,摈弃以前自然语言处理语义与句法相分离的观点,提出语义组块概念,并利用深度信念网络的深度学习方法构建对中文语义组块进行自动抽取的模型,模型以句子中名词为核心,将名词与其前后词语进行组合后构成中文语义组块,之后分别使用神经网络、支持向量机和深度信念网络三种抽取方法构建抽取模型,进行了三组实验,最终结果显示在高维大数据背景下,深度信念网络的方法与支持向量机和神经网络相比较具有更好的抽取效果。
Sentence semantic representation is not only a key problem in natural language processing to be solved at present, but also an important restriction factor whether nature language processing is ability to make deep application. Based on the characteristics of the Chinese text, this paper abandoned the point of separating semantic and grammar, and then put forward a new style of semantic clustering unit, and did a research on information extraction with a deep learning model based on deep belief net, the model took the noun as the core in the sentence and combined the noun with its before and after words to from semantic clustering unit. Then, it used three extraction methods, neural network, support vector machine and depth belief network, to construct the extraction model. Experimentally, there are three groups of experimental, finally results show that under the conditions of large data, deep belief network methods compare with support vector machines and neural networks, which has better effect.
出处
《计算机应用研究》
CSCD
北大核心
2018年第2期396-399,共4页
Application Research of Computers
基金
国家自然科学基金资助项目(61462027
61363072)
关键词
语义表述
深度信念网络
深度学习
中文语义组块
semantic representation
deep belief net
deep learning
Chinese semantic clustering unit