摘要
文本挖掘技术的基础是对文本的统计分析。通常,文本挖掘技术的基本做法是通过计算出某一个词或短语的出现频率来计算其在文档中的重要程度。但在统计分析中,其原始语义可能不是其在语句中的准确意思。为了解决这个问题,本文提出一个新的基于概念的模型框架,可以有效地找出文档间的匹配及相关联的概念。
Text mining is based on the statistical analysis of a term.Usually in text mining techniques,the basic measure like term frequency of a term(word or phrase)is computed to judge the importance of the term in the document.But with statistical analysis,the original semantics of the term may not carry the exact meaning of the term.To overcome this problem,a new framework has been introduced which can efficiently find significant matching and related concepts between documents.
作者
魏爽
WEI Shuang(Sanya University,Sanya 572000,Hainan)
出处
《电脑与电信》
2018年第3期46-48,共3页
Computer & Telecommunication
关键词
概念模型
数据挖掘
文本聚类
增强挖掘
concept model
data mining
text clustering
enhanced mining