摘要
为提高电解加工工艺知识本体中的概念提取的完整性,本文中构建了一种半自动化领域术语提取模型,该模型结合统计分析和数据挖掘的思想设计了N-Word算法,进行领域术语中词组的提取,3-Word构词性能最佳。为了提高领域术语的准确性,基于互信息(MI)和绝对词频对领域术语过滤得到2137个术语,进一步对术语修正和同义词合并处理,最终得到标准化的领域概念1894个。此模型满足对电解加工领域术语的提取,提高术语的领域覆盖度,保证本体构建的准确性。
In order to improve the integrity of concept extraction in ECM process knowledge ontology, this paper constructs a semi-automatic domain term extraction model, which combines the idea of statistical analysis and data mining to design N-Word algorithm to extract phrases in domain terms. 3-Word has the best word formation performance. In order to improve the accuracy of domain terms, 2137 terms were filtered based on mutual information (MI) and absolute word frequency, and 1894 standardized domain concepts were finally obtained through further term modification and synonym merging. This model can extract terms in the field of electrochemical machining, improve the domain coverage of terms, and ensure the accuracy of ontology construction.
出处
《软件工程与应用》
2022年第6期1554-1560,共7页
Software Engineering and Applications