摘要
获取政务频繁词汇与词群是建立实用政务本体,实现政务信息资源深度开发利用的前提。提出了基于Apriori算法发现政务频繁词汇与词群的方法,并结合政务训练文档,做了相关实验。该方法首先利用典型政务文档,构造政务字典,并获取政务句子数据库。在此基础上,通过Apriori算法来发现句子数据库中的政务频繁项集和词群。
To obtain frequent terms of government affair is the premise to establish practical government affair ontology and achieve the deep exploitation and utilization of government affair information. An Apriori algorithm based method is proposed to discover frequent terms and term clusters in govemment training documents. Firstofall, a govemment affair dictionary is constructed to obtain govemment affair sentences database. On that basis, government affair frequent term and term clusters in that sentence database is found by Apriori algorithm.
出处
《计算机工程与设计》
CSCD
北大核心
2007年第24期5942-5944,5968,共4页
Computer Engineering and Design
基金
国家自然科学基金项目(70573103)
关键词
政务频繁词汇
政务词群
APRIORI算法
政务本体
本体构建
government affair frequent terms
term clusters
Apriori algorithm
government affair ontology
ontology construction