摘要
生物化工产品的工业生产,要求有合适的生产环境,由于生产过程的复杂性,掌握适宜的生产环境较为困难。数据挖掘是从现有数据中找规律,可以从历史数据中,找出关联模式,从而获取对决策目标有利的生产环境条件。本文针对生物化工(生化)企业生产的数据特征,基于关联规则挖掘,分析生化企业生产数据,同时结合目前大多数关联规则挖掘算法的数据模型要求,重点论述了环境因子和环境因子数据项的关系,提出将原始数据指标分割成数据项,及分割后的数据项合并为决策目标的方法。由于生化企业生产决策目标的确定性,提出了具有确定性决策项时关联规则挖掘的优化算法,可快速地挖掘感兴趣的频繁数据项集。在此基础上,开发了具有数据预处理(环境指标分割)、关联知识发现、结果生成的应用系统,对系统做了初步试验和分析,从系统输出的结果中,可以辅助企业进行生产环境的优化。研究表明,用关联规则挖掘分析生化企业数据是有效的。
An optimized producing environment is a key issue in bio-chemical industry production. Due to the complex mechanism of bio-chemical production, understanding the favorable environment is very difficult. A great amount of data has been accumulated through industry production over years. It is possible to find out valuable rules that may contribute to the improvement of production efficiency and quality through data mining and association rule mining. By analyzing the feature of biochemical production dataset, this paper proposed the method of separating and merging environmental indexes and decision indexes to generate a ready-to-use dataset. Based on the generated dataset, an optimized algorithm to extract strong association rules was suggested. The proposed algorithm aimed at mining association rules with determined items and thus the computation complexity is greatly increased. To further explain the applications of the mined rules in bio-chemical industry, possible use of the rules is introduced specifically in the field of experiment/lab design during optimized variable selection. Result shows that association mining can be applied in environment condition control in industry production. A software prototype was built to mine association rules from an available dataset which proved to be successful.
出处
《计算机与应用化学》
CAS
CSCD
北大核心
2010年第9期1252-1256,共5页
Computers and Applied Chemistry
关键词
关联规则挖掘
系统开发
生化企业
决策目标
association rule, data mining, bio-chemical enterprise, decision