To equip data-driven dynamic chemical process models with strong interpretability,we develop a light attention–convolution–gate recurrent unit(LACG)architecture with three sub-modules—a basic module,a brand-new lig...To equip data-driven dynamic chemical process models with strong interpretability,we develop a light attention–convolution–gate recurrent unit(LACG)architecture with three sub-modules—a basic module,a brand-new light attention module,and a residue module—that are specially designed to learn the general dynamic behavior,transient disturbances,and other input factors of chemical processes,respectively.Combined with a hyperparameter optimization framework,Optuna,the effectiveness of the proposed LACG is tested by distributed control system data-driven modeling experiments on the discharge flowrate of an actual deethanization process.The LACG model provides significant advantages in prediction accuracy and model generalization compared with other models,including the feedforward neural network,convolution neural network,long short-term memory(LSTM),and attention-LSTM.Moreover,compared with the simulation results of a deethanization model built using Aspen Plus Dynamics V12.1,the LACG parameters are demonstrated to be interpretable,and more details on the variable interactions can be observed from the model parameters in comparison with the traditional interpretable model attention-LSTM.This contribution enriches interpretable machine learning knowledge and provides a reliable method with high accuracy for actual chemical process modeling,paving a route to intelligent manufacturing.展开更多
This paper proposes an integrative framework for network-structured analytic network process (ANP) modeling. The underlying rationales include: 1) creating the measuring items for the complex decision problems;2) appl...This paper proposes an integrative framework for network-structured analytic network process (ANP) modeling. The underlying rationales include: 1) creating the measuring items for the complex decision problems;2) applying factor analysis to reduce the complex measuring items into fewer constructs;3) employing Bayesian network classifier technique to discover the causal directions among constructs;4) using partial least squares path modeling to test the causal relationships among the items-constructs. The proposed framework is implemented for knowledge discovery to a case of high-tech companies’ enterprise resource planning (ERP) benefits and satisfaction in Hsinchu Science Park,Taiwan. The results show that the proposed framework for ANP modeling can reach a satisfactory level of convergent reliability and validity. Based on the findings, pragmatic implications to the ERP venders are discussed. This study has shed new light on the long neglected, yet critical, issue on decision structures and knowledge discovery for ANP modeling.展开更多
1 前言数据库中的知识发现KDD(Knowledge Discov-ery in Database)是近年来随着数据库和人工智能技术的发展而出现的,它是从大量数据中提取出可信的、新颖的、有效的并能被人理解的模式的高级处理过程。它主要采用机器学习算法或统计方...1 前言数据库中的知识发现KDD(Knowledge Discov-ery in Database)是近年来随着数据库和人工智能技术的发展而出现的,它是从大量数据中提取出可信的、新颖的、有效的并能被人理解的模式的高级处理过程。它主要采用机器学习算法或统计方法进行知识学习,一般将KDD中进行知识学习的阶段称为数据挖掘(Data Mining)。数据挖掘是KDD中的一个非常重要的处理步骤。人们往往不加区分地使用两者。一般来说,在工程应用领域多称数据挖掘,而在研究领域人们则多称为数据库中的知识发现。人们进行的关于KDD的研究是为了将知识发现的研究成果应用于实际数据处理中,为科学的决策提供支持。正是因为这样。展开更多
教育信息化、大数据战略已成为一种国家意志,通过数据挖掘发现新知识或更新现有知识是计算机信息处理最理想的产品之一。基于明确知识发现与数据挖掘(Knowledge Discovery and Data Mining,KDDM)的领域范畴,在回顾与综合分析欧美国家KDD...教育信息化、大数据战略已成为一种国家意志,通过数据挖掘发现新知识或更新现有知识是计算机信息处理最理想的产品之一。基于明确知识发现与数据挖掘(Knowledge Discovery and Data Mining,KDDM)的领域范畴,在回顾与综合分析欧美国家KDDM过程模型研究的基础之上,把KDDM过程模型概括为学科交叉性、应用多样性、本质探索性、过程迭代性、目标与结果不确定性等五个主要特征,从中获得在教育领域应用与实施KDDM工程实践的四点启示,并对KDDM在教育领域中的应用提出四点建议。展开更多
蛋白质二级结构预测是公认的生物信息学领域的国际性难题。以基于内在认知机理的知识发现理论(knowledge discovery theory based on inner cognitive mechanism,KDTICM)理论的扩展性研究与数据库中的知识发现(knowledge discovery in d...蛋白质二级结构预测是公认的生物信息学领域的国际性难题。以基于内在认知机理的知识发现理论(knowledge discovery theory based on inner cognitive mechanism,KDTICM)理论的扩展性研究与数据库中的知识发现(knowledge discovery in database*,KDD*)模型为基础,提出一种基于结构序列的多分类算法——SAC(structuralassociation classification),可以有效地解决蛋白质二级结构预测问题。该算法借助设定支持度阈值的精化知识库的方法,其预测准确率能够超过85%。以该算法为核心,构建了一个蛋白质二级预测模型——复合金字塔模型。实验证明,在RS126、CB513I、LP数据集上的预测准确率均超过80%,超过目前已知的国际主流水平。展开更多
基金support provided by the National Natural Science Foundation of China(22122802,22278044,and 21878028)the Chongqing Science Fund for Distinguished Young Scholars(CSTB2022NSCQ-JQX0021)the Fundamental Research Funds for the Central Universities(2022CDJXY-003).
文摘To equip data-driven dynamic chemical process models with strong interpretability,we develop a light attention–convolution–gate recurrent unit(LACG)architecture with three sub-modules—a basic module,a brand-new light attention module,and a residue module—that are specially designed to learn the general dynamic behavior,transient disturbances,and other input factors of chemical processes,respectively.Combined with a hyperparameter optimization framework,Optuna,the effectiveness of the proposed LACG is tested by distributed control system data-driven modeling experiments on the discharge flowrate of an actual deethanization process.The LACG model provides significant advantages in prediction accuracy and model generalization compared with other models,including the feedforward neural network,convolution neural network,long short-term memory(LSTM),and attention-LSTM.Moreover,compared with the simulation results of a deethanization model built using Aspen Plus Dynamics V12.1,the LACG parameters are demonstrated to be interpretable,and more details on the variable interactions can be observed from the model parameters in comparison with the traditional interpretable model attention-LSTM.This contribution enriches interpretable machine learning knowledge and provides a reliable method with high accuracy for actual chemical process modeling,paving a route to intelligent manufacturing.
文摘This paper proposes an integrative framework for network-structured analytic network process (ANP) modeling. The underlying rationales include: 1) creating the measuring items for the complex decision problems;2) applying factor analysis to reduce the complex measuring items into fewer constructs;3) employing Bayesian network classifier technique to discover the causal directions among constructs;4) using partial least squares path modeling to test the causal relationships among the items-constructs. The proposed framework is implemented for knowledge discovery to a case of high-tech companies’ enterprise resource planning (ERP) benefits and satisfaction in Hsinchu Science Park,Taiwan. The results show that the proposed framework for ANP modeling can reach a satisfactory level of convergent reliability and validity. Based on the findings, pragmatic implications to the ERP venders are discussed. This study has shed new light on the long neglected, yet critical, issue on decision structures and knowledge discovery for ANP modeling.
文摘教育信息化、大数据战略已成为一种国家意志,通过数据挖掘发现新知识或更新现有知识是计算机信息处理最理想的产品之一。基于明确知识发现与数据挖掘(Knowledge Discovery and Data Mining,KDDM)的领域范畴,在回顾与综合分析欧美国家KDDM过程模型研究的基础之上,把KDDM过程模型概括为学科交叉性、应用多样性、本质探索性、过程迭代性、目标与结果不确定性等五个主要特征,从中获得在教育领域应用与实施KDDM工程实践的四点启示,并对KDDM在教育领域中的应用提出四点建议。
文摘蛋白质二级结构预测是公认的生物信息学领域的国际性难题。以基于内在认知机理的知识发现理论(knowledge discovery theory based on inner cognitive mechanism,KDTICM)理论的扩展性研究与数据库中的知识发现(knowledge discovery in database*,KDD*)模型为基础,提出一种基于结构序列的多分类算法——SAC(structuralassociation classification),可以有效地解决蛋白质二级结构预测问题。该算法借助设定支持度阈值的精化知识库的方法,其预测准确率能够超过85%。以该算法为核心,构建了一个蛋白质二级预测模型——复合金字塔模型。实验证明,在RS126、CB513I、LP数据集上的预测准确率均超过80%,超过目前已知的国际主流水平。