The present article outlines progress made in designing an intelligent information system for automatic management and knowledge discovery in large numeric and scientific databases, with a validating application to th...The present article outlines progress made in designing an intelligent information system for automatic management and knowledge discovery in large numeric and scientific databases, with a validating application to the CAST-NEONS environmental databases used for ocean modeling and prediction. We describe a discovery-learning process (Automatic Data Analysis System) which combines the features of two machine learning techniques to generate sets of production rules that efficiently describe the observational raw data contained in the database. Data clustering allows the system to classify the raw data into meaningful conceptual clusters, which the system learns by induction to build decision trees, from which are automatically deduced the production rules.展开更多
Since the early 1990, significant progress in database technology has provided new platform for emerging new dimensions of data engineering. New models were introduced to utilize the data sets stored in the new genera...Since the early 1990, significant progress in database technology has provided new platform for emerging new dimensions of data engineering. New models were introduced to utilize the data sets stored in the new generations of databases. These models have a deep impact on evolving decision-support systems. But they suffer a variety of practical problems while accessing real-world data sources. Specifically a type of data storage model based on data distribution theory has been increasingly used in recent years by large-scale enterprises, while it is not compatible with existing decision-support models. This data storage model stores the data in different geographical sites where they are more regularly accessed. This leads to considerably less inter-site data transfer that can reduce data security issues in some circumstances and also significantly improve data manipulation transactions speed. The aim of this paper is to propose a new approach for supporting proactive decision-making that utilizes a workable data source management methodology. The new model can effectively organize and use complex data sources, even when they are distributed in different sites in a fragmented form. At the same time, the new model provides a very high level of intellectual management decision-support by intelligent use of the data collections through utilizing new smart methods in synthesizing useful knowledge. The results of an empirical study to evaluate the model are provided.展开更多
An integrated solution for discovery of literature information knowledge is proposed. The analytic model of literature Information model and discovery of literature information knowledge are illustrated. Practical ill...An integrated solution for discovery of literature information knowledge is proposed. The analytic model of literature Information model and discovery of literature information knowledge are illustrated. Practical illustrative example for discovery of literature information knowledge is given.展开更多
It is common in industrial construction projects for data to be collected and discarded without being analyzed to extract useful knowledge. A proposed integrated methodology based on a five-step Knowledge Discovery in...It is common in industrial construction projects for data to be collected and discarded without being analyzed to extract useful knowledge. A proposed integrated methodology based on a five-step Knowledge Discovery in Data (KDD) model was developed to address this issue. The framework transfers existing multidimensional historical data from completed projects into useful knowledge for future projects. The model starts by understanding the problem domain, industrial construction projects. The second step is analyzing the problem data and its multiple dimensions. The target dataset is the labour resources data generated while managing industrial construction projects. The next step is developing the data collection model and prototype data ware-house. The data warehouse stores collected data in a ready-for-mining format and produces dynamic On Line Analytical Processing (OLAP) reports and graphs. Data was collected from a large western-Canadian structural steel fabricator to prove the applicability of the developed methodology. The proposed framework was applied to three different case studies to validate the applicability of the developed framework to real projects data.展开更多
A database stores data in order to provide the user with information. However, how a database may achieve this is not always clear. The main reason for this seems that we, who are in the database community, have not f...A database stores data in order to provide the user with information. However, how a database may achieve this is not always clear. The main reason for this seems that we, who are in the database community, have not fully understood and therefore clearly defined the notion of “the information that data in a database carry”, in other words, “the information content of data”. As a result, databases’ capability is limited in terms of answering queries, especially, when users explore information beyond the scope of data stored in a database, the database normally cannot provide it. The underlying reason of the problem is that queries are answered based on a direct match between a query and data (up to aggregations of the data). We observe that this is because the information that data carry is seen as exactly the data per se. To tackle this problem, we propose the notion of information content inclusion relation, and show that it formulates the intuitive notion of the “information content of data” and then show how this notion may be used for the derivation of information from data in a database.展开更多
Structural choice is a significant decision having an important influence on structural function, social economics, structural reliability and construction cost. A Case Based Reasoning system with its retrieval part c...Structural choice is a significant decision having an important influence on structural function, social economics, structural reliability and construction cost. A Case Based Reasoning system with its retrieval part constructed with a KDD subsystem, is put forward to make a decision for a large scale engineering project. A typical CBR system consists of four parts: case representation, case retriever, evaluation, and adaptation. A case library is a set of parameterized excellent and successful structures. For a structural choice, the key point is that the system must be able to detect the pattern classes hidden in the case library and classify the input parameters into classes properly. That is done by using the KDD Data Mining algorithm based on Self Organizing Feature Maps (SOFM), which makes the whole system more adaptive, self organizing, self learning and open.展开更多
This paper elaborate the emergence of human information database and the important role it plays in the various industries of economic development. It also interpret the primary human information database of current d...This paper elaborate the emergence of human information database and the important role it plays in the various industries of economic development. It also interpret the primary human information database of current domestic and abroad and analysis it's classification characteristic, Besides, this papers further explains how to make use of human information database and how to make the database to play its due value. In the end, the prospect of our country's body information database has been set forth, using relatively mature foreign database to improve Chinese body information database.展开更多
在KDD(knowledge discovery in database)中,对所发现的知识进行评价是一个很重要的环节.提出了一种针对KDD中因果关联规则的自动评价方法.该评价方法采用了全新的、有效的知识表示方法(语言场和语言值结构)和推理机制(因果关系定性推...在KDD(knowledge discovery in database)中,对所发现的知识进行评价是一个很重要的环节.提出了一种针对KDD中因果关联规则的自动评价方法.该评价方法采用了全新的、有效的知识表示方法(语言场和语言值结构)和推理机制(因果关系定性推理机制),并且具有通用性和交互性的特征.给出了此评价方法的理论依据和构造过程,并提供了相应的算法.通过对具体实例的运行检验,证明了此评价方法的有效性.通过与相关工作的比较,证明了其先进性.展开更多
文摘The present article outlines progress made in designing an intelligent information system for automatic management and knowledge discovery in large numeric and scientific databases, with a validating application to the CAST-NEONS environmental databases used for ocean modeling and prediction. We describe a discovery-learning process (Automatic Data Analysis System) which combines the features of two machine learning techniques to generate sets of production rules that efficiently describe the observational raw data contained in the database. Data clustering allows the system to classify the raw data into meaningful conceptual clusters, which the system learns by induction to build decision trees, from which are automatically deduced the production rules.
文摘Since the early 1990, significant progress in database technology has provided new platform for emerging new dimensions of data engineering. New models were introduced to utilize the data sets stored in the new generations of databases. These models have a deep impact on evolving decision-support systems. But they suffer a variety of practical problems while accessing real-world data sources. Specifically a type of data storage model based on data distribution theory has been increasingly used in recent years by large-scale enterprises, while it is not compatible with existing decision-support models. This data storage model stores the data in different geographical sites where they are more regularly accessed. This leads to considerably less inter-site data transfer that can reduce data security issues in some circumstances and also significantly improve data manipulation transactions speed. The aim of this paper is to propose a new approach for supporting proactive decision-making that utilizes a workable data source management methodology. The new model can effectively organize and use complex data sources, even when they are distributed in different sites in a fragmented form. At the same time, the new model provides a very high level of intellectual management decision-support by intelligent use of the data collections through utilizing new smart methods in synthesizing useful knowledge. The results of an empirical study to evaluate the model are provided.
文摘An integrated solution for discovery of literature information knowledge is proposed. The analytic model of literature Information model and discovery of literature information knowledge are illustrated. Practical illustrative example for discovery of literature information knowledge is given.
文摘It is common in industrial construction projects for data to be collected and discarded without being analyzed to extract useful knowledge. A proposed integrated methodology based on a five-step Knowledge Discovery in Data (KDD) model was developed to address this issue. The framework transfers existing multidimensional historical data from completed projects into useful knowledge for future projects. The model starts by understanding the problem domain, industrial construction projects. The second step is analyzing the problem data and its multiple dimensions. The target dataset is the labour resources data generated while managing industrial construction projects. The next step is developing the data collection model and prototype data ware-house. The data warehouse stores collected data in a ready-for-mining format and produces dynamic On Line Analytical Processing (OLAP) reports and graphs. Data was collected from a large western-Canadian structural steel fabricator to prove the applicability of the developed methodology. The proposed framework was applied to three different case studies to validate the applicability of the developed framework to real projects data.
文摘A database stores data in order to provide the user with information. However, how a database may achieve this is not always clear. The main reason for this seems that we, who are in the database community, have not fully understood and therefore clearly defined the notion of “the information that data in a database carry”, in other words, “the information content of data”. As a result, databases’ capability is limited in terms of answering queries, especially, when users explore information beyond the scope of data stored in a database, the database normally cannot provide it. The underlying reason of the problem is that queries are answered based on a direct match between a query and data (up to aggregations of the data). We observe that this is because the information that data carry is seen as exactly the data per se. To tackle this problem, we propose the notion of information content inclusion relation, and show that it formulates the intuitive notion of the “information content of data” and then show how this notion may be used for the derivation of information from data in a database.
文摘Structural choice is a significant decision having an important influence on structural function, social economics, structural reliability and construction cost. A Case Based Reasoning system with its retrieval part constructed with a KDD subsystem, is put forward to make a decision for a large scale engineering project. A typical CBR system consists of four parts: case representation, case retriever, evaluation, and adaptation. A case library is a set of parameterized excellent and successful structures. For a structural choice, the key point is that the system must be able to detect the pattern classes hidden in the case library and classify the input parameters into classes properly. That is done by using the KDD Data Mining algorithm based on Self Organizing Feature Maps (SOFM), which makes the whole system more adaptive, self organizing, self learning and open.
文摘This paper elaborate the emergence of human information database and the important role it plays in the various industries of economic development. It also interpret the primary human information database of current domestic and abroad and analysis it's classification characteristic, Besides, this papers further explains how to make use of human information database and how to make the database to play its due value. In the end, the prospect of our country's body information database has been set forth, using relatively mature foreign database to improve Chinese body information database.
文摘在KDD(knowledge discovery in database)中,对所发现的知识进行评价是一个很重要的环节.提出了一种针对KDD中因果关联规则的自动评价方法.该评价方法采用了全新的、有效的知识表示方法(语言场和语言值结构)和推理机制(因果关系定性推理机制),并且具有通用性和交互性的特征.给出了此评价方法的理论依据和构造过程,并提供了相应的算法.通过对具体实例的运行检验,证明了此评价方法的有效性.通过与相关工作的比较,证明了其先进性.