In this paper, we propose a rule management system for data cleaning that is based on knowledge. This system combines features of both rule based systems and rule based data cleaning frameworks. The important advantag...In this paper, we propose a rule management system for data cleaning that is based on knowledge. This system combines features of both rule based systems and rule based data cleaning frameworks. The important advantages of our system are threefold. First, it aims at proposing a strong and unified rule form based on first order structure that permits the representation and management of all the types of rules and their quality via some characteristics. Second, it leads to increase the quality of rules which conditions the quality of data cleaning. Third, it uses an appropriate knowledge acquisition process, which is the weakest task in the current rule and knowledge based systems. As several research works have shown that data cleaning is rather driven by domain knowledge than by data, we have identified and analyzed the properties that distinguish knowledge and rules from data for better determining the most components of the proposed system. In order to illustrate our system, we also present a first experiment with a case study at health sector where we demonstrate how the system is useful for the improvement of data quality. The autonomy, extensibility and platform-independency of the proposed rule management system facilitate its incorporation in any system that is interested in data quality management.展开更多
Corporate Performance Management (CPM) system is an information system used to collect, analyze, and visualize key performance indicators (KPIs) to support both business operations and especially strategic decisio...Corporate Performance Management (CPM) system is an information system used to collect, analyze, and visualize key performance indicators (KPIs) to support both business operations and especially strategic decisions. CPM systems display KPIs in forms of scorecard and dashboard so the executives can keep track and evaluate corporate performance. The quality of the information as shown in the KPIs is very crucial for the executives to make the right decisions. Therefore, it is important that the executives must be able to retrieve not only the KPIs but also the quality of those KPIs before using such KPIs in their strategic decisions. The objectives of this study were to determine the role of the CPM system in the organizations, current data and information quality state, problems and perspectives regarding data quality, as well as data quality maturity stage of the organizations. Survey research was used in this study; a questionnaire was sent to collect data from 477 corporations listed in the Stock Exchange of Thailand (SET) on January, 2011. Forty-nine questionnaires were returned. The results show that about half of the organizations have implemented CPM systems. Most organizations are confident in the information in CPM system, but information quality issues are commonly found. Frequent problems regarding information quality are information not up to date, information not ready by time of use, inaccuracy and incomplete. The most concerned and frequently assessed quality dimensions were security, accuracy, completeness, and validity. When asked to prioritize, the most important quality dimensions are accuracy, timeliness, completeness, security, and validity respectively. In addition, most organizations concern about data govemance management and have deployed such measures. This study showed that most organizations are on level 4 on Gartner's data governance maturity stage in which data governance is concerned and managed, but still not effective.展开更多
A set of closed-loop normative standards of PDCA(planning-implementing-checking-improvement) of the quality management system is applied to promote the standardization, normalization and institutionalization of grass-...A set of closed-loop normative standards of PDCA(planning-implementing-checking-improvement) of the quality management system is applied to promote the standardization, normalization and institutionalization of grass-roots meteorological observations, further improve the service availability and timely rate of data, and improve the quality and efficiency of meteorological observations.展开更多
We advance here a novel methodology for robust intelligent biometric information management with inferences and predictions made using randomness and complexity concepts. Intelligence refers to learning, adap- tation,...We advance here a novel methodology for robust intelligent biometric information management with inferences and predictions made using randomness and complexity concepts. Intelligence refers to learning, adap- tation, and functionality, and robustness refers to the ability to handle incomplete and/or corrupt adversarial information, on one side, and image and or device variability, on the other side. The proposed methodology is model-free and non-parametric. It draws support from discriminative methods using likelihood ratios to link at the conceptual level biometrics and forensics. It further links, at the modeling and implementation level, the Bayesian framework, statistical learning theory (SLT) using transduction and semi-supervised lea- rning, and Information Theory (IY) using mutual information. The key concepts supporting the proposed methodology are a) local estimation to facilitate learning and prediction using both labeled and unlabeled data;b) similarity metrics using regularity of patterns, randomness deficiency, and Kolmogorov complexity (similar to MDL) using strangeness/typicality and ranking p-values;and c) the Cover – Hart theorem on the asymptotical performance of k-nearest neighbors approaching the optimal Bayes error. Several topics on biometric inference and prediction related to 1) multi-level and multi-layer data fusion including quality and multi-modal biometrics;2) score normalization and revision theory;3) face selection and tracking;and 4) identity management, are described here using an integrated approach that includes transduction and boosting for ranking and sequential fusion/aggregation, respectively, on one side, and active learning and change/ outlier/intrusion detection realized using information gain and martingale, respectively, on the other side. The methodology proposed can be mapped to additional types of information beyond biometrics.展开更多
Quality traceability plays an essential role in assembling and welding offshore platform blocks.The improvement of the welding quality traceability system is conducive to improving the durability of the offshore platf...Quality traceability plays an essential role in assembling and welding offshore platform blocks.The improvement of the welding quality traceability system is conducive to improving the durability of the offshore platform and the process level of the offshore industry.Currently,qualitymanagement remains in the era of primary information,and there is a lack of effective tracking and recording of welding quality data.When welding defects are encountered,it is difficult to rapidly and accurately determine the root cause of the problem from various complexities and scattered quality data.In this paper,a composite welding quality traceability model for offshore platform block construction process is proposed,it contains the quality early-warning method based on long short-term memory and quality data backtracking query optimization algorithm.By fulfilling the training of the early-warning model and the implementation of the query optimization algorithm,the quality traceability model has the ability to assist enterprises in realizing the rapid identification and positioning of quality problems.Furthermore,the model and the quality traceability algorithm are checked by cases in actual working conditions.Verification analyses suggest that the proposed early-warningmodel for welding quality and the algorithmfor optimizing backtracking requests are effective and can be applied to the actual construction process.展开更多
介绍ArcGIS Data Reviewer基本功能和特性,对其应用于林业地理信息矢量数据质量检查,如图斑重复、重叠,图斑间有间隙、多部件、狭长面、急锐角化、漏绘等空间关系,以及属性字段之间的逻辑性检查等的方法和步骤,举例进行了详细叙述,可为...介绍ArcGIS Data Reviewer基本功能和特性,对其应用于林业地理信息矢量数据质量检查,如图斑重复、重叠,图斑间有间隙、多部件、狭长面、急锐角化、漏绘等空间关系,以及属性字段之间的逻辑性检查等的方法和步骤,举例进行了详细叙述,可为该软件模块的使用提供参考。展开更多
文摘In this paper, we propose a rule management system for data cleaning that is based on knowledge. This system combines features of both rule based systems and rule based data cleaning frameworks. The important advantages of our system are threefold. First, it aims at proposing a strong and unified rule form based on first order structure that permits the representation and management of all the types of rules and their quality via some characteristics. Second, it leads to increase the quality of rules which conditions the quality of data cleaning. Third, it uses an appropriate knowledge acquisition process, which is the weakest task in the current rule and knowledge based systems. As several research works have shown that data cleaning is rather driven by domain knowledge than by data, we have identified and analyzed the properties that distinguish knowledge and rules from data for better determining the most components of the proposed system. In order to illustrate our system, we also present a first experiment with a case study at health sector where we demonstrate how the system is useful for the improvement of data quality. The autonomy, extensibility and platform-independency of the proposed rule management system facilitate its incorporation in any system that is interested in data quality management.
文摘Corporate Performance Management (CPM) system is an information system used to collect, analyze, and visualize key performance indicators (KPIs) to support both business operations and especially strategic decisions. CPM systems display KPIs in forms of scorecard and dashboard so the executives can keep track and evaluate corporate performance. The quality of the information as shown in the KPIs is very crucial for the executives to make the right decisions. Therefore, it is important that the executives must be able to retrieve not only the KPIs but also the quality of those KPIs before using such KPIs in their strategic decisions. The objectives of this study were to determine the role of the CPM system in the organizations, current data and information quality state, problems and perspectives regarding data quality, as well as data quality maturity stage of the organizations. Survey research was used in this study; a questionnaire was sent to collect data from 477 corporations listed in the Stock Exchange of Thailand (SET) on January, 2011. Forty-nine questionnaires were returned. The results show that about half of the organizations have implemented CPM systems. Most organizations are confident in the information in CPM system, but information quality issues are commonly found. Frequent problems regarding information quality are information not up to date, information not ready by time of use, inaccuracy and incomplete. The most concerned and frequently assessed quality dimensions were security, accuracy, completeness, and validity. When asked to prioritize, the most important quality dimensions are accuracy, timeliness, completeness, security, and validity respectively. In addition, most organizations concern about data govemance management and have deployed such measures. This study showed that most organizations are on level 4 on Gartner's data governance maturity stage in which data governance is concerned and managed, but still not effective.
文摘A set of closed-loop normative standards of PDCA(planning-implementing-checking-improvement) of the quality management system is applied to promote the standardization, normalization and institutionalization of grass-roots meteorological observations, further improve the service availability and timely rate of data, and improve the quality and efficiency of meteorological observations.
文摘We advance here a novel methodology for robust intelligent biometric information management with inferences and predictions made using randomness and complexity concepts. Intelligence refers to learning, adap- tation, and functionality, and robustness refers to the ability to handle incomplete and/or corrupt adversarial information, on one side, and image and or device variability, on the other side. The proposed methodology is model-free and non-parametric. It draws support from discriminative methods using likelihood ratios to link at the conceptual level biometrics and forensics. It further links, at the modeling and implementation level, the Bayesian framework, statistical learning theory (SLT) using transduction and semi-supervised lea- rning, and Information Theory (IY) using mutual information. The key concepts supporting the proposed methodology are a) local estimation to facilitate learning and prediction using both labeled and unlabeled data;b) similarity metrics using regularity of patterns, randomness deficiency, and Kolmogorov complexity (similar to MDL) using strangeness/typicality and ranking p-values;and c) the Cover – Hart theorem on the asymptotical performance of k-nearest neighbors approaching the optimal Bayes error. Several topics on biometric inference and prediction related to 1) multi-level and multi-layer data fusion including quality and multi-modal biometrics;2) score normalization and revision theory;3) face selection and tracking;and 4) identity management, are described here using an integrated approach that includes transduction and boosting for ranking and sequential fusion/aggregation, respectively, on one side, and active learning and change/ outlier/intrusion detection realized using information gain and martingale, respectively, on the other side. The methodology proposed can be mapped to additional types of information beyond biometrics.
基金funded by Ministry of Industry and Information Technology of the People’s Republic of China[Grant No.2018473].
文摘Quality traceability plays an essential role in assembling and welding offshore platform blocks.The improvement of the welding quality traceability system is conducive to improving the durability of the offshore platform and the process level of the offshore industry.Currently,qualitymanagement remains in the era of primary information,and there is a lack of effective tracking and recording of welding quality data.When welding defects are encountered,it is difficult to rapidly and accurately determine the root cause of the problem from various complexities and scattered quality data.In this paper,a composite welding quality traceability model for offshore platform block construction process is proposed,it contains the quality early-warning method based on long short-term memory and quality data backtracking query optimization algorithm.By fulfilling the training of the early-warning model and the implementation of the query optimization algorithm,the quality traceability model has the ability to assist enterprises in realizing the rapid identification and positioning of quality problems.Furthermore,the model and the quality traceability algorithm are checked by cases in actual working conditions.Verification analyses suggest that the proposed early-warningmodel for welding quality and the algorithmfor optimizing backtracking requests are effective and can be applied to the actual construction process.