Data-mining techniques have been developed to turn data into useful task-oriented knowledge. Most algorithms for mining association rules identify relationships among transactions using binary values and find rules at...Data-mining techniques have been developed to turn data into useful task-oriented knowledge. Most algorithms for mining association rules identify relationships among transactions using binary values and find rules at a single-concept level. Extracting multilevel association rules in transaction databases is most commonly used in data mining. This paper proposes a multilevel fuzzy association rule mining model for extraction of implicit knowledge which stored as quantitative values in transactions. For this reason it uses different support value at each level as well as different membership function for each item. By integrating fuzzy-set concepts, data-mining technologies and multiple-level taxonomy, our method finds fuzzy association rules from transaction data sets. This approach adopts a top-down progressively deepening approach to derive large itemsets and also incorporates fuzzy boundaries instead of sharp boundary intervals. Comparing our method with previous ones in simulation shows that the proposed method maintains higher precision, the mined rules are closer to reality, and it gives ability to mine association rules at different levels based on the user’s tendency as well.展开更多
The distribution of sampling data influences completeness of rule base so that extrapolating missing rules is very difficult. Based on data mining, a self-learning method is developed for identifying fuzzy model and e...The distribution of sampling data influences completeness of rule base so that extrapolating missing rules is very difficult. Based on data mining, a self-learning method is developed for identifying fuzzy model and extrapolating missing rules, by means of confidence measure and the improved gradient descent method. The proposed approach can not only identify fuzzy model, update its parameters and determine optimal output fuzzy sets simultaneously, but also resolve the uncontrollable problem led by the regions that data do not cover. The simulation results show the effectiveness and accuracy of the proposed approach with the classical truck backer-upper control problem verifying.展开更多
The fight against fraud and trafficking is a fundamental mission of customs. The conditions for carrying out this mission depend both on the evolution of economic issues and on the behaviour of the actors in charge of...The fight against fraud and trafficking is a fundamental mission of customs. The conditions for carrying out this mission depend both on the evolution of economic issues and on the behaviour of the actors in charge of its implementation. As part of the customs clearance process, customs are nowadays confronted with an increasing volume of goods in connection with the development of international trade. Automated risk management is therefore required to limit intrusive control. In this article, we propose an unsupervised classification method to extract knowledge rules from a database of customs offences in order to identify abnormal behaviour resulting from customs control. The idea is to apply the Apriori principle on the basis of frequent grounds on a database relating to customs offences in customs procedures to uncover potential rules of association between a customs operation and an offence for the purpose of extracting knowledge governing the occurrence of fraud. This mass of often heterogeneous and complex data thus generates new needs that knowledge extraction methods must be able to meet. The assessment of infringements inevitably requires a proper identification of the risks. It is an original approach based on data mining or data mining to build association rules in two steps: first, search for frequent patterns (support >= minimum support) then from the frequent patterns, produce association rules (Trust >= Minimum Trust). The simulations carried out highlighted three main association rules: forecasting rules, targeting rules and neutral rules with the introduction of a third indicator of rule relevance which is the Lift measure. Confidence in the first two rules has been set at least 50%.展开更多
The amount of data for decision making has increased tremendously in the age of the digital economy. Decision makers who fail to proficiently manipulate the data produced may make incorrect decisions and therefore har...The amount of data for decision making has increased tremendously in the age of the digital economy. Decision makers who fail to proficiently manipulate the data produced may make incorrect decisions and therefore harm their business. Thus, the task of extracting and classifying the useful information efficiently and effectively from huge amounts of computational data is of special importance. In this paper, we consider that the attributes of data could be both crisp and fuzzy. By examining the suitable partial data, segments with different classes are formed, then a multithreaded computation is performed to generate crisp rules (if possible), and finally, the fuzzy partition technique is employed to deal with the fuzzy attributes for classification. The rules generated in classifying the overall data can be used to gain more knowledge from the data collected.展开更多
Quantitative attributes are partitioned into several fuzzy sets by using fuzzy c-means algorithm.Fuzzy c-means algorithm can embody the actual distribution of the data,and fuzzy sets can soften the partition boundary....Quantitative attributes are partitioned into several fuzzy sets by using fuzzy c-means algorithm.Fuzzy c-means algorithm can embody the actual distribution of the data,and fuzzy sets can soften the partition boundary.Then,we improve the search technology of apriori algorithm and present the algorithm for mining fuzzy association rules.As the database size becomes larger and larger,a better way is to mine fuzzy association rules in parallel.In the parallel mining algorithm,quantitative attributes are partitioned into several fuzzy sets by using parallel fuzzy c-means algorithm.Boolean parallel algorithm is improved to discover frequent fuzzy attribute set,and the fuzzy association rules with at least a minimum confidence are generated on all processors.The experiment results implemented on the distributed linked PC/workstation show that the parallel mining algorithm has fine scaleup,sizeup and speedup.Last,we discuss the application of fuzzy association rules in the classification.The example shows that the accuracy of classification systems of the fuzzy association rules is better than that of the two popular classification methods:C4.5 and CBA.展开更多
模糊分类关联规则(Fuzzy Classification Association Rules,FCAR)是一种特殊的模糊关联规则,挖掘FCAR对于构建基于规则的分类模型至关重要。传统关联规则挖掘算法挖掘FCAR时可能会包含较多冗余规则,并且在数据集类别不平衡时,挖掘到的...模糊分类关联规则(Fuzzy Classification Association Rules,FCAR)是一种特殊的模糊关联规则,挖掘FCAR对于构建基于规则的分类模型至关重要。传统关联规则挖掘算法挖掘FCAR时可能会包含较多冗余规则,并且在数据集类别不平衡时,挖掘到的小类规则的数量会急剧减少甚至降为0。为解决上述问题,提出了一种基于特征选择和模糊类支持度-模糊提升度框架(Fuzzy Category Support-Fuzzy Lift Framework,FCS-FLF)的FCAR挖掘算法FSFCS Based FCARMiner(Feature Selection and Fuzzy Category Support-Fuzzy Lift Framework Based FCAR-Miner),基于模糊隶属度矩阵迭代挖掘FCAR。在多个类别不平衡的数据集上的实验结果表明,相比其他算法FSFCS Based FCAR-Miner算法能够避免大量冗余规则的生成,同时也能适应数据类别不平衡的情况,不会出现各类规则数量相差悬殊的情况。展开更多
文摘Data-mining techniques have been developed to turn data into useful task-oriented knowledge. Most algorithms for mining association rules identify relationships among transactions using binary values and find rules at a single-concept level. Extracting multilevel association rules in transaction databases is most commonly used in data mining. This paper proposes a multilevel fuzzy association rule mining model for extraction of implicit knowledge which stored as quantitative values in transactions. For this reason it uses different support value at each level as well as different membership function for each item. By integrating fuzzy-set concepts, data-mining technologies and multiple-level taxonomy, our method finds fuzzy association rules from transaction data sets. This approach adopts a top-down progressively deepening approach to derive large itemsets and also incorporates fuzzy boundaries instead of sharp boundary intervals. Comparing our method with previous ones in simulation shows that the proposed method maintains higher precision, the mined rules are closer to reality, and it gives ability to mine association rules at different levels based on the user’s tendency as well.
基金This project was supported by State Science &Technology Pursuing Project (2001BA204B01) of China and Foundation forUniversity Key Teacher by the Ministry of Education of China.
文摘The distribution of sampling data influences completeness of rule base so that extrapolating missing rules is very difficult. Based on data mining, a self-learning method is developed for identifying fuzzy model and extrapolating missing rules, by means of confidence measure and the improved gradient descent method. The proposed approach can not only identify fuzzy model, update its parameters and determine optimal output fuzzy sets simultaneously, but also resolve the uncontrollable problem led by the regions that data do not cover. The simulation results show the effectiveness and accuracy of the proposed approach with the classical truck backer-upper control problem verifying.
文摘The fight against fraud and trafficking is a fundamental mission of customs. The conditions for carrying out this mission depend both on the evolution of economic issues and on the behaviour of the actors in charge of its implementation. As part of the customs clearance process, customs are nowadays confronted with an increasing volume of goods in connection with the development of international trade. Automated risk management is therefore required to limit intrusive control. In this article, we propose an unsupervised classification method to extract knowledge rules from a database of customs offences in order to identify abnormal behaviour resulting from customs control. The idea is to apply the Apriori principle on the basis of frequent grounds on a database relating to customs offences in customs procedures to uncover potential rules of association between a customs operation and an offence for the purpose of extracting knowledge governing the occurrence of fraud. This mass of often heterogeneous and complex data thus generates new needs that knowledge extraction methods must be able to meet. The assessment of infringements inevitably requires a proper identification of the risks. It is an original approach based on data mining or data mining to build association rules in two steps: first, search for frequent patterns (support >= minimum support) then from the frequent patterns, produce association rules (Trust >= Minimum Trust). The simulations carried out highlighted three main association rules: forecasting rules, targeting rules and neutral rules with the introduction of a third indicator of rule relevance which is the Lift measure. Confidence in the first two rules has been set at least 50%.
文摘The amount of data for decision making has increased tremendously in the age of the digital economy. Decision makers who fail to proficiently manipulate the data produced may make incorrect decisions and therefore harm their business. Thus, the task of extracting and classifying the useful information efficiently and effectively from huge amounts of computational data is of special importance. In this paper, we consider that the attributes of data could be both crisp and fuzzy. By examining the suitable partial data, segments with different classes are formed, then a multithreaded computation is performed to generate crisp rules (if possible), and finally, the fuzzy partition technique is employed to deal with the fuzzy attributes for classification. The rules generated in classifying the overall data can be used to gain more knowledge from the data collected.
基金supported by the National Key Basic Research Program 973(2002CB312000)National Natural Science Funds for Distinguished Young Scholar(60425206)Advanced Armament Research Project(51406020105JB8103).
文摘Quantitative attributes are partitioned into several fuzzy sets by using fuzzy c-means algorithm.Fuzzy c-means algorithm can embody the actual distribution of the data,and fuzzy sets can soften the partition boundary.Then,we improve the search technology of apriori algorithm and present the algorithm for mining fuzzy association rules.As the database size becomes larger and larger,a better way is to mine fuzzy association rules in parallel.In the parallel mining algorithm,quantitative attributes are partitioned into several fuzzy sets by using parallel fuzzy c-means algorithm.Boolean parallel algorithm is improved to discover frequent fuzzy attribute set,and the fuzzy association rules with at least a minimum confidence are generated on all processors.The experiment results implemented on the distributed linked PC/workstation show that the parallel mining algorithm has fine scaleup,sizeup and speedup.Last,we discuss the application of fuzzy association rules in the classification.The example shows that the accuracy of classification systems of the fuzzy association rules is better than that of the two popular classification methods:C4.5 and CBA.
文摘模糊分类关联规则(Fuzzy Classification Association Rules,FCAR)是一种特殊的模糊关联规则,挖掘FCAR对于构建基于规则的分类模型至关重要。传统关联规则挖掘算法挖掘FCAR时可能会包含较多冗余规则,并且在数据集类别不平衡时,挖掘到的小类规则的数量会急剧减少甚至降为0。为解决上述问题,提出了一种基于特征选择和模糊类支持度-模糊提升度框架(Fuzzy Category Support-Fuzzy Lift Framework,FCS-FLF)的FCAR挖掘算法FSFCS Based FCARMiner(Feature Selection and Fuzzy Category Support-Fuzzy Lift Framework Based FCAR-Miner),基于模糊隶属度矩阵迭代挖掘FCAR。在多个类别不平衡的数据集上的实验结果表明,相比其他算法FSFCS Based FCAR-Miner算法能够避免大量冗余规则的生成,同时也能适应数据类别不平衡的情况,不会出现各类规则数量相差悬殊的情况。