期刊文献+

交易数据库中关联模式兴趣度的统计度量

Statistical Measure on Interesting of Association Patterns in Retail Database
下载PDF
导出
摘要 关联模式挖掘研究是数据挖掘研究领域的重要分支之一,旨在发现项集之间存在的关联或相关关系。然而,传统的基于支持度-可信度框架的挖掘方法存在着一些不足:一是会产生过多的模式(包括频繁项集和规则);二是挖掘出来的规则有些是用户不感兴趣的、无用的,甚至是错误的。所以,在挖掘过程中有效地对无用模式进行剪枝是必要的。将卡方分析引入到模式的相关性度量中,利用卡方检验对项集之间、规则前件与后件之间的相关性进行度量是一种有效的剪枝方法。结果分析表明,在支持度度量的基础上引入卡方检验可以有效地对非相关模式进行剪枝,从而缩小频繁项集和规则的规模。 Association patterns mining is one of the important task of research on data mining, which main purpose is finding the correlations between the items. However, there are some shortcomings while using the common approach based on support-confidence framework to capture association patterns. First, there are a great number of redundant association rules generated; second, some of patterns generated are unwanted, even are misleading. So it is necessary to prune such uninteresting patterns. Chi-Squared test is introduced to prune the irrelevant items via calculating the Chi-Squared value of items. The experiment shows that Chi-Squared test is efficient and the searching space of the algorithm has been reduced remarkably.
作者 徐勇 朱其祥
出处 《现代计算机》 2005年第11期21-24,共4页 Modern Computer
基金 安徽省高等学校自然科学研究项目(2005KJ305ZC)
  • 相关文献

参考文献12

  • 1Jiawei Han,Micheline,Kamber. Data Mining-Concepts and Techniques. High Education Press,Morgan Kaufman Publishers,2001.
  • 2PangNing Tan,Vipin Kumar. Interestingness Measures for Association Patterns:A Perspective.
  • 3R. Agrawal, and R. Srikant. Fast Algorithms for Mining Association Rules in Large Database. Technical Report FJ9839, IBM Almaden Research Center, San Jose, CA, Jun, 1994.
  • 4R. Agrawal, and R. Srikant. Fast Algorithms for Mining Association Rules. In Proc. 1994 Int. Conf. Very Large Databases(VLDB '94), Sep, 1994.
  • 5G. Piatetsky-Shapiro. Discovery, Analysis, and Presentation of Strong Rules. In G. Piatetsky-Shapiro(Editor). Knowledge Discovery in Databases. AAAI/MIT Press,1991 229-248.
  • 6M.S.Chen, J.Han,and P.S.Yu. Data Mining: An Overview from a Database Perspective. IEEE Trans. Knowledge and Data Engineering,8:866-883,1996.
  • 7S Brin,R Motwani,C Silvemtein.Beyond Market basekets: Generalizing Association Rules to Correlations. In:Proc of 1997 ACM SIGMOD Intl Conf on management of Data. Tucson,Arizona, UAS:ACM Press, 1997 : 265-276.
  • 8C,C.Aggarwal and P.S.Yu, A New Framework for Itemset Generation.In Proc. 1998 ACM Symp. Principles of Database Systems(PODS'98): 18-24,1999.
  • 9K.M.Ahmed,N.M.El-Makky,and Y.Taha. A Note on"Beyond Market basket: Generalizing Association Rules to Correlations."SIGKOD Explorations,2000:46-48.
  • 10A,A.freitas. On Rule Interestingness Measures.Knowledge-Based Systems, 1999:303-315.

二级参考文献2

  • 1Aggarwal C C,Proc of the Int’ l Conf on Data Engineering,1998年,402页
  • 2Han J,Proc of Int’ l Conf Very Large Data Bases,1995年,420页

共引文献90

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部