摘要
给出了一种实验人工数据合成的算法。该算法利用多种概率模型模拟事务长度、潜在强项集长度、项集频度等特征数据 ,模拟大型超市的事务数据库 ,生成不同规模、不同特点的数据 ,用以测试算法的时间曲线及可伸缩性。
This paper develops algorithms on synthetic data generation based on the mathematic model presented by IBM Almaden Center.Synthetic data generaion is the test foundation of association rule research.To verify that the new algorithm is advanced to the existings on performances and scalibility,the authors need to compare existing algorithms with new algorithms on various kinds of datasets different in size and potentially strong item size.The research makes it releasable and the result is relatively satisfactory.
出处
《吉林工业大学自然科学学报》
CSCD
2000年第3期47-50,共4页
Natural Science Journal of Jilin University of Technology
基金
国家自然科学基金!资助项目 (6 98730 19)