摘要
通过将相同事务压缩成一行,对原始事务矩阵进行约简,利用事务矩阵与候选项集-项集矩阵相乘得到候选项集-事务矩阵,从而得到候选项集的支持度数;在连接步时采用事先剪枝的策略,减少参与连接的频繁项集;设计实现改进的基于布尔映射矩阵的Apriori算法,并将其应用于医院诊疗数据的挖掘分析。实验结果表明,算法挖掘获得妊娠期糖尿病的危险因素有:年龄≥35岁、身体质量指数(BMI)≥30、孕次≥3、引产次数≥3以及产次≥3。
By compressing the same transaction into one line,the original transaction matrix is reduced,and the candidate itemset transaction matrix is obtained by multiplying the transaction matrix and candidate itemset matrix.The pruning strategy is adopted in the join step to reduce the frequent item sets participating in the join.The improved Apriori algorithm based on Boolean mapping matrix is designed and implemented,and it is applied to the mining and analysis of hospital diagnosis and treatment data.The experimental results show that the risk factors of gestational diabetes mellitus mined by the given algorithm are age≥35 years old,body mass index(BMI)≥30,pregnancies≥3,induced labor≥3,and births≥3.
作者
黄嘉欣
Huang Jiaxin(School of Computer,Electronics and Information,Guangxi University,Nannning 530004;Department of Information Engineering,Liuzhou Maternity and Child Healthcare Hospital,Liuzhou 545001)
出处
《现代计算机》
2021年第25期14-19,共6页
Modern Computer