摘要
为了有效利用云平台Hadoop框架的并行处理能力。通过对大数据挖掘技术中的传统关联规则算法-Apriori算法进行了分析和改进,提出了一种基于Map Reduce并行模式的改进数据挖掘算法,适用于医学大数据的分析和应用。首先通过布尔排列优化数据库中事务数据的存储方式,从而有效减少数据库被扫描的次数。然后采用关联规则优化减少Apriori算法中冗余的子集。为了验证改进算法的有效性,采用医学历史数据进行实验验证。最后仿真实验结果显示,相比传统的Apriori算法,提出算法的运行效率更高,具有较好的可靠性和有效性。
In order to effectively use the parallel processing capabilities of the cloud platfomi Hadoop framework,an improved data mining algorithm based on Map Reduce parallel mode is proposed by analyzing and improving a traditional association rles algorithm-Aprori algorithm which belong to big data mining technolog,which is suitable for the analysis and application of medical big data.First,the Boolean arrangement is used to optimize the storage mode of transaction data in the database,and it will effectively reduce the number of database scanned.Then,the association rule optimization is used to reduce the redundant subsets in the Apriori algorithm.In order to veiify the effectiveness of the improved algorithm,medical history data is used to verify the experiment.Finally,the simulation results show that the proposed algorithm is more efficient and has better reliability and validity as compared with the traditional Apror algorithm.
作者
姜广坤
Guang-kun JIANG(Dalian Ocean University,Dalian 116300,China)
出处
《机床与液压》
北大核心
2018年第18期163-168,共6页
Machine Tool & Hydraulics
基金
Liaoning Province Education Department Scientific Planning Project(JG15EB019)
Liaoning Province Higher Education Academic Research Project of the 12th Five-Year Plan(GHJT201502107)