摘要
近年来,用户画像作为一种有效的大数据工具,在电子商务、社交网络等互联网行业得到广泛应用。然而,对于传统企业,数据维度往往较少,同时分散在多个信息系统,难以通过一般的方法得到较准确的结果。针对此问题,文章提出基于优化K-means聚类算法的用户画像方法,即同时利用K-means++初始聚类中心优化算法提高聚类精度、Mini Batch K-means小批量优化算法提高收敛速度,以充分结合二者的强互补性,提高算法的分析处理能力。基于企业数据和公开数据集的实验结果显示,相比经典K-means算法,该方法的速度和精度分别提高150倍、20%左右。
In recent years,as an effective big data tool,user portraits have been widely applied in Internet industries,such ase-commerce and social networks.However,for traditional enterprises,where the data dimensions are usually small and scattered in multiple information systems,it is difficult to obtain accurate results through general methods.In response to this problem,the article proposes a user portraits method based on the optimized K-means clustering algorithms,namely,exploiting the K-means++initial clustering center optimization algorithm to improve the clustering accuracy and the Mini Batch K-means small batch optimization algorithm to improve the convergence speed,with the high complementarity of the two combined to improve the analysis and processing capabilities of the algorithm.The experimental results conducted on enterprise data and public data sets show that compared with the classic K-means,the speed and accuracy of this method are increased by about 150 times and 20%,respectively.
出处
《科技创新与应用》
2022年第18期18-21,共4页
Technology Innovation and Application