期刊文献+

大数据背景的电子商务商品实体识别算法 被引量:1

E-commerce Commodity Entity Recognition Algorithm Based on Big Data
下载PDF
导出
摘要 当前电子商务商品实体算法无法适应具有多源异构性质的电子商务大数据背景,因而研究了大数据背景的电子商务商品实体识别算法,利用分布式处理的基础架构Hadoop平台中的Map-Reduce执行引擎实现大数据任务处理,通过Hadoop平台的Map阶段合并处理同样模式关系相同值;利用Hadoop平台的Reduce阶段比较输入值与之前各等价值集合,将等价属性/值节点合实现属性/值的规范化处理,将合并后的等价值集合用图表示,用不同实体以及实体间的相似关系表示图顶点与边,通过图聚类思想的实体划分算法依据电子商务商品实体节点收缩邻居信息关系,划分图获取具有统一实体的实体簇实现大数据背景的电子商务商品实体识别。实验结果表明,采用该算法可有效识别大数据背景的电子商务商品实体,数据量为2000 GB时识别精度高达99.82%。 At present,the e-commerce commodity entity algorithm cannot adapt to the current multi-source heterogeneous e-commerce big data background.This paper studies the big data background of the e-commerce commodity entity recognition algorithm,and uses the distributed processing infrastructure of the Map-Reduce execution engine in the Hadoop platform to complete big data task processing.Through the map stage merge of the Hadoop platform,the same mode relationship and the same value are identified.The reduce stage of the Hadoop platform is used to compare the input value with the previous equal value sets,and the equivalent attribute/value nodes are combined to achieve the normalization of the attribute/value.The equivalent value set is represented by a graph,and the vertices and edges of the graph are represented by different entities and the similar relationship between entities.The entity division algorithm of the graph clustering idea shrinks the neighbor information relationship according to the e-commerce commodity entity nodes,and divides the graph to obtain entities with unified entities.The cluster realizes the entity recognition of e-commerce products in the context of big data.Experimental results show that the algorithm can effectively identify e-commerce commodity entities with a large data background,and the recognition accuracy is as high as 99.82%when the amount of data is 2000 GB.
作者 王玉玲 WANG Yuling(School of Aeronautical Management Engineering, Xi’an Aeronautical Polytechnic Institute, Xi’an 710089, China)
出处 《微型电脑应用》 2021年第6期80-83,共4页 Microcomputer Applications
基金 西安航空职业技术学院2019年科研计划项目(19XHSK-005)。
关键词 大数据背景 电子商务 商品实体 识别算法 big data background electronics commerce commodities entities identification algorithms
  • 相关文献

参考文献13

二级参考文献84

共引文献244

同被引文献12

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部