摘要
为了提高科技成果数据融合效果,提出基于大数据分析的科技成果数据融合方法。通过抽取HTML网页中成果元数据组合成果记录,结构化处理成果记录并构建DOM树抽取目标科技成果数据;预处理目标科技成果数据,提升存储空间;联合MapReduce编程模型与Hermite正交基前向神经网络方法,对预处理完成的科技成果数据进行并行化处理和分类融合,利用多个归约函数合并所有子集合结果,快速获取科技成果数据融合最终结果。实验结果表明:该方法在数据抽取时可有效确保抽取数据的完整性和准确性,同时具备科技成果数据融合效率高的优势。
In order to improve the effect of data fusion of scientific and technological achievements,a data fusion method based on big data analysis is proposed.By extracting the fruit meta data from HTML Web pages,combining the achievement records,structurally processing the achievement records and constructing DOM tree to extract the target scientific and technological achievement data;preprocessing the target scientific and technological achievement data to improve the storage space;combining MapReduce programming model and Hermite orthogonal basis forward neural network method,the preprocessed scientific and technological achievement data are parallelized and classified and fused.The final results of data fusion of scientific and technological achievements can be obtained quickly by combining all the results of subsets with multiple reduction functions.The experimental results show that the method can effectively ensure the integrity and accuracy of the extracted data,and has the advantage of high efficiency of data fusion of scientific and technological achievements.
作者
运晨超
黄毅臣
赵微
薛璐璐
杨亮
YUN Chenchao;HUANG Yichen;ZHAO Wei;XUE Lulu;YANG Liang(Economic Technology Institute,State Grid Jibei Electronic Power Company Limited,Beijing 100038,China;Beijing Bowanghuake Technology Co.Ltd.,Beijing 100045,China)
出处
《微型电脑应用》
2022年第4期113-116,共4页
Microcomputer Applications
关键词
大数据分析
科技成果
数据融合
神经网络
MAPREDUCE
并行化处理
big data analysis
scientific and technological achievement
data fusion
neural network
MapReduce
parallel processing