摘要
针对传统实体识别方法的主要研究对象是小数据集并且对结果的准确性关注较高的情况,在大数据的背景下提出了一种利用Hadoop平台和MapReduce框架的基于学习的实体识别方法。通过对MapReduce框架流程的分析,运行基于机器学习的算法,并行处理数据集来识别出数据实体。实验表明,该方法提高了实体识别的效果,具有很好的处理性能和效果,满足了识别海量数据中实体的需求。
According to the traditional entity recognition methods, the study is mainly to research the objects which are small data sets, and more attention is paid to the accuracy of the results.This paper presents an entity recognition method based on learning by using Hadoop platform and MapReduce framework under big data environment.Through the analysis of MapReduce process, running the algorithm based on machine learning and parallel processing data sets to identify the data entities.And the results of experiments show that this method improves the effect of entity recognition, which has good performance and results and can meet the demand for recognition of huge amounts of data entities.
出处
《齐鲁工业大学学报》
2016年第5期55-58,共4页
Journal of Qilu University of Technology
基金
山东省科技发展计划(2014GGX101052)