摘要
随着生产制造业的发展,各行业在生产制造的过程中都会产生大量的工程数据,现代工程领域的数据检索需求要求能够通过关键字快速且准确检索出相应的结果,利用ElasticSearch可以实现工程数据的检索,但是其性能方面还有优化的空间。为了解决这个问题,本文对ElasticSearch的底层原理进行深入研究,在ElasticSearch的索引创建、索引分片以及索引段合并方面进行优化。首先对ElasticSearch的分词器进行修改并配置自定义词典,其次提出基于集群节点性能与索引数据量大小的索引分片策略,最后,根据节点性能对索引段合并的时机进行优化。通过基于地铁工程数据的检索进行实验,实验结果表明,改进的方法确实能够提高ElasticSearch的数据写入与查询性能。
With the development of manufacturing industry,various industries generate a large amount of engineering data during the manufacturing process,the data retrieval requirements of the modern engineering field requires that the corresponding results can be retrieved quickly and accurately through keywords.The retrieval of engineering data can be achieved by using ElasticSearch,but there is still space for optimization in terms of its performance.In order to solve this problem,based on the in-depth study of the underlying theory of ElasticSearch,the index creation,index fragmentation and index segment merging of ElasticSearch are optimized.Firstly,the ElasticSearch tokenizer is modified and a custom dictionary is configured.Secondly,an index sharding strategy based on the performance of the cluster node and the size of the index data is proposed.Finally,the timing of index segment merging based on node performance is optimized.Through the experiments based on the retrieval of subway engineering data,the experimental results show that the improvement method can indeed improve the data writing and query performance of ElasticSearch.
作者
许贤慧
王淑营
曾文驱
XU Xian-hui;WANG Shu-ying;ZENG Wen-qu(School of Computer and Artificial Intelligence, Southwest Jiaotong University, Chengdu 611731, China;Guangzhou Metro Design & Research Institute Co. Ltd., Guangzhou 510010, China)
出处
《计算机与现代化》
2022年第2期79-84,119,共7页
Computer and Modernization
基金
国家重点研发计划项目(2017YFB1201102)。