摘要
在软件开发过程中会产生格式各样的文档数据,例如Word文档、PDF文档等,同时这些文档还存储在不同的数据源,例如文件系统、My SQL数据库、Git仓库等。多数据源情况下,还没有一种统一的方式从多种数据源中检索结构化和非结构化的数据。而基于Lucene的多源数据全文检索系统提出通过使用XML配置文件的方式对多种数据源索引模型创建,以此实现检索系统索引的可配置化。全文检索系统提供方便统一的检索方式来从多种数据源中检索数据,解决多数据源统一检索的问题。
In the software development process,it will produce a variety of document data,such as Word documents,PDF documents,etc.,while these documents are also stored in different data sources,such as the file system,MySQL database,Git warehouse.In the case of multiple data sources,there is no uniform way to retrieve structured and unstructured data from multiple data sources.Proposes the Lucene-based multisource data full-text retrieval system to create a variety of data source index models by using an XML configuration file,thereby realizes the configurability of the index of the retrieval system.The full-text search system provides a convenient and unified search method to retrieve data from multiple data sources,solves the problem of unified retrieval of multiple data sources.
作者
邱敏明
任洪敏
顾利军
QIU Min-ming;REN Hong-min;GU Li-jun(College of Information Engineering,Shanghai Maritime University,Shanghai 201306)
出处
《现代计算机》
2018年第15期88-92,共5页
Modern Computer