摘要
基于NoSQL数据库理论,根据应用场景的不同,将NoSQL数据库分为面向高性能读写、面向文档和面向分布式计算的3种类型。对比分析这3种类型数据库的6种代表产品的优缺点,结合铁路客票实名制售票信息综合分析系统中的大数据操作的需求,选用NoSQL数据库中的面向分布式计算的Cassandra数据库。基于Cassandra数据库,提出铁路客票实名制信息综合分析系统的技术架构,并设计反向索引以构建客票实名制乘车信息的查询策略和查询流程。通过性能测试,验证了NoSQL数据库技术在处理大数据查询和分析中的高可用性,可突破传统关系型数据库和数据仓库在应用中所遇到的查询性能、扩展性以及投资成本的瓶颈。
Based NoSQL database theory and different application scenarios,NoSQL database can be divided into three types for high-performance read and write,for documents and for distributed computing.According to the comparative analyses of the advantages and disadvantages of six representative products for these three types of databases,and combining with the demands for large data manipulation in the integrated railway real-name ticketing information analysis system,Cassandra database is chosen as NoSQL database for distributed computing.The technical architecture of integrated railway real-name ticketing information analysis system is proposed based on Cassandra database,and inverted indices are designed to build the query strategies and query processes of travel information for ticket real-name system.The high availability of NoSQL database technology in handling and analyzing large data queries has been verified through performance tests.The bottlenecks of query performance,scalability and investment cost of traditional relational database and data warehouse in applications can be broken through.
出处
《中国铁道科学》
EI
CAS
CSCD
北大核心
2014年第1期135-141,共7页
China Railway Science
基金
中国铁道科学研究院行业服务技术创新项目(1151DZ1003)