Unified identity authentication has become the basic information service provided by colleges and universities for teachers and students. Security, stability, high concurrency and easy maintenance are our requirements...Unified identity authentication has become the basic information service provided by colleges and universities for teachers and students. Security, stability, high concurrency and easy maintenance are our requirements for a unified identity authentication system. Based on the practical work experience of China University of Geosciences (Beijing), this paper proposes a high availability scheme of unified identity authentication system based on CAS, which is composed of multiple CAS Servers, Nginx for load balancing, and Redis as a cache database. The scheme has been practiced in China University of Geosciences (Beijing), and the application effect is good, which has practical reference significance for other universities.展开更多
High availability is a critical mission for business system. At first, an instance of business system OPENSTOCK for pharmacy is introduced including both client and server sides. Secondly, a solution to the high avail...High availability is a critical mission for business system. At first, an instance of business system OPENSTOCK for pharmacy is introduced including both client and server sides. Secondly, a solution to the high availability of this system is given in detail, including design and implementation. The essentiality of this solution consists of scope of system information, system parameter tables of service status, schedule strategies of load ba lance and how to acquire system parameters and detect service states. The solution proposed is scalable and application oriented and supporting load balance for high performance and fault tolerate for high reliability. This application system has been applied and verified realistically, and the features of this business system derived in this paper have been achieved.展开更多
Despite the rapid evolution in all aspects of computer technology, both the computer hardware and software are prone to numerous failure conditions. In this paper, we analyzed the characteristic of a computer system a...Despite the rapid evolution in all aspects of computer technology, both the computer hardware and software are prone to numerous failure conditions. In this paper, we analyzed the characteristic of a computer system and the methods of constructing a system, proposed a communication link management model supporting high availability for network applications, Which will greatly increase the high availability of network applications. Then we elaborated on heartbeat or service detect, fail-over, service take-over, switchback and error recovery process of the model. In the process of constructing the communication link, we implemented the link management and service take-over with high availability requirement, and discussed the state and the state transition of building the communication link between the hosts, depicted the message transfer and the start of timer. At Last, we applied the designed high availability system to a network billing system, and showed how the system was constructed and implemented, which perfectly satisfied the system requirements. Key words high availability - service take-over - fail-over - communication link management CLC number TP 393 Foundation item: Supported by the National Natural Science Foundation of China (60132030)Biography: Luo Juan (1974-), female, Ph. D candidate, research direction: high speed network and information security.展开更多
With the development of high-speed railways in China,more than 2000 high-speed trains will be put into use.Safety and efficiency of railway transportation is increasingly important.We have designed a high availability...With the development of high-speed railways in China,more than 2000 high-speed trains will be put into use.Safety and efficiency of railway transportation is increasingly important.We have designed a high availability quadruple vital computer (HAQVC) system based on the analysis of the architecture of the traditional double 2-out-of-2 system and 2-out-of-3 system.The HAQVC system is a system with high availability and safety,with prominent characteristics such as fire-new internal architecture,high efficiency,reliable data interaction mechanism,and operation state change mechanism.The hardware of the vital CPU is based on ARM7 with the real-time embedded safe operation system (ES-OS).The Markov modeling method is designed to evaluate the reliability,availability,maintainability,and safety (RAMS) of the system.In this paper,we demonstrate that the HAQVC system is more reliable than the all voting triple modular redundancy (AVTMR) system and double 2-out-of-2 system.Thus,the design can be used for a specific application system,such as an airplane or high-speed railway system.展开更多
We conceptualize bioresource upgrade for sustainable energy,environment,and biomedicine with a focus on circular economy,sustainability,and carbon neutrality using high availability and low utilization biomass(HALUB)....We conceptualize bioresource upgrade for sustainable energy,environment,and biomedicine with a focus on circular economy,sustainability,and carbon neutrality using high availability and low utilization biomass(HALUB).We acme energy-efficient technologies for sustainable energy and material recovery and applications.The technologies of thermochemical conversion(TC),biochemical conversion(BC),electrochemical conversion(EC),and photochemical conversion(PTC)are summarized for HALUB.Microalgal biomass could contribute to a biofuel HHV of 35.72 MJ Kg^(-1)and total benefit of 749$/ton biomass via TC.Specific surface area of biochar reached 3000 m^(2)g^(-1)via pyrolytic carbonization of waste bean dregs.Lignocellulosic biomass can be effectively converted into bio-stimulants and biofertilizers via BC with a high conversion efficiency of more than 90%.Besides,lignocellulosic biomass can contribute to a current density of 672 mA m^(-2)via EC.Bioresource can be 100%selectively synthesized via electrocatalysis through EC and PTC.Machine learning,techno-economic analysis,and life cycle analysis are essential to various upgrading approaches of HALUB.Sustainable biomaterials,sustainable living materials and technologies for biomedical and multifunctional applications like nano-catalysis,microfluidic and micro/nanomotors beyond are also highlighted.New techniques and systems for the complete conversion and utilization of HALUB for new energy and materials are further discussed.展开更多
Highly security-critical system should possess features of continuous service. We present a new Robust Disaster Recovery System Model (RDRSM). Through strengthening the ability of safe communications, RDRSM guarante...Highly security-critical system should possess features of continuous service. We present a new Robust Disaster Recovery System Model (RDRSM). Through strengthening the ability of safe communications, RDRSM guarantees the secure and reliable command on disaster recovery. Its self-supervision capability can monitor the integrality and security of disaster recovery system itself. By 2D and 3D rea-time visible platform provided by GIS, GPS and RS, the model makes the using, management and maintenance of disaster recovery system easier. RDRSM possesses predominant features of security, robustness and controllability. And it can be applied to highly security-critical environments such as E-government and bank. Conducted by RDRSM, an important E-government disaster recovery system has been constructed successfully. The feasibility of this model is verified by practice. We especially emphasize the significance of some components of the model, such as risk assessment, disaster recovery planning, system supervision and robust communication support.展开更多
A redundant data path(RDP) device driver for Windows 2000 is proposed and implemented to provide automatic transparent failover and load balancing across multiple SCSI or fibre paths between the hosts and the storage ...A redundant data path(RDP) device driver for Windows 2000 is proposed and implemented to provide automatic transparent failover and load balancing across multiple SCSI or fibre paths between the hosts and the storage subsystems. RDP driver is implemented as a filter driver on top of the traditional disk device driver, it is completely transparent to the upper file system and the lower level physical device. I/O requests bound for the disk device are routed first to the RDP driver, then RDP driver calls the disk driver to perform these I/O requests. RDP detects the path failure and reroutes all subsequent I/O traffic through survival paths. I/O requests are distributed to different physical paths to achieve the maximum throughput. The multi layered device driver approach significantly reduces the implementation overhead and improves portability, and does not require any changes to the OS or the on disk data layout. RDP driver keeps applications running under path fault conditions and improves disk I/O performance.展开更多
Data items are usually replicated in modem dis- tributed data stores to obtain high performance and avail- ability. However, the availability-consistency and latency- consistency trade-offs exist in data replication, ...Data items are usually replicated in modem dis- tributed data stores to obtain high performance and avail- ability. However, the availability-consistency and latency- consistency trade-offs exist in data replication, thus system designers intend to choose weak consistency models, such as eventual consistency, which may result in stale reads. Since stale data items may lead to serious application semantic problems, we consider how to increase the probability of data recency which provides a uniform view on recent versions of data items for all clients. In this work, we propose HARP, a framework that can enhance data recency of eventually con- sistent distributed data stores in an efficient and highly avail- able way. Through detecting possible stale reads under fail- ures or not, HARP can perform reread operations to elim- inate stale results only when needed based on our analysis on write/read processes. We also present solutions on how to deal with some practical anomalies in HARP, including de- layed, reordered and dropped messages and clock drift, and show how to extend HARP to multiple datacenters. Finally we implement HARP based on Cassandra, and the experi- ments show that HARP can effectively eliminate stale reads, with a low overhead (less than 6.9%) compared with original eventually consistent Cassandra.展开更多
Currently accelerator control systems adopt distribution architecture and are developed with integration tools,such as EPICS,TANGO and SCADA.The digital controller based on FPGA,DSP is widely used in accelerator contr...Currently accelerator control systems adopt distribution architecture and are developed with integration tools,such as EPICS,TANGO and SCADA.The digital controller based on FPGA,DSP is widely used in accelerator controls and embedded EPICS IOC is a hot point.On the software side,laboratories have built their software develop- ment environments and the open sources Eclips,Abeans serve software development too.The high availability research is a challenge in the control world.The paper describes accelerator controls and progress of correlative technologies.展开更多
LinuxDirector is a connection director that supports load balancing among multiple Internet servers, which can be used to build scalable Internet services based on clusters of servers. LinuxDirector extends the TCP/I...LinuxDirector is a connection director that supports load balancing among multiple Internet servers, which can be used to build scalable Internet services based on clusters of servers. LinuxDirector extends the TCP/IP stack of Linux kernel to support three IP load balancing techniques, VS/NAT, VS/TUN and VS/DR. Four scheduling algorithms have been implemented to assign connections to different servers. Scalability is achieved by transparently adding or removing a node in the cluster. High availability is provided by detecting node or daemon failures and reconfiguring the system appropriately. This paper describes the design and implementation of LinuxDirector and presents several of its features including scalability, high availability and connection affinity.展开更多
The Hong Kong Observatory (HKO) provides low-level turbulence alerting service for the Hong Kong International Airport (HKIA) through the windshear and turbulence warning system (WTWS). In the WTWS, turbulence intensi...The Hong Kong Observatory (HKO) provides low-level turbulence alerting service for the Hong Kong International Airport (HKIA) through the windshear and turbulence warning system (WTWS). In the WTWS, turbulence intensities along the flight paths of the airport are estimated based upon correlation equations established between the surface anemometer data and the turbulence data from research aircraft before the opening of the airport. The research aircraft data are not available on day-to-day basis. The remote sensing meteorological instruments, such as the Doppler light detection and ranging (LIDAR) and radar, may be used to provide direct measurements of turbulence intensities over the runway corridors. The performances of LIDAR- and radar-based turbulence intensity data are studied in this paper based on actual turbulence intensity measurements made on 423 commercial jets for a typical case of terrain-induced turbulence in association with a typhoon. It turns out that, with the tuning of the relative operating characteristic (ROC) curve between hit rate and false alarm rate, the LIDAR-based turbulence intensity measurement performs better than the anemometer-based estimation of WTWS for turbulence intensity at moderate level or above. On the other hand, the radar-based measurement does not perform as well when compared with WTWS. By combining LIDAR- and radar-based measurements, the performance is slightly better than WTWS, mainly as a result of contribution from LIDAR-based measurement. As a result, the LIDAR-based turbulence intensity measurement could be used to replace anemometer-based estimate for non-rainy weather conditions. Further enhancements of radar-based turbulence intensity measurement in rain would be necessary.展开更多
State machine replication has been widely used in modern cluster-based database systems.Most commonly deployed configurations adopt the Raft-like consensus protocol,which has a single strong leader which replicates th...State machine replication has been widely used in modern cluster-based database systems.Most commonly deployed configurations adopt the Raft-like consensus protocol,which has a single strong leader which replicates the log to other followers.Since the followers can handle read requests and many real workloads are usually read-intensive,the recovery speed of a crashed follower may significantly impact on the throughput.Different from traditional database recovery,the recovering follower needs to repair its local log first.Original Raft protocol takes many network round trips to do log comparison between leader and the crashed follower.To reduce network round trips,an optimization method is to truncate the follower’s uncertain log entries behind the latest local commit point,and then to directly fetch all committed log entries from the leader in one round trip.However,if the commit point is not persisted,the recovering follower has to get the whole log from the leader.In this paper,we propose an accurate and efficient log repair(AELR)algorithm for follower recovery.AELR is more robust and resilient to follower failure,and it only needs one network round trip to fetch the least number of log entries for follower recovery.This approach is implemented in the open source database system OceanBase.We experimentally show that the system adopting AELR has a good performance in terms of recovery time.展开更多
The evolution of computer networks has experienced several major steps, and research focus of each step has been kept changing and evolving, from ARPANET to OSI/RM, then HSN (high speed network) and HPN (high perfo...The evolution of computer networks has experienced several major steps, and research focus of each step has been kept changing and evolving, from ARPANET to OSI/RM, then HSN (high speed network) and HPN (high performance network). During the evolution, computer networks represented by Internet have made great progress and gained unprecedented success. However, with the appearance and intensification of tussle, along with the three difficult problems (service customizing, resource control and user management) of modern network, it is found that traditional Internet and its architecture no longer meet the requirements of next generation network. Therefore, it is the next generation network that current Internet must evolve to. With the mindset of achieving valuable guidance for research on next generation network, this paper firstly analyzes some dilemmas facing current Internet and its architecture, and then surveys some recent influential research work and progresses in computer networks and related areas, including new generation network architecture, network resource control technologies, network management and security, distributed computing and middleware, wireless/mobile network, new generation network services and applications, and foundational theories on network modeling. Finally, this paper concludes that within the research on next generation network, more attention should be paid to the high availability network and corresponding architecture, key theories and supporting technologies.展开更多
The modern in-memory database(IMDB)can support highly concurrent on-line transaction processing(OLTP)workloads and generate massive transactional logs per second.Quorum-based replication protocols such as Paxos or Raf...The modern in-memory database(IMDB)can support highly concurrent on-line transaction processing(OLTP)workloads and generate massive transactional logs per second.Quorum-based replication protocols such as Paxos or Raft have been widely used in the distributed databases to offer higher availability and fault-tolerance.However,it is non-trivial to replicate IMDB because high transaction rate has brought new challenges.First,the leader node in quorum replication should have adaptivity by considering various transaction arrival rates and the processing capability of follower nodes.Second,followers are required to replay logs to catch up the state of the leader in the highly concurrent setting to reduce visibility gap.Third,modern databases are often built with a cluster of commodity machines connected by low configuration networks,in which the network anomalies often happen.In this case,the performance would be significantly affected because the follower node falls into the long-duration exception handling process(e.g.,fetch lost logs from the leader).To this end,we build QuorumX,an efficient and stable quorum-based replication framework for IMDB under heavy OLTP workloads.QuorumX combines critical path based batching and pipeline batching to provide an adaptive log propagation scheme to obtain a stable and high performance at various settings.Further,we propose a safe and coordination-free log replay scheme to minimize the visibility gap between the leader and follower IMDBs.We further carefully design the process for the follower node in order to alleviate the influence of the unreliable network on the replication performance.Our evaluation results with the YCSB,TPC-C and a realistic microbenchmark demonstrate that QuorumX achieves the performance close to asynchronous primary-backup replication and could always provide a stable service with data consistency and a low-level visibility gap.展开更多
文摘Unified identity authentication has become the basic information service provided by colleges and universities for teachers and students. Security, stability, high concurrency and easy maintenance are our requirements for a unified identity authentication system. Based on the practical work experience of China University of Geosciences (Beijing), this paper proposes a high availability scheme of unified identity authentication system based on CAS, which is composed of multiple CAS Servers, Nginx for load balancing, and Redis as a cache database. The scheme has been practiced in China University of Geosciences (Beijing), and the application effect is good, which has practical reference significance for other universities.
文摘High availability is a critical mission for business system. At first, an instance of business system OPENSTOCK for pharmacy is introduced including both client and server sides. Secondly, a solution to the high availability of this system is given in detail, including design and implementation. The essentiality of this solution consists of scope of system information, system parameter tables of service status, schedule strategies of load ba lance and how to acquire system parameters and detect service states. The solution proposed is scalable and application oriented and supporting load balance for high performance and fault tolerate for high reliability. This application system has been applied and verified realistically, and the features of this business system derived in this paper have been achieved.
文摘Despite the rapid evolution in all aspects of computer technology, both the computer hardware and software are prone to numerous failure conditions. In this paper, we analyzed the characteristic of a computer system and the methods of constructing a system, proposed a communication link management model supporting high availability for network applications, Which will greatly increase the high availability of network applications. Then we elaborated on heartbeat or service detect, fail-over, service take-over, switchback and error recovery process of the model. In the process of constructing the communication link, we implemented the link management and service take-over with high availability requirement, and discussed the state and the state transition of building the communication link between the hosts, depicted the message transfer and the start of timer. At Last, we applied the designed high availability system to a network billing system, and showed how the system was constructed and implemented, which perfectly satisfied the system requirements. Key words high availability - service take-over - fail-over - communication link management CLC number TP 393 Foundation item: Supported by the National Natural Science Foundation of China (60132030)Biography: Luo Juan (1974-), female, Ph. D candidate, research direction: high speed network and information security.
基金Project(No.2009BAG12A05) supported by the National Key Technology R&D Program of China
文摘With the development of high-speed railways in China,more than 2000 high-speed trains will be put into use.Safety and efficiency of railway transportation is increasingly important.We have designed a high availability quadruple vital computer (HAQVC) system based on the analysis of the architecture of the traditional double 2-out-of-2 system and 2-out-of-3 system.The HAQVC system is a system with high availability and safety,with prominent characteristics such as fire-new internal architecture,high efficiency,reliable data interaction mechanism,and operation state change mechanism.The hardware of the vital CPU is based on ARM7 with the real-time embedded safe operation system (ES-OS).The Markov modeling method is designed to evaluate the reliability,availability,maintainability,and safety (RAMS) of the system.In this paper,we demonstrate that the HAQVC system is more reliable than the all voting triple modular redundancy (AVTMR) system and double 2-out-of-2 system.Thus,the design can be used for a specific application system,such as an airplane or high-speed railway system.
基金the support from Harvard/MITthe support funded by the National Research Foundation(NRF),Prime Minister’s Office,Singapore,under its Campus for Research Excellence and Technological Enterprise(CREATE)program,Grant Number R-706-001-102-281the funding support from Harbin Institute of Technology,China,Grant Number FRFCU5710053121。
文摘We conceptualize bioresource upgrade for sustainable energy,environment,and biomedicine with a focus on circular economy,sustainability,and carbon neutrality using high availability and low utilization biomass(HALUB).We acme energy-efficient technologies for sustainable energy and material recovery and applications.The technologies of thermochemical conversion(TC),biochemical conversion(BC),electrochemical conversion(EC),and photochemical conversion(PTC)are summarized for HALUB.Microalgal biomass could contribute to a biofuel HHV of 35.72 MJ Kg^(-1)and total benefit of 749$/ton biomass via TC.Specific surface area of biochar reached 3000 m^(2)g^(-1)via pyrolytic carbonization of waste bean dregs.Lignocellulosic biomass can be effectively converted into bio-stimulants and biofertilizers via BC with a high conversion efficiency of more than 90%.Besides,lignocellulosic biomass can contribute to a current density of 672 mA m^(-2)via EC.Bioresource can be 100%selectively synthesized via electrocatalysis through EC and PTC.Machine learning,techno-economic analysis,and life cycle analysis are essential to various upgrading approaches of HALUB.Sustainable biomaterials,sustainable living materials and technologies for biomedical and multifunctional applications like nano-catalysis,microfluidic and micro/nanomotors beyond are also highlighted.New techniques and systems for the complete conversion and utilization of HALUB for new energy and materials are further discussed.
基金Supported by the 10th Five Year High-Tech Researchand Development Plan of China (2002AA1Z67101)
文摘Highly security-critical system should possess features of continuous service. We present a new Robust Disaster Recovery System Model (RDRSM). Through strengthening the ability of safe communications, RDRSM guarantees the secure and reliable command on disaster recovery. Its self-supervision capability can monitor the integrality and security of disaster recovery system itself. By 2D and 3D rea-time visible platform provided by GIS, GPS and RS, the model makes the using, management and maintenance of disaster recovery system easier. RDRSM possesses predominant features of security, robustness and controllability. And it can be applied to highly security-critical environments such as E-government and bank. Conducted by RDRSM, an important E-government disaster recovery system has been constructed successfully. The feasibility of this model is verified by practice. We especially emphasize the significance of some components of the model, such as risk assessment, disaster recovery planning, system supervision and robust communication support.
文摘A redundant data path(RDP) device driver for Windows 2000 is proposed and implemented to provide automatic transparent failover and load balancing across multiple SCSI or fibre paths between the hosts and the storage subsystems. RDP driver is implemented as a filter driver on top of the traditional disk device driver, it is completely transparent to the upper file system and the lower level physical device. I/O requests bound for the disk device are routed first to the RDP driver, then RDP driver calls the disk driver to perform these I/O requests. RDP detects the path failure and reroutes all subsequent I/O traffic through survival paths. I/O requests are distributed to different physical paths to achieve the maximum throughput. The multi layered device driver approach significantly reduces the implementation overhead and improves portability, and does not require any changes to the OS or the on disk data layout. RDP driver keeps applications running under path fault conditions and improves disk I/O performance.
基金This work was supported partly by the National High-tech Research and Development Program (863 Program) of China (2015AA01A202), and partly by the National Natural Science Foundation of China (Grant Nos. 61370057 and 61421003).
文摘Data items are usually replicated in modem dis- tributed data stores to obtain high performance and avail- ability. However, the availability-consistency and latency- consistency trade-offs exist in data replication, thus system designers intend to choose weak consistency models, such as eventual consistency, which may result in stale reads. Since stale data items may lead to serious application semantic problems, we consider how to increase the probability of data recency which provides a uniform view on recent versions of data items for all clients. In this work, we propose HARP, a framework that can enhance data recency of eventually con- sistent distributed data stores in an efficient and highly avail- able way. Through detecting possible stale reads under fail- ures or not, HARP can perform reread operations to elim- inate stale results only when needed based on our analysis on write/read processes. We also present solutions on how to deal with some practical anomalies in HARP, including de- layed, reordered and dropped messages and clock drift, and show how to extend HARP to multiple datacenters. Finally we implement HARP based on Cassandra, and the experi- ments show that HARP can effectively eliminate stale reads, with a low overhead (less than 6.9%) compared with original eventually consistent Cassandra.
文摘Currently accelerator control systems adopt distribution architecture and are developed with integration tools,such as EPICS,TANGO and SCADA.The digital controller based on FPGA,DSP is widely used in accelerator controls and embedded EPICS IOC is a hot point.On the software side,laboratories have built their software develop- ment environments and the open sources Eclips,Abeans serve software development too.The high availability research is a challenge in the control world.The paper describes accelerator controls and progress of correlative technologies.
文摘LinuxDirector is a connection director that supports load balancing among multiple Internet servers, which can be used to build scalable Internet services based on clusters of servers. LinuxDirector extends the TCP/IP stack of Linux kernel to support three IP load balancing techniques, VS/NAT, VS/TUN and VS/DR. Four scheduling algorithms have been implemented to assign connections to different servers. Scalability is achieved by transparently adding or removing a node in the cluster. High availability is provided by detecting node or daemon failures and reconfiguring the system appropriately. This paper describes the design and implementation of LinuxDirector and presents several of its features including scalability, high availability and connection affinity.
文摘The Hong Kong Observatory (HKO) provides low-level turbulence alerting service for the Hong Kong International Airport (HKIA) through the windshear and turbulence warning system (WTWS). In the WTWS, turbulence intensities along the flight paths of the airport are estimated based upon correlation equations established between the surface anemometer data and the turbulence data from research aircraft before the opening of the airport. The research aircraft data are not available on day-to-day basis. The remote sensing meteorological instruments, such as the Doppler light detection and ranging (LIDAR) and radar, may be used to provide direct measurements of turbulence intensities over the runway corridors. The performances of LIDAR- and radar-based turbulence intensity data are studied in this paper based on actual turbulence intensity measurements made on 423 commercial jets for a typical case of terrain-induced turbulence in association with a typhoon. It turns out that, with the tuning of the relative operating characteristic (ROC) curve between hit rate and false alarm rate, the LIDAR-based turbulence intensity measurement performs better than the anemometer-based estimation of WTWS for turbulence intensity at moderate level or above. On the other hand, the radar-based measurement does not perform as well when compared with WTWS. By combining LIDAR- and radar-based measurements, the performance is slightly better than WTWS, mainly as a result of contribution from LIDAR-based measurement. As a result, the LIDAR-based turbulence intensity measurement could be used to replace anemometer-based estimate for non-rainy weather conditions. Further enhancements of radar-based turbulence intensity measurement in rain would be necessary.
基金This research was supported in part by National Key R&D Program of China(2018YFB1003303)the National Natural Science Foundation of China(Grant Nos.61432006,61732014 and 61972149).
文摘State machine replication has been widely used in modern cluster-based database systems.Most commonly deployed configurations adopt the Raft-like consensus protocol,which has a single strong leader which replicates the log to other followers.Since the followers can handle read requests and many real workloads are usually read-intensive,the recovery speed of a crashed follower may significantly impact on the throughput.Different from traditional database recovery,the recovering follower needs to repair its local log first.Original Raft protocol takes many network round trips to do log comparison between leader and the crashed follower.To reduce network round trips,an optimization method is to truncate the follower’s uncertain log entries behind the latest local commit point,and then to directly fetch all committed log entries from the leader in one round trip.However,if the commit point is not persisted,the recovering follower has to get the whole log from the leader.In this paper,we propose an accurate and efficient log repair(AELR)algorithm for follower recovery.AELR is more robust and resilient to follower failure,and it only needs one network round trip to fetch the least number of log entries for follower recovery.This approach is implemented in the open source database system OceanBase.We experimentally show that the system adopting AELR has a good performance in terms of recovery time.
基金supported in part by the National Natural Science Foundation of China under Grants No.90604003 and No.90604004by the National Grand Fundamental Research 973 Program of China under Grant No.2003CB314801.
文摘The evolution of computer networks has experienced several major steps, and research focus of each step has been kept changing and evolving, from ARPANET to OSI/RM, then HSN (high speed network) and HPN (high performance network). During the evolution, computer networks represented by Internet have made great progress and gained unprecedented success. However, with the appearance and intensification of tussle, along with the three difficult problems (service customizing, resource control and user management) of modern network, it is found that traditional Internet and its architecture no longer meet the requirements of next generation network. Therefore, it is the next generation network that current Internet must evolve to. With the mindset of achieving valuable guidance for research on next generation network, this paper firstly analyzes some dilemmas facing current Internet and its architecture, and then surveys some recent influential research work and progresses in computer networks and related areas, including new generation network architecture, network resource control technologies, network management and security, distributed computing and middleware, wireless/mobile network, new generation network services and applications, and foundational theories on network modeling. Finally, this paper concludes that within the research on next generation network, more attention should be paid to the high availability network and corresponding architecture, key theories and supporting technologies.
基金This work was partially supported by National Key R&D Program of China(2018YFB1003404)NSFC(Grant Nos.61972149,61977026)ECNU Academic Innovation Promotion Program for Excellent Doctoral Students.
文摘The modern in-memory database(IMDB)can support highly concurrent on-line transaction processing(OLTP)workloads and generate massive transactional logs per second.Quorum-based replication protocols such as Paxos or Raft have been widely used in the distributed databases to offer higher availability and fault-tolerance.However,it is non-trivial to replicate IMDB because high transaction rate has brought new challenges.First,the leader node in quorum replication should have adaptivity by considering various transaction arrival rates and the processing capability of follower nodes.Second,followers are required to replay logs to catch up the state of the leader in the highly concurrent setting to reduce visibility gap.Third,modern databases are often built with a cluster of commodity machines connected by low configuration networks,in which the network anomalies often happen.In this case,the performance would be significantly affected because the follower node falls into the long-duration exception handling process(e.g.,fetch lost logs from the leader).To this end,we build QuorumX,an efficient and stable quorum-based replication framework for IMDB under heavy OLTP workloads.QuorumX combines critical path based batching and pipeline batching to provide an adaptive log propagation scheme to obtain a stable and high performance at various settings.Further,we propose a safe and coordination-free log replay scheme to minimize the visibility gap between the leader and follower IMDBs.We further carefully design the process for the follower node in order to alleviate the influence of the unreliable network on the replication performance.Our evaluation results with the YCSB,TPC-C and a realistic microbenchmark demonstrate that QuorumX achieves the performance close to asynchronous primary-backup replication and could always provide a stable service with data consistency and a low-level visibility gap.