In this paper, we deal with the problem of improving backup and recovery performance by compressing redundancies in large disk-based backup system. We analyze some general compression algorithms; evaluate their scalab...In this paper, we deal with the problem of improving backup and recovery performance by compressing redundancies in large disk-based backup system. We analyze some general compression algorithms; evaluate their scalability and applicability. We investigate the distribution features of the redundant data in whole system range, and propose a multi-resolution distributed compression algorithm which can discern duplicated data at granularity of file level, block level or byte level to reduce the redundancy in backup environment. In order to accelerate recovery, we propose a synthetic backup solution which stores data in a recovery-oriented way and can compose the final data in back-end backup server. Experiments show that this algorithm can greatly reduce bandwidth consumption, save storage cost, and shorten the backup and recovery time. We implement these technologies in our product, called H-info backup system, which is capable of achieving over 10x compression ratio in both network utilization and data storage during backup.展开更多
This paper describes a method for building hot snapshot copy based on windows-file system (HSCF). The architecture and running mechanism of HSCF are discussed after giving a comparison with other on-line backup tecb...This paper describes a method for building hot snapshot copy based on windows-file system (HSCF). The architecture and running mechanism of HSCF are discussed after giving a comparison with other on-line backup tecbnology. HSCF, based on a file system filter driver, protects computer data and ensures their integrity and consistency with following three steps: access to open files, synchronization and copy on-write. Its strategies for improving system performance are analyzed including priority setting, incremental snapshot and load balance. HSCF is a new kind of snapshot technology to solve the data integrity and consistency problem in online backup, which is different from other storage-level snapshot and Open File Solution.展开更多
基金Supported by the National Natural Science Foun-dation of China (60473023) the National Innovation Foundationfor Small Technology-Based Firms (04C26214201280)
文摘In this paper, we deal with the problem of improving backup and recovery performance by compressing redundancies in large disk-based backup system. We analyze some general compression algorithms; evaluate their scalability and applicability. We investigate the distribution features of the redundant data in whole system range, and propose a multi-resolution distributed compression algorithm which can discern duplicated data at granularity of file level, block level or byte level to reduce the redundancy in backup environment. In order to accelerate recovery, we propose a synthetic backup solution which stores data in a recovery-oriented way and can compose the final data in back-end backup server. Experiments show that this algorithm can greatly reduce bandwidth consumption, save storage cost, and shorten the backup and recovery time. We implement these technologies in our product, called H-info backup system, which is capable of achieving over 10x compression ratio in both network utilization and data storage during backup.
基金Supported by the National Natural Science Foun-dation of China (60473023) National Innovation Foundation forSmall Technology Based Firms(04C26214201280)
文摘This paper describes a method for building hot snapshot copy based on windows-file system (HSCF). The architecture and running mechanism of HSCF are discussed after giving a comparison with other on-line backup tecbnology. HSCF, based on a file system filter driver, protects computer data and ensures their integrity and consistency with following three steps: access to open files, synchronization and copy on-write. Its strategies for improving system performance are analyzed including priority setting, incremental snapshot and load balance. HSCF is a new kind of snapshot technology to solve the data integrity and consistency problem in online backup, which is different from other storage-level snapshot and Open File Solution.