Data deduplication, as a compression method, has been widely used in most backup systems to improve bandwidth and space efficiency. As data exploded to be backed up, two main challenges in data deduplication are the C...Data deduplication, as a compression method, has been widely used in most backup systems to improve bandwidth and space efficiency. As data exploded to be backed up, two main challenges in data deduplication are the CPU-intensive chunking and hashing works and the I/0 intensive disk-index access latency. However, CPU-intensive works have been vastly parallelized and speeded up by multi-core and many-core processors; the I/0 latency is likely becoming the bottleneck in data deduplication. To alleviate the challenge of I/0 latency in multi-core systems, multi-threaded deduplication (Multi-Dedup) architecture was proposed. The main idea of Multi-Dedup was using parallel deduplication threads to hide the I/0 latency. A prefix based concurrent index was designed to maintain the internal consistency of the deduplication index with low synchronization overhead. On the other hand, a collisionless cache array was also designed to preserve locality and similarity within the parallel threads. In various real-world datasets experiments, Multi-Dedup achieves 3-5 times performance improvements incorporating with locality-based ChunkStash and local-similarity based SiLo methods. In addition, Multi-Dedup has dramatically decreased the synchronization overhead and achieves 1.5-2 times performance improvements comparing to traditional lock-based synchronization methods.展开更多
In order to improve the real-time performance of the real-time HLA(high level architecture) in the application of massive data communication volume,multi-thread processing was adopted,thread pool structure was introdu...In order to improve the real-time performance of the real-time HLA(high level architecture) in the application of massive data communication volume,multi-thread processing was adopted,thread pool structure was introduced into the system,different threads to handle corresponding message queues was utilized to respond different message requests.Furthermore,an allocation strategy of semi-complete deprivation of priority was adopted,which reduces thread switching cost and processing burden in the system,provided that the message requests with high priority can be responded in time,thus improves the system's overall performance.The design and experiment results indicate that the method proposed in this paper can improve the real-time performance of HLA in distributed system applications greatly.展开更多
mc211vm is a process-level ARM-to-x86 binary translator developed in our lab in the past several years. Currently, it is able to emulate singlethreaded programs. We extend mc211vm to emulate multi-threaded programs. O...mc211vm is a process-level ARM-to-x86 binary translator developed in our lab in the past several years. Currently, it is able to emulate singlethreaded programs. We extend mc211vm to emulate multi-threaded programs. Our main task is to reconstruct its architecture for multi-threaded programs. Register mapping, code cache management, and address mapping in mc2llvm have all been modified. In addition, to further speed up the emulation, we collect hot paths, aggressively optimize and generate code for them at run time. Additional threads are used to alleviate the overhead. Thus, when the same hot path is walked through again, the corresponding optimized native code will be executed instead. In our experiments, our system is 8.8X faster than QEMU (quick emulator) on average when emulating the specified benchmarks with 8 guest threads.展开更多
Web crawlers are an important part of modern search engines.With the development of the times,data has exploded and humans have entered a“big data era”.For example,Wikipedia carries the knowledge from all over the w...Web crawlers are an important part of modern search engines.With the development of the times,data has exploded and humans have entered a“big data era”.For example,Wikipedia carries the knowledge from all over the world,records the realtime news that occurs every day,and provides users with a good database of data,but because of the large amount of data,it puts a lot of pressure on users to search.At present,single-threaded crawling data can no longer meet the requirements of text crawling.In order to improve the performance and program versatility of single-threaded crawlers,a high-speed multi-threaded web crawler is designed to crawl the network hyper-scale text database.Multi-threaded crawling uses multiple threads to process web pages in parallel,combining breadth-first and depth-first algorithms to control web crawling.The practice project is based on the Python language to achieve multi-threaded optimization network hyper-large-scale text database-Wikipedia book crawling method,the project is inspired by the article on the Wikipedia article in the Big Data Digest public number.展开更多
In this paper, we conduct research on the Java multi-thread programming and its further development tendency. Multithreading mechanisms can run several programs at the same time, make the program run effi ciency becom...In this paper, we conduct research on the Java multi-thread programming and its further development tendency. Multithreading mechanisms can run several programs at the same time, make the program run effi ciency becomes higher that also can overcome the problem of basic traditional programming language design while its design is the key to the realization of the synchronous thread. Multithreading is a mechanism that allows concurrent execution of multiple instruction stream in the program, each instruction stream is called a thread, independent from each other between each other. Thread is also known as a lightweight process, it have independent execution and process control. Our research starts from the analysis of the corresponding mechanism to enhance the performance that is innovative and meaningful.展开更多
This paper presents a reasonable gridding-parameters extraction method for setting the optimal interpolation nodes in the gridding of scattered observed data. The method can extract optimized gridding parameters based...This paper presents a reasonable gridding-parameters extraction method for setting the optimal interpolation nodes in the gridding of scattered observed data. The method can extract optimized gridding parameters based on the distribution of features in raw data. Modeling analysis proves that distortion caused by gridding can be greatly reduced when using such parameters. We also present some improved technical measures that use human- machine interaction and multi-thread parallel technology to solve inadequacies in traditional gridding software. On the basis of these methods, we have developed software that can be used to grid scattered data using a graphic interface. Finally, a comparison of different gridding parameters on field magnetic data from Ji Lin Province, North China demonstrates the superiority of the proposed method in eliminating the distortions and enhancing gridding efficiency.展开更多
Inspired by the unique structure of insect compound eyes,a multi-channel image acquisition system is designed to photograph a cylindrical panorama of its surroundings with one shot. The hardware structure consists of ...Inspired by the unique structure of insect compound eyes,a multi-channel image acquisition system is designed to photograph a cylindrical panorama of its surroundings with one shot. The hardware structure consists of an embedded ARM system and one array of 16 micro-image sensors. The system achieves the synchronization of captured photos in 10 ms,as well as 10 f /s video capture. The software architecture includes the TCP /IP protocol,video capture procedures in"Poll/Read"or"video streaming"modes,thread pool monitoring in multi-threading mutex,synchronization control with the"event""mutex signal"and"critical region"functions,and a synthetic image algorithm characterized by its portability,modularity,and remote transmission. The panoramic imaging system is expected to be a vision sensor for mobile robotics.展开更多
The problems of current highly redundant flight control system are analyzed in this paper. Our study gives methods of utilizing other information to reduce physical components on the condition of meeting the reliabili...The problems of current highly redundant flight control system are analyzed in this paper. Our study gives methods of utilizing other information to reduce physical components on the condition of meeting the reliability requirements for flight control system. The strategies presented in this paper mainly include information redundancy, multi-thread, time redundancy, geometry space redundancy, etc.. Analysis and simulation show these non-hardware based methods can reduce the requirement of system hardware level and thus reduce the system complexity, weight, space, costs and R&D (research and development) time.展开更多
The last decade witnessed rapid increase in multimedia and other applications that require transmitting and protecting huge amount of data streams simultaneously.For such applications,a high-performance cryptosystem i...The last decade witnessed rapid increase in multimedia and other applications that require transmitting and protecting huge amount of data streams simultaneously.For such applications,a high-performance cryptosystem is compulsory to provide necessary security services.Elliptic curve cryptosystem(ECC)has been introduced as a considerable option.However,the usual sequential implementation of ECC and the standard elliptic curve(EC)form cannot achieve required performance level.Moreover,the widely used Hardware implementation of ECC is costly option and may be not affordable.This research aims to develop a high-performance parallel software implementation for ECC.To achieve this,many experiments were performed to examine several factors affecting ECC performance including the projective coordinates,the scalar multiplication algorithm,the elliptic curve(EC)form,and the parallel implementation.The ECC performance was analyzed using the different factors to tune-up them and select the best choices to increase the speed of the cryptosystem.Experimental results illustrated that parallel Montgomery ECC implementation using homogenous projection achieves the highest performance level,since it scored the shortest time delay for ECC computations.In addition,results showed thatNAF algorithm consumes less time to perform encryption and scalar multiplication operations in comparison withMontgomery ladder and binarymethods.Java multi-threading technique was adopted to implement ECC computations in parallel.The proposed multithreaded Montgomery ECC implementation significantly improves the performance level compared to previously presented parallel and sequential implementations.展开更多
Scalability is one of the most important quality attribute of softwareintensive systems,because it maintains an effective performance parallel to the large fluctuating and sometimes unpredictable workload.In order to ...Scalability is one of the most important quality attribute of softwareintensive systems,because it maintains an effective performance parallel to the large fluctuating and sometimes unpredictable workload.In order to achieve scalability,thread pool system(TPS)(which is also known as executor service)has been used extensively as a middleware service in software-intensive systems.TPS optimization is a challenging problem that determines the optimal size of thread pool dynamically on runtime.In case of distributed-TPS(DTPS),another issue is the load balancing b/w available set of TPSs running at backend servers.Existing DTPSs are overloaded either due to an inappropriate TPS optimization strategy at backend servers or improper load balancing scheme that cannot quickly recover an overload.Consequently,the performance of software-intensive system is suffered.Thus,in this paper,we propose a new DTPS that follows the collaborative round robin load balancing that has the effect of a double-edge sword.On the one hand,it effectively performs the load balancing(in case of overload situation)among available TPSs by a fast overload recovery procedure that decelerates the load on the overloaded TPSs up to their capacities and shifts the remaining load towards other gracefully running TPSs.And on the other hand,its robust load deceleration technique which is applied to an overloaded TPS sets an appropriate upper bound of thread pool size,because the pool size in each TPS is kept equal to the request rate on it,hence dynamically optimizes TPS.We evaluated the results of the proposed system against state of the art DTPSs by a clientserver based simulator and found that our system outperformed by sustaining smaller response times.展开更多
Multi-frame coding is supported by the emerging H.264. It is important for the enhancement of both coding efficiency and error robustness. In this paper, error resilient schemes for H.264 based on multi-frame were inv...Multi-frame coding is supported by the emerging H.264. It is important for the enhancement of both coding efficiency and error robustness. In this paper, error resilient schemes for H.264 based on multi-frame were investigated. Error robust H.264 video transmission schemes were introduced for the applications with and without a feedback channel. The experimental results demonstrate the effectiveness of the proposed schemes.展开更多
基金Project(IRT0725)supported by the Changjiang Innovative Group of Ministry of Education,China
文摘Data deduplication, as a compression method, has been widely used in most backup systems to improve bandwidth and space efficiency. As data exploded to be backed up, two main challenges in data deduplication are the CPU-intensive chunking and hashing works and the I/0 intensive disk-index access latency. However, CPU-intensive works have been vastly parallelized and speeded up by multi-core and many-core processors; the I/0 latency is likely becoming the bottleneck in data deduplication. To alleviate the challenge of I/0 latency in multi-core systems, multi-threaded deduplication (Multi-Dedup) architecture was proposed. The main idea of Multi-Dedup was using parallel deduplication threads to hide the I/0 latency. A prefix based concurrent index was designed to maintain the internal consistency of the deduplication index with low synchronization overhead. On the other hand, a collisionless cache array was also designed to preserve locality and similarity within the parallel threads. In various real-world datasets experiments, Multi-Dedup achieves 3-5 times performance improvements incorporating with locality-based ChunkStash and local-similarity based SiLo methods. In addition, Multi-Dedup has dramatically decreased the synchronization overhead and achieves 1.5-2 times performance improvements comparing to traditional lock-based synchronization methods.
基金Sponsored by the National Defence SciTech Key Lab Fundation(51457040204BQ0102)
文摘In order to improve the real-time performance of the real-time HLA(high level architecture) in the application of massive data communication volume,multi-thread processing was adopted,thread pool structure was introduced into the system,different threads to handle corresponding message queues was utilized to respond different message requests.Furthermore,an allocation strategy of semi-complete deprivation of priority was adopted,which reduces thread switching cost and processing burden in the system,provided that the message requests with high priority can be responded in time,thus improves the system's overall performance.The design and experiment results indicate that the method proposed in this paper can improve the real-time performance of HLA in distributed system applications greatly.
基金supported by NSC under Grant No.NSC 100-2218-E-009-009MY3 and NSC 100-2218-E-009-010-MY3
文摘mc211vm is a process-level ARM-to-x86 binary translator developed in our lab in the past several years. Currently, it is able to emulate singlethreaded programs. We extend mc211vm to emulate multi-threaded programs. Our main task is to reconstruct its architecture for multi-threaded programs. Register mapping, code cache management, and address mapping in mc2llvm have all been modified. In addition, to further speed up the emulation, we collect hot paths, aggressively optimize and generate code for them at run time. Additional threads are used to alleviate the overhead. Thus, when the same hot path is walked through again, the corresponding optimized native code will be executed instead. In our experiments, our system is 8.8X faster than QEMU (quick emulator) on average when emulating the specified benchmarks with 8 guest threads.
基金This research is funded by the Open Foundation for the University Innovation Platform in the Hunan Province,grant number 16K013Hunan Provincial Natural Science Foundation of China,grant number 2017JJ2016+2 种基金2016 Science Research Project of Hunan Provincial Department of Education,grant number 16C0269.Accurate crawler design and implementation with a data cleaning function,National Students innovation and entrepreneurship of training program,grant number 201811532010.This research work is implemented at the 2011 Collaborative Innovation Center for Development and Utilization of Finance and Economics Big Data Property,Universities of Hunan Province.Open Foundation for the University Innovation Platform in the Hunan Province,grant number 16K013Hunan Provincial Natural Science Foundation of China,grant number 2017JJ20162016 Science Research Project of Hunan Provincial Department of Education,grant number 16C0269.This research work is implemented at the 2011 Collaborative Innovation Center for Development and Utilization of Finance and Economics Big Data Property,Universities of Hunan Province.Open project,grant number 20181901CRP03,20181901CRP04,20181901CRP05.
文摘Web crawlers are an important part of modern search engines.With the development of the times,data has exploded and humans have entered a“big data era”.For example,Wikipedia carries the knowledge from all over the world,records the realtime news that occurs every day,and provides users with a good database of data,but because of the large amount of data,it puts a lot of pressure on users to search.At present,single-threaded crawling data can no longer meet the requirements of text crawling.In order to improve the performance and program versatility of single-threaded crawlers,a high-speed multi-threaded web crawler is designed to crawl the network hyper-scale text database.Multi-threaded crawling uses multiple threads to process web pages in parallel,combining breadth-first and depth-first algorithms to control web crawling.The practice project is based on the Python language to achieve multi-threaded optimization network hyper-large-scale text database-Wikipedia book crawling method,the project is inspired by the article on the Wikipedia article in the Big Data Digest public number.
文摘In this paper, we conduct research on the Java multi-thread programming and its further development tendency. Multithreading mechanisms can run several programs at the same time, make the program run effi ciency becomes higher that also can overcome the problem of basic traditional programming language design while its design is the key to the realization of the synchronous thread. Multithreading is a mechanism that allows concurrent execution of multiple instruction stream in the program, each instruction stream is called a thread, independent from each other between each other. Thread is also known as a lightweight process, it have independent execution and process control. Our research starts from the analysis of the corresponding mechanism to enhance the performance that is innovative and meaningful.
基金partly supported by the Public Geological Survey Project(No.201011039)the National High Technology Research and Development Project of China(No.2007AA06Z134)the 111 Project under the Ministry of Education and the State Administration of Foreign Experts Affairs,China(No.B07011)
文摘This paper presents a reasonable gridding-parameters extraction method for setting the optimal interpolation nodes in the gridding of scattered observed data. The method can extract optimized gridding parameters based on the distribution of features in raw data. Modeling analysis proves that distortion caused by gridding can be greatly reduced when using such parameters. We also present some improved technical measures that use human- machine interaction and multi-thread parallel technology to solve inadequacies in traditional gridding software. On the basis of these methods, we have developed software that can be used to grid scattered data using a graphic interface. Finally, a comparison of different gridding parameters on field magnetic data from Ji Lin Province, North China demonstrates the superiority of the proposed method in eliminating the distortions and enhancing gridding efficiency.
基金Supported by the National Natural Science Foundation of China(61233014)the China Postdoctoral Science Foundation(2012M5210711,20123218110031)the National Natural Science Major International Cooperation Projects(61161120323)
文摘Inspired by the unique structure of insect compound eyes,a multi-channel image acquisition system is designed to photograph a cylindrical panorama of its surroundings with one shot. The hardware structure consists of an embedded ARM system and one array of 16 micro-image sensors. The system achieves the synchronization of captured photos in 10 ms,as well as 10 f /s video capture. The software architecture includes the TCP /IP protocol,video capture procedures in"Poll/Read"or"video streaming"modes,thread pool monitoring in multi-threading mutex,synchronization control with the"event""mutex signal"and"critical region"functions,and a synthetic image algorithm characterized by its portability,modularity,and remote transmission. The panoramic imaging system is expected to be a vision sensor for mobile robotics.
文摘The problems of current highly redundant flight control system are analyzed in this paper. Our study gives methods of utilizing other information to reduce physical components on the condition of meeting the reliability requirements for flight control system. The strategies presented in this paper mainly include information redundancy, multi-thread, time redundancy, geometry space redundancy, etc.. Analysis and simulation show these non-hardware based methods can reduce the requirement of system hardware level and thus reduce the system complexity, weight, space, costs and R&D (research and development) time.
基金Authors extend their appreciation to the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University for funding and supporting this work through Graduate Student Research Support Program.
文摘The last decade witnessed rapid increase in multimedia and other applications that require transmitting and protecting huge amount of data streams simultaneously.For such applications,a high-performance cryptosystem is compulsory to provide necessary security services.Elliptic curve cryptosystem(ECC)has been introduced as a considerable option.However,the usual sequential implementation of ECC and the standard elliptic curve(EC)form cannot achieve required performance level.Moreover,the widely used Hardware implementation of ECC is costly option and may be not affordable.This research aims to develop a high-performance parallel software implementation for ECC.To achieve this,many experiments were performed to examine several factors affecting ECC performance including the projective coordinates,the scalar multiplication algorithm,the elliptic curve(EC)form,and the parallel implementation.The ECC performance was analyzed using the different factors to tune-up them and select the best choices to increase the speed of the cryptosystem.Experimental results illustrated that parallel Montgomery ECC implementation using homogenous projection achieves the highest performance level,since it scored the shortest time delay for ECC computations.In addition,results showed thatNAF algorithm consumes less time to perform encryption and scalar multiplication operations in comparison withMontgomery ladder and binarymethods.Java multi-threading technique was adopted to implement ECC computations in parallel.The proposed multithreaded Montgomery ECC implementation significantly improves the performance level compared to previously presented parallel and sequential implementations.
文摘Scalability is one of the most important quality attribute of softwareintensive systems,because it maintains an effective performance parallel to the large fluctuating and sometimes unpredictable workload.In order to achieve scalability,thread pool system(TPS)(which is also known as executor service)has been used extensively as a middleware service in software-intensive systems.TPS optimization is a challenging problem that determines the optimal size of thread pool dynamically on runtime.In case of distributed-TPS(DTPS),another issue is the load balancing b/w available set of TPSs running at backend servers.Existing DTPSs are overloaded either due to an inappropriate TPS optimization strategy at backend servers or improper load balancing scheme that cannot quickly recover an overload.Consequently,the performance of software-intensive system is suffered.Thus,in this paper,we propose a new DTPS that follows the collaborative round robin load balancing that has the effect of a double-edge sword.On the one hand,it effectively performs the load balancing(in case of overload situation)among available TPSs by a fast overload recovery procedure that decelerates the load on the overloaded TPSs up to their capacities and shifts the remaining load towards other gracefully running TPSs.And on the other hand,its robust load deceleration technique which is applied to an overloaded TPS sets an appropriate upper bound of thread pool size,because the pool size in each TPS is kept equal to the request rate on it,hence dynamically optimizes TPS.We evaluated the results of the proposed system against state of the art DTPSs by a clientserver based simulator and found that our system outperformed by sustaining smaller response times.
文摘Multi-frame coding is supported by the emerging H.264. It is important for the enhancement of both coding efficiency and error robustness. In this paper, error resilient schemes for H.264 based on multi-frame were investigated. Error robust H.264 video transmission schemes were introduced for the applications with and without a feedback channel. The experimental results demonstrate the effectiveness of the proposed schemes.