This paper presents the design and implementation of a low power digital signal processor (THUCIDSP-1 ) targeting at application for cochlear implants. Multi-level low power strategies including algorithm optimizati...This paper presents the design and implementation of a low power digital signal processor (THUCIDSP-1 ) targeting at application for cochlear implants. Multi-level low power strategies including algorithm optimization, operand isolation, clock gating and memory partitioning are adopted in the processor design to reduce the power consumption. Experimental results show that the complexity of the Continuous Interleaved Sampling (CIS) algorithm is reduced by more than 80 % and the power dissipation of the hardware alone is reduced by about 25% with the low power methods. The THUCIDSP-1 prototype, fabricated in 0.18-μm standard CMOS process, consumes only 1.91 mW when executing the CIS algorithm at 3 MHz.展开更多
The Distributed Shared Memory(DSM)architecture is widely used in today’s computer design to mitigate the ever-widening processing-memory gap,and it inevitably exhibits Non-Uniform Memory Access(NUMA)to shared-memory ...The Distributed Shared Memory(DSM)architecture is widely used in today’s computer design to mitigate the ever-widening processing-memory gap,and it inevitably exhibits Non-Uniform Memory Access(NUMA)to shared-memory parallel applications.Failure to adapt to the NUMA effect can significantly downgrade application performance,especially on today’s manycore platforms with tens to hundreds of cores.However,traditional approaches such as first-touch and memory policy fall short in false page-sharing,fragmentation,or ease of use.In this paper,we propose a partitioned shared-memory approach that allows multithreaded applications to achieve full NUMA-awareness with only minor code changes and develop an accompanying NUMA-aware heap manager which eliminates false page-sharing and minimizes fragmentation.Experiments on a 256-core cc-NUMA computing node show that the proposed approach helps applications to adapt to NUMA with only minor code changes and improves the performance of typical multithreaded scientific applications by up to 4.3 folds with the increased use of cores.展开更多
A partition checkpoint strategy based on data segment priority is presented to meet the timing constraints of the data and the transaction in embedded real-time main memory database systems(ERTMMDBS) as well as to r...A partition checkpoint strategy based on data segment priority is presented to meet the timing constraints of the data and the transaction in embedded real-time main memory database systems(ERTMMDBS) as well as to reduce the number of the transactions missing their deadlines and the recovery time.The partition checkpoint strategy takes into account the characteristics of the data and the transactions associated with it;moreover,it partitions the database according to the data segment priority and sets the corresponding checkpoint frequency to each partition for independent checkpoint operation.The simulation results show that the partition checkpoint strategy decreases the ratio of trans-actions missing their deadlines.展开更多
基金Supported by the National Natural Science Foundation of China (No. 60475018)
文摘This paper presents the design and implementation of a low power digital signal processor (THUCIDSP-1 ) targeting at application for cochlear implants. Multi-level low power strategies including algorithm optimization, operand isolation, clock gating and memory partitioning are adopted in the processor design to reduce the power consumption. Experimental results show that the complexity of the Continuous Interleaved Sampling (CIS) algorithm is reduced by more than 80 % and the power dissipation of the hardware alone is reduced by about 25% with the low power methods. The THUCIDSP-1 prototype, fabricated in 0.18-μm standard CMOS process, consumes only 1.91 mW when executing the CIS algorithm at 3 MHz.
基金supported by the National Key Research and Development Program of China(No.2016YFB0201300)。
文摘The Distributed Shared Memory(DSM)architecture is widely used in today’s computer design to mitigate the ever-widening processing-memory gap,and it inevitably exhibits Non-Uniform Memory Access(NUMA)to shared-memory parallel applications.Failure to adapt to the NUMA effect can significantly downgrade application performance,especially on today’s manycore platforms with tens to hundreds of cores.However,traditional approaches such as first-touch and memory policy fall short in false page-sharing,fragmentation,or ease of use.In this paper,we propose a partitioned shared-memory approach that allows multithreaded applications to achieve full NUMA-awareness with only minor code changes and develop an accompanying NUMA-aware heap manager which eliminates false page-sharing and minimizes fragmentation.Experiments on a 256-core cc-NUMA computing node show that the proposed approach helps applications to adapt to NUMA with only minor code changes and improves the performance of typical multithreaded scientific applications by up to 4.3 folds with the increased use of cores.
基金Supported by the National Natural Science Foundation of China (60673128)
文摘A partition checkpoint strategy based on data segment priority is presented to meet the timing constraints of the data and the transaction in embedded real-time main memory database systems(ERTMMDBS) as well as to reduce the number of the transactions missing their deadlines and the recovery time.The partition checkpoint strategy takes into account the characteristics of the data and the transactions associated with it;moreover,it partitions the database according to the data segment priority and sets the corresponding checkpoint frequency to each partition for independent checkpoint operation.The simulation results show that the partition checkpoint strategy decreases the ratio of trans-actions missing their deadlines.