期刊文献+
共找到6篇文章
< 1 >
每页显示 20 50 100
Design of a memory polynomial predistorter for wideband envelope tracking amplifiers 被引量:5
1
作者 Jing Zhang Songbai He Lu Gan 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2011年第2期193-199,共7页
Efficiency and linearity of the microwave power amplifier are critical elements for mobile communication systems. A memory polynomial baseband predistorter based on an indirect learning architecture is presented for i... Efficiency and linearity of the microwave power amplifier are critical elements for mobile communication systems. A memory polynomial baseband predistorter based on an indirect learning architecture is presented for improving the linearity of an envelope tracing (ET) amplifier with application to a wireless transmitter. To deal with large peak-to-average ratio (PAR) problem, a clipping procedure for the input signal is employed. Then the system performance is verified by simulation results. For a single carrier wideband code division multiple access (WCDMA) signal of 16-quadrature amplitude modulation (16-QAM), about 2% improvement of the error vector magnitude (EVM) is achieved at an average output power of 45.5 dBm and gain of 10.6 dB, with adjacent channel leakage ratio (ACLR) of -64.55 dBc at offset frequency of 5 MHz. Moreover, a three-carrier WCDMA signal and a third-generation (3G) long term evolution (LTE) signal are used as test signals to demonstrate the performance of the proposed linearization scheme under different bandwidth signals. 展开更多
关键词 envelope tracking memory polynomial predistorter indirect learning architecture power amplifier memory effects.
下载PDF
Fault-Tolerant Design of Spaceborne Mass Memory System
2
作者 张宇宁 常亮 +1 位作者 杨根庆 李华旺 《Transactions of Tianjin University》 EI CAS 2010年第1期17-21,共5页
A fault-tolerant spaceborne mass memory architecture is presented based on entirely commercial-off-theshelf components.The highly modularized and scalable memory kernel supports the hierarchical design and is well sui... A fault-tolerant spaceborne mass memory architecture is presented based on entirely commercial-off-theshelf components.The highly modularized and scalable memory kernel supports the hierarchical design and is well suited to redundancy structure.Error correcting code(ECC) and periodical scrubbing are used to deal with bit errors induced by single event upset.For 8-bit wide devices, the parallel Reed Solomon(10, 8) can perform coder/decoder calculations in one clock cycle, achieving a data rate of several Gb/... 展开更多
关键词 fault-tolerant memory architecture data integrity parallel Reed-Solomon codec
下载PDF
Approximate Similarity-Aware Compression for Non-Volatile Main Memory
3
作者 陈章玉 华宇 +2 位作者 左鹏飞 孙园园 郭云程 《Journal of Computer Science & Technology》 SCIE EI CSCD 2024年第1期63-81,共19页
Image bitmaps,i.e.,data containing pixels and visual perception,have been widely used in emerging applica-tions for pixel operations while consuming lots of memory space and energy.Compared with legacy DRAM(dynamic ra... Image bitmaps,i.e.,data containing pixels and visual perception,have been widely used in emerging applica-tions for pixel operations while consuming lots of memory space and energy.Compared with legacy DRAM(dynamic ran-dom access memory),non-volatile memories(NVMs)are suitable for bitmap storage due to the salient features of high density and intrinsic durability.However,writing NVMs suffers from higher energy consumption and latency compared with read accesses.Existing precise or approximate compression schemes in NVM controllers show limited performance for bitmaps due to the irregular data patterns and variance in bitmaps.We observe the pixel-level similarity when writing bitmaps due to the analogous contents in adjacent pixels.By exploiting the pixel-level similarity,we propose SimCom,an approximate similarity-aware compression scheme in the NVM module controller,to efficiently compress data for each write access on-the-fly.The idea behind SimCom is to compress continuous similar words into the pairs of base words with runs.The storage costs for small runs are further mitigated by reusing the least significant bits of base words.SimCom adaptively selects an appropriate compression mode for various bitmap formats,thus achieving an efficient trade-off be-tween quality and memory performance.We implement SimCom on GEM5/zsim with NVMain and evaluate the perfor-mance with real-world image/video workloads.Our results demonstrate the efficacy and efficiency of our SimCom with an efficient quality-performance trade-off. 展开更多
关键词 approximate computing data compression memory architecture non-volatile memory
原文传递
Efficient and flexible memory architecture to alleviate data and context bandwidth bottlenecks of coarse-grained reconfigurable arrays 被引量:2
4
作者 YANG Chen LIU Lei Bo +1 位作者 YIN Shou Yi WEI Shao Jun 《Science China(Physics,Mechanics & Astronomy)》 SCIE EI CAS 2014年第12期2214-2227,共14页
The computational capability of a coarse-grained reconfigurable array(CGRA)can be significantly restrained due to data and context memory bandwidth bottlenecks.Traditionally,two methods have been used to resolve this ... The computational capability of a coarse-grained reconfigurable array(CGRA)can be significantly restrained due to data and context memory bandwidth bottlenecks.Traditionally,two methods have been used to resolve this problem.One method loads the context into the CGRA at run time.This method occupies very small on-chip memory but induces very large latency,which leads to low computational efficiency.The other method adopts a multi-context structure.This method loads the context into the on-chip context memory at the boot phase.Broadcasting the pointer of a set of contexts changes the hardware configuration on a cycle-by-cycle basis.The size of the context memory induces a large area overhead in multi-context structures,which results in major restrictions on application complexity.This paper proposes a Predictable Context Cache(PCC)architecture to address the above context issues by buffering the context inside a CGRA.In this architecture,context is dynamically transferred into the CGRA.Utilizing a PCC significantly reduces the on-chip context memory and the complexity of the applications running on the CGRA is no longer restricted by the size of the on-chip context memory.Data preloading is the most frequently used approach to hide input data latency and speed up the data transmission process for the data bandwidth issue.Rather than fundamentally reducing the amount of input data,the transferred data and computations are processed in parallel.However,the data preloading method cannot work efficiently because data transmission becomes the critical path as the reconfigurable array scale increases.This paper also presents a Hierarchical Data Memory(HDM)architecture as a solution to the efficiency problem.In this architecture,high internal bandwidth is provided to buffer both reused input data and intermediate data.The HDM architecture relieves the external memory from the data transfer burden so that the performance is significantly improved.As a result of using PCC and HDM,experiments running mainstream video decoding programs achieved performance improvements of 13.57%–19.48%when there was a reasonable memory size.Therefore,1080p@35.7fps for H.264high profile video decoding can be achieved on PCC and HDM architecture when utilizing a 200 MHz working frequency.Further,the size of the on-chip context memory no longer restricted complex applications,which were efficiently executed on the PCC and HDM architecture. 展开更多
关键词 memory architecture CGRA context cache cache prefetch data memory
原文传递
A compact PE memory for vision chips
5
作者 石匆 陈哲 +2 位作者 杨杰 吴南健 王志华 《Journal of Semiconductors》 EI CAS CSCD 2014年第9期104-110,共7页
This paper presents a novel compact memory in the processing element (PE) for single-instruction multiple-data (SIMD) vision chips. The PE memory is constructed with 8×8 register cells, where one latch in the... This paper presents a novel compact memory in the processing element (PE) for single-instruction multiple-data (SIMD) vision chips. The PE memory is constructed with 8×8 register cells, where one latch in the slave stage is shared by eight latches in the master stage. The memory supports simultaneous read and write on the same address in one clock cycle. Its compact area of 14.33 μm^2/bit promises a higher integration level of the processor. A prototype chip with a 64×64 PE array is fabricated in a UMC 0.18 μm CMOS technology. Five types of the PE memory cell structure are designed and compared. The testing results demonstrate that the proposed PE memory architecture well satisfies the requirement of the vision chip in high-speed real-time vision applications, such as 1000 fps edge extraction. 展开更多
关键词 vision chip PE memory architecture SIMD edge extraction
原文传递
A Study on Modeling and Optimization of Memory Systems
6
作者 Jason Liu Pedro Espina Xian-He Sun 《Journal of Computer Science & Technology》 SCIE EI CSCD 2021年第1期71-89,共19页
Accesses Per Cycle(APC),Concurrent Average Memory Access Time(C-AMAT),and Layered Performance Matching(LPM)are three memory performance models that consider both data locality and memory assess concurrency.The APC mod... Accesses Per Cycle(APC),Concurrent Average Memory Access Time(C-AMAT),and Layered Performance Matching(LPM)are three memory performance models that consider both data locality and memory assess concurrency.The APC model measures the throughput of a memory architecture and therefore reflects the quality of service(QoS)of a memory system.The C-AMAT model provides a recursive expression for the memory access delay and therefore can be used for identifying the potential bottlenecks in a memory hierarchy.The LPM method transforms a global memory system optimization into localized optimizations at each memory layer by matching the data access demands of the applications with the underlying memory system design.These three models have been proposed separately through prior efforts.This paper reexamines the three models under one coherent mathematical framework.More specifically,we present a new memorycentric view of data accesses.We divide the memory cycles at each memory layer into four distinct categories and use them to recursively define the memory access latency and concurrency along the memory hierarchy.This new perspective offers new insights with a clear formulation of the memory performance considering both locality and concurrency.Consequently,the performance model can be easily understood and applied in engineering practices.As such,the memory-centric approach helps establish a unified mathematical foundation for model-driven performance analysis and optimization of contemporary and future memory systems. 展开更多
关键词 performance modeling performance optimization memory architecture memory hierarchy concurrent average memory access time
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部