摘要
Computer vision(CV)is widely expected to be the next big thing in emerging applications.So many heterogeneous architectures for computer vision emerge.However,plenty of data need to be transferred between different structures for heterogeneous architecture.The long data transfer delay becomes the mainly problem to limit the processing speed for computer vision applications.For reducing data transfer delay and fasting computer vision applications,a clustered data-driven array processor is proposed.A three-level pipelining processing element is designed which supports two-buffer data flow interface and 8 bits,16 bits,32 bits subtext parallel computation.At the same time,for accelerating transcendental function computation,a four-way shared pipelining transcendental function accelerator is designed,which is based on Y-intercept adjusted piecewise linear segment algorithm.A distributed shared memory structure based on unified addressing is also employed.To verify efficiency of architecture,some image processing algorithms are implemented on proposed architecture.Simultaneously the proposed architecture has been implemented on Xilinx ZC 706 development board.The same circuitry has been synthesized using SMIC 130 nm CMOS technology.The circuitry is able to run at 100 MHz.Area is 26.58 mm2.
作者
山蕊
Deng Junyong
Jiang Lin
Zhu Yun
Wu Haoyue
He Feilong
Shan Rui;Deng Junyong;Jiang Lin;Zhu Yun;Wu Haoyue;He Feilong(School of Electronic and Engineering,Xi’an University of Posts and Telecommunications,Xi’an 710121,P.R.China;Integrated Circuit Laboratory,Xi’an University of Science and Technology,Xi’an 710054,P.R.China)
基金
the National Natural Science Foundation of China(No.61802304,61834005,61772417,61634004,61602377)
Shaanxi Provincial Co-ordination Innovation Project of Science and Technology(No.2016KTZDGY02-04-02)
Shaanxi Provincial Key R&D Plan(No.2017GY-060)
Shaanxi International Science and Technology Cooperation Program(No.2018KW-006).