Deep learning algorithms have been widely used in computer vision,natural language processing and other fields.However,due to the ever-increasing scale of the deep learning model,the requirements for storage and compu...Deep learning algorithms have been widely used in computer vision,natural language processing and other fields.However,due to the ever-increasing scale of the deep learning model,the requirements for storage and computing performance are getting higher and higher,and the processors based on the von Neumann architecture have gradually exposed significant shortcomings such as consumption and long latency.In order to alleviate this problem,large-scale processing systems are shifting from a traditional computing-centric model to a data-centric model.A near-memory computing array architecture based on the shared buffer is proposed in this paper to improve system performance,which supports instructions with the characteristics of store-calculation integration,reducing the data movement between the processor and main memory.Through data reuse,the processing speed of the algorithm is further improved.The proposed architecture is verified and tested through the parallel realization of the convolutional neural network(CNN)algorithm.The experimental results show that at the frequency of 110 MHz,the calculation speed of a single convolution operation is increased by 66.64%on average compared with the CNN architecture that performs parallel calculations on field programmable gate array(FPGA).The processing speed of the whole convolution layer is improved by 8.81%compared with the reconfigurable array processor that does not support near-memory computing.展开更多
针对高效视频编码(high efficiency video coding,HEVC)分像素运动估计亮度分量插值算法计算量大、冗余度高、难以实现不同编码块之间灵活切换的问题,提出一种动态可重构且具有高数据复用率的分像素插值算法实现方法。根据编码单元(codi...针对高效视频编码(high efficiency video coding,HEVC)分像素运动估计亮度分量插值算法计算量大、冗余度高、难以实现不同编码块之间灵活切换的问题,提出一种动态可重构且具有高数据复用率的分像素插值算法实现方法。根据编码单元(coding unit,CU)的规模和大小自适应地对其周围参考像素块进行插值计算,得到最优预测单元的编码模式和运动矢量。实验结果表明,与专用硬件实现的分像素插值算法相比,不同编码块灵活切换的同时,参考像素的读取数量减少43.8%,硬件资源消耗减少18.5%。展开更多
【目的】研究生物炭添加对灌区麦田土壤团聚体分布、稳定性及作物产量的影响,阐明土壤及作物对生物炭培肥效果的响应,为灌区麦田土壤结构改良和合理培肥制度建立提供理论依据。【方法】试验采用裂区设计,氮肥(纯氮)用量设0、150 kg hm^(...【目的】研究生物炭添加对灌区麦田土壤团聚体分布、稳定性及作物产量的影响,阐明土壤及作物对生物炭培肥效果的响应,为灌区麦田土壤结构改良和合理培肥制度建立提供理论依据。【方法】试验采用裂区设计,氮肥(纯氮)用量设0、150 kg hm^(−2)两个水平,每一氮肥用量下设生物炭用量0,10,20,30 t hm^(-2)4个水平,通过2年(2018~2020年)田间定位试验,利用干筛法得到了不同粒级土壤团聚体含量,对土壤团聚体稳定性指标和春小麦产量进行对比分析。【结果】与不添加生物炭相比,添加生物炭显著提高了>5 mm、2~5 mm粒级土壤团聚体含量(P<0.05),增幅范围为10.2%~29.2%、8.3%~10.2%;施用生物炭20 t hm^(-2)时土壤团聚体平均重量直径及几何重量直径增幅最为显著(P<0.05),较不施生物炭处理相比分别增加了21.4%和32.3%;生物炭配施氮肥春小麦增产效果优于单施生物炭处理;土壤团聚体几何重量直径与春小麦产量之间呈显著的正相关关系。【结论】生物炭施用对灌区麦田土壤大团聚体形成及其稳定性提升效果显著,有利于改良土壤,提升春小麦产量。在本试验条件下,单施生物炭20 t hm^(-2)时土壤团聚体稳定性最强,生物炭20 t hm^(-2)与纯氮150 kg hm^(−2)配施时产量最高,与对照相比春小麦增产达42.7%。展开更多
基金Supported by the National Natural Science Foundation of China(No.61802304,61834005,61772417,61602377)the Shaanxi Province KeyR&D Plan(No.2021GY-029)。
文摘Deep learning algorithms have been widely used in computer vision,natural language processing and other fields.However,due to the ever-increasing scale of the deep learning model,the requirements for storage and computing performance are getting higher and higher,and the processors based on the von Neumann architecture have gradually exposed significant shortcomings such as consumption and long latency.In order to alleviate this problem,large-scale processing systems are shifting from a traditional computing-centric model to a data-centric model.A near-memory computing array architecture based on the shared buffer is proposed in this paper to improve system performance,which supports instructions with the characteristics of store-calculation integration,reducing the data movement between the processor and main memory.Through data reuse,the processing speed of the algorithm is further improved.The proposed architecture is verified and tested through the parallel realization of the convolutional neural network(CNN)algorithm.The experimental results show that at the frequency of 110 MHz,the calculation speed of a single convolution operation is increased by 66.64%on average compared with the CNN architecture that performs parallel calculations on field programmable gate array(FPGA).The processing speed of the whole convolution layer is improved by 8.81%compared with the reconfigurable array processor that does not support near-memory computing.
文摘针对高效视频编码(high efficiency video coding,HEVC)分像素运动估计亮度分量插值算法计算量大、冗余度高、难以实现不同编码块之间灵活切换的问题,提出一种动态可重构且具有高数据复用率的分像素插值算法实现方法。根据编码单元(coding unit,CU)的规模和大小自适应地对其周围参考像素块进行插值计算,得到最优预测单元的编码模式和运动矢量。实验结果表明,与专用硬件实现的分像素插值算法相比,不同编码块灵活切换的同时,参考像素的读取数量减少43.8%,硬件资源消耗减少18.5%。
文摘【目的】研究生物炭添加对灌区麦田土壤团聚体分布、稳定性及作物产量的影响,阐明土壤及作物对生物炭培肥效果的响应,为灌区麦田土壤结构改良和合理培肥制度建立提供理论依据。【方法】试验采用裂区设计,氮肥(纯氮)用量设0、150 kg hm^(−2)两个水平,每一氮肥用量下设生物炭用量0,10,20,30 t hm^(-2)4个水平,通过2年(2018~2020年)田间定位试验,利用干筛法得到了不同粒级土壤团聚体含量,对土壤团聚体稳定性指标和春小麦产量进行对比分析。【结果】与不添加生物炭相比,添加生物炭显著提高了>5 mm、2~5 mm粒级土壤团聚体含量(P<0.05),增幅范围为10.2%~29.2%、8.3%~10.2%;施用生物炭20 t hm^(-2)时土壤团聚体平均重量直径及几何重量直径增幅最为显著(P<0.05),较不施生物炭处理相比分别增加了21.4%和32.3%;生物炭配施氮肥春小麦增产效果优于单施生物炭处理;土壤团聚体几何重量直径与春小麦产量之间呈显著的正相关关系。【结论】生物炭施用对灌区麦田土壤大团聚体形成及其稳定性提升效果显著,有利于改良土壤,提升春小麦产量。在本试验条件下,单施生物炭20 t hm^(-2)时土壤团聚体稳定性最强,生物炭20 t hm^(-2)与纯氮150 kg hm^(−2)配施时产量最高,与对照相比春小麦增产达42.7%。