摘要
文本检测领域的非极大值抑制(NMS)算法延迟过大,影响整体实时性,为此提出一种后处理加速方法。采用基于位置范围的候选框分类方法代替提前排序,减少计算复杂度;通过多次缩放优化交并比计算公式,补充完全覆盖的约束条件,减少由缩放导致的小尺寸候选框冗余的问题;将判断条件复用并设计三级流水线的计算单元,进一步减少计算延迟。实验结果表明,加速器在Zynq-XC7Z020上部署的功耗为3.28 W,相比CPU实现的NMS和LANMS,性能提高了67倍和38倍。
The delay of non-maximum suppression(NMS)algorithm in the field of text detection is unignorable,which affects the overall real-time performance.For this reason,a post-processing acceleration method was proposed.Instead of sorting in advance,a location-based box classification method was adopted,the computational complexity was reduced.Multiple zooms were applied to simplify the calculation formula of intersection.The constraint condition of full mask was added to reduce the redundancy of small size candidate boxes caused by scaling.The judgment condition was reused and an ALU with three-stage pipeline was designed.The computing latency was further reduced.Experimental results show that the power consumption on Zynq-XC7Z020 platform is 3.28 W,and the computing performance is 67 times and 38 times higher than that of NMS and LANMS implemented by CPU respectively.
作者
屠程力
陈章进
乔栋
TU Cheng-li;CHEN Zhang-jin;QIAO Dong(Microelectronics Research and Development Center,Shanghai University,Shanghai 200444,China;Computing Center,Shanghai University,Shanghai 200444,China)
出处
《计算机工程与设计》
北大核心
2023年第9期2837-2843,共7页
Computer Engineering and Design
基金
国家自然科学基金项目(61674100)。
关键词
文本检测
非极大值抑制
流水线处理
后处理
硬件加速
交并比
检测框
text detection
non-maximum suppression
pipeline processing
post processing
hardware acceleration
intersection over union
detection boxes