期刊文献+

zk-SNARK中数论变换的硬件加速方法研究 被引量:2

Hardware Acceleration of Number Theoretic Transform in zk-SNARK
下载PDF
导出
摘要 简洁非交互式零知识证明能够生成长度固定的证明并快速进行验证,极大地推动了零知识证明在数字签名、区块链及分布式存储等领域的应用。但其证明的生成过程极其耗时且需要被频繁调用,其中数论变换是证明生成过程的主要运算之一。然而现有的通用数论变换硬件加速方法难以满足其在简洁非交互式零知识证明中大规模、高位宽的要求。针对该问题,提出一种数论变换多级流水硬件计算架构。针对高位宽计算需求对高位模运算进行优化,设计了低时延蒙哥马利模乘单元;为了加速大规模计算,通过二维子任务划分将大规模数论变换任务划分为小规模独立子任务,并通过消除数据依赖实现了子任务间计算流水;在子任务多轮蝶形运算之间采用数据重排机制,有效缓解了访存需求并实现了不同步长蝶形运算间的计算流水。所提出的数论变换计算架构可以根据现场可编程门阵列(FPGA)片上资源灵活扩展,方便部署在不同规模的FPGA上以获得最大加速效果。所提出的硬件架构使用高层次综合(HLS)开发并基于OpenCL框架在AMD Xilinx Alveo U50实现了整套异构加速系统。实验结果表明,相比于PipeZK中的数论变换加速模块,该方法获得了1.95倍的加速比;在运行当前主流的简洁非交互式零知识证明开源项目bellman时,相比于AMD Ryzen 95900X单核及12核分别获得了27.98倍和1.74倍的加速比,并分别获得了6.9倍、6倍的能效提升。 The proof in zk-SNARK has a fixed length and can be verified quickly,promoting the application of zero-knowledge proof in areas such as digital signature,blockchain,distributed storage,and outsourced computing.How-ever,the generation of proofs is time-consuming and frequently used.As a result,NTT(number theoretic trans-form),one of the most time-consuming parts in proof-generation,needs to be accelerated significantly.However,the existing general NTT hardware acceleration methods cannot meet the requirements of large-bitwidth and large-scale in zk-SNARK.To address this issue,this paper proposes a highly pipelined architecture for NTT.First of all,large-bitwidth modular arithmetic is optimized and low-latency Montgomery modular multiplication hardware unit is de-signed.And then,the large-scale NTT tasks are divided into smaller sub-tasks through two-dimensional partitioning,which improves the parallelism of NTT computation and eliminates the data dependence among sub-tasks,thus reali-zing the pipeline among sub-tasks.Finally,the“data reordering”technique is introduced among multiple rounds of butterfly operations in a sub-task,which effectively alleviates the memory access requirements,thus realizing the bottom-level pipeline in each sub-task,among butterfly operations with different step sizes.This architecture can be flexibly scaled to different scales of FPGAs.The accelerator is prototyped on the AMD-Xilinx Alveo U50 card(UltraScale+XCU50 FPGA).To balance computing efficiency and flexibility,the OpenCL equipped with high-level synthesis(HLS)is used to implement the system.The evaluation results show that the NTT module performs 1.95 times faster than the one in PipeZK and the accelerator achieves 27.98 and 1.74 times speedup,6.9 and 6 times ener-gy efficiency improvement than AMD Ryzen 95900X respectively,when it is integrated into the well-known ZKP open-source project,bellman.
作者 赵海旭 柴志雷 花鹏程 王锋 丁冬 ZHAO Haixu;CHAI Zhilei;HUA Pengcheng;WANG Feng;DING Dong(School of Artificial Intelligence and Computer Science,Jiangnan University,Wuxi,Jiangsu 214122,China;School of Internet of Things Engineering,Jiangnan University,Wuxi,Jiangsu 214122,China;Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence,Wuxi,Jiangsu 214122,China)
出处 《计算机科学与探索》 CSCD 北大核心 2024年第2期538-552,共15页 Journal of Frontiers of Computer Science and Technology
基金 国家自然科学基金(61972180) 江苏省模式识别与计算智能工程实验室项目。
关键词 现场可编程门阵列(FPGA) 简洁非交互式零知识证明(zk-SNARK) 模乘 数论变换 硬件加速 field programmable gate array(FPGA) zero-knowledge succinct non-interactive arguments of knowl-edge(zk-SNARK) modular multiplication number theoretic transform hardware acceleration
  • 相关文献

参考文献7

二级参考文献21

  • 1Koblitz N. A Course in Number Theory and Cryptography[M]. Heidelberg: Springer-Verlag, 1994.
  • 2Miller V. Uses of Elliptic Curves in Cryptography[C]//Advances in Cryptology CRYPTO'85, Lecture Notes in Computer Science. Heidelberg: Springer, 1986: 417-426.
  • 3Namal S, Georgantas K, Gurtov A. Lightweight Authentication and Key Management on 802.11 with Elliptic Curve Cryptography[C]//IEEE Wireless Communications and Networking Conference. Piscataway: IEEE, 2013: 1830-1835.
  • 4Kodali R K, Budwal H S. High Performance Scalar Multiplication for ECC[C]//International Conference on Computer Communication and Informatics. Piscataway: IEEE, 2013: 1-4.
  • 5Lopez J, Dahab R. Improved Algorithms for Elliptic Curve Arithmetic in GF(2<sup>n</sup>)[C]//Lecture Notes In Computer Science: 1556. Heidelberg: Springer, 1988: 201-212.
  • 6Lu C Y, Jen S M, Laih C S. A General Framework of Side-Channel Atomicity for Elliptic Curve Scalar Multiplication[J]. IEEE Transactions on Computers, 2013, 62(3): 428-438.
  • 7Lopez J, Dahab R. Fast Multiplication on Elliptic Curves over GF (2m) without Pre Computation[C]//Workshop on Cryptographic Hardware and Embedded Systems. Heidelberg: Springer, 1999: 316-327.
  • 8Certicom Research. Standards for Efficient Cryptography[S]. SEC 2: Recommended Elliptic Curve Domain Parameters, 2000.
  • 9丁勇.一种用椭圆曲线密码构建的传感网络密钥管理方案[J].西安电子科技大学学报,2008,35(4):739-742. 被引量:5
  • 10吴金红,曹建,赵岩.基于FPGA的OFDM改进调制解调器设计[J].计算机测量与控制,2010,18(12):2815-2817. 被引量:2

共引文献71

同被引文献8

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部