摘要
简洁非交互式零知识证明(zk-SNARK)由于具备证明验证过程简捷快速的优点,已在加密货币等众多领域得到广泛应用。但其证明生成过程所需计算仍复杂耗时,影响了进一步的应用拓展。针对zk-SNARK证明生成过程中的主要计算瓶颈——数论变换(NTT),提出了一种基于GPU的NTT计算加速方法PreNTT。首先,提出了基于预计算的NTT并行计算方法,利用预计算与旋转因子次幂算法优化,减少NTT并行计算开销,并结合动态预计算,进一步提高NTT计算效率。其次,通过“动态自适应计算核调度”,可以根据NTT输入规模自适应地分配GPU片上资源,提升了大规模NTT任务的计算能效。然后,通过核外整体数据混洗和核内局部数据混洗相结合的方式,避免了访存冲突。最后,使用CUDA多流技术执行数据传输和计算过程,对预计算时间进行了有效隐藏。实验结果表明:基于PreNTT实现的zk-SNARK系统,与目前业界最先进的系统Bellperson相比,NTT模块运行时间获得了全规模最低1.7倍的加速比,最高加速比为9倍。PreNTT能够有效提高NTT算法并行度,降低zk-SNARK运算时间开销。
Zero-knowledge succinct non-interactive proofs(zk-SNARK)have found extensive applications in various fields,including cryptocurrencies,due to their swift and efficient proof verification process.However,the computational intensity of the proof generation process poses a significant challenge,particularly at the number theoretic transform(NTT)stage.This paper proposed a GPU-based acceleration method for NTT computations,named PreNTT,to address this bottleneck.The method employed precomputation and optimization of twiddle factor powers to reduce the parallel computation overhead in NTT.It also introduced dynamic precomputation to enhance the efficiency of these computations.The algorithm made use of dynamic adaptive kernel scheduling,which allocated GPU resources on-chip according to the NTT input size,thereby boosting the computational efficiency for large-scale tasks.Additionally,the approach combined external global data shuffling with internal local data shuffling to avoid memory access conflicts.The use of CUDA multi-stream technology allowed for effective concealment of precomputation times during data transfer and computation processes.Experimental results indicate that the zk-SNARK system utilizing PreNTT achieves a speed-up ratio ranging from 1.7x to 9x in NTT module running times compared to Bellperson,the industry-leading system.PreNTT effectively increases the parallelism of the NTT algorithm and reduces the computational overhead in zk-SNARK operations.
作者
丁冬
李正权
柴志雷
Ding Dong;Li Zhengquan;Chai Zhilei(School of Internet of Things Engineering,Jiangnan University,Wuxi Jiangsu 214122,China;School of Artificial Intelligence&Computer Science,Jiangnan University,Wuxi Jiangsu 214122,China;State Key Laboratory of Networking&Switching Technology,Beijing University of Posts&Telecommunications,Beijing 100876,China)
出处
《计算机应用研究》
CSCD
北大核心
2024年第10期3059-3067,共9页
Application Research of Computers
基金
国家自然科学基金资助项目(61972180)
北京邮电大学网络与交换技术全国重点实验室开放课题资助项目(SKLNST-2023-1-13)。
关键词
简洁非交互式零知识证明
数论变换
GPU
并行计算
加速
zero-knowledge succinct non-interactive argument of knowledge
number theoretic transformation
GPU
parallel computing
acceleration