NTT架构研究及其FPGA硬件优化实现被引量：1

NTT Architecture Research and Its FPGA Hardware Optimization Implementation

下载PDF

导出

摘要量子计算机的发展对现有公钥体系的影响是实质性的.在众多后量子密码流派中,格基密码方案因其安全性、高效性等优良特点而成为了主流技术路线.密码算法在各实现平台的运行效率是后量子算法评估进程中的重要指标.FPGA(Field Programmable Gate Array)具有并行性架构,是密码体系实际部署时的重要硬件平台.近年来,后量子算法的硬件优化实现研究吸引了越来越多的关注.针对格基密码算法中计算复杂度最高、耗时最长的操作——环上多项式乘法计算,本文对主流的加速技术——数论变换技术(Number-Theoretic Transform,NTT)进行了系统研究,根据参数不同将其分为标准NTT(Standard-NTT,SNTT)、删减NTT(Truncated-NTT,TNTT)、混合NTT(Hybrid-NTT,HNTT).为了提高适用性,设计了一种统一型、可常数时间执行、支持多参数的硬件电路,支持三种NTT架构,结合Karatsuba技巧,将对应系数相乘中乘法次数减少20%.针对NTT的核心操作——蝴蝶变换,设计了一种紧凑型、低时延的电路结构,可实现正向NTT、对应系数相乘、逆向NTT功能,同时支持CooleyTukey和Gentleman-Sande两种结构;提出一种新的系数访存模式,采用“交叉存储型”结构,使用双口BRAM将计算后的系数“对角交叉”变换位置后存储在原地址,利用双Bank存储模式满足蝴蝶变换单元的数据吞吐量;对Barrett约减算法进行修改,提出一种适用于多模值的模约减硬件单元,用加法和移位代替其中的乘法操作,减少乘法器资源的消耗.本方案使用Verilog HDL语言在Xilinx公司Artix-7系列XC7A200TFBG484-2型号FPGA芯片上进行了纯硬件优化实现.与当前研究相比,本设计在面积上减少12%-62.6%,时间上提升5.5%-49.8%.此外,我们还对高吞吐量的并行设计进行了优化实现,优化后的二并行、四并行架构时间上分别加速两倍、四倍,面积消耗仅为单结构的1.45倍、2.58倍. The development of quantum computers has a substantial impact on the existing public key systems.The security of traditional public key encryption schemes,such as RSA and ECC algorithm,is based on the most difficult mathematical problems such as large integer factoring and discrete logarithm problems.In 1994,Shor proposed that these mathematical problems can be solved in polynomial time on a quantum computer.IBM and Microsoft engineers expect largescale quantum computers to emerge in the next 15-20 years.It is urgent to research and design quantum-resistant public key cryptographic algorithms and protocols.Post-quantum cryptography(PQC)refers to cryptography that can resist quantum computer attacks.Among many postquantum cryptographic schemes,lattice-based cryptography has attracted extensive attention due to its security,efficiency and other advantages.In July 2022,the National Institute of Standards and Technology(NIST)announced four algorithms that will serve as the future post-quantum cryptography standard.Three of them are lattice-based cryptographic schemes.Among the winning algorithms in the Chinese national cryptographic algorithm competition,lattice-based algorithms account for the majority,such as Aigis,AKCN-MLWE,etc.FPGA(Field Programmable Gate Array)is an important hardware platform for practical deployment,with a parallel architecture.There has been widespread interest in researching and designing dedicated hardware unit for post-quantum cryptographic algorithms on FPGAs.The most complex and timeconsuming operation in lattice-based cryptography is polynomial multiplication over the rings.Normally,most schemes use number theoretic transform(NTT)technology to reduce computational complexity and accelerate polynomial multiplication.In this paper,we studied NTT technologies used in the mainstream lattice-based cryptographic schemes and typed them as Standard-NTT(SNTT),Truncated-NTT(TNTT)and Hybrid-NTT(HNTT)according to different parameters.In order to improve its applicability,a unified,constant-time executable,multi-parameter supported hardware architecture was designed,which supports all three NTT architectures.Combining the Karatsuba technique,the number of multiplication times during point-wise multiplication was reduced by 20%.A compact and low latency unit has been designed for the core operation of NTT,butterfly transform,which can perform forward NTT,point-wise multiplication,and reverse NTT functions.It supports both Cooley Tukey and Gentleman Sande structures.To avoid data conflicts,we designed a unique cross-storage memory access mode,using two banks to store polynomial coefficients.After each layer of NTT,polynomial coefficients are cross-swapped before being stored in BRAM.The coefficients in the NTT calculation process are carried out in the specific domain,where the intermediate results require modular reduction.There are two popular algorithms in the cryptographic engineering field,Montgomery reduction and Barrett reduction.Both of them require two multiplications which are expensive for hardware design.For that,we have designed a dedicated reduction unit that replaces multiplications with addition/shift operations and is suitable for multiple moduli.The proposed hardware design code is written in Verilog HDL,synthesized,placed,and routed using Xilinx Vivado tools on the Xilinx Artix-7 FPGA XC7A200TFBG484-2 chip.Compared to the state-of-art,this design reduces area consumption by 12%-62.6%,and speeds up by 5.5%-49.8%.We also designed a high-performance parallel framework to increase data throughput.Compared to the single structure,the optimized two-parallel and four-parallel architectures speed up twice and four times respectively,with only 1.45 and 2.58 times area consumption.

作者赵旭阳梁志闯胡跃耿合详赵运磊 ZHAO Xu-Yang;LIANG Zhi-Chuang;HU Yue;GENG He-Xiang;ZHAO Yun-Lei(School of Computer Science,Fudan University,Shanghai 200433;State Key Laboratory of Cryptology,Beijing 100000)

机构地区复旦大学计算机科学技术学院密码科学技术国家重点实验室

出处《计算机学报》 EI CAS CSCD 北大核心 2023年第12期2670-2686,共17页 Chinese Journal of Computers

基金国家重点研发计划(2022YFB2701600,2017YFB0802000) 国家自然科学基金(61972094)资助。

关键词后量子密码格基密码算法多项式乘法数论变换 FPGA 硬件实现 post-quantum cryptography lattice-based algorithm polynomial multiplication number theoretic transform FPGA hardware design

分类号 TP309 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

同被引文献2

1李延斌,朱嘉杰,唐明,张焕国.面向格密码的能耗分析攻击技术[J].计算机学报,2023,46(2):331-352. 被引量：1
2陈韬,李慧琴,李伟,南龙梅,杜怡然.面向格基后量子密码算法的可重构多项式乘法架构[J].电子与信息学报,2023,45(9):3380-3392. 被引量：1

引证文献1

1周清雷,韩贺茹,李斌,刘宇航.面向格密码的可配置基-4 NTT硬件优化与实现[J].通信学报,2024,45(10):163-179.

1陈杰.应用量子加密,保障信息安全[J].信息化建设,2023(9):63-64. 被引量：1
2姜利娟.数学模型在服装设计中的应用策略研究[J].化纤与纺织技术,2023,52(9):179-181. 被引量：2
3周全.在强国复兴伟业中践行青春力量的使命担当[J].时代青年（视点）,2023(10):40-41.
4白野,何德彪,罗敏,杨智超,彭聪.一种针对SM2数字签名算法的攻击方案[J].密码学报,2023,10(4):823-835.
5翟子萌,张洪,吴钰清,龚岚,徐建江,周行涛,周峰,洪佳旭.国产液脉动系统治疗睑板腺功能障碍的临床应用评价[J].中华实验眼科杂志,2023,41(5):442-449.
6时尚新闻[J].时尚芭莎,2022(18):54-54.
7孙奥,何银,李海波,张云蓓.后量子密码发展综述[J].信息安全与通信保密,2023(9):27-35. 被引量：2
8张子建,李洪照,吴跃国,韩通,陈奔,张岩.创新“减算法”研究,缩短二次再热机组汽轮机冲转时长[J].企业管理,2022(S02):54-55.
9崔劼.基于FPGA的卷积神经网络图像识别算法研究[J].无线互联科技,2023,20(18):159-161.
10盛志豪,刘展杰,熊晓明.基于二进制域的SM2数字签名算法的硬件实现[J].集成电路应用,2023,40(9):34-35.

计算机学报

2023年第12期

浏览历史

内容加载中请稍等...

NTT架构研究及其FPGA硬件优化实现被引量：1

同被引文献2

引证文献1

相关作者

相关机构

相关主题

浏览历史

NTT架构研究及其FPGA硬件优化实现 被引量：1

同被引文献2

引证文献1

相关作者

相关机构

相关主题

浏览历史

NTT架构研究及其FPGA硬件优化实现被引量：1