AKCN-MLWE算法AVX2高效实现被引量：1

High-Speed AVX2 Implementation of AKCN-MLWE

下载PDF

导出

摘要随着量子计算机的快速发展,经典密码系统面临巨大的威胁.Shor算法可以在量子计算机上多项式时间内分解大整数和求解离散对数,而这两类问题分别对应经典公钥密码系统中的RSA和椭圆曲线密码(ECC)所依赖的困难问题,因此可以抵御量子计算攻击的后量子密码近年来受到广泛的研究.格密码是后量子密码中最为高效且拓展性强的一类密码算法,在未来会逐步替代传统公钥密码算法(RSA、ECC等).256位高级向量扩展(AVX2)指令集是英特尔64位处理器中普遍支持的一类单指令多数据(SIMD)指令集,可用于并行计算.但是,由于格密码结构复杂,在支持AVX2指令集的英特尔64位处理器上难以对格密码方案进行高适配的深度优化.AKCN-MLWE算法是我国自主设计的基于模格上容错学习(MLWE)问题的格密码密钥封装(KEM)方案,是中国密码学会举办的公钥密码算法竞赛第二轮的获奖算法.本文基于256位高级向量扩展(AVX2)指令集设计了针对AKCN-MLWE算法的高效实现方案,包括以下几个关键优化点:针对多项式乘法,本文结合最优的数论变换(NTT)算法,将NTT的最后一层转换为线性多项式并使用Karatsuba算法进行加速计算,大幅提升计算效率的同时减少了预计算表的空间占用;针对取模运算,本文结合了Barrett约减算法和蒙哥马利约减算法的优势,同时采用延迟约减技术降低取模次数;本文针对所有多项式运算均实现了高度并行化,设计了针对多项式压缩与解压缩的并行算法,进一步提升了实现效率.本文设计的AKCN-MLWE算法AVX2高效实现方案在八核Intel Core i9-9880H处理器上仅需不到0.04 ms即可完成一次完整的KEM(包括密钥生成、密钥封装和密钥解封装),相比于参考实现提升8.84倍,其中密钥生成提升7.07倍,密钥封装提升7.90倍,密钥解封装算法提升11.78倍.本文提出的AKCN-MLWE算法AVX2实现方案在相近经典安全强度下性能优于美国国家标准技术研究所(NIST)后量子密码标准化进程第二轮中众多格密码方案(Kyber、NewHope和Saber等).同时,本文设计的部分优化方案可用于提升Kyber、NewHope等格密码方案的性能. As quantum computers developing rapidly,classical cryptosystems face a huge threat.Shor’s algorithm can factor large integers and solve discrete logarithms in polynomial time on quantum computers,and these two types of problems correspond to the difficult problems relied on by RSA and elliptic curve cryptography(ECC)in classical public-key cryptosystems,respectively,so post-quantum cryptography that can withstand quantum computing attacks has been widely studied in recent years.Post-quantum cryptography is a collection of public-key cryptosystems which run on classical computers and are considered to be resistant to quantum attacks.Lattice-based cryptography was proposed as the most efficient and scalable type of post-quantum cryptography to resist quantum computer attacks,which will replace classical public-key cryptosystems(RSA,ECC,etc.)in the near future.The 256-bit Advanced Vector Extensions(AVX2)instruction set is a class of Single Instruction Multiple Data(SIMD)instruction set commonly supported in Intel 64-bit processors for parallel computing.However,due to the complex structure of lattice-based cryptography,it is difficult to perform deep optimization for lattice-based cryptographic schemes with a high degree of adaptability on Intel 64-bit processors that support the AVX2 instruction set.The AKCN-MLWE scheme is a lattice-based key encapsulation mechanism(KEM)scheme proposed by a Chinese research team,which is based on the module learning with errors(MLWE)problem.The scheme is also one of the winning algorithms of the public-key cryptographic algorithm competition organized by the Chinese Association for Cryptologic Research.In this paper,we present a high-speed implementation of the AKCN-MLWE scheme using the 256-bit advanced vector extension(AVX2)instruction set provided by Intel 64-bit processors.Firstly,we pick the state-of-the-art number theory transformation(NTT)algorithm,which reduces one layer of both forward and inverse NTT and transforms one-dimensional vectors in the NTT domain into linear polynomials.Therefore,we tweak the Karatsuba algorithm to calculate the linear polynomial multiplications in the NTT domain,which significantly improves the computational efficiency and reduces the space occupation of the precomputed table.Furthermore,we combine the advantages of the Barrett reduction and the Montgomery reduction and further use the lazy reduction technique to reduce the number of times that modular operations are called.Finally,we achieve a high degree of parallelization for all polynomial operations and design a parallel algorithm for polynomial compression and decompression,which further improves the efficiency of our AVX2 implementation of AKCN-MLWE.Our high-speed AVX2 implementation of AKCN-MLWE can complete a full key encapsulation mechanism(KEM,including key generation,key encapsulation,and key decapsulation)in less than 0.04 milliseconds on an 8-core Intel Core i9-9880H processor,which is 8.84 times faster than the reference implementation,including 7.07 times faster key generation,7.90 times faster key encapsulation and 11.78 times faster key decapsulation.Our high-speed AVX2 implementation of AKCN-MLWE outperforms many lattice-based cryptographic schemes(Kyber,NewHope,and Saber,etc.)in the second round of the National Institute of Standards and Technology(NIST)post-quantum cryptography standardization process at similar classical security strength.Meanwhile,some of the optimization techniques in this article can be used to improve the performance of Kyber,NewHope,and other lattice-based cryptographic schemes.

作者杨昊刘哲黄军浩沈诗羽赵运磊 YANG Hao;LIU Zhe;HUANG Jun-Hao;SHEN Shi-Yu;ZHAO Yun-Lei(College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106;School of Computer Science,Fudan University,Shanghai 200433)

机构地区南京航空航天大学计算机科学与技术学院密码科学技术国家重点实验室复旦大学计算机科学技术学院

出处《计算机学报》 EI CAS CSCD 北大核心 2021年第12期2560-2572,共13页 Chinese Journal of Computers

基金科技部重点研发计划(2020AAA0107703) 国家自然科学基金重点项目(62132008) 国家自然科学基金青年项目(61802180) 江苏省自然科学基金(BK20180421) “十三五”国家密码发展基金重点项目(MMJJ20180105) 河南省网络密码技术重点实验室开放课题(LNCT2019-A03)资助。

关键词后量子密码格密码高级向量扩展数论变换模约减多项式运算 post-quantum cryptography lattice-based cryptography advanced vector extensions number theory transform modular reduction polynomial operation

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

同被引文献2

1李子臣,谢婷,张卷美.基于RLWE问题的后量子口令认证密钥交换协议[J].电子学报,2021,49(2):260-267. 被引量：13
2王小云,刘明洁.格密码学研究[J].密码学报,2014,1(1):13-27. 被引量：45

引证文献1

1张晓涵,程池,余天润.对密钥不匹配攻击的进一步理论分析--以NTRU-HRSS为例[J].电子学报,2023,51(4):1081-1092.

1李晓涵,张昆龙.格论中的一些反例[J].中央民族大学学报（自然科学版）,2021,30(1):16-20.
2Sergei Skuzovatov.Nature and(in-)coherent metamorphic evolution of subducted continental crust in the Neoproterozoic accretionary collage of SW Mongolia[J].Geoscience Frontiers,2021,12(3):78-103.
3苏晓明,石豪达,包·阿迪亚,马智慧.基于中间估计器的非线性广义马尔可夫跳变系统的故障估计[J].辽宁师范大学学报（自然科学版）,2021,44(3):309-316. 被引量：1
4陆旭凡,胡海根,邢明杰.基于LLVM的RISC-V向量扩展栈帧布局优化[J].计算机系统应用,2021,30(11):27-32. 被引量：1
5袁轶,施沩,梁甜甜,季铮铮.新型电力营销智慧管控平台的设计与研究[J].电气自动化,2021,43(6):90-93. 被引量：2
6林志佳,椰汁脆皮(图).中国何时诞生“英伟达”?[J].计算机应用文摘,2021(19):75-75. 被引量：1
7伊士曼50%再生材料用于iPhone 13手机壳,性能媲美聚碳酸酯[J].橡塑技术与装备,2021,47(24):33-33.
8李浩然,王磊.基于申威1621处理器的BLAS一级函数优化[J].计算机系统应用,2021,30(7):246-252. 被引量：1
9伯婷.智慧城市下现代城市规划设计发展方向[J].智能城市,2021,7(19):105-106. 被引量：5
10Arm新技术助力汽车产业拥抱软件定义的未来[J].单片机与嵌入式系统应用,2021,21(11):9-9.

计算机学报

2021年第12期

浏览历史

内容加载中请稍等...

AKCN-MLWE算法AVX2高效实现被引量：1

同被引文献2

引证文献1

相关作者

相关机构

相关主题

浏览历史

AKCN-MLWE算法AVX2高效实现 被引量：1

同被引文献2

引证文献1

相关作者

相关机构

相关主题

浏览历史

AKCN-MLWE算法AVX2高效实现被引量：1