A 334µW 0.158mm2 ASIC for Post-Quantum Key-Encapsulation Mechanism Saber with Low-latency Striding Toom-Cook Multiplication Extended Version

Archisman Ghosh; Jose Maria Bermudo Mera; Angshuman Karmakar; Debayan Das; Santosh Ghosh; Ingrid Verbauwhede; Shreyas Sen

Paper 2023/678

A 334µW 0.158mm2 ASIC for Post-Quantum Key-Encapsulation Mechanism Saber with Low-latency Striding Toom-Cook Multiplication Extended Version

Archisman Ghosh

, Purdue University West Lafayette

Jose Maria Bermudo Mera, PQShield Ltd

Angshuman Karmakar, Indian Institute of Technology Kanpur

Debayan Das, Indian Institute of Science Bangalore

Santosh Ghosh, Intel (United States)

Ingrid Verbauwhede, KU Leuven

Shreyas Sen, Purdue University West Lafayette

Abstract

The hard mathematical problems that assure the security of our current public-key cryptography (RSA, ECC) are broken if and when a quantum computer appears rendering them ineffective for use in the quantum era. Lattice based cryptography is a novel approach to public key cryptography, of which the mathematical investigation (so far) resists attacks from quantum computers. By choosing a module learning with errors (MLWE) algorithm as the next standard, National Institute of Standard \& Technology (NIST) follows this approach. The multiplication of polynomials is the central bottleneck in the computation of lattice based cryptography. Because public key cryptography is mostly used to establish common secret keys, focus is on compact area, power and energy budget and to a lesser extent on throughput or latency. While most other work focuses on optimizing number theoretic transform (NTT) based multiplications, in this paper we highly optimize a Toom-Cook based multiplier. We demonstrate that a memory-efficient striding Toom-Cook with lazy interpolation, results in a highly compact, low power implementation, which on top enables a very regular memory access scheme. To demonstrate the efficiency, we integrate this multiplier into a Saber post-quantum accelerator, one of the four NIST finalists. Algorithmic innovation to reduce active memory, timely clock gating and shift-add multiplier has helped to achieve 38\% less power than state-of-the art PQC core, 4 $\times$ less memory, 36.8\% reduction in multiplier energy and 118$\times$ reduction in active power with respect to state-of-the-art Saber accelerator (not silicon verified). This accelerator consumes $0.158mm^2$ active area which is lowest reported till date despite process disadvantages of the state-of-the-art designs.

Metadata

Available format(s): PDF
Category: Public-key cryptography
Publication info: Published elsewhere. IEEE Journal of Solid-State Circuits
DOI: 10.1109/JSSC.2023.3253425
Keywords: Post-quantum cryptography striding Toom-Cook lazy interpolation memory-efficient energy-efficient architecture
Contact author(s): ghosh69 @ purdue edu
das60 @ purdue edu
santosh ghosh @ intel com
shreyas @ purdue edu
History: 2023-05-17: last of 2 revisions; 2023-05-12: received; See all versions
Short URL: https://ia.cr/2023/678
License: CC BY-NC-SA

BibTeX

@misc{cryptoeprint:2023/678,
      author = {Archisman Ghosh and Jose Maria Bermudo Mera and Angshuman Karmakar and Debayan Das and Santosh Ghosh and Ingrid Verbauwhede and Shreyas Sen},
      title = {A {334µW} 0.158mm2 {ASIC} for Post-Quantum Key-Encapsulation Mechanism Saber with Low-latency Striding Toom-Cook Multiplication Extended Version},
      howpublished = {Cryptology {ePrint} Archive, Paper 2023/678},
      year = {2023},
      doi = {10.1109/JSSC.2023.3253425},
      url = {https://eprint.iacr.org/2023/678}
}