Paper 2023/1893

BOLT: Privacy-Preserving, Accurate and Efficient Inference for Transformers

Qi Pang, Carnegie Mellon University
Jinhao Zhu, University of California, Berkeley
Helen Möllering, Technical University of Darmstadt
Wenting Zheng, Carnegie Mellon University
Thomas Schneider, Technical University of Darmstadt
Abstract

The advent of transformers has brought about significant advancements in traditional machine learning tasks. However, their pervasive deployment has raised concerns about the potential leakage of sensitive information during inference. Existing approaches using secure multiparty computation (MPC) face limitations when applied to transformers due to the extensive model size and resource-intensive matrix-matrix multiplications. In this paper, we present BOLT, a privacy-preserving inference framework for transformer models that supports efficient matrix multiplications and nonlinear computations. Combined with our novel machine learning optimizations, BOLT reduces the communication cost by 10.91x. Our evaluation on diverse datasets demonstrates that BOLT maintains comparable accuracy to floating-point models and achieves 4.8-9.5x faster inference across various network settings compared to the state-of-the-art system.

Metadata
Available format(s)
PDF
Category
Cryptographic protocols
Publication info
Published elsewhere. Minor revision. IEEE S&P 2024
Keywords
secure multi-party computationhomomorphic encryptionsecure machine learning inferencetransformer
Contact author(s)
qipang @ cmu edu
jinhao zhu @ berkeley edu
moellering @ encrypto cs tu-darmstadt de
wenting @ cmu edu
schneider @ encrypto cs tu-darmstadt de
History
2024-07-06: last of 5 revisions
2023-12-09: received
See all versions
Short URL
https://ia.cr/2023/1893
License
Creative Commons Attribution
CC BY

BibTeX

@misc{cryptoeprint:2023/1893,
      author = {Qi Pang and Jinhao Zhu and Helen Möllering and Wenting Zheng and Thomas Schneider},
      title = {{BOLT}: Privacy-Preserving, Accurate and Efficient Inference for Transformers},
      howpublished = {Cryptology {ePrint} Archive, Paper 2023/1893},
      year = {2023},
      url = {https://eprint.iacr.org/2023/1893}
}
Note: In order to protect the privacy of readers, eprint.iacr.org does not use cookies or embedded third party content.