Paper 2023/1893
BOLT: Privacy-Preserving, Accurate and Efficient Inference for Transformers
Abstract
The advent of transformers has brought about significant advancements in traditional machine learning tasks. However, their pervasive deployment has raised concerns about the potential leakage of sensitive information during inference. Existing approaches using secure multiparty computation (MPC) face limitations when applied to transformers due to the extensive model size and resource-intensive matrix-matrix multiplications. In this paper, we present BOLT, a privacy-preserving inference framework for transformer models that supports efficient matrix multiplications and nonlinear computations. Combined with our novel machine learning optimizations, BOLT reduces the communication cost by 10.91x. Our evaluation on diverse datasets demonstrates that BOLT maintains comparable accuracy to floating-point models and achieves 4.8-9.5x faster inference across various network settings compared to the state-of-the-art system.
Metadata
- Available format(s)
- Category
- Cryptographic protocols
- Publication info
- Published elsewhere. Minor revision. IEEE S&P 2024
- Keywords
- secure multi-party computationhomomorphic encryptionsecure machine learning inferencetransformer
- Contact author(s)
-
qipang @ cmu edu
jinhao zhu @ berkeley edu
moellering @ encrypto cs tu-darmstadt de
wenting @ cmu edu
schneider @ encrypto cs tu-darmstadt de - History
- 2024-07-06: last of 5 revisions
- 2023-12-09: received
- See all versions
- Short URL
- https://ia.cr/2023/1893
- License
-
CC BY
BibTeX
@misc{cryptoeprint:2023/1893, author = {Qi Pang and Jinhao Zhu and Helen Möllering and Wenting Zheng and Thomas Schneider}, title = {{BOLT}: Privacy-Preserving, Accurate and Efficient Inference for Transformers}, howpublished = {Cryptology {ePrint} Archive, Paper 2023/1893}, year = {2023}, url = {https://eprint.iacr.org/2023/1893} }