Efficient Pruning for Machine Learning Under Homomorphic Encryption

Aharoni, Ehud; Baruch, Moran; Bose, Pradip; Buyuktosunoglu, Alper; Drucker, Nir; Pal, Subhankar; Pelleg, Tomer; Sarpatwar, Kanthi; Shaul, Hayim; Soceanu, Omri; Vaculin, Roman

doi:10.1007/978-3-031-51482-1_11

Ehud Aharoni¹¹,
Moran Baruch¹¹,
Pradip Bose¹²,
Alper Buyuktosunoglu¹²,
Nir Drucker¹¹,
Subhankar Pal¹²,
Tomer Pelleg¹¹,
Kanthi Sarpatwar¹²,
Hayim Shaul¹¹,
Omri Soceanu¹¹ &
…
Roman Vaculin¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14347))

Included in the following conference series:

European Symposium on Research in Computer Security

541 Accesses

Abstract

Privacy-preserving machine learning (PPML) solutions are gaining widespread popularity. Among these, many rely on homomorphic encryption (HE) that offers confidentiality of the model and the data, but at the cost of large latency and memory requirements. Pruning neural network (NN) parameters improves latency and memory in plaintext ML but has little impact if directly applied to HE-based PPML.

We introduce a framework called HE-PEx that comprises new pruning methods, on top of a packing technique called tile tensors, for reducing the latency and memory of PPML inference. HE-PEx uses permutations to prune additional ciphertexts, and expansion to recover inference loss. We demonstrate the effectiveness of our methods for pruning fully-connected and convolutional layers in NNs on PPML tasks, namely, image compression, denoising, and classification, with autoencoders, multilayer perceptrons (MLPs) and convolutional neural networks (CNNs).

We implement and deploy our networks atop a framework called HElayers, which shows a 10–35% improvement in inference speed and a 17–35% decrease in memory requirement over the unpruned network, corresponding to 33–65% fewer ciphertexts, within a 2.5% degradation in inference accuracy over the unpruned network. Compared to the state-of-the-art pruning technique for PPML, our techniques generate networks with 70% fewer ciphertexts, on average, for the same degradation limit.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 8464; Price includes VAT (Japan)

Softcover Book: JPY 10581; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Latency-Aware Inference on Convolutional Neural Network Over Homomorphic Encryption

slytHErin: An Agile Framework for Encrypted Deep Neural Network Inference

FHE-Compatible Batch Normalization for Privacy Preserving Deep Learning

Notes

1.
Gl/L1/Neu and Gl/L1/Chan were not evaluated since PyTorch limits the scope of global pruning to unstructured methods only.

References

Aharoni, E., Drucker, N., Ezov, G., Shaul, H., Soceanu, O.: Complex encoded tile tensors: accelerating encrypted analytics. IEEE Secur. Priv. 20, 35–43 (2021)
Article Google Scholar
Aharoni, E., et al.: HELayers: a tile tensors framework for large neural networks on encrypted data. PoPETs (2023)
Google Scholar
Akavia, A., Vald, M.: On the privacy of protocols based on CPA-secure homomorphic encryption. Cryptology ePrint Archive (2021)
Google Scholar
Albrecht, M., et al.: Homomorphic encryption security standard. Technical report, HomomorphicEncryption.org, November 2018
Google Scholar
Baruch, M., et al.: Sensitive tuning of large scale CNNs for E2E secure prediction using homomorphic encryption. arXiv preprint arXiv:2304.14836 (2023)
Baruch, M., Drucker, N., Greenberg, L., Moshkowich, G.: A methodology for training homomorphic encryption friendly neural networks. In: Zhou, J., et al. (eds.) ACNS 2022. Lecture Notes in Computer Science, vol. 13285, pp. 536–553. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16815-4_29
Chapter Google Scholar
Baruch, M., Drucker, N., Greenberg, L., Moshkowich, G.: A methodology for training homomorphic encryption friendly neural networks. In: Zhou, J., et al. (eds.) ACNS 2022. LNCS, vol. 13285, pp. 536–553. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16815-4_29
Chapter Google Scholar
Blalock, D., Gonzalez Ortiz, J.J., Frankle, J., Guttag, J.: What is the state of neural network pruning? Proc. Mach. Learn. Syst. 2, 129–146 (2020)
Google Scholar
Boemer, F., Costache, A., Cammarota, R., Wierzynski, C.: NGraph-HE2: a high-throughput framework for neural network inference on encrypted data. In: Proceedings of the 7th ACM Workshop on Encrypted Computing & Applied Homomorphic Cryptography, WAHC 2019. Association for Computing Machinery (2019)
Google Scholar
Cai, Y., Zhang, Q., Ning, R., Xin, C., Wu, H.: Hunter: HE-friendly structured pruning for efficient privacy-preserving deep learning. In: Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security (CCS) (2022)
Google Scholar
Cheon, J.H., Kim, A., Kim, M., Song, Y.: Homomorphic encryption for arithmetic of approximate numbers. In: Takagi, T., Peyrin, T. (eds.) ASIACRYPT 2017. LNCS, vol. 10624, pp. 409–437. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70694-8_15
Chapter Google Scholar
Chou, E., Beal, J., Levy, D., Yeung, S., Haque, A., Fei-Fei, L.: Faster cryptonets: Leveraging sparsity for real-world encrypted inference. arXiv preprint (2018)
Google Scholar
EU General Data Protection Regulation: Regulation (EU) 2016/679 Directive 95/46/EC (General Data Protection Regulation). Official Journal of the European Union 119 (2016). http://data.europa.eu/eli/reg/2016/679/oj
Gong, Y., et al.: A privacy-preserving-oriented DNN pruning and mobile acceleration framework. In: Proceedings of the 2020 on Great Lakes Symposium on VLSI (2020)
Google Scholar
Gottemukkula, V.: Polynomial activation functions (2019). https://openreview.net/pdf?id=rkxsgkHKvH
Gunraj, H., Wang, L., Wong, A.: COVIDNet-CT: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest CT images. Front. Med. 7, 608525 (2020)
Article Google Scholar
Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. In: 4th International Conference on Learning Representations, ICLR 2016 (2016)
Google Scholar
Juvekar, C., Vaikuntanathan, V., Chandrakasan, A.: GAZELLE: a low latency framework for secure neural network inference. In: 27th USENIX Security Symposium, pp. 1651–1669. USENIX Association, Baltimore, August 2018
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. Neural Inf. Process. Syst. 25 (2012)
Google Scholar
Lehmkuhl, R., Mishra, P., Srinivasan, A., Popa, R.A.: Muse: secure inference resilient to malicious clients. In: USENIX Security 2021. USENIX Association (2021)
Google Scholar
Lou, Q., Jiang, L.: HEMET: a homomorphic-encryption-friendly privacy-preserving mobile neural network architecture. In: Proceedings of the 38th International Conference on Machine Learning, vol. 139, pp. 7102–7110. PMLR (2021)
Google Scholar
Luo, J.H., Wu, J., Lin, W.: ThiNet: a filter level pruning method for deep neural network compression. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5058–5066 (2017)
Google Scholar
Microsoft: SEAL (release 3.5). https://github.com/Microsoft/SEAL
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning (2011)
Google Scholar
Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48910-X_16
Chapter Google Scholar
Riazi, M.S., Samragh, M., Chen, H., Laine, K., Lauter, K., Koushanfar, F.: XONN: XNOR-based oblivious deep neural network inference. In: 28th USENIX Security Symposium, pp. 1501–1518. USENIX Association, Santa Clara, August 2019
Google Scholar
The HEBench Organization: HEBench (2022). https://hebench.github.io/
Wang, C., Zhang, G., Grosse, R.: Picking winning tickets before training by preserving gradient flow. arXiv preprint arXiv:2002.07376 (2020)
Wang, J., Jin, C., Meftah, S., Aung, K.M.M.: Popcorn: Paillier Meets Compression For Efficient Oblivious Neural Network Inference (2021)
Google Scholar
Yang, Y., Kuppannagari, S.R., Kannan, R., Prasanna, V.K.: FPGA accelerator for homomorphic encrypted sparse convolutional neural network inference. In: 2022 IEEE 30th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 1–9 (2022)
Google Scholar
Yuan, X., Zhang, L.: Membership inference attacks and defenses in neural network pruning. In: USENIX Security 2022
Google Scholar

Download references

Author information

Authors and Affiliations

IBM Research, Haifa, Israel
Ehud Aharoni, Moran Baruch, Nir Drucker, Tomer Pelleg, Hayim Shaul & Omri Soceanu
IBM T. J. Watson Research Center, Yorktown Heights, USA
Pradip Bose, Alper Buyuktosunoglu, Subhankar Pal, Kanthi Sarpatwar & Roman Vaculin

Authors

Ehud Aharoni
View author publications
You can also search for this author in PubMed Google Scholar
Moran Baruch
View author publications
You can also search for this author in PubMed Google Scholar
Pradip Bose
View author publications
You can also search for this author in PubMed Google Scholar
Alper Buyuktosunoglu
View author publications
You can also search for this author in PubMed Google Scholar
Nir Drucker
View author publications
You can also search for this author in PubMed Google Scholar
Subhankar Pal
View author publications
You can also search for this author in PubMed Google Scholar
Tomer Pelleg
View author publications
You can also search for this author in PubMed Google Scholar
Kanthi Sarpatwar
View author publications
You can also search for this author in PubMed Google Scholar
Hayim Shaul
View author publications
You can also search for this author in PubMed Google Scholar
Omri Soceanu
View author publications
You can also search for this author in PubMed Google Scholar
Roman Vaculin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nir Drucker .

Editor information

Editors and Affiliations

University of California, Irvine, CA, USA
Gene Tsudik
University of Padua, Padua, Italy
Mauro Conti
Delft University of Technology, Delft, The Netherlands
Kaitai Liang
Delft University of Technology, Delft, The Netherlands
Georgios Smaragdakis

A Appendix

1.1 A.1 Additional Evaluation

Selection of P4E prune\(^\textbf{pack}\) Threshold. We justify our choice of selecting a threshold for the prune\(^\textrm{pack}\) step (threshold % of zeros in the tile, above which the tile is pruned) of P4E by running a sweep and reporting the results in Fig. 12, for our AE-Compressor with CIFAR-10. The behavior is similar for other networks and datasets. We observe that large values of the threshold do not introduce sufficient new zero tiles to make any meaningful improvement upon P3E. At the other extreme, too small values of this threshold aggressively remove weights, leading to degradation in test loss. We select 93.8% as our choice here, but posit that our results can be further improved by tuning this as a hyperparameter.

Comparison of Permutation Algorithms. We compare different permutation algorithms for the permute step in HE-PEx.

Figure 13 shows the results of AlexNet with Lc/L1/Wei pruning of 97.5% of the weights, which is the same scenario discussed in Sect. 6.3. As expected, “Random Reordering” for the row-column co-permutation provides little-to-no benefit, due to the sheer number of possible permutations of the entire network. Interestingly, k-Means alone only improves the benefits to <10%. Moreover, the use of a “Grouped Hamming” variant (Sect. 4.2) does not show significance for the smaller tile shapes. However, as seen in the \(64\times 64\) case, grouped Hamming distance is critical for grouping zeros together for large tile shapes. Finally, the impact of balancing the clusters in k-Means is paramount, as it brings in the critical information of the first dimension of the tile shape into the optimization algorithm. In summary, our permutation algorithm variant improves the tile sparsity by \(\sim \)4–1,000\(\%\) for this network.

Table 2. Network configurations and hyperparameters. [HL = hidden layer]

Full size table

1.2 A.2 Experimental Setup

Our experiments involving the methods in Sect. 4.1 are done on a cluster of systems equipped with Tesla A100 GPUs and Xeon Gold 6258R CPUs. We used PyTorch version 1.11.0 accelerated with CUDA 11.6. For end-to-end evaluation using HElayers [2], we used a 44-core Xeon CPU E5-2699 @2.20 GHz machine with 750 GB memory. We observed that for our pruned networks, not all of the cores were fully utilized and performance saturated around 8 cores. Thus, for a fair comparison, we limited the system to use 8 cores for all of our experiments. We use a modified HElayers [2] with the CKKS SEAL [23] implementation targeting 128-bit security, and the reported results are the average of 10 runs.

Network Configurations and Model Hyperparameters. Table 2 shows details of the activation functions used in each network, the network architecture, and training hyperparameters for each of the networks and datasets in Sect. 5.

Enhancing HElayers. We integrate HE-PEx with HeLayers [2] and show a flowchart for P4E in Fig. 14. This can be extended as needed to our other strategies. We illustrate a grid search approach in the flowchart with {thres_all} and {tile_shapes} being a pre-selected set of values. The output of the framework is a deployment-ready model that meets an objective function based on accuracy, latency, throughput, and memory. A local search strategy may be considered to navigate the state space with fewer training and permutation rounds, which are the most time-consuming segments of this flow. The zero-tile–skipping logic is programmed into HeLayers by associating a zero_flag bit with each ciphertext, which is set during the HE Enc procedure (Sect. 3.1).

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Aharoni, E. et al. (2024). Efficient Pruning for Machine Learning Under Homomorphic Encryption. In: Tsudik, G., Conti, M., Liang, K., Smaragdakis, G. (eds) Computer Security – ESORICS 2023. ESORICS 2023. Lecture Notes in Computer Science, vol 14347. Springer, Cham. https://doi.org/10.1007/978-3-031-51482-1_11

Download citation

DOI: https://doi.org/10.1007/978-3-031-51482-1_11
Published: 11 January 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-51481-4
Online ISBN: 978-3-031-51482-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Efficient Pruning for Machine Learning Under Homomorphic Encryption