Abstract
Privacy-preserving machine learning (PPML) solutions are gaining widespread popularity. Among these, many rely on homomorphic encryption (HE) that offers confidentiality of the model and the data, but at the cost of large latency and memory requirements. Pruning neural network (NN) parameters improves latency and memory in plaintext ML but has little impact if directly applied to HE-based PPML.
We introduce a framework called HE-PEx that comprises new pruning methods, on top of a packing technique called tile tensors, for reducing the latency and memory of PPML inference. HE-PEx uses permutations to prune additional ciphertexts, and expansion to recover inference loss. We demonstrate the effectiveness of our methods for pruning fully-connected and convolutional layers in NNs on PPML tasks, namely, image compression, denoising, and classification, with autoencoders, multilayer perceptrons (MLPs) and convolutional neural networks (CNNs).
We implement and deploy our networks atop a framework called HElayers, which shows a 10–35% improvement in inference speed and a 17–35% decrease in memory requirement over the unpruned network, corresponding to 33–65% fewer ciphertexts, within a 2.5% degradation in inference accuracy over the unpruned network. Compared to the state-of-the-art pruning technique for PPML, our techniques generate networks with 70% fewer ciphertexts, on average, for the same degradation limit.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Gl/L1/Neu and Gl/L1/Chan were not evaluated since PyTorch limits the scope of global pruning to unstructured methods only.
References
Aharoni, E., Drucker, N., Ezov, G., Shaul, H., Soceanu, O.: Complex encoded tile tensors: accelerating encrypted analytics. IEEE Secur. Priv. 20, 35–43 (2021)
Aharoni, E., et al.: HELayers: a tile tensors framework for large neural networks on encrypted data. PoPETs (2023)
Akavia, A., Vald, M.: On the privacy of protocols based on CPA-secure homomorphic encryption. Cryptology ePrint Archive (2021)
Albrecht, M., et al.: Homomorphic encryption security standard. Technical report, HomomorphicEncryption.org, November 2018
Baruch, M., et al.: Sensitive tuning of large scale CNNs for E2E secure prediction using homomorphic encryption. arXiv preprint arXiv:2304.14836 (2023)
Baruch, M., Drucker, N., Greenberg, L., Moshkowich, G.: A methodology for training homomorphic encryption friendly neural networks. In: Zhou, J., et al. (eds.) ACNS 2022. Lecture Notes in Computer Science, vol. 13285, pp. 536–553. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16815-4_29
Baruch, M., Drucker, N., Greenberg, L., Moshkowich, G.: A methodology for training homomorphic encryption friendly neural networks. In: Zhou, J., et al. (eds.) ACNS 2022. LNCS, vol. 13285, pp. 536–553. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16815-4_29
Blalock, D., Gonzalez Ortiz, J.J., Frankle, J., Guttag, J.: What is the state of neural network pruning? Proc. Mach. Learn. Syst. 2, 129–146 (2020)
Boemer, F., Costache, A., Cammarota, R., Wierzynski, C.: NGraph-HE2: a high-throughput framework for neural network inference on encrypted data. In: Proceedings of the 7th ACM Workshop on Encrypted Computing & Applied Homomorphic Cryptography, WAHC 2019. Association for Computing Machinery (2019)
Cai, Y., Zhang, Q., Ning, R., Xin, C., Wu, H.: Hunter: HE-friendly structured pruning for efficient privacy-preserving deep learning. In: Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security (CCS) (2022)
Cheon, J.H., Kim, A., Kim, M., Song, Y.: Homomorphic encryption for arithmetic of approximate numbers. In: Takagi, T., Peyrin, T. (eds.) ASIACRYPT 2017. LNCS, vol. 10624, pp. 409–437. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70694-8_15
Chou, E., Beal, J., Levy, D., Yeung, S., Haque, A., Fei-Fei, L.: Faster cryptonets: Leveraging sparsity for real-world encrypted inference. arXiv preprint (2018)
EU General Data Protection Regulation: Regulation (EU) 2016/679 Directive 95/46/EC (General Data Protection Regulation). Official Journal of the European Union 119 (2016). http://data.europa.eu/eli/reg/2016/679/oj
Gong, Y., et al.: A privacy-preserving-oriented DNN pruning and mobile acceleration framework. In: Proceedings of the 2020 on Great Lakes Symposium on VLSI (2020)
Gottemukkula, V.: Polynomial activation functions (2019). https://openreview.net/pdf?id=rkxsgkHKvH
Gunraj, H., Wang, L., Wong, A.: COVIDNet-CT: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest CT images. Front. Med. 7, 608525 (2020)
Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. In: 4th International Conference on Learning Representations, ICLR 2016 (2016)
Juvekar, C., Vaikuntanathan, V., Chandrakasan, A.: GAZELLE: a low latency framework for secure neural network inference. In: 27th USENIX Security Symposium, pp. 1651–1669. USENIX Association, Baltimore, August 2018
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. Neural Inf. Process. Syst. 25 (2012)
Lehmkuhl, R., Mishra, P., Srinivasan, A., Popa, R.A.: Muse: secure inference resilient to malicious clients. In: USENIX Security 2021. USENIX Association (2021)
Lou, Q., Jiang, L.: HEMET: a homomorphic-encryption-friendly privacy-preserving mobile neural network architecture. In: Proceedings of the 38th International Conference on Machine Learning, vol. 139, pp. 7102–7110. PMLR (2021)
Luo, J.H., Wu, J., Lin, W.: ThiNet: a filter level pruning method for deep neural network compression. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5058–5066 (2017)
Microsoft: SEAL (release 3.5). https://github.com/Microsoft/SEAL
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning (2011)
Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48910-X_16
Riazi, M.S., Samragh, M., Chen, H., Laine, K., Lauter, K., Koushanfar, F.: XONN: XNOR-based oblivious deep neural network inference. In: 28th USENIX Security Symposium, pp. 1501–1518. USENIX Association, Santa Clara, August 2019
The HEBench Organization: HEBench (2022). https://hebench.github.io/
Wang, C., Zhang, G., Grosse, R.: Picking winning tickets before training by preserving gradient flow. arXiv preprint arXiv:2002.07376 (2020)
Wang, J., Jin, C., Meftah, S., Aung, K.M.M.: Popcorn: Paillier Meets Compression For Efficient Oblivious Neural Network Inference (2021)
Yang, Y., Kuppannagari, S.R., Kannan, R., Prasanna, V.K.: FPGA accelerator for homomorphic encrypted sparse convolutional neural network inference. In: 2022 IEEE 30th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 1–9 (2022)
Yuan, X., Zhang, L.: Membership inference attacks and defenses in neural network pruning. In: USENIX Security 2022
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Appendix
A Appendix
1.1 A.1 Additional Evaluation
Selection of P4E prune\(^\textbf{pack}\) Threshold. We justify our choice of selecting a threshold for the prune\(^\textrm{pack}\) step (threshold % of zeros in the tile, above which the tile is pruned) of P4E by running a sweep and reporting the results in Fig. 12, for our AE-Compressor with CIFAR-10. The behavior is similar for other networks and datasets. We observe that large values of the threshold do not introduce sufficient new zero tiles to make any meaningful improvement upon P3E. At the other extreme, too small values of this threshold aggressively remove weights, leading to degradation in test loss. We select 93.8% as our choice here, but posit that our results can be further improved by tuning this as a hyperparameter.
Comparison of Permutation Algorithms. We compare different permutation algorithms for the permute step in HE-PEx.
Figure 13 shows the results of AlexNet with Lc/L1/Wei pruning of 97.5% of the weights, which is the same scenario discussed in Sect. 6.3. As expected, “Random Reordering” for the row-column co-permutation provides little-to-no benefit, due to the sheer number of possible permutations of the entire network. Interestingly, k-Means alone only improves the benefits to <10%. Moreover, the use of a “Grouped Hamming” variant (Sect. 4.2) does not show significance for the smaller tile shapes. However, as seen in the \(64\times 64\) case, grouped Hamming distance is critical for grouping zeros together for large tile shapes. Finally, the impact of balancing the clusters in k-Means is paramount, as it brings in the critical information of the first dimension of the tile shape into the optimization algorithm. In summary, our permutation algorithm variant improves the tile sparsity by \(\sim \)4–1,000\(\%\) for this network.
1.2 A.2 Experimental Setup
Our experiments involving the methods in Sect. 4.1 are done on a cluster of systems equipped with Tesla A100 GPUs and Xeon Gold 6258R CPUs. We used PyTorch version 1.11.0 accelerated with CUDA 11.6. For end-to-end evaluation using HElayers [2], we used a 44-core Xeon CPU E5-2699 @2.20 GHz machine with 750 GB memory. We observed that for our pruned networks, not all of the cores were fully utilized and performance saturated around 8 cores. Thus, for a fair comparison, we limited the system to use 8 cores for all of our experiments. We use a modified HElayers [2] with the CKKS SEAL [23] implementation targeting 128-bit security, and the reported results are the average of 10 runs.
Network Configurations and Model Hyperparameters. Table 2 shows details of the activation functions used in each network, the network architecture, and training hyperparameters for each of the networks and datasets in Sect. 5.
Enhancing HElayers. We integrate HE-PEx with HeLayers [2] and show a flowchart for P4E in Fig. 14. This can be extended as needed to our other strategies. We illustrate a grid search approach in the flowchart with {thres_all} and {tile_shapes} being a pre-selected set of values. The output of the framework is a deployment-ready model that meets an objective function based on accuracy, latency, throughput, and memory. A local search strategy may be considered to navigate the state space with fewer training and permutation rounds, which are the most time-consuming segments of this flow. The zero-tile–skipping logic is programmed into HeLayers by associating a zero_flag bit with each ciphertext, which is set during the HE Enc procedure (Sect. 3.1).
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Aharoni, E. et al. (2024). Efficient Pruning for Machine Learning Under Homomorphic Encryption. In: Tsudik, G., Conti, M., Liang, K., Smaragdakis, G. (eds) Computer Security – ESORICS 2023. ESORICS 2023. Lecture Notes in Computer Science, vol 14347. Springer, Cham. https://doi.org/10.1007/978-3-031-51482-1_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-51482-1_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-51481-4
Online ISBN: 978-3-031-51482-1
eBook Packages: Computer ScienceComputer Science (R0)