skip to main content
10.1145/3578360.3580273acmconferencesArticle/Chapter ViewAbstractPublication PagesccConference Proceedingsconference-collections
research-article
Open access

RL4ReAl: Reinforcement Learning for Register Allocation

Published: 17 February 2023 Publication History

Abstract

We aim to automate decades of research and experience in register allocation, leveraging machine learning. We tackle this problem by embedding a multi-agent reinforcement learning algorithm within LLVM, training it with the state of the art techniques. We formalize the constraints that precisely define the problem for a given instruction-set architecture, while ensuring that the generated code preserves semantic correctness. We also develop a gRPC based framework providing a modular and efficient compiler interface for training and inference. Our approach is architecture independent: we show experimental results targeting Intel x86 and ARM AArch64. Our results match or out-perform the heavily tuned, production-grade register allocators of LLVM.

References

[1]
Paul Almasan, José Suárez-Varela, Arnau Badia-Sampera, Krzysztof Rusek, Pere Barlet-Ros, and Albert Cabellos-Aparicio. 2020. Deep Reinforcement Learning meets Graph Neural Networks: exploring a routing optimization use case. arxiv:1910.07421.
[2]
Ethem Alpaydin. 2010. Introduction to Machine Learning (2nd ed.). The MIT Press. isbn:026201243X
[3]
Andrew W. Appel and Lal George. 2001. Optimal Spilling for CISC Machines with Few Registers. In PLDI ’01. Association for Computing Machinery, New York, NY, USA. 243–253. isbn:1581134142 https://doi.org/10.1145/378795.378854
[4]
Amir H. Ashouri, Andrea Bignoli, Gianluca Palermo, Cristina Silvano, Sameer Kulkarni, and John Cavazos. 2017. MiCOMP: Mitigating the Compiler Phase-Ordering Problem Using Optimization Sub-Sequences and Machine Learning. ACM Trans. Archit. Code Optim., 14, 3 (2017), Article 29, Sept., 28 pages. issn:1544-3566 https://doi.org/10.1145/3124452
[5]
Rajkishore Barik, Christian Grothoff, Rahul Gupta, Vinayaka Pandit, and Raghavendra Udupa. 2006. Optimal Bitwise Register Allocation Using Integer Linear Programming. In Languages and Compilers for Parallel Computing, 19th International Workshop, LCPC 2006, New Orleans, LA, USA, November 2-4, 2006. Revised Papers, George Almási, Calin Cascaval, and Peng Wu (Eds.) (Lecture Notes in Computer Science, Vol. 4382). Springer, 267–282. https://doi.org/10.1007/978-3-540-72521-3_20
[6]
Tal Ben-Nun, Alice Shoshana Jakobovits, and Torsten Hoefler. 2018. Neural Code Comprehension: A Learnable Representation of Code Semantics. In Proceedings of the 32Nd International Conference on Neural Information Processing Systems (NIPS’18). Curran Associates Inc., USA. 3589–3601. http://dl.acm.org/citation.cfm?id=3327144.3327276
[7]
Antoine Bordes, Nicolas Usunier, Alberto Garcia-Durán, Jason Weston, and Oksana Yakhnenko. 2013. Translating Embeddings for Modeling Multi-relational Data. In Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2 (NIPS’13). Curran Associates Inc., USA. 2787–2795. http://dl.acm.org/citation.cfm?id=2999792.2999923
[8]
Florent Bouchez, Alain Darte, Christophe Guillon, and Fabrice Rastello. 2007. Register Allocation: What Does the NP-Completeness Proof of Chaitin et al. Really Prove? Or Revisiting Register Allocation: Why and How. In Languages and Compilers for Parallel Computing, George Almási, Călin Caşcaval, and Peng Wu (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 283–298. isbn:978-3-540-72521-3 https://doi.org/10.1007/978-3-540-72521-3_21
[9]
Preston Briggs, Keith D. Cooper, and Linda Torczon. 1994. Improvements to Graph Coloring Register Allocation. ACM Trans. Program. Lang. Syst., 16, 3 (1994), may, 428–455. issn:0164-0925 https://doi.org/10.1145/177492.177575
[10]
Philip Brisk, Foad Dabiri, Jamie Macbeth, and Majid Sarrafzadeh. 2005. Polynomial time graph coloring register allocation. In 14th International Workshop on Logic and Synthesis.
[11]
Gregory J. Chaitin, Marc A. Auslander, Ashok K. Chandra, John Cocke, Martin E. Hopkins, and Peter W. Markstein. 1981. Register allocation via coloring. Computer Languages, 6, 1 (1981), 47–57. issn:0096-0551 https://doi.org/10.1016/0096-0551(81)90048-5
[12]
Chia-Ming Chang, Chien-Ming Chen, and Chung-Ta King. 1997. Using integer linear programming for instruction scheduling and register allocation in multi-issue processors. Computers & Mathematics with Applications, 34, 9 (1997), 1–14. issn:0898-1221 https://doi.org/10.1016/S0898-1221(97)00184-3
[13]
Wei-Yu Chen, Guei-Yuan Lueh, Pratik Ashar, Kaiyu Chen, and Buqi Cheng. 2018. Register Allocation for Intel Processor Graphics. In Proceedings of the 2018 International Symposium on Code Generation and Optimization (CGO 2018). Association for Computing Machinery, New York, NY, USA. 352–364. isbn:9781450356176 https://doi.org/10.1145/3168806
[14]
Kyunghyun Cho, Bart van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar. 1724–1734. https://doi.org/10.3115/v1/D14-1179
[15]
Fred C. Chow and John L. Hennessy. 1990. The Priority-Based Coloring Approach to Register Allocation. ACM Trans. Program. Lang. Syst., 12, 4 (1990), oct, 501–536. issn:0164-0925 https://doi.org/10.1145/88616.88621
[16]
LLVM Community and LLVM Developers’ conference. 2014. PBQP Register Allocation. https://llvm.org/devmtg/2014-10/Slides/PBQP-update-and-in-the-wild.pdf
[17]
Chris Cummins, Zacharias V. Fisches, Tal Ben-Nun, Torsten Hoefler, Michael F P O’Boyle, and Hugh Leather. 2021. ProGraML: A Graph-based Program Representation for Data Flow Analysis and Compiler Optimizations. In Proceedings of the 38th International Conference on Machine Learning, Marina Meila and Tong Zhang (Eds.) (Proceedings of Machine Learning Research, Vol. 139). PMLR, 2244–2253.
[18]
Chris Cummins, Bram Wasti, Jiadong Guo, Brandon Cui, Jason Ansel, Sahir Gomez, Somya Jain, Jia Liu, Olivier Teytaud, Benoit Steiner, Yuandong Tian, and Hugh Leather. 2022. CompilerGym: Robust, Performant Compiler Optimization Environments for AI Research. In Proceedings of the 20th IEEE/ACM International Symposium on Code Generation and Optimization (CGO ’22). IEEE Press, 92–105. isbn:9781665405843 https://doi.org/10.1109/CGO53902.2022.9741258
[19]
Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and F. Kenneth Zadeck. 1991. Efficiently Computing Static Single Assignment Form and the Control Dependence Graph. ACM Trans. Program. Lang. Syst., 13, 4 (1991), oct, 451–490. issn:0164-0925 https://doi.org/10.1145/115372.115320
[20]
Dibyendu Das, Shahid Asghar Ahmad, and Venkataramanan Kumar. 2020. Deep Learning-based Approximate Graph-Coloring Algorithm for Register Allocation. In 2020 IEEE/ACM 6th Workshop on the LLVM Compiler Infrastructure in HPC (LLVM-HPC) and Workshop on Hierarchical Parallelism for Exascale Computing (HiPar). 23–32. https://doi.org/10.1109/LLVMHPCHiPar51896.2020.00008
[21]
Grigori Fursin, Yuriy Kashnikov, Abdul Wahid Memon, Zbigniew Chamski, Olivier Temam, Mircea Namolaru, Elad Yom-Tov, Bilha Mendelson, Ayal Zaks, Eric Courtois, Francois Bodin, Phil Barnard, Elton Ashton, Edwin Bonilla, John Thomson, Christopher K. I. Williams, and Michael O’Boyle. 2011. Milepost GCC: Machine Learning Enabled Self-tuning Compiler. International Journal of Parallel Programming, 39, 3 (2011), 01 Jun, 296–327. issn:1573-7640 https://doi.org/10.1007/s10766-010-0161-2
[22]
M. R. Garey and David S. Johnson. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman. isbn:0-7167-1044-7
[23]
gRPC. [n. d.]. gRPC Remote Procedure Calls. https://grpc.io [Online; accessed 29-Aug-2022]
[24]
Sebastian Hack, Daniel Grund, and Gerhard Goos. 2006. Register Allocation for Programs in SSA-Form. In Compiler Construction, Alan Mycroft and Andreas Zeller (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 247–262. isbn:978-3-540-33051-6 https://doi.org/10.1007/11688839_20
[25]
Ameer Haj-Ali, Nesreen K. Ahmed, Ted Willke, Yakun Sophia Shao, Krste Asanovic, and Ion Stoica. 2020. NeuroVectorizer: End-to-End Vectorization with Deep Reinforcement Learning. In Proceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization (CGO 2020). Association for Computing Machinery, New York, NY, USA. 242–255. isbn:9781450370479 https://doi.org/10.1145/3368826.3377928
[26]
Lang Hames and Bernhard Scholz. 2006. Nearly Optimal Register Allocation with PBQP. In Modular Programming Languages, David E. Lightfoot and Clemens Szyperski (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 346–361. isbn:978-3-540-40928-1 https://doi.org/10.1007/11860990_21
[27]
Jiayi Huang, Md. Mostofa Ali Patwary, and Gregory F. Diamos. 2019. Coloring Big Graphs with AlphaGoZero. CoRR, abs/1902.10162 (2019), arXiv:1902.10162. arxiv:1902.10162
[28]
Shalini Jain, Yashas Andaluri, S. VenkataKeerthy, and Ramakrishna Upadrasta. 2022. POSET-RL: Phase ordering for Optimizing Size and Execution Time using Reinforcement Learning. In International IEEE Symposium on Performance Analysis of Systems and Software, ISPASS 2022, Singapore, May 22-24, 2022. IEEE, 121–131. https://doi.org/10.1109/ISPASS55109.2022.00012
[29]
S O Jakob. 2011. Greedy Register Allocation in LLVM 3.0. http://blog.llvm.org/2011/09/greedy-register-allocation-in-llvm-30.html
[30]
Dmitry Kalashnikov, Alex Irpan, Peter Pastor, Julian Ibarz, Alexander Herzog, Eric Jang, Deirdre Quillen, Ethan Holly, Mrinal Kalakrishnan, and Vincent Vanhoucke. 2018. Qt-opt: Scalable deep reinforcement learning for vision-based robotic manipulation. arXiv preprint arXiv:1806.10293.
[31]
Minsu Kim, Jeong-Keun Park, and Soo-Mook Moon. 2021. Irregular Register Allocation for Translation of Test-Pattern Programs. ACM Trans. Archit. Code Optim., 18, 1 (2021), Article 5, dec, 23 pages. issn:1544-3566 https://doi.org/10.1145/3427378
[32]
Minsu Kim, Jeong-Keun Park, and Soo-Mook Moon. 2022. Solving PBQP-Based Register Allocation Using Deep Reinforcement Learning. In Proceedings of the 20th IEEE/ACM International Symposium on Code Generation and Optimization (CGO ’22). IEEE Press, 230–241. isbn:9781665405843 https://doi.org/10.1109/CGO53902.2022.9741272
[33]
B Ravi Kiran, Ibrahim Sobh, Victor Talpaert, Patrick Mannion, Ahmad A. Al Sallab, Senthil Yogamani, and Patrick Pérez. 2021. Deep Reinforcement Learning for Autonomous Driving: A Survey. IEEE Transactions on Intelligent Transportation Systems, 1–18. https://doi.org/10.1109/TITS.2021.3054625
[34]
Krzysztof Kuchcinski. 2003. Constraints-Driven Scheduling and Resource Assignment. ACM Trans. Des. Autom. Electron. Syst., 8, 3 (2003), jul, 355–383. issn:1084-4309 https://doi.org/10.1145/785411.785416
[35]
Chris Lattner and Vikram Adve. 2004. LLVM: a compilation framework for lifelong program analysis & transformation. In International Symposium on Code Generation and Optimization, 2004. CGO 2004. 75–86. https://doi.org/10.1109/CGO.2004.1281665
[36]
Linux-Foundation. [n. d.]. Performance Counters for Linux. https://perf.wiki.kernel.org/index.php/Main_Page [Online; accessed 29-Aug-2022]
[37]
LLVM. [n. d.]. LLVM Machine Code Analyzer. https://llvm.org/docs/CommandGuide/llvm-mca.html [Online; accessed 29-Aug-2022]
[38]
Roberto Castañeda Lozano, Mats Carlsson, Frej Drejhammar, and Christian Schulte. 2012. Constraint-Based Register Allocation and Instruction Scheduling. In Principles and Practice of Constraint Programming, Michela Milano (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg. 750–766. isbn:978-3-642-33558-7 https://doi.org/10.1007/978-3-642-33558-7_54
[39]
Rahim Mammadli, Ali Jannesari, and Felix Wolf. 2020. Static Neural Compiler Optimization via Deep Reinforcement Learning. In The Sixth Workshop on the LLVM Compiler Infrastructure in HPC. 1–11. https://doi.org/10.1109/LLVMHPCHiPar51896.2020.00006
[40]
Charith Mendis, Alex Renda, Dr.Saman Amarasinghe, and Michael Carbin. 2019. Ithemal: Accurate, Portable and Fast Basic Block Throughput Estimation using Deep Neural Networks. In Proceedings of the 36th International Conference on Machine Learning, Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.) (Proceedings of Machine Learning Research, Vol. 97). PMLR, 4505–4515. https://proceedings.mlr.press/v97/mendis19a.html
[41]
Charith Mendis, Cambridge Yang, Yewen Pu, Dr.Saman Amarasinghe, and Michael Carbin. 2019. Compiler Auto-Vectorization with Imitation Learning. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d' Alché-Buc, E. Fox, and R. Garnett (Eds.). 32, Curran Associates, Inc. https://proceedings.neurips.cc/paper/2019/file/d1d5923fc822531bbfd9d87d4760914b-Paper.pdf
[42]
Santosh G. Nagarakatte and R. Govindarajan. 2007. Register Allocation and Optimal Spill Code Scheduling in Software Pipelined Loops Using 0-1 Integer Linear Programming Formulation. In Compiler Construction, Shriram Krishnamurthi and Martin Odersky (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 126–140. isbn:978-3-540-71229-9 https://doi.org/10.1007/978-3-540-71229-9_9
[43]
Massimiliano Poletto and Vivek Sarkar. 1999. Linear Scan Register Allocation. ACM Trans. Program. Lang. Syst., 21, 5 (1999), Sept., 895–913. issn:0164-0925 https://doi.org/10.1145/330249.330250
[44]
Louis-Noël Pouchet and Tomofumi Yuki. 2018. PolyBench 4.2 Benchmarks. http://sourceforge.net/projects/polybench/
[45]
Fernando Magno Quintão Pereira and Jens Palsberg. 2008. Register Allocation by Puzzle Solving. SIGPLAN Not., 43, 6 (2008), jun, 216–226. issn:0362-1340 https://doi.org/10.1145/1379022.1375609
[46]
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.
[47]
Yongwon Shin and Hyojin Sung. 2021. Hybrid Register Allocation with Spill Cost and Pattern Guided Optimization. In Languages and Compilers for Parallel Computing: 34th International Workshop, LCPC 2021, Newark, DE, USA, October 13–14, 2021, Revised Selected Papers. Springer-Verlag, Berlin, Heidelberg. 33–49. isbn:978-3-030-99371-9 https://doi.org/10.1007/978-3-030-99372-6_3
[48]
David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, and Adrian Bolton. 2017. Mastering the game of go without human knowledge. nature, 550, 7676 (2017), 354–359. https://doi.org/10.1038/nature24270
[49]
Douglas Simon, John Cavazos, Christian Wimmer, and Sameer Kulkarni. 2013. Automatic Construction of Inlining Heuristics Using Machine Learning. In Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) (CGO ’13). IEEE Computer Society, USA. 1–12. isbn:9781467355247 https://doi.org/10.1109/CGO.2013.6495004
[50]
Yulei Sui, Xiao Cheng, Guanqin Zhang, and Haoyu Wang. 2020. Flow2Vec: Value-Flow-Based Precise Code Embedding. Proc. ACM Program. Lang., 4, OOPSLA (2020), Article 233, nov, 27 pages. https://doi.org/10.1145/3428301
[51]
Ondrej Sykora, Phitchaya Mangpo Phothilimthana, Charith Mendis, and Amir Yazdanbakhsh. 2022. GRANITE: A Graph Neural Network Model for Basic Block Throughput Estimation. https://doi.org/10.48550/ARXIV.2210.03894
[52]
Mircea Trofin, Yundi Qian, Eugene Brevdo, Zinan Lin, Krzysztof Choromanski, and David Li. 2021. MLGO: a Machine Learning Guided Compiler Optimizations Framework. CoRR, abs/2101.04808 (2021), arXiv:2101.04808. arxiv:2101.04808
[53]
S. VenkataKeerthy, Rohit Aggarwal, Shalini Jain, Maunendra Sankar Desarkar, Ramakrishna Upadrasta, and Y. N. Srikant. 2020. IR2Vec: LLVM IR Based Scalable Program Embeddings. ACM Trans. Archit. Code Optim., 17, 4 (2020), Article 32, Dec., 27 pages. issn:1544-3566 https://doi.org/10.1145/3418463
[54]
Tiago Cariolano de Souza Xavier, George Souza Oliveira, Ewerton Daniel de Lima, and Anderson Faustino da Silva. 2012. A Detailed Analysis of the LLVM’s Register Allocators. In 2012 31st International Conference of the Chilean Computer Science Society. 190–198. https://doi.org/10.1109/SCCC.2012.29

Cited By

View all
  • (2024)Two-Step Register Allocation for Implementing Single-Path Code2024 IEEE 27th International Symposium on Real-Time Distributed Computing (ISORC)10.1109/ISORC61049.2024.10551362(1-12)Online publication date: 22-May-2024
  • (2024)BitEA: BitVertex Evolutionary Algorithm to Enhance Performance for Register AllocationIEEE Access10.1109/ACCESS.2024.344659612(115497-115514)Online publication date: 2024
  • (2022)Reinforcement Learning assisted Loop Distribution for Locality and Vectorization2022 IEEE/ACM Eighth Workshop on the LLVM Compiler Infrastructure in HPC (LLVM-HPC)10.1109/LLVM-HPC56686.2022.00006(1-12)Online publication date: Nov-2022

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CC 2023: Proceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction
February 2023
249 pages
ISBN:9798400700880
DOI:10.1145/3578360
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 February 2023

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. Register Allocation
  2. Reinforcement Learning

Qualifiers

  • Research-article

Funding Sources

Conference

CC '23
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)451
  • Downloads (Last 6 weeks)75
Reflects downloads up to 15 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Two-Step Register Allocation for Implementing Single-Path Code2024 IEEE 27th International Symposium on Real-Time Distributed Computing (ISORC)10.1109/ISORC61049.2024.10551362(1-12)Online publication date: 22-May-2024
  • (2024)BitEA: BitVertex Evolutionary Algorithm to Enhance Performance for Register AllocationIEEE Access10.1109/ACCESS.2024.344659612(115497-115514)Online publication date: 2024
  • (2022)Reinforcement Learning assisted Loop Distribution for Locality and Vectorization2022 IEEE/ACM Eighth Workshop on the LLVM Compiler Infrastructure in HPC (LLVM-HPC)10.1109/LLVM-HPC56686.2022.00006(1-12)Online publication date: Nov-2022

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media