skip to main content
article
Free access

An analysis of dynamic page placement on a NUMA multiprocessor

Published: 01 June 1992 Publication History

Abstract

The class of NUMA (nonuniform memory access time) shared memory architectures is becoming increasingly important with the desire for larger scale multiprocessors. In such machines, the placement and movement of code and data are crucial to performance. The operating system can play a role in managing placement through the policies and mechanisms of the virtual memory subsystem. In this paper, we develop an analytic model of memory system performance of a Local/Remote NUMA architecture based on approximate mean-value analysis techniques. The model assumes that a simple workload model based on a few parameters can often provide insight into the general behavior of real applications. The model is validated against experimental data obtained with the DUnX operating system kernel for the BBN GP1000 while running a synthetic workload. The results of this validation show that in general, model predictions are quite good, though in some cases the model fails to include the effect of unexpected behaviors in the implementation. Experiments investigate the effectiveness of dynamic multiple-copy page placement. We investigate the cost of incorrect policy decisions by introducing different percentages of policy error and measuring their effect on performance.

References

[1]
J. K. Bennett, J. B. Carter, and W. Zwaenepoel. Adaptive software cache management for distributed shared memory architectures. In Proceedings of the 171h Annual International Symposium on Computer Architecture, pages 125-1135, May 1990. also Rice University COMP TR89-99 and ELEC TR 8928.
[2]
D. Black. Scheduling and Resource Manage. ment Techniques for Mul~iprocessors. PhD thesis, Carnegie-Mellon University, July 1990.
[3]
D. Black, A. Gupta, and W-D. Weber. Competitive management of distributed shared memory. In Spring COMPCON 89 Digest of Papers, pages 184-190, 1989.
[4]
D. Black and D. Sleator. Competitive algorithms for replication and migration problems. Technical Report CMU-CS-89-201, Carnegie-Mellon University, November 1989.
[5]
W. Bolosky, M. Scott, and R. Fitzgerald. Simple but effective techniques for NUMA memory management. In Proceedings of the Twelfth A CM Symposium on Operating Systems Principles, pages 19- 31, December 1989.
[6]
W. Bolosky, M. Scott, R. Fitzgerald, R. Fowler, and A. Cox. NUMA policies and their relationship to memory architecture, in Proceedings, Architectural Support for Programming Languages and Operating Systems, pages 212-221, April 1991.
[7]
M-C. Chiang and G.S. Sohi. Experience with mean value analysis models for evaluating shared bus throughput-oriented multiprocessors. In Proceedings of the 1991 A CM Sigmetrics Conference on Measurement and Modeling of Computer Systems, pages 90-100, San Diego, CA, May 1991.
[8]
A.L. Cox and R.J. Fowler. The implementation of a coherent memory abstraction on a NUMA multiprocessor" Experiences with Platinum. in Proceedings of the Twelfth A CM Symposium on Operating Systems Principles, pages 32-43, December 1989.
[9]
M. Holliday. Page table management in local/remote architectures. In A CM SIGARCH lnt. Conf. on Supercomputing, pages 1-8, July 1988.
[10]
M. Holliday. Reference history, page size, and migration daemons in local/remote architectures. in Proceedings, Architectural Support for Programming Languages and Operating Systems, pages 104-112, April 1989.
[11]
1~. P. LaRowe Jr. Page Placement for Nonuniform Memory Access Time (NUMA) Shared Memory Multiprocessors. PhD thesis, Duke University, March 1991.
[12]
R. P. LaRowe Jr. and C. S. Ellis. Experimental comparison of memory management policies for NUMA multiprocessors. A CM Transactions on Computer Systems, 9(4):319-363, November 1991.
[13]
1~. P. LaRowe Jr. and C. S. Ellis. OS experimentation and a user community coexist under the DUnX kernel. In Proceedings of the 1991 International Conference on Parallel Processing, pages 11-158- 166, August 1991.
[14]
R. P. LaRowe Jr., C. S. Ellis, and L. S. Kaplan. The robustness of NUMA memory management. In Proceedings of the Thirteenth A CM Symposium on Operating Systems Principles, pages 137-151, October 1991.
[15]
R. P. LaR,owe Jr., M. A. Holliday, and C. S. Ellis. An analysis of dynamic page placement on a NUMA multiprocessor. Technical Report CS-1991- 028, Duke University, August 1991.
[16]
R. P. LaRowe Jr., J. T. Wilkes, and C. S. Ellis. Exploiting operating system support for dynamic page placement on a NUMA shared memory multiprocessor. In Proceedings of ~he Symposium on the Principles and Practice of Parallel Programming, pages 122-132, April 1991.
[17]
S. T. Leutenegger and M. K. Vernon. A meanvalue performance analysis of a new multiprocessor architecture. In Proceedings of the 1988 A CM Sigmetrics Conference on Measurement and Modeling of Computer Systems, pages 167-176, May 1988.
[18]
C. Scheurich and M. Dubois. Dynamic page migration in multiprocessors with distributed global memory. In Proceedings of the Eighth International Conference on Distributed Computer Syslems, pages 162-169, June 1988.
[19]
J. Torrellas, J. Hennessy, and T. Weil. Analysis of critical architectural and program paramters in a hierarchical shared-memory multiprocessor, in Proceedings of the 1990 A CM Sigmetrics Conference on Measurement and Modeling of Computer Systems, pages 163-172, 1990.
[20]
M. K. Vernon, E. D. Lazowska, and J. Zahorjan. An accurate and efficient performance analysis technique for multiprocessor snooping cacheconsistency protocols. In Proceedings of the 15~h Annual International Symposium on Computer Architecture, pages 308-317, May 1988.

Cited By

View all
  • (2018)IntroductionThread and Data Mapping for Multicore Systems10.1007/978-3-319-91074-1_1(1-8)Online publication date: 5-Jul-2018
  • (2017)Optimizing memory affinity with a hybrid compiler/OS approachProceedings of the Computing Frontiers Conference10.1145/3075564.3075566(221-229)Online publication date: 15-May-2017
  • (2016)Affinity-Based Thread and Data Mapping in Shared Memory SystemsACM Computing Surveys10.1145/300638549:4(1-38)Online publication date: 5-Dec-2016
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGMETRICS Performance Evaluation Review
ACM SIGMETRICS Performance Evaluation Review  Volume 20, Issue 1
June 1992
260 pages
ISSN:0163-5999
DOI:10.1145/149439
Issue’s Table of Contents
  • cover image ACM Conferences
    SIGMETRICS '92/PERFORMANCE '92: Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
    June 1992
    267 pages
    ISBN:0897915070
    DOI:10.1145/133057
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 1992
Published in SIGMETRICS Volume 20, Issue 1

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)50
  • Downloads (Last 6 weeks)13
Reflects downloads up to 21 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2018)IntroductionThread and Data Mapping for Multicore Systems10.1007/978-3-319-91074-1_1(1-8)Online publication date: 5-Jul-2018
  • (2017)Optimizing memory affinity with a hybrid compiler/OS approachProceedings of the Computing Frontiers Conference10.1145/3075564.3075566(221-229)Online publication date: 15-May-2017
  • (2016)Affinity-Based Thread and Data Mapping in Shared Memory SystemsACM Computing Surveys10.1145/300638549:4(1-38)Online publication date: 5-Dec-2016
  • (2016)Hardware-Assisted Thread and Data Mapping in Hierarchical Multicore ArchitecturesACM Transactions on Architecture and Code Optimization10.1145/297558713:3(1-28)Online publication date: 17-Sep-2016
  • (2017)Databases on Modern Hardware: How to Stop Underutilization and Love MulticoresSynthesis Lectures on Data Management10.2200/S00774ED1V01Y201704DTM0459:1(1-113)Online publication date: 14-Aug-2017
  • (2016)Affinity-Based Thread and Data Mapping in Shared Memory SystemsACM Computing Surveys10.1145/300638549:4(1-38)Online publication date: 5-Dec-2016
  • (2000)HP scalable computing architectureProceedings of the 1st conference on Industrial Experiences with Systems Software - Volume 110.5555/1251503.1251506(3-3)Online publication date: 22-Oct-2000
  • (1994)Evaluating memory system performance of a large scale NUMA multiprocessorProceedings of International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems10.1109/MASCOT.1994.284453(18-29)Online publication date: 1994
  • (1993)Hot spot analysis in large scale shared memory multiprocessorsProceedings of the 1993 ACM/IEEE conference on Supercomputing10.1145/169627.169857(895-905)Online publication date: 1-Dec-1993
  • (1992)Evaluation of NUMA Memory Management Through Modeling and MeasurementsIEEE Transactions on Parallel and Distributed Systems10.1109/71.1806243:6(686-701)Online publication date: 1-Nov-1992

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media