On The Reproducibility and Scalability of Extreme Scale Applications

Shi, Justin

Computer Science > Distributed, Parallel, and Cluster Computing

arXiv:1405.4464v1 (cs)

[Submitted on 18 May 2014 (this version), latest version 22 Jul 2014 (v3)]

Title:On The Reproducibility and Scalability of Extreme Scale Applications

Authors:Justin Shi

View PDF

Abstract:For dedicated small scale computing environments, having replicated code and data are sufficient to reproduce prior results if the processing platform is also compatible with both the code and the data. Experiences indicate that these dependencies often quickly deteriorate due to rapid technology advances. The current main-stream explicit parallel programming paradigm not only makes the parallel programs closely coupled with the processing platform at the time of programming but also builds growing instabilities into the parallel programs making reproducibility impossible in larger scales even if the processing platform only changes in size. This impossibility is also linked to our inability to quantify parallel application scalability.
As technology developments will inevitably change the future data processing methods and environments, the explicit parallel paradigms and virtual circuit based APIs should be considered counter-productive to scientific research and big data processing applications, especially in light of reproducibility of large scale cloud computer applications.
This position paper details why and how this long standing problem could be overcome. Preliminary computational results are reported in support of the proposed position.

Comments:	WSSSPE 2014
Subjects:	Distributed, Parallel, and Cluster Computing (cs.DC)
ACM classes:	C.1.4
Cite as:	arXiv:1405.4464 [cs.DC]
	(or arXiv:1405.4464v1 [cs.DC] for this version)
	https://doi.org/10.48550/arXiv.1405.4464

Submission history

From: Justin Shi [view email]
[v1] Sun, 18 May 2014 06:29:02 UTC (401 KB)
[v2] Mon, 21 Jul 2014 17:58:52 UTC (478 KB)
[v3] Tue, 22 Jul 2014 03:32:48 UTC (478 KB)

Computer Science > Distributed, Parallel, and Cluster Computing

Title:On The Reproducibility and Scalability of Extreme Scale Applications

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Distributed, Parallel, and Cluster Computing

Title:On The Reproducibility and Scalability of Extreme Scale Applications

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators