Abstract
Systematic testing, first demonstrated in small, specialized cases 15 years ago, has matured sufficiently for large-scale systems developers to begin to put it into practice. With actual deployment come new, pragmatic challenges to the usefulness of the techniques. In this paper we are concerned with scaling dynamic partial order reduction, a key technique for mitigating the state space explosion problem, to very large clusters. In particular, we present a new approach for distributed dynamic partial order reduction. Unlike previous work, our approach is based on a novel exploration algorithm that 1) enables trading space complexity for parallelism, 2) achieves efficient load-balancing through time-slicing, 3) provides for fault tolerance, which we consider a mandatory aspect of scalability, 4) scales to more than a thousand parallel workers, and 5) is guaranteed to avoid redundant exploration of overlapping portions of the state space.
This research was sponsored by the U.S. Army Research Office under grant number W911NF0910273. The authors are also thankful to Google for providing its hardware and software infrastructure for the evaluation presented in this paper. Further, we also thank the members and companies of the PDL Consortium. The views and conclusions contained in this document are those of the author and should not be interpreted as representing the official policies, either expressed or implied, of any sponsoring institution, the U.S. government or any other entity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agha, G.: Actors: A Model of Concurrent Computation in Distributed Systems. MIT Press, Cambridge (1986)
Barnat, J., Brim, L., Češka, M., Ročkai, P.: DiVinE: Parallel Distributed Model Checker (Tool paper). In: HiBi/PDMC 2010, pp. 4–7. IEEE (2010)
Bucur, S., Ureche, V., Zamfir, C., Candea, G.: Parallel symbolic execution for automated real-world software testing. In: EuroSys 2011, pp. 183–198 (2011)
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: BigTable: A distributed storage system for structured data. In: OSDI 2006, pp. 205–218 (2006)
Flanagan, C., Godefroid, P.: Dynamic Partial Order Reduction for Model Checking Software. SIGPLAN Not. 40(1), 110–121 (2005)
Ghemawat, S., Gobioff, H., Leung, S.: The Google file system. SIGOPS Oper. Syst. Rev. 37(5), 29–43 (2003)
Godefroid, P.: Partial-Order Methods for the Verification of Concurrent Systems - An Approach to the State-Explosion Problem. Springer (1996)
Godefroid, P.: Model Checking for Programming Languages using VeriSoft. In: POPL 1997, pp. 174–186. ACM (1997)
Gueta, G., Flanagan, C., Yahav, E., Sagiv, M.: Cartesian Partial-Order Reduction. In: Bošnački, D., Edelkamp, S. (eds.) SPIN 2007. LNCS, vol. 4595, pp. 95–112. Springer, Heidelberg (2007)
Guo, H., Wu, M., Zhou, L., Hu, G., Yang, J., Zhang, L.: Practical software model checking via dynamic interface reduction. In: SOSP 2011, pp. 265–278. ACM, New York (2011)
Holzmann, G.J., Joshi, R., Groce, A.: Swarm Verification Techniques. IEEE Transactions on Software Engineering 37, 845–857 (2011)
Killian, C.E., Anderson, J.W., Jhala, R., Vahdat, A.: Life, Death, and the Critical Transition: Finding Liveness Bugs in Systems Code. In: NSDI 2007 (2007)
Lamport, L.: Time, Clocks, and the Ordering of Events in a Distributed System. Commun. ACM 21(7), 558–565 (1978)
Musuvathi, M., Qadeer, S., Ball, T., Basler, G., Nainar, P.A., Neamtiu, I.: Finding and Reproducing Heisenbugs in Concurrent Programs. In: OSDI 2008, pp. 267–280 (2008)
Simsa, J., Bryant, R., Gibson, G., Hickey, J.: Efficient Exploratory Testing of Concurrent Systems. CMU-PDL Technical Report, 113 (November 2011)
Simsa, J., Gibson, G., Bryant, R.: dBug: Systematic Evaluation of Distributed Systems. In: SSV 2010 (2010)
Stern, U., Dill, D.L.: Parallelizing the MurPhi Verifier. Formal Methods in System Design 18(2), 117–129 (2001)
Vakkalanka, S.S., Sharma, S., Gopalakrishnan, G., Kirby, R.M.: ISP: A tool for model checking MPI programs. In: PPoPP 2008, pp. 285–286 (2008)
Yang, J., Chen, T., Wu, M., Xu, Z., Liu, X., Lin, H., Yang, M., Long, F., Zhang, L., Zhou, L.: MoDist: Transparent Model Checking of Unmodified Distributed Systems. In: NSDI 2009, pp. 213–228 (April 2009)
Yang, Y., Chen, X., Gopalakrishnan, G.: Inspect: A Runtime Model Checker for Multithreaded C Programs. University of Utah Tech. Report, UUCS-08-004 (2008)
Yang, Y., Chen, X., Gopalakrishnan, G., Kirby, R.M.: Distributed Dynamic Partial Order Reduction Based Verification of Threaded Software. In: Bošnački, D., Edelkamp, S. (eds.) SPIN 2007. LNCS, vol. 4595, pp. 58–75. Springer, Heidelberg (2007)
Yang, Y., Chen, X., Gopalakrishnan, G., Kirby, R.M.: Efficient Stateful Dynamic Partial Order Reduction. In: Havelund, K., Majumdar, R. (eds.) SPIN 2008. LNCS, vol. 5156, pp. 288–305. Springer, Heidelberg (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Simsa, J., Bryant, R., Gibson, G., Hickey, J. (2013). Scalable Dynamic Partial Order Reduction. In: Qadeer, S., Tasiran, S. (eds) Runtime Verification. RV 2012. Lecture Notes in Computer Science, vol 7687. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35632-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-35632-2_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35631-5
Online ISBN: 978-3-642-35632-2
eBook Packages: Computer ScienceComputer Science (R0)