Sound Regular Expression Semantics for Dynamic Symbolic Execution of JavaScript

Loring, Blake; Mitchell, Duncan; Kinder, Johannes

doi:10.1145/3314221.3314645

Computer Science > Programming Languages

arXiv:1810.05661 (cs)

[Submitted on 10 Oct 2018 (v1), last revised 13 Mar 2020 (this version, v4)]

Title:Sound Regular Expression Semantics for Dynamic Symbolic Execution of JavaScript

Authors:Blake Loring, Duncan Mitchell, Johannes Kinder

View PDF

Abstract:Existing support for regular expressions in automated test generation or verification tools is lacking. Common aspects of regular expression engines found in mainstream programming languages, such as backreferences or greedy matching, are commonly ignored or imprecisely approximated, leading to poor test coverage or failed proofs. In this paper, we present the first complete strategy to faithfully reason about regular expressions in the context of symbolic execution, focusing on the operators found in JavaScript. We model regular expression operations using string constraints and classical regular expressions and use a refinement scheme to address the problem of matching precedence and greediness. Our survey of over 400,000 JavaScript packages from the NPM software repository shows that one fifth make use of complex regular expressions features. We implemented our model in a dynamic symbolic execution engine for JavaScript and evaluated it on over 1,000 this http URL packages containing regular expressions, demonstrating that the strategy is effective and can increase line coverage of programs by up to 30%

Comments:	This arXiv version (v4) contains fixes for some typographical errors of the PLDI'19 version (the numbering of indices in Section 4.1 and the example in Section 4.3)
Subjects:	Programming Languages (cs.PL)
Cite as:	arXiv:1810.05661 [cs.PL]
	(or arXiv:1810.05661v4 [cs.PL] for this version)
	https://doi.org/10.48550/arXiv.1810.05661
Journal reference:	Proc. ACM SIGPLAN Conf. Programming Language Design and Implementation (PLDI), pp. 425-438, ACM, 2019
Related DOI:	https://doi.org/10.1145/3314221.3314645

Submission history

From: Johannes Kinder [view email]
[v1] Wed, 10 Oct 2018 15:51:34 UTC (96 KB)
[v2] Tue, 11 Dec 2018 12:21:30 UTC (81 KB)
[v3] Tue, 7 May 2019 12:16:25 UTC (103 KB)
[v4] Fri, 13 Mar 2020 16:08:14 UTC (106 KB)

Computer Science > Programming Languages

Title:Sound Regular Expression Semantics for Dynamic Symbolic Execution of JavaScript

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Programming Languages

Title:Sound Regular Expression Semantics for Dynamic Symbolic Execution of JavaScript

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators