research-article

Open access

Precisely Extracting Complex Variable Values from Android Apps

Authors:

Marc Miltenberger,

Steven ArztAuthors Info & Claims

ACM Transactions on Software Engineering and Methodology, Volume 33, Issue 5

Article No.: 121, Pages 1 - 56

https://doi.org/10.1145/3649591

Published: 04 June 2024 Publication History

PDF eReader

Abstract

Millions of users nowadays rely on their smartphones to process sensitive data through apps from various vendors and sources. Therefore, it is vital to assess these apps for security vulnerabilities and privacy violations. Information such as to which server an app connects through which protocol, and which algorithm it applies for encryption, are usually encoded as variable values and arguments of API calls. However, extracting these values from an app is not trivial. The source code of an app is usually not available, and manual reverse engineering is cumbersome with binary sizes in the tens of megabytes. Current automated tools, however, cannot retrieve values that are computed at runtime through complex transformations.

In this article, we present ValDroid, a novel static analysis tool for automatically extracting the set of possible values for a given variable at a given statement in the Dalvik byte code of an Android app. We evaluate ValDroid against existing approaches (JSA, Violist, DroidRA, Harvester, BlueSeal, StringHound, IC3, and COAL) on benchmarks and 794 real-world apps. ValDroid greatly outperforms existing tools. It provides an average F¹ score of more than 90%, while only requiring 0.1 s per value on average. For many data types including Network Connections and Dynamic Code Loading, its recall is more than twice the recall of the best existing approaches.

1 Introduction

Many users nowadays rely on smartphones and apps for sensitive tasks such as banking, e-mail, shopping, or access to their company’s internal network and services. With the “Internet of Things” movement, more and more “smart” gadgets have entered the market that are also controlled via mobile apps. Users can now, for example, unlock doors through a click or swipe on their phone, be it at home, in the car, or in the company. While these new features increase the users’ convenience, they also require the mobile devices and the software running on them (i.e., the apps) to be trustworthy and secure. The car control app, for example, can create movement profiles of the user, and the door unlocking app might allow unauthorized parties access to the house. Such behavior need not necessarily be malicious but can also be caused by programming mistakes and security vulnerabilities.

Therefore, popular app stores such as Google Play vet all uploaded apps before making them available to customers [34]. Security teams at larger organizations perform their own app analysis. However, app developers usually do not provide the source code. Analysts must therefore gain an understanding of the app on the binary alone, usually with the help of static and dynamic analysis tools [3, 13, 27, 35]. Extracting variable values is a crucial building block for these approaches. Method calls may, for example, show that an app connects to a server or encrypts data. The call arguments, however, specify the concrete target server and the concrete cryptographic algorithm being used. Without this information, neither tools nor human analysts can judge whether a data transmission is acceptable or whether a chosen cipher is appropriate. Furthermore, Android apps are usually built from numerous components that interact through a mechanism called Intent.¹ An Intent is a message from one component to another, which can either be in the same app or in a different one. The operating system resolves the recipient of each Intent using the concrete values inside the Intent, such as a component name string, or the type of the data being transmitted. Therefore, tools and humans alike must know these values to correctly follow such data transmissions.

Statically extracting such variable values is not trivial, because they are usually not available as constants inside the bytecode. Instead, a Uniform Resource Locator (URL), for example, is often dynamically generated from a base URL denoting the target server, and individual Application Programming Interface (APIs) that denote paths on that server. To correctly reconstruct the final URL, the analyzer must emulate the semantics of the string manipulation and URL building APIs invoked by the app. Intents, however, are complex multi-value objects that contain several individual strings. A single-value analyzer such as JSA [10] that extracts individual strings without context, cannot capture the semantics of the object. Therefore, such an analysis must assume that all combinations of these strings are potentially valid Intents. In reality, only few of these combinations are actually valid, which leads to a high false-positive rate of such tools, as we show in Section 8.3. IC3 [31] and COAL [32] perform composite constant propagation to address the issue of complex objects but only discover a few correct Intents in practice, as we show in our experiments on large real-world apps.

Loops inside the value computation also pose significant challenges to the analysis. To the best of our knowledge, no static tool for extracting variable values from Android apps or Java code is able to precisely deal with variable values altered in loops, such as strings being generated from iterator data inside a loop. Most approaches either completely ignore loops, unroll them a constant number of times, or use string widening to compute overapproximated regular expressions instead of precise strings. Such techniques do not correctly capture the semantics of, e.g., iterating over a Java collection and appending data from each collection element to a string. In contrast, our approach executes loops with the semantics of the original program to ensure the computation of correct values. This involves respecting loop conditions (i.e., $\texttt {if}$ conditionals in loops).

Furthermore, values may deliberately be obfuscated to prevent reverse engineering and protect the intellectual property of the developer. A popular technique is to encrypt the value using a static key that is computed at runtime from inputs that, albeit hard coded, are deliberately spread across the program and stored in non-intuitive formats. A static analysis tool must be able to derive the correct key such that it can undo the obfuscation and decrypt the original values. Although these values are not constants, the outcome of the code that reconstructs them at runtime is deterministic and usually does not require any user input. We call such values pseudo-constants.

Given the challenges static analyzers face on value extraction, analysts can resort to dynamic analyses. Since the app computes the final value at runtime, the value can directly be extracted at the call site where it is finally used through either instrumentation [1] or function hooking [11]. While trivial in theory, these approaches require the code that computes the value to actually be executed, leading to the well-known code coverage problem. Current UI automation approaches only exercise a minority of all code inside an app [9]. Combined static/dynamic approaches such as Harvester [35] can circumvent this problem by first extracting the relevant code and then executing this code in isolation. Such approaches face significant scalability issues, as we show in Section 8.5, making them unsuitable for large real-world apps.

In this article, we present ValDroid, a static analysis tool for accurately extracting variable values for complex objects from Android applications in Dalvik bytecode, even in the presence of loops and obfuscation. ValDroid is able to reconstruct not only individual values such as primitives or strings but also complex objects such as Android Intents built from individual components. ValDroid precisely captures the semantics of loops over well-defined objects such as iterators. It uses program slicing to ensure only necessary parts of the analyzed program are considered. After slicing, ValDroid simulates the effects of the statements in the slice to compute the values. ValDroid focuses on pseudo constants, which are values computed in a deterministic fashion and without user-input. As such, it can reason about concrete values and does not require symbolic values, a major contributing factor to its scalability. As we show in Section 8, ValDroid achieves a recall of more than 80% and a precision of more than 95% on average. For many categories of values, such as Intents, external programs launched by an app, or network connections, ValDroid’s recall is considerably higher compared to the second-best related approach. ValDroid requires only 0.1 s per value on average, while most other approaches take multiple seconds as we show in our evaluation. In summary, this article presents the following original contributions:

(1)

the ValDroid tool for statically reconstructing complex variable values from Android Dalvik bytecode apps,

(2)

an evaluation of ValDroid on 246 benchmarks taken from JSA [10], 794 real-world apps taken from the Google Play Store, and 864 malware apps from four different sources, and

(3)

a comparison between ValDroid and previous work (JSA [10], Violist [22], DroidRA [24], IC3 [31], Harvester [35], BlueSeal [44], StringHound [14] and COAL [32]) as well as Soot’s [2] standard analysis on our set of real-world apps.

The remainder of this article is structured as follows. We discuss related work in Section 2. In Section 3, we present common challenges involved when performing value analyses. In Section 4, we present the technical details of our approach. In Section 5, we discuss a representative example that is used to illustrate our approach, followed by details of our implementation in Section 6. In Section 7, we show properties of the algorithm and core improvements compared to related work and proceed with an extensive evaluation in Section 8. Subsequently, we present the limitations of ValDroid in Section 9 and conclude the article in Section 10.

2 Related Work

Various approaches for extracting values from Android apps or Java programs are already known in the literature. R-Droid [4] is conceptually closest to our approach, as it also computes backward slices from a given point of interest. It then applies a value analysis to each slice through copy propagation and expression evaluation similar to our forward execution step. In contrast to our approach, R-Droid handles neither loops nor complex multi-value data objects such as Android Intents. Lack of loop support hinders the analysis in case of string obfuscation. We would have liked to measure R-Droid’s accuracy against our approach, but no implementation is available.

Shannon et al. [37] present the tool Haderach, which uses automata-based representations for abstracting strings during forward symbolic execution. It injects symbolic class implementations of String, StringBuffer, and StringBuilder via instrumentation. For $\texttt {if}$ conditions, it executes paths from the beginning for each conditional branch. For each path condition, Haderach generates a witness for the induced constraints by the path condition. Haderach outputs a finite state automata containing possible strings and limits loops to a definable number of iterations. Unfortunately, Haderach is not available for experimentation.

JSA [10] models string operations as automata transitions and compute a regular expression that captures all possible strings accepted in the language of the automaton. Since control flow is not present in these automata, JSA is not path sensitive.

Violist [22] is a string analysis tool for Java programs and Android apps. It first applies a region-based analysis to identify the code segments and loops that compute a value of interest and then uses an interpreter to compute actual string values. Violist can model strings as values through loop unrolling or as automata similar to JSA. The loop unrolling method does not consider loop conditions. Instead, it unrolls the loop a certain number of times, which can be specified by the user. The values of each loop iteration are saved into variable-specific sets. As the loop semantics are not consistent with the original program, it can thus deliver incorrect results. The string analysis built into BlueSeal by Vecchio et al. [44] is also based on backward slicing and abstract interpretation, but the article does not give much details on the algorithm, in particular the handling of conditionals. COAL [32] introduced a language to model complex data structures, which are used for static constant propagation. It is “interprocedural, context-sensitive, field-sensitive but not object-sensitive” [30]. In contrast, ValDroid is object sensitive. While the COAL language supports storing data supplied via method calls such as setters, more complex operations such as string concatenation are not represented in the language. Therefore, the semantics of $\texttt {StringBuilder.append}$ are not specified in a COAL model but are hardcoded in the Java code to support this API call. This operation could only be encoded in the COAL language itself if the COAL language supported string manipulation operations. Complex operations generally cannot be represented in the current version of the COAL language itself without extending the language for this special use case. Rather than using a limited model language, transformers used in ValDroid are all Java code and as such can model the semantics more closely to the original implementation.

COAL uses a widening approach for loops and outputs regular expressions instead of concrete values. This means that COAL does usually not deliver precise results for when string decryption routines or other complex decodings are involved. In many of these cases, COAL only deduces a generic .* regular expression.

IC3 [31] is a tool specific for computing Android Intents and is based on the COAL language. Furthermore, IC3’s string analysis handles aliasing introduced by API calls such as $\texttt {StringBuilder.append}$ with a trick called “alias workaround” in Reference [31]. As the result value and the base value alias after the statement, and this is not captured by a def-use analysis, they replace the returned variable with the base variable and also replace all uses of the original returned variable with the base variable. In case of branching this approach leads to false results, since in the original code the uses of the returned value could have more than one definition.

Choi et al. [8] propose a string analysis tool based on abstract interpretation and heuristic widening. The implementation is not available for comparison. It uses a widening approach similar to JSA, to which it adds support for context sensitivity and field variables. It widens over loops and discards loop and if conditionals. We tried to contact the authors to request the prototype but were unable to reach them via e-mail.

Automata-based approaches usually have problems to get precise values in case of string encryption, where loop conditions and string computations on concrete strings are relevant. Usually, these approaches discard loop conditions and as such assume the loop can be repeated an arbitrary number of times, which results in less accurate string values.

DroidRA [24] focuses on resolving reflective method calls inside Android apps and is based on COAL. Harvester [35] is a combined static and dynamic approach. It first computes a slice with all statements that compute a given value of interest. It then executes this slice on an Android device or emulator. A dynamic approach has non-trivial additional setup time and is bound by the speed of the device or emulator. Harvester generates a separate dex file for each slice to allow multiple slices to be executed in a single run, resulting in large apps that take a long time to install on a device. To get a more complete picture, several devices and environments would have to be tested, since applications can exhibit different behaviors based on the device type and may even employ emulator checks. Due to imprecise slicing, in practice, exceptions may occur on dynamic runs, which are not relevant for the value of interest. This results in a termination of the path and no values are being computed. ValDroid ignores such problems during path simulation. In case a specific value cannot be computed, this might not pose a problem in case the slicer was not precise enough. If there is no value in the machine state at the statement of interest at the end, then no value is reported.

StringHound [14] is an approach similar to Harvester. The main difference that it runs slices on the JVM, which is fast but poses two problems: First, the JVM does not have an adequate model of the Android platform, rendering the computation of any values that need the Android API impossible. Therefore, it could not resolve any Intents, since these are part of the Android API. Furthermore, executing the sliced code as-is on bare metal may pose a security risk in case of untrusted applications. ValDroid does not have this problem, since it only models non-critical API methods. Tamiflex [6] resolves reflective call sites in static callgraphs by injecting actual class and method names encountered at runtime.

OLSA [45] is an approach inspired by and similar to JSA. It is faster but more imprecise than JSA. It does not support arrays, data structures and class fields. We asked the authors for the implementation, but it was not possible for legal approval reasons.

Grech et al. [15] propose an approach to solve reflection string analysis via graph coloring. It assumes that the substrings of names of members accessed via reflection are present as string constants. They use a graph coloring algorithm to determine which of the string constants can be merged without losing their ability to be used as reflection operator selectors. Rather than computing concrete string values or finite automata, the approach reduces the number of possible string constants (of all string constants in the analyzed application) used for reflective calls. In case of string obfuscations, the assumption that string constants are part of the members name of reflective calls does not hold.

Chopped Symbolic Execution [41] is an approach introduced by Trabish et al., which uses symbolic execution in an effective way to reach a given statement and allows the user to manually specify uninteresting parts to be skipped. Since the execution usually starts at the beginning of the program, executing all possible paths to the statement of interest would lead to path explosion. Therefore, skipping the manually specified uninteresting parts lead to a reduction of path combinations. Upon reaching a skipped part of the code, the approach clones the execution state, which are called snapshot states. When side effects of the skipped paths are needed, the algorithm needs to execute paths based on the snapshot states. Slicing is used to compute only necessary parts of skipped parts.

Symcretic Execution [12] was proposed by Dinges et al. and introduces symcretic execution, a combination of symbolic backwards execution and concrete forward execution, which is used to infer targeted inputs to cover specific branches or statements. First, it leverages symbolic backward execution to find a path from the given target to a program entry point. Second, it begins forward execution upon reaching an entry point. The authors acknowledge that slicing can help to reduce the number of paths, but it remained as part of future work.

In contrast to both Reference [41] and Reference [12], ValDroid works backwards and skips methods automatically when they are not needed due to the semantics of slicing. Since ValDroid aims to compute only a subset of values, ValDroid stops if it infers that no further slicing is necessary. Like Harvester, ValDroid usually does not start the execution at an entry point of the application, as it does not need an executable slice. It can start the execution as soon as the slicing criterion is fulfilled. Even uninitialized variables that would lead to a crash during traditional execution can be ignored by ValDroid.

Jaffar et al. [17] proposed a path-sensitive backward slicing approach. In the absence of loops, it can produce “exact slices for loop-free programs.” For loops, it uses approximate loop invariants to produce a sound, albeit not necessarily perfectly precise slice. It leverages symbolic execution to exclude dependencies on infeasible paths. Functions are inlined, since the analysis is intraprocedural, and issues such as context sensitivity is not considered in this approach. Slices produced by ValDroid do not need to be minimal, since imprecision only lead to a longer runtime, but do not lead to imprecisions of the reported values. ValDroid respects object and context sensitivity and does not report spurious results due to imprecisions. Producing better slices comes with an increased complexity and thus by itself can take a longer time.

Existing security analysis tools such as SCanDroid [13] rely on approximations to compute prefixes of string values of interest. SCanDroid first computes the subgraph that contains all operations on a given value of interest and then applies a value analysis on top of WALA’s intraprocedural flow solver. SLOG [46] uses constraint solvers to generate possible attack inputs for vulnerabilities in the context of web applications, such as Structured Query Language (SQL)-injection vulnerabilities. For potential vulnerable SQL statements in the code, it computes string constraint formulas using a dependency graph and checks the intersection between the formulas and the given attack pattern (regular expression). If it not empty, then a counterexample containing a possible attack scenario is computed.

ComDroid [7] extracts Intents but does not perform a multi-variate analysis, i.e., does not distinguish between changes to different fields of the same object made along different control flow paths. FlowDroid [3] relies on a simple interprocedural constant propagation based on Soot’s built-in transformers for code optimization. Soot [2] is an analysis framework that is capable of reading in Android applications and Java bytecode. It uses Jimple as its intermediate representation and performs a simple intraprocedural constant propagation as part of its input processing.

Amandroid [47] relies on string constants for ICC resolving and assumes that every string is possible if an ICC value is computed at runtime. HybriDroid [20] contains its own string analysis based on backwards slicing and explicit library models. In contrast to ValDroid, it performs a fixpoint iteration on all string operations in the slice. Lisper et al. [26] propose an on-demand static backward slicing algorithm, which outputs a program slice, i.e., a set of program statements, while our slicing algorithm produces single paths. Furthermore, their approach has to be augmented with API library models to produce viable slices of Dalvik bytecode.

The loop handling proposed by Park and Ryu [33] extends k-Control-Flow Analyses (CFA) with a Loop-Sensitive Analysis (LSA) framework for general-purpose analyses based on abstract interpretation. It does loop unrolling and tries to determine the correct amount of loop unrollings. They “proved that LSA is more precise or at least as precise as k-CFA” and implemented the approach in a JavaScript analyzer. However, as shown by Rival et al. [36], LSA has problems with loops in case of complex conditions.

Other research focuses on verifying whether a string computed by a program satisfies a given constraint [39]. Existing work on integrating string analysis into SMT solvers [25, 42, 48] follows a similar direction. More recent approaches aim to supplement traditional constraint solving with search-driven techniques [40]. While this research is relevant for detecting error patterns such as cross-site scripting or SQL injection, our focus is on recovering the precise value.

3 Challenges

In this section, we present various challenges that a value analysis approach faces on complex real-world apps. The first column of Table 1 lists the challenges we have identified. For each pair of approach and challenge, the table reports on whether the approach handles the respective challenge. We find that only ValDroid handles all of the challenges and therefore improves over the state of the art. The table focuses its comparison on the approaches against which we empirically evaluate ValDroid in Section 8. We deliberately omit existing work for which no artifact is available, as the performance and accuracy of such approaches cannot be evaluated in a fair comparison. As we show in Section 8, ValDroid offers significantly better precision and recall than existing approaches.

Table 1.

Type	Soot	JSA	Hv	Viol	DRA	BS	SH	IC3	COAL	VD
Precise Loop handling	✗	~ $^{1}$	✓	~ $^{2}$	~ $^{1}$	~ $^{1}$	✓	~ $^{1}$	~ $^{1}$	✓
Collections	✗	✗	✓	✗	✗	✗	✓	✗	✗	✓
Precise complex objects	✗	✗⁶	✓	✗⁶	✗⁶	✗⁶	✓	✗⁶	✗⁶	✓
Precise arrays	✗	✗ $^{3}$	✓	✗ $^{4}$	✗ $^{4}$	✗ $^{4}$	✓	✗ $^{4}$	✗	✓
Android APIs	✗	✗	✓	✗	✓	✗	✗ $^{5}$	✓	✓	✓
Reflective calls	✗	✗	✓	✗	✗	✗	✓	✗	✗	✓
Static or Hybrid	S	S	H	S	S	S	H	S	S	S

Table 1. A Comparison of Value Analysis Approaches with Respect to the Identified Challenges

“✓” denotes that the respective approach supports code with the particular type of challenge. “✗” marks that the approach does not support correct handling of such code. “~” denotes that the approach supports this type of challenge with some restrictions. S/H is used to show whether a tool is static or hybrid (static & dynamic).

$^{1}$ Uses the widening approach and generates regular expressions (using wildcards) out of loops, which may be more imprecise than concrete values.

$^{2}$ Can use the widening approach or loop unrolling until a specified maximum without consideration for the loop condition.

$^{3}$ Does not handle dynamic indices.

$^{4}$ Arrays are overapproximated.

$^{5}$ While StringHound supports JVM APIs, it does not support any Android APIs, since it runs on the JVM.

$^{6}$ These tools overapproximate field aliasing, leading to spurious results.

Harvester [35] and ValDroid are the only tools that handle loops, collections, complex objects, and arrays precisely as well as offer support for most important Android APIs. Since Harvester is a dynamic approach, it requires setting up a real-world device or using an emulator. To get good results, Harvester requires its slicer to deliver bug-free programs, i.e., ones that do not crash due to runtime exceptions. If the app crashes, then paths are aborted early and values are missed. Further, running untrusted apps or malware using Harvester is discouraged, since the sliced application runs with the original permissions of the program. Further, Harvester generates one dex file per sliced path, which is a relatively slow process.

Similarly, StringHound [14] slices for values of interest and runs the slices. In contrast to Harvester, StringHound runs slices derived from Android apps on the JVM. Consequently, Android APIs are inherently unsupported. Additionally, StringHound relies on the Java Security Manager to restrict the security implications of running untrusted code in the JVM. The security manager has been deprecated as of Java 17 partly due to continued vulnerabilities [16].

ValDroid in contrast works statically and does not require an elaborate setup. Dangerous APIs are modeled such that they can only access data from within the app. File access APIs, for example, cannot read from arbitrary files of the analysis computer. Instead, they can only query a file system model that contains files packaged within the APK file itself, e.g,. the Android asset directory.

Further, most approaches handle loops using the widening approach. In some cases, e.g., when string encryption is used in applications, these approaches tend to either overapproximate or give wrong values, since exact loop handling is required to determine the precise values. Too few or too many loop iterations would not result in producing the correct value. ValDroid, however, handles loops precisely.

We next discuss the individual challenges in more detail.

3.1 Loop Handling

In some cases, we found that Strings have been obfuscated with tools like DexGuard. DexGuard instruments a string decryption routine into the application, which uses loops to decrypt string values at runtime. Existing approaches approximate loops using the widening approach (or optionally loop unrolling). This technique yields overly generic regular expressions using wildcards as we show in Section 8. For example, in case of string encryption algorithms that decrypt a string character by character, a widening approach leads to an overly generic regular expression. Not only the correct string is described by the regular expression but also wrong character combinations.

ValDroid, however, handles loops precisely by evaluating the loop condition. We acknowledge that dynamic execution as applied by Harvester and StringHound also handle loops precisely.

3.2 Collections

An effective value analysis should be able to handle complex objects such as lists and maps, which are commonly used to store data. This requires the analysis to corectly process calls to the Java collections API and possibly other popular collection libraries such as Guava.

ValDroid models behavior of frequently used library methods. Harvester supports collections for which it has a slicing model. For example, there is a specification file in Harvester that says that a call $\texttt {ArrayList.add}$ must be retained during backwards slicing if the base object is already in the slice. The other static approaches do not support collections.

3.3 Precise Complex Objects Handling

Complex objects hold their state using multiple fields, requiring a multivariate analysis for optimal precision. Assume that a class has fields A and B with two instances that hold values $A=a_1$ , $B=b_1$ and $A=a_2$ , $B=b_2$ . An imprecise analysis would return the spurious combinations $A=a_1$ , $B=b_2$ and $A=a_2$ , $B=b_1$ . This is exactly the case for all static tools in Table 1 except ValDroid.

ValDroid is object and field sensitive during its simulation phase and precisely captures complex objects. The simulation phase is similar to an execution trace on a program slice and mimics the source program’s behavior with respect to the value under analysis.

3.4 Arrays

Obfuscators often rely on arrays for replacement-based runtime decryption. Characters in a string are replaced with other strings from an array based on indices computed at runtime. A precise analysis must therefore be able to distinguish between the different elements in an array.

ValDroid slices arrays and their elements if it determines that they may alias with the value of interest or a dependency thereof. During the forward execution phase, the concrete array indexes can be resolved and thus the handling is equivalent to the original program. Many other static approaches either overapproximate all arrays, i.e., disregard the array index entirely, or only support arrays with static indices.

3.5 Android API Support

Android apps often depend on Android-specific APIs. For example, some apps load strings or parts thereof from resource files using using $\texttt {Resources.getString}$ . Tools designed for Java applications do not model or support Android APIs. Since ValDroid was designed with Android applications in mind, it models the most important Android APIs.

StringHound, however, does not support Android and runs the code on the JVM. For resolving dependencies to Android APIs, it uses a stubbed android.jar with no actual implementation. This is also noted in the StringHound paper: “Methods that have to return a value return the type’s default value (e.g., null or 0)” [14].

3.6 Reflective Calls

Obfuscators such as DexGuard replace ordinary method calls with reflective method calls. The strings that denote the target class and methods are decrypted at runtime using constant values. If an analysis cannot handle reflection, then it operates on an incomplete callgraph and may therefore miss values. Only Harvester, StringHound, and ValDroid handle reflective calls properly.

ValDroid uses the same approach as Harvester [35], which, in turn, was inspired by Tamiflex [6]. ValDroid first constructs a preliminary callgraph. It then uses its value finder to obtain the targets of reflective method calls to augment the call graph with edges to the resolved methods. ValDroid iterates this process until no more callgraph edges are found.

Generally, tools such as Harvester can also be used as a pre-analysis to replace reflective call sites with direct calls before running a value analysis that does not support reflection. However, as we show, ValDroid achieves better precision and recall for reflective calls than Harvester.

3.7 Encounter Rates

Table 2 shows how often we have encountered a challenge during slicing. The data were computed on the real-world apps from RQ2 in the evaluation Section 8.3. Since the slice may be imprecise and encompass some statements that might not be relevant (e.g., due to imprecision in the alias analysis), we removed the statements that did not contribute to the computation of the final value. This removal was done as follows. We ran the slicing normally. In the simulation phase, we recorded the program state for each step. We then performed a fixed point iteration running backwards on the recorded program states to construct a set of relevant values. All variables that were never accessed during simulation were removed. All statements that did not read or write relevant variables in the respective program state were removed as well. The resulting slice only contained the relevant statements on which we performed the counting for Table 2.

Table 2.

Type	Encountered
Precise Loop handling	4,137
Collections	2,023
Precise complex objects	15,970
Precise arrays	9,012
Android APIs	6,306
Reflective calls	250

Table 2. How Often the Challenges Were Encountered Using Slicing

According to these results, the handling of precise complex objects using fields is important for many values. Handling arrays precisely is also a requirement for many values. Harvester, StringHound, and ValDroid are the only approaches that discern different array elements, with ValDroid being the only static one.

4 Approach

In this section, we explain the ValDroid approach in detail. We first give a high-level overview to ease the understanding of the following subsections before focusing on the individual algorithms that comprise ValDroid.

4.1 Approach Overview

Figure 1 shows an overview of the most important components of ValDroid. ValDroid takes two inputs: the Android application to analyze and a set of methods of interest. A method of interest is represented by the method signature as well as a set of parameter indices. ValDroid looks in the code of the given Android app for invocations of methods of interest and searches for the respective parameter values.

Fig. 1.

The ValDroid approach consists of two phases: slicing ① and simulation ②. The slicer identifies all statements in the code that contribute to the computation of a given variable of interest. It then translates these statements into mathematical functions that simulate the effect of the respective statement on the state of the program. Applying these per-statement functions in the right order to an initial state computes a final state that includes the concrete value of the variable of interest. This value is equivalent to the value that would be computed by running the program’s code on a machine initialized to the same initial state.

Note that ValDroid statically translates statements to functions and then evaluates these functions as a mathematical operation. ValDroid operates on the Jimple intermediate representation [43] that uses a limited set of instructions. Complex Java instructions are split up into multiple Jimple instructions, which makes implementing an analysis easier. Soot [2] is used to transform the Dalvik bytecode, which is used by Android apps, to Jimple [5]. Note that we have used collision-free naming of variables in the article for clarity reasons. The actual ValDroid implmentation operates on Java objects from the Jimple AST, i.e., a variable is uniquely identified by the object identity of the corresponding $\texttt {JimpleLocal}$ object in memory, regardless of the name of the local. The same applies to fields ( $\texttt {SootField}$ ) and methods ( $\texttt {SootMethod}$ ). Consequently, ValDroid does not depend on the input code to use collision-free naming.

To translate method calls to the Android library, which includes basic Java library classes such as $\texttt {StringBuilder}$ , ValDroid utilizes library models ③. Library models provide instructions on how to simulate an API call for which there is no code in the app under analysis. Further, a library model contains instructions for slicing, i.e., whether the presence of the method call has side effects on the base object or a parameter. For example, a call to $\texttt {StringBuilder.append}$ has a side effect on the base object of the call site. A call to $\texttt {System.arrayCopy}$ has a side effect on the array passed as the third parameter (the destination array). Furthermore, library models encode information about alias relationships that the actual implementation of the callee introduces. Currently, these models are written by hand, but usually just call the original Java implementation of the same method. In Section 8.7, we show that relatively few classes need to be modelled to achieve a high recall.

4.2 Definitions

Definition 1.

We define $\texttt {V} := \mathbb {V}_p \cup \lbrace \epsilon \rbrace$ with $\texttt {V}_p$ being the set of all variables in the program.² The special variable $\epsilon$ represents a placeholder element, which will be described later in this section. Analogously, we define $\mathbb {F}:= \mathbb {F}_p \cup \lbrace \epsilon \rbrace$ with $\texttt {F}_p$ being the set of all (Java) class fields in the program. $\mathbb {L}$ is the set of all possible values a variable or field can take. In our implementation, values implement the same interface and mimic data objects used in the implementation. For Intents, we use class objects with separate fields for the action, categories, and so on. For primitive types and Strings, we use a simple wrapper containing a simple value. We support all primitive types used by the JVM. Array values are represented by an array list. Supported collection classes use the actual implementation and may contain any supported value represented by the common value interface. StringBuilder/StringBuffer values hold an actual StringBuilder instance and behave like the Java object.

Definition 2.

A machine state $\texttt {M}$ is a function $M : \mathbb {V} \times \mathbb {F} \rightarrow \mathbb {L}$ . The function argument is a pair of base variable and field. Throughout this article, we use angular brackets to denote tuples. For example, a field access $\texttt {a.f}$ is modeled as $\langle a, f\rangle$ . Local variables (without a field dereference) are modeled as $\langle b, \epsilon \rangle$ . Static fields (without a base variable) are modeled as $\langle \epsilon , f\rangle$ . The $\epsilon$ in these cases specifies that the respective part (base variable or field) is absent.

Definition 3.

A state transformer $\lambda$ is a function $\texttt {M} \rightarrow \mathbb {M}$ .

Definition 4.

A pseudo constant is either

—

a constant value such as a primitive value or a string constant OR

—

computed as a result of a deterministic function using pseudo constants as inputs.

Note that $\texttt {L}$ contains not only values of primitive type but also complex objects such as Android Intents. State transformers model the effects that statements have on machine states. Conceptionally, a sequence of statements can be translated to a sequence of state transformers. Consider the following example with two variables: a = 3; b = a + 5. The effect of the first two statements can be modeled via the two following transformers $\lambda _1$ and $\lambda _2$ , mapping a machine state $t \in \mathbb {M}$ to a new machine state, which contains the effect of the respective transformer:

$\lambda _1 = t \mapsto {\left\lbrace \begin{array}{ll} \langle v, f \rangle \mapsto 3 & if\, \langle v, f \rangle = \langle a, \epsilon \rangle \\ \langle v, f \rangle \mapsto t(\langle v, f \rangle) & otherwise\\ \end{array}\right.}$

$\lambda _2 = t \mapsto {\left\lbrace \begin{array}{ll} \langle v, f \rangle \mapsto t(\langle a, \epsilon \rangle) + 5 & if\, \langle v, f \rangle = \langle b, \epsilon \rangle \\ \langle v, f \rangle \mapsto t(\langle v, f \rangle) & otherwise. \\ \end{array}\right.}$

The state transformer $\lambda _1$ models that variable a maps to the constant value 3 and that all other variables and fields keep their respective original values, which is exactly the equivalent of a = 3. The transformer $\lambda _2$ models that variable b receives the value of variable a plus 5, which is the equivalent of the second statement. Composing these function in the right order simulates the effect of the entire method and yields the final machine state, $m_t = [\lambda _2 \circ \lambda _1](m_0)$ , based on some initial machine state $m_0$ . A single concrete value can then be obtained by evaluating the final machine state for that variable. For variable b, for example, the concrete value is $m_t(b) = 8$ . In the context of this article, we call elements of $\texttt {M}$ a path, because each element is a composition of transformers (e.g., $\lambda _2 \circ \lambda _1$ ) and simulates the effects of a program path. In case of a single transformer, a path can be obtained using the identity function (i.e., $\lambda \circ id$ ).

4.3 Simulation

Algorithm 1 presents the starting point of the ValDroid approach in the Search method. This method first slices the code to obtain the paths, i.e., a set of compositions of transformers. It then calls Run to simulate each path. More precisely, the Run method applies the composition of lambda functions to an initial state. In the initial state, all variables are associated with the top symbol, resembling an undefined value. The final result of Search is the set containing all values of all machine states for the given variable of interest. In other words, Search returns all possible values along all control flow paths for the variable of interest.

4.4 Slicing

In this section, we discuss ValDroid’s slicing approach in more detail. Note that the slicing already performs the translation from code statements to transformers. The output of the slice function (see Algorithm 2) is a set of paths. There may be more than one path in case there are multiple control flow paths through the original code.

Slicing is based on an interprocedural control flow graph such as the one constructed by FlowDroid [3]. As inputs, the slice function takes a statement s from where to start slicing, a set of variables $S_v \subseteq \mathbb {V} \times \mathbb {F}$ of interest, and an (initially empty) current path. Technically, the empty path contains a single transformer, which applies the identity function to all variables in the machine state. To simplify the presentation of the article, we will discuss loop handling separately (see Section 4.8) and thus first assume loop-free code. The slice function calls the Apply method (Algorithm 5) to translate the current statement into a transformer. It then continues with the predecessors of the current statement. The calls to Map in line 21 and line 22 (defined in Algorithm 3) are only relevant in the interprocedural case described in Section 4.6. They are identity functions in the intraprocedural case. When there are multiple predecessors, slicing continues along all of these control flow paths with an independent copy of the current path for each predecessor. While this may theoretically lead to path explosion, our evaluation shows that ValDroid performs well in practice.

Special handling is required in case a predecessor statement is a condition. In that case, the Slice method calls itself recursively not only for the current variables of interest but also for the variables on which the condition depends. Without loss of generality, we assume a binary comparison and search back for the two operands. Note that the algorithm can cope with object and primitive types. In the former case, referential comparison was used in the source code of the application such as obj1 == obj2 with obj1 and obj2 being objects. As such, we can compare the actual values of the operands by reference as well when simulating the path. ValDroid tracks object references by using special values when instances are created at allocation statements. Note that the semantic comparison of objects, e.g., via $\texttt {equals}$ , can be reduced to a comparison of primitives: $\texttt {if (a.equals(b))}$ , is expressed in Dalvik bytecode as $\texttt {c = a.equals(b); if (c == true)}$ . More complex conditions such as combined conditions using $\texttt {\&\&}$ and $\texttt {|}$ | are treated similarly, using temporary variables and a primitive comparison in the $\texttt {if}$ condition. Field accesses introduce dedicated statements for that purpose as well, so that field conditions do not need to be handled in $\texttt {if}$ conditions or computation statements, since values from fields are stored in local variables before usage. This is a direct consequence of the Dalvik bytecode, in which $\texttt {if}$ conditions are evaluated using this technique.

Once the machine states are available, ValDroid evaluates each machine state for the operands of the condition using the Run method. We observe that the then branch of the condition is only feasible if there is at least one machine state on which the condition evaluates to true. The inverse applies to the else branch of the condition. Therefore, ValDroid evaluates these variables for each machine state and simulates the comparison. It only continues the slicing if the current statement (which is in either the then or the else branch) is actually possible with respect to at least one machine state (line 19). This prevents some spurious values from being computed, which do not resemble runtime values. The abortion of inconsistent paths may also decrease computation time.

4.5 Converting Statements to Transformers

The Apply method as shown in Algorithm 5 converts a single statement to a transformer. The core idea is to encode the effect the current statement has on the variables of interest as a transformation on machine states. The algorithm composes the existing path r with the respective new transformer $\lambda$ . Recall that machine states are functions on variables and fields and that transformers are functions on machine states.

Method Apply also updates the set of variables of interest $S_v$ . In the case of an assignment a = b with variable a being of interest, for example, the slicer must add b to the set $S_v$ and remove a. If the right side of the assignment is a constant, then the left side is removed from the set of variables of interest without adding the right side. In other words, the set $S_v$ keeps track of variables that are required but not fully resolved to constants yet. In line 8 of the Apply function, ValDroid handles arithmetic and logic operations (concrete operation abstracted through $\diamond$ ). For these operations, the operands are added to $S_v$ , while the variable that stores the result is removed. The state transformer emulates the operation. Field access handling is performed in lines 11 and following first for instance fields and then for static fields (starting at line 17).

Listing 1.

Listing 2.

The approach is capable of discriminating different class instances. Consider a user-code class C with a field f, which is used in Listing 1 to save a value. The analyst invokes ValDroid to obtain the value passed to the $\texttt {send}$ method in Line 5. The approach needs to distinguish the object instances $\texttt {o1}$ and $\texttt {o2}$ to obtain the correct result $\texttt {p1}$ . An object-insensitive approach may compute the wrong value $\texttt {p2}$ as a result of the analysis.

To understand how ValDroid handles this challenge, we must analyze the how the example is structured on the bytecode level.³ The object allocation is split in two statements, as shown in Listing 2. In the first statement, an empty object instance is created. The constructor call to the <init> method is treated as a normal method invocation. When the slicer encounters the $\texttt {new}$ -statement, it adds a transformer to the path so that in the execution phase a new instance placeholder is instantiated and saved in the machine state for that variable (line 25 and following in the Apply function). The instance placeholder works like any other value. Therefore, even before any field in o1 or o2 is assigned any value, the two objects with their separate allocation sites are already modeled as separate entities in the machine state.

Note that this approach works to discriminate object instances even if the instances are passed around in the program. API classes such as android.util.Pair are handled in the same way. The $\texttt {Pair}$ class saves a pair of objects as two separate fields that can be accessed from user code directly. The $\texttt {create}$ method of the $\texttt {Pair}$ class, which is a factory method for creating a pair, is handled by a library model. That model emulates the behavior of the API call in the Android framework and saves the passed parameters as the corresponding fields of that particular instance in the machine state. Library models are further explained in Section 4.7.

4.6 Interprocedural Slicing

ValDroid performs interprocedural slicing based on an interprocedural control flow graph, such as the one generated by FlowDroid. When the slicer encounters a method call, it recursively calls Slice on all return statements of all possible callees. If the caller statement is an assignment, then the Map function (Algorithm 3) is used to map the left side of the caller statement to the return value in the callee. The Map function is called in line 21 in Algorithm 2 in case the left side of the assignment is needed in the slice (i.e., the left side is present in the input set of variables of interest). We have omitted the implementation, since it is rather trivial and technical. In general, it maps the current values of interest $S_v$ between callers and callees, i.e., different function contexts. The slicer also adds a transformer that maps the variables in the machine state into the scope of the callee (i.e., translates variables from caller-side call arguments to callee-side formal parameters). Note that this mapping only copies values, but keeps the full state, even for variables that are local to the caller. This ensures that such variables are not lost from the machine state upon returning from the callee. For simplicity in the presentation of this article, we assume a collision-free naming of variables across the whole program to avoid clashes between variables in different methods. The implementation can handle such cases and does not depend on variable names.

When the slicer arrives at the beginning of a method, it continues the slicing at all possible call sites with the respective mapping (see line 21 in Algorithm 2). That means that the $\texttt {this}$ -Object and parameters are mapped from the callee to the caller. Note that we present the algorithms in a context-insensitive manner, while the actual implementation is based on the call strings approach [38]. Integrating call strings into the algorithms is straightforward but would hinder readability on the core contributions of this article. For virtual and interface invocations, the implementation introduces a special state transformer that checks whether the type of the value of the base object is compatible with the method’s declaring class, i.e., whether the virtual invocation call would actually call the callee at runtime. We use a virtual dispatch model, which mimics the behavior of the JVM. Note that FlowDroid’s callgraph, which is based on the Soot Pointer Analysis Research Kit (SPARK) callgraph algorithm, is context insensitive. Its points-to set for a given variable merges all possible types of that variable over all paths. ValDroid fixes this imprecision by checking the concrete type of the class instance, which was propagated along the current path, during the simulation phase. The $\texttt {this}$ object becomes part of the set of sliced variables, which gets propagated to the caller. If the callee does not correspond to the type of the base object, then the path is aborted. This improves the recall and runtime characteristics of the algorithm. In case the runtime type of the base object could not be determined using the machine state, it assumes that the callee is a viable candidate and continues the execution. Note that ValDroid also checks the class hierarchy, since subclasses inherit the non-private implementations of their parent class unless they define a the method with a compatible signature.

The call graph algorithm of FlowDroid takes care of generating a dummy main method that contains call edges to event handlers. It also models the lifecycle of the Android app by emulating the interaction between the Android framework and the app. Building on this callgraph allows ValDroid to reconstruct values that are computed across various lifecycle methods and event handlers. The alias algorithm uses points-to-sets generated by SPARK, which leverages FlowDroid’s dummy main method. SPARK is flow and context insensitive. Other aliasing algorithms might offer better performance by being more precise; we plan to explore that in the future.

In the implementation, the machine state contains stack frames. To reduce the complexity of the presented algorithms, this detail has been omitted in the pseudocode. Special transformers that push and pop stackframes are placed in case of method exits and enters respectively during backwards slicing. For all call sites, a push stackframe transformer is placed on the path. During the simulation phase (which happens forward), a stackframe is placed onto the stack and local variables that are visible in the callee are mapped to the new stackframe when entering the method. When leaving the method, the stackframe is removed again and the variables that are visible in the caller are mapped back into the scope of the caller. In the pseudo code, we only show virtual invocations for simplification reasons. The implementation can cope with all invocation types.

4.7 Library Models

User code often references libraries, some of which are only available on the target device and therefore cannot be analyzed together with the app, such as the Android API. To handle such library functions f, ValDroid relies on a repository of manually implemented transformers $\lambda (\delta _f)$ called summaries. Note that summaries are not necessary for libraries that are compiled into the app, such as the Android support library, since these libraries are app code from the perspective of ValDroid.

Definition 5.

$\Delta (s)$ is the set of pre-defined transformers that capture the semantics of the method invoked by statement s. The transformers $\lambda (\delta _f) \in \Delta (s)$ in the sets $\Delta _{base}(s)$ , $\Delta _{param_i}(s)$ , $\Delta _{return}(s)$ capture the effects of s on the base object, the ith parameter, and the return value, respectively.

For each call site without a callee available to the analyzer (such as calls to the Android or Java API), method Apply in Algorithm 5 invokes ApplyLibraryModel (in Algorithm 6). For each value of interest referenced by the current statement s, it identifies the corresponding transformer $\lambda (\delta _f)$ , which is appended to the current path. API calls may again depend on other variables, e.g., the first parameter of the call. Such dependencies must be added to the values of interest v before continuing the slicing.

Definition 6.

$\theta (s, \lambda _f)$ is the set of variables/fields of interest on which the transformer $\lambda _f$ depends when applied to statement s.

Note that, when applying a summary, the slicing continues with the intra-procedural predecessor of the call site, since there is no callee in the user code. We show an example on how we implemented library models in Section 6.

4.8 Loops

In Android bytecode and in Jimple, loops are not an entity on their own but are rather built using a combination of conditional and unconditional jumps. Classical loop-unrolling, which many other tools employ, unrolls the loop a constant number of times. Since the number of loop iterations can often not be determined in advance, the number of loop unrolling iterations is arbitrary and may not reflect the actual semantics of the program. Furthermore, the number of iterations that is required to compute the correct value may be different for different values of interest.

ValDroid therefore avoids constant loop unrolling. Instead, ValDroid skips the loop in the slicing phase but places a special transformer called the loop emulator on the path. The loop emulator is capable of executing the loop in its entirety until the execution leaves the loop due to the loop condition no longer being met. This means, that in contrast to classic loop-unrolling behaviors, ValDroid only executes the loop as long as the loop condition is fulfilled, which is precisely the semantics of the original program. To ensure termination in case such a state is never reached, e.g., because the termination condition cannot be evaluated statically, ValDroid can be configured with a maximum loop iteration count. Only if this cutoff is triggered, ValDroid degrades to traditional loop unrolling. In our evaluation (see Section 8.8), a cutoff was triggered in only 0.97% of all searches.

Loop detection.

For ValDroid, statements in loops are handled differently than statements that are not within a loop. To detect loops and their corresponding statements we employ a strongly connected components (SCC) analysis on the statement graph of the sliced method. Using SCC, we can compute entry points to the loop as well as statements that take part in the loop. In case of nested loops, we only retain the strongly connected components of the outer loops (which include all statements of the inner loops) and discard results about the inner loops. The emulation of the inner loops are handled implicitly by the emulator. Therefore, a statement in a loop may only be part of exactly one SCC. For example, Figure 2 shows an outer loop counting from 0 to 200 and an inner loop counting from 0 to 300. For this sample, we assume that $r3 is the searched value at $\fbox{1}$ . The statement is sliced normally. When the next statement in the intraprocedural CFG is processed ( $\fbox{2}$ ), ValDroid detects that this statement is part of a loop, since the SCC analysis reports two SCCs for that statement. SCC 1 contains all statements of SCC 2. A correct emulation of SCC 1 leads to a correct emulation of SCC 2. Therefore, only SCC 1 is used, and information about SCC 2 is irrelevant and is discarded. In the remaining part of the section, we do not distinguish between the inner and the outer loop and always mean SCC 1. ValDroid determines further that the loop is relevant, since it has an effect on the value in $r3 and thus cannot be skipped. If this was not the case, then the slicer would continue at $\fbox{3}$ .

Fig. 2.

The SCC analysis is an on-demand analysis on the sliced method, and its results as well as the set of affected variables of the loops are cached, since they are independent of the slicing criterion. This approach reduces the slicing time in case multiple values of interest are requested.

In case the slicer detects that the current statement to slice is part of a loop (line 10 in Algorithm 2), the special loop handling approach takes place. Currently, our approach adds all variables that are read and all fields that are dereferenced within the loop to the slicing criterion and assumes that these values are needed. In the example, these variables needed from before the loop are $\$r0$ and $\$r3$ . This approach trades slicing performance for slice size, i.e., it leads to a larger slice in comparison to precisely slicing the loop body. In the evaluation, we show that this approach does not have a significant negative effect on the recall of the overall value analysis in practice. Improving the loop slicer to be more precise while retaining the performance that ValDroid currently offers therefore remains as future work. Though the slicer skips the loop for now, it takes note of the current statement in that path. This statement is considered the exit statement. Since our slicing algorithm works backwards, the exit statement is the last statement outside of the loop. The exit statement in the sample is $\fbox{1}$ . ValDroid adds the loop emulator to the path, and the slicing continues for each entry point of the loop.

The loop emulator is an ordinary state transformer and thus receives an input machine state that corresponds to the machine state right before the loop execution. After inserting the loop transformer, the slicer continues at statement $\fbox{3}$ . The loop emulator starts emulating the loop at the entry point of the loop. Note that emulating a statement only involves using the corresponding state transformer for that statement. State transformers do not need any special handling for this forward emulation. The loop emulator executes the complete loop body and respects all if conditionals within the loop. Thus, loop termination conditions are emulated as well. In case the emulator would run a statement out of the respective loop it checks whether the next statement of the rest of the path is the exit statement. If these statements are not identical, then the path is aborted, since otherwise a value along an infeasible path would be computed. Loops can only be left at the exit statement determined during slicing.

Statement of interest within the loop.

A special case occurs if the statement of interest is within the loop. This is the case when the statement of interest is $\fbox{4}$ . Rather than computing a value in a loop and passing it to the statement of interest afterwards, the statement of interest is triggered occasionally during the execution of the loop. In this case, the exit statement determined by the slicer is part of the loop itself. The loop emulator calls the remaining state transformers in the path each time the emulator reaches the statement of interest. This corresponds to the original behavior of the program, as the statement of interest would also be called in every loop iteration during normal execution.

Precise virtual invocations.

For handling potentially recursive method calls, the emulator utilizes and updates the stackframes present in the machine state. When an application method is invoked, the emulator respects the virtual/interface dispatch semantics of the Dalvik virtual machine. Similarly to callgraph construction, the type of the base object must be known to precisely handle virtual method calls [21]. ValDroid tracks the allocation sites of all sliced values as part of the machine state. When the executor encounters a virtual method call, it queries the type of the base object from the current machine state. In case the actual type is known, the emulator computes the precise method being called. Otherwise, e.g., because no allocation site for the base object was encountered, the emulator utilizes the static call graph to determine possible call targets. This may, for example, happen in case the base object is instantiated, in a factory method outside of the application code.

There may be situations in which the executor is not able to simulate a statement. Required values may be missing values, a method call may reference a method outside the application code for which no summary exists, or the call may be have an ambiguous call target. In such a situation, ValDroid attempts to approximate the computation by skipping the respective statement and executing the next. While this may yield an incorrect value, the value might still provide some insight for a human analyst, e.g., in case of a file path in which an intermediate directory is missing, but the rest is correct. In case a condition for loop termination cannot be determined, the path is aborted to avoid infinite loop simulation.

Normal $\texttt {if}$ conditions.

Conditionals, which are not part of a loop, can be sliced normally without the catch-all behavior described for loops. Still, ValDroid offers a configurable tradeoff between recall and performance for such $\texttt {if}$ statements. In the most precise configuration, ValDroid slices the criterion. During simulation, ValDroid checks whether the criterion is fulfilled in the current machine state. If so, then the simulation continues with the then block of the conditional. If not, then it continues with the else block. The machine state is used to determine which block is feasible on the current path. In case the criterion cannot be determined, e.g., because there is no corresponding entry in the machine state, both blocks are simulated.

Conclusions.

The loop handling explained in this section allows ValDroid to process loops without knowing in advance how many loop iterations are needed to compute the requested value. Instead, upon reaching this part of the code in the simulation phase, the emulator can determine when to terminate the loop by simulating the loop condition similar to the semantics at execution time in the original program. With this approach, ValDroid improves significantly over current techniques based on constant loop unrolling.

Slicing all loop conditions is very precise but leads to a lot of control-flow dependencies that must be sliced as well. In practice, many control statements are independent of the values of interest. Therefore, the performance of ValDroid may be impacted severely. This property of control-flow dependencies is well known in the literature [18]. To avoid this performance bottleneck, ValDroid does not slice criteria of conditionals by default, i.e., does not add them as dependencies during slicing. However, in the execution phase of the algorithm, these conditionals are still evaluated. In case the machine state happens to contain values for the criterion variable, the executor selects the applicable block (then or else) with which to continue. If the criterion cannot be evaluated, then ValDroid simulates both blocks as possible suffixes to the current path.

5 Demonstration Example

In this section, we use an example to demonstrate how ValDroid operates. The code in Listing 3 was taken from a popular Turkish medical app and slightly simplified. The method sendToInternet takes data from an Android bundle, which is essentially a key/value map, and constructs an HTTP GET request with this data as parameters. A mapping from key k to value v therefore becomes k=v in the generated URL, e.g., https://a.com/text?k=v. This request is then sent to a remote server in line 26. Since this method may also be used for security-sensitive requests with, e.g., login credentials as in the example, analysts need to know the final request string that is transferred to the remote server.

Listing 3.

This example is complex, because it contains loops and stateful complex objects. The input is an Android bundle, not a primitive object or string. The iterator’s next() method returns a different object for each loop iteration. This value is then used to manipulate the state of the StringBuilder in place (i.e., without constructing a new StringBuilder or overwriting the variable) on lines 20ff. The append method returns the base object, thereby introducing an alias. Further, values are passed between multiple methods (onClick and sendToInternet) using the method call in line 4, requiring an inter-procedural analysis. The value of interest is the contents of the StringBuilder after the last loop iteration (line 26). A simple constant propagation cannot handle such cases, while our approach delivers the correct value “https://a.com/put?key=secret” for $\texttt {s}$ at the sink in line 26.

ValDroid operates on the Jimple [43] intermediate representation language provided by the Soot library [2]. Jimple is a three-address code generated from the Dalvik bytecode of the app. In Jimple, at most one complex operation can be performed per statement. Complex operations include field read/writes, array element read/writes, and binary operand computations. Statements with nested complex operations get split into multiple Jimple statements with temporary variables for the intermediate results. Note that arbitrary Dalvik code can be translated to Jimple, and although Jimple introduces more statements and temporary variables, the transformation is semantics preserving. The appendix contains the Jimple code for our demonstration example from Listing 3, slightly abbreviated for legibility. Constants and local variables are not considered complex and an arbitrary number of these simple values can appear inside a statement. Our algorithms in ValDroid are built on these language constraints, i.e., only consider a single complex operation per statement. Note that $\texttt {if}$ statements may represent loop-headers as well as if conditions. A loop is built from an $\texttt {if}$ statements and a $\texttt {goto}$ statement. For the sake of simplicity, we omit the handling of switch statements. ValDroid supports switch statements, but for the brevity of the article we assume without loss of generality that $\texttt {switch}$ statements are replaced with corresponding $\texttt {if}$ statements.

Fig. 3.

5.1 Slicing and Running the Demonstration Example

In this section, we briefly describe how ValDroid computes the value of variable s before line 26 in Listing 3. The slightly shortened representation of the interprocedural control-flow graph is shown in Figure 3 and is used in this section to demonstrate certain aspects of the approach. The sets of variables annotated on the dotted arrows show the worklist of our slicer (Section 4.6), i.e., the variables that contribute to the computation of the requested values.

Slicing.

The slicer starts at ③, which is the statement of interest. Variable $r15 is the value of interest and is the only element in the slicing set. The slicer then encounters ④, which is a call to a method in the StringBuilder class from the Java Standard Library. Since this method is not part of the application code, it requires a summary. The summary for computing the return value of the call is appended to the path as function composition. The dependencies $\theta (s, \delta _f)$ defined in the summary are added to the variables of interest with $S_v=\lbrace \langle \$r0, \epsilon \rangle \rbrace$ , which is the base object.

The next predecessor statement ⑤ is within a loop. Therefore, the state transformer handling this loop (②) is appended to the path and the used variables within the loop are added to $S_V$ . The statement represented by ⑤ is used as the exit statement of the loop. The slice continues right before the start of the loop (at statement ⑥)).

The slicer continues until the start of the method is reached. After slicing statement ⑦, the set of sliced variables is mapped to each caller. In this case, there is only the caller ① according to the interprocedural CFG. The Map function maps the local variables $p0 and $this1 from sendToInternet to the variables within onClick at ①, $b and $this0, respectively. Upon reaching the start of the method, the path is complete. It is then passed to the Run method of the algorithm for final simulation.

Running.

The Run method evaluates each path (in this case the one and only path). At first, it executes the transformer to create a new bundle object as well as the transformer emulating the effects of $\texttt {putString}$ . Afterwards a push-stacktrace-transformer is executed, which adds a new stackframe for the $\texttt {sendToInternet}$ method with the value mapped from the actual parameter $\$b$ to the formal parameter variable of the callee $\$p0$ . The execution proceeds normally until the loop emulator transformer ② is reached. Since the machine state contains all relevant values, the loop emulator transformer executes according to the original semantics of the loop within the program. After the first iteration, the $\texttt {if}$ condition at ⑤ becomes false and the loop emulator transformer determines that the next executed statement would be ④ and thus outside of the loop. ④ is the correct exit statement, since the rest of the path starts with the transformer of statement ④. Thus, the path is continued normally. The last transformer at ③ returns the computed value of $\$r15$ , which is the string constant ”https://a.com/put?key=secret ”.

6 Implementation

In this section, we describe implementation-specific aspects of ValDroid.

6.1 Reflection

As briefly noted in Section 3.6, ValDroid transparently handles reflective method calls similarly to Harvester [35]. It starts with a preliminary callgraph that ignores reflection. Based on this callgraph, it resolves the concrete string values passed to the Java reflection APIs using its normal Slice method. If the target class and method names are available, then ValDroid replaces the reflective method calls with direct ones and updates its callgraph accordingly. Finally, it repeats the analysis with the original configuration, i.e., the values of interest the analyst originally requested. This second analysis stage can then rely on a complete callgraph that is not challenged by reflection. In case of nested reflection, i.e., reflective method calls that again call the Java reflection API, ValDroid can be configured to iterate its analysis and rewriting of reflective method calls, until only normal method calls remain.

6.2 Inter-Component Calls

Using inter-component communication (ICC) calls, can invoke code in other components by sending messages called Intents to these components. For a static analysis, Intent passing therefore leads to additional edges in the callgraph. ValDroid handles such cases similarly to reflective method calls. In a first step, it extracts the concrete values of the Intents that are sent inside the app. It then matches these Intents against the component declarations in the app’s manifest file to find the correct recipient. In an approach similar to ICCTA [23], ValDroid rewrites the app to include direct method calls in addition to the Intent API calls. With these normal method invokes in place, a standard callgraph algorithm such as SPARK [21] can update its data structures with the cross-component callgraph edges. Afterwards, ValDroid iterates its analysis with the original configuration on the rewritten app based only on normal method calls.

6.3 Path Explosion

On complex apps, the slicer may run for a long time when processing all transitive dependencies of a given value of interest. As we found, long paths are often a result of over-approximations in the analysis rather than a necessity of the app under analysis. Such imprecision can, for example, stem from the callgraph algorithm on which ValDroid relies. If this algorithm returns more callees for a given site than necessary, then ValDroid must slice all of them (including their callees in turn), which can exponentially increase the analysis time. We therefore limit the maximum number of allowed backwards steps in the slicer as well as the number of different paths that may be created for one slicing criterion. As we show in our evaluation (see Section 8.8), paths have less than 100 steps on average. As such we set the cutoff to 100, which has no significant effect on ValDroid in practice.

The implementation does not compute the path for each requested variable individually but uses a mutable machine state containing all sliced values. Machine states are implemented by maps from local variables/instance fields/static fields to values. These states are passed along state transformers, which change the contents within the states.

6.4 Library Models

In our implementation, all values, including primitive constants, arrays and complex values such as Intents, StringBuilders, and so on, are represented by Java classes, which implement the $\texttt {Value}$ interface. Listing 4 shows a simplified excerpt of the implementation. The $\texttt {Value}$ interface provides a $\texttt {getAs}$ method, which receives an arbitrary Java class as parameter. The value instance should try to convert itself into an instance of the given class or return $\texttt {null}$ otherwise. As an example, this can be used to convert computed constants to Java primitives or Strings.

A machine state is implemented as a class containing a list of stackframes and maps, for instance, and static field values as well as values of local variables.

In ValDroid, library models are implemented as Java classes. Transformers are classes that contain a $\texttt {process}$ method performing operations on the current machine state. In many cases, we can leverage the original Java implementation of certain classes in the library, but Android-specific methods are implemented in models. The $\texttt {Get}$ -class is used to model all get-methods ( $\texttt {getBoolean}$ , $\texttt {getFloat}$ , $\texttt {getInt}$ , $\texttt {getLong}$ , $\texttt {getString}$ and $\texttt {getStringSet}$ ) of the $\texttt {SharedPreferences}$ class. Albeit the return type and the second parameter, which denotes the default value, are different depending on the concrete method, the same model can be used. We use this transformer as abn example to show how transformers are implemented in ValDroid.

When the effects of the method $\texttt {get*}$ should be applied to the current state in the path, the transformer’s $\texttt {process}$ method is called. The $\texttt {Get.process}$ method obtains the base object at the current simulation state (which also has a reference to the statement this transformer was created for, i.e., the statement in the user code containing the $\texttt {get*}$ call). The $\texttt {getParam}$ and $\texttt {getBase}$ methods available to all transformers identify the formal parameter at the requested index and the base object of the respective call. These can then be used to query the respective variable value from the machine state. The lookup is performed against the machine state associated with the current stack frame, which contains all variables in the scope of the callee. Recall that in Jimple, invocation statements may only reference local variables or constants and that more complex inputs such as field references must be stored in a temporary variable beforehand.

In the case of $\texttt {SharedPreferences.Get}$ , the library method is commonly called as part of an assignment where the return value is written into a variable. The state transformer must model the semantics of the assignment by associating the result with the correct variable in the machine state of the current stack frame. The ValDroid framework provides the $\texttt {setReturn}$ method for this task. If the code under analysis does not use the return value, i.e., the call is not an assignment, then $\texttt {setReturn}$ does nothing.

The example transformer $\texttt {SharedPreferenceValue}$ internally uses a map from String to Value to handle arbitrary objects as values. Since only Strings are allowed as keys in $\texttt {SharedPreference}$ , the model class also uses strings as keys only. Note that the contents of $\texttt {SharedPreferenceValue}$ ’s values map is mutable and as such is suitable to hold the state of the shared preferences object across multiple statements in the simulation phase. Since in ValDroid, values are represented by concrete instances, checking for referential equality is simple. Calls in user code to $\texttt {hashCode}$ and $\texttt {equals}$ are handled by a transformer that calls the respective methods of the $\texttt {Value}$ object. In the case of $\texttt {SharedPreferencesValue}$ , we did not override these methods, because the original implementation in Android, $\texttt {SharedPreferencesImpl}$ , does not implement $\texttt {hashCode}$ and $\texttt {equals}$ . Thus, calling $\texttt {equals}$ on a $\texttt {SharedPreference}$ object checks for referential equality. ValDroid uses the same behavior in this case.

In contrast, for values such as $\texttt {HashMap}$ s for example, the respective $\texttt {Value}$ implementation overrides the $\texttt {hashCode}$ and $\texttt {equals}$ methods and forward the call to the underlying $\texttt {HashMap}$ .

Listing 4.

6.5 Aliasing

Besides the semantics, Library models include aliasing effects of methods. For example, in case of $\texttt {StringBuilder.append}$ , the return value is an alias of the base object when this method is called. This information is leveraged to improve the PAG within Soot. ValDroid uses Soot’s point to sets to compute alias relationships. Note that imprecise alias information, i.e., an over-approximation, does not lead to an imprecision of the approach. When simulating the paths, object allocations are simulated as well. While simulating spurious value computations lead to a longer runtime, they do not affect the correct value. Similarly to the original application, we use machine states to track variables and their corresponding values. ValDroid achieves object sensitivity using this approach.

6.6 Unknown Values

Some values may dependent on user-supplied input (thereby violating our definition of a pseudo-constant value) or ValDroid may fail to compute a partial value, although it would normally be a pseudo-constant. In the algorithm, the $\top$ symbol is to represent these missing values. In the implementation, ValDroid uses a special value $\texttt {Value.UNKNOWN}$ to represent the $\top$ symbol. Library models may choose to treat unknown values in a special way. For example, the model for StringBuilder.append appends a special string ”UNKNOWN” in this case rather then discarding the whole StringBuilder value in case a the argument cannot be computed for one append call. When an app contains a URL that includes a user-name supplied by the user and constructs an URL like ”http://foo.com/login?user= ” $\texttt {+ username}$ , ValDroid computes ”http://foo.com/login?user=UNKNOWN ” as a string. In the evaluation section we use a strict comparison, so that only strings equals to the ground truth are considered correct. Such a URL would therefore count as both a false positive and as a false negative, reducing precision and recall of ValDroid. Depending on the use-case the results of the string extraction is used in, such partial results could be enough, though. For example, for an analysis that tries to determine whether the application potentially creates connections to servers via the unprotected HTTP protocol, it is sufficient to have the protocol. Since the full URL depends on user input and is not deterministic, it is not a pseudo constant, i.e., out of the scope of this work. Nevertheless, ValDroid can still be helpful in practice.

7 Core Improvements

This section focuses on some properties and core improvements compared to related work.

ValDroid focuses on the computation of pseudo constants and thus does not model user input. It avoids costly symbolic execution and does not need to execute from an application’s entry point but executes path as soon as the slicing criterion is fulfilled.

7.1 Sensitivity Properties

As shown in Section 3, handling complex objects and arrays precisely is important for a value computation approach. In this section, we show how ValDroid is an object-sensitive, context-sensitive, and path-sensitive approach.

Context sensitivity. The context sensitivity of ValDroid has been omitted in the algorithms as presented in the article, but the implementation respects context sensitivity in two different ways—first, during slicing. When the slicer looks for callers and there is nothing on the call stack, the slicer needs to take all potential callers into account. For each (non-static) caller statement, the slicer introduces a special transformer that is an identity function when the context matches and otherwise maps all variables to top. This transformer helps achieve context sensitivity during the simulation phase. The context is not matching when the type of the base object does not match the callee implementation, i.e., calling the virtual method in the real application would lead to the execution of a different method. This transformer understands the special semantics of virtual and interface invocations. This has been omitted in Algorithm 2 at line 22.

ValDroid tracks for each variable value the concrete type of the variable value to make this check possible. As such, values created by createInstance remember the class reference type, i.e., when simulating the statement $\texttt {Object a = new B}$ , ValDroid saves the concrete type $\texttt {B}$ . Values returned by library models also know their type. This information is also leveraged to evaluate the Jimple $\texttt {instanceof}$ condition when user-code uses $\texttt {a instanceof b}$ .

Listing 5.

Listing 5 serves as an example. The value of interest is $\texttt {s}$ at line 38. Note that although the $\texttt {Random}$ class is used in line 27, the value of interest is a pseudoconstant, since the RNG is seeded and thus deterministic.

In the example, when the slicer reaches the $\texttt {sendToInternet}$ call in line 37, no information about $\texttt {i}$ is known. The slicer puts the current statement on the call stack and forks into two different slices, one for each possible callee and a cloned call stack, $\texttt {a.postProcess}$ and $\texttt {b.postProcess}$ . When the slicer finishes the $\texttt {postProcess}$ methods, in each of the slicer instances the slicer returns to the caller using the call stack. Furthermore, to ensure context sensitivity and increase performance, the slicer takes note that $\texttt {i}$ must have the respective type to call the previously sliced method at runtime. The special transformer that checks the type of the variable $\texttt {i}$ in the machine state is appended to the current path. When reaching the $\texttt {prepare}$ call in line 36, the slicer uses the type information of $\texttt {i}$ to not descend into the wrong callee.

In short, ValDroid installs transformers with additional checks to make sure that only realizable paths are taken. The context information also increases the performance of the approach by avoiding impossible paths. In the example, ValDroid correctly computes ”a” and ”bb” but not the spurious values ”b” and ”aa.”

Path sensitivity. ValDroid is path sensitive. While the slicer can provide inconsistent paths, the forward execution aborts such paths during simulation. As an example, consider Listing 6. When ValDroid slices line 12, it places a transformer checking the value of x. Depending on the path, it checks for equality or inequality to 1. As such, ValDroid correctly computes ”ax” and ”by” but not the spurious values ”ay” and ”bx.”

Listing 6.

Object sensitivity. In ValDroid, machine states mimic the memory layout during execution on a real device. Inaccuracies in the slicer may lead to spurious statements and thus unneeded values in the machine states but do not affect the outcome of the simulation phase. As shown in Section 4.5 using Listing 1, ValDroid distinguishes object instances in the same way as a classic execution, these values do not influence the correct object instances. Therefore, ValDroid is object sensitive.

7.2 Loop Handling

Unlike known previous literature in the context of value computation, Unlike other approaches for value computation that have been presented in the literature so far, ValDroid does not use unconditional loop unrolling or widening but simulates the loop similar to an interpreter. Since loop conditions are respected, ValDroid does not need to unroll loops a predetermined number of times. Our loop handling not only reduces the time needed for simulation but also decreases the number of false values reported due to a wrong number of loop iterations. By checking the loop condition against the current simulator state, ValDroid can abort loops early without having to perform unnecessary long unrolling. This significantly reduces the set of possible values to track (e.g., when each loop iterations adds another item to a list). Note that our loop handling is similar to the Loop-Sensitive Analysis concept introduced by Park and Ryu [33], which describes an abstract interpretation based framework to analyze loops for analyses in general. Like our approach, they evaluate loop conditions to determine how many loop iterations are necessary and show that this both more precise and faster than naive loop unrolling. However, while our approach works on-demand starting from the point of interest, their approach is intended to analyze the program completely as a whole. They do not leverage slicing to only compute values needed for the analyses. We cannot directly compare ValDroid to Park and Ryu’s work, since they do not apply their technique on the value problem. Furthermore, their implementation is limited to JavaScript and does not analyze Java or Android applications.

7.3 Further Sanity Checks

ValDroid employs some further sanity checks to abort unrealizable paths. If a type-cast is performed in the analyzed code, which cannot be realized during simulation, then the path is aborted.

8 Evaluation

In this section, we evaluate ValDroid with regard to the following research questions:

RQ1

How accurate is ValDroid on benchmarks in comparison to other tools?

RQ2

How accurate is ValDroid on real-world apps in comparison to other tools?

RQ3

How accurate is ValDroid on malware apps in comparison to other tools?

RQ4

How long does it take for ValDroid to retrieve a value?

RQ5

How well does ValDroid handle string obfuscation?

RQ6

How many library models are necessary?

RQ7

How important is loop handling in practice?

8.1 Experiment Setup

We ran ValDroid as well as the existing tools (Soot, JSA, Harvester, Violist, DroidRA, BlueSeal, and StringHound) on a machine with 144 Intel Xeon Gold 6154 CPU cores and 3 TB of physical memory using the Oracle JDK version 1.8.0. A maximum of 50 GB was assigned as Java heap space (-Xmx VM parameter). We chose a powerful server system, because some of the existing approaches to which we compare ValDroid require large amounts of memory. All tools were run with their respective default configurations. Harvester employs a dynamic approach and needs either an actual device or an emulator. We used a Cubot J3 phone to get a more realistic result, since apps may have emulator checks in place.

8.2 RQ1: Accuracy on Benchmarks

To evaluate the accuracy of ValDroid and to compare it with the quality of previous approaches known to literature, we ran all tools against the test cases included with JSA [10] as a benchmark suite. In these tests, tools need to find 246 values in total. Each test case consists of Java code that computes one or more values and annotations on the expected outcome of the analysis in the form of three regular expressions. These expressions can be used to judge the quality of a value analysis. While the expression denoted “perfect” gives a precise formulation of the expected results, the “mediocre” regular expression also accepts slight variations. The “lowerbound” expression is usually of the form str.*, i.e., only specifies a prefix of the expected value.

Some of the tools under test (such as JSA) return regular expressions as their result. For these tools, we check whether the expression returned by the tool matches one of the three expected expressions (perfect, mediocre, lowerbound). Other tools, including ValDroid, return individual values. In such cases, we check whether the tool’s result is accepted by any of the three regular expressions in the test case. If so, then the value is counted in the respective category (precise, mediocre, lowerbound). If not, then the value is counted as not found. If a tool returns the massively over-fitting regular expression .*, which matches any string, then we did not count the value either, since it does not contain any information. We further also removed obviously invalid results such as Unknown@FIELD or This is a string constant inserted by HARVESTER. In case a tool returns multiple values, we check whether at least one value matches the “precise” category. If so, then we count the test case as correctly solved for the “precise” category. Otherwise, we check again for “mediocre” and last for “lowerbound.”

For all tools, we set a timeout of 3 hours per test case. If an analysis did not terminate in the allotted time, then we aborted it and counted the test as no value being found. The results of our evaluation are shown in Table 3. In addition to the individual match counts per category, we report the total precision for each tool as the sum of all matching categories. Four test cases included with JSA were erroneous, i.e., the “perfect” regular expression did not accept the value that was generated when running the code under analysis as a normal Java program. We fixed those tests and measured against the corrected version.

Table 3.

Tool	Perfect	Mediocre	Lwr.bound	Total
Soot	0.62	—	—	0.62
JSA	50.62	10.80	—	61.42
Harvester	52.47	6.17	—	58.64
Violist	40.43	13.35	—	52.78
BlueSeal	21.23	8.71	—	29.94
StringHound	4.12	—	3.17	7.29
COAL	31.76	5.26	—	37.02
ValDroid	94.60	0.03	—	94.63

Table 3. Precision on the JSA Test Cases (in %)

The tool “Soot” represents the simple intra-procedural constant propagator included in the Soot framework for program analysis [2], which we include as a baseline. Each tool based on Soot has these values available immediately. Note that Soot already performs some transformations on the decompiled code in its Jimple [43] intermediate representation. Besides pure constant propagation, these include simple arithmetic operations and condition folding, i.e., evaluating conditions that only reference constant values. As shown in Table 3, this baseline approach was only able to recover 2 values of 246, which is less than 1%. While these two values were precise, the results show that simple constant propagation is insufficient for retrieving values from complex code.

The authors of Violist evaluate their tool only against a subset of the JSA test cases (80 of 246) [22]. Our evaluation of Violist is based on the complete set of all 246 JSA test cases. On several test cases, Violist timed out or ran out of memory even with a maximum Java Heap Size of 500 GB. Consequently, we report lower precision numbers on Violist than in the original paper. Violist has problems with test cases involving arrays and aliasing. Similarly, COAL also has problems with array test cases. We also noted that COAL returned an impossible value in the ExpressionSensitivity test case due to applying the effects of an unreachable assignment. In the MultiArrayAssignment test case that involves multi dimensional arrays, COAL overapproximates the result. StringHound does not handle calls to $\texttt {StringBuilder.append(Integer)}$ . In the $\texttt {IntOrString}$ test case, for example, the slicer aborts and returns no value. Harvester inserts fake generated strings for some test cases. This happens when the slicer fails to include necessary code in its slice. Note that to preserve the integrity of the application, Harvester does not actually remove any methods. Instead, an unused method is replaced with a stub that returns a default value as a constant.

COAL [32] only supports Java applications. We therefore converted our Android apps to Java bytecode using Soot. We used FlowDroid to generate a dummy main method, which contains a model of the Android ecosystem for a particular application. This approach is also used by FlowDroid to create a precise callgraph for Android apps. Since FlowDroid is based on Soot, we used Soot to output the dummy main class and the application code as Java class files, which were used as input for COAL. We further added IC3s Intent models to COAL so that intercomponent communication is supported.

Table 3 shows that ValDroid was able to find a correct value in almost 95% of all cases, which greatly exceeds all other tools. In more than 90%, ValDroid found a precise value. In other words, ValDroid not only finds more correct values than all other approaches in our evaluation, but its results are also more precise. ValDroid’s precision even exceeds dynamic approaches such as Harvester.

We did not evaluate DroidRA [24], IC3 on the JSA benchmarks. DroidRA only resolves reflection, which are not contained in the JSA benchmarks.

8.3 RQ2: Recall/Precision on Real-world Apps

For the experiments, we downloaded 1,000 randomly selected apps from the Google Play Store as of May 2020. We then defined a global set of 299 different Android API methods that process data relevant to a security analyst. The methods were taken from the following categories:

—

Dynamic loading of classes and native libraries

—

Java reflection⁴

—

Android inter-component communication

—

Executing external programs

—

File handling (open/create files)

—

SQL statement execution

—

Network connections (web views, Hypertext Transfer Protocol (HTTP) requests, and socket connections).

For each API method, we selected domain-specific parameters of interest such as the name of the dynamically loaded class, the path of the program being executed, or the target URL to which an HTTP connection is established. In each app, we identified all calls to these methods. We then configured the approaches under test to extract all possible values for the parameters of interest at these call sites. Table 4 shows the total number of API methods per category as well as the number of statements that call these methods. To not give any tool an unfair advantage, we made sure that all tools recognize these API calls as points of interest of their supported categories. Note that some tools such as Droid-RA are not general-purpose constant finders but optimized for a specific category. For these, we excluded the API calls of the other categories that are unsupported by the tools by design.

Table 4.

Category	APIs	Stmts
Dynamic Code Loading	43	2488
Java Reflection	7	87460
Inter-Component Communication (ICC)	37	3362
External Programs	10	17
File Handling	45	15733
SQL Commands	36	5579
Network Connections	121	5761

Table 4. Shows the Categories of Interest, How Many Different APIs Are within the Respective Category as Well as How Many Statements in Real-world Apps Call These APIs

To generate a ground truth, i.e., a set of values known to be correct, we instrumented all apps to report the actual values computed for a given call site and parameter at runtime. The instrumentation was conducted with a variant of the Harvester instrumenter that does not perform any slicing or execution and only writes the values into a database on the phone’s SD card. For 965 of our 1,000 apps, this approach returned a valid new instrumented app. We then manually exercised these 965 instrumented apps on Samsung XCover Pro devices running stock Android 8.1.0 for at least 10 minutes per app. If the app required the user to log in, then we created accounts. Additionally, we also ran the Monkey automated UI testing tool included with the Android SDK on each app for 10 minutes. Our goal was to maximize coverage on the app and retrieve as many some apps failed due to internal reasons, e.g., servers not being available, or could not be fully utilized without in-app payments. Note that coverage-based testing is best-effort and necessarily incomplete [9]. This means that there might be values missing due to some paths not being exercised at runtime. This is an inherent limitation to the used approach of generating ground-truth values. We removed all API call sites from our test configuration that were not reached while generating the ground truth, i.e., we did not query the approaches under test for these call sites, since we cannot evaluate the validity of results at these sites. We assumed the 470,084 values to be both correct and relevant. They constituted the final ground truth for our experiments. Our approach to compute the ground truth for the evaluation required significant manual labor and therefore highlights the necessity for efficient automated approaches for value analysis.

Furthermore, coverage-based testing does not guarantee the completeness of the values in our ground truth even if the respective call site was reached. Values might, for example, depend on specific device models or UI interaction patterns. We therefore manually spot-checked the values that were found by the approaches under test but that were not part of the ground truth. As we found, these values were almost entirely invalid, i.e., the theoretically incomplete ground truth did not impact the experimental results.

Manually reverse-engineering the apps to retrieve all possible values is not a suitable alternative, because of the size, complexity, and number of apps in our evaluation. Note that the instrumented apps were only used for generating the ground truth. The evaluation of ValDroid and the existing approaches was conducted on the original unmodified apps downloaded from the Google Play Store.

JSA, Violist, and DroidRA are based on Soot. The original software only processed the first dex file inside an app, which is Soot’s default configuration. However, modern Android apps usually consist of multiple dex files. Recent versions of Soot support reading in all of these files and merging them into the same data structures. We therefore updated the Soot library and added the multi-dex option to its configuration for these tools. According to our preliminary experiments, this change was essential for finding even the directly available constants inside the app.

Table 5 shows the per-category recall results for the 794 real-world apps in our dataset for which we were able to obtain a ground truth. For each category, the table shows the number of API methods for which we collected values in the second column (“APIs”). The third column (“Stmts”) gives the number of individual statements that call those APIs across all apps in our test set. We consider a value returned by a tool as correct if it matches one of the values in the ground truth for the respective statement and variable. If a tool did not support the respective category, then Table 5 shows a dash. Generic string finders such as JSA or Violist do not support complex objects such as class arrays that are used to denote a specific overload in a reflective method call. We therefore only compared the class and method names but not the type information in the “reflection” category. In other words, we accept that these tools cannot distinguish between multiple overloads of the target method when analyzing a reflective method call. Note that ValDroid does not have this limitation, because it can search for complex objects. Still, for fairness purposes, we only compared the class and method names in the evaluation.

Table 5.

Category	Soot	JSA	Hv	Viol	DRA	BS	SH	IC3	COAL	VD
Code Loading	0.87	3.52	22.42	1.54	—	2.74	0.99	—	11.61	67.53
Java Reflection	92.65	88.38	92.35	92.65	2.84	89.85	92.95	—	93.76	95.58
ICC Commun.	0.00	0.00	26.14	0.00	—	0.00	0.00	5.71	10.39	59.82
External Progs	18.75	0.00	41.18	50.00	—	0.00	20.00	—	33.33	93.75
File Handling	24.11	21.02	36.19	24.57	—	25.29	25.27	—	26.68	50.28
SQL Commands	61.08	66.87	76.28	74.66	—	55.88	60.94	—	63.79	83.59
Network	3.42	1.65	25.29	3.82	—	3.42	7.83	—	11.95	35.14
W. avg (Stmts)	73.47	70.20	77.75	74.19	2.06	71.38	74.04	5.71	75.65	84.63

Table 5. Recall on Real-world Apps (Weighted Average over Statements in %), Hv = Harvester, Viol = Violist, DRA = Droid-RA, BS = BlueSeal, SH = StringHound, VD = ValDroid

We ran each tool with a timeout of 30 minutes per category and app, i.e., each app gets analyzed once for every category with a 30 minute timeout. These 30 minutes were evenly distributed across all statements for which values were requested inside the respective app, so the time budget for each statement is 30 minutes divided by the number of statements in that category. If an analysis reached a timeout for a particular statement, then we counted the values that were extracted and reported until the timeout was reached and continued with the next statement. Quantitative data on timeouts and performance is discussed in Section 8.5.

ValDroid achieves the highest recall of all tested approaches as shown in Table 5. Note that the average recall in the last row is weighted according to the number of statements in each category. On average, more than 80% of its values are equal to the ground truth for ValDroid. For reflection values, ValDroid is able to correctly reconstruct more than 95% of all values. For external programs, ValDroid’s recall of 93.75% is especially remarkable, because these strings are not trivially available inside the app, as shown by the poor recall of our Soot-based baseline. Even in categories in which ValDroid scores low such as Network Connections (35%), it significantly outperforms all other approaches under test.

For ICC communication, only ValDroid, COAL, and IC3 returned valid Intents matching the ground-truth data. Note that we only consider an Intent as correct if its toString() output is equal to a value in the ground truth. Android’s toString() implementation prints out all fields required for matching (target class, mime type, action, etc.), while omitting unnecessary details such as clip data and extras. Therefore, the toString() result is a good measure of how suitable a given tool is for Intent resolving and IPC analysis. Our data suggest that reconstructing an Intent for proper matching is a highly non-trivial task.

We consider this benchmark as relevant in practice and fair to the tools. IC3, COAL, Harvester, and ValDroid are the only tools that support analyzing intercomponent communication. All of these tools are support Intents and compute all values on which the toString() implementation from the Android AOSP depends. BlueSeal does not support Intent actions or categories but instead the putExtra methods, which are not relevant for Intent resolving. Therefore, we did not run BlueSeal on the ICC category.

For network communications, many URIs are dynamically constructed using previous server responses, which are not available to a static tool. Furthermore, URIs are often constructed using complex operations and conditions. For example, URIs may contain device-specific parts to allow the server to collect diagnostic data. We have found several instances of such URIs (e.g., http://www.shoujiduoduo.com/...?mac=020000000000&dev=samsung%3Excoverproxx%3ESM-G715FN...) in the results of ValDroid. ValDroid models the APIs and returns the same information as the real device. Of our tested approaches, only Harvester supports those APIs.

In other examples, clients send timestamps or the like as part of a GET parameter. One could argue that URLs should be considered correct even if the GET parameters differ. However, a different GET parameter can lead to entirely different server responses, e.g., when an ID parameter differs. To alleviate this problem, we stick to our strict metric of checking the checking the whole URI.

Other approaches score significantly lower than ValDroid, with COAL being the second best with an recall of only about 11.95%.

While Table 5 shows how many correct values each tool is able to retrieve, we also evaluated how many of the extracted values were actually correct. For each point of interest, we computed how many of the reported results were contained in the ground truth. If a tool returns a value that is not part of the ground truth, then we consider this value a false positive. Table 6 presents the percentage of correct values for each category and tool. A dash shows that no result was reported by the tool in the respective category. Since Soot propagates constant values in a conservative manner, it has a precision of 100%. Most tools have an average precision over 70%, with ValDroid showing an average precision of over 87% on our dataset. We therefore conclude that ValDroid has superior recall, while providing similar precision as the best existing approaches.

Table 6.

Category	Soot	JSA	Hv	Viol	DRA	BS	SH	IC3	COAL	VD
Code Loading	100	100	60.50	12.87	—	50.00	95.45	—	79.30	78.80
Java Reflection	100	100	96.59	99.18	84.48	99.51	99.98	—	97.87	99.51
ICC Commun.	—	—	38.67	—	—	—	—	71.67	22.45	82.40
External Progs	100	—	100	66.67	—	—	100	—	33.33	93.75
File Handling	100	100	73.52	72.62	—	98.77	97.19	—	26.68	90.14
SQL Commands	100	100	45.25	45.18	—	99.32	99.38	—	92.73	97.79
Network	100	100	77.15	93.33	—	96.15	99.91	—	34.85	75.84
W. avg (Stmts)	100	100	87.76	90.91	84.48	98.19	99.47	71.67	82.81	96.17

Table 6. Precision on Real-world Apps (Weighted Average over Statements in %)

Finally, Table 7 combines recall and precision using the $F_1$ score. ValDroid achieves the highest $F_1$ score of $90.03 \%$ .

Table 7.

Category	Soot	JSA	Hv	Viol	DRA	BS	SH	IC3	COAL	VD
Code Loading	1.72	6.80	32.72	2.75	—	5.19	1.95	—	79.30	72.73
Java Reflection	96.18	93.83	94.42	95.80	5.49	94.43	96.33	—	97.87	97.51
ICC Commun.	—	—	31.19	—	—	—	—	10.57	22.45	69.32
External Progs	31.57	—	58.34	57.14	—	—	33.33	—	33.33	93.75
File Handling	38.85	34.73	48.50	36.71	—	40.26	40.11	—	26.68	64.55
SQL Commands	75.83	80.14	56.80	56.29	—	71.52	75.55	—	92.73	90.13
Network	6.61	3.24	38.09	7.33	—	6.60	14.52	—	34.85	48.03
W. avg (Stmts)	84.70	82.49	82.45	81.70	4.02	82.66	84.89	10.57	82.81	90.03

Table 7. F1-Score on Real-world Apps (Weighted Average over Statements in %)

Note that JSA is an automata based approach and returns regular expressions as as results. We explicitly excluded the generic .*, which does not inhibit any information. For the recall, we checked whether any value of the ground truth for that particular statement and parameter matches the regular expression. For precision, we likewise assumed that a regular expression is precise if and only if there is any value of the ground truth for that particular statement and parameter matches the regular expression. Note that this behavior gives JSA an advantage, since it allows for imprecise regular expressions to be correct. We found that some apps generate random values in some rare cases, e.g., as file names when writing temporary files, or as unique identifiers for new database entries. These impact JSA not as much as an approach that tries to find exact values.

8.4 RQ3: Recall on Malicious Apps

Apart from benign applications, we also tested ValDroid and the other tools on several malware sets, such as Malware Genome [49], CICMalAnalysis [19], MalDroid [28], and PRAGuard [29]. Most malware sets are structured into subsets. We sampled 100 apps for each subset. If a subset contained less than 100 apps, then we selected all apps from the respective subset. To obtain the ground truth, we used the technique explained in Section 8.3, i.e., a combination of manual and automatic exploration on an instrumented version of the app that writes out the values of interest at runtime. We used the same Samsung XCover Pro devices but disabled the Internet connection to limit any malicious behavior to the local device. While this may disable some behaviors in the malware apps, such as command-and-control schemes, it reduces the risk of the study. Enabling Internet access could allow the app to conduct attacks on third-party systems on the Internet.

Table 8 shows the recall of the tested approaches on our malware apps. Note that recall is usually more important than precision when investigating the behavior of malicious applications as a reverse-engineering task or when distinguishing malicious from benign behavior. Having a correct value is more useful for determining what actions the app may perform in the worst case, even if there are some spurious or wrong values. To make the numbers comparable for Droid-RA, which only supports reflection, we have listed the recall on all categories and the recall on reflection specifically in Table 9. For comparison with IC3 and COAL, we listed the recall on ICC in Table 10. Like in the real-world apps in Table 5, only IC3, COAL, and ValDroid were able to find any correct values, whereas all other tools did not deliver any useful results in the ICC category. While ValDroid outperforms IC3 and COAL significantly in the real-world apps and on most malicious datasets, IC3 is able to extract slightly more values in the MalDroid-Adware. Investigating the core cause of the results on MalDroid-Adware, and potentially improving ValDroid, remains as future work.

Table 8.

Set	Soot	JSA	Hv	Violist	BS	SH	COAL	VD
Malware (Genome, 2015)	29.98	31.51	29.98	34.09	30.67	32.46	25.59	84.98
CICMalAnalysis, 2017:
Adware	48.75	49.01	48.75	53.18	48.75	52.66	52.44	87.12
Ransomware	28.61	32.60	28.61	36.40	28.35	30.69	42.49	59.51
Scareware	60.57	60.57	60.57	63.72	61.39	61.34	62.13	86.04
SMS-Malware	62.82	62.91	62.82	64.53	61.74	64.23	67.93	89.61
MalDroid, 2020:
Adware	33.10	33.57	33.57	36.11	33.10	35.46	46.66	84.14
Banking	52.03	52.03	52.03	54.09	52.74	53.45	72.15	86.96
Riskware	55.85	55.92	55.85	62.14	53.62	58.75	68.28	89.43
PRAGuard, 2020:
Class Encryption	50.86	50.86	50.86	50.86	50.86	50.86	16.86	96.67
Reflection	98.53	98.55	98.53	98.73	98.53	98.53	99.25	99.62
String Encryption	0	0	0	0	0	20.40	0.4	68.68
Trivial	44.19	44.94	44.19	51.68	44.94	54.26	66.81	86.48
Trivial/String Encryption	0	0	23.04	0	0	21.05	0	67.75
Trivial/String Enc./Refl.	98.06	99.02	99.82	99.02	99.02	99.02	99.02	99.73

Table 8. Recall on Malicious Apps (per Value with Ground Truth in %)

Table 9.

Set	Soot	JSA	Hv	Viol.	BS	SH	D-RA	COAL	VD
Malware (Genome)	73.18	73.18	73.18	89.84	75.07	77.92	73.18	77.04	97.87
CICMalAnalysis:
Adware	83.59	83.59	83.59	86.19	83.59	86.49	84.00	87.84	92.30
Ransomware	57.86	68.75	57.86	71.59	57.86	64.03	57.08	66.69	81.44
Scareware	86.97	86.97	86.97	85.36	86.97	86.99	87.06	85.39	97.20
SMS-Malware	86.18	86.18	86.18	86.91	86.18	88.90	86.18	87.15	91.48
MalDroid, 2020:
Adware	84.34	84.34	84.34	86.30	84.34	84.34	84.34	86.90	90.43
Banking	90.64	90.64	90.64	91.18	90.64	92.16	90.64	99.41	99.51
Riskware	60.10	60.10	60.10	90.43	60.10	62.48	61.84	91.50	91.50
PRAGuard, 2020:
Class Encryption	46.51	46.51	46.51	46.51	46.51	46.51	17.87	17.87	91.24
Reflection	99.89	99.89	99.89	99.89	99.89	99.89	99.89	99.89	99.94
String Encryption	0	0	0	0	0	38.25	0	0	77.67
Trivial	56.85	56.85	56.85	99.71	56.85	56.85	56.85	100	100
Trivial/String Enc.	0	0	41.33	0	0	13.33	0	0	80.00
Trivial/Str. Enc./Refl.	99.92	99.92	99.92	99.92	99.92	99.96	99.92	99.92	100

Table 9. Recall on Malicious Apps in the Reflection Category (per Value with Ground Truth in %)

Table 10.

Set	Soot	JSA	Hv	Violist	BS	SH	IC3	COAL	VD
Malware (Genome, 2015)	0	0	22.22	0	0	0	52.48	72.91	83.33
CICMalAnalysis, 2017:
Adware	0	0	16.66	0	0	0	47.37	48.57	58.76
Ransomware	0	0	48.55	0	0	0	54.11	68.57	72.22
Scareware	0	0	8.98	0	0	0	19.21	19.21	80.00
SMS-Malware	0	0	31.42	0	0	0	34.37	23.34	51.04
MalDroid, 2020:
Adware	0	0	15.00	0	0	0	66.64	31.48	64.16
Banking	0	0	35.84	0	0	0	34.00	34.00	55.76
Riskware	0	0	77.25	0	0	0	57.33	50.59	76.47
PRAGuard, 2020:
Class Encryption	0	0	4.20	0	0	0	0	0	5.55
Reflection	0	0	18.00	0	0	0	2.91	0	18.13
String Encryption	0	0	21.16	0	0	0	0	13.38	36.66
Trivial	0	0	41.66	0	0	0	0	50	50
Trivial/String Enc.	0	0	28.57	0	0	0	0	0	57.14
Trivial/String Enc./Refl.	0	0	100	0	0	0	0	0	100

Table 10. Recall on Malicious Apps in the Intercomponent Communication Category (per Value with Ground Truth in %)

PRAGuard contains obfuscated malware Apps. For the string encryption categories, only StringHound, COAL, and ValDroid produce any correct results. Note that PRAGuard also features a subset called Trivial/String Encryption/Reflection/Class Encryption, but we were unable to gather any ground-truth values from them. We were unable to run the apps from this category on our Samsung XCover Pro phones and found that the obfuscation process transformed the applications into a non-working state. The Trivial/String Encryption/Reflection subset contains, despite its name, a significant number of constant values, i.e., non-obfuscated strings.

8.5 RQ4: Performance

For each approach under test, Table 11 shows the average runtime in seconds per value of interest approach timed out on an app we did not include the timings measured on this app into the average runtime or initialization times. Instead, we separately report the percentage of apps that timed out in Table 12. Since all value searches can be conducted independently of each other, we made sure to call all searches for each tool in parallel using as many threads as there were CPU cores on our host system (144).

Table 11.

Category	JSA	Hv	Viol.	Droid-RA	BS	SH	IC3	COAL	VD
Code Loading	3.34	17.61	0.02	—	5.51	0.04	—	1.02	3.26
Reflection	1.36	139.13	0.00	1.16	4.14	0.00	—	0.14	0.16
ICC Comm.	14.24	797.33	0.00	—	—	0.00	0.16	0.63	2.08
External Progs	—	62.05	0.01	—	1.25	0.01	—	5.38	12.17
File Handling	0.08	0.26	0.00	—	1.34	0.00	—	0.07	0.10
SQL Commands	29.14	2.73	0.00	—	3.29	0.00	—	0.15	0.14
Network	1.50	4.58	0.00	—	1.72	0.00	—	0.12	0.30
W. avg (Stmts)	2.88	124.08	0.00	1.16	—	0.00	0.16	0.16	0.27

Table 11. Performance on Real-world Apps in Seconds

Table 12.

Category	JSA	Harvester	D-RA	IC3	VD
Code Loading	48.55	25.16	—	—	0.00
Reflection	47.93	45.11	5.03	—	0.12
ICC Comm.	49.67	41.22	—	38.79	0.17
Ext. Progs	100	34.78	—	—	0.00
File Handling	48.39	31.06	—	—	0.93
SQL Commands	49.63	27.87	—	—	0.00
Network	48.48	51.39	—	—	2.79
W. avg (Stmts)	48.16	42.25	5.03	38.79	0.34

Table 12. Timeouts on Real-world Apps in Percentage

Many approaches under test contain an initialization phase in which they load the app, construct a callgraph, and so on. This initialization is independent of the values of interest and only depends on the target app. Therefore, we measured this initialization time separately and did not include it in the per-category averaged runtimes. We report on the initialization times in Table 13. A dash means that no explicit initialization is performed. ValDroid only times out in 0.36% of all cases.

Table 13.

Tool	Time (in seconds)
JSA	414
Harvester	148
Violist	23
Droid-RA	224
BlueSeal	—
StringHound	34
IC3	—
COAL	30
ValDroid	70

Table 13. Initialization Times on Real-world Apps in Seconds

We also found that the time required for the analysis greatly depends on the category, especially for complex objects such as Android Intents. Across all categories, we noticed that Harvester faced severe scalability issues on this dataset. Harvester only completed its analysis for about 12% of the apps in our real-world dataset without errors in the backwards slicing or execution phase. For those value searches for which it did not time out, Harvester required significantly more time than other approaches in some cases (e.g., 797 s for the “ICC Communication” category vs. 0.01 s for Violist or 2.08 s for ValDroid). The dynamic nature of Harvester forces the usage of an emulator or a real device, which is comparably slower than a workstation computer.

JSA encountered a high number of timeouts, leading to the low recall reported in Table 5. Since different points of interest are independent of each other, we ran multiple searches in parallel and made sure to run the initialization only once per analyzed app and tool.

Table 13 shows the times used for initialization. A dash means that no explicit initialization is performed.

ValDroid sliced 434 paths per search on average. The low number of paths is mainly due to the precise call graph obtained through FlowDroid and explains why ValDroid’s slicing does not suffer from path explosion. ValDroid completely executed only 228 of these paths (52%) on average. The other paths, i.e., the majority, were aborted prematurely, because ValDroid detected that a condition ( $\texttt {if}$ statement) could not be met. This path pruning not only avoids false positives but also explains why the approach scales well in practice.

8.6 RQ5: String Obfuscation

Obfuscation is a popular technique to hinder reverse-engineering. While this technique can protect the intellectual property of the developer, it also makes it harder for an analyst to perform a security assessment of the app. String encryption is an important obfuscation technique, where an automatic obfuscation tool takes an input program and transforms it into a semantically equivalent output program that contains no readable string constants. Instead, all strings are computed at runtime, sometimes using well-known cryptographic primitives such as Advanced Encryption Standard (AES). Required data, such as the encryption key, is hidden inside the app through special computation or encoding. Since the obfuscated app must remain self-contained, no external data are required to recover the strings. However, the process greatly increases the efforts for a human analyst and poses challenges for static analysis. Note that automatic string obfuscation of string constants satisfies the requirements for pseudo constants, since at runtime the original value of the string must be computed to not break the semantics of the programs.

We obfuscated the apps from our real-world dataset using the popular obfuscators DashO⁵ and Allatori.⁶ We would have liked to evaluate DexGuard as well, but the manufacturer denied our request to purchase the tool for evaluation. Obfuscators are usually applied before compiling the code to the dex file. We therefore decompiled all APKs to class files and then applied the obfuscators. DashO and Allatori both obfuscated 96 applications successfully. Table 14 shows the recall of each analysis tool on the obfuscated apps (column Recall). The recall is computed over only those analyses for which the analysis tool terminated successfully. The column Compl. (completed) shows on how many apps the respective tool terminated without errors within the time limit of three hours. ValDroid provides a recall of more than 85% on apps obfuscated with Allatori and more than 87% on apps obfuscated with DashO. Note that ValDroid is not specifically tailored for a specific type of string obfuscation. Instead, its normal approach is already able to recover many strings, just like any other value. The second best analyzer, StringHound, has around 27% recall on Allatori and 50% recall on DashO. Harvester has significantly less recall, around 7–8%.

Table 14.

Tool	Recall	Compl.	Recall	Compl.
	Allatori		DashO
Soot	0	96	0	96
JSA	0	6	0	6
Harvester	7.28	43	8.23	51
Violist	0.18	44	0.0	44
DroidRA	0.02	82	0.048	91
BlueSeal	0	11	0	6
StringHound	27.84	92	50.56	96
IC3	0	94	0	96
COAL	0	69	0	67
ValDroid	85.02	96	87.14	96

Table 14. Recall on Obfuscated Apps (in %)

8.7 RQ6: Library Models

ValDroid requires summaries to deal with calls to library methods. In this experiment, we evaluate the hit/miss rate of ValDroid on our database of summaries. We implemented summaries for commonly used utility classes from the Java Standard Library as well as the Android Framework. The implementation of library models is usually straightforward; models of Java classes forward calls to the actual Java class they model in many cases. Android methods such as getPackageName() replicate the behavior on the device by reading in the package name from the manifest of the app. The AssetManager.open(String) method is modelled by reading in the specified file from the assets folder and returning a value representing an input stream. Figure 4 shows the distribution and usage of implemented and unavailable library models. The left side of the figure (blue boxes) shows the distribution of calls for which a summary was available. To keep the presentation concise, we aggregate the calls by class. We find that ValDroid provides a model for 74.75% of all calls. Over 23% of all calls target java.lang.StringBuilder, followed by java.util.Iterator with over 11% of all calls. The right side of Figure 4 (red boxes) shows the calls for which ValDroid has no corresponding summary yet.

Fig. 4.

In total, 30 modelled classes are responsible for more than 70% of all calls to library classes. Many unmodelled classes are used very rarely, shown as rectangles with a small area in the figure (lower right corner of the red and blue areas).

We found that when only using models for $\texttt {String}$ and $\texttt {StringBuidler}$ , 30.89% of all values could still be found with these two models alone. We conclude that string analyzers should at least provide models for these two classes.

8.7.1 Using Limited Library Models.

Every tool provides a different set of library models. To allow for a fair comparison, we re-evaluated ValDroid with a dedicated configuration per competitor tool. In each such configuration, ValDroid only has access to library models for API methods for which the respective other tool also provides a model. With this evaluation, we analyze the impact of library models on the recall and precision of the respective tool in comparison to ValDroid. However, the precise implementations of the library models may still differ between the tools. Models can be more or less expressive or precise depending on the tool. Therefore, even if the same API is modeled, the value analysis may produce a different result.

The list of library models supported by each tool can be found in Appendix C. Note that Harvester only requires library models for alias analysis but not for the actual value computation. Hence, only a few methods that introduce special aliases such as $\texttt {StringBuilder.append}$ need to be modeled in Harvester. For aliasing, Harvester supports the same methods as ValDroid. Since Harvester is based on dynamic analysis, it can run any Android or JVM API on the phone. In total, we do not need a separate evaluation run for Harvester.

Soot does not use library models for its constant string propagation. Hence, no evaluation run is performed for Soot.

Table 15 shows the precision/recall of ValDroid when restricting the library models to the set of methods/fields supported by the other tools. For comparison, the table also shows the baseline performance of ValDroid with all library models enabled. This baseline value is equivalent to the respective value for ValDroid from Tables 6 and 5 (precision and recall on real-world apps).

Table 15.

Models of Tool	Precision	Recall	Precision orig. tool*	Recall orig. tool*
JSA	95.30%	83.51%	100%	76.20%
Violist	95.49%	84.17%	90.91%	74.19%
BlueSeal	95.54%	83.43%	98.19%	71.38%
StringHound	95.22%	84.46%	99.47%	74.04%
COAL	95.49%	83.76%	82.81%	75.65%
ValDroid (all Models)	96.17%	84.63%

Table 15. Precision/Recall for ValDroid When Restricting the Library Models to the Respective Tool’s Supported Models

The columns with an asterisk (*) denote the performance of the original tool as per Tables 5 and 6.

With the configurations of any general-purpose string finder, ValDroid’s recall always stays over 82%. As we have shown in Table 5, the recall of the other tools is less than 76%. Recall that we limited ValDroid to the library models supported by the other tools. Hence, only the precision and recall of ValDroid needs to be evaluated anew, whereas the values for the other tools remain unchanged.

When configuring ValDroid for IC3, we only evaluated on Intents (Table 16) and for Droid-RA on reflection, respectively (Table 17). Precision and Recall are within a few percentage points of the non-restricted baseline.

Table 16.

Models of Tool	Precision	Recall	Precision orig. tool*	Recall orig. tool*
IC3	78.57%	53.88%	71.67%	5.71%
ValDroid	82.40%	59.82%
(Intents using all models)

Table 16. Intents: Precision/Recall for ValDroid when Restricting the Library Models to the IC3 Supported Models

The columns with an asterisk (*) denote the performance of the original tool as per Tables 5 and 6.

Table 17.

Models of Tool	Precision	Recall	Precision orig. tool*	Recall orig. tool*
Droid-RA	99.51%	93.75%	84.48%	2.84%
ValDroid	99.51%	95.87%
(reflection using all models)

Table 17. Reflection: Precision/Recall for ValDroid when Restricting the Library Models to Droid-RAs Supported Models

The columns with an asterisk (*) denote the performance of the original tool as per Tables 5 and 6.

We conclude that, even when limiting the library models to the ones also supported by other approaches, ValDroid shows a better recall. This suggests that our approach offers a significant advantage even without our additional library models.

8.8 RQ7: Loops

Table 18 shows the percentage of value searches that required processing at least one loop. We report these statistics for the original apps as well as for the obfuscated versions that we generated using the two obfuscators DashO and Allatori. The values are averages across all values of interest in all categories. After string obfuscation, around 81% of all searches required loop handling. In each app of the obfuscated dataset, at least one search required loop support to reconstruct. Consequently, loop support is a requirement when analyzing obfuscated apps.

Table 18.

Obfuscation	Loop present (%)	Path length	Loop iterations	Executed loop stmts
None	24.91	228.10	159.46	2034.00
Allatori	80.43	146.72	74.92	2760.75
DashO	82.39	271.05	178.08	1782.86

Table 18. Statistics on the Presence of Loops within the Paths

Table 18 also presents the number of non-looping statements in the slice, i.e., the statements that are not part of a loop (“path length“). We find that obfuscating an application changes the average path length. This is usually due to the runtime deobfuscation code that the obfuscators need to inject to restore the original value at runtime. The number of statements executed within loops (“executed loop statements”) also differs between the original apps and their obfuscated counterparts. Note that these average executed loop statements divided by the average loop iterations result in the average length of loops. For computing the average path length, we did not count paths with length 0 (i.e., when the string constant is already present at the statement of interest). These string constants are precisely those that are obfuscated by string obfuscators. Depending on the implementation of the string obfuscator, the introduced string decryption routines may be more or less complex on average than other paths in the application. ValDroid was able to fully evaluate most loops, only triggering a cutoff in 0.97% of all searches. This shows that ValDroid does not depend on its cutoff but is able to emulate the precise condition that terminates the loop. ValDroid is more precise than traditional loop unrolling, which requires the number of loop iterations to be known in advance.

As a quantitative comparison, we replace ValDroid’s condition evaluator with standard loop unrolling and measure the decrease in precision. Since the correct loop counter is unknown a priori, we unroll each loop $x \in \lbrace 1, \ldots 10 \rbrace$ times and compute the union of all results. In this experiment, ValDroid’s precision dropped to 55.59%. When only unrolling the loop for exactly 10 times, and not computing the union over all loop counters from 1 to 10, the precision is 70.51%. In either case, the precision is lower than the 87.97% with ValDroid’s precise condition evaluation shown in Table 6. On average, ValDroid performed 159.46 loop iterations on the non-obfuscated applications. Loops were part of the value computation in 99.39% of the analyzed applications, as they were present in 24.91% of the searches. When performing the union approach for $x \in \lbrace 1, \ldots 200 \rbrace$ times, the average time to compute a value increased by a factor of approximately 15 to 1,455 ms on average. The recall is 85.95%, the precision is 64.85%, and the $F_1$ score is 73.92%. The high recall value shows that usually not many loop iterations are needed. Even with 200 unrollings, the recall is still slightly lower than with ValDroid’s precise loop handling.

9 Limitations

ValDroid can only analyze the Java portions of an Android app. If the app also contains native libraries, then this code cannot be sliced. Furthermore, no transformers are available for custom native methods that could be used to treat the native code like a library. Additionally, native code for Android is usually compiled for the Advanced RISC Machines (ARM) platform. Therefore, full execution of the referenced native methods would require interaction with an ARM-based Android emulator or phone. Note that in Android Apps, Java libraries are part of the application code itself, as the compiler stores the code of libraries along the application code. Therefore, Java libraries are not a limitation, since ValDroid processes them like regular application code.

Moreover, in its current state, ValDroid does not compute values that depend on external data outside of the application. If the app, for example, loads parts of a string from the internet, then this part is unavailable. Depending on the configuration, ValDroid can abort the analysis or insert a dummy value. The latter is useful, e.g., for URLs, where an analyst might not require each part of the path on the webserver, but might only be interested whether the connection is made via HTTP or HTTPS.

For the inter-procedural analysis, ValDroid relies on existing callgraph algorithms such as the one integrated with FlowDroid [3], which is in turn based on Soot’s SPARK [21]. If the chosen algorithm fails to capture complex callback semantics for example, then the slicer might not be able to capture all dependencies of a given statement.

ValDroid cannot reverse code packing techniques or obfuscators that rely on native code. We assume that the app code is not encrypted and that all functionality required for computing a value of interest is implemented in the Dalvik code.

10 Conclusion

In this article, we have presented ValDroid, an approach for automatically extracting complex values from Android apps. ValDroid is based on statically slicing the app and then simulating the effects of the statements in the slice. For handling library methods, ValDroid relies on a comprehensive database of external models. As we have shown using experiments on benchmarks as well as 794 real-world apps from the Google Play Store, ValDroid is able to extract even such values that are computed at runtime. Our evaluation also showed that ValDroid’s recall of more than 83% greatly exceeds the capabilities of other approaches currently known in the literature, while taking only 0.1 s per value on average. On important categories such as Android Intents, external program launches, Dynamic Code Loading, and URLs, ValDroid achieves more than twice the recall of the second best tool in our evaluation. ValDroid features precise handling of loop conditions and does therefore not require assumptions on the number of iterations as in traditional loop unrolling. Loops bodies are extracted and simulated obeying the precise semantics of the original program. Our approach can reconstruct values even from obfuscated apps that perform deliberately complex deobfuscations at runtime to restore the original values.

As future work, we plan to further automate the generation of library models such that more parts of the Android framework and other popular Java libraries are covered.

11 Availability

The implementation of ValDroid is part of the VUSC⁷ commercial vulnerability scanner. We are currently establishing an academic initiative to provide non-profit research institutions with free access to the source code of the implementation under a license that enables research while protecting our commercial interests in the scanner product.

Acknowledgment

We thank Siegfried Rasthofer for his valuable input.

Footnotes

Android API documentation: https://developer.android.com/reference/android/content/Intent

As explained in Section 4.6, we assume collision-free naming

The Jimple IR that ValDroid’s implementation is based on mimics the bytecode structure for object allocation. Hence, the same considerations apply to Jimple and Dalvik bytecode.

⁴

This is a separate category and not included in “Dynamic Code Loading,” because some tools only process the reflection APIs and no other values of interest.

⁵

https://www.preemptive.com/products/dasho/overview

⁶

http://www.allatori.com/

⁷

https://www.sit.fraunhofer.de/en/offers/projekte/vusc/

A Full Example Jimple Listing

The following is a full representation of the motivational sample in Jimple. Note that to not clutter the representation the in the CFG in Figure 3, we have merged the lines containing the invocations to $\texttt {Bundle.get}$ and $\texttt {Iterator.next}$ with the casts of their return values to strings.

Listing 7.

B Recall and Precision On Non-timeouted Apps

Since some of the compared tools (especially JSA) tend to have a high number of timeouted applications, we have run an additional evaluation on non-timeouted apps. Note that only 95 apps managed to complete without timeouts on all tools. Table 19 shows the recall and Table 20 the precision on these apps. ValDroid shows a higher recall than on all apps, presumably because complicated apps are excluded.

Table 19.

Category	Soot	JSA	HV	Viol.	D-RA	BS	SH	IC3	COAL	VD
Code Loading	0.00	2.47	22.34	0.46	—	0.00	0.01	—	11.56	67.88
Java Reflection	93.01	93.01	93.97	93.03	3.12	92.19	93.26	—	93.75	95.58
ICC Commun.	0.00	0.00	40.54	0.00	—	0.00	0.00	8.21	10.36	59.91
External Progs	12.5	12.5	12.5	50.0	—	12.5	12.5	—	33.33	93.75
File Handling	24.53	24.55	36.50	25.00	—	25.17	25.67	—	26.70	50.27
SQL Comm.	90.59	90.59	91.05	92.39	—	88.58	90.77	—	63.84	83.63
Network	8.66	8.66	24.72	9.17	—	9.17	8.83	—	12.02	35.02
W. avg (Stmts)	77.94	77.99	80.96	78.49	3.12	77.08	78.25	8.21	72.71	83.98

Table 19. Recall on Non-timeouted Real-world Apps (Weighted Average over Statements in %), SH = StringHound, VD = ValDroid

Table 20.

Category	Soot	JSA	Hv	Viol.	D-RA	BS	SH	IC3	COAL	VD
Code Loading	—	100	90.5	6.58	—	—	100	—	22.31	89.06
Java Reflection	100	100	100	99.02	99.19	99.73	99.98	—	97.90	99.56
ICC Commun.	—	—	28.14	—	—	—	—	74.53	41.70	82.41
External Progs	100	—	100	80.00	—	—	100	—	100	85.71
File Handling	100	100	99.34	70.22	—	98.63	95.95	—	80.04	90.11
SQL Comm.	100	100	99.89	83.07	—	99.89	99.89	—	92.71	97.90
Network	100	100	99.14	47.40	—	98.68	75.61	—	34.77	75.77
W. avg (Stmts)	100	100	97.71	86.14	99.19	95.88	96.09	74.53	89.22	96.47

Table 20. Precision on Non-timeouted Real-world Apps (Weighted Average over Statements in %)

C Modelled Library Functions Per Tool

C.1 JSA

java.lang.String. Constructors, $\texttt {toString}$ , $\texttt {intern}$ , $\texttt {concat}$ , $\texttt {replace}$ , $\texttt {trim}$ , $\texttt {substring}$ , $\texttt {toLowerCase}$ , $\texttt {toUpperCase}$ , $\texttt {split}$ , $\texttt {charAt}$ , $\texttt {contains}$ , $\texttt {contentEquals}$ , $\texttt {toCharArray}$

java.lang.StringBuffer/java.lang.StringBuilder. Constructors, $\texttt {toString}$ , $\texttt {append}$ , $\texttt {insert}$ , $\texttt {delete}$ , $\texttt {deleteCharAt}$ , $\texttt {replace}$ , $\texttt {reverse}$ , $\texttt {setCharAt}$ , $\texttt {setLength}$ , $\texttt {substring}$

java.lang.Object. $\texttt {equals}$ , $\texttt {hashCode}$ , $\texttt {getClass}$

C.2 Violist

java.lang.String. Constructors, $\texttt {replaceAll}$ , $\texttt {replaceFirst}$ , $\texttt {replace}$ , $\texttt {toLowerCase}$ , $\texttt {toUpperCase}$ , $\texttt {trim}$ java.lang.StringBuilder. Constructors, $\texttt {append}$ java.net.URLEncoder. $\texttt {encode}$

C.3 Droid-RA

dalvik.system.BaseDexClassLoader. Constructors dalvik.system.DexClassLoader. Constructors dalvik.system.PathClassLoader. Constructors java.lang.Class. $\texttt {defineClass}$ , $\texttt {findClass}$ , $\texttt {findLoadedClass}$ , $\texttt {findSystemClass}$ , $\texttt {loadClass}$ , $\texttt {desiredAssertionStatus}$ , $\texttt {equals}$ , $\texttt {isAnnotation}$ , $\texttt {isAnnotationPresent}$ , $\texttt {isAnonymousClass}$ , $\texttt {isArray}$ , $\texttt {isAssignableFrom}$ , $\texttt {isEnum}$ , $\texttt {isInstance}$ , $\texttt {isInterface}$ , $\texttt {isLocalClass}$ , $\texttt {isMemberClass}$ , $\texttt {isPrimitive}$ , $\texttt {isSynthetic}$ , $\texttt {getModifiers}$ , $\texttt {hashCode}$ , $\texttt {getResourceAsStream}$ , $\texttt {getAnnotation}$ , $\texttt {getAnnotations}$ , $\texttt {getDeclaredAnnotations}$ , $\texttt {asSubclass}$ , $\texttt {forName}$ , $\texttt {getClass}$ , $\texttt {getComponentType}$ , $\texttt {getDeclaringClass}$ , $\texttt {getEnclosingClass}$ , $\texttt {getSuperclass}$ , $\texttt {getClasses}$ , $\texttt {getDeclaredClasses}$ , $\texttt {getInterfaces}$ , $\texttt {getClassLoader}$ , $\texttt {cast}$ , $\texttt {newInstance}$ , $\texttt {getEnumConstants}$ , $\texttt {getSigners}$ , $\texttt {getPackage}$ , $\texttt {getConstructor}$ , $\texttt {getDeclaredConstructor}$ , $\texttt {getEnclosingConstructor}$ , $\texttt {getConstructors}$ , $\texttt {getDeclaredConstructors}$ , $\texttt {getDeclaredField}$ , $\texttt {getField}$ ,

$\texttt {getDeclaredFields}$ , $\texttt {getFields}$ , $\texttt {getDeclaredMethod}$ , $\texttt {getEnclosingMethod}$ , $\texttt {getMethod}$ ,

$\texttt {getDeclaredMethods}$ , $\texttt {getMethods}$ , $\texttt {getGenericSuperclass}$ , $\texttt {getGenericInterfaces}$ ,

$\texttt {getTypeParameters}$ , $\texttt {getCanonicalName}$ , $\texttt {getName}$ , $\texttt {getSimpleName}$ , $\texttt {toString}$ , $\texttt {getResource}$ ,

$\texttt {getProtectionDomain}$ , $\texttt {notify}$ , $\texttt {notifyAll}$ , $\texttt {wait}$ , java.lang.reflect.Constructor. $\texttt {equals}$ , $\texttt {isAccessible}$ , $\texttt {isAnnotationPresent}$ , $\texttt {isSynthetic}$ , $\texttt {isVarArgs}$ , $\texttt {getModifiers}$ , $\texttt {hashCode}$ , $\texttt {getAnnotation}$ , $\texttt {getAnnotations}$ , $\texttt { getDeclaredAnnotations}$ ,

$\texttt {getParameterAnnotations}$ , $\texttt {getClass}$ , $\texttt {getDeclaringClass}$ , $\texttt {getExceptionTypes}$ , $\texttt {getParameterTypes}$ , $\texttt {newInstance}$ ,

$\texttt {getGenericExceptionTypes}$ , $\texttt {getGenericParameterTypes}$ , $\texttt {getTypeParameters}$ , $\texttt {getName}$ ,

$\texttt {toGenericString}$ , $\texttt {toString}$ , $\texttt {notify}$ , $\texttt {notifyAll}$ , $\texttt {setAccessible}$ , $\texttt {wait}$ , java.lang.reflect.Field. $\texttt {equals}$ , $\texttt {getBoolean}$ , $\texttt {isAccessible}$ , $\texttt {isAnnotationPresent}$ , $\texttt {isEnumConstant}$ , $\texttt {isSynthetic}$ , $\texttt {getByte}$ , $\texttt {getChar}$ , $\texttt {getDouble}$ , $\texttt {getFloat}$ , $\texttt {getInt}$ , $\texttt {getModifiers}$ , $\texttt {hashCode}$ ,

$\texttt {getAnnotation}$ , $\texttt {getAnnotations}$ , $\texttt {getDeclaredAnnotations}$ , $\texttt {getClass}$ , $\texttt {getDeclaringClass}$ , $\texttt {getType}$ , $\texttt {get}$ ,

$\texttt {getGenericType}$ , $\texttt {getName}$ , $\texttt {toGenericString}$ , $\texttt {toString}$ , $\texttt {getLong}$ , $\texttt {getShort}$ , $\texttt {notify}$ , $\texttt {notifyAll}$ , $\texttt {set}$ , $\texttt {setAccessible}$ , $\texttt {setBoolean}$ , $\texttt {setByte}$ , $\texttt {setChar}$ , $\texttt {setDouble}$ , $\texttt {setFloat}$ , $\texttt {setInt}$ , $\texttt {setLong}$ , $\texttt {setShort}$ , $\texttt {wait}$ , java.lang.reflect.Method. $\texttt {equals}$ , $\texttt {isAccessible}$ , $\texttt {isAnnotationPresent}$ , $\texttt {isBridge}$ , $\texttt {isSynthetic}$ , $\texttt {isVarArgs}$ , $\texttt {getModifiers}$ , $\texttt {hashCode}$ , $\texttt {getAnnotation}$ , $\texttt {getAnnotations}$ , $\texttt {getDeclaredAnnotations}$ , $\texttt {getParameterAnnotations}$ , $\texttt {getClass}$ , $\texttt {getDeclaringClass}$ , $\texttt {getReturnType}$ , $\texttt {getExceptionTypes}$ , $\texttt {getParameterTypes}$ , $\texttt {getDefaultValue}$ , $\texttt {invoke}$ , $\texttt {getGenericReturnType}$ , $\texttt {getGenericExceptionTypes}$ , $\texttt {getGenericParameterTypes}$ , $\texttt {getTypeParameters}$ , $\texttt {getName}$ , $\texttt {toGenericString}$ , $\texttt {toString}$ , $\texttt {notify}$ , $\texttt {notifyAll}$ , $\texttt {setAccessible}$ , $\texttt {wait}$

C.4 BlueSeal

java.lang.StringBuilder/java.lang.StringBuffer. $\texttt {append}$ , $\texttt {charAt}$ , $\texttt {delete}$ , $\texttt {deleteCharAt}$ , $\texttt {getChars}$ , $\texttt {indexOf}$ , $\texttt {insert}$ , $\texttt {replace}$ , $\texttt {setCharAt}$ , $\texttt {subSequence}$ , $\texttt {subString}$ , $\texttt {delete}$

java.lang.String. $\texttt {charAt}$ , $\texttt {concat}$ , $\texttt {copyValueOf}$ , $\texttt {format}$ , $\texttt {intern}$ , $\texttt {join}$ , $\texttt {replace}$ , $\texttt {split}$ , $\texttt {subSequence}$ , $\texttt {subString}$ , $\texttt { toCharArray}$ , $\texttt {toLowerCase}$ , $\texttt {toUpperCase}$ , $\texttt {valueOf}$

android.content.Intent. $\texttt {putExtra}$

C.5 IC3 and COAL

android.os.Bundle. Constructors, $\texttt {clear}$ , $\texttt {putAll}$ , $\texttt {putBoolean}$ , $\texttt {putBooleanArray}$ , $\texttt {putBundle}$ , $\texttt {putByte}$ , $\texttt {putByteArray}$ , $\texttt {putChar}$ , $\texttt {putCharArray}$ , $\texttt {putCharSequence}$ , $\texttt {putCharSequenceArray}$ , $\texttt {putCharSequenceArrayList}$ , $\texttt {putDouble}$ , $\texttt {putDoubleArray}$ , $\texttt {putFloat}$ , $\texttt {putFloatArray}$ , $\texttt {putInt}$ , $\texttt {putIntArray}$ , $\texttt {putIntegerArrayList}$ , $\texttt {putLong}$ , $\texttt {putLongArray}$ , $\texttt {putParcelable}$ , $\texttt {putParcelableArray}$ , $\texttt {putParcelableArrayList}$ , $\texttt {putSerializable}$ , $\texttt {putShort}$ , $\texttt {putShortArray}$ , $\texttt {putSparseParcelableArray}$ , $\texttt {putString}$ , $\texttt {putStringArray}$ , $\texttt {putStringArrayList}$ , $\texttt {remove}$ , $\texttt {get}$ , $\texttt {getBinder}$ , $\texttt {getBoolean}$ ,

$\texttt {getBooleanArray}$ , $\texttt {getBundle}$ , $\texttt {getByte}$ , $\texttt {getByteArray}$ , $\texttt {getChar}$ , $\texttt {getCharArray}$ , $\texttt {getCharSequence}$ , $\texttt {getCharSequenceArray}$ , $\texttt {getCharSequenceArrayList}$ , $\texttt {getDouble}$ , $\texttt {getDoubleArray}$ , $\texttt {getFloat}$ , $\texttt {getFloatArray}$ , $\texttt {getInt}$ , $\texttt {getIntArray}$ , $\texttt {getIntegerArrayList}$ , $\texttt {getLong}$ , $\texttt {getLongArray}$ , $\texttt {getParcelable}$ , $\texttt {getParcelableArrayList}$ , $\texttt {getSerializable}$ , $\texttt {getShort}$ , $\texttt {getShortArray}$ , $\texttt {getSparseParcelableArray}$ , $\texttt {getString}$ , $\texttt {getStringArray}$ , $\texttt {getStringArrayList}$ android.content.ComponentName. Constructors, $\texttt {getClassName}$ , $\texttt {getPackageName}$ , $\texttt {getShortClassName}$ android.content.IntentFilter. Constructors, $\texttt {addAction}$ , $\texttt {addCategory}$ , $\texttt {addDataAuthority}$ , $\texttt {addDataPath}$ , $\texttt {addDataScheme}$ , $\texttt {addDataType}$ , $\texttt {setPriority}$ , $\texttt {create}$ , $\texttt {getPriority}$ android.content.Context. $\texttt {registerReceiver}$ , $\texttt {startActivities}$ , $\texttt {bindService}$ , $\texttt {sendBroadcast}$ , $\texttt {sendBroadcastAsUser}$ , $\texttt {sendOrderedBroadcast}$ , $\texttt {sendOrderedBroadcastAsUser}$ , $\texttt {sendStickyBroadcast}$ , $\texttt {sendStickyBroadcastAsUser}$ , $\texttt {sendStickyOrderedBroadcast}$ , $\texttt {sendStickyOrderedBroadcastAsUser}$ , $\texttt {startActivity}$ , $\texttt {startActivityForResult}$ , $\texttt {startActivityFromChild}$ , $\texttt {startActivityFromFragment}$ , $\texttt {startActivityIfNeeded}$ , $\texttt {startService}$ android.content.Intent. Constructors, $\texttt {addCategory}$ , $\texttt {addFlags}$ , $\texttt {fillIn}$ ,

$\texttt {putCharSequenceArrayListExtra}$ , $\texttt {putExtra}$ , $\texttt {putExtras}$ , $\texttt {putIntegerArrayListExtra}$ ,

$\texttt {putParcelableArrayListExtra}$ , $\texttt {putStringArrayListExtra}$ , $\texttt {removeCategory}$ , $\texttt {removeExtra}$ ,

$\texttt {replaceExtras}$ , $\texttt {setAction}$ , $\texttt {setClass}$ , $\texttt {setClassName}$ , $\texttt {setComponent}$ , $\texttt {setData}$ , $\texttt {setDataAndType}$ , $\texttt {setDataAndTypeAndNormalize}$ , $\texttt {setFlags}$ , $\texttt {setPackage}$ , $\texttt {setType}$ , $\texttt {setTypeAndNormalize}$ , $\texttt {createChooser}$ , $\texttt {makeMainActivity}$ , $\texttt {getType}$ , $\texttt {getAction}$ , $\texttt {getCategories}$ , $\texttt {getComponentName}$ , $\texttt {getData}$ , $\texttt {getExtras}$ , $\texttt {getFlags}$ , $\texttt {getPackage}$ , $\texttt {getScheme}$ , $\texttt {toString}$ , $\texttt {getBooleanArrayExtra}$ , $\texttt {getBooleanExtra}$ ,

$\texttt {getBundleExtra}$ , $\texttt {getByteArrayExtra}$ , $\texttt {getByteExtra}$ , $\texttt {getCharArrayExtra}$ , $\texttt {getCharExtra}$ ,

$\texttt {getCharSequenceExtra}$ , $\texttt {getDoubleArrayExtra}$ , $\texttt {getDoubleExtra}$ , $\texttt {getFloatArrayExtra}$ ,

$\texttt {getFloatExtra}$ , $\texttt {getIntArrayExtra}$ , $\texttt {getIntExtra}$ ,

$\texttt {getIntegerArrayListExtra}$ , $\texttt {getLongArrayExtra}$ , $\texttt {getLongExtra}$ , $\texttt {getParcelableArrayExtra}$ , $\texttt {getParcelableArrayListExtra}$ , $\texttt {getParcelableExtra}$ , $\texttt {getSerializableExtra}$ , $\texttt {getShortArrayExtra}$ , $\texttt {getShortExtra}$ , $\texttt {getStringArrayExtra}$ , $\texttt {getStringArrayListExtra}$ , $\texttt {getStringExtra}$ android.content.pm.PackageManager. $\texttt { getLaunchIntentForPackage}$ android.app.Activity . $\texttt {getIntent}$ android.app.PendingIntent. $\texttt {getActivity}$ , $\texttt {getBroadcast}$ , $\texttt {getService}$ , $\texttt {send}$ , $\texttt {getCreatorPackage}$ , $\texttt {getTargetPackage}$ android.content.ContentUris. $\texttt {appendId}$ , $\texttt {withAppendedId}$ android.net.Uri$Builder. Constructors, $\texttt {appendEncodedPath}$ , $\texttt {appendPath}$ , $\texttt {appendQueryParameter}$ , $\texttt {authority}$ , $\texttt {clearQuery}$ , $\texttt {encodedAuthority}$ , $\texttt {encodedFragment}$ , $\texttt {encodedOpaquePart}$ , $\texttt {encodedPath}$ , $\texttt {encodedQuery}$ , $\texttt {fragment}$ , $\texttt {opaquePart}$ , $\texttt {path}$ , $\texttt {query}$ , $\texttt {scheme}$ , $\texttt {build}$ android.net.Uri. $\texttt {withAppendedPath}$ , $\texttt {fromFile}$ , $\texttt {fromParts}$ , $\texttt {parse}$ , $\texttt {buildUpon}$ , $\texttt {normalizeScheme}$ , $\texttt {getAuthority}$ , $\texttt {getEncodedAuthority}$ , $\texttt {getEncodedFragment}$ , $\texttt {getEncodedPath}$ , $\texttt {getEncodedQuery}$ , $\texttt {getEncodedSchemeSpecificPart}$ , $\texttt {getFragment}$ , $\texttt {getPath}$ , $\texttt {getQuery}$ , $\texttt {getScheme}$ , $\texttt {getSchemeSpecificPart}$ , $\texttt {EMPTY}$ ${\it {android.provider.MediaStore\$Audio\$Media}}$ . $\texttt {getContentUriForPath}$ , $\texttt {INTERNAL}\_\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {EXTERNAL}\_\texttt {CONTENT}\_\texttt {URI}$ android.content.ContentResolver. $\texttt { acquireContentProviderClient}$ , $\texttt { acquireUnstableContentProviderClient}$ , $\texttt {applyBatch}$ , $\texttt {bulkInsert}$ , $\texttt {call}$ , $\texttt {delete}$ , $\texttt {insert}$ , $\texttt {notifyChange}$ , $\texttt {query}$ , $\texttt {registerContentObserver}$ , $\texttt {update}$ ${\it {android.provider.Browser}}$ . $\texttt {getContentUriForPath}$ , $\texttt {INTERNAL}\_\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {EXTERNAL}\_\texttt {CONTENT}\_\texttt {URI}$ android.content.ContentResolver. $\texttt { acquireContentProviderClient}$ , $\texttt { acquireUnstableContentProviderClient}$ , $\texttt {applyBatch}$ , $\texttt {bulkInsert}$ , $\texttt {call}$ , $\texttt {delete}$ , $\texttt {insert}$ , $\texttt {notifyChange}$ , $\texttt {query}$ , $\texttt {registerContentObserver}$ , $\texttt {update}$ . $\texttt {BOOKMARKS}\_\texttt {URI}$ , $\texttt {SEARCHES}\_\texttt {URI}$ ${\it {android.provider.MediaStore\$Images\$Thumbnails}}$ . $\texttt {INTERNAL}\_\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {EXTERNAL}\_\texttt {CONTENT}\_\texttt {URI}$ android.provider.MediaStore$Images$Media. $\texttt {INTERNAL}\_\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {EXTERNAL}\_\texttt {CONTENT}\_\texttt {URI}$ android.provider.MediaStore$Audio$Albums. $\texttt {INTERNAL}\_\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {EXTERNAL}\_\texttt {CONTENT}\_\texttt {URI}$ android.provider.MediaStore$Audio$Artists. $\texttt {EXTERNAL}\_\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {INTERNAL}\_\texttt {CONTENT}\_\texttt {URI}$ android.provider.MediaStore$Audio$Genres. $\texttt {EXTERNAL}\_\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {INTERNAL}\_\texttt {CONTENT}\_\texttt {URI}$ android.provider.MediaStore$Audio$Playlists. $\texttt {EXTERNAL}\_\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {INTERNAL}\_\texttt {CONTENT}\_\texttt {URI}$ android.provider.MediaStore$Video$Thumbnails. $\texttt {EXTERNAL}\_\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {INTERNAL}\_\texttt {CONTENT}\_\texttt {URI}$ android.provider.MediaStore$Video$Media. $\texttt {INTERNAL}\_\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {EXTERNAL}\_\texttt {CONTENT}\_\texttt {URI}$ android.provider.Settings$Bookmarks. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.Settings$Secure. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.Settings$System. $\texttt {DEFAULT}\_\texttt {RINGTONE}\_\texttt {URI}$ ,

$\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {DEFAULT}\_\texttt {ALARM}\_\texttt {ALERT}\_\texttt {URI}$ , $\texttt {DEFAULT}\_\texttt {NOTIFICATION}\_\texttt {URI}$ android.provider.BrowserContract. $\texttt {AUTHORITY}\_\texttt {URI}$ android.provider.BrowserContract$Accounts. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.BrowserContract$Settings. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.BrowserContract$SyncState. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.BrowserContract$Images. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.BrowserContract$Bookmarks. $\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {DEFAULT}\_\texttt {FOLDER}$ android.provider.BrowserContract$History. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.BrowserContract$Searches. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.BrowserContract$Combined. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.ContactsContract. $\texttt {AUTHORITY}\_\texttt {URI}$ android.provider.ContactsContract$RawContacts. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.ContactsContract$Directory. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.ContactsContract$SyncState. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.ContactsContract$ProfileSyncState. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.ContactsContract$Contacts. $\texttt {CONTENT}\_\texttt {STREQUENT}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {FILTER}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {FILTER}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {LOOKUP}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {VCARD}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {MULTI}\_\texttt {VCARD}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {GROUP}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {FREQUENT}\_\texttt {URI}$ android.provider.ContactsContract$Profile. $\texttt {CONTENT}\_\texttt {RAW}\_\texttt {CONTACTS}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {URI}$ ,

$\texttt {CONTENT}\_\texttt {VCARD}\_\texttt {URI}$ android.provider.ContactsContract$StreamItems. $\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {PHOTO}\_\texttt {URI}$ ,

$\texttt {CONTENT}\_\texttt {LIMIT}\_\texttt {URI}$ android.provider.ContactsContract$Data. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.ContactsContract$RawContactsEntity. $\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {PROFILE}\_\texttt {CONTENT}\_\texttt {URI}$ android.provider.ContactsContract$PhoneLookup. $\texttt {CONTENT}\_\texttt {FILTER}\_\texttt {URI}$ android.provider.ContactsContract$StatusUpdates. $\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {PROFILE}\_\texttt {CONTENT}\_\texttt {URI}$ android.provider.ContactsContract$CommonDataKinds$Phone. $\texttt {CONTENT}\_\texttt {FILTER}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {URI}$ android.provider.ContactsContract$CommonDataKinds$Email. $\texttt {CONTENT}\_\texttt {LOOKUP}\_\texttt {URI}$ ,

$\texttt {CONTENT}\_\texttt {FILTER}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {URI}$ android.provider.ContactsContract$CommonDataKinds$StructuredPostal. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.ContactsContract$Groups. $\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {SUMMARY}\_\texttt {URI}$ android.provider.ContactsContract$AggregationExceptions. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.ContactsContract$Settings. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.ContactsContract$ProviderStatus. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.ContactsContract$DataUsageFeedback. $\texttt {FEEDBACK}\_\texttt {URI}$ android.provider.ContactsContract$DisplayPhoto. $\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {MAX}\_\texttt {DIMENSIONS}\_\texttt {URI}$ android.provider.Applications. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.CalendarContract. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.CalendarContract$CalendarEntity. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.CalendarContract$Calendars. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.CalendarContract$Attendees. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.CalendarContract$EventsEntity. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.CalendarContract$Events. $\texttt {CONTENT}\_\texttt {EXCEPTION}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {URI}$ android.provider.CalendarContract$Instances. $\texttt {CONTENT}\_\texttt {BY}\_\texttt {DAY}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {SEARCH}\_\texttt {BY}\_\texttt {DAY}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {SEARCH}\_\texttt {URI}$ android.provider.CalendarContract$CalendarCache. $\texttt {URI}$ android.provider.CalendarContract$EventDays. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.CalendarContract$Reminders. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.CalendarContract$CalendarAlerts. $\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {URI}\_\texttt {BY}\_\texttt {INSTANCE}$ android.provider.CalendarContract$Colors. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.CalendarContract$ExtendedProperties. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.CalendarContract$SyncState. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.CallLog. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.CallLog$Calls. $\texttt {CONTENT}\_\texttt {URI}$ ,

$\texttt {CONTENT}\_\texttt {FILTER}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {URI}\_\texttt {WITH}\_\texttt {VOICEMAIL}$ android.provider.Contacts. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.Contacts$Settings. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.Contacts$People. $\texttt {_EMAIL}\_\texttt {OR}\_\texttt {IM}\_\texttt {FILTER}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {FILTER}\_\texttt {URI}$ , $\texttt {DELETED}\_\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {URI}$ android.provider.Contacts$Groups. $\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {DELETED}\_\texttt {CONTENT}\_\texttt {URI}$ android.provider.Contacts$GroupMembership. $\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {RAW}\_\texttt {CONTENT}\_\texttt {URI}$ android.provider.Contacts$ContactMethods. $\texttt {CONTENT}\_\texttt {EMAIL}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {URI}$ android.provider.Contacts$Presence. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.Contacts$Organizations. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.Contacts$Photos. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.Contacts$Extensions. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.Contacts$Phones. $\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {CONTENT}\_\texttt {FILTER}\_\texttt {URL}$ android.provider.Downloads$Impl. $\texttt {CONTENT}\_\texttt {URI}$ , $\texttt {ACCESSIBLE}\_\texttt {DOWNLOADS}\_\texttt {URI}$ , $\texttt {ALL}\_\texttt {DOWNLOADS}\_\texttt {CONTENT}\_\texttt {URI}$ android.provider.DrmStore$Images. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.DrmStore$Audio. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.UserDictionary. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.UserDictionary$Words. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.VoicemailContract$Voicemails. $\texttt {CONTENT}\_\texttt {URI}$ android.provider.VoicemailContract$Status. $\texttt {CONTENT}\_\texttt {URI}$

C.6 StringHound

All JVM methods (not Android classes)

References

[1]

Steven Arzt, Siegfried Rasthofer, and Eric Bodden. 2013. Instrumenting android and java applications as easy as abc. In International Conference on Runtime Verification. Springer, Berlin, 364–381.

Crossref

Google Scholar

[2]

S. Arzt, S. Rasthofer, and E. Bodden. 2017. The soot-based toolchain for analyzing android apps. In Proceedings of the IEEE/ACM 4th International Conference on Mobile Software Engineering and Systems (MOBILESoft’17). 13–24.

Digital Library

Google Scholar

[3]

Steven Arzt, Siegfried Rasthofer, Christian Fritz, Eric Bodden, Alexandre Bartel, Jacques Klein, Yves Le Traon, Damien Octeau, and Patrick McDaniel. 2014. Flowdroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. ACM SIGPLAN Not. 49, 6 (2014), 259–269.

Digital Library

Google Scholar

[4]

Michael Backes, Sven Bugiel, Erik Derr, Sebastian Gerling, and Christian Hammer. 2016. R-Droid: Leveraging android app analysis with static slice optimization. In Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security (ASIA CCS’16). ACM, New York, NY, 129–140. DOI:

Digital Library

Google Scholar

[5]

Alexandre Bartel, Jacques Klein, Yves Le Traon, and Martin Monperrus. 2012. Dexpler: Converting android dalvik bytecode to jimple for static analysis with soot. In Proceedings of the ACM SIGPLAN International Workshop on State of the Art in Java Program Analysis. 27–38.

Digital Library

Google Scholar

[6]

Eric Bodden, Andreas Sewe, Jan Sinschek, Hela Oueslati, and Mira Mezini. 2011. Taming reflection: Aiding static analysis in the presence of reflection and custom class loaders. In Proceedings of the 33rd International Conference on Software Engineering (ICSE’11). ACM, New York, NY, 241–250. DOI:

Digital Library

Google Scholar

[7]

Erika Chin, Adrienne Porter Felt, Kate Greenwood, and David Wagner. 2011. Analyzing inter-application communication in android. In Proceedings of the 9th International Conference on Mobile Systems, Applications, and Services (MobiSys’11). ACM, New York, NY, 239–252. DOI:

Digital Library

Google Scholar

[8]

Tae-Hyoung Choi, Oukseh Lee, Hyunha Kim, and Kyung-Goo Doh. 2006. A practical string analyzer by the widening approach. In Programming Languages and Systems, Naoki Kobayashi (Ed.). Springer, Berlin, 374–388.

Digital Library

Google Scholar

[9]

S. R. Choudhary, A. Gorla, and A. Orso. 2015. Automated test input generation for android: Are we there yet? (E). In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE’15). 429–440. DOI:

Digital Library

Google Scholar

[10]

Aske Simon Christensen, Anders Møller, and Michael I. Schwartzbach. 2003. Precise analysis of string expressions. In Proceedings of the 10th International Conference on Static Analysis (SAS’03). Springer-Verlag, Berlin, 1–18.

Digital Library

Google Scholar

[11]

Valerio Costamagna and Cong Zheng. 2016. ARTDroid: A virtual-method hooking framework on android ART runtime. In International Meeting of Pysychometric Society at Engineering Secure Software and Systems (IMPS@ESSoS’16). 20–28.

Google Scholar

[12]

Peter Dinges and Gul Agha. 2014. Targeted test input generation using symbolic-concrete backward execution. In Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering (ASE’14). Association for Computing Machinery, New York, NY, 31–36. DOI:

Digital Library

Google Scholar

[13]

Adam P. Fuchs, Avik Chaudhuri, and Jeffrey S. Foster. 2009. Scandroid: Automated Security Certification of Android Applications. University of Maryland, College Park, MA. http://spruce.cs.ucr.edu/SCanDroid/papers.html

Google Scholar

[14]

Leonid Glanz, Patrick Müller, Lars Baumgärtner, Michael Reif, Sven Amann, Pauline Anthonysamy, and Mira Mezini. 2020. Hidden in plain sight: Obfuscated strings threatening your privacy. In Proceedings of the 15th ACM Asia Conference on Computer and Communications Security. DOI:

Digital Library

Google Scholar

[15]

Neville Grech, George Kastrinis, and Yannis Smaragdakis. 2018. Efficient reflection string analysis via graph coloring. In Proceedings of the 32nd European Conference on Object-Oriented Programming (ECOOP’18),Leibniz International Proceedings in Informatics (LIPIcs), Todd Millstein (Ed.), Vol. 109. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 26:1–26:25. DOI:

Crossref

Google Scholar

[16]

Philipp Holzinger, Stefan Triller, Alexandre Bartel, and Eric Bodden. 2016. An in-depth study of more than ten years of Java exploitation. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS’16). Association for Computing Machinery, New York, NY, 779–790. DOI:

Digital Library

Google Scholar

[17]

Joxan Jaffar, Vijayaraghavan Murali, Jorge A. Navas, and Andrew E. Santosa. 2012. Path-sensitive backward slicing. In Proceedings of the SAS Annual Conference (SAS’12). 231–247.

Digital Library

Google Scholar

[18]

Dave King, Boniface Hicks, Michael Hicks, and Trent Jaeger. 2008. Implicit flows: Can’t live with’em, can’t live without’em. In International Conference on Information Systems Security. Springer, 56–70.

Digital Library

Google Scholar

[19]

Arash Habibi Lashkari, Andi Fitriah A. Kadir, Laya Taheri, and Ali A. Ghorbani. 2018. Toward developing a systematic approach to generate benchmark android malware datasets and classification. In Proceedings of the International Carnahan Conference on Security Technology (ICCST’18). 1–7. DOI:

Crossref

Google Scholar

[20]

Sungho Lee, Julian Dolby, and Sukyoung Ryu. 2016. HybriDroid: Static analysis framework for android hybrid applications. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE’16). ACM, New York, NY, 250–261. DOI:

Digital Library

Google Scholar

[21]

Ondřej Lhoták and Laurie Hendren. 2003. Scaling Java points-to analysis using spark. In Compiler Construction, Görel Hedin (Ed.). Lecture Notes in Computer Science, Vol. 2622. Springer, Berlin, 153–169. DOI:

Crossref

Google Scholar

[22]

Ding Li, Yingjun Lyu, Mian Wan, and William G. J. Halfond. 2015. String analysis for Java and android applications. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’15). ACM, New York, NY, 661–672. DOI:

Digital Library

Google Scholar

[23]

Li Li, Alexandre Bartel, Tegawendé F. Bissyandé, Jacques Klein, Yves Le Traon, Steven Arzt, Siegfried Rasthofer, Eric Bodden, Damien Octeau, and Patrick McDaniel. 2015. Iccta: Detecting inter-component privacy leaks in android apps. In Proceedings of the 37th International Conference on Software Engineering-Volume 1. IEEE Press, 280–291.

Crossref

Google Scholar

[24]

Li Li, Tegawendé F. Bissyandé, Damien Octeau, and Jacques Klein. 2016. DroidRA: Taming reflection to support whole-program analysis of android apps. In Proceedings of the 25th International Symposium on Software Testing and Analysis (ISSTA’16). ACM, New York, NY, 318–329. DOI:

Digital Library

Google Scholar

[25]

Tianyi Liang, Andrew Reynolds, Cesare Tinelli, Clark Barrett, and Morgan Deters. 2014. A DPLL(T) theory solver for a theory of strings and regular expressions. In Computer Aided Verification, Armin Biere and Roderick Bloem (Eds.). Springer International Publishing, Cham, 646–662.

Digital Library

Google Scholar

[26]

Björn Lisper, Abu Naser Masud, and Husni Khanfar. 2015. Static backward demand-driven slicing. In Proceedings of the 20th Workshop on Partial Evaluation and Program Manipulation. ACM, 115–126. DOI:

Digital Library

Google Scholar

[27]

Long Lu, Zhichun Li, Zhenyu Wu, Wenke Lee, and Guofei Jiang. 2012. CHEX: Statically vetting android apps for component hijacking vulnerabilities. In Proceedings of the ACM Conference on Computer and Communications Security (CCS’12). ACM, New York, NY, 229–240. DOI:

Digital Library

Google Scholar

[28]

Samaneh Mahdavifar, Andi Fitriah Abdul Kadir, Rasool Fatemi, Dima Alhadidi, and Ali A. Ghorbani. 2020. Dynamic android malware category classification using semi-supervised deep learning. In Proceedings of the IEEE International Conference on Dependable, Autonomic and Secure Computing, International Conference on Pervasive Intelligence and Computing, International Conference on Cloud and Big Data Computing, and International Conference on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech’20). 515–522. DOI:

Crossref

Google Scholar

[29]

Davide Maiorca, Davide Ariu, Igino Corona, Marco Aresu, and Giorgio Giacinto. 2015. Stealth attacks: An extended insight into the obfuscation effects on android malware. Computers And Security (Elsevier) 51 (June) (2015), 16–31.

Digital Library

Google Scholar

[30]

Damien Octeau, Daniel Luchaup, Matthew Dering, Somesh Jha, and Patrick McDaniel. 2015. Composite constant propagation: Application to android inter-component communication analysis. In Proceedings of the IEEE/ACM 37th IEEE International Conference on Software Engineering, Vol. 1. 77–88. DOI:

Crossref

Google Scholar

[31]

Damien Octeau, Daniel Luchaup, Somesh Jha, and Patrick McDaniel. 2016. Composite constant propagation and its application to android program analysis. IEEE Trans. Softw. Eng. 42, 11 (2016), 999–1014. DOI:

Digital Library

Google Scholar

[32]

Damien Octeau, Daniel Luchaup, Somesh Jham, and Patrick McDaniel. 2014. Coal Constant Propagation Language. Retrieved from http://siis.cse.psu.edu/coal/.

Google Scholar

[33]

Changhee Park and Sukyoung Ryu. 2015. Scalable and precise static analysis of JavaScript applications via loop-sensitivity. In Proceedings of the 29th European Conference on Object-Oriented Programming (ECOOP’15),Leibniz International Proceedings in Informatics (LIPIcs), John Tang Boyland (Ed.), Vol. 37. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 735–756. DOI:

Crossref

Google Scholar

[34]

Nicholas J. Percoco and Sean Schulte. 2012. Adventures in bouncerland. Blackhat USA. http://media.blackhat.com/bh-us-12/Briefings/Percoco/BH_US_12_Percoco_Adventures_in_Bouncerland_WP.pdf. Accessed March 13, 2024.

Google Scholar

[35]

Siegfried Rasthofer, Steven Arzt, Marc Miltenberger, and Eric Bodden. 2016. Harvesting runtime values in android applications that feature anti-analysis techniques. In Proceedings of the Network and Distributed System Security Symposium (NDSS’16).

Crossref

Google Scholar

[36]

X. Rival and Sukyoung Ryu. 2017. Weakly sensitive analysis for unbounded iteration over javaScript objects. Programming Languages and Systems (APLAS’17), Springer International Publishing, Cham, 148–168. DOI:

Crossref

Google Scholar

[37]

D. Shannon, S. Hajra, A. Lee, D. Zhan, and S. Khurshid. 2007. Abstracting symbolic execution with string analysis. In Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION (TAICPART-MUTATION’07). 13–22. DOI:

Crossref

Google Scholar

[38]

M. Sharir and A. Pnueli. 1978. Two Approaches to Interprocedural Data Flow Analysis. New York University Computer Science Department, New York, NY.

Google Scholar

[39]

Takaaki Tateishi, Marco Pistoia, and Omer Tripp. 2013. Path- and index-sensitive string analysis based on monadic second-order logic. ACM Trans. Softw. Eng. Methodol. 22, 4, Article 33 (Oct. 2013), 33 pages. DOI:

Digital Library

Google Scholar

[40]

Julian Thomé, Lwin Khin Shar, Domenico Bianculli, and Lionel Briand. 2017. Search-driven string constraint solving for vulnerability detection. In Proceedings of the 39th International Conference on Software Engineering (ICSE’17). IEEE Press, Piscataway, NJ, 198–208. DOI:

Digital Library

Google Scholar

[41]

David Trabish, Andrea Mattavelli, Noam Rinetzky, and Cristian Cadar. 2018. Chopped symbolic execution. In Proceedings of the 40th International Conference on Software Engineering (ICSE’18). Association for Computing Machinery, New York, NY, 350–360. DOI:

Digital Library

Google Scholar

[42]

Minh-Thai Trinh, Duc-Hiep Chu, and Joxan Jaffar. 2014. S3: A symbolic string solver for vulnerability detection in web applications. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS’14). ACM, New York, NY, 1232–1243. DOI:

Digital Library

Google Scholar

[43]

Raja Vallee-Rai and Laurie J. Hendren. 1998. Jimple: Simplifying Java bytecode for analyses and transformations. https://api.semanticscholar.org/CorpusID:10529361. Accessed March 13, 2024.

Google Scholar

[44]

J. D. Vecchio, F. Shen, K. M. Yee, B. Wang, S. Y. Ko, and L. Ziarek. 2015. String analysis of android applications (N). In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE’15). 680–685. DOI:

Digital Library

Google Scholar

[45]

Kostyantyn Vorobyov, Yang Zhao, and Padmanabhan Krishnan. 2021. Scalable string analysis: An experience report. In Proceedings of the 10th ACM SIGPLAN International Workshop on the State of the Art in Program Analysis (SOAP’21). Association for Computing Machinery, New York, NY, 43–48. DOI:

Digital Library

Google Scholar

[46]

H. Wang, T. Tsai, C. Lin, F. Yu, and J. R. Jiang. 2016. String Analysis via Automata Manipulation with Logic Circuit Representation. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 9779. 241–260.

Crossref

Google Scholar

[47]

Fengguo Wei, Sankardas Roy, Xinming Ou, and Robby. 2018. Amandroid: A precise and general inter-component data flow analysis framework for security vetting of android apps. ACM Trans. Priv. Secur. 21, 3, Article 14 (Apr. 2018), 32 pages. DOI:

Digital Library

Google Scholar

[48]

Yunhui Zheng, Xiangyu Zhang, and Vijay Ganesh. 2013. Z3-str: A Z3-based string solver for web application analysis. In Proceedings of the 9th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’13). ACM, New York, NY, 114–124. DOI:

Digital Library

Google Scholar

[49]

Yajin Zhou and Xuxian Jiang. 2012. Dissecting android malware: Characterization and evolution. In Proceedings of the IEEE Symposium on Security and Privacy (SP’12). IEEE Computer Society, Los Alamitos, CA, 95–109. DOI:

Digital Library

Google Scholar

Cited By

View all

Index Terms

Precisely Extracting Complex Variable Values from Android Apps
1. Software and its engineering
  1. Software organization and properties
    1. Software functional properties
      1. Formal methods
        Automated static analysis

Recommendations

Are apps ready for new Android releases?
MOBILESoft '20: Proceedings of the IEEE/ACM 7th International Conference on Mobile Software Engineering and Systems

Context: Android operating system always brings new releases and updates to improve security, increase performance and bring a better user experience. When Google announces a new release, a whole chain of changes is triggered in cascade, causing many ...
HybriDroid: static analysis framework for Android hybrid applications
ASE '16: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering

Mobile applications (apps) have long invaded the realm of desktop apps, and hybrid apps become a promising solution for supporting multiple mobile platforms. Providing both platform-specific functionalities via native code like native apps and user ...
Inter-app communication between Android apps developed in app-inventor and Android studio
MOBILESoft '16: Proceedings of the International Conference on Mobile Software Engineering and Systems

Communications between mobile apps are an important aspect of mobile platforms. Android is specifically designed with inter-app communication in mind and depends on this to provide different platform specific functionalities. Android Apps can either be ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology

ACM Transactions on Software Engineering and Methodology Volume 33, Issue 5

June 2024

952 pages

EISSN:1557-7392

DOI:10.1145/3618079

Editor:
Mauro Pezzè
USI Università della Svizzera italiana and SIT Schaffhausen Institute of Technology, Switzerland

Issue’s Table of Contents

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 June 2024

Online AM: 27 February 2024

Accepted: 08 February 2024

Revised: 24 January 2024

Received: 03 June 2022

Published in TOSEM Volume 33, Issue 5

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Research Center for Applied Cybersecurity ATHENE

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
638
Total Downloads

Downloads (Last 12 months)638
Downloads (Last 6 weeks)90

Reflects downloads up to 15 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Type	Soot	JSA	Hv	Viol	DRA	BS	SH	IC3	COAL	VD
Precise Loop handling	✗	~ \(^{1}\)	✓	~ \(^{2}\)	~ \(^{1}\)	~ \(^{1}\)	✓	~ \(^{1}\)	~ \(^{1}\)	✓
Collections	✗	✗	✓	✗	✗	✗	✓	✗	✗	✓
Precise complex objects	✗	✗⁶	✓	✗⁶	✗⁶	✗⁶	✓	✗⁶	✗⁶	✓
Precise arrays	✗	✗ \(^{3}\)	✓	✗ \(^{4}\)	✗ \(^{4}\)	✗ \(^{4}\)	✓	✗ \(^{4}\)	✗	✓
Android APIs	✗	✗	✓	✗	✓	✗	✗ \(^{5}\)	✓	✓	✓
Reflective calls	✗	✗	✓	✗	✗	✗	✓	✗	✗	✓
Static or Hybrid	S	S	H	S	S	S	H	S	S	S

Abstract

1 Introduction

2 Related Work

3 Challenges

3.1 Loop Handling

3.2 Collections

3.3 Precise Complex Objects Handling

3.4 Arrays

3.5 Android API Support

3.6 Reflective Calls

3.7 Encounter Rates

4 Approach

4.1 Approach Overview

4.2 Definitions

4.3 Simulation

4.4 Slicing

4.5 Converting Statements to Transformers

4.6 Interprocedural Slicing

4.7 Library Models

4.8 Loops

Loop detection.

Statement of interest within the loop.

Precise virtual invocations.

Normal \(\texttt {if}\) conditions.

Conclusions.

5 Demonstration Example

5.1 Slicing and Running the Demonstration Example

Slicing.

Running.

6 Implementation

6.1 Reflection

6.2 Inter-Component Calls

6.3 Path Explosion

6.4 Library Models

6.5 Aliasing

6.6 Unknown Values

7 Core Improvements

7.1 Sensitivity Properties

7.2 Loop Handling

7.3 Further Sanity Checks

8 Evaluation

8.1 Experiment Setup

8.2 RQ1: Accuracy on Benchmarks

8.3 RQ2: Recall/Precision on Real-world Apps

8.4 RQ3: Recall on Malicious Apps

8.5 RQ4: Performance

8.6 RQ5: String Obfuscation

8.7 RQ6: Library Models

8.7.1 Using Limited Library Models.

8.8 RQ7: Loops

9 Limitations

10 Conclusion

11 Availability

Acknowledgment

Footnotes

A Full Example Jimple Listing

B Recall and Precision On Non-timeouted Apps

C Modelled Library Functions Per Tool

C.1 JSA

C.2 Violist

C.3 Droid-RA

C.4 BlueSeal

C.5 IC3 and COAL

C.6 StringHound

References

Cited By

Index Terms

Recommendations

Are apps ready for new Android releases?

HybriDroid: static analysis framework for Android hybrid applications

Inter-app communication between Android apps developed in app-inventor and Android studio

Comments

Information

Published In

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources