We've seen that we can analyze the behavior of programs at compile time by using elements of various lattices to represent facts about the execution of a program at run time. Sometimes the lattice elements have been sets ordered by relations ⊆ and ⊇; sometimes they have been more exotic structures. An example of the latter is the lattice elements used in conditional constant propagation (CCP). These lattice elements map each program variable to either a constant or the special values ⊤ and ⊥.
As we've discussed program analyses, the tricky part has always been coming up with the correct transfer function \(F_n\) for each node \(n\). The transfer function represents the change in information associated with executing \(n\) in a given program state.
Abstract interpretation, a technique pioneered by Patrick and Radhia Cousot in the late '70s, offers a principled way to design both the lattice over which dataflow analysis is performed and the transfer functions that operate on these lattice elements.
The key idea of abstract interpretation is to maintain a mapping between a more concrete representation of program execution, and a more abstract one that supports to compile-time program analysis. Both the concrete and abstract are lattices.
In general, both the concrete and abstract lattices may be abstractions of program execution. However, a common and important choice for the more concrete representation is a set of possible states of the program (such as the values of all variables and the structure of the heap). These sets form a lattice, ordered by the superset relation ⊇. (In the abstract interpretation literature, the lattices are turned upside down, so that the “most informative dataflow value” is ⊥, corresponding to the empty set of program states, and the least informative value is ⊤, corresponding to all possible states. We will keep the ordering consistent with the dataflow analyses discussed thus far.) Thus, the most informative value ⊤ corresponds to the empty set of program states, and the least informative value ⊥ corresponds to all possible states. It is possible to have even more concrete representations, for example by including the entire execution history of the program as part of the concrete representation, which can be useful if we are trying to analyze programs using past or future behavior.
For example, consider CCP, which statically approximates the value of each variable with a lattice containing each value, plus two elements representing undefined values and non-constant values:
For simplicity, let's suppose for the moment that there is only one variable in the program, so that a program state is simply the value of that variable. For each set of program states, there is a lattice element that best describes those states. The singleton set \(\{3\}\) is best represented by the lattice element 3, whereas the set \(\{3, 4\}\) is represented by the element \(⊥\). Thus, the mapping to lattice elements in general loses information about the program. The mapping from the more concrete domain (e.g., sets of states) to the more abstract domain is called the abstraction function α.
A natural way to define the abstraction function over sets of states is by first defining a function that maps individual states to lattice elements. In our CCP example, this representation function \(β\) would map each constant value to the corresponding constant in the lattice. No state maps to \(⊥\) or \(⊤\). The abstraction function over a set of states \(S\) just gives the greatest lower bound of the representations of the individual states: \[ α(S) = \sqcap_{s ∈ S}~β(s) \]
In our example, applying \(α\) to the set \(\{3,4\}\}\) then gives \(β(3) ⊓ β(4) = ⊥\), as we would expect.
For each abstraction function \(α\) there is a concretization function γ that maps from the abstract domain to the concrete domain. The concretization of an abstract element is the most conservative concrete element that maps to \[ γ(ℓ) = ⊓ \{ S \mid ℓ ⊑ α(S) \} \]
For example, if we define the abstraction function via a representation function \(β\), the corresponding concretization function just gives the set of all corresponding states:
\[ γ(ℓ) = \{ s \mid ℓ ⊑ β(s) \} \]Hence, in the CCP example, \(γ(⊥)\) is the set of all states and \(γ(⊤)\) is the empty set.
The abstraction and concretization functions have the property that the concretization of the abstraction of a concrete element \(S\) is no more informative than the original element and the abstraction of the concretization of an abstract element \(ℓ\) is the same abstract element: \begin{align*} γ(α(S)) & ⊑ S \\ ℓ & ⊑ α(γ(ℓ)) \end{align*}
Abstraction and concretization functions
Further, both \(α\) and \(γ\) must be monotonic, so \(S_1 ⊑ S_2 ⇒ α(S_1) ⊑ α(S_2)\) and \(ℓ_1 ⊑ ℓ_2 ⇒ γ(ℓ_1) ⊑ γ(ℓ_2)\). In other words, making the input to either of these functions more precise cannot harm the precision of their output.
These relationships make the pair of functions \(α\) and \(γ\) a Galois connection. In practice, useful abstractions are Galois insertions, in which \( ℓ = α(γ(ℓ)) \). If the abstract domain has elements that do not map back to themselves, they cannot be the abstraction of any concrete element, so they are not very useful. A Galois insertion is depicted in the figure above.
For example, both the set of all states where \(x=3\), and the set of states where \(x=3\) and either \(y=4\) or \(y=5\), map via \(α\) to the abstract element \(\{x↦3, y↦⊥\}\). To preserve the property \(γ(α(S)) ⊑ S\), the concretization of this element must include at least the set of all states where \(x=3\), and the most precise concretization would be exactly this set; that is, we choose the concretization \(γ(a) = \bigcap \{ S \mid α(S) ⊒ a \}\)
The transfer function as an abstraction of execution
We want construct transfer functions that capture conservatively what the original program would have done, but by “computing” on abstract elements. We can construct the most precise such transfer function by using the abstraction function α to map the inputs and outputs of the actual computation into the abstract space. Suppose that the actual computation at node \(n\) maps a given set of states \(S\) to a set \(S' = f_n(S)\). Then, given an input element \(\ell\), we can map it to a set of states \(γ(\ell)\) and apply the effect of the CFG node \(n\) to it to get a resulting set of states \(f_n(γ(\ell))\). Applying the abstraction function to this gives us a sound transfer function: \(α(f_n(γ(\ell))\). Intuitively, this transfer function, which operates downward on the right-hand side of the figure, is constructed by instead following the blue arrows around the figure counter-clockwise.
Example: constructing a transfer function
For example, suppose the abstract input is \(\ell = \{x↦3, y↦⊥\}\) and the node
\(n\) is the assignment y=x+1
. The input set of states \(γ(\ell)\) is all
states for which \(x=3\). Running this statement on these states,
as depicted in the figure, gives us all states
for which \(x=3\) and \(y=4\). Mapping this back via α, we have
\(F_n(\ell) = \{x↦3, y↦4\}\), which is exactly what we'd
want. On the other hand, if we defined \(F_n(\ell)\) to
be \(\{x↦⊥, y↦4\}\) in this case, that would still be sound—just
less precise that we might like.
If \(γ\) is defined to be the most precise concretization, as described above, this transfer function will also be as precise as possible. However, this most precise transfer function will not be computable in general. Fortunately, we don't necessarily need this most precise transfer function in order to have an effective program analysis. All that is required is that the result \(F_n(\ell)\) is no more informative than this most precise result: \( F_n(l) ⊑ α(f_n(γ(\ell)) \) .
Pointer analysis, an essential ingredient for precise alias analysis, is a good chance to think about program analysis in terms of abstract interpretation.
The simple abstraction we have been using as an example is only one of many possible abstractions. For example, instead of representing variables as either constant or unknown, we could represent the range of values each variable might conservatively take on at each program point. Or we could keep track of other attributes such as the sign or parity of variables. We could also construct abstractions that take into account multiple variables at once, to allow keeping track of relationships between variables. Polyhedral abstract domains allow arbitrary linear inequalities over variables; octahedral abstract domains allow a more restricted set of inequalities that are nevertheless still useful for common computations. A topic of current research interest is to developing domains that allow sufficiently precise yet scalable conservative analysis of machine learning computations.