Program Analysis Using Weighted Pushdown Systems

Presented on May 4, 2017
Presenter: Lawton

Preview

I have decided! Next week I’ll present “Program Analysis Using Weighted Pushdown Systems”, by Thomas Reps, Akash Lal, and Nick Kidd.

Best y’all, Lawton!

(Mika does a wonderful impression of me!)

Summary

In interprocedural analyses, you want to exclude as many of the paths that violate the call/return structure of the language as possible. Pushdown systems (PDSs) are an automata-theoretic method for tracking this structure─they encode a semantics using a stack of function calls. Weighted pushdown systems (WPDSs) “extend PDSs by adding a general ‘black-box’ abstraction for expressing transformations of a program’s data state (through weights)”.

Acceptable paths:

valid ⟶ matched valid
      ∣ (ᵢ valid
      ∣ ε
matched ⟶ matched matched
        ∣ (ᵢ matched )ᵢ 
        ∣ n [where n is an arbitrary node name]
        ∣ ε

Boolean programs can be encoded as a pushdown system, and then the tuple ⟨p, γ₁γ₂⋯γₙ⟩ represents a state where p encodes the values of global variables; γ₁ encodes the current program location and the values of local variables in scope; and the rest of the stack encodes the list of unfinished calls with the values of local variables at the time the call was made.

The benefit of WPDSs is that they allow dataflow queries to be posed with respect to a regular language of stack configurations. For free, you can answer stack-qualified queries about a program, by giving a regular expression that asks for the meet over valid paths of a configuration ⟨C, r⟩, where r describes the stack configuration of interest. A normal query stack looks like nΓ*, which selects the node n coming from a stack context that could be anything.

A bounded idempotent semiring (a.k.a. weight domain) is the black box part of a WPDS that provides the weights. Weighted pushdown systems generalize pushdown systems, and you can get a normal PDS by instantiating the weight domain with the Boolean weight domain ({⊥,⊤}, ∨, ∧, ⊥, ⊤), where ⊤ represents reachable and ⊥ represents unreachable. Some other examples of pluggable weight domains allow an analysis developer to find the shortest valid trace that represents a given property, and to track affine relations across program variables.

A key quote: “A PDS models the control flow of a program. A weight domain models abstract transformers for an abstraction of a program’s data.”

Discussion

We talked about how, while weighted pushdown systems seem nice for imperative languages, things start to go horribly wrong when you try to apply all of this to a functional language. The stack-based approach described in this paper fits perfectly with continuations and continuation stacks, but closures are not stack-y.

We also discussed semirings, and how we might teach them and similar algebraic structures to students in an introductory discrete math course.

Attachments