Data Flow Analysis -- Reaching definition Analysis

It has been a long time since I wrote the last blog, because I was preparing the Toefl and GRE exam 😢 (so hard). I reviewed the data flow analysis algoritms recently by watching the Li Yue’s videos and reading Engineer A Compiler. Here are some ideas.

What’s the Reaching definition

In some case the compiler need to know that where the operand has been defined. To find a set of definitions that reach a blocl, the compilers can compute the reaching definitions.

The domain of the Reaches is the set of definitions in the procedure. The definition b of some vaiable v reaches operation i if and only if i reads the value of v and there exists a path that from d to i does not redefine v.

What’s the use of Reaching definition

We can build the define-use chain to perform the further optimization. Also, we can perform some easy checking such as using uninitialized variables. We can add dummy definitions in the entry node. The uninitialized values are used if any dummy definitions reach a program point p.

Algorithm

The compiler annotates each node n in the CFG with a set REACHES(n), computed as the a forward data-flow problem:

\[Reaches(n) = \Phi, \forall n\] \[Reaches(n) = \mathop{\cup}\limits_{m \in preds(n)} (DEDef(m) \cup (Reaches(m) \cap \overline{DefKill(m)}))\]

The DEDef is the downward-exposed definitions in m. DefKill(m) contains all the definition points that are obscured by a definition of the same name in m; d in DefKill(m) if d defines some name v and m contains a definition that also defines v.

DEDef and DefKill are both defined over the set of definition points, but computing each of them requires a mapping from names (variables and compiler-generated temporatries) to definition points.

Using iteration algorithm to approach the fixed point

Just like the liveness analysis, we can use the iteration method to reach the fixed point. For one procedure, we can implement the algorithm by three steps:

  1. Build the cfg
  2. Gather initial information: compute the Reaches and the DefKill and build a map of the name to defination points with a pass
  3. Perform the iteration to approach the fixed point

The pseudo codes are following:

multi_map<string, statement*> definition_info

def Init(b):
    b.Reaches = null
    b.DefKill = null
    for i in b.statements:
        if IsDefineStmt(i):
            definition_info.insert((i.name, &i))    # build map
            b.DefKill().append(&i)                  # build DefKill set
        for oprand in i.oprands():
            if oprand not in b.DefKill():
                b.Reaches().append(&i)              # build Reaches set

# assume CFG has N blocks
# numbered 0 to N - 1

def GetReachingDefinitionOutPass(cfg):
    for bb in cfg:
        bb.ReachingDefOut = null

    changed = true
    while changed:
        changed = false
        for i in range(N):
            cfg[i].ReachingDefOut = cfg.ReachingDefOut.recompute()
            if cfg[i].ReachingDefOut.changed:
                changed = true