Data Flow Analysis -- Reaching definition Analysis
It has been a long time since I wrote the last blog, because I was preparing the Toefl and GRE exam 😢 (so hard). I reviewed the data flow analysis algoritms recently by watching the Li Yue’s videos and reading Engineer A Compiler. Here are some ideas.
What’s the Reaching definition
In some case the compiler need to know that where the operand has been defined. To find a set of definitions that reach a blocl, the compilers can compute the reaching definitions.
The domain of the Reaches is the set of definitions in the procedure. The definition b of some vaiable v reaches operation i if and only if i reads the value of v and there exists a path that from d to i does not redefine v.
What’s the use of Reaching definition
We can build the define-use chain to perform the further optimization. Also, we can perform some easy checking such as using uninitialized variables. We can add dummy definitions in the entry node. The uninitialized values are used if any dummy definitions reach a program point p.
Algorithm
The compiler annotates each node n in the CFG with a set REACHES(n), computed as the a forward data-flow problem:
\[Reaches(n) = \Phi, \forall n\] \[Reaches(n) = \mathop{\cup}\limits_{m \in preds(n)} (DEDef(m) \cup (Reaches(m) \cap \overline{DefKill(m)}))\]The DEDef is the downward-exposed definitions in m. DefKill(m) contains all the definition points that are obscured by a definition of the same name in m; d in DefKill(m) if d defines some name v and m contains a definition that also defines v.
DEDef and DefKill are both defined over the set of definition points, but computing each of them requires a mapping from names (variables and compiler-generated temporatries) to definition points.
Using iteration algorithm to approach the fixed point
Just like the liveness analysis, we can use the iteration method to reach the fixed point. For one procedure, we can implement the algorithm by three steps:
- Build the cfg
- Gather initial information: compute the Reaches and the DefKill and build a map of the name to defination points with a pass
- Perform the iteration to approach the fixed point
The pseudo codes are following:
- Gather the initial information
multi_map<string, statement*> definition_info
def Init(b):
b.Reaches = null
b.DefKill = null
for i in b.statements:
if IsDefineStmt(i):
definition_info.insert((i.name, &i)) # build map
b.DefKill().append(&i) # build DefKill set
for oprand in i.oprands():
if oprand not in b.DefKill():
b.Reaches().append(&i) # build Reaches set
- Perform the iteration algorithm
# assume CFG has N blocks
# numbered 0 to N - 1
def GetReachingDefinitionOutPass(cfg):
for bb in cfg:
bb.ReachingDefOut = null
changed = true
while changed:
changed = false
for i in range(N):
cfg[i].ReachingDefOut = cfg.ReachingDefOut.recompute()
if cfg[i].ReachingDefOut.changed:
changed = true