Sunday, April 3, 2011

Global Optimizations in the LLVM Backend

Recall that LLVM structures the entire contents of a program as a Module object, which contains a list of Functions, each representing a function of that program. In doing global register allocation, we are performing a global optimization that requires access not only to individual functions, but a call graph representation of them in an easily accessible manner.

Although the LLVM compiler is well designed for global passes (or as LLVM calls it, ModulePasses) on its high-level abstract syntax form, the current version of LLVM seemingly has little support for optimizations that require access to multiple functions in the compilation backend. For example, if one were to write an alias analysis optimization, he/she would write a class that extends ModulePass. Unfortunately, the backend of LLVM is designed to completely separate functions for each other. In fact, currently, instruction selection, register allocation, and assembly/bytecode output are written as a set of MachineFunctionPasses, or passes that independently operate on the machine IR representation of functions. All target dependent optimizations are also performed on a function-by-function level and are not inter-procedural in any noticeable way.

The implementation of the LLVM backend currently lacks support for global optimizations like inter-procedural register allocation. Most backend work is performed on MachineFunction objects, which represent the target-dependent intermediate representation and code form of their associated Functions. Of course, each Function therefore contains a single MachineFunction, and the same MachineFunction is used throughout the process of instruction selection, register allocation, and assembly output.

Except, MachineFunction allocations are generated in the MachineFunctionAnalysis class, which is a FunctionPass. This may initially not seem like an issue. However, note that MachineFunctions are allocated in the runOnFunction() method and are persistent across the entire backend compilation process. An issue arises when we think of the way passes are managed. In essence, there is no easy way of accessing another MachineFunction when modifying currently modifying one. Worse yet, as a programmer, we have no way of knowing whether or not other MachineFunctions are even allocated when currently modifying one. Depending on the implementation of the Pass Manager, a compiler could feasibly one-by-one translate a Function into a MachineFunction and perform instruction selection, register allocation, and assembly generation before proceeding to the next Function. Under this design of LLVM, we simply cannot make any assumptions.

It turns out that modifying LLVM to add the support global backend optimizations is not actually that tricky. What needs to be done is that the MachineFunctionAnalysis code really should modified to generate all allocations of MachineFunctions at once, allowing the programmer the ability to access a specific MachineFunction at will. Because we want to spend more time on the register allocation issue itself instead of dealing with the LLVM infrastructure, we believe the following may be the most simple way to get around such issues.

  1. Modify MachineFunctionAnalysis, initializing the object with a mapping of Functions to MachineFunctions and adding getMachineFunction(Function &), which will return a Function's associated MachineFunction.
  2. Writing register allocation as a ModulePass, requiring that the MachineFunctionAnalysis and CallGraph passes have been run prior.
  3. runOnModule should traverse the CallGraph and then perform analysis on strongly connected components in a bottom-up fashion.

No comments:

Post a Comment