What we have done so far:
- explored LLVM's infrastructure enough to determine that its backend is not flexible/amenable enough to make global register allocation (at least, in the more sophisticated form we want to do) feasible for a 6-week project.
- devised a bottom-up-and-top-down algorithm for performing GRA, in a manner that supports rewriting calling conventions per-function
- planned new infrastructure for our undergrad compiler to support the algorithm
We expect infrastructure to be much easier to manipulate in our compiler than in LLVM. In essence, we are tearing out and rewriting the back-most parts (register allocation, code generation - everything up through liveness and function inlining will remain intact) with new pipeline stages. A vague checklist of what needs to be done:
- find and iterate over strongly connected components of a call graph
- register allocator support for inserting conflict-resolving register move instructions, and for allocating on "virtual" and "hardware" registers at the same time
- main GRA pass, which will go top-down and bottom-up propagating constraints, and invoke the register allocator as necessary, and also keep track of register-used sets to determine when saving and restoring registers is needed
- code generation backend rewritten to respect the calling conventions we decide on instead of the standard x86 one
For benchmarking, we plan to re-use the benchmark suite from 15-411, which has several complicated computation-heavy programs, such as a collatz conjecture solver and hunt the wumpus. If this proves to be insufficient (and probably even if it doesn't), we will spend some time generating more test cases. (Note: I expect the "coming up with a new algorithm" thing will take much energy away from benchmarking, so I personally would be proud to have a full-featured correct implementation even without thorough benchmarking.)
No comments:
Post a Comment