Skip to content

Optimizing compiler for the Smpl programming language implemented in Python

Notifications You must be signed in to change notification settings

andrej/smpl-compiler

Repository files navigation

Another Smpl Compiler

Provides a (somewhat) optimizing compiler for the Smpl programming language as used in the CS 241 Advanced Compiler Design class by Dr. Michael Franz at the University of California, Irvine. See the test_progs directory for example programs of the Smpl programming language.

The compiler includes a backend for the DLX instruction set architecture, and an emulator for this ISA.

For help running the compiler, see

./main.py --help

Structure

The path from input file to compiled machine code goes as follows:

  1. input file

  2. lexer (provides stream of tokens) and parser (creates tree of AST nodes), see parser.py

  3. AST, see ast.py

    1. Interpreter: each AST node has a run() method that implements that node's semantics in Python and executes it

    2. Compiler: We produce a single static assignment intermediate representation. For this, each AST node has a compile() method that appends (emits) the appropriate SSA instructions to a given CompilationContext.

      IfStatements and WhileStatements create new basic blocks in the context. Note that both of these statements end in one join block; the current block in the context is updated to point to this block. Hence it is safe to keep emitting the instructions following if/while statement in the current basic block.

  4. Common subexpression elimination, dead code elimination and constant elimination, see ssa.py

    Common subexpression elimination is performed as statements are compiled: The emit() method of the compilation context checks whether an equivalent dominating instruction has been previously emitted. If so, a pointer to that instruction is returned and no instruction is emitted.

    The other two optimizations are performed during multiple passes over the IR, removing instructions until a fixed point is reached.

  5. DLX machine code or assembly generation, see backend.py (abstract base class), dlx.py and allocator.py. The machine code is generated by iterating through the SSA IR basic blocks in a depth-first manner; the backend in dlx.py calls back to its given allocator (allocator.py) to map the SSA operands to actual machine registers and memory locations (for spills).

About

Optimizing compiler for the Smpl programming language implemented in Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages