CSE411

CSE411: Introduction to Compilers

Intermediate Representation

Jooyong Yi (UNIST)

2023 Spring
CSE411

Where are we?

                 |-----------|        |----------|
Source Program → | Front End | → IR → | Back End | → Target Program
                 |-----------|        |----------|
                                  ↑
         |-------|            |--------|           |----------|        |-----------|
Source → | Lexer | → tokens → | Parser | → parse → | Semantic |→ AST → |   IR      | → IR
Program  |       |            |        |   tree    | Analyzer |        | Generator |
         |-------|            |--------|           |----------|        |-----------|
  • IR: Intermediate Representation (or Intermediate Code)
2023 Spring
CSE411

Example of IR

  • LLVM IR
int sum(int x, int y) { int z = x + y; return z; }
define i32 @sum(i32 noundef %0, i32 noundef %1) #0 {
  %3 = alloca i32, align 4 ; allocate the memory on the current call stack
  %4 = alloca i32, align 4
  %5 = alloca i32, align 4
  store i32 %0, ptr %3, align 4
  store i32 %1, ptr %4, align 4
  %6 = load i32, ptr %3, align 4
  %7 = load i32, ptr %4, align 4
  %8 = add nsw i32 %6, %7
  store i32 %8, ptr %5, align 4
  %9 = load i32, ptr %5, align 4
  ret i32 %9
}
2023 Spring
CSE411

Another Example of IR

  • Java Bytecode
static int sum(int x, int y) { int z = x + y; return z; }
  static int sum(int, int);
    descriptor: (II)I
    flags: (0x0008) ACC_STATIC
    Code:
      stack=2, locals=3, args_size=2
         0: iload_0
         1: iload_1
         2: iadd
         3: istore_2
         4: iload_2
         5: ireturn
      LineNumberTable:
        line 3: 0
        line 4: 4
2023 Spring
CSE411

Why IR?

  • Why not directly translate AST to the target program?
         |-------|            |--------|           |----------|        |-----------|
Source → | Lexer | → tokens → | Parser | → parse → | Semantic |→ AST → |   Code    | → Target
Program  |       |            |        |   tree    | Analyzer |        | Generator |   Program
         |-------|            |--------|           |----------|        |-----------|
2023 Spring
CSE411

Why IR?

2023 Spring
CSE411

Why IR?

2023 Spring
CSE411

Why IR?

2023 Spring
CSE411

Why IR?

2023 Spring
CSE411

Why IR?

  • front ends and back ends to support language-machine pairs.
2023 Spring
CSE411

Why IR?

  • front ends and back ends to support language-machine pairs.
  • Direct translation from AST to a target program is not easy.
    • You should consider both high-level/low-level issues.
  • It is typically easier to perform optimization with IR than with AST.
2023 Spring
CSE411

Three-Address Code

%8 = add %6, %7

br i1 %cond, label %IfEqual, label %IfUnequal

ret 1
  • Each instruction uses addresses where .
  • Close to machine instructions
  • E.g., LLVM IR
2023 Spring
CSE411

Java Bytecode

         0: iload_0
         1: iload_1
         2: iadd
         3: istore_2
         4: iload_2
         5: ireturn
  • Language for a stack-based virtual machine.
  • More abstract than the three-address code.
2023 Spring
CSE411

How to Generate IR?

2023 Spring
CSE411

How to Generate IR?

static void foo() {
    int x;
    x = 1;
}
    ICONST_1
    ISTORE 0
    RETURN
2023 Spring
CSE411

How to Generate IR?

static void foo() { int x; x = 1; }

    ICONST_1
    ISTORE 0
    RETURN
2023 Spring
CSE411

Various Representations of a Program

  • Source code
  • Parse tree
  • AST 1, AST 2, ...
  • IR 1, IR 2, ...
  • Target code
2023 Spring

\begin{tikzpicture}[node distance=1cm and 10cm] \node (c) { $C$ }; \node (cpp) [below of=c] { $C++$ }; \node (rust) [below of=cpp] { $Rust$ }; \node (Julia) [below of=rust] { $Julia$ }; \node (x86) [right=4cm of c] { $x86$ }; \node (arm) [below of=x86] { $ARM$ }; \node (mips) [below of=arm] { $MIPS$ }; \node (alpha) [below of=mips] { $Alpha$ }; \draw [arrow] (c) -- (x86); \draw [arrow] (c) -- (arm); \draw [arrow] (c) -- (mips); \draw [arrow] (c) -- (alpha); \draw [arrow] (cpp) -- (x86); \draw [arrow] (cpp) -- (arm); \draw [arrow] (cpp) -- (mips); \draw [arrow] (cpp) -- (alpha); \draw [arrow] (rust) -- (x86); \draw [arrow] (rust) -- (arm); \draw [arrow] (rust) -- (mips); \draw [arrow] (rust) -- (alpha); \draw [arrow] (Julia) -- (x86); \draw [arrow] (Julia) -- (arm); \draw [arrow] (Julia) -- (mips); \draw [arrow] (Julia) -- (alpha); \end{tikzpicture}

\begin{tikzpicture}[node distance=1cm and 10cm] \node (java) { $Java$ }; \node (kotlin) [below of=java] { $Kotlin$ }; \node (scala) [below of=kotlin] { $Scala$ }; \node (closure) [below of=scala] { $Closure$ }; \node (x86) [right=4cm of java] { $x86$ }; \node (arm) [below of=x86] { $ARM$ }; \node (mips) [below of=arm] { $MIPS$ }; \node (alpha) [below of=mips] { $Alpha$ }; \draw [arrow] (java) -- (x86); \draw [arrow] (java) -- (arm); \draw [arrow] (java) -- (mips); \draw [arrow] (java) -- (alpha); \draw [arrow] (kotlin) -- (x86); \draw [arrow] (kotlin) -- (arm); \draw [arrow] (kotlin) -- (mips); \draw [arrow] (kotlin) -- (alpha); \draw [arrow] (scala) -- (x86); \draw [arrow] (scala) -- (arm); \draw [arrow] (scala) -- (mips); \draw [arrow] (scala) -- (alpha); \draw [arrow] (closure) -- (x86); \draw [arrow] (closure) -- (arm); \draw [arrow] (closure) -- (mips); \draw [arrow] (closure) -- (alpha); \end{tikzpicture}

\begin{tikzpicture}[node distance=1cm and 10cm] \node (c) { $C$ }; \node (cpp) [below of=c] { $C++$ }; \node (rust) [below of=cpp] { $Rust$ }; \node (julia) [below of=rust] { $Julia$ }; \node (llvm) [right=2.5cm of c, below of=c] { LLVM IR }; \node (x86) [right=4.5cm of c] { $x86$ }; \node (arm) [below of=x86] { $ARM$ }; \node (mips) [below of=arm] { $MIPS$ }; \node (alpha) [below of=mips] { $Alpha$ }; \draw [arrow] (c) -- (llvm); \draw [arrow] (cpp) -- (llvm); \draw [arrow] (rust) -- (llvm); \draw [arrow] (julia) -- (llvm); \draw [arrow] (llvm) -- (x86); \draw [arrow] (llvm) -- (arm); \draw [arrow] (llvm) -- (mips); \draw [arrow] (llvm) -- (alpha); \end{tikzpicture}