LLVM — Introduction and Setup

#llvm #compilers #computerscience #learning

Helloo! Welcome to my next project—The LLVM framework (the Kaleidoscope tutorial, to be precise). I decided to try this out as soon as I finished the jlox Interpreter in Robert Nystrom’s Crafting Interpreters.

What’s LLVM?
The ‘Low Level Virtual Machine’ (LLVM) is a full compiler infrastructure that can take various programming languages, generate LLVM Intermediate Representation (IR), optimise it, and finally convert it into hardware-specific machine level code. This low level code is then run on the CPU.

Why am I doing this?
The Abstract Syntax Trees (ASTs) borne out of interpreting any code are specific to its language. When we use LLVM’s IR, which is aware of hardware concepts like registers, memory, arithmetic operations, etc., our language becomes more universal.

As against assembly language, which is very platform/hardware dependent, LLVM (whose IR almost resembles assembly language) supports various machines like Intel x86, ARM, RISC-V, etc. I find heterogenous hardware very fascinating, especially the prospect of hardware-agnostic compilation. That’s pretty much what drew me LLVM.

How I set it up:
While there are many tutorials to familiarise oneself with the platform, I (surprisingly) found the documentation wonderful. Some of it I skipped, but the set-up and introduction were quite helpful. I still haven’t figured out a lot, so I’m taking things slowly. For anyone wanting to start out with LLVM, I’d recommend these:

Getting the source code and building LLVM
An example for using the LLVM tool chain
Disclaimer: Most of what I did below is adapted from these tutorials.

1) Getting LLVM running on macOS (especially Apple Silicon) is very straightforward with Homebrew.

brew install llvm
export PATH="/opt/homebrew/opt/llvm/bin:$PATH"
export CMAKE_PREFIX_PATH="/opt/homebrew/opt/llvm"
source ~/.zshrc

2) Managing Dependencies: I used CMake.

llvm -config —version
cd ~
mkdir toy-compiler
cd toy-compiler
mkdir src build
nano CMakeLists.txt

Here's the CMakeLists.txt (configuration) for the toy-compiler/Kaleidoscope project I'll be doing. It serves as a project manager, telling the compiler where the LLVM "brains" are and which specific libraries (like JIT, native codegen) we want to use.
It doesn't compile the code itself, and instead gathers all the ingredients (libraries, headers, compiler settings) so the build tool (Ninja, in my case) knows what to do.

cmake_minimum_required(VERSION 3.13)
project(ToyCompiler LANGUAGES C CXX)

set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

find_package(LLVM REQUIRED CONFIG)
message(STATUS "Found LLVM ${LLVM_PACKAGE_VERSION}")

add_definitions(${LLVM_DEFINITIONS})
include_directories(${LLVM_INCLUDE_DIRS})
link_directories(${LLVM_LIBRARY_DIRS})

add_executable(toy src/main.cpp)

# For JIT and native codegen
llvm_map_components_to_libnames(llvm_libs core orcjit native)
target_link_libraries(toy ${llvm_libs})

3) Verifying the build system:

To ensure the libraries are linking correctly, I wrote a simple sanity check in src/main.cpp using LLVM's output stream, llvm::outs()

#include "llvm/Support/raw_ostream.h"
int main() {
    llvm::outs() << "LLVM setup works!\n";
    return 0;
}

To actually compile the project, I used Ninja. It's a small build system focused on speed.

brew install ninja
cd ~/toy-compiler/build
cmake -G Ninja ..
ninja
./toy -> LLVM setup works!

4) An example: To truly understand how compilation happens at lower levels, it helps to follow a program through the entire pipeline, from high level code to bit code, and finally to a native binary.
test1.c

#include <stdio.h>
int main() {
    printf("First go at LLVM!");
    return 0;
}

clang -O3 -emit-llvm test1.c -c -o test1.bc (Clang is the C compiler macOS uses as a front-end; emit-llvm creates the bitcode.)
lli test1.bc (lli directly executes the bytecode.)
llvm-dis < test1.bc | less (This is for looking at the human-readable LLVM assembly code.)
llc test1.bc -o test1.s (llc converts bitcode to native assembly.)
gcc test1.s -o test1.native (This assembles the native file into a program.)
Running ./test1.native → First go at LLVM!

What's next:
The lexer and parser for Kaleidoscope!

Musings:
Sometimes, I get these sudden, intense urges to just... be somewhere. I often find myself wishing for Doraemon’s "Anywhere Door" so I could instantly step into a completely different world. It’s not that I don’t enjoy where I am, but there’s an incomparable high that comes from being somewhere totally foreign.

Travel has always been my ultimate meditation. I once read that our memories of new places are so vivid because we become hyper-sensitive to our surroundings. In our daily lives—the same commute, the same routine—we stop truly "seeing" the world. We stop noticing the colour of the sky, the old man reading the newspaper by the shopfront, or the little puppy prancing around with an aluminium foil. But in a new place, with your senses wide open, you notice everything.

A few years ago, I visited Sikkim (an absolutely stunning state in our North-East), and I’ve never felt more alive. Even now, I can recall every sound, smell, and sight from that trip with perfect clarity. That level of observation taught me how to be more attentive to the daily moments in my life. Ever since, I've tried to carry that traveler’s spirit with me, finding joy in the small details—in the "normal" and the everyday.

DEV Community

LLVM — Introduction and Setup

Top comments (0)