Skip to content

mll/clojure-rt

Repository files navigation

Clojure Real Time

Clojuire Real Time (clojure-rt) is a compiler of Clojure programming language.

It is being developed to allow deterministic and fast execution that could allow Clojure to proliferate beyond its original domains of application. It uses LLVM for agressive optimisations, JIT and platform independence.

The compiler strives to be a full implementation of Clojure following the reference java implementation as closely as possible.

The interop part of java has been reimplemented in C to enable the compilation of *.clj part of the clojure source code (this is work in progress).

All the data structures have been developed in C. The compiler uses JIT to compile clojure->llvm->binary at runtime. The primary advantage of this implementation is reference counting memory model, based on Renking et al, MSR-TR-2020-42, Nov 29, 2020. It allows clojure programmes to behave predictably with respect to memory management and execution time, as no garbage collector is involved. Furthermore, advanced optimisations allowed by llvm enable the compiler to often execute clojure code at nearly native speeds.

Type annotations, commonly used in the java implementation to speed the execution up are not needed in clojure-rt as all the types are being discovered during programme execution and the resulting JITted representations are optimised to benefit from static type analysis. To archieve the above result, two tiered JIT compiler designed specifically with CLojure in mind has been developed.

Currently the compiler is set up in bootstrap mode and consists of two separate parts

  1. Compiler frontend written in Clojure itself, to be executed using leiningen / java clojure Based on org.clojure/tools.reader and org.clojure/tools.analyzer.

    The frontend therefore is very simple, it only adds some additional passes to compute memory management annotations.

  2. Compiler backend written in C++ (llvm part) and C (runtime + basic standard library).

    The backend is composed of llvm code generator, llvm JIT infrastructure and bootstraps and implementation of clojure runtime in C (for performance). All the basic immutable data structures are implemented in C, allowing the highest level of optimisation.

The parts of the compiler currently communicate using protocol buffers (the *.cljb files). However, as soon as clojure itself will be able to be compiled using the two-part bootstrap compiler, a self-hosted compiler will be developed.

Compilation

The compilation process is tested on OS X only, but due to cmake it should work everywhere if you manage to install dependencies from begin_development using your system's package manager.

  1. ./begin_development.sh
  2. cd backend-v2
  3. cmake . -DCMAKE_PREFIX_PATH=/opt/homebrew -DCMAKE_BUILD_TYPE=Debug ;; or Release
  4. make -j 8

This should build the c/c++ compiler backend.

Running

./compile.sh fib.clj

Please note that currently the backend prints out a lot of debug information. The primary info it prints is the LLVM code generated for given clojure statements. Please also note, that currently the programmes compiled bu the frontend have to be executed by the frontend to generate the AST. Therefore, running the naive recursive (fib 42) runs on my machine for 26 seconds (frontend using java implementation) and 0.46 seconds (backend, clojre-rt). This not only shows how much faster clojure-rt can be, but also how much will be gained when the compiler bootstrapping process will be complete.

Generating protobuf models

The models do not need to be regenerated before compilation. However, there might be a time in development when such generation can become necessary. The process is described below.

The original source of protobuf models come from org.clojure/tools.analyzer.jvm documentation of its AST representation.

We use two files to describe the AST as a data structure:

frontend/resources/ast-types.edn frontend/resources/ast-ref.edn

These ast-ref.edn can be readily obtained from the analyzer repository, they can also be autogenerated using a script found in this repository. The types file is manually created to assign protobuf types to elements of AST tree produced by the analyser (needed here as clojure is dynamically typed).

The frontend application is not only capable of compiling .clj into .cljb (protobuf representation of AST) but also can be used to generate the protobuf definition (model/bytecode.proto) from the above files.

Then, protoc can be used to generate clojure and c++ specific code for decoding / encoding .cljb

TL;DR: The whole process is automated and can be run this way:

cd model ./generate.sh

Self-hosted compiler roadmap

The self-hosted compiler will consist of:

  1. Compiler backend, exactly as (2) above.
  2. Compiled protobuf file of clojure itself. It will be generated from the main clojure repository using the bootstrap compiler (1)
  3. Compiled protobuf files of (1) with all its dependencies.
  4. Bootstrapper as main function of the C++ application:

The bootstrapper would launch the backend with pre-build protobuf files as initial input. Then, it will compile the frontend (later we could use the LLVM representation of the complete compilation result so that backend would start more rapidly).

Finally, any clojure files to be compiled and run (including possible REPL commands) would be fed to the running system, first passed through the frontend and finally to the backend for execution. The protobuf files will probably be transferred between frontend and backend as in-memory data structures at this stage, as both parts of the compiler would reside inside the same process.

License

clojure-rt is being distributed on GPLv2 license. See LICENSE.md for details.

Copyright © 2022-2026 Marek Lipert, Aleksander Wiacek

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •