Skip to content

codeboost/xitdb-clj

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

⚠️ Alpha Software - Work in Progress

This project is in early development and rapidly evolving. Expect breaking changes, rough edges, and incomplete documentation.

Help Wanted! If you find this useful, please consider contributing:

  • Report bugs and issues you encounter
  • Suggest improvements or new features
  • Submit pull requests for fixes or enhancements
  • Share your configuration patterns and workflows
  • Help improve documentation and examples

Your feedback and contributions will help make this tool better for the entire Clojure community!

Overview

xitdb-clj is a embedded database for efficiently storing and retrieving immutable, persistent data structures. The library provides atom-like semantics for working with the database from Clojure.

It is a Clojure interface for xitdb-java, itself a port of xitdb, written in Zig.

Clojars Project

Main characteristics

  • Embeddable, tiny library.
  • Supports writing to a file as well as purely in-memory use.
  • Each transaction (done via swap!) efficiently creates a new "copy" of the database, and past copies can still be read from.
  • Reading/Writing to the database is efficient, only the necessary nodes are read or written.
  • Thread safe. Multiple readers, one writer.
  • Append-only. The data you are writing is invisible to any reader until the very last step, when the top-level history header is updated.
  • All heavy lifting done by the bare-to-the-jvm java library.
  • Database files can be used from other languages, via xitdb Java library or the xitdb Zig library

Quick Start

Add the dependency to your project, start a REPL.

You already know how to use it!

For the programmer, a xitdb database is like a Clojure atom. reset! or swap! to reset or update, deref or @ to read.

(require '[xitdb.db :as xdb])
(def db (xdb/xit-db "my-app.db"))
;; Use it like an atom
(reset! db {:users {"alice" {:name "Alice" :age 30}
                    "bob"   {:name "Bob" :age 25}}})
;; Read the entire database
(xdb/materialize @db)
;; => {:users {"alice" {:name "Alice", :age 30}, "bob" {:name "Bob", :age 25}}}

(get-in @db [:users "alice" :age])
;; => 30
(swap! db assoc-in [:users "alice" :age] 31)

(get-in @db [:users "alice" :age])
;; => 31

Data structures are read lazily from the database

Reading from the database returns wrappers around cursors in the database file:

(type @db) ;; => xitdb.hash_map.XITDBHashMap

The returned value is a XITDBHashMap which is a wrapper around the xitdb-java's ReadHashMap, which itself has a cursor to the tree node in the database file. These wrappers implement the protocols for Clojure collections - vectors, lists, maps and sets, so they behave exactly like the Clojure native data structures. Any read operation on these types is going to return new XITDB types:

(type (get-in @db [:users "alice"])) ;; => xitdb.hash_map.XITDBHashMap

So it will not read the entire nested structure into memory, but return a 'cursor' type, which you can operate upon using Clojure functions.

Use materialize to convert a nested XITDB data structure to a native Clojure data structure:

(xdb/materialize (get-in @db [:users "alice"])) ;; => {:name "Alice" :age 31}

No query language

Use filter, group-by, reduce, etc. If you want a query engine, datascript works out of the box, you can store the datoms as a vector in the db.

Here's a taste of how your queries could look like:

(defn titles-of-songs-for-artist
  [db artist]
  (->> (get-in db [:songs-indices :artist artist])
       (map #(get-in db [:songs % :title]))))

(defn what-is-the-most-viewed-song? [db tag]
  (let [views (->> (get-in db [:songs-indices :tag tag])
                   (map (:songs db))
                   (map (juxt :id :views))
                   (sort-by #(parse-long (second %))))]
    (get-in db [:songs (first (last views))])))

History

Since the database is immutable, all previous values are accessed by reading from the respective history index. The root data structure of a xitdb database is a ArrayList, called 'history'. Each transaction adds a new entry into this array, which points to the latest value of the database (usually a map).

(xdb/deref-at db -1) ;; the most recent value, same as @db
(xdb/deref-at db -2) ;; the second most recent value
(xdb/deref-at db 0)  ;; the earliest value
(xdb/deref-at db 1)  ;; the second value

You can get the latest history index from the count of the database:

(def history-index (dec (count db)))

After making further transactions, you can revert back to it simply like this:

(reset! db (xdb/deref-at db history-index))

It is also possible to create a transaction which returns the previous and current values of the database, by setting the *return-history?* binding to true.

;; Work with history tracking
(binding [xdb/*return-history?* true]
  (let [[history-index old-value new-value] (swap! db assoc :new-key "value")]
    (println "old value:" old-value)
    (println "new value:" new-value)))

Freezing

One important distinction from the Clojure atom is that inside a transaction (eg. a swap!), the data is temporarily mutable. This is exactly like Clojure's transients, and it is a very important optimization. However, this can lead to a surprising behavior:

(swap! db (fn [moment]
            (let [moment (assoc moment :fruits ["apple" "pear" "grape"])
                  moment (assoc moment :food (:fruits moment))
                  moment (update moment :food conj "eggs" "rice" "fish")]
              moment)))

;; =>

{:fruits ["apple" "pear" "grape" "eggs" "rice" "fish"]
 :food ["apple" "pear" "grape" "eggs" "rice" "fish"]}

;; the fruits vector was mutated!

If you want to prevent data from being mutated within a transaction, you must freeze! it:

(swap! db (fn [moment]
            (let [moment (assoc moment :fruits ["apple" "pear" "grape"])
                  moment (assoc moment :food (xdb/freeze! (:fruits moment)))
                  moment (update moment :food conj "eggs" "rice" "fish")]
              moment)))

;; =>

{:fruits ["apple" "pear" "grape"]
 :food ["apple" "pear" "grape" "eggs" "rice" "fish"]}

Note that this is not doing an expensive copy of the fruits vector. We are benefitting from structural sharing, just like in-memory Clojure data. The reason we have to freeze! is because the default is different than Clojure; in Clojure, you must opt-in to temporary mutability by using transients, whereas in xitdb you must opt-out of it.

Architecture

xitdb-clj builds on xitdb-java which implements:

  • Hash Array Mapped Trie (HAMT) - For efficient map and set operations
  • RRB Trees - For vector operations with good concatenation performance
  • Structural Sharing - Minimizes memory usage across versions
  • Copy-on-Write - Ensures immutability while maintaining performance

The Clojure wrapper adds:

  • Idiomatic Clojure interfaces (IAtom, IDeref)
  • Automatic type conversion between Clojure and Java types
  • Thread-local read connections for scalability
  • Integration with Clojure's sequence abstractions

Supported Data Types

  • Maps - Hash maps with efficient key-value access
  • Vectors - Array lists with indexed access
  • Sets - Hash sets with unique element storage
  • Lists - Linked lists and RRB tree-based linked array lists
  • Primitives - Numbers, strings, keywords, booleans, dates.

Performance Characteristics

  • Read Operations: O(log16 n) for maps and vectors due to trie structure
  • Write Operations: O(log16 n) with structural sharing for efficiency
  • Memory Usage: Minimal overhead with automatic deduplication of identical subtrees
  • Concurrency: Thread-safe with optimized read-write locks

License

MIT