Package patricia-tree

This library contains a single module: PatriciaTree.

This is version 0.9.0 of the library. It is known to work with OCaml versions ranging from 4.14 to 5.2.

This is an OCaml library that implements sets and maps as Patricia Trees, as described in Okasaki and Gill's 1998 paper Fast mergeable integer maps. It is a space-efficient prefix trie over the big-endian representation of the key's integer identifier.

The source code of this library is available on Github under an LGPL-2.1 license.

This library was written by Matthieu Lemerre, with some contributions by Dorian Lesbre, as part of the https://codex.top/, developed at CEA List.

Installation

This library can be installed with opam:

opam install patricia-tree

Alternatively, you can clone the source repository and install with dune:

git clone git@github.com:codex-semantics-library/patricia-tree.git
opam install . --deps-only
cd patricia-tree
dune build
dune install
# To build documentation
opam install odoc
dune build @doc

Features

Quick overview

Functors

This library contains a single module, PatriciaTree. The functors used to build maps and sets are the following:

Interfaces

Here is a brief overview of the various module types of our library:

Examples

Homogeneous map

Here is a small example of a non-generic map:

(** Create a key struct *)
module Int (*: PatriciaTree.KEY*) = struct
  type t = int
  let to_int x = x
end

(** Call the map and/or set functors *)
module IMap = PatriciaTree.MakeMap(Int)
module ISet = PatriciaTree.MakeSet(Int)

(** Use all the usual map operations *)
let map =
  IMap.empty |>
  IMap.add 1 "hello" |>
  IMap.add 2 "world" |>
  IMap.add 3 "how do you do?"
  (* Also has an [of_list] and [of_seq] operation for initialization *)

let _ = IMap.find 1 map (* "hello" *)
let _ = IMap.cardinal map (* 3 *)

(** The strength of Patricia Tree is the speedup of operation on multiple maps
    with common subtrees. *)
let map2 =
  IMap.idempotent_inter_filter (fun _key _l _r -> None)
  (IMap.add 4 "something" map) (IMap.add 5 "something else" map)
let _ = map == map2 (* true *)
(* physical equality is preserved as much as possible, although some intersections
   may need to build new nodes and won't be fully physically equal, they will
   still share subtrees if possible. *)

(** Many operations preserve physical equality whenever possible *)
let _ = (IMap.add 1 "hello" map) == map (* true: already present *)

(** Example of cross map/set operation: only keep the bindings of [map]
    whose keys are in a given set *)
let set = ISet.of_list [1; 3]
module CrossOperations = IMap.WithForeign(ISet.BaseMap)
let restricted_map = CrossOperations.nonidempotent_inter
  { f = fun _key value () -> value } map set

Heterogeneous map

(** Very small typed expression language *)
type 'a expr =
  | G_Const_Int : int -> int expr
  | G_Const_Bool : bool -> bool expr
  | G_Addition : int expr * int expr -> int expr
  | G_Equal : 'a expr * 'a expr -> bool expr

module Expr : PatriciaTree.HETEROGENEOUS_KEY with type 'a t = 'a expr = struct
  type 'a t = 'a expr

  (** Injective, so long as expression are small enough
      (encodes the constructor discriminant in two lowest bits).
      Ideally, use a hash-consed type, to_int needs to be fast *)
  let rec to_int : type a. a expr -> int = function
    | G_Const_Int i ->   0 + 4*i
    | G_Const_Bool b ->  1 + 4*(if b then 1 else 0)
    | G_Addition(l,r) -> 2 + 4*(to_int l mod 10000 + 10000*(to_int r))
    | G_Equal(l,r) ->    3 + 4*(to_int l mod 10000 + 10000*(to_int r))

  (** Full polymorphic equality *)
  let rec polyeq : type a b. a expr -> b expr -> (a, b) PatriciaTree.cmp =
    fun l r -> match l, r with
    | G_Const_Int l, G_Const_Int r -> if l = r then Eq else Diff
    | G_Const_Bool l, G_Const_Bool r -> if l = r then Eq else Diff
    | G_Addition(ll, lr), G_Addition(rl, rr) -> (
        match polyeq ll rl with
        | Eq -> polyeq lr rr
        | Diff -> Diff)
    | G_Equal(ll, lr), G_Equal(rl, rr) ->    (
        match polyeq ll rl with
        | Eq -> (match polyeq lr rr with Eq -> Eq | Diff -> Diff) (* Match required by typechecker *)
        | Diff -> Diff)
    | _ -> Diff
end

(** Map from expression to their values: here the value only depends on the type
    of the key, not that of the map *)
module EMap = PatriciaTree.MakeHeterogeneousMap(Expr)(struct type ('a, _) t = 'a end)

(** You can use all the usual map operations *)
let map : unit EMap.t =
  EMap.empty |>
  EMap.add (G_Const_Bool false) false |>
  EMap.add (G_Const_Int 5) 5 |>
  EMap.add (G_Addition (G_Const_Int 3, G_Const_Int 6)) 9 |>
  EMap.add (G_Equal (G_Const_Bool false, G_Equal (G_Const_Int 5, G_Const_Int 7))) true

let _ = EMap.find (G_Const_Bool false) map (* false *)
let _ = EMap.cardinal map (* 4 *)

(** Fast operations on multiple maps with common subtrees. *)
let map2 =
  EMap.idempotent_inter_filter
    { f = fun _key _l _r -> None } (* polymorphic 1rst order functions are wrapped in records *)
    (EMap.add (G_Const_Int 0) 8 map)
    (EMap.add (G_Const_Int 0) 9 map)

Release status

This should be close to a stable release. It is already being used as part of a larger project successfully, and this usage as helped us mature the interface. As is, we believe the project is usable, and we don't anticipate any major change before 1.0.0. We didn't commit to a stable release straight away as we would like a bit more time using this library before doing so.

Known issues

There is a bug in the OCaml typechecker which prevents us from directly defining non-generic maps as instances of generic maps. To avoid this, non-generic maps use a separate value type (instead of just using 'b)

type (_, 'b) snd = Snd of 'b [@@unboxed]

It should not incur any extra performance cost as it is unboxed, but can appear when manipulating non-generic maps.

For more details about this issue, see the OCaml discourse discussion.

Comparison to other OCaml libraries

ptmap and ptset

There are other implementations of Patricia Tree in OCaml, namely ptmap and ptset, both by J.C. Filliatre. These are smaller and closer to OCaml's built-in Map and Set, however:

dmap

Additionally, there is a dependent map library: dmap. It allows creating type safe dependent maps similar to our heterogeneous maps. However, its maps aren't Patricia trees. They are binary trees build using a (polymorphic) comparison function, similarly to the maps of the standard library.

Another difference is that the type of values in the map is independent from the type of the keys, allowing keys to be associated with different values in different maps. i.e. we map 'a key to any ('a, 'b) value type, whereas dmap only maps 'a key to 'a or 'a value.

dmap also works with OCaml >= 4.12, whereas we require OCaml >= 4.14.

Contributions and bug reports

Any contributions are welcome!

You can report any bug, issues, or desired features using the Github issue tracker. Please include OCaml, dune, and library version information in you bug reports.

If you want to contribute code, feel free to fork the repository on Github and open a pull request. By doing so you agree to release your code under this project's license (LGPL-2.1).

There is no imposed coding style for this repository, here are just a few guidelines and conventions: