Module Mula.Match

Universal Levenshtein Automata

The paper by Touzet details the Universal Levenshtein Automata.

Some nice computational properties of the (not-deterministic) automata include:

These allow for efficient implementation together with interesting uses.

  1. We not only know if two strings are within a certain edit distance, but we also know what the edit distance is if they are within the edit distance limit.
  2. If several strings are compared, we can rank them by their edit distance.
  3. We can lazily feed characters and string fragments to the automata and get the current error count.

This module offers a functor to build matchers for different representations of strings and characters. For a prebuilt matcher for OCaml strings and characters, see Strings.

String Abstraction

We abstract over strings and characters so that we do not rely on any specific encoding. We only need the following:

module type S = sig ... end

Levenshtein Automata

Given a string representation, we produce two Automata:

module Make (St : S) : sig ... end