You are here

Finite State Machines

  • A general toolbox in MatLab and Octave containing the main algorithms for regular and subregular language classes (RPNI, k-testable). It also contains an implementation of the OSTIA algorithm (subsequential transducers), together with three of the main probabilistic finite state automata learning algorithms (EDSM, ALERGIA and MDI). This work developed by Hasan Ibne Akram and Huang Xiao can be downloaded here.
  • Treba is a basic command-line tool for training, decoding, and calculating with weighted (probabilistic) finite state automata (PFSA) and Hidden Markov Models (HMMs). It can be downloaded here.

    It contains the following implementations:

    For PFA/HMMs:

    • MCMC (also has GPU implementation)
    • Baum-Welch/EM (also has parallel implementation for multiple CPUs)
    • Variational Bayes

    For PFA only:

    • ALERGIA + variants
    • MDI

    This work was developed by Mans Hulden.

  • 3 methods of moments, including a spectral learning algorithm by Borja Balle, William L. Hamilton, and Joelle Pineau.
    The learning algorithms produce weighted finite automata that can be used to make predictions over string.
    Sources in github can be found here.
  • The PAutomaC competition dealt with learning probability distribution over strings. It provided baseline implementations: Baum-Welch in python and ALERGIA in OpenFST and Visual Studio. The sources can be download here.
  • Mans Hulden developed the algorithms used by the participants of the PAutomaC competition in order to be able to compare them. The result is incorporated to the Treba tool box and can be found here.

    It contains wrapper scripts about how to easily run the different techniques on PAutomaC data. This allows one to replicate the results of the top three competitors:

    • Gibbs sampler
    • EM/Baum-Welch
    • A state-merging algorithm
    • fsm provides an OCaML library which implements several functions for finite state machines. Functions implemented for finite state acceptors include minimization, determinization, concatenation, union, intersection, state-merging learning algorithms with various state-equivalence criteria, prefix and suffix tree construction, etc.

      The source can be downloaded here.

    • Statechum is a tool implementing EDSM/QSM inference techniques and a number of methods contributing to research work on inference methods. These include:
      • support for domain-specific heuristics (such as using LTL),
      • generation of random LTS,
      • graph differencing (construction of a 'diff' between two LTS),
      • computation of F-Measure and BCR,
      • integration with R tool for plotting graphs,
      • integration with Erlang for inference of models of Erlang modules.

      Its code can be downloaded here.

    • Though not dealing directly with grammar learning, Foma is a compiler, programming language, and C library for constructing finite-state automata and transducers for various uses. Its sources, developed mainly by Mans Hulden, can be download here.
    • iGREAT is an open-source, statistical machine translation software toolkit based on the application of grammatical inference algorithms for finite-state models. It can be downloaded here.