You are here

Active Learning

Active learning is a framework first introduced by Dana Angluin in the 80's. The learner interacts with an oracle via queries of different kind. For example, the learner can ask the oracle whether a given string belongs to the target language or not (this is called a membership query).

Active learning is extensively used and developed in the contexts of software engineering and protocol inference and various implementations are available:

  • LearnLib is a free, open-source Java library for active automata learning. It contains the initial Angluin's L* algorithm for DFA, together with more recent derived algorithms that learn more complex finite state models, like Mealy Machines.
  • The Libalf library is a comprehensive, open-source library for active and passive learning finite-state automata. It covers various well-known learning techniques (such as Angluin's L*, Biermann's learning approach, and RPNI), as well as novel learning algorithms (e.g. for NFA and visibly one-counter automata).
  • AIDE (automata-identification engine) is a free open source tool for automata inference algorithms developed in C# .Net. This implementation provides active automata-learning via two kinds of queries: membership queries and equivalence queries. The AIDE engine is using the Automata Tool and Testing Tool which both are developed in the context of AIDE project.
  • Tomte is a tool that fully automatically constructs abstractions for active state machine learning. It is not a learning tool/implementation per se but is an automatic alphabet abstraction tool which helps to bridge the gap between “black-box concrete world” (SUH) and “abstract automata world”.
    Usually, a component implementing the learning algorithm is directly connected to the system under learning (or SUL, the oracle). By observing how the SUL responds to queries sent by the learner, a model of the behaviour of the SUL can be constructed. This is not enough for learning models of realistic software components which, due to the presence of program variables and data parameters in messages, typically have much larger state spaces. Tomte is placed in between the SUL and the learner, and transforms the large set of actions of the SUL into a small set of abstract actions that can be handled by the learner. This means the inference is performed on a small alphabet and once the learning is done the abstract model is combined with the abstraction to construct a concrete version of the SUL.