Opened 5 years ago

Closed 5 years ago

Last modified 2 years ago

#15078 closed enhancement (fixed)

new module: finite state machines, automata, transducers

Reported by: dkrenn Owned by:
Priority: major Milestone: sage-5.13
Component: combinatorics Keywords: finite state machines, automaton, transducer
Cc: eviatarbach, tmonteil, cheuberg Merged in: sage-5.13.beta5
Authors: Clemens Heuberger, Daniel Krenn, Sara Kropf Reviewers: Volker Braun, Frédéric Chapoton, Vincent Delecroix, Darij Grinberg, Sébastien Labbé
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Description (last modified by jdemeyer)

This module adds a class for

  • finite state machine,
  • finite automaton,
  • finite transducer.

Apply only trac_15078_fsm_automata_transducers.8.patch

Change History (50)

Changed 5 years ago by dkrenn

comment:1 Changed 5 years ago by dkrenn

  • Description modified (diff)
  • Status changed from new to needs_review

comment:2 Changed 5 years ago by dkrenn

The patch does not change anything (except one index.rst and one all.py) in the existing Sage library; please review.

comment:3 Changed 5 years ago by dkrenn

  • Authors set to Clemens Heuberger, Daniel Krenn, Sara Kropf

Changed 5 years ago by dkrenn

comment:4 Changed 5 years ago by dkrenn

  • Description modified (diff)

Changed 5 years ago by dkrenn

comment:5 Changed 5 years ago by dkrenn

  • Description modified (diff)

Another change in the docstrings, all changes are now in trac_15078_fsm_automata_transducers.2.patch

comment:6 Changed 5 years ago by dkrenn

  • Description modified (diff)

comment:7 Changed 5 years ago by chapoton

Here is the report of pyflakes

sage/combinat/finite_state_machine.py:293: 'SR' imported but unused sage/combinat/finite_state_machine.py:1074: local variable 'from_state' is assigned to but never used sage/combinat/finite_state_machine.py:1077: local variable 'to_state' is assigned to but never used sage/combinat/finite_state_machine.py:1160: redefinition of unused 'itertools' from line 304 sage/combinat/finite_state_machine.py:2213: local variable 'to_state' is assigned to but never used sage/combinat/finite_state_machine.py:3372: local variable 'new_transition' is assigned to but never used

This needs to be cleaned

comment:8 Changed 5 years ago by vdelecroix

Hi all,

Right now I just have a quick look but

  • the lot of examples at the begining are very nice!
  • conditional_iterator is ifilter in the module itertools
  • you should use Word and Words (in sage.combinat.words)

Vincent

Last edited 5 years ago by vdelecroix (previous) (diff)

comment:9 Changed 5 years ago by vbraun

INPUT/OUTPUT needs to be documented in every method (see developer manual)

Are you sure you need FSMProcessIterator in the global namespace? Similarly, instead of FSMstate and FSMtransition, which are useless by themselves, just do

    sage: fsm = FiniteStateMachine() 
    sage: day = fsm.add_state('day')
    sage: night = fsm.add_state('night')
    sage: sunrise = fsm.add_transition(night, day)

in addition to

    sage: sunrise = fsm.add_transition('night', 'day')

Also, instead of reinventing the mutability wheel there is sage.structure.mutability.Mutability to inherit from.

comment:10 Changed 5 years ago by slabbe

Nice! I was needing such a class for a while! I don't know if you should really use sage.combinat.words stuff in your code, maybe in a method def language(self, n) of FiniteStateMachine that would return an iterator over all words of length n recognized by the automaton. Anyhow, sage.combinat.words will definitively benefits from it, for instance for representing automatic sequences.

I agree with both comments of vbraun above. The code is well written. Only small fixes must be done like adding INPUT and OUPUT blocks and follow Python convention of having one space before and after = character (except for arguments of a function). See PEP 8 at http://www.python.org/dev/peps/pep-0008/

comment:11 Changed 5 years ago by eviatarbach

  • Cc eviatarbach added

comment:12 Changed 5 years ago by chapoton

for the bot : apply only trac_15078_fsm_automata_transducers.2.patch

comment:13 Changed 5 years ago by slabbe

I did some small tests yesterday. Below are few of them.

In the following example, the output of process should be True, but it returns False. Should the process method raise an NotImplementedError if the automaton is not deterministic?::

    sage: D = {'A': [('A', 'a'), ('B', 'a'), ('A', 'b')], 'C': [], 'B': [('C', 'b')]}
    sage: auto = Automaton(D, initial_states=['A'], final_states=['C'])
    sage: auto.is_deterministic()
    False
    sage: auto.process(list('aaab'))
    (False, State 'A', [])

The determinisation is broken for this automaton. Why?::

    sage: auto.states()
    [State 'A', State 'B', State 'C']
    sage: auto.determinisation()
    ...
    LookupError: No state with label State 'A' found.

Apparently, label needs to be integers for determinisation to work? ::

    sage: L = [('A', 'A', 0), ('A', 'B', 0), ('A','A',1), ('B','C', 1)]
    sage: auto = Automaton(L, initial_states=['A'], final_states=['C'])
    sage: auto.process([0, 0, 0, 1])
    (False, State 'A', [])
    sage: auto.determinisation()
    finite state machine with 3 states
    sage: auto.determinisation().states()
    [State frozenset(['A']), State frozenset(['A', 'B']), State frozenset(['A', 'C'])]

Cheers!

Last edited 5 years ago by slabbe (previous) (diff)

Changed 5 years ago by dkrenn

comment:14 Changed 5 years ago by dkrenn

  • Description modified (diff)

comment:15 Changed 5 years ago by dkrenn

I've uploaded a new patch: http://trac.sagemath.org/attachment/ticket/15078/trac_15078_fsm_automata_transducers.3.patch It is already rebased to 5.12

Replying to chapoton:

Here is the report of pyflakes [...] This needs to be cleaned

Done.

Replying to vdelecroix:

Changed.

  • you should use Word and Words (in sage.combinat.words)

See #15267 together with comments below.

Replying to vbraun:

INPUT/OUTPUT needs to be documented in every method (see developer manual)

Done.

Are you sure you need FSMProcessIterator in the global namespace? Similarly, instead of FSMstate and FSMtransition [...]

Those things are now local. Only FiniteStateMachine, Automaton and Transducer are now global.

Also, instead of reinventing the mutability wheel there is sage.structure.mutability.Mutability to inherit from.

This is now #15266 (which can be solved after #15264). See also the discussion "Mutability" on sage-devel ​https://groups.google.com/forum/#!topic/sage-devel/dnXSgh56Boo

Replying to slabbe:

I don't know if you should really use sage.combinat.words stuff in your code, maybe in a method def language(self, n) of FiniteStateMachine that would return an iterator over all words of length n recognized by the automaton. Anyhow, sage.combinat.words will definitively benefits from it, for instance for representing automatic sequences.

This is now #15267.

I agree with both comments of vbraun above. The code is well written. Only small fixes must be done like adding INPUT and OUPUT blocks and follow Python convention of having one space before and after = character (except for arguments of a function). See PEP 8 at http://www.python.org/dev/peps/pep-0008/

I thought I followed those rules. Anyhow, I found some misplaced spaces and non-spaces. Now changed (and hopefully nothing missed).

Replying to slabbe:

In the following example, the output of process should be True, but it returns False. Should the process method raise an NotImplementedError if the automaton is not deterministic?::

    sage: D = {'A': [('A', 'a'), ('B', 'a'), ('A', 'b')], 'C': [], 'B': [('C', 'b')]}
    sage: auto = Automaton(D, initial_states=['A'], final_states=['C'])
    sage: auto.is_deterministic()
    False
    sage: auto.process(list('aaab'))
    (False, State 'A', [])

No, I would not raise a NotImplementedError, since it isn't one, but just one possible outcome is given. But I agree that it could be a problem. Therefore I added a comment in the documentation of process.

The determinisation is broken for this automaton. Why?::

    sage: auto.states()
    [State 'A', State 'B', State 'C']
    sage: auto.determinisation()
    ...
    LookupError: No state with label State 'A' found.

I cannot reproduce this behaviour. Anyhow, I added the example above as a doctest.

Last edited 5 years ago by dkrenn (previous) (diff)

Changed 5 years ago by dkrenn

comment:16 Changed 5 years ago by dkrenn

  • Description modified (diff)

comment:17 Changed 5 years ago by dkrenn

There is a new version available: http://trac.sagemath.org/attachment/ticket/15078/trac_15078_fsm_automata_transducers.4.patch We fixed some small bugs, made the output of some functions more consistent, and improved the documentation.

We'd be happy if someone could review the patch.

Changed 5 years ago by dkrenn

comment:18 Changed 5 years ago by dkrenn

  • Description modified (diff)

comment:19 Changed 5 years ago by slabbe

I had a question concerning the choices you made for the classes. Why did you choose to have only one class FiniteStateMachine with the mode argument?

def Automaton(*args, **kwargs): 
    return FiniteStateMachine(mode='automaton', *args, **kwargs) 

def Transducer(*args, **kwargs): 
    return FiniteStateMachine(mode='transducer', *args, **kwargs)

The role of this mode argument is usually done transparently by the classes. Was it better than having three classes class FiniteStateMachine(SageObject), class Automaton(FiniteStateMachine) and class Transducer(FiniteStateMachine) ? Another option closer to what you have done could be two classes : class Transducer(SageObject) and class Automaton(Transducer) ? All this depends on the way the methods are implemented and how much they depends on the class instance. What would be best do you think?

There is a Sage Afternoon today in Paris. I hope to have time to take a look more deeply at the most recent version of the patch.

comment:20 Changed 5 years ago by slabbe

I did a more careful reading of the patch today. All tests passed (5.59s for 552 tests which is quite efficient). Coverage is 100% (115 of 115). Documentation builds fine without warnings.

I am ready to give a positive review to the patch provided the following three more things are fixed.

In the file doc/en/reference/combinat/index.rst, the finite state module is at the very end of the list after Miscellaneous and Combinatorial maps. I suggest that you put it before the Words section instead.

2.

The documentation for the data input of FiniteStateMachine is not properly documented. It is not usual to see such pseudo code examples and they are not easy to read, thus not very usefull (personnaly, my eyes do not want to look at them).

    - ``data`` -- can be any of the following:

      - ``{A:{B:{word_in=0, word_out=1}, C:{word_in=1, word_out=1}, ...}``
      - ``{A:{B:(0, 1), C:(1, 1), ...}``
      - ``{A:{B:FSMTransition(A, B, 0, 1), C:FSMTransition(A, C, 1, 1), ...}``
      - ``{A:[(B, 0, 1), (C, 1, 1)], ...}``
      - ``{A:[FSMTransition(A, B, 0, 1), FSMTransition(A, C, 1, 1)], ...}``
      - ``[{from_state:A, to_state:B, word_in:0, word_out:1}, \
            from_state:A, to_state:C, word_in:1, word_out:1}, ...]``
      - ``[(A, B, 0, 1), (A, C, 1, 1), ...]``
      - ``[FSMTransition(A, B, 0, 1), FSMTransition(A, C, 1, 1), ...]``

I suggest that you follow the documentation of Graph which is now quite good. See:

sage: Graph?

You will see that the possible cases of data for Graph are listed verbosely with english words with empty line in between them. The list is numeroted with numbers. And the same numbers are used below in the examples section where many examples are provided for *each* case.

3.

Make three classes for FiniteStateMachine, Automaton and Transducer.

Indeed, FiniteStateMachine has 77 methods. Of those only 7 of them depends on the mode attribute. I have copied below the relevant part of the code where the mode attribute is used in those six methods:

class FiniteStateMachine(SageObject):
    def __init__(self, ..., mode=None, ...):
        ...
        self.mode = mode
        ...
    def empty_copy(self, memo=None):
        ...
        new.mode = deepcopy(self.mode, memo)
        ...
    def _latex_(self):
        ...
                    if self.mode == 'automaton':
                        labels.append(format_transition_label(
                                transition.word_in))
                    else:
                        labels.append(format_transition_label(
                            transition.word_in) + "\\mid" + \
                                format_transition_label(transition.word_out))
        ...
    def projection(self, what='input'):
        new = self.empty_copy()
        new.mode='automaton'
        ...
    def determinisation(self):
        assert self.mode == 'automaton'
        ...
    def minimization(self, algorithm=None):
        if self.mode is None:
            raise NotImplementedError, "The mode attribute must be set."
        if self.mode == 'transducer':
            raise NotImplementedError, "Minimization for Transducer is not implemented. Try the simplification method."
        if not self.mode == 'automaton':
            raise NotImplementedError
        ...
    def simplification(self):
        if self.mode != 'transducer':
            raise NotImplementedError, "Simplification is only implemented for Transducers. For Automata, use minimization instead"
        ...

I would leave the 70 methods not mentioned above in the class FiniteStateMachine. Then, according to how each of the 7 methods are used, I would do the following for each of them:

  • __init__: This method can be kept as it is in the FiniteStateMachine class and I believe the line self.mode = mode can just be deleted.
  • empty_copy: This method can be kept in the FiniteStateMachine and I believe the line new.mode = 'automaton' can just be deleted.
  • _latex_: This method can be kept in the FiniteStateMachine and I would move the code depending on the mode inside a method in the classes Automaton and Transducer. This method could be called _transition_label or something like that would return the label of a transition in the according way. Maybe this method will just ask the transition to do it.
  • projection: This method can be kept in the FiniteStateMachine and could return an instance of an Automaton.
  • determinisation and minimization : I would put these methods in the class Automaton
  • simplification : I would put this method in the class Transducer

Finally, I would replace the following functions::

def Transducer(*args, **kwargs):
    ...
def Automaton(*args, **kwargs):
    ...

by classes with the same docstrings in the following way (the constructor is stil the same actually defined for FiniteStateMachine)::

class Transducer(FiniteStateMachine):
    r"""
    same doctrings here as for def Transducer
    """
    def _transition_label(self, some_args):
        r"""
        Return the proper transition label.

        Method used by ``_latex_`` method
        """
        ...
    def simplification(self):
        ...

class Automaton(FiniteStateMachine):
    r"""
    same doctrings here as for def Automaton
    """
    def _transition_label(self, some_args):
        r"""
        Return the proper transition label.

        Method used by ``_latex_`` method
        """
        ...
    def determinisation(self):
        ...
    def minimization(self):
        ...

Cheers!

Sébastien

Changed 5 years ago by dkrenn

comment:21 Changed 5 years ago by dkrenn

We have worked in all the comments from above. In particular, Automaton and Transducer now inherite from FiniteStateMachine.

Last edited 5 years ago by dkrenn (previous) (diff)

comment:22 Changed 5 years ago by dkrenn

  • Description modified (diff)

comment:23 Changed 5 years ago by darij

This needs rebasing on beta3:

darij@travis-virtualbox:~/sage-5.13.beta3/devel/sage-main$ hg qimport ~/patches/trac_15078_fsm_automata_transducers.6.patch 
adding trac_15078_fsm_automata_transducers.6.patch to series file
darij@travis-virtualbox:~/sage-5.13.beta3/devel/sage-main$ hg qpushapplying trac_15078_fsm_automata_transducers.6.patch
patching file doc/en/reference/combinat/index.rst
Hunk #1 FAILED at 78
1 out of 1 hunks FAILED -- saving rejects to file doc/en/reference/combinat/index.rst.rej
patch failed, unable to continue (try -v)
patch failed, rejects left in working dir
errors during apply, please fix and refresh trac_15078_fsm_automata_transducers.6.patch

Changed 5 years ago by dkrenn

comment:25 follow-up: Changed 5 years ago by slabbe

Hi there, I just tested the most recent patch. All test pass. Documentation builds fine again. Correction were made after my previous comments : documentation of the input FiniteStateMachine is now great, new classes for Automaton and Transducer were done.

Related to the creation of the three classes, I think still another thing must be done : document the differences between them. The user of these classes will know that the three classes exist (because the documentation of one class refers to the other one). The user understand the inputs are the same, but then will ask why should he use one class instead of the other one?

   OUTPUT:

   A finite state machine.

   The object creation of "Automaton" and "Transducer" is the same as
   the one described here (i.e. just replace the word
   "FiniteStateMachine" by "Automaton" or "Transducer").

I am suggesting some fixes below that should help to answer this global question.

  1. There should be a new _repr_ method for Automaton and Transducer (I am sorry I forgot to mention this repr method in my previous comment):
sage: FiniteStateMachine()
Finite state machine with 0 states
sage: Automaton()
Automaton with 0 states
sage: Transducer()
Transducer with 0 states

instead of

sage: FiniteStateMachine()
finite state machine with 0 states
sage: Automaton()
finite state machine with 0 states
sage: Transducer()
finite state machine with 0 states
  1. The documentation of FiniteStateMachine should include a two line paragraph (maybe before the INPUT block) saying why the user wants to use this class and why the user should prefer to use another class like Automaton or Transducer. For instance : "For determinisation and minimisation, use Automaton class".
  1. Documentation of Automaton (and Transducer):
class Automaton(FiniteStateMachine):
    """
    This creates an automaton, which is a special type of a finite
    state machine.

    See class :class:`FiniteStateMachine` for more information.

    TESTS::

        sage: Automaton()
        finite state machine with 0 states
    """

Why is it special? Mention methods that are defined for Automaton and not for FiniteStateMachine. Same comments for Transducer.

The first example of the documentation of FiniteStateMachine uses word_out which is not the best first example for the user using Automaton. So, I suggest to add an EXAMPLES:: section in the doc of Automaton class containing the creation of at least one non empty Automaton.

class Automaton(FiniteStateMachine):
    r"""
    ...

    The inputs are the same as for :class:`FiniteStateMachine`.
    See class :class:`FiniteStateMachine` for more information.

    EXAMPLES:

        ...

    TESTS::

        sage: Automaton()
        finite state machine with 0 states
  1. A typo (finial) :
* "initial_states" and "final_states" -- the initial and finial

comment:26 in reply to: ↑ 25 Changed 5 years ago by dkrenn

Many thanks for your comments.

Replying to slabbe:

  1. There should be a new _repr_ method for Automaton and Transducer (I am sorry I forgot to mention this repr method in my previous comment):

Done.

  1. The documentation of FiniteStateMachine should include a two line paragraph (maybe before the INPUT block) saying why the user wants to use this class and why the user should prefer to use another class like Automaton or Transducer. For instance : "For determinisation and minimisation, use Automaton class".

Done (Added after the output block, where Automata and Transducer are mentioned; I think this is a good place for it. Also added links to minimization, simplification,... there)

  1. Documentation of Automaton (and Transducer):
class Automaton(FiniteStateMachine):
    """
    This creates an automaton, which is a special type of a finite
    state machine.

Why is it special? Mention methods that are defined for Automaton and not for FiniteStateMachine. Same comments for Transducer.

The first example of the documentation of FiniteStateMachine uses word_out which is not the best first example for the user using Automaton. So, I suggest to add an EXAMPLES:: section in the doc of Automaton class containing the creation of at least one non empty Automaton.

Rewritten and examples added.

  1. A typo (finial) :

Corrected.

These changes can be found in ​​http://trac.sagemath.org/attachment/ticket/15078/trac_15078_fsm_automata_transducers.8.patch (which runs in 5.13.beta3).

Changed 5 years ago by dkrenn

comment:27 Changed 5 years ago by dkrenn

  • Description modified (diff)

comment:28 Changed 5 years ago by slabbe

  • Reviewers set to Volker Braun, Frédéric Chapoton, Vincent Delecroix, Darij Grinberg, Sébastien Labbé

Great! Thanks for the changes and your good work answering all of the reviewers comments. I am ready to give a positive review.

Since much time passed since the start of the review, I believe everybody had a chance to give their comments. Hence, I change the status of this ticket to positive review.

Sébastien

comment:29 Changed 5 years ago by slabbe

  • Status changed from needs_review to positive_review

comment:30 Changed 5 years ago by jdemeyer

  • Description modified (diff)

comment:31 Changed 5 years ago by jdemeyer

  • Merged in set to sage-5.13.beta5
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:32 follow-ups: Changed 5 years ago by slabbe

Hi,

I would like to share some comments on the code on finite state machine that was merged into Sage.

New modules sometimes takes forever (see for example #10519, #12996) to go into Sage because reviewers are never happy and authors gets tired. So, for the actual ticket #15078, I wanted things to go differently. As a reviewer, I listed a bunch of improvements. Authors made the improvements. I gave positive review. And now it is in Sage. Good!

But finally, after using the code a little bit since its inclusion into Sage, I realized that I should have asked for more improvements... Anyway, I prefer an active developpment rather than dying code on trac. So I don't regret, considering the above examples, if I gave a too quick positive review.

So, here are my comments since the inclusion into Sage:

  • The basics operations (union, concatenation, intersection, complement, Kleene star) on automaton are not implemented. This might be bad because maybe the chosen representation is not suitable to answer to those basic operations in an efficient way. Or maybe the representation is suitable... Nobody knows.
  • I think for efficiency reasons there should not have a class for states and a class for transitions.
  • I think the class FSMProcessIterator could be changed into a method of the class automaton or finite state machine using yield statement.
  • An easy one: for now __mul__ is chosen for intersection. It should be __and__. Also __add__ is chosen for union. It should be __or__.
  • Documentation of top level module should not mention FSMState. Also, it should contain good examples of automaton and of transducers.

Unfortunately, I do not have time to work on these things myself. So, I let others on the short term create ticket if they agree with one of the above.

Cheers,

Sébastien

Last edited 5 years ago by slabbe (previous) (diff)

comment:33 Changed 5 years ago by darij

Please open up a new ticket -- not many people have closed tickets on their radar.

comment:34 Changed 5 years ago by tmonteil

  • Cc tmonteil added

comment:35 Changed 5 years ago by cheuberg

  • Cc cheuberg added

comment:36 in reply to: ↑ 32 Changed 5 years ago by skropf

Replying to slabbe:

  • An easy one: for now __mul__ is chosen for intersection. It should be __and__. Also __add__ is chosen for union. It should be __or__.

In #16016 these names are changed.

comment:37 in reply to: ↑ 32 Changed 5 years ago by cheuberg

Replying to slabbe:

  • The basics operations (union, concatenation, intersection, complement, Kleene star) on automaton are not implemented. This might be bad because maybe the chosen representation is not suitable to answer to those basic operations in an efficient way. Or maybe the representation is suitable... Nobody knows.

intersection now is in #16061

  • Documentation of top level module should not mention FSMState. Also, it should contain good examples of automaton and of transducers.

One more example has been added in #16143: Standard Binary -> Gray Code.

comment:38 in reply to: ↑ 32 Changed 3 years ago by cheuberg

Replying to slabbe:

  • The basics operations (union, concatenation, intersection, complement, Kleene star) on automaton are not implemented.

union is now #18557.

comment:39 in reply to: ↑ 32 Changed 3 years ago by cheuberg

Replying to slabbe:

  • The basics operations (union, concatenation, intersection, complement, Kleene star) on automaton are not implemented.

concatenation: #18965 complement: #18966 Kleene star: #18964

comment:40 follow-up: Changed 2 years ago by jdemeyer

What is the rationale for this:

    def __iadd__(self, other):
        raise NotImplementedError

Do you intend to implement this in the future?

comment:41 in reply to: ↑ 40 Changed 2 years ago by dkrenn

Replying to jdemeyer:

What is the rationale for this:

    def __iadd__(self, other):
        raise NotImplementedError

Do you intend to implement this in the future?

No, I don't think so.

Note: See TracTickets for help on using tickets.