Opened 6 years ago

Closed 22 months ago

#18267 closed enhancement (fixed)

libgap for PermutationGroup

Reported by: vdelecroix Owned by: embray
Priority: major Milestone: sage-8.9
Component: interfaces Keywords: fpsac2019
Cc: nthiery, tscrim Merged in:
Authors: Erik Bray Reviewers: Frédéric Chapoton, Vincent Delecroix, Travis Scrimshaw
Report Upstream: N/A Work issues:
Branch: 65f8a84 (Commits, GitHub, GitLab) Commit: 65f8a84b3965bf80fc3b7fb6018a1be753f3eab2
Dependencies: #27946, #28354 Stopgaps:

Status badges

Description

As remarked in this question on ask.sagemath.org a lot of time is spent with pexpect when playing with permutation groups.

In this ticket:

  • we interface libgap permutation group so that the following works
    sage: p1=libgap.eval("(1,2,3)")
    sage: p2=libgap.eval("(3,4,5)")
    sage: p1.sage()
    (1,2,3)
    sage: G=libgap.Group([p1,p2])
    sage: G.sage()
    Traceback (most recent call last):
    ...
    NotImplementedError: cannot construct equivalent Sage object
    
  • we check that interesting GAP functions such as ClosureGroup, AllBlocks, are available from libgap
  • we use them in PermutationGroup inplace of the current gap version

Change History (40)

comment:1 Changed 2 years ago by embray

  • Milestone changed from sage-6.7 to sage-wishlist

comment:2 Changed 2 years ago by embray

The overhead of using pexpect for GAP is especially serious on Cygwin. I believe it to be one of the worst performance hits there, due to the I/O overhead.

It's so slow in fact that I wonder if there isn't a bug/defect in Cygwin related to this. It's normal that there is some additional overhead involved for I/O on Cygwin, but this is especially bad.

comment:3 Changed 2 years ago by embray

  • Owner changed from (none) to embray

I have a prototype for this in progress that is shaping up pretty well so far. I will likely need help with this part:

we check that interesting GAP functions such as ClosureGroup, AllBlocks, are available from libgap

as I don't know all the math well enough to write interesting tests for this. But I can get most of the skunkworks done and then anyone else who's interested can add on to it.

Last edited 2 years ago by embray (previous) (diff)

comment:4 Changed 2 years ago by embray

The I/O overhead on Cygwin adds up to something on the order of 20x in this test (specially, the _test_cellular test). I tracked this down to being entirely coming from the pexpect communication. On Linux:

sage -t --long src/sage/combinat/symmetric_group_algebra.py
    [422 tests, 30.17 s]

On Cygwin (different machine, but with reasonably good specs):

sage -t --long src/sage/combinat/symmetric_group_algebra.py
    [422 tests, 613.16 s]

More than 10 minutes on this one test! It's by far one of the slowest tests on Cygwin. With my prototype of getting pexpect out of permutation groups I knocked this result down to (again on Cygwin):

sage -t --long src/sage/combinat/symmetric_group_algebra.py
    [422 tests, 25.49 s]

comment:5 Changed 2 years ago by embray

  • Milestone changed from sage-wishlist to sage-pending

Unfortunately src/sage/groups/perm_gps/permgroup_named.py still remains by far one of the slowest tests on Cygwin despite my fixes. Unclear as of yet why; maybe several tests using other code that still uses pexpect, or something else...

comment:6 Changed 2 years ago by embray

Well I found one more pexpect use in permgroup_named; fixing that one knocked it down from ~5 minutes to

sage -t --long src/sage/groups/perm_gps/permgroup_named.py
    [509 tests, 47.12 s]

Which is still about 5 times slower for some reason than that test is on Linux, but progress!

Some of the tests also intentionally use the pexpect interface directly, so nothing to do about that for now.

comment:7 Changed 2 years ago by embray

  • Authors set to Erik Bray
  • Branch set to u/embray/prototype/perm-gps-no-pexpect-gap
  • Cc nthiery added
  • Commit set to 75d6b764cac83c2930dff648605de98adbc506db
  • Dependencies set to #27946
  • Status changed from new to needs_review

Here is my current prototype for getting pexpect out of permutation groups, and having them use libgap exclusively. For the large part it is quite effective at this. This fixes (among many other examples):

sage: G = SymmetricGroup(3)
sage: %prun G.one()
sage: G.one()
()
sage: gap._expect is None

where previously this was invoking a GAP process through pexpect.


Last 10 new commits:

618a0d4Fix several predicate methods of permutation groups to use libgap.
7a0dd0eUse libgap for permutation group comparison.
cc5381dIt seems these tests broke again.
d4e81e8Use libgap in several more methods where it was trivial to make the change, and where it at least didn't break any local tests.
2e60ebeThese tests were marked 'random output' despite the explicit calls to set_random_seed(0), which should have obviated this.
3c8954aUse libgap for direct_product
220c5e7Use libgap for character_table
5ab41fbSome fixes to make ClassFunction work properly with libgap.
db4eedaUse libgap for PermutationGroup_generic._regular_subgroup_gap
75d6b76Use libgap for PermutationGroup_generic.molien_series

comment:8 Changed 2 years ago by embray

  • Status changed from needs_review to needs_work

Unfortunately there is still a handful of failing tests with this.

comment:9 Changed 2 years ago by embray

I think most of the problems are coming from my hacky attempt to make some GapElement objects pickle-able in fcfde85 which can make some surprising results, and is maybe not quite right yet. It should probably be handled in a separate ticket.

comment:10 Changed 2 years ago by git

  • Commit changed from 75d6b764cac83c2930dff648605de98adbc506db to c02c2f11a2c2228851419eb1aadd32fde75a2ad4

Branch pushed to git repo; I updated commit sha1. This was a forced push. Last 10 new commits:

075a48aIt seems these tests broke again.
dcc0b11Use libgap in several more methods where it was trivial to make the change, and where it at least didn't break any local tests.
96c55c3These tests were marked 'random output' despite the explicit calls to set_random_seed(0), which should have obviated this.
13b0e3cUse libgap for direct_product
6928f69Use libgap for character_table
77513feSome fixes to make ClassFunction work properly with libgap.
edf2a5dUse libgap for PermutationGroup_generic._regular_subgroup_gap
6fbdbd1Use libgap for PermutationGroup_generic.molien_series
42318a9Fix minor test failure: This will now fail at trying to find a _libgap_init_ rather than a _gap_init_
c02c2f1If the values passed to the ClassFunction constructor are already a libgap

comment:11 Changed 2 years ago by embray

Fixed the problems that I knew about, though there are likely still others. Although this is set to "needs_work" it would probably be interesting-enough to start playing around with if anyone wants to review.

comment:12 Changed 2 years ago by embray

With the current version of this branch almost everything works except for a few trivial doctest failures due to new pseudo-random results in some tests.

However, I am also getting a bizarre memory error when running the tests for sage.combinat.tableau that's similar to what I reported in #27681 (but later couldn't reproduce).

comment:13 Changed 2 years ago by tscrim

  • Cc tscrim added

+1 for doing this.

comment:14 Changed 2 years ago by embray

If anyone would like to help test this that would be great for moving forward on this work. In particular, it would be helpful to know if anyone can reproduce the test failure in sage.combinat.tableau, since from my end that is the only major blocker to completing this work.

comment:15 Changed 2 years ago by git

  • Commit changed from c02c2f11a2c2228851419eb1aadd32fde75a2ad4 to cc6491009070408c9ed5dde6317ee0ac229458ca

Branch pushed to git repo; I updated commit sha1. New commits:

cc64910Fixed slightly different output on a few tests due to PRNG differences related

comment:16 Changed 2 years ago by embray

  • Status changed from needs_work to needs_review

All tests now pass for me on this ticket (have not tried optional packages).

The one bug I mentioned in comment:14 is now tracked at #28106 and is not directly caused by the changes I made here; rather it only exhibited due to the fact that using libgap in more places meant slightly more virtual memory was being allocated in some tests than previously, but this is expected.

So if you run into the bug with MemoryErrors in sage.combinat.tableau you can pass the test runner --memlimit=4000 or something like that.

comment:17 Changed 2 years ago by chapoton

I have tested a bit with python3. All doctests pass in tableau-related files and in the modified files.

comment:18 Changed 2 years ago by vdelecroix

Tests pass with surface_dynamics that uses PermutationGroup. In the same library, I also replaced many calls to gap by libgap and everything went smoothly.

Thanks for working on this!

comment:19 Changed 2 years ago by tscrim

  • Milestone changed from sage-pending to sage-8.9
  • Reviewers set to Frédéric Chapoton, Vincent Delecroix, Travis Scrimshaw
  • Status changed from needs_review to positive_review

Everything also LGTM.

comment:20 Changed 2 years ago by tscrim

  • Keywords fpsac2019 added

comment:21 Changed 2 years ago by vbraun

  • Dependencies changed from #27946 to #27946, #28106

comment:22 Changed 23 months ago by chapoton

  • Milestone changed from sage-8.9 to sage-pending

comment:23 Changed 23 months ago by embray

Any reason this is sage-pending?

comment:24 Changed 23 months ago by chapoton

dependency needing work

comment:25 Changed 23 months ago by embray

  • Dependencies changed from #27946, #28106 to #27946
  • Milestone changed from sage-pending to sage-8.9

I think I'll just remove the dependency. It's not actually a dependency for this ticket, especially since my solution proposed there still doesn't seem to work. The relation between that issue and this ticket is much more indirect than I previously thought.

comment:26 Changed 23 months ago by vbraun

  • Status changed from positive_review to needs_work

This breaks the gap interfaces when building with SAGE_DEBUG=yes:

**********************************************************************
File "src/sage/interfaces/gap.py", line 732, in sage.interfaces.gap.Gap_generic._eval_line
Failed example:
    a = gap(3)
Exception raised:
    Traceback (most recent call last):
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 681, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 1105, in compile_and_execute
        exec(compiled, globs)
      File "<doctest sage.interfaces.gap.Gap_generic._eval_line[8]>", line 1, in <module>
        a = gap(Integer(3))
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/interfaces/interface.py", line 295, in __call__
        result = self._coerce_from_special_method(x)
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/interfaces/interface.py", line 323, in _coerce_from_special_method
        return (x.__getattribute__(s))(self)
      File "sage/structure/sage_object.pyx", line 693, in sage.structure.sage_object.SageObject._gap_ (build/cythonized/sage/structure/sage_object.c:5929)
        return self._interface_(G)
      File "sage/structure/sage_object.pyx", line 669, in sage.structure.sage_object.SageObject._interface_ (build/cythonized/sage/structure/sage_object.c:5475)
        X = I(s)
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/interfaces/interface.py", line 288, in __call__
        return cls(self, x, name=name)
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/interfaces/expect.py", line 1438, in __init__
        self._name = parent._create(value, name=name)
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/interfaces/interface.py", line 491, in _create
        self.set(name, value)
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/interfaces/gap.py", line 1411, in set
        self._eval_line(cmd, allow_use_file=True)
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/interfaces/gap.py", line 746, in _eval_line
        expect_eof= (self._quit_string() in line))
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/interfaces/gap.py", line 609, in _execute_line
        x = E.expect_list(self._compiled_full_pattern)
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/pexpect/spawnbase.py", line 369, in expect_list
        return exp.expect_loop(timeout)
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/pexpect/expect.py", line 111, in expect_loop
        incoming = spawn.read_nonblocking(spawn.maxread, timeout)
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/pexpect/pty_spawn.py", line 469, in read_nonblocking
        self.isalive()
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/pexpect/pty_spawn.py", line 704, in isalive
        alive = ptyproc.isalive()
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/contextlib.py", line 35, in __exit__
        self.gen.throw(type, value, traceback)
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/pexpect/pty_spawn.py", line 25, in _wrap_ptyprocess_err
        raise ExceptionPexpect(*e.args)
    ExceptionPexpect: isalive() encountered condition where "terminated" is 0, but there was no child process. Did someone else call waitpid() on our process?
**********************************************************************
[...]
    [227 tests, 92 failures, 42.83 s]
----------------------------------------------------------------------
sage -t --long src/sage/interfaces/gap.py  # 92 doctests failed
----------------------------------------------------------------------

comment:27 Changed 23 months ago by embray

  • Status changed from needs_work to needs_info

I'm not really sure what makes you think that has anything to do with this ticket. That looks like the gap interpreter itself is crashing unexpectedly when running a debug build. But this ticket doesn't change anything about gap itself or the pexpect interface.

comment:28 Changed 23 months ago by vbraun

I only ran the tests with and without this ticket, I haven't looked at the code. If it is as you say then presumably its some sort of resource exhaustion while running the tests.

comment:29 Changed 23 months ago by embray

I'll still check if I can reproduce. It could be something like #28106

comment:30 Changed 23 months ago by embray

I can reproduce it, even with --memlimit=0. Truly baffling, but I guess there must be something...

comment:31 Changed 23 months ago by embray

Somehow it seems this has caused a regression of something that was supposed to be fixed by #10296. The problems start here:

        The following tests against a bug fixed at :trac:`10296`::

            sage: gap(3)
            3
            sage: gap.eval('quit;')
            ''
            sage: a = gap(3)
            ** Gap crashed or quit executing '\$sage...:=3;;' **
            Restarting Gap and trying again
            sage: a
            3

Instead of gap crashing normally and restarting, we get this error instead:

Exception raised:
    Traceback (most recent call last):
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 681, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 1105, in compile_and_execute
        exec(compiled, globs)
      File "<doctest sage.interfaces.gap.Gap_generic._eval_line[8]>", line 1, in <module>
        a = gap(Integer(3))
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/interfaces/interface.py", line 295, in __call__
        result = self._coerce_from_special_method(x)
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/interfaces/interface.py", line 323, in _coerce_from_special_method
        return (x.__getattribute__(s))(self)
      File "sage/structure/sage_object.pyx", line 693, in sage.structure.sage_object.SageObject._gap_ (build/cythonized/sage/structure/sage_object.c:5929)
        return self._interface_(G)
      File "sage/structure/sage_object.pyx", line 669, in sage.structure.sage_object.SageObject._interface_ (build/cythonized/sage/structure/sage_object.c:5475)
        X = I(s)
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/interfaces/interface.py", line 288, in __call__
        return cls(self, x, name=name)
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/interfaces/expect.py", line 1438, in __init__
        self._name = parent._create(value, name=name)
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/interfaces/interface.py", line 491, in _create
        self.set(name, value)
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/interfaces/gap.py", line 1411, in set
        self._eval_line(cmd, allow_use_file=True)
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/interfaces/gap.py", line 746, in _eval_line
        expect_eof= (self._quit_string() in line))
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/interfaces/gap.py", line 609, in _execute_line
        x = E.expect_list(self._compiled_full_pattern)
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/pexpect/spawnbase.py", line 369, in expect_list
        return exp.expect_loop(timeout)
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/pexpect/expect.py", line 111, in expect_loop
        incoming = spawn.read_nonblocking(spawn.maxread, timeout)
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/pexpect/pty_spawn.py", line 469, in read_nonblocking
        self.isalive()
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/pexpect/pty_spawn.py", line 704, in isalive
        alive = ptyproc.isalive()
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/contextlib.py", line 35, in __exit__
        self.gen.throw(type, value, traceback)
      File "/home/buildbot-sage/slave/sage_git/build/local/lib/python2.7/site-packages/pexpect/pty_spawn.py", line 25, in _wrap_ptyprocess_err
        raise ExceptionPexpect(*e.args)
    ExceptionPexpect: isalive() encountered condition where "terminated" is 0, but there was no child process. Did someone else call waitpid() on our process?

after which point the gap interface never recovers. It's only happening in the tests though. I can't reproduce interactively. Running the tests with --serial makes no difference.

Last edited 23 months ago by embray (previous) (diff)

comment:32 Changed 23 months ago by git

  • Commit changed from cc6491009070408c9ed5dde6317ee0ac229458ca to 65f8a84b3965bf80fc3b7fb6018a1be753f3eab2

Branch pushed to git repo; I updated commit sha1. This was a forced push. Last 10 new commits:

2fc1096These tests were marked 'random output' despite the explicit calls to set_random_seed(0), which should have obviated this.
b123422Use libgap for direct_product
f35a967Use libgap for character_table
4aacf0fSome fixes to make ClassFunction work properly with libgap.
4fe1132Use libgap for PermutationGroup_generic._regular_subgroup_gap
3ef4b4bUse libgap for PermutationGroup_generic.molien_series
9205aa6Fix minor test failure: This will now fail at trying to find a _libgap_init_ rather than a _gap_init_
0d0e1bbIf the values passed to the ClassFunction constructor are already a libgap
c8c5ecfFixed slightly different output on a few tests due to PRNG differences related
65f8a84Update this test to still use the gap pexpect interface.

comment:33 Changed 23 months ago by embray

  • Status changed from needs_info to needs_review

Rebased on current develop, and the latest commit fixes the immediate problem in the test suite.

I believe there is still an underlying problem with libgap/gap interaction which is not yet addressed though (and is probably not new as of this ticket, but just more likely to occur with libgap being used more extensively).

comment:34 Changed 23 months ago by embray

This might be related to https://github.com/gap-system/gap/issues/3380 , with libgap wait()-ing on random child processes inappropriatly.

comment:35 Changed 23 months ago by embray

It seems that, at least theoretically, we already do disable GAP's default SIGCHLD handler.

comment:36 Changed 23 months ago by embray

  • Status changed from needs_review to positive_review

Bizarrely, even after backing out that last test update, and rebuilding, I can't reproduce the issue anymore either. I think it might have actually had something to do with sage-cleaner.

Please send this to the buildbots again. I really don't think the issue is caused by this ticket, but is something more ephemeral that it happens to provoke somehow.

comment:37 Changed 23 months ago by embray

Yes, this has got to have something to do with sage-cleaner. If I just start a sage-cleaner process on its own, then run ./sage -t src/sage/interfaces/gap.py I can reproduce the issue reliably.

comment:38 Changed 23 months ago by embray

Soft dependency on #28354: Although I don't think the issues are particularly related, if those pexpect exceptions about isalive() keep happening, merging #28354 first should alleviate them.

comment:39 Changed 23 months ago by vbraun

  • Dependencies changed from #27946 to #27946, #28354

comment:40 Changed 22 months ago by vbraun

  • Branch changed from u/embray/prototype/perm-gps-no-pexpect-gap to 65f8a84b3965bf80fc3b7fb6018a1be753f3eab2
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.