#19585 closed enhancement (fixed)
Improve efficiency of calling GAP functions
Reported by:  jaanos  Owned by:  

Priority:  major  Milestone:  sage6.10 
Component:  interfaces  Keywords:  GAP functions interface 
Cc:  vbraun, ncohen, dimpase  Merged in:  
Authors:  Janoš Vidali  Reviewers:  Travis Scrimshaw 
Report Upstream:  N/A  Work issues:  
Branch:  5fba891 (Commits)  Commit:  
Dependencies:  Stopgaps: 
Description (last modified by )
When calling a GAP function, the current code outputs its result twice for each call  first time when the function is called, and once again when checking whether the function actually returned something. Then, when the result is actually needed, it is once again read from the variable storing it.
This ticket removes the first two outputs, thus greatly speeding up the (still slow) GAP interface. The first output is dealt with by ;;
, which suppresses outputting the result, but not any explicitly printed output (which is caught when the function doesn't actually return anything). The second output, which would be printed and compared with the marker, is now stored in an auxiliary variable, which is then compared with the marker using GAP's IsIdenticalObj
function (which will also give the correct result in the unlikely case when the function returns "__SAGE_LAST__"
, except in the even more unlikely case when it returns the marker object itself).
The speedup achieved by this fix can be quite dramatic. For example, calling A.conjugacy_classes_subgroups()
for a permutation group A
of order 2000 took about 20 seconds on my laptop, but this is now cut down to under 10 seconds (which is still slower than it needs to be, IMO).
Attachments (1)
Change History (45)
comment:1 Changed 3 years ago by
 Branch set to u/jaanos/improve_efficiency_of_calling_gap_functions
comment:2 Changed 3 years ago by
 Commit set to 62bd5710559065d03a64aed7b8611cdc6842b278
 Component changed from PLEASE CHANGE to interfaces
 Description modified (diff)
 Keywords GAP functions interface added
 Status changed from new to needs_review
 Type changed from PLEASE CHANGE to enhancement
comment:3 Changed 3 years ago by
 Cc vbraun added
This does not appear to be a universal speedup. With branch:
sage: P = groups.permutation.Symmetric(7) sage: %time S = P.conjugacy_classes_subgroups() CPU times: user 432 ms, sys: 44 ms, total: 476 ms Wall time: 507 ms sage: P = groups.matrix.GL(3, 3) sage: %timeit P.conjugacy_classes_representatives() 1000 loops, best of 3: 1.08 ms per loop sage: P = groups.permutation.Suzuki(8) sage: P.cardinality() 29120 sage: %time S = P.conjugacy_classes_subgroups() CPU times: user 115 ms, sys: 10.3 ms, total: 125 ms Wall time: 296 ms
vs before:
sage: P = groups.permutation.Symmetric(7) sage: %time S = P.conjugacy_classes_subgroups() CPU times: user 368 ms, sys: 40.4 ms, total: 408 ms Wall time: 436 ms sage: P = groups.matrix.GL(3, 3) sage: %timeit P.conjugacy_classes_representatives() 1000 loops, best of 3: 1.02 ms per loop sage: P = groups.permutation.Suzuki(8) sage: %time S = P.conjugacy_classes_subgroups() CPU times: user 139 ms, sys: 121 µs, total: 139 ms Wall time: 264 ms
(For these timings, I ran it 10 times and took the best.)
Perhaps this change is asymptotically better. Could you post the group(s) you are using to test this?
comment:4 Changed 3 years ago by
 Cc ncohen dimpase added
comment:5 Changed 3 years ago by
I have attached the sparse6 string of a graph the automorphism group of which I was working with. I guess the problem here is that the domain is very large (unlike the cases above), so actually printing and reading the result is what takes most time, rather than the computation itself.
comment:6 Changed 3 years ago by
I suspect the reason we are seeing a slowdown in these smaller function calls is because of the extra GAP call:
if self.eval('IsIdenticalObj(__SAGE_VAL__, __SAGE_LAST__)') != 'true':
I think we could probably just check if res
is nonempty:
self.eval('__SAGE_LAST__ := "__SAGE_LAST__";;') res = self.eval("%s(%s);;"%(function, ",".join([s.name() for s in args]+ ['%s=%s'%(key,value.name()) for key, value in kwds.items()]))) if not res: # No printing was done if self.eval('IsIdenticalObj(last, __SAGE_LAST__)') != 'true': return self.new('last2') else: if res.strip(): from sage.interfaces.expect import AsciiArtString return AsciiArtString(res)
comment:7 Changed 3 years ago by
By the way, apart from the result the last call, last
, GAP stores results of two previous calls, in last2
and in last3
.
It looks like using these might avoid doing an extra GAP call...
(IsIdenticalObject
is not cheap in GAP)
comment:8 Changed 3 years ago by
 Branch changed from u/jaanos/improve_efficiency_of_calling_gap_functions to u/tscrim/improve_calling_gap_function19585
 Commit changed from 62bd5710559065d03a64aed7b8611cdc6842b278 to 48409b03cc37684677c441eddc6ae54584cdd13c
Here is the above tweak (which I did some editing to) that has about a %1 slowdown of the current Sage version but doesn't do the extra printing.
Is IsIdenticalObject
really that expensive in GAP? It seems like it would be the equivalent of is
, and so it should be fast...
New commits:
48409b0  Reducing the number of function calls to GAP.

comment:9 followup: ↓ 12 Changed 3 years ago by
I think we have a problem now... what if the function does print something, but also returns a value? Then we should catch the value, not the printed output (which is most likely some kind of warning).
comment:10 Changed 3 years ago by
 Branch changed from u/tscrim/improve_calling_gap_function19585 to u/jaanos/improve_calling_gap_function19585
comment:11 Changed 3 years ago by
 Commit changed from 48409b03cc37684677c441eddc6ae54584cdd13c to 46f1391fd13052e8c684ed3e96434143a58860dc
I have removed the check for empty output. Since most functions we call will return a value and not print anything, there is actually one check less in these cases.
As far as IsIdenticalObj
is concerned, it surprises me too that it should not be cheap. I don't suppose that =
is actually cheaper?
Actually, what we could do is cut the number of times eval
is called, by simply running the three GAP commands in one eval
. I'll try that, and if there is any speedup from doing that, I'll push the change to this ticket.
New commits:
46f1391  Call IsIdenticalObj even when there is some output, to prevent ignoring the returned value when there is printed output

comment:12 in reply to: ↑ 9 Changed 3 years ago by
Replying to jaanos:
I think we have a problem now... what if the function does print something, but also returns a value? Then we should catch the value, not the printed output (which is most likely some kind of warning).
This is the current behavior in Sage, so it's not really a problem. However, with our current approach, we could print out any nontrivial print statements along with functions that print and give outputs.
Although I think we do need 2 function calls, where the second one is the current call to IsIdenticalObject
.
comment:13 Changed 3 years ago by
Hi!
I've tried a few alternatives. First of all, for reference, the timings on my laptop with the develop branch (i.e., 6.10.beta4):
sage: P = groups.permutation.Symmetric(7) sage: %time S = P.conjugacy_classes_subgroups() CPU times: user 944 ms, sys: 64 ms, total: 1.01 s Wall time: 1.33 s sage: P = groups.matrix.GL(3, 3) sage: %timeit P.conjugacy_classes_representatives() 100 loops, best of 3: 3.26 ms per loop sage: P = groups.permutation.Suzuki(8) sage: %time S = P.conjugacy_classes_subgroups() CPU times: user 300 ms, sys: 40 ms, total: 340 ms Wall time: 421 ms
With the current branch, there is a small slowdown when there wasn't much output to cut, as expected:
sage: P = groups.permutation.Symmetric(7) sage: %time S = P.conjugacy_classes_subgroups() CPU times: user 1.08 s, sys: 120 ms, total: 1.2 s Wall time: 1.42 s sage: P = groups.matrix.GL(3, 3) sage: %timeit P.conjugacy_classes_representatives() 100 loops, best of 3: 3.31 ms per loop sage: P = groups.permutation.Suzuki(8) sage: %time S = P.conjugacy_classes_subgroups() CPU times: user 220 ms, sys: 40 ms, total: 260 ms Wall time: 343 ms
I've tried removing IsIdenticalObj
:
marker1 = '"__SAGE_LAST1__"' marker2 = '"__SAGE_LAST2__"' self.eval('__SAGE_LAST1__ := %s;;' % marker1) self.eval('__SAGE_LAST2__ := %s;;' % marker2) res = self.eval("%s(%s);;"%(function, ",".join([s.name() for s in args]+ ['%s=%s'%(key,value.name()) for key, value in kwds.items()]))) if self.eval('last2') == marker2: return self.new('last2') else: ...
But it seems that IsIdenticalObj
isn't that expensive after all:
sage: P = groups.permutation.Symmetric(7) sage: %time S = P.conjugacy_classes_subgroups() CPU times: user 1.08 s, sys: 132 ms, total: 1.22 s Wall time: 1.5 s sage: P = groups.matrix.GL(3, 3) sage: %timeit P.conjugacy_classes_representatives() 100 loops, best of 3: 3.27 ms per loop sage: P = groups.permutation.Suzuki(8) sage: %time S = P.conjugacy_classes_subgroups() CPU times: user 240 ms, sys: 28 ms, total: 268 ms Wall time: 371 ms
I've also tried making all calls in a single eval
, but it seems that last
doesn't work when used in a file. So it will probably be best to keep the latest version with IsIdenticalObj
.
Janoš
comment:14 followup: ↓ 18 Changed 3 years ago by
 Branch changed from u/jaanos/improve_calling_gap_function19585 to u/tscrim/improve_calling_gap_function19585
 Commit changed from 46f1391fd13052e8c684ed3e96434143a58860dc to f59926575333e34976901f92c1f070e0c168a0b9
 Reviewers set to Travis Scrimshaw
By combining the two commands, I get a net speedup:
sage: P = groups.permutation.Symmetric(7) sage: %timeit P.conjugacy_classes_subgroups() 1 loops, best of 3: 390 ms per loop sage: P = groups.permutation.Symmetric(2) sage: %timeit P.conjugacy_classes_subgroups() 100 loops, best of 3: 7.87 ms per loop
vs current develop
:
sage: P = groups.permutation.Symmetric(7) sage: %timeit P.conjugacy_classes_subgroups() 1 loops, best of 3: 403 ms per loop sage: P = groups.permutation.Symmetric(2) sage: %timeit P.conjugacy_classes_subgroups() 100 loops, best of 3: 8.08 ms per loop
I think we need to mitigate then number of eval
calls that get made more so than doing the check in Sage. So if you're happy with my last change, then I'm okay with setting this to positive review.
New commits:
f599265  Combining the two commands for more speed.

comment:15 followup: ↓ 16 Changed 3 years ago by
For solving your original problem: if communication is a bottleneck, then using libGap will probably give you the best improvements. That said, improvements in expect interfaces are of course welcome, so thanks!
comment:16 in reply to: ↑ 15 ; followup: ↓ 17 Changed 3 years ago by
Replying to nbruin:
For solving your original problem: if communication is a bottleneck, then using libGap will probably give you the best improvements. That said, improvements in expect interfaces are of course welcome, so thanks!
interestingly, libGAP is quite slow here.
sage: %time S = P.conjugacy_classes_subgroups() CPU times: user 143 ms, sys: 24 ms, total: 167 ms Wall time: 181 ms sage: P = libgap.SuzukiGroup(libgap.IsPermGroup,8) sage: %time S=P.ConjugacyClassesSubgroups() CPU times: user 220 ms, sys: 0 ns, total: 220 ms Wall time: 220 ms
I guess there handling of P
and S
on the Sage side is not optimal...
comment:17 in reply to: ↑ 16 Changed 3 years ago by
Replying to dimpase:
interestingly, libGAP is quite slow here.
...
I guess there handling of
P
andS
on the Sage side is not optimal...
That is one possibility. It could also be that libGap gets compiled with different optimization settings and/or that the memory manager underneath is different and affects execution. It could really be worthwhile if someone knowledgeable about libGap would run some of these examples through a profiler (gperftool perhaps?) and compare the execution profiles between Gap and libGap.
comment:18 in reply to: ↑ 14 Changed 3 years ago by
Hi!
I think we need to mitigate then number of
eval
calls that get made more so than doing the check in Sage. So if you're happy with my last change, then I'm okay with setting this to positive review.
That seems like a good idea, however it fails for longer commands, when a file is used to communicate with GAP (as I said, last
doesn't seem to work in this case). Maybe we could check the length of the command, and then decide how many eval
s we do based on that (really short cases can be handled with a single eval
).
Janoš
comment:19 Changed 3 years ago by
 Branch changed from u/tscrim/improve_calling_gap_function19585 to u/jaanos/improve_calling_gap_function19585
comment:20 Changed 3 years ago by
 Commit changed from f59926575333e34976901f92c1f070e0c168a0b9 to 1408680266b88585fa6cb15780e8c8dfce415105
Hello again!
OK, I added the check for overlong function calls, and it doesn't seem to affect runnning times. Such calls would probably never happen, but we can simulate one with something like
gap.function_call("ConjugacyClassesSubgroups", sage.interfaces.gap.GapElement(gap, 'SymmetricGroup(2)', name = 'a_variable_with_a_very_very_very_long_name'))
Without the latest commit, this would return a wrong result.
I have also experimented with also making the check for identical objects in the same eval
if the function call was short enough, but this invariably caused slowdowns.
Janoš
New commits:
1408680  Check if command is too long for combining

comment:21 Changed 3 years ago by
I'm not quite sure why my version didn't work; I'm sure it has something to do with how the pexpect interfaces work. However I confirm we need this change. Could you add a doctest for the long commands? With that, then you can set a positive review on my behalf.
comment:22 Changed 3 years ago by
 Commit changed from 1408680266b88585fa6cb15780e8c8dfce415105 to 086af29b0632d16e097da3f042a7656417fa517c
Branch pushed to git repo; I updated commit sha1. New commits:
086af29  Add doctests for function_call

comment:23 Changed 3 years ago by
 Status changed from needs_review to needs_work
I've added the doctest. However, now doctesting fails on a seemingly unrelated issue:
sage t src/sage/interfaces/gap.py ********************************************************************** File "src/sage/interfaces/gap.py", line 1001, in sage.interfaces.gap.GapElement_generic.bool Failed example: gap('false').bool() Expected: False Got: True **********************************************************************
This doesn't happen on the develop branch, and not even with the last commit. It seems that the failure has to do with the overlong variable name being reused:
sage: gap.function_call("ConjugacyClassesSubgroups", sage.interfaces.gap.GapElement(gap, 'SymmetricGroup(2)', name = 'a_variable_with_a_name_so_very_very_very_long_that_even_by_itself_will_make_expect_use_a_file')) [ ConjugacyClassSubgroups(SymmetricGroup( [ 1 .. 2 ] ),Group( [ () ] )), ConjugacyClassSubgroups(SymmetricGroup( [ 1 .. 2 ] ),SymmetricGroup( [ 1 .. 2 ] )) ] sage: F = gap('false') sage: F.bool() True sage: F.name() 'a_variable_with_a_name_so_very_very_very_long_that_even_by_itself_will_make_expect_use_a_file' sage: F = gap('false') sage: F.bool() False sage: F.name() '$sage1'
Will look into it.
comment:24 Changed 3 years ago by
 Commit changed from 086af29b0632d16e097da3f042a7656417fa517c to 31a173bccd732cb8d8b656d6882dcec074a87b7d
Branch pushed to git repo; I updated commit sha1. New commits:
31a173b  Properly handle long variable names

comment:25 Changed 3 years ago by
 Status changed from needs_work to needs_review
OK, I had to make sure that getting variables with long names is properly handled. As it is, it is possible that eval
s with commands a bit longer than the cutoff are made directly to GAP instead of with a temporary file (due to the Print()
being wrapped around the variable name), but this shouldn't be a problem.
I am setting this back to "needs review" due to this additional changes.
Janoš
comment:26 Changed 3 years ago by
While in principle, I think these recent changes are okay, could someone else doublecheck that we want to do things this way?
As for the current code, I would change:
 if isinstance(self._name, six.string_types) and parent._eval_using_file_cutoff and \  parent._eval_using_file_cutoff < len(self._name):  self._get_using_file = True + self._get_using_file = (isinstance(self._name, six.string_types) + and parent._eval_using_file_cutoff + and parent._eval_using_file_cutoff < len(self._name))
so we don't have to do a try/except
block:
if use_file is None:  try:  use_file = self._get_using_file  except AttributeError:  use_file = False + use_file = self._get_using_file
comment:27 Changed 3 years ago by
I agree with the change.
By the way, earlier in Expect.__init__
, _get_using_file
is set if the value is too long  I don't see how that helps (except when is_name == True
, but the new code also covers this case). So we can probably remove that piece of code, too.
On the other hand, this comment
# idea: Joe Wetherell  try to find out if the output # is too long and if so get it using file, otherwise # don't.
makes me wonder whether _get_using_file == True
was meant that fetching the result would be done using a temporary file (rather than using one to communicate the variable name). But of course, a long command may give a very short answer, and the output of a short command may be very long, so this isn't something that can be decided here. Also, the only get_using_file
methods apart from the new one in gap.py
are the generic one in interface.py
(which simply calls get
), and one in lie.py
which raises a NotImplementedError
. So we're probably safe redefining the meaning, if that's what's happening here.
comment:28 Changed 3 years ago by
 Status changed from needs_review to needs_work
Hello everybody!
It turns out that we still have some issues here.
Firstly, GAP function calls don't actually work when done through a temporary file, nor does getting a variable (which is essentially a Print
call). The reason for this is that when _eval_line_using_file
is called, it first looks for a :=
token  if there is one, all is well; but since we are relying on last
, we don't have one, and a Print()
is wrapped around our function call. If the function actually returns a value, this gives an impression of working, since the correct result is printed  but we get an AsciiArtString
instead of a GapElement
. When the function does not return a value, the situation is even worse, since printing the result of a function which does not return a result results in an error (which is silently suppressed, and an empty string is returned).
Now, it seems that we can't really solve this problem, since last
is not updated when commands are read from a file. So I'd suggest this is how we deal with it:
 If the command is short, evaluate it directly (one or two
eval
s, as in the latest commit).  If the command is long, we assign its result to a variable. This will of course cause an error if the function does not return a value  but such functions are few and it is much more likely that a (long) function does return a value.
 When getting a variable, we make a direct call to
Expect._eval_line_using_file
if a file should be needed to avoid having twoPrint
s and hence an error.
This problem has existed before this ticket was opened, so the cases when a temporary file is used are marginal at best. If you think it is OK, I can implement the above solution (or maybe someone else can do it, since I'll probably be busy in the following days).
Secondly, get_using_file
is indeed meant to use a file to read the results. So the code regarding _get_using_file
should probably be left as it is (of course we can always set _get_using_file
, so we don't need try
/except
blocks to check it), although the length of the input may be a poor predictor of the length of the output. Now, we should check whether getting with a file is actually faster with GAP than directly reading the output.
Janoš
comment:29 Changed 3 years ago by
Just to confirm, all of the issues you noted are there before this change. If so, then let us open up a followup ticket for these issues and get this into Sage as is. We don't need to fix everything all in one ticket.
comment:30 Changed 3 years ago by
 Branch changed from u/jaanos/improve_calling_gap_function19585 to u/jaanos/improve_efficiency_of_calling_gap_functions
comment:31 Changed 3 years ago by
 Commit changed from 31a173bccd732cb8d8b656d6882dcec074a87b7d to 086af29b0632d16e097da3f042a7656417fa517c
 Status changed from needs_work to needs_review
OK, I have removed the latest commit since it introduces some unneeded and erroneous code. Now we're back to where we've been at comment 23, with a doctest failing. I'll open a new ticket shortly addressing this problem.
comment:32 Changed 3 years ago by
The other ticket is #19607, and I don't have the technical knowledge of that part of Sage to do a good review of that.
However I have figured out the problem with comment:23, and it is essentially what you wrote in comment:28. I added the following right at the offending doctest:
sage: x.str() # This is why it is returning True '' sage: x._name # Because this is making use the file 'a_variable_with_a_name_so_very_very_very_long_that_even_by_itself_will_make_expect_use_a_file' sage: gap(x._name) # At least the variable is stored and retrieved correctly... false
So we could likely to suppress this by doing another test with a shorter name being passed. I'm guessing we are somehow recycling variable names in the session, and we either shouldn't or will actually have to handle long variable names before this can get merged (possibly both too). I'm okay with suppressing with a note in the doctest referencing this issue. Anyone familiar with the interface have a better idea of what is going on?
comment:33 Changed 3 years ago by
Indeed, variable names are being recycled, so simply doing another test with a shorter variable name will probably result in another doctest failing (and even if it doesn't, it may in the future). So I guess the right thing to do would be to do the long variable name doctests in a separate GAP session.
comment:34 Changed 3 years ago by
 Commit changed from 086af29b0632d16e097da3f042a7656417fa517c to 5fba891a391e2b345842bf07e1be5b8d91283fef
Branch pushed to git repo; I updated commit sha1. New commits:
5fba891  Fix the doctests so that long variable names are only used in a separate GAP session

comment:35 Changed 3 years ago by
OK, done. Can anyone look at #19607? (I have included the latest commit there to avoid conflicts.
comment:36 Changed 3 years ago by
 Status changed from needs_review to positive_review
Okay, LGTM. Thanks.
comment:37 Changed 3 years ago by
I see that this has now been merged into the develop
branch. Thanks everyone!
comment:38 Changed 3 years ago by
I hope that you are wrong, as otherwise the ticket should be marked as 'closed'.
comment:39 Changed 3 years ago by
Yes, I'd expect it to be marked as closed too, but I see that it has been merged (just missing the latest beta, though):
https://github.com/vbraun/sage/commits/develop
The other three tickets merged after the latest beta are still 'positive_review', too.
comment:40 Changed 3 years ago by
Oh I see. That's the repository that Volker uses to work on the betas, but that's not released yet. Several commits below the one of this ticket, you see the 'Updated Sage version' commit that marks where the last release goes.
Nathann
comment:41 Changed 3 years ago by
You shouldn't be looking at my repo, that is just what is being tested on the buildbot. Just because its being tested doesn't mean that its ever going to be merged ;)
comment:42 Changed 3 years ago by
Oh, I see:) Thanks for the clarification.
comment:43 Changed 3 years ago by
 Branch changed from u/jaanos/improve_efficiency_of_calling_gap_functions to 5fba891a391e2b345842bf07e1be5b8d91283fef
 Resolution set to fixed
 Status changed from positive_review to closed
comment:44 Changed 3 years ago by
 Commit 5fba891a391e2b345842bf07e1be5b8d91283fef deleted
OK, now it really is merged:) Thanks again!
New commits:
Flip the switch from "magic" to "more magic":)