Opened 19 months ago
Closed 19 months ago
#21388 closed defect (fixed)
Optimize Psi2()
Reported by:  embray  Owned by:  

Priority:  minor  Milestone:  sage7.4 
Component:  elliptic curves  Keywords:  
Cc:  Merged in:  
Authors:  Erik Bray, Jeroen Demeyer  Reviewers:  Jeroen Demeyer, Erik Bray 
Report Upstream:  N/A  Work issues:  
Branch:  bdce5dc (Commits)  Commit:  bdce5dced1e9f811616836253c4e7cd45b71fcf3 
Dependencies:  Stopgaps: 
Description (last modified by )
We greatly optimize the Psi2()
function.
Before:
sage: from sage.schemes.elliptic_curves.isogeny_small_degree import Psi2 sage: timeit('Psi2.f(71)') 5 loops, best of 3: 9.36 s per loop
After:
sage: from sage.schemes.elliptic_curves.isogeny_small_degree import Psi2 sage: timeit('Psi2.f(71)') 5 loops, best of 3: 875 ms per loop
(note the use of Psi2.f()
to bypass caching)
This also provides a workaround to a test that caused the Python interpreter to segfault on Windows. The specific call that caused the segfault is:
Psi2(71)
But in brief, the crash occurs because we have (in the original code) a large element of Univariate Quotient Polynomial Ring in v over Multivariate Polynomial Ring in x, u over Rational Field with modulus v^2  u^4 + 10*u^3 + 3*u^2  4*u + 8
. This is then being cast simply to a multivariate rational polynomial over x, u, v. Because there is no direct coercion between these types this involves eval()
ing the polynomial as a Python expression.
The problem is that this expression can become too large for the stackspecifically in Python's symbol visibility analysis, a step it performs just before compiling an expression to bytecode. The implementation of that recurses into binary expressions, leading to a stack overflow for such a large expression. This issue has come up once before in #16014 where it was worked around by a rewrite of the code. This particular test worked on Linux where the default stack size is 8MB, but it crashed on Windows where the typical stack is just 1MB.
The optimization gives smaller expressions to eval.
A more general fix to this problem would be nice thoughI'm writing up some thoughts I have on it in a separate post.
Change History (16)
comment:1 Changed 19 months ago by
 Status changed from new to needs_review
comment:2 Changed 19 months ago by
 Description modified (diff)
comment:3 Changed 19 months ago by
 Description modified (diff)
comment:4 Changed 19 months ago by
 Branch changed from u/embray/psi2 to u/jdemeyer/psi2
comment:5 Changed 19 months ago by
 Commit changed from 9b22d75ac27fcae0d07065daccab96ba14562f79 to 10ff2ec9ca61fb39187b9ab2fe0fb1965251ca24
comment:6 Changed 19 months ago by
Is the idea that you would just take the multiplication by x
outside?
comment:7 Changed 19 months ago by
 Commit changed from 10ff2ec9ca61fb39187b9ab2fe0fb1965251ca24 to bdce5dced1e9f811616836253c4e7cd45b71fcf3
Branch pushed to git repo; I updated commit sha1. New commits:
bdce5dc  Optimize Psi2()

comment:8 Changed 19 months ago by
 Description modified (diff)
 Summary changed from Make Psi2 function less likely to break the stack to Optimize Psi2()
comment:9 Changed 19 months ago by
This also happens to speed up the function by a factor of 10.
Please review...
comment:10 followup: ↓ 11 Changed 19 months ago by
Some of this makes sense to me (I was already irked by the use of += []
) but the rest is a little beyond my understanding of how Sage works (though I'd appreciate a brief explanation :)
That said it's no surprise to me that more could be done with this than what I did.
comment:11 in reply to: ↑ 10 Changed 19 months ago by
Replying to embray:
Some of this makes sense to me (I was already irked by the use of
+= []
) but the rest is a little beyond my understanding of how Sage works (though I'd appreciate a brief explanation :)
 Some general code cleanup and reformatting, in particular adding some spaces.
 To construct the rational number
1/x
, I useQQ((1,x))
instead of1/QQ(x)
.
 I remove
raise ValueError("%s must be one of %s."%(l,hyperelliptic_primes))
since that exception is already raised by_hyperelliptic_isogeny_data
.
 I rename
y
>v
to avoid confusion (the firstv
and the secondv
are really the same thing conceptually).
 I remove the variable
x
from the firstR
since it's never used. This is probably where the main speedup comes from.
 I replace
s = [1]
bys = [K(1)]
to ensure that all elements ofs
haveK
as parent.
 I replace your
R((1)**i*s[i]*x**(di))
by(1)**i * x**(di) * R(s[i])
. Sinces[i]
does not involve the variablex
at all, I can indeed first converts[i]
toR
and then multiply withx
.
 I replace
s[i]
bys[i].lift()
before conversion. This means "forgetting" the quotient and convertings[i]
to a true polynomial. Put differently, it's the conversion fromK
toL
. Then there is a coercion fromL
toR
(but not fromK
toR
).
comment:12 Changed 19 months ago by
Thanks, that all makes sense. In particular I didn't realize what lift() did.
comment:13 Changed 19 months ago by
 Status changed from needs_review to positive_review
comment:14 Changed 19 months ago by
 Status changed from positive_review to needs_work
Reviewer name..
comment:15 Changed 19 months ago by
 Reviewers set to Jeroen Demeyer, Erik Bray
 Status changed from needs_work to positive_review
comment:16 Changed 19 months ago by
 Branch changed from u/jdemeyer/psi2 to bdce5dced1e9f811616836253c4e7cd45b71fcf3
 Resolution set to fixed
 Status changed from positive_review to closed
In fact, we don't need the variable
x
in the ringR
at all...Hang on...
New commits:
Workaround to prevent this calculation to blow the stack