Opened 12 years ago
Closed 7 years ago
#10295 closed enhancement (fixed)
Upgrade and optimize pexpect
Reported by: | SimonKing | Owned by: | was |
---|---|---|---|
Priority: | major | Milestone: | sage-7.0 |
Component: | interfaces | Keywords: | pexpect upgrade |
Cc: | drkirkby, leif, Vincent.Neri | Merged in: | |
Authors: | François Bissey, Bill Page, Jeroen Demeyer | Reviewers: | Jeroen Demeyer, Bill Page |
Report Upstream: | Reported upstream. Developers acknowledge bug. | Work issues: | |
Branch: | 9351ccb (Commits, GitHub, GitLab) | Commit: | 9351ccb53d1d3c1d387c11e0c57443b53d1cfdd1 |
Dependencies: | #19671 | Stopgaps: |
Description (last modified by )
We use pexpect version 2.0. Shouldn't we upgrade to the current version 4.0.1?
See also #10294.
New upstream tarballs:
- https://pypi.python.org/packages/source/p/pexpect/pexpect-4.0.1.tar.gz
- https://pypi.python.org/packages/source/p/ptyprocess/ptyprocess-0.5.tar.gz
- http://www.math-cs.gordon.edu/~kcrisman/sagenb-0.11.6.1.tar
Patches included and submitted upstream:
- https://github.com/pexpect/pexpect/pull/291
- https://github.com/pexpect/pexpect/pull/303
- https://github.com/pexpect/pexpect/pull/304
- https://github.com/pexpect/pexpect/pull/307
Note that this branch includes #19616.
Throughput timings (best out of 5):
pexpect 2.0 upstream:
sage: %time _ = str(gp("2^2^22")) CPU times: user 184 ms, sys: 4 ms, total: 188 ms Wall time: 384 ms
pexpect 4.0.1 upstream:
sage: %time _ = str(gp("2^2^22")) CPU times: user 208 ms, sys: 4 ms, total: 212 ms Wall time: 405 ms
pexpect 4.0.1 + Sage patches:
sage: %time _ = str(gp("2^2^22")) CPU times: user 9 ms, sys: 7 ms, total: 16 ms Wall time: 209 ms
Latency timings (best out of 5):
pexpect 2.0 upstream:
sage: gp(1); sage: %time _ = [gp(i) for i in range(10^4)] CPU times: user 2.96 s, sys: 855 ms, total: 3.82 s Wall time: 4.52 s
pexpect 4.0.1 upstream:
sage: gp(1); sage: %time _ = [gp(i) for i in range(10^4)] CPU times: user 2.61 s, sys: 758 ms, total: 3.37 s Wall time: 3.48 s
pexpect 4.0.1 + Sage patches:
sage: gp(1); sage: %time _ = [gp(i) for i in range(10^4)] CPU times: user 2.3 s, sys: 999 ms, total: 3.3 s Wall time: 3.4 s
Attachments (2)
Change History (191)
comment:1 Changed 12 years ago by
comment:2 follow-up: 3 Changed 12 years ago by
When I say version 2.4 you have to go there: http://pypi.python.org/pypi/pexpect rather than here: http://pypi.python.org/pypi/pexpect
comment:3 Changed 12 years ago by
Replying to fbissey:
When I say version 2.4 you have to go there: http://pypi.python.org/pypi/pexpect rather than here: http://pypi.python.org/pypi/pexpect
Aren't the two links the same?
So, you say that there might be a performance problem with more recent versions of pexpect. That's bad, because the idea was to get rid of a performance problem by upgrading...
comment:4 Changed 12 years ago by
cut and paste didn't work as expected the second one was meant to be: http://pexpect.sourceforge.net/
search the mailing list for pexpect, it is a proverbial can of worms.
comment:5 Changed 12 years ago by
2.4 is not released yet. Here's a package for 2.3 that I created.
http://boxen.math.washington.edu/home/kirkby/patches/pexpect-2.3.spkg
but on my OpenSolaris machine, this 2.3 package results in one doctest failure, which is:
sage -t -long -force_lib devel/sage/sage/interfaces/expect.py
The interface must be semi-working, as the interface to R works - or at lease the R doctest does not fail. Here's the error message of the failed doctest.
sage -t -long -force_lib devel/sage/sage/interfaces/sage0.py [14.8 s] sage -t -long -force_lib devel/sage/sage/interfaces/expect.py Exception pexpect.ExceptionPexpect: ExceptionPexpect() in <bound method spawn.__del__ of <pexpect.spawn object at 0xc5988ac>> ignored <snip out load more similar errors> <pexpect.spawn object at 0xc386d6c>> ignored Exception pexpect.ExceptionPexpect: ExceptionPexpect() in <bound method spawn.__del__ of <pexpect.spawn object at 0xc386d6c>> ignored ********************************************************************** File "/export/home/drkirkby/sage-4.7.alpha3/devel/sage-main/sage/interfaces/expect.py", line 867: sage: r._expect.before Expected: 'abc;\r\n[1] ' Got: 'abc <- 10 +15;\r\n__SAGE__R__PROMPT__> abc;\r\n[1] ' ********************************************************************** 1 items had failures: 1 of 11 in __main__.example_16 ***Test Failed*** 1 failures. For whitespace errors, see the file /export/home/drkirkby/.sage//tmp/.doctest_expect.py [18.4 s] sage -t -long -force_lib devel/sage/sage/interfaces/gap.py [20.4 s]
I think in some cases where we call external programs there are probably better ways of doing this. For example
- R has libraries, I expect we can call R via the libraries, rather than on the command line.
- Mathematica uses the Mathlink protocol to communicate between the kernel and front end. How to use Mathlink is documented, and has been used by at least open-source program (JMath) to work with Mathematica. But Sage calls Mathematica via the command line.
Dave
comment:6 Changed 12 years ago by
Oops, I see this doctest is related to R. But I'm sure there are other doctests which make use of pexpect, which are passing
Dave
comment:7 Changed 12 years ago by
While 2.4 is not on sourceforge but on pypi.python.org http://pypi.python.org/pypi/pexpect I think we can call it released unless you have other infos (from author/mailing list). As far as I remember, last time we tried pexpect-2.4 on sage-on-gentoo it broke plotting in the notebook. Can you try to do some plotting in the notebook with this version of pexpect? Digged my email archive, more precisely:
g=sin(x); plot(g,(x,-pi,3*pi/2))
in the notebook. It produced the following for us (at the time, notice the python time stamp):
import os;os.chdir("/tmp/tmpo2KomY"); execfile("_sage_input_1.py")Python 2.6.4 (r264:75706, Mar 4 2010, 21:15:13) [GCC 4.3.4] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> >>> import os;os.chdir("/tmp/tmpo2KomY"); >>> execfile("_sage_input_1.py") START1 import os;os.chdir("/tmp/tmpY61yjM"); execfile("_sage_input_2.py") import os;os.chdir("/tmp/tmpXjmsQK"); execfile("_sage_input_3.py") __SAGE__ __SAGE__import os;os.chdir("/tmp/tmpY61yjM"); __SAGE__execfile("_sage_input_2.py") START2 __SAGE__ __SAGE__import os;os.chdir("/tmp/tmpXjmsQK");
comment:9 Changed 12 years ago by
Cc: | leif added |
---|
comment:10 Changed 9 years ago by
Milestone: | sage-5.11 → sage-5.12 |
---|
comment:11 Changed 9 years ago by
Milestone: | sage-6.1 → sage-6.2 |
---|
comment:12 Changed 9 years ago by
Bump this. There is a pexpect 3.1 now at http://pexpect.readthedocs.org/en/latest/ (note: the sourceforge address will redirect you there). For a long time I thought pexpect was done and there wouldn't be anymore release.
I think if the project is alive they may take request from us.
comment:13 Changed 9 years ago by
Milestone: | sage-6.2 → sage-6.3 |
---|
comment:14 Changed 9 years ago by
Milestone: | sage-6.3 → sage-6.4 |
---|
comment:15 Changed 8 years ago by
Description: | modified (diff) |
---|
comment:16 Changed 8 years ago by
Branch: | → u/fbissey/pexpect3.3 |
---|---|
Commit: | → 9b1e0985e09c7181860d43f41f394cc23103ce1c |
Description: | modified (diff) |
Milestone: | sage-6.4 → sage-6.7 |
comment:18 Changed 8 years ago by
Commit: | 9b1e0985e09c7181860d43f41f394cc23103ce1c → 679e33af8315444f0080c98a45d7def4566bfbc1 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
679e33a | Update pexpect to 3.3
|
comment:19 Changed 8 years ago by
Commit: | 679e33af8315444f0080c98a45d7def4566bfbc1 → a24eab3ce985874ab8445ed5e362bbeead76fd40 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
a24eab3 | Fix broken doctest. _expect.before returns more stuff in pexpect 3.3
|
comment:20 Changed 8 years ago by
The bad behavior in the notebook originates in sagenb/notebook/worksheet.py
, this is potentially broken but the question is why don't we go there we the old pexpect?
comment:22 Changed 8 years ago by
Seems to be orthogonal to the only change I made to that file. It'll just need rebasing. On the other hand there seem to be several load
functions defined in sage. And it is not obvious why we should get that one in the notebook.
comment:23 Changed 8 years ago by
https://github.com/billpage/sagenb/commit/811a2e762bb041dd34d32d96b003ab46eaab22e9
Here is a patch to sagenb/interfaces/expect.py that corrects the bad behavior in the notebook following upgrade to pexpect 3.3. The problem is due to a difference between pexpect 2.0 and newer versions of pexpect in what happens following a TIMEOUT of an expect; call. TIMEOUT is the intended result from a call to '_read'. In newer versions '_expect.before' is not modified by TIMEOUT so the output continues to accumulate. There is no need to do this ourselves in 'output_status'.
comment:24 Changed 8 years ago by
See also http://pexpect.readthedocs.org/en/latest/history.html?highlight=timeout#version-2-3 for perhaps a better explanation.
comment:25 follow-up: 26 Changed 8 years ago by
Cool, I'll try that tomorrow morning. The next step will be performance measurements. One of the reason we hadn't moved to at least pexpect 2.4 was poor performance (sloooow). I would be willing to trade performance for something properly maintained but we should know what we are getting into on that front. Then again that TIMEOUT
thing may have been the cause.
comment:26 Changed 8 years ago by
Replying to fbissey:
Then again that
TIMEOUT
thing may have been the cause.
The mentioned change was made in 2.3, but already 2.1 was said to be slooooooooow.
But we'll (or you'll) see...
comment:27 follow-up: 28 Changed 8 years ago by
I also doubt that this particular case of the use of TIMEOUT has anything to do with version 2.1 being slower. The problem here is how the notebook interacts with Sage in an asynchronous manner. TIMEOUT is the expected behavior when checking on the status of a computation. This allows notebook to display partial results, i.e. the output in 'self._so_far'. The notebook code treats TIMEOUT as a Python exception rather than waiting for it with a 'expect' pattern. In the old version of pexpect this exception did not update the '.before' attribute. On the other hand if the code was changed to explicitly look for 'pexpect.TIMEOUT' then we would have had this problem even with the old version.
What is the recommended way to measure Sage performance?
comment:28 Changed 8 years ago by
Replying to bpage:
What is the recommended way to measure Sage performance?
Good question, I can only talk about the command line interface:
Of course there's time ./sage -t --long src/sage/interfaces/
(probably with -p
; one can also set SAGE_TEST_ITER
and SAGE_TEST_GLOBAL_ITER
). Only testing the interfaces that actually use pexpect (and where the external program is really installed) makes more sense.
Tests that make heavy use of (stand-alone) Maxima/ecl
, Singular, gap
and gp
are meaningful as well. I cannot really list them right now, but src/sage/schemes/elliptic_curves/
and especially src/sage/sandpiles/sandpile.py
are known to fall into that category IIRC.
Before testing, it IMHO makes sense to remove or rename $DOT_SAGE/timings*.json
, otherwise doctests will get reordered with undesirable impact on the resource usage, influencing the measured timings.
comment:29 Changed 8 years ago by
On my system with pexpect 3.x from github I get:
wspage@suse:~/sage> inxi -SCI System: Host: suse Kernel: 3.0.101-0.46-default x86_64 (64 bit) Desktop Gnome 2.28.2 Distro: SUSE Linux Enterprise Server 11 (x86_64) VERSION = 11 PATCHLEVEL = 3 CPU: Quad core Intel Core i5-2500K CPU (-MCP-) cache: 6144 KB flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx) Clock Speeds: 1: 1600.00 MHz 2: 1600.00 MHz 3: 3301.00 MHz 4: 1600.00 MHz Info: Processes: 245 Uptime: 67 days 5:09 Memory: 11461.1/15917.0MB Client: Shell inxi: 1.7.24 wspage@suse:~/sage> time ./sage -t --long src/sage/interfaces/ ... ---------------------------------------------------------------------- sage -t --long src/sage/interfaces/expect.py # 1 doctest failed ---------------------------------------------------------------------- Total time for all tests: 123.2 seconds cpu time: 14.3 seconds cumulative wall time: 120.5 seconds real 2m18.358s user 0m56.396s sys 0m4.856s
comment:30 Changed 8 years ago by
You should do some things like
%timeit gp.eval('2+3') %timeit gp('2') + gp('3') %timeit axiom('2+3')
etc. The doctesting won't testing latency much, and latency is *the* main issue with performance with the pexpect interfaces. The other is size, e.g.,
%timeit s=str(gp.eval('2^10000'))
comment:31 Changed 8 years ago by
I will move to do the latency test shortly but for reference when I do the doctesting of sage/interface
on my machine.
pexpect-2.0
Total time for all tests: 118.3 seconds cpu time: 20.6 seconds cumulative wall time: 115.9 seconds real 1m59.638s user 1m21.060s sys 0m3.290s
pexpect-3.3
Total time for all tests: 128.3 seconds cpu time: 21.3 seconds cumulative wall time: 125.9 seconds real 2m9.655s user 1m27.270s sys 0m3.200s
So it's a little bit slower already. I don't know if anything skew that a bit. Moving to the latency test which may be a better test anyway.
comment:32 Changed 8 years ago by
pexpect-2.0
┌────────────────────────────────────────────────────────────────────┐ │ SageMath Version 6.7.beta4, Release Date: 2015-05-05 │ │ Type "notebook()" for the browser-based notebook interface. │ │ Type "help()" for help. │ └────────────────────────────────────────────────────────────────────┘ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Warning: this is a prerelease version, and it may be unstable. ┃ ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ sage: %timeit gp.eval('2+3') The slowest run took 269.56 times longer than the fastest. This could mean that an intermediate result is being cached 10000 loops, best of 3: 94.4 us per loop sage: %timeit gp('2') + gp('3') The slowest run took 8.78 times longer than the fastest. This could mean that an intermediate result is being cached 1000 loops, best of 3: 556 us per loop sage: %timeit s=str(gp.eval('2^10000')) 1000 loops, best of 3: 250 us per loop
One time only so we don't use the cache
sage: %time gp.eval('2+3') CPU times: user 0 ns, sys: 0 ns, total: 0 ns Wall time: 728 us '5' sage: %time gp('2') + gp('3') CPU times: user 0 ns, sys: 0 ns, total: 0 ns Wall time: 1.51 ms 5 sage: %time s=str(gp.eval('2^10000')) CPU times: user 0 ns, sys: 0 ns, total: 0 ns Wall time: 667 us
pexpect-3.3
sage: %timeit gp.eval('2+3') The slowest run took 80.90 times longer than the fastest. This could mean that an intermediate result is being cached 1000 loops, best of 3: 309 us per loop sage: %timeit gp('2') + gp('3') 1000 loops, best of 3: 1.54 ms per loop sage: %timeit s=str(gp.eval('2^10000')) The slowest run took 6.02 times longer than the fastest. This could mean that an intermediate result is being cached 1000 loops, best of 3: 513 us per loop
One time only
sage: %time gp.eval('2+3') CPU times: user 10 ms, sys: 0 ns, total: 10 ms Wall time: 30.5 ms '5' sage: %time gp('2') + gp('3') CPU times: user 0 ns, sys: 0 ns, total: 0 ns Wall time: 4.58 ms 5 sage: %time s=str(gp.eval('2^10000')) CPU times: user 0 ns, sys: 0 ns, total: 0 ns Wall time: 644 us
So it is a little bit all over the place, but generally 3 to 4 times slower with pexpect 3.3. I don't know if it is acceptable. Since upstream is now alive and well we could take it as an issue with them?
comment:33 follow-up: 40 Changed 8 years ago by
So it is a little bit all over the place, but generally 3 to 4 times slower with pexpect 3.3. I don't know if it is acceptable.
No, this is definitely not acceptable. For real work it could easily mean people's programs are 4 times slower due to this upgrade. It's important to understand that the use case of pexpect by Sage is massively different than the use case of pexpect for absolutely everybody else -- for them some latency issues are no big deal, but for us they are.
Incidentally, I think there is no caching going on despite what %timeit says...
comment:34 Changed 8 years ago by
Right, I am hearing you. This is a definite problem. So is keeping dead software. If we are keeping the pexpect 2.0 branch in for performance issue it becomes the responsibility of sage developers to maintain it and add new features like the one Bill wants.
Effectively we are maintaining a fork and if we are going that way I would like it to be formalized with a different name and possibly integrated in the sage code itself. This is so people don't get upgraded to a different pexpect through pip if they are using it. Also while our core functionality uses this pexpect, we don't want to block other packages that would need a newer pexpect.
That is my opinion on the matter. The other way is to work, possibly with upstream, to solve that latency problem in newer versions.
comment:35 Changed 8 years ago by
fbissey -- definitely -- I completely agree with you, as long as "we" = not me :-)
comment:36 Changed 8 years ago by
@bill do you think it would be easy to bake unicode support in pexpect 2.0? Or would it be easier to work out the latency issue? You seem to have a good handle on what's in pexpect.
comment:37 follow-up: 43 Changed 8 years ago by
Apparently the performance problem originates with this old commit:
after commenting out
https://github.com/pexpect/pexpect/blob/3.x/pexpect/__init__.py#L1540
#time.sleep(0.0001)
performance changes from
sage: sage: %timeit gp.eval('2+3') 1000 loops, best of 3: 427 µs per loop sage: sage: %timeit gp.eval('2+3') 1000 loops, best of 3: 449 µs per loop
to this
sage: %timeit gp.eval('2+3') The slowest run took 175.25 times longer than the fastest. This could mean that an intermediate result is being cached 10000 loops, best of 3: 76.3 µs per loop sage: %timeit gp.eval('2+3') The slowest run took 7.85 times longer than the fastest. This could mean that an intermediate result is being cached 10000 loops, best of 3: 76.4 µs per loop
comment:38 Changed 8 years ago by
Excellent job! Do we have contact upstream to see if this is really needed or could be made optional?
comment:39 follow-up: 41 Changed 8 years ago by
Alternative to patching pexpect, we can just replace pieces of pexpect that we don't like with custom Python or Cython code (I do this in #17686 for exiting programs).
comment:40 Changed 8 years ago by
Replying to was:
No, this is definitely not acceptable. For real work it could easily mean people's programs are 4 times slower due to this upgrade.
Indeed. I am even using a patched version of my department's linux distribution, since otherwise the latency (of pseudo-TTY, which slows down pexpect) makes my group cohomology spkg practically unusable.
comment:41 follow-up: 42 Changed 8 years ago by
Replying to jdemeyer:
Alternative to patching pexpect, we can just replace pieces of pexpect that we don't like with custom Python or Cython code (I do this in #17686 for exiting programs).
That's interesting if we can do it in a systematic way. I wouldn't mind getting rid of most pexpect interface with the exception of the notebook (which may be harder).
comment:42 Changed 8 years ago by
Replying to fbissey:
That's interesting if we can do it in a systematic way. I wouldn't mind getting rid of most pexpect interface with the exception of the notebook (which may be harder).
I don't mean to stop using pexpect interfaces. I mean replacing parts of pexpect itself: the main pexpect code is in one Python class spawn
and we can monkey-patch that by subclassing that and changing methods as we please.
Since pexpect uses some low-level OS stuff, implementing the critical parts in Cython might also give significant speed gain (I haven't tested this yet, but surely calling os.foo()
in Python and playing with strings will be slower than directly calling the foo()
system call from Cython).
comment:43 follow-up: 44 Changed 8 years ago by
Replying to bpage:
Apparently the performance problem originates with this old commit:
after commenting out
https://github.com/pexpect/pexpect/blob/3.x/pexpect/__init__.py#L1540
#time.sleep(0.0001)
Hahaha, I was going to grab 2.1 and investigate the changes compared to 2.0. (Didn't realize the full history was kept.)
Of course continually polling without delay isn't nice either; it will significantly increase the CPU load.
comment:44 follow-up: 45 Changed 8 years ago by
Replying to leif:
Replying to bpage:
Apparently the performance problem originates with this old commit:
after commenting out
https://github.com/pexpect/pexpect/blob/3.x/pexpect/__init__.py#L1540
#time.sleep(0.0001)Hahaha, I was going to grab 2.1 and investigate the changes compared to 2.0. (Didn't realize the full history was kept.)
Of course continually polling without delay isn't nice either; it will significantly increase the CPU load.
Will it? After all unless there are quite a number of other changes, it is the current situation with 2.0, isn't it? Upgrading without patching would decrease the CPU load if I am not mistaken.
Anyway, I like Jeroen's idea but we will need to upgrade pexpect in any case. I am all in favour of upgrading now with a patch and working on some cythonized elements for pexpect in a follow up ticket. We could remove the patch when we have something satisfactory in place.
comment:45 Changed 8 years ago by
Replying to fbissey:
Replying to leif:
Of course continually polling without delay isn't nice either; it will significantly increase the CPU load.
Will it?
http://en.wikipedia.org/wiki/Busy_waiting.
I'll have to look closer at the code. Haven't tested yet, but I guess it'll get worse the longer the net computation in the subprocess takes; the examples above only do short toy computations and don't take the overall CPU time into account. (It may lead to effectively doubling SAGE_NUM_THREADS
in ptestlong.)
After all unless there are quite a number of other changes, it is the current situation with 2.0, isn't it? Upgrading without patching would decrease the CPU load if I am not mistaken.
Probably. In fact, for me ptestlong with 3.3 (unpatched) went faster than with 2.0, although only slightly, under the same conditions (constant low load). (But in both cases I had some process apparently still waiting for Singular after all other tests had already finished, so the wall time got messed up a bit.)
But I'm ok with first upgrading here and removing the sleep
, then checking what else we could tweak on another ticket.
Anyway, I like Jeroen's idea but we will need to upgrade pexpect in any case. I am all in favour of upgrading now with a patch and working on some cythonized elements for pexpect in a follow up ticket. We could remove the patch when we have something satisfactory in place.
Subclassing, probably with individual changes for different interfaces or dynamically changing parameters is the way to go I think.
While we could perhaps do some parts in Cython, leaving Python of course puts more burden of portablility on us, depending on what exactly we do.
comment:46 Changed 8 years ago by
Perhaps proposing an upstream patch to provide an optional parameter such as
expect(prompt,yield=false)
to optionally avoid the call to sleep would be a good idea? Apparently there are situations such as the way that Sage uses gap where busy waiting is a good thing. Meanwhile in Python 3.x there is asyncio and pexpect 3.x already supports that.
comment:47 follow-up: 49 Changed 8 years ago by
I tried
time.sleep(0)
in pexpect/init.py and the result on my linux system remains fast (78.9 µs per loop). In principle the use of sleep(0) should allow scheduling on all systems but it might be a good idea for someone with a MAC to try this on OSX.
The pexpect people might accept a patch to change this to 0.
comment:48 follow-up: 51 Changed 8 years ago by
Why do you think it's busy waiting? It looks like it's reading with a timeout, which is not busy waiting.
comment:49 Changed 8 years ago by
Replying to bpage:
The pexpect people might accept a patch to change this to 0.
Well, rather the delay should be configurable.
But the whole expect_loop()
and read_nonblocking()
look pretty messy. We should perhaps simply overwrite one or both with a version that fits our needs. The whole class certainly isn't written for performance.
Note that the behaviour also depends on the writer (i.e., the subprocess); ECL for example is extremely poor in that it doesn't write strings, but really character by character, such that in the worst case we here would end up with at least len(s)
* 0.1 ms (and at least len(s) context switches, not including the system calls themselves, where pexpect doesn't only call select()
and probably read()
, but also waitpid()
because of the isalive()
checks).
comment:50 follow-up: 52 Changed 8 years ago by
@jdemeye Yes I see, so I wonder why they need time.sleep at all? I am back to asking them to just remove it. But if they prefer a patch with a configurable sleep time then why not?
@leif Re: rewriting the loop. For sure if you have the time but I think the priority should be to get pexpect updated soonest.
comment:51 follow-up: 55 Changed 8 years ago by
Replying to jdemeyer:
Why do you think it's busy waiting? It looks like it's reading with a timeout, which is not busy waiting.
Unless the timeout is None
, it is busy waiting.
With a larger timeout, it gets less worse, but is still busy waiting.
comment:52 Changed 8 years ago by
Replying to bpage:
@leif Re: rewriting the loop. For sure if you have the time but I think the priority should be to get pexpect updated soonest.
Of course on a follow-up, as I said before.
sleep(0)
shouldn't be necessary because of the (many!) system calls in the loop.
comment:54 Changed 8 years ago by
comment:55 Changed 8 years ago by
Replying to leif:
Replying to jdemeyer:
Why do you think it's busy waiting? It looks like it's reading with a timeout, which is not busy waiting.
Unless the timeout is
None
, it is busy waiting.With a larger timeout, it gets less worse, but is still busy waiting.
Ok, it's not as bad as I first thought, since the loop is exited upon the first timeout (which is on the total time spent in expect_loop()
, but also used in select()
). So it depends on what happens in the application upon timeout, and the timeout chosen.
comment:56 follow-up: 61 Changed 8 years ago by
It would be good if upstream were accepting one of the possible changes.
@Bill I see you have made a pull request can you tell us the results from travis when you have them? If it passes the tests I'll update the branch.
comment:57 follow-up: 58 Changed 8 years ago by
Hum.... I don't see much improvement by applying that patch at all.
sage: %time gp.eval('2+3') CPU times: user 0 ns, sys: 0 ns, total: 0 ns Wall time: 24 ms '5'
Note that while William says there shouldn't be a cache there is definitely one
sage: %time gp.eval('2+3') CPU times: user 0 ns, sys: 20 ms, total: 20 ms Wall time: 23.4 ms '5' sage: %time gp.eval('2+3') CPU times: user 0 ns, sys: 0 ns, total: 0 ns Wall time: 309 us '5'
comment:58 Changed 8 years ago by
Replying to fbissey:
Note that while William says there shouldn't be a cache there is definitely one
sage: %time gp.eval('2+3') CPU times: user 0 ns, sys: 20 ms, total: 20 ms Wall time: 23.4 ms '5' sage: %time gp.eval('2+3') CPU times: user 0 ns, sys: 0 ns, total: 0 ns Wall time: 309 us '5'
Isn't that just the overhead of starting up the process the first time you run it? If I subsequently send other strings to gp
, such as gp.eval('3+4')
, I get similar times. That's not consistent with caching.
comment:59 Changed 8 years ago by
I hadn't thought of that. But then the figure for the first run with pexpect 2.0 is in us while it is ms for 3.3. So there is quite a gap for that. May be I will need to recheck the timing for 2.0.
timeit
is definitely in the same ballpark for 2.0 (94.4 us) and 3.3 (99.7 us) so you are probably right.
comment:60 Changed 8 years ago by
Cc: | Vincent.Neri added |
---|
For info: upgrading pexpect might also solve another issue we are having when running Sage's doctests within a docker container. See comment by olliewalsh on: https://github.com/docker/docker/issues/12277
@Vincent.Neri: using this branch might be the easiest way to try out a recent version of pexpect and experiment with olliewalsh's suggestion
comment:61 Changed 8 years ago by
Replying to fbissey:
It would be good if upstream were accepting one of the possible changes.
@Bill I see you have made a pull request can you tell us the results from travis when you have them? If it passes the tests I'll update the branch.
https://github.com/pexpect/pexpect/pull/216
I am not sure if the "ci/teamcity" result is real. Could someone with more experience with this ci stuff take a look?
comment:62 Changed 8 years ago by
It looks like a misconfiguration rather than a fault in pexpect but I couldn't affirm it 100%.
comment:63 Changed 8 years ago by
Commit: | a24eab3ce985874ab8445ed5e362bbeead76fd40 → 80cd29cee620bb62b53a389cff5f7db4aa65b2d6 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
80cd29c | Add patch from https://github.com/pexpect/pexpect/pull/216
|
comment:64 Changed 8 years ago by
The failing test upstream for python3.3 is a bit annoying but I am pushing the patch anyway as we are only concerned with python2.7 and 3.4.x (or even 3.5) in the future. It looks like we may have to get ready to shift target to pexpect 4.0 if they solve their last issue for a release.
comment:65 Changed 8 years ago by
We'll also need a new release of sagenb or patch it, getting it done may be fun...
comment:66 Changed 8 years ago by
Note that we can also re-implement the slow/broken parts in the Cython front-end sage/interfaces/sagespawn.pyx
. Doing that for expect_loop()
and read_nonblocking()
might be worth it.
comment:67 follow-up: 68 Changed 8 years ago by
What advantage can we gain from using Cython to re-implement? I think it might be better to propose a different upstream patch, something like:
time.sleep(self.maxSleep)
where maxSleep corresponds to a new optional parameter that defaults to 0.0001 for backward compatibility but which we can set to 0 in expect.py
. I have tested time.sleep(0)
and it seems to have the same end effect in Sage as commenting it out (and the same failures upstream). Given that they have apparently used similar strategies in the past, by making it optional I think we can expect upstream to agree to the change and that it would be incorporated into future versions of pexpect with no impact on Sage maintenance.
comment:68 Changed 8 years ago by
Replying to bpage:
What advantage can we gain from using Cython to re-implement?
I wasn't specifically talking about the sleep(0.0001)
issue, it was just a general comment.
We could
- Directly call system calls instead of using
os.foo()
, leading to better performance and better reliability, especially w.r.t. interrupts. - Use C strings instead of Python strings for performance.
- More generally change/add functionality without depending on upstream.
comment:69 follow-ups: 70 71 Changed 8 years ago by
Is there significant overhead calling os.foo
? By "w.r.t. interrupts" are you referring to the so called Python GIL that makes most things like this single-threaded? Would that mean that we would have to be careful to avoid mixing os.foo
calls and direct system calls? I understand that this changes significantly in Python 3. pexpect 3.x
and 4.x
support some new asynchronous modes for i/o processing in Python 3 that might be of interest.
I am sceptical that performance is significantly limited by string matching. I think this might only be the case if it is necessary to locate prompts within very large "before" buffers - something that does not happen frequently in the way that Sage calls external programs and when it does, performance may not be a priority. Do you have any examples where this would be of benefit?
If the goal is to change/add functionality without depending on upstream perhaps the best choice would be to re-write pexpect
entirely in cython, e.g. cexpect
. But it worries me every time the choice is made to re-invent things in Sage.
comment:70 follow-up: 72 Changed 8 years ago by
Replying to bpage:
Is there significant overhead calling
os.foo
?
No idea, I haven't profiled anything. Most likely there is not a large overhead.
By "w.r.t. interrupts" are you referring to the so called Python GIL that makes most things like this single-threaded?
No, but that could actually be a 4th point on my list above.
Would that mean that we would have to be careful to avoid mixing
os.foo
calls and direct system calls?
No, there is no such problem.
I understand that this changes significantly in Python 3.
I have no idea what changes in Python 3 for system calls.
I am sceptical that performance is significantly limited by string matching. I think this might only be the case if it is necessary to locate prompts within very large "before" buffers - something that does not happen frequently in the way that Sage calls external programs and when it does, performance may not be a priority. Do you have any examples where this would be of benefit?
It happens a lot in practice: it happens when a pexpect
process has a large output (which is then followed by a prompt). See the last example of 30
If the goal is to change/add functionality without depending on upstream perhaps the best choice would be to re-write
pexpect
entirely in cython, e.g.cexpect
.
I don't really want to fork completely pexpect
. It might be sufficient to change only expect
and read_nonblocking
.
comment:71 Changed 8 years ago by
Replying to bpage:
By "w.r.t. interrupts" are you referring to the so called Python GIL that makes most things like this single-threaded?
No, with "interrupts" I am talking about CTRL-C, a.k.a. KeyboardInterrupt
in Python. The problem is that Python's os
interface really doesn't deal with those well. Here is the problem:
try: result = os.read(fd) except KeyboardInterrupt: # ... ?
The problem is that, when an interrupt happens, you simply cannot know whether the read()
call actually succeeded and what the result is (the interrupt could happen while assigned the result for example).
I don't know to what extent this is actually a problem in practice, but it is something that worries me about pexpect
.
In Cython, interrupt handling has to be done manually, which means that it can be done in a predictable way.
comment:72 Changed 8 years ago by
Replying to jdemeyer:
Replying to bpage:
I am sceptical that performance is significantly limited by string matching. I think this might only be the case if it is necessary to locate prompts within very large "before" buffers - something that does not happen frequently in the way that Sage calls external programs and when it does, performance may not be a priority. Do you have any examples where this would be of benefit?
It happens a lot in practice: it happens when a
pexpect
process has a large output (which is then followed by a prompt). See the last example of 30
My point was that I would not expect something like %timeit s=str(gp.eval('2^10000'))
to be particularly fast. But maybe it is important to note that pexpect
already does make an effort to allow faster scanning for prompt if it can be represented as a fixed string rather than a regular expression.
If the goal is to change/add functionality without depending on upstream perhaps the best choice would be to re-write
pexpect
entirely in cython, e.g.cexpect
.I don't really want to fork completely
pexpect
. It might be sufficient to change onlyexpect
andread_nonblocking
.
It seems to me that this is exactly the essential part of pexpect
on which Sage depends. Most of the rest of pexpect
is just program interface and portability.
comment:73 Changed 7 years ago by
See commit to pexpect
https://github.com/pexpect/pexpect/commit/40ce421051c0a54f3b3849424491882cf1339801
by Jeff Quast. Some discussion at
https://gitter.im/pexpect/pexpect
The proposal is to make it possible for Sage to stop bundling an obsolete version of pexpect and to take advantage of recent improvements to pexpect including unicode support and support for Python 3.
comment:74 Changed 7 years ago by
Commit: | 80cd29cee620bb62b53a389cff5f7db4aa65b2d6 → b8f493325f89857566fb66d4f49a96e3d6fb41da |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
b8f4933 | Merge branch 'develop' into pexpect3.3
|
comment:75 Changed 7 years ago by
Commit: | b8f493325f89857566fb66d4f49a96e3d6fb41da → 13f9f4fb9c120de1ada2f6165f25cdbeb689d2c5 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
13f9f4f | upgrade pexpect to top of git branch superfluous-sleep
|
comment:76 Changed 7 years ago by
Description: | modified (diff) |
---|
OK this is to update/test on top of the superfluous-sleep branch of pexpect - this 4.0.1+ hence the _p1
bit. We may move to an upstream release including this if all goes well.
The upgrade needs to be matched with a sagenb
upgrade including matching fixes for the notebook. Sorry to have been this long in preparing this.
New commits:
13f9f4f | upgrade pexpect to top of git branch superfluous-sleep
|
comment:77 Changed 7 years ago by
The matching change to sagenb
is now https://github.com/sagemath/sagenb/pull/351.
comment:79 Changed 7 years ago by
Status: | new → needs_review |
---|
comment:80 follow-up: 83 Changed 7 years ago by
Milestone: | sage-6.7 → sage-6.10 |
---|---|
Status: | needs_review → needs_info |
Why the custom tarball? Usually in Sage, we try using an upstream tarball with Sage patches.
comment:81 Changed 7 years ago by
If you can give very explicit instructions on how to test these simultaneously I should be okay to try to merge this over Thanksgiving (US). Do you expect (hah!) any differences between local installations and server ones?
comment:82 Changed 7 years ago by
Dependencies: | → #19616 |
---|
comment:83 Changed 7 years ago by
Replying to jdemeyer:
Why the custom tarball? Usually in Sage, we try using an upstream tarball with Sage patches.
Yes and I tried to do it that way. But just taking the latest release and then adding the patch from the commit didn't work. That particular branch was split from master sometimes after the latest release and sticking to pexpect 4.0.1
means that I was going to have to do some backporting. Because it was not obvious I decided to bit the bullet and ship the branch.
comment:84 Changed 7 years ago by
I think you should then do either of the following two things:
(A) properly document how the tarball was made: I should be able to follow your instructions and end up with the same files.
(B) add a patch between the branch you want to use and the latest upstream stable release. Then there will be no issues of a patch not applying.
comment:86 Changed 7 years ago by
Fair enough I put it to review after all. Patch between 4.0.1 and the state of this branch is 50KB :( no wonder the patch from the commit just didn't apply. Anyway all these things are incoming.
comment:87 Changed 7 years ago by
Commit: | 13f9f4fb9c120de1ada2f6165f25cdbeb689d2c5 → 90aebce1238710a9df18d6cd6aed6ce765caaf93 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
90aebce | Base the package on pexpect 4.0.1 with the help of a patch to get to the desired superfluous sleep branch of pexpect.
|
comment:88 Changed 7 years ago by
Authors: | → François Bissey, Bill Page |
---|---|
Description: | modified (diff) |
Status: | needs_work → needs_review |
comment:89 Changed 7 years ago by
If this doesn't make any more progress in the next few days I will probably just revert this sagenb commit upstream and release something, then we could have a specific sagenb just for this ticket later.
comment:90 Changed 7 years ago by
Note that there is a circular dependency with #19616. Can somebody please clarify?
comment:91 Changed 7 years ago by
fbissey said that the packages need to be updated simultaneously; assuming that #19616 ends up including this commit, then that ticket would be the one.
comment:92 Changed 7 years ago by
Reviewers: | → Jeroen Demeyer |
---|---|
Status: | needs_review → needs_work |
Sorry to ask the obvious question, but did anybody ever check that Sage even starts with this branch? Because it doesn't for me...
Traceback (most recent call last): File "/usr/local/src/sage-config/src/bin/sage-ipython", line 10, in <module> from sage.repl.interpreter import SageTerminalApp File "/usr/local/src/sage-config/local/lib/python2.7/site-packages/sage/repl/interpreter.py", line 218, in <module> from IPython.core.interactiveshell import InteractiveShell File "/usr/local/src/sage-config/local/lib/python2.7/site-packages/IPython/__init__.py", line 47, in <module> from .core.application import Application File "/usr/local/src/sage-config/local/lib/python2.7/site-packages/IPython/core/application.py", line 24, in <module> from IPython.core import release, crashhandler File "/usr/local/src/sage-config/local/lib/python2.7/site-packages/IPython/core/crashhandler.py", line 28, in <module> from IPython.core import ultratb File "/usr/local/src/sage-config/local/lib/python2.7/site-packages/IPython/core/ultratb.py", line 116, in <module> from IPython.utils import path as util_path File "/usr/local/src/sage-config/local/lib/python2.7/site-packages/IPython/utils/path.py", line 19, in <module> from IPython.utils.process import system File "/usr/local/src/sage-config/local/lib/python2.7/site-packages/IPython/utils/process.py", line 19, in <module> from ._process_posix import system, getoutput, arg_split, check_pid File "/usr/local/src/sage-config/local/lib/python2.7/site-packages/IPython/utils/_process_posix.py", line 24, in <module> import pexpect File "/usr/local/src/sage-config/local/lib/python2.7/site-packages/pexpect/__init__.py", line 75, in <module> from .pty_spawn import spawn, spawnu File "/usr/local/src/sage-config/local/lib/python2.7/site-packages/pexpect/pty_spawn.py", line 11, in <module> import ptyprocess ImportError: No module named ptyprocess
comment:93 Changed 7 years ago by
Bother, I didn't notice that this bit was split off. Works brilliantly in sage-on-gentoo. Seriously I was counting on Bill to look it over weeks ago, but he disappeared. I'll be pushing the needed dependencies ASAP.
comment:94 Changed 7 years ago by
Commit: | 90aebce1238710a9df18d6cd6aed6ce765caaf93 → fe1066154892173d17bed1892886dc8148fa33f5 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
fe10661 | Add ptyprocess as a dependency of pexpect, clean pexpect SPKG.txt and update the license
|
comment:96 follow-up: 97 Changed 7 years ago by
I am sorry but I have been focusing on other unrelated things. I am kind of lost and confused about how to test this kind of change in Sage. What exactly do I need to do? Can I just checkout a branch using git and build Sage? Steps?
comment:97 Changed 7 years ago by
Replying to bpage:
Can I just checkout a branch using git and build Sage?
Precisely. The easiest way to get access to the Trac branches is using git-trac
:
http://doc.sagemath.org/html/en/developer/git_trac.html
Once you have Sage and git-trac
set up, you should be able to do
$ git trac checkout 10295 $ make
comment:98 Changed 7 years ago by
Status: | needs_review → needs_info |
---|
Upstream tarball for ptyprocess
?
comment:99 Changed 7 years ago by
Description: | modified (diff) |
---|---|
Status: | needs_info → needs_review |
Added.
comment:100 Changed 7 years ago by
Combining the last two topics, I think these are the correct instructions for building Sage: first download the pexpect
and ptyprocess
tarballs (as given in the description) and stick them in SAGE_ROOT/upstream/
. Then do
$ git trac checkout 10295 $ make
comment:101 Changed 7 years ago by
OK, at least ptyprocess
and pexpect
build now. I'll rebuild Sage from scratch and test it overnight.
comment:102 Changed 7 years ago by
Dependencies: | #19616 |
---|
I'm going to have to make a different ticket for this sagenb update than originally anticipated, I think.
comment:103 Changed 7 years ago by
I get some failures:
---------------------------------------------------------------------- sage -t --long src/sage/interfaces/gap.py # 14 doctests failed sage -t --long src/sage/interfaces/expect.py # Timed out ----------------------------------------------------------------------
For the first one, there are lots of errors of this sort:
sage -t --long src/sage/interfaces/gap.py ********************************************************************** File "src/sage/interfaces/gap.py", line 717, in sage.interfaces.gap.Gap_generic._eval_line Failed example: a Expected: 3 Got: ** Gap crashed or quit executing 'Print($sage1);' ** Restarting Gap and trying again <repr(<sage.interfaces.gap.GapElement at 0x10cd07870>) failed: RuntimeError: Gap produced error output Error, Variable: '$sage1' must have a value <BLANKLINE> executing Print($sage1);> ********************************************************************** File "src/sage/interfaces/gap.py", line 779, in sage.interfaces.gap.Gap_generic.unbind Failed example: gap.set('x', '2') Expected nothing Got: ** Gap crashed or quit executing 'x:=2;;' ** Restarting Gap and trying again **********************************************************************
For the second one, the start of that part of the log looks like this:
********************************************************************** File "src/sage/interfaces/expect.py", line 708, in sage.interfaces.expect.Expect._eval_line_using_file Failed example: singular('a') Exception raised: Traceback (most recent call last): File "/Users/jpalmier/Desktop/Sage_stuff/git/sage/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 496\ , in _run self.compile_and_execute(example, compiler, test.globs) File "/Users/jpalmier/Desktop/Sage_stuff/git/sage/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 858\ , in compile_and_execute exec(compiled, globs) File "<doctest sage.interfaces.expect.Expect._eval_line_using_file[4]>", line 1, in <module> singular('a') File "/Users/jpalmier/Desktop/Sage_stuff/git/sage/local/lib/python2.7/site-packages/sage/interfaces/singular.py", lin\ e 788, in __call__ return SingularElement(self, type, x, False) File "/Users/jpalmier/Desktop/Sage_stuff/git/sage/local/lib/python2.7/site-packages/sage/interfaces/singular.py", lin\ e 1266, in __init__ raise_(TypeError, x, sys.exc_info()[2]) File "/Users/jpalmier/Desktop/Sage_stuff/git/sage/local/lib/python2.7/site-packages/sage/interfaces/singular.py", lin\ e 1261, in __init__ self._name = parent._create( value, type) File "/Users/jpalmier/Desktop/Sage_stuff/git/sage/local/lib/python2.7/site-packages/sage/interfaces/singular.py", lin\ e 749, in _create self.set(type, name, value) File "/Users/jpalmier/Desktop/Sage_stuff/git/sage/local/lib/python2.7/site-packages/sage/interfaces/singular.py", lin\ e 692, in set self.eval(cmd) File "/Users/jpalmier/Desktop/Sage_stuff/git/sage/local/lib/python2.7/site-packages/sage/interfaces/singular.py", lin\ e 650, in eval raise SingularError('Singular error:\n%s'%s) TypeError: Singular error: ? `a` is undefined ? error occurred in or before STDIN line 12: `def sage15=a;` ********************************************************************** File "src/sage/interfaces/expect.py", line 710, in sage.interfaces.expect.Expect._eval_line_using_file Failed example: singular.eval('quit;') Expected: '' Got: Singular crashed -- automatically restarting. '' ********************************************************************** File "src/sage/interfaces/expect.py", line 721, in sage.interfaces.expect.Expect._eval_line_using_file Failed example: singular(3) Expected: Singular crashed -- automatically restarting. 3 Got: Singular crashed -- automatically restarting. Singular crashed -- automatically restarting. Singular crashed -- automatically restarting. <repr(<sage.interfaces.singular.SingularElement at 0x11ac13fa0>) failed: AttributeError: 'NoneType' object has no attri\ bute 'group'> ********************************************************************** File "src/sage/interfaces/expect.py", line 800, in sage.interfaces.expect.Expect._eval_line Failed example: singular._eval_line('def a=3;') Expected: '' Got: Singular crashed -- automatically restarting. '' **********************************************************************
This is on OS X 10.10, Sage built from scratch.
comment:106 Changed 7 years ago by
I think I am an idiot. I need to add a bit that's different for this branch than what we had before with pexpect-3.3
.
comment:107 Changed 7 years ago by
Commit: | fe1066154892173d17bed1892886dc8148fa33f5 → 6e463e51f47e8ff3035921b9213a4e580e1f70f1 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
6e463e5 | Set the new delayafterread parameter to restore pexpect behavior, and restore pexpect 3.3 and under behavior with regards to sighup.
|
comment:108 Changed 7 years ago by
The last commit is not sufficient to solve the observed problems but it should be included especially with regards to speed. Still have to find the post pexpect-4.0.1
change responsible for these.
comment:109 Changed 7 years ago by
I think it's better to move those changes to the SageSpawn
class instead (see src/sage/interfaces/sagespawn.pyx
).
comment:110 Changed 7 years ago by
Branch: | u/fbissey/pexpect3.3 → u/jdemeyer/pexpect3.3 |
---|
comment:111 Changed 7 years ago by
Commit: | 6e463e51f47e8ff3035921b9213a4e580e1f70f1 → de71ef81f8acac4c3dd2219d87de897c965fc11a |
---|
New commits:
de71ef8 | Set the new delayafterread parameter to restore pexpect behavior, and restore pexpect 3.3 and under behavior with regards to sighup.
|
comment:112 Changed 7 years ago by
Status: | needs_review → needs_work |
---|
comment:113 Changed 7 years ago by
I think there is a new problem in send()
. In certain cases, the following can happen:
Traceback (most recent call last): File "/usr/local/src/sage-config/local/lib/python2.7/site-packages/sage/interfaces/expect.py", line 870, in _eval_line E.sendline(line) File "/usr/local/src/sage-config/local/lib/python2.7/site-packages/pexpect/pty_spawn.py", line 529, in sendline n = self.send(s) File "/usr/local/src/sage-config/local/lib/python2.7/site-packages/pexpect/pty_spawn.py", line 520, in send return os.write(self.child_fd, b) OSError: [Errno 9] Bad file descriptor
comment:114 Changed 7 years ago by
Something is closing the file descriptor for the pty
, but I haven't figured out which thing is doing that.
comment:116 Changed 7 years ago by
The change causing that is not actually between 4.0.1 and the current patch but in between 3.3 and 4.0.1. Quite a lot of change in there but it sounds like overeager garbage collection.
comment:117 Changed 7 years ago by
Adding E._keep_alive()
in the _eval_line
code in sage/interfaces/expect.py
solves the problem for expect.py
but not for gap.py
. Presumably it is abusive and would have to be added in numerous place, and properly ripped afterwards, that doesn't scale.
comment:118 Changed 7 years ago by
Let me know when you're ready for me to make sagenb 0.11.6 with the one-character change needed...
comment:119 follow-up: 121 Changed 7 years ago by
I asked upstream for guidance, this is what they have to say so far:
When the PtyProcess
instance (self.ptyproc
on a Pexpect
spawn instance) is garbage collected, it will indeed close the file descriptor. But I don't see how that could have happened when you're still calling E.sendline()
- clearly there is still a reference to the object, so it shouldn't be garbage collected.
Could something else be closing an fd explicitly? I have run into odd effects before because the numbers representing fds are readily reused. If one fd gets closed while its corresponding 'handle' object (e.g. a PtyProcess
instance) still exists, when the handle object is garbage collected, it may close an unrelated file which has been opened in the meantime:
A = open('foo') # handle A with fd=5 os.close(5) B = open('bar') # handle A with fd=5 # handle B with fd=5 del A # handle A is garbage collected, closing fd 5 B.read() # Bad fd!
comment:121 Changed 7 years ago by
Replying to fbissey:
Could something else be closing an fd explicitly?
This is a likely explanation of what is happening. The "problem" is that pexpect
was split up in two independent parts (pexpect
and ptyprocess
) and the Sage wrapper SageSpawn
tries to override the wrong close()
method.
comment:122 Changed 7 years ago by
Authors: | François Bissey, Bill Page → François Bissey, Bill Page, Jeroen Demeyer |
---|---|
Report Upstream: | N/A → Not yet reported upstream; Will do shortly. |
We need a custom wrapper over PtyProcess
also.
comment:123 Changed 7 years ago by
Commit: | de71ef81f8acac4c3dd2219d87de897c965fc11a → 411c29346c9234aec8c920eae86c18f5209352a8 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
411c293 | Fix Sage wrapper around spawn/PtyProcess
|
comment:124 Changed 7 years ago by
Status: | needs_work → needs_review |
---|
comment:125 Changed 7 years ago by
Description: | modified (diff) |
---|---|
Report Upstream: | Not yet reported upstream; Will do shortly. → Reported upstream. No feedback yet. |
comment:126 Changed 7 years ago by
Description: | modified (diff) |
---|
comment:127 Changed 7 years ago by
The new pexpect
seems slightly slower at reading large strings:
pexpect 2.0:
sage: timeit('str(gp("2^2^22"))', repeat=10) 5 loops, best of 10: 380 ms per loop
pexpect 4.0.1:
sage: timeit('str(gp("2^2^22"))', repeat=10) 5 loops, best of 10: 402 ms per loop
comment:128 Changed 7 years ago by
The same now with %time
shows that the time increase is mainly in user
time, so it's really the pexpect
code which is causing the slowdown:
pexpect 2.0:
sage: %time _ = str(gp("2^2^22")) CPU times: user 184 ms, sys: 4 ms, total: 188 ms Wall time: 384 ms
pexpect 4.0.1:
sage: %time _ = str(gp("2^2^22")) CPU times: user 203 ms, sys: 7 ms, total: 210 ms Wall time: 405 ms
comment:129 Changed 7 years ago by
Status: | needs_review → needs_work |
---|---|
Work issues: | → slowness |
comment:130 follow-up: 155 Changed 7 years ago by
I just noticed that pexpect
uses a read buffer of only 2000 bytes by default.
WTF??? Does pexpect
upstream really want things to be slow?
comment:131 Changed 7 years ago by
Commit: | 411c29346c9234aec8c920eae86c18f5209352a8 → 10623b2318047234dc0f2f4fcaed47d741bffc9a |
---|
comment:132 Changed 7 years ago by
Description: | modified (diff) |
---|---|
Status: | needs_work → needs_review |
Looking better now:
pexpect 2.0 upstream:
sage: %time _ = str(gp("2^2^22")) CPU times: user 184 ms, sys: 4 ms, total: 188 ms Wall time: 384 ms
pexpect 4.0.1 upstream:
sage: %time _ = str(gp("2^2^22")) CPU times: user 308 ms, sys: 9 ms, total: 317 ms Wall time: 571 ms
pexpect 4.0.1 + Sage patches:
sage: %time _ = str(gp("2^2^22")) CPU times: user 26 ms, sys: 2 ms, total: 28 ms Wall time: 231 ms
comment:133 Changed 7 years ago by
It's especially interesting to look at user
time. We cannot really change how much time is spent in the system calls, but we can change the time spent in Python code.
comment:134 Changed 7 years ago by
Summary: | Upgrading pexpect → Upgrade and optimize pexpect |
---|
comment:135 Changed 7 years ago by
Commit: | 10623b2318047234dc0f2f4fcaed47d741bffc9a → e0ccd4547824107c1bb0ea132726f565eb70142e |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
e0ccd45 | Remove support for maxread, there is no need to limit it!
|
comment:136 Changed 7 years ago by
Description: | modified (diff) |
---|
comment:137 Changed 7 years ago by
Description: | modified (diff) |
---|
comment:139 Changed 7 years ago by
jdemeyer asked for a branch for the sagenb piece of this. I didn't want to have two branches - I guess I should have based on this branch but too late now, and won't have time later today.
u/kcrisman/ticket/10295
-
build/pkgs/sagenb/checksums.ini
diff --git a/build/pkgs/sagenb/checksums.ini b/build/pkgs/sagenb/checksums.ini index f35bb1b..f8e929b 100644
a b 1 1 tarball=sagenb-VERSION.tar 2 sha1= 66593ecf18cbf77115279f6f1564854eb9b93df53 md5= facdc06b94fcf0a85b76aa861c6e646f4 cksum=271 98203782 sha1=7631e986aa01676cfac2522d5cc52b4131f0d399 3 md5=4a286cac8412fa3af9cdcc6b6234b9ce 4 cksum=2718880955 -
build/pkgs/sagenb/package-version.txt
diff --git a/build/pkgs/sagenb/package-version.txt b/build/pkgs/sagenb/package-v index 35ad344..af92809 100644
a b 1 0.11. 41 0.11.5.1
Package will be at http://www.math-cs.gordon.edu/~kcrisman/sagenb-0.11.5.1.tar
comment:140 Changed 7 years ago by
Description: | modified (diff) |
---|
comment:141 Changed 7 years ago by
Commit: | e0ccd4547824107c1bb0ea132726f565eb70142e → 74c8754804e135c4ceb8ab03d1c49ca2a8715d9e |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
74c8754 | Upgrade to sagenb 0.11.5.1 for pexpect 4
|
comment:142 follow-up: 143 Changed 7 years ago by
In the pexpect
log, I see this:
byte-compiling /Users/palmieri/Desktop/Sage_stuff/git/sage/local/lib/python2.7/site-packages/pexpect/async.py to async.pyc File "/Users/palmieri/Desktop/Sage_stuff/git/sage/local/lib/python2.7/site-packages/pexpect/async.py", line 16 transport, pw = yield from asyncio.get_event_loop()\ ^ SyntaxError: invalid syntax
and as a result, there is a file async.py
but not async.pyc
. Something to worry about?
comment:143 Changed 7 years ago by
Replying to jhpalmieri:
In the
pexpect
log, I see this:byte-compiling /Users/palmieri/Desktop/Sage_stuff/git/sage/local/lib/python2.7/site-packages/pexpect/async.py to async.pyc File "/Users/palmieri/Desktop/Sage_stuff/git/sage/local/lib/python2.7/site-packages/pexpect/async.py", line 16 transport, pw = yield from asyncio.get_event_loop()\ ^ SyntaxError: invalid syntaxand as a result, there is a file
async.py
but notasync.pyc
. Something to worry about?
No, I opened an issue upstream that was quickly closed. And it mirrored at least another issue opened by another distro maintainer. This file is only used with python3. See https://github.com/pexpect/pexpect/issues/297
comment:144 follow-up: 148 Changed 7 years ago by
I greatly appreciate all of your hard work on this and regret not being able to keep up. The new timings are very impressive. I have one question: Do you have timings before and after for the full Sage test suite or at least those parts of it that exercise most of the functionality associated with those other parts of Sage besides gap that use external components via pexpect?
comment:145 Changed 7 years ago by
I gave it a try but documentation doesn't build
[notebook ] reading sources... [ 85%] sagenb/notebook/template [notebook ] Exception occurred: [notebook ] File "/Users/buildslave-sage/slave/sage_git/build/local/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/globals.py", line 34, in _find_app [notebook ] raise RuntimeError('working outside of application context') [notebook ] RuntimeError: working outside of application context [notebook ] The full traceback has been saved in /var/folders/hj/rs3k66f91zddr7hvc6yf_90m0000gp/T/sphinx-err-jc5YhT.log, if you want to report the issue to the developers. [notebook ] Please also report this if it was a user error, so that a better error message can be provided next time. [notebook ] A bug report can be filed in the tracker at <https://bitbucket.org/birkenfeld/sphinx/issues/>. Thanks! Error building the documentation. Traceback (most recent call last): File "/Users/buildslave-sage/slave/sage_git/build/src/doc/common/builder.py", line 1641, in <module> getattr(get_builder(name), type)() File "/Users/buildslave-sage/slave/sage_git/build/src/doc/common/builder.py", line 302, in _wrapper getattr(get_builder(document), 'inventory')(*args, **kwds) File "/Users/buildslave-sage/slave/sage_git/build/src/doc/common/builder.py", line 513, in _wrapper x.get(99999) File "/Users/buildslave-sage/slave/sage_git/build/local/lib/python/multiprocessing/pool.py", line 567, in get raise self._value OSError: [notebook ] Exception occurred: make[1]: *** [doc-html-mathjax] Error 1
comment:146 Changed 7 years ago by
Dependencies: | → #19671 |
---|---|
Work issues: | slowness |
comment:147 Changed 7 years ago by
Commit: | 74c8754804e135c4ceb8ab03d1c49ca2a8715d9e → f0523e41eb491c0589c230973229cfeb2123dccd |
---|
comment:148 follow-up: 149 Changed 7 years ago by
Replying to bpage:
Do you have timings before and after for the full Sage test suite or at least those parts of it that exercise most of the functionality associated with those other parts of Sage besides gap that use external components via pexpect?
See old-pexpect.log and new-pexpect.log for the output of
./sage -t --optional=sage src/sage/interfaces/*.py* src/sage/interfaces/*.py* src/sage/interfaces/*.py* src/sage/interfaces/*.py* src/sage/interfaces/*.py*
(test all files in src/sage/interfaces
5 times).
The total time of those tests is (I'm not claiming that anything is statistically significant):
old:
Total time for all tests: 378.2 seconds cpu time: 86.5 seconds cumulative wall time: 365.2 seconds
new:
Total time for all tests: 360.3 seconds cpu time: 93.6 seconds cumulative wall time: 347.1 seconds
comment:149 follow-up: 150 Changed 7 years ago by
Replying to jdemeyer:
Replying to bpage:
Do you have timings before and after for the full Sage test suite or at least those parts of it that exercise most of the functionality associated with those other parts of Sage besides gap that use external components via pexpect?
See old-pexpect.log and new-pexpect.log for the output of
./sage -t --optional=sage src/sage/interfaces/*.py* src/sage/interfaces/*.py* src/sage/interfaces/*.py* src/sage/interfaces/*.py* src/sage/interfaces/*.py*(test all files in
src/sage/interfaces
5 times).The total time of those tests is (I'm not claiming that anything is statistically significant):
old:
Total time for all tests: 378.2 seconds cpu time: 86.5 seconds cumulative wall time: 365.2 secondsnew:
Total time for all tests: 360.3 seconds cpu time: 93.6 seconds cumulative wall time: 347.1 seconds
So the CPU time increased by a significant amount, which is bad...
(Not sure how reliable the figures at all are.)
comment:150 Changed 7 years ago by
Replying to leif:
So the CPU time increased by a significant amount, which is bad...
(Not sure how reliable the figures at all are.)
I have no idea if it's really significant. I don't even know how to check this (all I know is that I should not ask the person who wrote the patchbot startup-time plugin)
comment:151 Changed 7 years ago by
P.S.: There is (or at least used to be) SAGE_TEST_ITER
and SAGE_TEST_GLOBAL_ITER
(w.r.t. the "five times").
comment:153 Changed 7 years ago by
I tried the doctests in sage/interface
again, now with 10 runs. Results are essentially the same:
old:
Total time for all tests: 722.6 seconds cpu time: 170.2 seconds cumulative wall time: 696.9 seconds
new:
Total time for all tests: 710.4 seconds cpu time: 185.6 seconds cumulative wall time: 685.0 seconds
Is it really a problem if the CPU goes up while the total time goes down?
comment:154 Changed 7 years ago by
Description: | modified (diff) |
---|
comment:155 follow-up: 156 Changed 7 years ago by
Replying to jdemeyer:
Is it really a problem if the CPU goes up while the total time goes down?
As always, it depends.
Do we burn more carbon that way?
More seriously, in case the potential cpu time is wasted otherwise, perhaps not (wall time then matters; but even on a battery-powered notebook, the answer might yet be different). But if it's not, i.e., if other processes suffer from getting less, it does of course. So it mainly depends on the user, or rather the usage of the machine Sage is running on (and hence potentially many users).
Replying late to jdemeyer:
I just noticed that
pexpect
uses a read buffer of only 2000 bytes by default.WTF??? Does
pexpect
upstream really want things to be slow?
Think of pexpect's purpose; not counting newlines, 2000 characters is 80 columns by 25 lines, so pretty reasonable even with ASCII art output (well, perhaps twice that, but not vastly more.).
In the same sense, benchmarking with gp("2^2^22")
seems hardly appropriate, IMHO.
Latency matters, more than throughput. (And in a multitasking environment, overall resource usage.)
comment:156 follow-up: 158 Changed 7 years ago by
Replying to leif:
In the same sense, benchmarking with
gp("2^2^22")
seems hardly appropriate, IMHO.
Large outputs do actually happen in Sage. It's not so crazy.
I agree that it's not the only thing which matters.
comment:157 follow-up: 159 Changed 7 years ago by
Is anybody willing to review this ticket? I won't waste my time investigating the CPU time issue if this ticket will stall anyway.
comment:158 Changed 7 years ago by
Replying to jdemeyer:
Replying to leif:
In the same sense, benchmarking with
gp("2^2^22")
seems hardly appropriate, IMHO.Large outputs do actually happen in Sage. It's not so crazy.
Yep, but certainly not the typical usage scenario of pexpect, so I wouldn't blame upstream for their default buffer size.
On the other hand, often the interaction of Sage with pexpect-driven programs looks more like a human one, where lots of variables get set with short commands, to finally get a (typically relatively short) answer. Mainly in toy examples communication overhead dominates computation time (unless latency in a longer dialogue is really bad, as we experienced with some Ubuntu kernels a while ago).
I'm not very worried about the increased CPU time (yet ;-) ); it's just that we shouldn't exclusively focus on bare throughput.
If the number and/or size of objects passed is huge, it's probably more worth thinking of other solutions.
comment:159 Changed 7 years ago by
Replying to jdemeyer:
Is anybody willing to review this ticket? I won't waste my time investigating the CPU time issue if this ticket will stall anyway.
I do not find the difference in reported times for the sage/interface tests at all disturbing. I seriously doubt that it would be noticeable to the average user. At the same time the new gap timings suggest that certain heavy users might be pleasantly surprized.
I plan to test this change some time in the next few days. One thing that I want out of this change is support for unicode in external packages. I expect to be able to give it a positive review.
comment:160 Changed 7 years ago by
Description: | modified (diff) |
---|
I added some latency timings too, which again show a significant improvement.
comment:161 Changed 7 years ago by
Commit: | f0523e41eb491c0589c230973229cfeb2123dccd → b98fbf950f8f056fa096f9fbe2ff52501de98a44 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
b98fbf9 | Simplify patch for pexpect PR #291
|
comment:162 follow-up: 165 Changed 7 years ago by
I see no reason to include all the changes on the superfluous-sleep
branch. I changed the patch to take just the changes proposed on the pull request.
comment:163 Changed 7 years ago by
Commit: | b98fbf950f8f056fa096f9fbe2ff52501de98a44 → 59d659155808600547bf72fd302ae977840d109a |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
59d6591 | New version of https://github.com/pexpect/pexpect/pull/304
|
comment:164 Changed 7 years ago by
Commit: | 59d659155808600547bf72fd302ae977840d109a → c7947bd69e183c6590125051ace85529ba0be251 |
---|
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
c7947bd | New version of https://github.com/pexpect/pexpect/pull/304
|
comment:165 follow-up: 166 Changed 7 years ago by
Replying to jdemeyer:
I see no reason to include all the changes on the
superfluous-sleep
branch. I changed the patch to take just the changes proposed on the pull request.
Failed to parse that... So you rebased the changes from PR #291 onto 4.0.1 (since the PR is based on some 4.1.dev)?
New commits:
c7947bd | New version of https://github.com/pexpect/pexpect/pull/304
|
comment:166 Changed 7 years ago by
Replying to leif:
Failed to parse that... So you rebased the changes from PR #291 onto 4.0.1 (since the PR is based on some 4.1.dev)?
Yes. But it was a trivial rebase, essentially just a cherry-pick of one commit.
Now all 3 patches are independent.
comment:167 Changed 7 years ago by
Description: | modified (diff) |
---|
The new version of patch for https://github.com/pexpect/pexpect/pull/304 shows again a small improvement.
comment:169 Changed 7 years ago by
Dependencies: | #19671 → #19671, #19616 |
---|---|
Report Upstream: | Reported upstream. No feedback yet. → Reported upstream. Developers acknowledge bug. |
comment:170 Changed 7 years ago by
Description: | modified (diff) |
---|
comment:171 Changed 7 years ago by
Commit: | c7947bd69e183c6590125051ace85529ba0be251 → 0f4148d993d799b09143b778e37a376c237cf903 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
0f4148d | Use delaybeforesend=None to skip a sleep(0)
|
comment:172 Changed 7 years ago by
Thanks for all the work on this Jeroen, I have been rather absent for the last week at least but it looks like a great clean up as well as an upgrade.
comment:173 Changed 7 years ago by
Commit: | 0f4148d993d799b09143b778e37a376c237cf903 → 1e75fbd677507e9bb1e06879fe2f45580ab3b584 |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
1e75fbd | Clean up header
|
comment:174 Changed 7 years ago by
Status: | needs_review → positive_review |
---|
I've patched sage.6.10.beta7 and built it on SMC. I ran the interface tests. I configured it to run from SMC worksheet and verified that Sage and my favorite external packages (FriCAS and Octave) work as expected. Unicode symbols work in FriCAS. Everything looks fine to me. I would give this ticket a positive review.
comment:176 Changed 7 years ago by
Milestone: | sage-6.10 → sage-6.11 |
---|
Since we are in the "sage-6.10.rc" phase, I don't think this should be merged in 6.10.
comment:177 follow-up: 178 Changed 7 years ago by
Now that this has positive review otherwise, I'll also be sure to include the relevant piece in #19616. François, if you want to make the same PR as before I can do that; I'm not sure how to revert reverting a changeset on Github, I just want to make sure your name is still attached to that change.
comment:178 follow-up: 180 Changed 7 years ago by
Replying to kcrisman:
Now that this has positive review otherwise, I'll also be sure to include the relevant piece in #19616. François, if you want to make the same PR as before I can do that; I'm not sure how to revert reverting a changeset on Github, I just want to make sure your name is still attached to that change.
On it.
comment:179 Changed 7 years ago by
I have made one additional pull request to pexpect
: https://github.com/pexpect/pexpect/pull/308/files
It is really quite trivial and it provided yet another easy optimization. If some reviewer agrees, I will add that patch to the branch here.
comment:180 Changed 7 years ago by
Replying to fbissey:
Replying to kcrisman:
Now that this has positive review otherwise, I'll also be sure to include the relevant piece in #19616. François, if you want to make the same PR as before I can do that; I'm not sure how to revert reverting a changeset on Github, I just want to make sure your name is still attached to that change.
On it.
This is now https://github.com/sagemath/sagenb/pull/356
comment:181 Changed 7 years ago by
Description: | modified (diff) |
---|
comment:182 Changed 7 years ago by
Commit: | 1e75fbd677507e9bb1e06879fe2f45580ab3b584 → 1255a5a4a07559fe444455aeff7a2d27388a0b87 |
---|---|
Status: | positive_review → needs_review |
comment:183 Changed 7 years ago by
Status: | needs_review → positive_review |
---|
Merged the latest version of #19616.
comment:184 Changed 7 years ago by
Commit: | 1255a5a4a07559fe444455aeff7a2d27388a0b87 → 9351ccb53d1d3c1d387c11e0c57443b53d1cfdd1 |
---|---|
Status: | positive_review → needs_review |
comment:185 Changed 7 years ago by
Dependencies: | #19671, #19616 → #19671 |
---|---|
Description: | modified (diff) |
comment:186 Changed 7 years ago by
Status: | needs_review → positive_review |
---|
comment:188 Changed 7 years ago by
Reviewers: | Jeroen Demeyer → Jeroen Demeyer, Bill Page |
---|
comment:189 Changed 7 years ago by
Branch: | u/jdemeyer/pexpect3.3 → 9351ccb53d1d3c1d387c11e0c57443b53d1cfdd1 |
---|---|
Resolution: | → fixed |
Status: | positive_review → closed |
Try version 2.4 even. This has been discussed to death several times on sage-devel. Version 2.1 and over are slower than what we currently have. Plus from experience in sage-on-gentoo where at some stage we experimented with pexpect 2.4, some plotting broke down in the notebook.