Opened 4 years ago

Closed 4 years ago

#17689 closed enhancement (fixed)

A tutorial on profiling in Sage

Reported by: ncohen Owned by:
Priority: major Milestone: sage-6.5
Component: documentation Keywords: sd66
Cc: kcrisman, jhpalmieri, vdelecroix, tmonteil, dcoudert, azi, jmantysalo Merged in:
Authors: Nathann Cohen Reviewers: Thierry Monteil
Report Upstream: N/A Work issues:
Branch: 5ad2972 (Commits) Commit: 5ad29725172eeb5562d46e722506112228f33fe9
Dependencies: #17616 Stopgaps:

Description

This branch adds a tutorial on profiling in Sage. It merely points users toward the right tools and their specific documentation.

Nathann

Change History (35)

comment:1 Changed 4 years ago by ncohen

  • Branch set to public/17689
  • Commit set to 3e69b2939c1cdc5697da38075cbb77154aab7056
  • Status changed from new to needs_review

New commits:

c07c428trac #17615: Move bordeaux_2008 into thematic_tutorials/explicit_methods_in_number_theory/ (and only that)
3c4d499trac #17615: Update the links
21dbfd8trac #17615: Merged with 6.5.beta6
e4f3206trac #17615: Broken doctest
89bd621trac #17616: Cleanup in the thematic tutorials
d121ef2trac #17616: Typo
6f74cb9#17616: link to #17617 in temporarily disabled doctests
e1f7d37trac #17616: Merged with updated #17615
d780d12trac #17616: Broken doctest
3e69b29trac #17689: A tutorial on profiling in Sage

comment:2 follow-up: Changed 4 years ago by vdelecroix

Hello,

Very cool to start that! Here are just potential improvements to discuss. This is not a complete review and I think it would be nice if several people actually read it.

  • For %prun would be cool to explain a little bit the columns. In particular why there is twice "percall".
  • In %prun what do you mean by 'critical functions'?
  • The following does not work for me
     sage: %prun -r random_prime(2**500)
     sage: stats_object = _
     sage: stats.total_calls
     2547
     Traceback (most recent call last):
     ...
     AttributeError: 'module' object has no attribute 'total_calls'
    
  • The runsnake program is curiously contained in the package runsnakerun on debian-like os. More generally, it would be cool to say on standard OS what are the packages to install. Right now it is only the case for perf.
  • For perf you need to launch the process before launching it??

Vincent

comment:3 Changed 4 years ago by git

  • Commit changed from 3e69b2939c1cdc5697da38075cbb77154aab7056 to 38ee29e4ca8fbc62831edb1ab8b6b6ae87b6905e

Branch pushed to git repo; I updated commit sha1. New commits:

38ee29etrac #17689: Review

comment:4 in reply to: ↑ 2 Changed 4 years ago by ncohen

Hello,

  • For %prun would be cool to explain a little bit the columns. In particular why there is twice "percall".

I added a link for that. I am trying to keep this document "short and efficient" and to give users an idea of where to look for what they need, without documenting other people's code.

  • In %prun what do you mean by 'critical functions'?

I meant 'performance-critical'. I rephrased it.

  • The following does not work for me

The second 'stats' should be 'stats_object'. Fixed.

  • The runsnake program is curiously contained in the package runsnakerun on debian-like os. More generally, it would be cool to say on standard OS what are the packages to install. Right now it is only the case for perf.

I added this information for 'perf' because 'perf' is a very common keyword in softwares, and that it is actually non-trivial to find out that it is part of linux-tools unless you more or less know what to look for.

To find out that 'runsnake' is included in the 'runsnakerun' debian package you need to type 'runsnake debian' in google (samek for ubuntu), and probably for most others. I fear that if I add one people will come from Fedora, archlinux and who knows what else to add totally trivial information.

~$ apt-cache search runsnake
runsnakerun - GUI utility for (Python) cProfile or Profile profiler dumps
  • For perf you need to launch the process before launching it??

No need. This thing is a wonder.

Nathann

comment:5 Changed 4 years ago by ncohen

  • Status changed from needs_review to needs_work

I just noticed a couple of mistakes in my last commit. I will update it today.

comment:6 Changed 4 years ago by git

  • Commit changed from 38ee29e4ca8fbc62831edb1ab8b6b6ae87b6905e to bdd2d969491d9247b929e2a0b93a63fef4f5b4a1

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

a45edf8trac #17689: A tutorial on profiling in Sage
bdd2d96trac #17689: Review

comment:7 Changed 4 years ago by ncohen

  • Status changed from needs_work to needs_review

Done.

comment:8 Changed 4 years ago by git

  • Commit changed from bdd2d969491d9247b929e2a0b93a63fef4f5b4a1 to 31eb4b5f2d741eb78b83c2ffa044dbf8b70364ef

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

31eb4b5trac #17689: Review

comment:9 follow-up: Changed 4 years ago by vdelecroix

Hello,

I like the functional way of using timeit

sage: timeit("for i in range(2,100): _ = is_prime(i)")
625 loops, best of 3: 964 µs per loop

Especially because it allows you to tune the parameters

sage: timeit("for i in range(2,100): _ = is_prime(i)", number=137, precision=3)
137 loops, best of 3: 1.05 ms per loop

And even better, I like sage_timeit that allows you to play with the results (which is useful if you want to benchmark several implementations)

sage: from sage.misc.sage_timeit import sage_timeit
sage: t = sage_timeit("for i in range(2,100): _ = is_prime(i)", globals_dict=globals())
sage: t.stats
(625, 3, 3, 924.3534088134766, '\xc2\xb5s')
sage: t.series
[0.0009369647979736328, 0.0009340656280517578, 0.0009243534088134766]

What do you think of having it mentioned?

Vincent

comment:10 in reply to: ↑ 9 Changed 4 years ago by ncohen

Hello,

I like the functional way of using timeit

And even better, I like sage_timeit that allows you to play with the results (which is useful if you want to benchmark several implementations)

What do you think of having it mentioned?

Would it be a problem for you if this documentation appeared in the docstring of sage_timeit (or its module)? Documenting the various arguments, their effects and uses is the job of the function's documentation.

We could then add a link from the tutorial to these places so that the interested user finds them.

I meant this page to be an overview of the different means to profile code in Sage, as it is difficult to find "timeit/prun/lprun/crun/perf" if you do not know the name already. With this page, people interested in profiling will find all the keywords they need, and links toward the specific documentations.

Nathann

comment:11 Changed 4 years ago by git

  • Commit changed from 31eb4b5f2d741eb78b83c2ffa044dbf8b70364ef to 7d017b16887b5e009793d01bdc8294d3b60cefe5

Branch pushed to git repo; I updated commit sha1. New commits:

7d017b1trac #17689: Thierry's suggestion

comment:12 Changed 4 years ago by ncohen

Turns out that we already had an interface with runsnake. I simplified an entry. Still in needs_review.

Nathann

comment:13 Changed 4 years ago by ncohen

Ping ?...

Nathann

P.S.: There is a stable release ahead, and the doc that will not make it before that will not appear online until the next one ^^;;;

comment:14 follow-up: Changed 4 years ago by dcoudert

This doc is very nice. Below are some comments / problems.

  • Note that while %time only runs the command once, timeit tries to return a more meaningful value over several runs. -> shouldn't you write %timeit instead ?
  • The runsnake command is not working on my mac, and the reported problem is not always the same :( Is this a known issue?
    sage: runsnake('random_prime(2**500)')
    sage:   File "/Users/dcoudert/.sage/temp/confetti.inria.fr/17114/tmp_WELnaQ", line 1
        {(i
          ^
    SyntaxError: invalid syntax
    
    sage: 
    sage: runsnake('random_prime(2**500)')
    sage:   File "/Users/dcoudert/.sage/temp/confetti.inria.fr/17114/tmp_UNaKxo", line 1
    SyntaxError: Non-ASCII character '\xc5' in file /Users/dcoudert/.sage/temp/confetti.inria.fr/17114/tmp_UNaKxo on line 2, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details
    
    sage: runsnake('random_prime(2**500)')
    sage:   File "/Users/dcoudert/.sage/temp/confetti.inria.fr/17114/tmp_q4q1bM", line 1
    SyntaxError: Non-ASCII character '\xfb' in file /Users/dcoudert/.sage/temp/confetti.inria.fr/17114/tmp_q4q1bM on line 2, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details
    
    sage: 
    
  • %crun: I have very different output. Is it normal ?
    sage: %crun p=random_prime(2**500)
    PROFILE: interrupts/evictions/bytes = 57/0/1840
    Using local file /Users/dcoudert/sage/local/bin/python.
    Using local file /Users/dcoudert/.sage/temp/confetti.inria.fr/38434/tmp_IbMOhg.perf.
    Total: 57 samples
           0   0.0%   0.0%       10  17.5% __mh_execute_header
           3   5.3%   5.3%        3   5.3% 0x00007fff8a751c13
           0   0.0%   5.3%        2   3.5% 0x000000010225ce6b
           0   0.0%   5.3%        2   3.5% 0x00000001031e6ac8
           0   0.0%   5.3%        2   3.5% 0x000000010399c0ee
           0   0.0%   5.3%        2   3.5% 0x00007fff5da2fd4f
           0   0.0%   5.3%        1   1.8% 0x00000001021e57f9
    
  • I'm not checking the linux-only section on my mac.

David.

comment:15 Changed 4 years ago by git

  • Commit changed from 7d017b16887b5e009793d01bdc8294d3b60cefe5 to 076ebef1f7344c274e8f43f8337bc46d2a717c16

Branch pushed to git repo; I updated commit sha1. New commits:

076ebeftrac #17689: Review

comment:16 in reply to: ↑ 14 Changed 4 years ago by ncohen

Yo,

  • Note that while %time only runs the command once, timeit tries to return a more meaningful value over several runs. -> shouldn't you write %timeit instead ?

Fixed.

  • The runsnake command is not working on my mac, and the reported problem is not always the same :( Is this a known issue?

I don't know. There are two tickets that I know about runsnake: #17735 and #14414.

  • %crun: I have very different output. Is it normal ?

No idea. It looks like Python may have been compiled with different flags, but I do not know enough to answer that. I expect that the function will still be useful when the most time-consuming entries are Sage functions and not those internal Python functions.

I prefer to not add a link toward a tutorial I cannot test myself.

Nathann

comment:17 follow-up: Changed 4 years ago by dcoudert

Unfortunately I'm unable to test all proposed methods, so I cannot help more.

comment:18 in reply to: ↑ 17 Changed 4 years ago by ncohen

Unfortunately I'm unable to test all proposed methods, so I cannot help more.

I understand, thanks for your help! But do give the 'perf' thing a try if you ever get a linux machine around, it may convince you to give mac up for your next laptop ;-)

Nathann

comment:19 Changed 4 years ago by ncohen

Guys ?... The next stable release is pretty close. Would anybody have some time to review this patch so that it appears in the updated online doc ?

Nathann

comment:20 Changed 4 years ago by ncohen

Guys ?... It's only text...

comment:21 follow-up: Changed 4 years ago by tmonteil

  • Status changed from needs_review to needs_work

I am a bit overloaded right now to help more on this interesting and important ticket, but as such, there is a big confusion between ipython %timeit and Sage timeit, so %timeit has nothing to do with timeit <sage.misc.sage_timeit_class>. A problem with %timeit is that it is ipython only, so it will not work in Sage notebook (that does not rely on ipython). So you need at least to point to the right documentations for the two different timeit, see timeit? vs %timeit?.

For example, with the Sage timeit() command, you can not do:

sage: timeit(factor(1234567890))
AttributeError: 'IntegerFactorization' object has no attribute 'lstrip'

You have to do

sage: timeit("factor(1234567890)")
625 loops, best of 3: 72.8 µs per loop

But then, with ipython:

sage: %timeit("factor(1234567890)")
10000000 loops, best of 3: 26.3 ns per loop

Which is much faster... Indeed, it measures the time for creating the string ! What you have to do instead is:

sage: %timeit(factor(1234567890))
10000 loops, best of 3: 59.2 µs per loop

or

sage: %timeit factor(1234567890)
10000 loops, best of 3: 59.5 µs per loop

So there is a confusion that can lead to a silent misuse. It think this distinction should be made clear.

comment:22 Changed 4 years ago by git

  • Commit changed from 076ebef1f7344c274e8f43f8337bc46d2a717c16 to 03c122938a9784b6c5ed1054035d8812d7d23241

Branch pushed to git repo; I updated commit sha1. New commits:

2f979b7trac #17689: Merged with 6.5.rc1
03c1229trac #17689: Remove a reference to Sage's timeit function

comment:23 in reply to: ↑ 21 Changed 4 years ago by ncohen

  • Status changed from needs_work to needs_review

I am a bit overloaded right now to help more on this interesting and important ticket, but as such, there is a big confusion between ipython %timeit and Sage timeit, so %timeit has nothing to do with timeit <sage.misc.sage_timeit_class>.

As I did not even use timeit in the examples I chose to remove this link toward Sage's timeit function. We still link toward Python's timeit function.

All examples in this document use % functions, and the last one requires a terminal, so notebook users are already a bit out of scope.

Nathann

comment:24 Changed 4 years ago by git

  • Commit changed from 03c122938a9784b6c5ed1054035d8812d7d23241 to 66ef100d9e77c5b09289cb50f72740891026ce09

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

dee89cctrac #17689: A tutorial on profiling in Sage
2d02115trac #17689: Review
f8b2e3ctrac #17689: Thierry's suggestion
d81bd00trac #17689: Review
66ef100trac #17689: Remove a reference to Sage's timeit function

comment:25 Changed 4 years ago by ncohen

Rebased on the latest beta.

comment:26 Changed 4 years ago by ncohen

Guys ?...

comment:27 Changed 4 years ago by git

  • Commit changed from 66ef100d9e77c5b09289cb50f72740891026ce09 to 8cb4d61d524b529b387a2389b90e62341f6cd284

Branch pushed to git repo; I updated commit sha1. New commits:

cf34cbe#17689 : fix confusion between IPython magic and Python command.
ececb70#17689 : Sage-notebook timeit function.
8cb4d61#17689 : typo.

comment:28 Changed 4 years ago by git

  • Commit changed from 8cb4d61d524b529b387a2389b90e62341f6cd284 to f8e5096e7531e36667fe653a993b1e3e568954d4

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

e471875trac #17689: A tutorial on profiling in Sage
03f9a0dtrac #17689: Review
a4159fftrac #17689: Thierry's suggestion
f3a2324trac #17689: Review
43a61cbtrac #17689: Remove a reference to Sage's timeit function
32705f6#17689 : fix confusion between IPython magic and Python command.
37cf898#17689 : Sage-notebook timeit function.
f8e5096#17689 : typo.

comment:29 Changed 4 years ago by git

  • Commit changed from f8e5096e7531e36667fe653a993b1e3e568954d4 to b027e16d80fcdbf55cb09ac8a86c2c20a2b2c4f5

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

b027e16#17689 : typo.

comment:30 follow-up: Changed 4 years ago by tmonteil

I added some info about the timeit Sage function since the %timeit IPython magic is not available from the Sage notebook, and clarify the fact that % IPython magics are not Python commands. Is-it OK ?

comment:31 in reply to: ↑ 30 Changed 4 years ago by ncohen

I added some info about the timeit Sage function since the %timeit IPython magic is not available from the Sage notebook, and clarify the fact that % IPython magics are not Python commands. Is-it OK ?

Yeahyeah it is fine (and concise, thanks ! :-P). I shortened a couple of sentences and replaced (as you asked me to) the 500 with a 300 in the first examples, so that the timeit call runs more than 1 loop.

Is that okay for you?

Nathann

comment:32 Changed 4 years ago by git

  • Commit changed from b027e16d80fcdbf55cb09ac8a86c2c20a2b2c4f5 to 5ad29725172eeb5562d46e722506112228f33fe9

Branch pushed to git repo; I updated commit sha1. New commits:

5ad2972trac #17689: Last touch

comment:33 Changed 4 years ago by tmonteil

  • Keywords sd66 added
  • Reviewers set to Thierry Monteil
  • Status changed from needs_review to positive_review

Good to go !

comment:34 Changed 4 years ago by ncohen

Wouhouuuuuuuu ! THaaaaaaaaaaanks ! ;-)

Nathann

comment:35 Changed 4 years ago by vbraun

  • Branch changed from public/17689 to 5ad29725172eeb5562d46e722506112228f33fe9
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.