Ticket #6429 (closed enhancement: fixed)

Opened 9 months ago

Last modified 8 months ago

[with patch; positive review] sagedoc: make search_src and friends less OS dependent

Reported by: jhpalmieri Owned by: jhpalmieri
Priority: minor Milestone: sage-4.1
Component: misc Keywords:
Cc: ddrake, craigcitro Author(s): John Palmieri
Report Upstream: Reviewer(s): Dan Drake
Merged in: sage-4.1.rc0 Work issues:

Description (last modified by jhpalmieri) (diff)

As discussed in  this thread on sage-devel, the search_src, search_doc, and search_def functions use the unix 'find' command, and since there are different versions of the command which take incompatible arguments, there are problems with those functions. The attached patch reworks all of these to use pure Python rather than 'find'. It might be a little slower, but it should be more robust.

This patch also adds two new arguments to those functions. From the docstring:

    - ``path_re`` (optional, default '') - regular expression which
      the filename (including the path) must match.

    - ``module`` (optional, default 'sage') - the module in which to
      search.  The default is 'sage', the entire Sage library.

(Actually, module doesn't make sense for search_doc, so it's not available there.)

For example:

search_src("matrix", module="sage.calculus")

with tab completion available as you type in "sage.calculus", or to accomplish essentially the same thing:

search_src("matrix", path_re="calc")

This depends on the patch at #6418.

Attachments

sagedoc_6429.patch Download (20.4 KB) - added by jhpalmieri 9 months ago.
trac_6429_line_numbers.patch Download (6.9 KB) - added by ddrake 8 months ago.
trac_6429_ref.patch Download (0.8 KB) - added by jhpalmieri 8 months ago.

Change History

Changed 9 months ago by jhpalmieri

  • description modified (diff)

Changed 9 months ago by ddrake

This looks good, but I do have two notes:

  • you should add yourself to the AUTHORS block, if nothing else so that anyone reading that doesn't think we are still using "find" for the search_* functions.
  • I thought it might be nice if the output included the line numbers where the match was found. I have a suggestion for how to do that which I am testing right now. If it seems to work, I'll upload a patch to do that.

Changed 9 months ago by jhpalmieri

Changed 9 months ago by jhpalmieri

Okay, this new patch just adds my name to the AUTHORS block. Line numbers would be nice -- good luck!

Changed 9 months ago by ddrake

  • reviewer set to Dan Drake
  • summary changed from [with patch; needs review] sagedoc: make search_src and friends less OS dependent to [with patch; positive review] sagedoc: make search_src and friends less OS dependent

The trac server is not letting me upload patches, so you can get my line numbers patch from  http://sage.math.washington.edu/home/drake/trac_6429_line_numbers.patch.

For the search functions, it includes line numbers just like "grep -nH":

combinat/yang_baxter_graph.py:530:    def __init__(self, partition):

with the line number between colons. For search_doc, this admittedly doesn't really make sense, but it does no harm to include it. When in the notebook, no line numbers are produced, since users will view the results with a web browser, and "jump to this line in the source code" is not a typical web browser feature. Fortunately, the format_search_as_html function worked without alteration. I did write up a way to include a list of line numbers in the notebook output, but since it's not so useful I ditched it.

With the line numbers, you could now automatically look up functions and docstrings; for example, in some kind of external editor, if you wanted functions related to set partitions, the editor could run

sage -c 'print search_def("partition", "set", interact=False)'

in a background shell, and then offer a menu with the resulting files, and be able to jump right to the correct location.

Your patch gets a positive review, although I've only tested it on sage.math and bsd.math (so, 64-bit Ubuntu and OS X 10.5)...but now that you're doing everything in Python, it's hard to see how this could fail on different platforms. (Famous last words?) What do you think of the line numbers patch?

Changed 8 months ago by ddrake

Changed 8 months ago by ddrake

Okay, trac was restarted and now the patch is available here.

Changed 8 months ago by jhpalmieri

Changed 8 months ago by jhpalmieri

What do you think of the line numbers patch?

Looks good except that line numbers are indexed starting at 0, and I think the first line of a file should be line 1, not line 0. I've added a referee's patch to fix this.

Apply all three patches in this order: sagedoc_6429.patch, trac_6429_line_numbers.patch, trac_6429_ref.patch.

Changed 8 months ago by jhpalmieri

Some timings (on an Intel iMac running OS X 10.5):

Before the patch:

sage: time s = search_src('matrix', interact=False)
CPU times: user 1.54 s, sys: 2.46 s, total: 4.00 s
Wall time: 4.50 s

After the patch:

sage: time s = search_src('matrix', interact=False)
CPU times: user 2.15 s, sys: 0.16 s, total: 2.31 s
Wall time: 2.31 s

On sage.math, it's also faster after the patch, but not by quite the same margin: Wall time 2.03 s before the patch, 1.42 s after.

Changed 8 months ago by ddrake

I am working on some timings, but thanks for noting that the line numbers are off by one. Both vim and emacs start lines numbers from 1, so that seems like the proper thing to do.

Changed 8 months ago by ddrake

Regarding testing, on an older machine that I tested, I saw a speedup similar to what you saw on sage.math -- nothing dramatic, but definitely faster. Nice work!

(And to think, at #5806, I thought I should use find and grep because that would be faster than a pure Python solution...!)

Changed 8 months ago by rlm

  • status changed from new to closed
  • resolution set to fixed
  • merged set to sage-4.1.rc0
Note: See TracTickets for help on using tickets.