Opened 6 years ago

Closed 3 years ago

Last modified 3 years ago

#10358 closed defect (fixed)

New interface with OEIS

Reported by: was Owned by: tmonteil
Priority: critical Milestone: sage-5.13
Component: combinatorics Keywords: Cernay2012 days49
Cc: sage-combinat, was, kini, jpflori, chrisjamesberg, VivianePons Merged in: sage-5.13.beta2
Authors: Thierry Monteil Reviewers: Nathann Cohen
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: #15245 Stopgaps:

Description (last modified by tmonteil)

The sloane_find command, which parses output of a webpage, is broken for years due to the Sloane sequence webpage being rewritten. It silently always finds nothing.

sage: sloane_find([1,1,2,3,5,8,13,21], nresults=1)
[]

The aim of this ticket is to propose a new interface with OEIS http://oeis.org/

Apply:

Attachments (6)

trac_10358-oeis-tm.patch (86.7 KB) - added by tmonteil 4 years ago.
trac_10358-oeis-review_1-tm.patch (23.9 KB) - added by tmonteil 4 years ago.
First review patch
trac_10358-oeis-review_2-tm.patch (4.6 KB) - added by tmonteil 4 years ago.
Second review patch
trac_10358-rev.patch (6.0 KB) - added by ncohen 4 years ago.
trac_10358-oeis-review_3-nc-tm.patch (17.8 KB) - added by tmonteil 4 years ago.
Tested on 5.12
trac_10358-oeis-tm_rebase.patch (86.6 KB) - added by tmonteil 4 years ago.
rebased after #15245

Download all attachments as: .zip

Change History (63)

comment:1 Changed 6 years ago by was

  • Type changed from PLEASE CHANGE to defect

comment:2 Changed 6 years ago by was

  • Resolution set to duplicate
  • Status changed from new to closed

Crap, this is a dup of #10056

comment:3 Changed 6 years ago by tmonteil

  • Cc sage-combinat was added
  • Owner changed from sage-combinat to tmonteil

Hi,

the aim of #10056 was to migrate every url in sage that was dealing with Sloane's database at once (not only sloane_find()), hence this is not really a duplicate. The issue here is that oeis changed their html templates, so the sage parsers are not working anymore. I just coded a new parser, which you can see working on the combinat repository (sloane_new_website_parsers-tm.patch).

I added the more atomic functions in sage/databases/sloane.py (the documentation still needs to be fixed):

  • sloane_discover() which discovers which sequences correspond to some list of integers.
  • sloane_first_terms() which returns the first terms of the sequence.
  • sloane_description() which returns the one line description of the sequence.

I don't really know what to do with the broken functions:

  • parse_sequence() is completely dead and now useless. Should it now raise a deprecation warning ?
  • sloane_sequence() and sloane_find() are broken, but maybe some users still need them, should i make them working again as a combination of the first 3 functions ?

Also, maybe all those functions should be renamed by replacing sloane by oeis ? This may be too early since i guess most people know more about Sloane than about oeis. In such a case, what is the procedure/convention to ensure backward compatibility ?

At the end, we could have a daily crontab to check when the oeis changes its html template, and adapt the parsers accordingly. Better, we could ask them to provide an online standard query system, but if i understand a discussion in the nmbrthry mailing-list, this may not be their whish.

Ciao, Thierry

comment:4 Changed 6 years ago by tmonteil

  • Status changed from closed to needs_work

comment:5 Changed 6 years ago by tmonteil

  • Milestone changed from sage-4.6.1 to sage-4.6.2

comment:6 Changed 6 years ago by nthiery

Salut Thierry,

Rather than having a bunch of functions to return the various pieces of a Sloane result, there could be a single object from which one could query the pieces via methods. Something like:

    sage: s = sloane_find([1,1,2,5,8,14])[0]
    sage: s.description()
    ...
    sage: s[6]
    42

Actually, Anders Claesson [1] had implemented a prototype of this feature during FPSAC'09, and his demo was quite cool. I guess he would agree to give his code away for someone else to finalize it and merge it into Sage.

But maybe that should be the topic of a separate ticket.

[1] http://combinatorics.is/anders/

comment:7 Changed 6 years ago by leif

  • Resolution changed from duplicate to fixed
  • Status changed from needs_work to closed

comment:8 Changed 6 years ago by leif

  • Resolution fixed deleted
  • Status changed from closed to new

comment:9 Changed 6 years ago by kini

  • Cc kini added

comment:10 Changed 6 years ago by tmonteil

By the way, if you need to play with OEIS now, here is the current version of the patch in preparation, on sage-combinat: http://combinat.sagemath.org/patches/file/0506836c7c66/sloane_new_website_parsers-tm.patch

The methods still parse the new website correctly, but i didn't put everything in a single class yet as Nicolas suggested.

comment:11 Changed 6 years ago by kini

Oh, no particular hurry. I just happened to notice OEIS was broken, and was surprised to find that this ticket was closed, so I asked leif to reopen it. Glad to hear there is a patch in the works. If possible could you post it here once in a while? As the ticket hadn't been changed for 9 months I thought nobody was working on it. Thanks!

comment:12 Changed 5 years ago by tmonteil

  • Keywords Cernay2012 added

comment:13 follow-up: Changed 5 years ago by slabbe

Thierry, can you upload your patch here? Was it ready?

Sébastien

comment:14 Changed 4 years ago by kcrisman

See also #13884, though it should be dealt with here.

And FYI, we would want to make sure that this also works:

sage -t --only_optional=internet "devel/sage/sage/combinat/words/paths.py"
    sage: sloane_find(_, nresults=1) #optional - internet
Expected:
    Searching Sloane's online database...
    [[1653,
      'Numbers n such that 2*n^2 - 1 is a square.',
      [1,
       5,
       29,
       169,
       985,
       5741,
       33461,
       195025,
       1136689,
       6625109,
       38613965,
       225058681,
       1311738121,
       7645370045,
       44560482149,
       259717522849,
       1513744654945,
       8822750406821,
       51422757785981,
       299713796309065,
       1746860020068409]]]
Got:
    Searching Sloane's online database...
    []
**********************************************************************

Which of course it should once this is done, but wanted to point it out since it's in a different file.

Last edited 4 years ago by kcrisman (previous) (diff)

comment:15 in reply to: ↑ 13 Changed 4 years ago by ncohen

Thierry, can you upload your patch here? Was it ready?

6 months old question left unanswered.. Time for a ping :-P

Nathann

comment:16 Changed 4 years ago by kini

http://combinat.sagemath.org/patches/log/tip/sloane_new_website_parsers-tm.patch

Nothing happened for two years, and the patch is still sitting in the combinat queue without being posted on trac.

Last edited 4 years ago by kini (previous) (diff)

comment:17 Changed 4 years ago by ncohen

"The Sage-Combinat experience" :-P

So, should we set this ticket to need_work, or needs_info, or wait_for_the_combinat_queue_to_be_reviewed ? :-P

Nathann

comment:18 Changed 4 years ago by tmonteil

Hi,

sorry for the delay, the version on the combinat queue is not the good one, during the Cernay combinat days in last february (less than one year!), i build a new class named oeis, that allows to get all kind of information from the oeis db (cross references, web links, ...).

I upload a temporary version here so that you can try it, but the documentation is not finished, and non backward-compatible changes may still appear.

Thierry

comment:19 Changed 4 years ago by jpflori

  • Cc jpflori added

comment:20 Changed 4 years ago by tmonteil

The patch needs to be rebased after #13701, some doctests, and make urls clickable from the notebook.

I will finish this during sage days 49.

comment:21 Changed 4 years ago by chrisjamesberg

  • Cc chrisjamesberg added

Changed 4 years ago by tmonteil

comment:22 Changed 4 years ago by tmonteil

  • Authors set to Thierry Monteil
  • Description modified (diff)
  • Keywords days49 added
  • Status changed from new to needs_review
  • Summary changed from The sloane_find command is now completely broken to New interface with OEIS

There is a room for lots of improvements, but i submit it now under the social pressure ;)

Most tests use an internet connection, so do not forget --optional=sage,internet when testing.

The name of the method .natural_object() might be improved (.sage() is not better since an OEIS sequence is already a Sage object).

comment:23 Changed 4 years ago by kcrisman

Just a brief question - why the oeis() rather than sloane_find()? Are you just completely replacing the sloane.py? I just don't quite get why this isn't just all put in sloane.py, and perhaps the new code replacing the broken code... Not that I have a vested interest in this.

I do recommend that all references to is broken for more than a year be changed to is broken or perhaps has been broken for some time.

Also, it would be nice to have references like

class:`sage.databases.oeis.OEIS`_

or whatever the correct format is.

comment:24 Changed 4 years ago by tmonteil

Here is how i understand the mess. There are three different things related to OEIS in Sage.

  • sage/databases/sloane.py (still) contains a class SloaneEncyclopedia that corresponds to a local partial copy of the OEIS. It is really a database. This file used to contain a function sloane_find() as well that, given a sequence, looked on the web and answered the name, the title, and the first terms of the OEIS sequences containing it.
  • sage/databases/oeis.py (which is created by this patch) contains two classes,
    • OEIS, that represents the on-line database, so that you can ask it questions like
      • give me the sequence whose ID is 'A000040'
      • give me 5 sequences whose description deal with "decimals".
      • give me 3 sequences that contain [2,3,5,7] as a subsequence.
      • (todo) : give me the explanation of the keyword 'cofr'
      • ...
    • OEISSequence that represents a sequence, so that you can ask it questions like
      • what is your name ?
      • what are your first terms ?
      • are you finite ?
      • do you have some references, links,... ?
      • what is your starting index ?
      • which natural Sage object do you represent (continued fraction, decimal number, sequence, power series,...) ?
      • ...
  • sage/combinat/sloane_functions.py contains a generic class SloaneSequence, and a class (that inherits from the generic one) for each implementation of a given OEIS sequence. Such a class allows to dynamically give more terms than the one stored in OEIS. This functionality could perhaps be redesigned in something lighter (like a dictionary of functions), to ease adding new entries).

Of course OEIS and sloane_find() are related, which is why i posted the patch on this ticket. But, the answers are not of the same type (direct informations versus an OEIS sequences that you can ask for informations), hence if i name the class OEIS as sloane_find, old code using sloane_find() won't work either, hence the deprecation warning instead. Also it would have been harder to extend the function sloane_find(), where it will be easy for the classes OEIS/OEISSequence.

Actually, i do not know whether oeis.py should stay in the sage/databases/ directory, since is it not an interface to a local database but a tool to fetch an online website, like finance/stock.py and interfaces/magma_free.py

Perhaps should i move it to sage/combinat/ directory (where sloane_functions.py is), or maybe we should move all three tools to a sage/combinat/oeis/ directory (but then sloane.py will not be in sage/databases/ anymore) ? The current choice is the laziest.

Of course those 3 tools related to OEIS should be interfaced together (e.g. if there are too many online queries, suggest the user to install the offline database ; if there is a Sage implementation of some OEIS sequence, then use it when iterating over it once the first terms are exhausted ; and so on). This is planned but, will take some more tests, and perhaps having 3 different tools for 3 different purposes is a good start. Having the web, the DB and the functions interfaced will also allow to be more lazy (e.g an OEIS sequence could be created with only its ID, without fetching the web as long as there are local answers, this could be easily done in the OEIS/OEISSequence framework).

I will change the deprecation messages, and description of broken functions as you suggested.

Concerning the reference class:sage.databases.oeis.OEIS_, i already put one in the general description of sloane.py (as a SEEALSO::). Should i also put one in the deprecation warnings as well ?

comment:25 Changed 4 years ago by ncohen

Helloooooooo !!

I noticed small things

  • find_by_description does not use first_result
  • ``absolute_value`` - (bool, default False) expliquer le bordel.
  • Shouldn't OEISSequence.fields be renamed to _fields ? It's weird to have this constant among so many functions in the tab-completion.
  • In your deprecation warnings could you replace "oeis" by "oeis()" ? To emphasize that "oeis" is a function that should be used instead of "sloane_find" ?
  • I am not a big fan of the design of first_terms which builds the list of all terms, and that is subsequently called by __call__ and __getitem__ that ignore most values and return only one. But of course efficiency is not really a problem here :-P
  • The documentation of the OEIS module begins a tad harshly. A sentence saying "The OEIS is an online database for integer sequences, you can query it through Sage in order to guess how an integer sequence contiues and what it may mean" would be nice before all Sage examples are given :-)

This being said I like your new code, and I think that it fits well in database/...

Nathann

comment:26 Changed 4 years ago by ncohen

  • Status changed from needs_review to needs_info

comment:27 Changed 4 years ago by VivianePons

  • Cc VivianePons added

comment:28 Changed 4 years ago by tmonteil

Here is the first review patch.

Suggested by Karl-Dieter:

  • I do recommend that all references to is broken for more than a year be changed to is broken or perhaps has been broken for some time.
    • Done.
  • Also, it would be nice to have references like class:sage.databases.oeis.OEIS_ or whatever the correct format is.
    • Done
    • I added :mod:OEIS <sage.databases.oeis>
    • I also added similar cross references between the three Sloane-related modules.

Suggested by Nathann:

  • find_by_description does not use first_result
    • Done.
    • I think it disapeared when migrating from my first patch and the new framework.
  • absolute_value - (bool, default False) expliquer le bordel.
    • Done.
    • Why did my en_GB spell checker did not alert me about that ? ;)
  • Shouldn't OEISSequence.fields be renamed to _fields ? It's weird to have this constant among so many functions in the tab-completion.
    • Done.
    • I like to see it exists, but i agree this may be confusing, and perhaps someone that need this will find it even if it is called _fields.
  • In your deprecation warnings could you replace "oeis" by "oeis()" ? To emphasize that "oeis" is a function that should be used instead of "sloane_find" ?
    • Done.
  • I am not a big fan of the design of first_terms which builds the list of all terms, and that is subsequently called by call and getitem that ignore most values and return only one. But of course efficiency is not really a problem here :-P
    • Done.
    • Actually, the result we get from the website is a string from which we have to extract the first terms, and if i want the 10th term, id no not see how to find it without parsing the whole string (or at least a big part of it since i do not know the length of the 9 previous integers of the sequence). There used to be a @cached_method here (and i forgot to remove the associated import_statement), so i put it back, hence many calls to call or getitem will cost string parsing only the first time.
    • Also, i think it is good to have such a separate method to say that those first terms come from the online database. This might be useful when interfacing with sloane_functions, and then have a differnt method like further_terms or so, to distinguish them (and allow cross testing).
  • The documentation of the OEIS module begins a tad harshly. A sentence saying "The OEIS is an online database for integer sequences, you can query it through Sage in order to guess how an integer sequence contiues and what it may mean" would be nice before all Sage examples are given :-)
    • Done.

Moreover, i took the opportunity to fix a few things:

  • I added empty default values to the variables of sloane_find() and sloane_sequence() so that calling them without argument will lead to the deprecation warning instead of an error.
  • Thanks to the patchbot, i noticed that Sage startuptime was slower with oeis, this is due to the import of modules that are not imported elsewhere in Sage: HTMLParser and webbrowser. Therefore, i now import them in the right methods only, and make oeis lazy_imported as well.
  • Thanks to the patchbot, i put doctest coverage of oeis.py to 100% (i forgot to doctest .__init__() and untestable functions like .browse())
  • I removed an old unused option in FancyTuple, and fixed some typos.
  • I added a .browse() method to OEIS class.

Changed 4 years ago by tmonteil

First review patch

comment:29 Changed 4 years ago by tmonteil

  • Description modified (diff)
  • Status changed from needs_info to needs_review

comment:30 Changed 4 years ago by ncohen

I am playing with oeis again and thought of two others things..

  • How come that there is no ".sequence" entry when you have a oeis sequence ? Sounds natural, doesn't it ? Wouldn't it be muuuuch more natural than "first_n_terms" ? O_o
  • How come that there is no way to print all the information that OEIS knows on the sequence ? Something like seq.show_all() or seq.show ? Right now everything is split in manymany fields, which is nice if you want to write some code but if you just want to know everything there is to know about this sequence, well.. It's quite uneasy to use, isn't it ?

Oh, and perhaps you should use viewer.browser instead of webbroser ?

http://www.sagemath.org/doc/reference/misc/sage/misc/viewer.html

Nathann

comment:31 Changed 4 years ago by ncohen

  • Status changed from needs_review to positive_review

Hmmmmmm... Looks like your way of picking the browser makes more sense than mine. It's bit weird though that viewer.browser does not use webbrowser at all O_o

Nathann

comment:32 Changed 4 years ago by ncohen

  • Status changed from positive_review to needs_work

comment:33 Changed 4 years ago by ncohen

  • Status changed from needs_work to needs_review

comment:34 Changed 4 years ago by tmonteil

In this second review ticket, i added a basic .show() method as Nathann suggested.

I took the opportunity to update a doctest (an OEIS entry was updated), to let .cross_references() return tuples instead of lists (like the other methods), and to have a better alignment in the representation of FancyTuple.

Concerning the name .sequence() vs .first_terms() i prefer to stick to .first_terms() because we only got the first few terms, not the whole sequence. Perhaps .sequence() could be defined while interfacing oeis with sloane_sequence.

Changed 4 years ago by tmonteil

Second review patch

comment:35 Changed 4 years ago by tmonteil

  • Description modified (diff)

comment:36 Changed 4 years ago by ncohen

  • Reviewers set to Nathann Cohen

Helloooooooooo Thiery !

I read your code again, and made a couple of modifications. Could you please give it a look ? If you agree with them you can set this ticket to positive_review :-)

  • Added __call__ to the Sphinx doc
  • Copied the doc of __call__ to the doc of OEIS because it's currently impossible to know what oeis() takes as parameters (that's what you get from using fancy __call__ instead of a normal .find() function...)
  • Added a section for classes and methods in the module's doc, to separate them from the module's doc in the html page

Nathann

Last edited 4 years ago by ncohen (previous) (diff)

Changed 4 years ago by ncohen

comment:37 Changed 4 years ago by ncohen

  • Description modified (diff)

comment:38 Changed 4 years ago by jdemeyer

  • Milestone changed from sage-5.11 to sage-5.12

comment:39 Changed 4 years ago by ncohen

Ping ?...

comment:40 Changed 4 years ago by tmonteil

  • Description modified (diff)

Hi,

i built a new patch above yours. Instead of copying the doc of __call__ to the doc of OEIS, i moved it to avoid code replication.

Also, the html() function prints things but returns nothing, so i updated the doc accordingly.

I also had to update some doctests since some OEIS entries get updated upstream.

I made the patch with sage 5.10, so let us see if the patchbot can apply and test them on 5.12 (i am going compile it, but it may take some time).

comment:41 Changed 4 years ago by chapoton

The last patch does not apply, see the patchbot report. But maybe it is not supposed to apply it on top of Nathann's patch trac_10358-rev.patch​ ?

If so, you must tell the patchbot (in a comment) what patches to apply.

comment:42 Changed 4 years ago by tmonteil

Indeed, the patch applies instead of Nathann's one, not on top of it. I specified it in the ticket description. Did i use a wrong syntax ?

comment:43 Changed 4 years ago by chapoton

Hello. The patchbots do not look at the description, but only look here (in the comments). Let me do it:

apply trac_10358-oeis-tm.patch​ trac_10358-oeis-review_1-tm.patch​ trac_10358-oeis-review_2-tm.patch trac_10358-oeis-review_3-nc-tm.patch​

I do not remember if commas are necessary. Let us see.

comment:44 Changed 4 years ago by chapoton

apply trac_10358-oeis-tm.patch​,trac_10358-oeis-review_1-tm.patch​,trac_10358-oeis-review_2-tm.patch,trac_10358-oeis-review_3-nc-tm.patch​

comment:45 Changed 4 years ago by chapoton

This does not seem to work..

for patchbot: apply trac_10358-oeis-tm.patch​ trac_10358-oeis-review_1-tm.patch​ trac_10358-oeis-review_2-tm.patch trac_10358-oeis-review_3-nc-tm.patch​

comment:46 Changed 4 years ago by ncohen

"identify a sequence from of its first elements." ? :-)

Nathann

comment:47 Changed 4 years ago by ncohen

Oh. Looks like the mistake was actually in my patch. Well, I guess you can fix it in yours as it replaces mine and set this patch to positive_review :-)

Nathann

comment:48 Changed 4 years ago by chapoton

Patchbots are sometimes dumb. Let us try again to teach them !

apply trac_10358-oeis-tm.patch​,trac_10358-oeis-review_1-tm.patch,trac_10358-oeis-review_2-tm.patch,trac_10358-oeis-review_3-nc-tm.patch

comment:49 Changed 4 years ago by chapoton

Almost good ... Let me try again

apply trac_10358-oeis-tm.patch​,trac_10358-oeis-review_1-tm.patch,trac_10358-oeis-review_2-tm.patch,trac_10358-oeis-review_3-nc-tm.patch

comment:50 Changed 4 years ago by chapoton

pfff. Last try

apply only trac_10358-oeis-tm.patch​,trac_10358-oeis-review_1-tm.patch,trac_10358-oeis-review_2-tm.patch,trac_10358-oeis-review_3-nc-tm.patch

comment:51 Changed 4 years ago by chapoton

ok, it was failing because the patchbots do not like invisible characters (and neither do I). This should work:

apply trac_10358-oeis-tm.patch trac_10358-oeis-review_1-tm.patch trac_10358-oeis-review_2-tm.patch trac_10358-oeis-review_3-nc-tm.patch

Changed 4 years ago by tmonteil

Tested on 5.12

comment:52 Changed 4 years ago by tmonteil

  • Status changed from needs_review to positive_review

comment:53 Changed 4 years ago by jdemeyer

  • Dependencies set to #15245
  • Status changed from positive_review to needs_work

This needs to be rebased to #15245.

Changed 4 years ago by tmonteil

rebased after #15245

comment:54 Changed 4 years ago by tmonteil

  • Description modified (diff)
  • Status changed from needs_work to needs_review

The rebase is a shift in sage/matrix/matrix2.pyx

apply trac_10358-oeis-tm_rebase.patch trac_10358-oeis-review_1-tm.patch trac_10358-oeis-review_2-tm.patch trac_10358-oeis-review_3-nc-tm.patch

comment:55 Changed 4 years ago by jdemeyer

  • Status changed from needs_review to positive_review

The patches apply now.

comment:56 Changed 3 years ago by jdemeyer

  • Merged in set to sage-5.13.beta2
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:57 Changed 3 years ago by nthiery

Joy, joy, joy is all around us!

Hail to Thierry, Hail to Nathann!

Note: See TracTickets for help on using tickets.