#7692 closed enhancement (fixed)
update the sloane OEIS database to the latest version; it is a little out of date
Reported by: | was | Owned by: | tbd |
---|---|---|---|
Priority: | minor | Milestone: | sage-4.3.1 |
Component: | packages: optional | Keywords: | |
Cc: | Merged in: | sage-4.3.1.alpha0 | |
Authors: | Steven Sivek | Reviewers: | Jaap Spies |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description
The Sloane database hasn't been updated since 2005, so update it.
http://sagemath.org/packages/optional/database_sloane_oeis-2005-12.spkg
Attachments (3)
Change History (16)
comment:1 Changed 13 years ago by
Changed 13 years ago by
comment:2 Changed 13 years ago by
- Status changed from new to needs_review
Here is a complete spkg up to the Sage standard for spkg's (hopefully):
http://sage.math.washington.edu/home/wstein/patches/database_sloane_oeis-2009-12.spkg
comment:3 Changed 13 years ago by
- Status changed from needs_review to needs_work
The package installed ok, but sloane.py needs work:
sage: SloaneEncyclopedia.load() ERROR: An unexpected error occurred while tokenizing input The following traceback may be corrupted or invalid The error message is: ('EOF in multi-line statement', (48, 0)) --------------------------------------------------------------------------- IndexError Traceback (most recent call last) /home/jaap/.sage/temp/vrede.jaapspies.nl/14953/_home_jaap__sage_init_sage_0.py in <module>() /home/jaap/downloads/sage-4.3.rc0/local/lib/python2.6/site-packages/sage/databases/sloane.pyc in load(self) 246 seqnum = int(m.group('num')); 247 msg = m.group('body').strip(); --> 248 self.__data__[seqnum] = [seqnum, None, ','+msg+','] 249 verbose("Finished loading", tm) 250 self.__loaded__ = True IndexError: list assignment index out of range
First of all there are more sequence in the databas:
class SloaneEncyclopediaClass: """ A local copy of the Sloane Online Encyclopedia of Integer Sequences that contains only the sequence numbers and the sequences themselves. """ def __init__(self): """ Initialize the database but do not load any of the data. """ self.__file__ = "%s/data/sloane/sloane-oeis.bz2"%os.environ["SAGE_ROOT"] self.__arraysize__ = 114751 # maximum sequence number + 1 self.__loaded__ = False
Jaap
comment:4 Changed 13 years ago by
I completely forgot that the array size was hardcoded in SloaneEncyclopediaClass -- this is what caused the error, since now the number of entries is bigger than the array size. There's a bizarre new issue with numbering, though: most of the online sequences are sequentially numbered, but in the version I downloaded last night the sequential numbers end at A175062 and then there's a single sequence, A557274, after that. (To check the numbers in your database file, run "cut -d' ' -f1 sloane-oeis | head".) The two best fixes I have in mind, other than getting Sloane to renumber that one extra sequence, are to replace SloaneEncyclopediaClass.__data__ with a hashtable whose keys are the indices and to let it be a huge array whose last index is 557274. The first might be slower, but the second one will require storing almost 400000 extra "None" entries in the __data__ array, and they'll have to be iterated through and ignored in the find() method. If we stick to using an array instead of a hash table, then probably the right thing to do as far as the array size is to add a line to the update-sloane script: something like cut -d' ' -f1 sloane-oeis | sort -r | head -1 | sed 's/A//' > sloane-maxseq where sloane-oeis is the unzipped encyclopedia file, to write the maximal sequence number (in this case, 557274) to a file sloane-maxseq. Then the SloaneEncyclopediaClass.load() method could read this number (plus one) from the sloane-maxseq file into the variable self.__arraysize__ before it creates self.__data__, and continue as normal. Which of these do you think is the best way to proceed? Steven
comment:5 Changed 13 years ago by
More readable version:
I completely forgot that the array size was hardcoded in SloaneEncyclopediaClass? -- this is what caused the error, since now the number of entries is bigger than the array size. There's a bizarre new issue with numbering, though: most of the online sequences are sequentially numbered, but in the version I downloaded last night the sequential numbers end at A175062 and then there's a single sequence, A557274, after that. (To check the numbers in your database file, run "cut -d' ' -f1 sloane-oeis | head".)
The two best fixes I have in mind, other than getting Sloane to renumber that one extra sequence, are to replace SloaneEncyclopediaClass?.data with a hashtable whose keys are the indices and to let it be a huge array whose last index is 557274. The first might be slower, but the second one will require storing almost 400000 extra "None" entries in the data array, and they'll have to be iterated through and ignored in the find() method.
If we stick to using an array instead of a hash table, then probably the right thing to do as far as the array size is to add a line to the update-sloane script: something like
cut -d' ' -f1 sloane-oeis | sort -r | head -1 | sed 's/A//' > sloane-maxseq
where sloane-oeis is the unzipped encyclopedia file, to write the maximal sequence number (in this case, 557274) to a file sloane-maxseq. Then the SloaneEncyclopediaClass?.load() method could read this number (plus one) from the sloane-maxseq file into the variable self.arraysize before it creates self.data, and continue as normal.
Which of these do you think is the best way to proceed?
Steven
comment:6 Changed 13 years ago by
FWIW, I downloaded a snapshot of the OEIS: all sequences up to date 2009-12-19.
I made a bz2 file: http://sage.math.washington.edu/home/jsp/cat25.bz2
43 MB, expanded this is 176 MB.
Nice to have around.
Jaap
comment:7 Changed 13 years ago by
- Status changed from needs_work to needs_review
I've added a patch which adds two new functions in SloaneEncyclopediaClass?:
- SloaneEncyclopedia?.install() will download the stripped.gz file from the OEIS website and install it. The user can specify an alternate URL and whether to overwrite an existing copy of the OEIS.
- SloaneEncyclopedia?.install_from_gz() installs the encyclopedia from a local copy of stripped.gz; the user has to specify the filename and (optionally) whether to overwrite an existing copy.
This eliminates the need for a spkg as long as the user can get a copy of stripped.gz, so if we want to continue providing a spkg (assuming we even have permission: see http://www.research.att.com/~njas/sequences/Seis.html#SEARCH2) it should probably just contain stripped.gz and a spkg-install script which passes it to install_from_gz().
The patch should also fix the IndexError? issue from the referee report, since now instead of hardcoding the size of the database and allocating an array of that size it just loads the database into a dictionary.
comment:8 Changed 13 years ago by
- Status changed from needs_review to positive_review
Looks good to me.
Tested the new functions. Worked for me.
Remark: I think there is no problem in offering an optional spkg. Neil excludes distributing the full database.
Suggestion: maybe it is feasible to modify sloane.py to include the file names.gz. That way sequence can have there proper name from the OEIS.
Cheers,
Jaap
comment:9 Changed 13 years ago by
Two remarks:
[1] Maybe the name of the patch should be conform the standard: trac_7692.patch
[2] Output is in Python:
sage: SloaneEncyclopedia[111111] [1, 2, 0, 2, 6, 46, 338, 2926, 28146, 298526, 3454434, 43286526, 583835650, 8433987582L, 129941213186L, 2127349165822L, 36889047574274L, 675548628690430L, 13030733384956418L, 264111424634864638L]
I would like to see this in Sage.
Shall we open another ticket?
Cheers,
Jaap
comment:10 Changed 13 years ago by
Ticket #7749 is now open, and I expect to have a patch submitted in the next day or so.
comment:11 Changed 13 years ago by
- Reviewers set to Jaap Spies
comment:12 Changed 12 years ago by
- Merged in set to sage-4.3.1.alpha0
- Resolution set to fixed
- Status changed from positive_review to closed
comment:13 Changed 12 years ago by
- Summary changed from update the sloane OEIS database to the latest version; it is a little out of date. to update the sloane OEIS database to the latest version; it is a little out of date
(see attachment: update-sloane)