#12205 closed task (fixed)
rewrite conway polynomial spkg and code in Sage library to not use ZODB
Reported by: | was | Owned by: | was |
---|---|---|---|
Priority: | major | Milestone: | sage-5.7 |
Component: | packages: huge | Keywords: | |
Cc: | kini | Merged in: | sage-5.7.beta0 |
Authors: | R. Andrew Ohana | Reviewers: | François Bissey |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | #13896, #13963 | Stopgaps: |
Description (last modified by )
This is necessary so we can dump the ZODB spkg from Sage.
Installation Instructions
- Install the new spkg.
- Apply trac12205.patch to the sage library
- Apply 12205_root.patch to the sage root repository
Attachments (3)
Change History (68)
comment:1 Changed 11 years ago by
- Cc kini added
comment:2 Changed 11 years ago by
comment:3 Changed 11 years ago by
That's a good start. Now we need to put SPKG.txt and spkg-install in proper shape. I suppose there need to be more work sage side to use the sobj rather than zodb.
comment:4 Changed 10 years ago by
- Priority changed from minor to major
- Type changed from enhancement to task
William, do you have a work-in-progress patch for the sage library? I would like to finish this task.
Edit: Other than a description in the SPKG.txt, this spkg should final.
Edit2: Ok, there is now a small description, and it is rebased off of the spkg that I made for #13123.
comment:5 Changed 10 years ago by
Looks like I missed your work on this ohanar getting rid of zodb would be good news all around. Do this need any rebasing?
comment:6 Changed 10 years ago by
What's the status? According to comment:4, conway_polynomials-0.4.spkg should be final. So, needs review?
What should be done to test whether this is really enough to get rid of zodb? I mean: How can one remove zodb from an existing Sage install?
comment:7 Changed 10 years ago by
Well, a patch is also needed to the sage library, and unfortunately William never provided that. It probably wouldn't be that hard to reverse engineer what he was thinking from the SPKG, but I haven't really looked into it.
The only other database in sage that used zodb was the Cremona database, but I rewrote that a year and a half ago to use sqlite3, so this is the only thing that should really be using zope at all (there may be other things that stupidly depend on it, but they should be easy to fix to not use zope).
comment:8 Changed 10 years ago by
I think all that needs to be changed are the files: databases/compressed_storage.py, databases/db.py and databases/conway.py. conway.py is where we will have to be careful to present the same interface to the rest of sage so has not to break anything.
comment:9 Changed 10 years ago by
OK rewrite of conway.py belongs here and the other two files belong to #10353. I am looking at conway.py and see if it is in my abilities.
comment:10 Changed 10 years ago by
I think it is easy but it will be question of getting the syntax right. In the mean time I found that the stein-watkins optional "huge" database also use zodb and will need a similar treatment.
comment:11 follow-ups: ↓ 12 ↓ 14 Changed 10 years ago by
In the end I am hitting a brick wall here. Possibly due to my lack of python skills and my will to only have minimal changes in conway.py. The issue is that the created sobj is a double dictionary and that the call to the ConwayPolynomials() class follow the same pattern as a double dictionary. You have calls of the kind: ConwayPolynomials()[3][21]
Implementing the class as a dictionary allows sage to start
import os from sage.structure.sage_object import load from sage.misc.misc import SAGE_DATA _CONWAYDATA = os.path.join(SAGE_DATA,'conway_polynomials') class ConwayPolynomials(dict): def __init__(self,*arg,**kw): """ Initialize the database. """ super(ConwayPolynomials, self).__init__(*arg, **kw) def _init(self): if not os.path.exists(_CONWAYDATA): raise RuntimeError, "In order to initialize the database, the directory must exist."%_CONWAYDATA os.chdir(_CONWAYDATA) self = load(os.path.join(_CONWAYDATA, 'conway_polynomials'))
but unfortunately but I get key errors:
sage: sage.databases.conway.ConwayPolynomials()[3][21] --------------------------------------------------------------------------- KeyError Traceback (most recent call last) /home/fbissey/Work/Overlays/BlueFern/sci-mathematics/sage/<ipython console> in <module>() KeyError: 3 sage: sage.databases.conway.ConwayPolynomials().keys() []
I was hoping that inheriting the dict class the access problem would be solved but I am obviously doing something wrong here. Loading the sobj directly from sage is totally usable, it is just when trying to put it in a class.
comment:12 in reply to: ↑ 11 Changed 10 years ago by
Replying to fbissey:
self = load(os.path.join(_CONWAYDATA, 'conway_polynomials'))
This only changes the meaning of the self
variable in the local scope. To modify the self
object, use self.update(load(…))
instead.
comment:13 Changed 10 years ago by
Interesting point but here it will call update() from the dict class which I don't think is what you meant. I still get the same results after doing your suggested replacement.
comment:14 in reply to: ↑ 11 Changed 10 years ago by
Replying to fbissey:
In the end I am hitting a brick wall here. Possibly due to my lack of python skills and my will to only have minimal changes in conway.py.
I don't think _init
is a magic python method that automatically gets called. It may well be part of a protocol involved in sage.databases.db.Database. However, when you want to implement this as inheriting from dict
, you'll have to call _init
yourself in __init__
, or better: just make _init
part of it.
To make the dictionary read-only, you may want to override the dictionary modification methods too. Plus you probably want to amend pickling (interesting question: How should this pickle? Should it write its content to a file or should it pickle to "initialize ConwayPolynomials?() for unpickling"?
comment:15 Changed 10 years ago by
Thanks Nils. I actually though about that _init method as it was strange to have two. Merging the two didn't improve my problems however. I wouldn't know about the pickling. I'd like this to give me the correct behaviour of the class first. I'll retry merging inits as there may have been caching getting in the way.
comment:16 Changed 10 years ago by
I found a way to properly initialise the database. It is a bit ugly but we may be able to move to the other problems: pickling and dict method override. I'll post a patch when I am on something else than that iPad.
comment:17 Changed 10 years ago by
Actually, what you should do is just write the ConwayPolynomials? class without thinking about how to initialize from an sobj at all, but just as either a normal class that has an attribute storing the info or inheriting from dict and ensure that pickling works. Then you just have to make sure that the .sobj
is a pickled version of it with the appropriate content. (and you may store in the documentation somewhere a script that can produce this pickle from a human-readable file)
comment:18 follow-up: ↓ 20 Changed 10 years ago by
Rather than inheriting from dict
, I think it makes more sense to inherit from collections.Mapping
since we only want read functionality. I uploaded a patch that does this, and uses the current dictionary format in the background (I don't see a good alternative to this storage method, but I'm open to suggestions).
comment:19 follow-up: ↓ 22 Changed 10 years ago by
I think we have to stick with the dictionary format since the code in sage is littered with calls to ConwayPolynomials()[p][n] and changing the storage would imply changing an enormous amount of code. I trust that your python foo is way ahead of mine. I would never have produced that getitem and didn't find any examples to do that with google. Your patch is longer than my current one but probably better behaved in the long run. It is very educative to me too.
comment:20 in reply to: ↑ 18 Changed 10 years ago by
Looks great! Two questions:
- How do we (re)produce the *.sobj file? I think we do want to be able to bootstap sage from human-readable formats? Just documenting how to obtain the appropriate *.sobj is probably sufficient.
- Do you have a good reason to inherit from
UniqueRepresentation
? Doing so is definitely not free. It makes instantiation a much more expensive affair and there'll be a cache floating around. You should only do so if you need it and I don't think you do here (ConwayPolynomials
is not a parent).
comment:21 Changed 10 years ago by
I can answer the first question: It is generated in the spkg attached in this ticket from a python table in text format look it up. Technically it doesn't have to be a sobj the previous spkg was installing a bzipped version of the python table and then inserting it in the database. We could do something similar, not sure about the cost.
comment:22 in reply to: ↑ 19 Changed 10 years ago by
Replying to fbissey:
I think we have to stick with the dictionary format ...
As much as I understand, the "format" (in the sense of the "interface to retrieve data") does not change. That's exactly the point: If you wanted to inherit from dict, then you'd also have to have a __setitem__
method - but we don't want to set data in an interactive session. All what we want is that during initialisation, data are obtained from a file, and then the user can retrieve these data (which is granted by the __getitem__
method).
... since the code in sage is littered with calls to ConwayPolynomials()[p][n]
Would that be changed by the patch? Looking at the lines
def __getitem__(self, key): try: return self._store[key] except KeyError, err: try: if isinstance(key, (tuple, list)): if len(key) == 2: return self._store[key[0]][key[1]] except KeyError: pass raise err
it seems to me that ConwayPolynomials()[p][n]
is still a possible syntax (that's return self._store[key]
), but in addition to the old syntax we now also allow the syntax ConwayPolynomials()[p, n]
(that's return self._store[key[0]][key[1]]
).
and changing the storage would imply changing an enormous amount of code.
By "storage", I understand the format of the data stored in the .sobj file containing the data base. I don't think that would be a problem, as long as there is a function that is able to read old data and transforms it into the new format.
I trust that your python foo is way ahead of mine. I would never have produced that getitem and didn't find any examples to do that with google.
That's in dive into python or other Python introductions, or in the Python documentation.
comment:23 follow-up: ↓ 27 Changed 10 years ago by
I noticed a a flaw with what I did -- ConwayPolynomials()[p]
returns a mutable Python dict, so you can actually write to the database. I'll post a fix for this in a moment.
There is no good reason as far as I can tell to use anything but plain text in the SPKG considering how small the database is.
As for the second question, the only concern would be that the entire database is loaded into memory with each instance (granted this might not be a huge concern due to the small size of the database). The main reason I added the inheritance of UniqueRepresentation
was because of a comment I seem to recall being told that Sage databases should be unique. It is simple enough to remove if we deem it not necessary.
comment:24 Changed 10 years ago by
@SimonKing?: overall you read my terrible prose correctly. I read the patch carefuly and in spite of call changes in the class I understood that the old way of accessing the class was preserved. You make a good point that it doesn't need to be preserved internally - just that we provide the right method of access. I used to have a copy of "dive into python" wonder what happened to it. Probably lost last time I moved. I would say I read the python documentation but my python skill are not advanced enough that I could have crafted a getitem accessing something like [p][n] - [p,n] is easy the former is more tricky to me.
@ohanar: going with plain text would remove the requirement of installing sage before this spkg. As it is sage needs to be installed first to produce the sobj. The spkg could then be installed at any time.
comment:25 Changed 10 years ago by
- Description modified (diff)
comment:26 follow-up: ↓ 28 Changed 10 years ago by
Any reason you changed "[]" to "()" in the polynomial method doctest? It wasn't in the previous patch and I would hope that if it is correct it doesn't lead to other breakages in the rest of sage.
comment:27 in reply to: ↑ 23 Changed 10 years ago by
Replying to ohanar:
As for the second question, the only concern would be that the entire database is loaded into memory with each instance (granted this might not be a huge concern due to the small size of the database).
Perhaps store the database in a module-level variable then? Modules naturally have a unique instance (or at least you have to work very hard to get multiple instances). If the nature of the instance is known at compile time (such as with a data base!) you might as well use the mechanism to your advantage.
(for data bases in general: Especially for writable data bases, you wouldn't want multiple instances backed by the same file)
comment:28 in reply to: ↑ 26 ; follow-up: ↓ 29 Changed 10 years ago by
Replying to fbissey:
Any reason you changed "[]" to "()" in the polynomial method doctest? It wasn't in the previous patch and I would hope that if it is correct it doesn't lead to other breakages in the rest of sage.
I updated the database to use tuples (updated the spkg), rather than lists, as they are not mutable. I don't believe there are other breakages, but I haven't run the full doctest suit (just a few files I found through greping).
Replying to nbruin:
Perhaps store the database in a module-level variable then?
Yes, that sounds like a plan.
comment:29 in reply to: ↑ 28 ; follow-up: ↓ 30 Changed 10 years ago by
Replying to ohanar:
Replying to fbissey:
Any reason you changed "[]" to "()" in the polynomial method doctest? It wasn't in the previous patch and I would hope that if it is correct it doesn't lead to other breakages in the rest of sage.
I updated the database to use tuples (updated the spkg), rather than lists, as they are not mutable. I don't believe there are other breakages, but I haven't run the full doctest suit (just a few files I found through greping).
OK will run some tests as well. I didn't know you had updated the spkg.
comment:30 in reply to: ↑ 29 Changed 10 years ago by
Replying to fbissey:
OK will run some tests as well. I didn't know you had updated the spkg.
Sorry about that, I noticed it wouldn't install on 5.6-beta1, so I updated it (and changed to tuples at the same time).
Ok, added a updated the patch to add some doctests and remove UniqueRepresentation
in favor of a module level variable.
comment:31 Changed 10 years ago by
All tests past on 5.5 with the next to last patch. Since the the last changes are internal I don't think it will change the tests results. Thanks ohanar for the splendid work.
comment:32 Changed 10 years ago by
- Status changed from new to needs_review
Ok, all doctests are now filled and all functionality should be there, so I'm marking this as needs review.
comment:33 Changed 10 years ago by
well, apparently sage doesn't like docstrings surrounded by '''
instead of """
, so that is now fixed
comment:34 follow-up: ↓ 35 Changed 10 years ago by
This is very bizzare I had no doctests failures with the previous version of the patch (on top of 5.5) but with this new one I get
sage -t -long "devel/sage-main/sage/algebras/free_algebra.py" The doctested process was killed by signal 11 [1.4 s]
But if I add "-verbose" the test pass. There are no obvious relation with this ticket (and #10353 on which I am working at the same time) apart from the fact that it appeared out of the blue after application.
comment:35 in reply to: ↑ 34 Changed 10 years ago by
Replying to fbissey:
This is very bizzare I had no doctests failures with the previous version of the patch (on top of 5.5) but with this new one I get
sage -t -long "devel/sage-main/sage/algebras/free_algebra.py" The doctested process was killed by signal 11 [1.4 s]But if I add "-verbose" the test pass.
That is very likely caused by an upstream bug in Cython that became immanent with #715.
Can you try with the new Cython spkg from #13896? I.e., install that spkg and do sage -ba.
comment:36 Changed 10 years ago by
Thanks Simon. I have no time to do it tonight, I'll try tomorrow. As far as I am concerned this is the only thing that stand against a positive review. But I suspect it is as you say, and is just something bitting from another corner.
comment:37 follow-up: ↓ 38 Changed 10 years ago by
- Reviewers set to François Bissey
- Status changed from needs_review to positive_review
Yup. Upgrading cython fixed the problem. I also checked that the thing would build properly not just upgrade from an existing sage. I am giving this a positive review.
comment:38 in reply to: ↑ 37 Changed 10 years ago by
- Dependencies set to #13896
Replying to fbissey:
Yup. Upgrading cython fixed the problem. I also checked that the thing would build properly not just upgrade from an existing sage. I am giving this a positive review.
Great! To be on the safe side, I add #13896 as a dependency, even though #13896 is a blocker with positive review (hence, is assumed to be in the next release).
comment:39 follow-up: ↓ 40 Changed 10 years ago by
- Milestone changed from sage-5.6 to sage-5.7
comment:40 in reply to: ↑ 39 Changed 10 years ago by
What is preventing this from being merged in 5.6?
comment:41 Changed 10 years ago by
Nothing really, but I feel like this should be merged while removing ZODB.
comment:42 Changed 10 years ago by
I am not fussy about it. I think #10353 is ready but if you want it in two stages that's fine. Zodb is now gone from sage-on-gentoo and maintenance wise we won't miss it :).
comment:43 Changed 10 years ago by
- Status changed from positive_review to needs_work
The sage-root patch needs to be rebased to sage-5.6.beta3.
comment:44 Changed 10 years ago by
- Description modified (diff)
- Status changed from needs_work to needs_review
I put a rebased version of the root repo patch and updated the ticket instructions accordingly. The rebasing was done against 5.6.rc0.
comment:45 Changed 10 years ago by
- Status changed from needs_review to positive_review
Putting back to positive review.
comment:46 Changed 10 years ago by
- Description modified (diff)
comment:47 Changed 10 years ago by
- Description modified (diff)
- Status changed from positive_review to needs_work
Changed 10 years ago by
comment:48 Changed 10 years ago by
This needs to depend on $(SAGERUNTIME)
, not only the Sage library because we actually run Sage code. See new patch.
Moreover, this causes build problems for sagenb
:
Processing dependencies for Twisted==12.1.0 Searching for zope.interface Link to http://pypi.python.org/simple/zope.interface/ ***BLOCKED*** by --allow-hosts Couldn't find index page for 'zope.interface' (maybe misspelled?) Scanning index of all packages (this may take a while) Link to http://pypi.python.org/simple/ ***BLOCKED*** by --allow-hosts No local packages or download links found for zope.interface error: Could not find suitable distribution for Requirement.parse('zope.interface') Error installing Twisted-12.1.0.tar.bz2 !
comment:49 follow-up: ↓ 50 Changed 10 years ago by
SAGERUNTIME? I hadn't seen that. I should have paid more attention to the originating ticket. OK I hadn't thought of twisted but the comment really is for #10353. I shall try to update the sagenb spkg to include zope.interface sometime this week.
comment:50 in reply to: ↑ 49 Changed 10 years ago by
Replying to fbissey:
SAGERUNTIME? I hadn't seen that.
Basically, $(INST)/$(SAGE)
is purely the Sage library (i.e. the stuff in devel/sage
). But actually starting Sage or running Sage code requires more, because things are imported from other packages, for example the notebook. That stuff is in $(SAGERUNTIME)
.
It used to be the case that $(INST)/$(SAGE)
was what is essentially $(SAGERUNTIME)
now. But that's silly because it adds extra unneeded build-time dependencies for $(INST)/$(SAGE)
, which needs a long time to build.
comment:51 follow-up: ↓ 52 Changed 10 years ago by
Understood. But zope.interface shouldn't be a blocker for this ticket, only for #10353.
comment:52 in reply to: ↑ 51 Changed 10 years ago by
comment:53 Changed 10 years ago by
- Status changed from needs_work to positive_review
comment:54 Changed 10 years ago by
- Dependencies changed from #13896 to #13896, #13963
Potential build failures due to #13963.
comment:55 Changed 10 years ago by
Could this have caused (note this is over GF(8)
):
sage -t "devel/sage-main/sage/rings/polynomial/polynomial_ring.py" ********************************************************************** File "/release/merger/sage-5.7.beta0/devel/sage-main/sage/rings/polynomial/polynomial_ring.py", line 1641: sage: R.lagrange_polynomial([(a^2+a,a),(a,1),(a^2,a^2+a+1)], algorithm="neville", previous_row=p)[-1] Expected: a^2*x^2 + a^2*x + a^2 Got: a^2*x^2 + a*x + a^2 + a **********************************************************************
comment:56 follow-up: ↓ 57 Changed 10 years ago by
Is it a new doctest? I see this on unreleased 5.7.beta0. I agressively patched sage-on-gentoo (5.5) to get rid of zodb and we haven't seen that anywhere. I'll admit the move to new gap is giving me trouble but it should be unrelated - or should it?
comment:57 in reply to: ↑ 56 Changed 10 years ago by
Replying to fbissey:
Is it a new doctest?
No, it's not. On first sight, it doesn't seem to use anything besides basic polynomial arithmetic. But if basic polynomial arithmetic were broken, we would see a lot more doctests failing.
I'm not at all saying that this ticket has anything to do with it, I merge a bunch of tickets and when there's a failure I have to guess which ticket caused it.
comment:58 Changed 10 years ago by
- Merged in set to sage-5.7.beta0
- Resolution set to fixed
- Status changed from positive_review to closed
comment:59 Changed 10 years ago by
comment:60 follow-up: ↓ 61 Changed 10 years ago by
It is part of the spkg. Don't ask me why ohanar put a "print p " in there. I don't mind the spam but I had a complaint about it in sage-on-gentoo and removed it there.
comment:61 in reply to: ↑ 60 ; follow-up: ↓ 63 Changed 10 years ago by
It is part of the spkg. Don't ask me why ohanar put a "print p " in there. I don't mind the spam but I had a complaint about it in sage-on-gentoo and removed it there.
Don't worry, it wasn't directed at anyone in particular! Presumably it's something left over from debugging. Shall I assume this is to be considered non-optimal and a ticket should be opened?
comment:62 Changed 10 years ago by
fbissey: sage-on-gentoo is like a third level of ticket review after the official reviewer and the release manager! :D
comment:63 in reply to: ↑ 61 Changed 10 years ago by
It is part of the spkg. Don't ask me why ohanar put a "print p " in there. I don't mind the spam but I had a complaint about it in sage-on-gentoo and removed it there.
Don't worry, it wasn't directed at anyone in particular! Presumably it's something left over from debugging. Shall I assume this is to be considered non-optimal and a ticket should be opened?
So... should I open a ticket? It does seem bizarre, and apparently easily fixed? But if it was necessary, I don't want to.
comment:64 follow-up: ↓ 65 Changed 10 years ago by
That's very easy to fix but opening a ticket is pretty much the only way it will ever get merged.
comment:65 in reply to: ↑ 64 Changed 10 years ago by
That's very easy to fix but opening a ticket is pretty much the only way it will ever get merged.
I did some work on this on the airplane, and better post a link before I forget about it or loose it: http://wstein.org/patches/conway_polynomials-0.3.spkg
Basically, the database is tiny so it can easily and very efficiently just be stored as a standard sobj. Using ZODB is silly.