Opened 8 years ago

Closed 7 years ago

Last modified 7 years ago

#12205 closed task (fixed)

rewrite conway polynomial spkg and code in Sage library to not use ZODB

Reported by: was Owned by: was
Priority: major Milestone: sage-5.7
Component: packages: huge Keywords:
Cc: kini Merged in: sage-5.7.beta0
Authors: R. Andrew Ohana Reviewers: François Bissey
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: #13896, #13963 Stopgaps:

Description (last modified by jdemeyer)

This is necessary so we can dump the ZODB spkg from Sage.


Installation Instructions

Attachments (3)

trac12205.patch (10.0 KB) - added by ohanar 7 years ago.
apply to sage library
trac12205_root.2.patch (803 bytes) - added by fbissey 7 years ago.
updated sage_root repo patch
12205_root.patch (932 bytes) - added by jdemeyer 7 years ago.

Download all attachments as: .zip

Change History (68)

comment:1 Changed 8 years ago by kini

  • Cc kini added

comment:2 Changed 8 years ago by was

I did some work on this on the airplane, and better post a link before I forget about it or loose it: http://wstein.org/patches/conway_polynomials-0.3.spkg

Basically, the database is tiny so it can easily and very efficiently just be stored as a standard sobj. Using ZODB is silly.

comment:3 Changed 8 years ago by fbissey

That's a good start. Now we need to put SPKG.txt and spkg-install in proper shape. I suppose there need to be more work sage side to use the sobj rather than zodb.

comment:4 Changed 8 years ago by ohanar

  • Priority changed from minor to major
  • Type changed from enhancement to task

William, do you have a work-in-progress patch for the sage library? I would like to finish this task.

Edit: Other than a description in the SPKG.txt, this spkg should final.

Edit2: Ok, there is now a small description, and it is rebased off of the spkg that I made for #13123.

Last edited 8 years ago by ohanar (previous) (diff)

comment:5 Changed 7 years ago by fbissey

Looks like I missed your work on this ohanar getting rid of zodb would be good news all around. Do this need any rebasing?

comment:6 Changed 7 years ago by SimonKing

What's the status? According to comment:4, conway_polynomials-0.4.spkg should be final. So, needs review?

What should be done to test whether this is really enough to get rid of zodb? I mean: How can one remove zodb from an existing Sage install?

comment:7 Changed 7 years ago by ohanar

Well, a patch is also needed to the sage library, and unfortunately William never provided that. It probably wouldn't be that hard to reverse engineer what he was thinking from the SPKG, but I haven't really looked into it.

The only other database in sage that used zodb was the Cremona database, but I rewrote that a year and a half ago to use sqlite3, so this is the only thing that should really be using zope at all (there may be other things that stupidly depend on it, but they should be easy to fix to not use zope).

comment:8 Changed 7 years ago by fbissey

I think all that needs to be changed are the files: databases/compressed_storage.py, databases/db.py and databases/conway.py. conway.py is where we will have to be careful to present the same interface to the rest of sage so has not to break anything.

comment:9 Changed 7 years ago by fbissey

OK rewrite of conway.py belongs here and the other two files belong to #10353. I am looking at conway.py and see if it is in my abilities.

comment:10 Changed 7 years ago by fbissey

I think it is easy but it will be question of getting the syntax right. In the mean time I found that the stein-watkins optional "huge" database also use zodb and will need a similar treatment.

comment:11 follow-ups: Changed 7 years ago by fbissey

In the end I am hitting a brick wall here. Possibly due to my lack of python skills and my will to only have minimal changes in conway.py. The issue is that the created sobj is a double dictionary and that the call to the ConwayPolynomials() class follow the same pattern as a double dictionary. You have calls of the kind: ConwayPolynomials()[3][21]

Implementing the class as a dictionary allows sage to start

import os

from sage.structure.sage_object import load
from sage.misc.misc import SAGE_DATA

_CONWAYDATA = os.path.join(SAGE_DATA,'conway_polynomials')

class ConwayPolynomials(dict):
    def __init__(self,*arg,**kw):
        """
        Initialize the database.
        """
        super(ConwayPolynomials, self).__init__(*arg, **kw)

    def _init(self):
        if not os.path.exists(_CONWAYDATA):
            raise RuntimeError, "In order to initialize the database, the directory must exist."%_CONWAYDATA
        os.chdir(_CONWAYDATA)
        self = load(os.path.join(_CONWAYDATA, 'conway_polynomials'))

but unfortunately but I get key errors:

sage: sage.databases.conway.ConwayPolynomials()[3][21]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)

/home/fbissey/Work/Overlays/BlueFern/sci-mathematics/sage/<ipython console> in <module>()

KeyError: 3
sage: sage.databases.conway.ConwayPolynomials().keys()
[]

I was hoping that inheriting the dict class the access problem would be solved but I am obviously doing something wrong here. Loading the sobj directly from sage is totally usable, it is just when trying to put it in a class.

comment:12 in reply to: ↑ 11 Changed 7 years ago by gagern

Replying to fbissey:

self = load(os.path.join(_CONWAYDATA, 'conway_polynomials'))

This only changes the meaning of the self variable in the local scope. To modify the self object, use self.update(load(…)) instead.

comment:13 Changed 7 years ago by fbissey

Interesting point but here it will call update() from the dict class which I don't think is what you meant. I still get the same results after doing your suggested replacement.

comment:14 in reply to: ↑ 11 Changed 7 years ago by nbruin

Replying to fbissey:

In the end I am hitting a brick wall here. Possibly due to my lack of python skills and my will to only have minimal changes in conway.py.

I don't think _init is a magic python method that automatically gets called. It may well be part of a protocol involved in sage.databases.db.Database. However, when you want to implement this as inheriting from dict, you'll have to call _init yourself in __init__, or better: just make _init part of it.

To make the dictionary read-only, you may want to override the dictionary modification methods too. Plus you probably want to amend pickling (interesting question: How should this pickle? Should it write its content to a file or should it pickle to "initialize ConwayPolynomials?() for unpickling"?

comment:15 Changed 7 years ago by fbissey

Thanks Nils. I actually though about that _init method as it was strange to have two. Merging the two didn't improve my problems however. I wouldn't know about the pickling. I'd like this to give me the correct behaviour of the class first. I'll retry merging inits as there may have been caching getting in the way.

comment:16 Changed 7 years ago by fbissey

I found a way to properly initialise the database. It is a bit ugly but we may be able to move to the other problems: pickling and dict method override. I'll post a patch when I am on something else than that iPad.

comment:17 Changed 7 years ago by nbruin

Actually, what you should do is just write the ConwayPolynomials? class without thinking about how to initialize from an sobj at all, but just as either a normal class that has an attribute storing the info or inheriting from dict and ensure that pickling works. Then you just have to make sure that the .sobj is a pickled version of it with the appropriate content. (and you may store in the documentation somewhere a script that can produce this pickle from a human-readable file)

comment:18 follow-up: Changed 7 years ago by ohanar

Rather than inheriting from dict, I think it makes more sense to inherit from collections.Mapping since we only want read functionality. I uploaded a patch that does this, and uses the current dictionary format in the background (I don't see a good alternative to this storage method, but I'm open to suggestions).

comment:19 follow-up: Changed 7 years ago by fbissey

I think we have to stick with the dictionary format since the code in sage is littered with calls to ConwayPolynomials()[p][n] and changing the storage would imply changing an enormous amount of code. I trust that your python foo is way ahead of mine. I would never have produced that getitem and didn't find any examples to do that with google. Your patch is longer than my current one but probably better behaved in the long run. It is very educative to me too.

comment:20 in reply to: ↑ 18 Changed 7 years ago by nbruin

Looks great! Two questions:

  • How do we (re)produce the *.sobj file? I think we do want to be able to bootstap sage from human-readable formats? Just documenting how to obtain the appropriate *.sobj is probably sufficient.
  • Do you have a good reason to inherit from UniqueRepresentation? Doing so is definitely not free. It makes instantiation a much more expensive affair and there'll be a cache floating around. You should only do so if you need it and I don't think you do here (ConwayPolynomials is not a parent).

comment:21 Changed 7 years ago by fbissey

I can answer the first question: It is generated in the spkg attached in this ticket from a python table in text format look it up. Technically it doesn't have to be a sobj the previous spkg was installing a bzipped version of the python table and then inserting it in the database. We could do something similar, not sure about the cost.

comment:22 in reply to: ↑ 19 Changed 7 years ago by SimonKing

Replying to fbissey:

I think we have to stick with the dictionary format ...

As much as I understand, the "format" (in the sense of the "interface to retrieve data") does not change. That's exactly the point: If you wanted to inherit from dict, then you'd also have to have a __setitem__ method - but we don't want to set data in an interactive session. All what we want is that during initialisation, data are obtained from a file, and then the user can retrieve these data (which is granted by the __getitem__ method).

... since the code in sage is littered with calls to ConwayPolynomials()[p][n]

Would that be changed by the patch? Looking at the lines

    def __getitem__(self, key): 
        try: 
            return self._store[key] 
        except KeyError, err: 
            try: 
                if isinstance(key, (tuple, list)): 
                    if len(key) == 2: 
                        return self._store[key[0]][key[1]] 
            except KeyError: 
                pass 
            raise err 

it seems to me that ConwayPolynomials()[p][n] is still a possible syntax (that's return self._store[key]), but in addition to the old syntax we now also allow the syntax ConwayPolynomials()[p, n] (that's return self._store[key[0]][key[1]]).

and changing the storage would imply changing an enormous amount of code.

By "storage", I understand the format of the data stored in the .sobj file containing the data base. I don't think that would be a problem, as long as there is a function that is able to read old data and transforms it into the new format.

I trust that your python foo is way ahead of mine. I would never have produced that getitem and didn't find any examples to do that with google.

That's in dive into python or other Python introductions, or in the Python documentation.

comment:23 follow-up: Changed 7 years ago by ohanar

I noticed a a flaw with what I did -- ConwayPolynomials()[p] returns a mutable Python dict, so you can actually write to the database. I'll post a fix for this in a moment.

There is no good reason as far as I can tell to use anything but plain text in the SPKG considering how small the database is.

As for the second question, the only concern would be that the entire database is loaded into memory with each instance (granted this might not be a huge concern due to the small size of the database). The main reason I added the inheritance of UniqueRepresentation was because of a comment I seem to recall being told that Sage databases should be unique. It is simple enough to remove if we deem it not necessary.

comment:24 Changed 7 years ago by fbissey

@SimonKing?: overall you read my terrible prose correctly. I read the patch carefuly and in spite of call changes in the class I understood that the old way of accessing the class was preserved. You make a good point that it doesn't need to be preserved internally - just that we provide the right method of access. I used to have a copy of "dive into python" wonder what happened to it. Probably lost last time I moved. I would say I read the python documentation but my python skill are not advanced enough that I could have crafted a getitem accessing something like [p][n] - [p,n] is easy the former is more tricky to me.

@ohanar: going with plain text would remove the requirement of installing sage before this spkg. As it is sage needs to be installed first to produce the sobj. The spkg could then be installed at any time.

comment:25 Changed 7 years ago by ohanar

  • Description modified (diff)

comment:26 follow-up: Changed 7 years ago by fbissey

Any reason you changed "[]" to "()" in the polynomial method doctest? It wasn't in the previous patch and I would hope that if it is correct it doesn't lead to other breakages in the rest of sage.

comment:27 in reply to: ↑ 23 Changed 7 years ago by nbruin

Replying to ohanar:

As for the second question, the only concern would be that the entire database is loaded into memory with each instance (granted this might not be a huge concern due to the small size of the database).

Perhaps store the database in a module-level variable then? Modules naturally have a unique instance (or at least you have to work very hard to get multiple instances). If the nature of the instance is known at compile time (such as with a data base!) you might as well use the mechanism to your advantage.

(for data bases in general: Especially for writable data bases, you wouldn't want multiple instances backed by the same file)

comment:28 in reply to: ↑ 26 ; follow-up: Changed 7 years ago by ohanar

Replying to fbissey:

Any reason you changed "[]" to "()" in the polynomial method doctest? It wasn't in the previous patch and I would hope that if it is correct it doesn't lead to other breakages in the rest of sage.

I updated the database to use tuples (updated the spkg), rather than lists, as they are not mutable. I don't believe there are other breakages, but I haven't run the full doctest suit (just a few files I found through greping).

Replying to nbruin:

Perhaps store the database in a module-level variable then?

Yes, that sounds like a plan.

comment:29 in reply to: ↑ 28 ; follow-up: Changed 7 years ago by fbissey

Replying to ohanar:

Replying to fbissey:

Any reason you changed "[]" to "()" in the polynomial method doctest? It wasn't in the previous patch and I would hope that if it is correct it doesn't lead to other breakages in the rest of sage.

I updated the database to use tuples (updated the spkg), rather than lists, as they are not mutable. I don't believe there are other breakages, but I haven't run the full doctest suit (just a few files I found through greping).

OK will run some tests as well. I didn't know you had updated the spkg.

comment:30 in reply to: ↑ 29 Changed 7 years ago by ohanar

Replying to fbissey:

OK will run some tests as well. I didn't know you had updated the spkg.

Sorry about that, I noticed it wouldn't install on 5.6-beta1, so I updated it (and changed to tuples at the same time).

Ok, added a updated the patch to add some doctests and remove UniqueRepresentation in favor of a module level variable.

comment:31 Changed 7 years ago by fbissey

All tests past on 5.5 with the next to last patch. Since the the last changes are internal I don't think it will change the tests results. Thanks ohanar for the splendid work.

comment:32 Changed 7 years ago by ohanar

  • Authors set to R. Andrew Ohana
  • Status changed from new to needs_review

Ok, all doctests are now filled and all functionality should be there, so I'm marking this as needs review.

Changed 7 years ago by ohanar

apply to sage library

comment:33 Changed 7 years ago by ohanar

well, apparently sage doesn't like docstrings surrounded by ''' instead of """, so that is now fixed

comment:34 follow-up: Changed 7 years ago by fbissey

This is very bizzare I had no doctests failures with the previous version of the patch (on top of 5.5) but with this new one I get

sage -t -long "devel/sage-main/sage/algebras/free_algebra.py"
The doctested process was killed by signal 11
         [1.4 s]

But if I add "-verbose" the test pass. There are no obvious relation with this ticket (and #10353 on which I am working at the same time) apart from the fact that it appeared out of the blue after application.

comment:35 in reply to: ↑ 34 Changed 7 years ago by SimonKing

Replying to fbissey:

This is very bizzare I had no doctests failures with the previous version of the patch (on top of 5.5) but with this new one I get

sage -t -long "devel/sage-main/sage/algebras/free_algebra.py"
The doctested process was killed by signal 11
         [1.4 s]

But if I add "-verbose" the test pass.

That is very likely caused by an upstream bug in Cython that became immanent with #715.

Can you try with the new Cython spkg from #13896? I.e., install that spkg and do sage -ba.

comment:36 Changed 7 years ago by fbissey

Thanks Simon. I have no time to do it tonight, I'll try tomorrow. As far as I am concerned this is the only thing that stand against a positive review. But I suspect it is as you say, and is just something bitting from another corner.

comment:37 follow-up: Changed 7 years ago by fbissey

  • Reviewers set to François Bissey
  • Status changed from needs_review to positive_review

Yup. Upgrading cython fixed the problem. I also checked that the thing would build properly not just upgrade from an existing sage. I am giving this a positive review.

comment:38 in reply to: ↑ 37 Changed 7 years ago by SimonKing

  • Dependencies set to #13896

Replying to fbissey:

Yup. Upgrading cython fixed the problem. I also checked that the thing would build properly not just upgrade from an existing sage. I am giving this a positive review.

Great! To be on the safe side, I add #13896 as a dependency, even though #13896 is a blocker with positive review (hence, is assumed to be in the next release).

comment:39 follow-up: Changed 7 years ago by jdemeyer

  • Milestone changed from sage-5.6 to sage-5.7

comment:40 in reply to: ↑ 39 Changed 7 years ago by ohanar

What is preventing this from being merged in 5.6?

comment:41 Changed 7 years ago by jdemeyer

Nothing really, but I feel like this should be merged while removing ZODB.

comment:42 Changed 7 years ago by fbissey

I am not fussy about it. I think #10353 is ready but if you want it in two stages that's fine. Zodb is now gone from sage-on-gentoo and maintenance wise we won't miss it :).

comment:43 Changed 7 years ago by jdemeyer

  • Status changed from positive_review to needs_work

The sage-root patch needs to be rebased to sage-5.6.beta3.

Changed 7 years ago by fbissey

updated sage_root repo patch

comment:44 Changed 7 years ago by fbissey

  • Description modified (diff)
  • Status changed from needs_work to needs_review

I put a rebased version of the root repo patch and updated the ticket instructions accordingly. The rebasing was done against 5.6.rc0.

comment:45 Changed 7 years ago by fbissey

  • Status changed from needs_review to positive_review

Putting back to positive review.

comment:46 Changed 7 years ago by jdemeyer

  • Description modified (diff)

comment:47 Changed 7 years ago by jdemeyer

  • Description modified (diff)
  • Status changed from positive_review to needs_work

Changed 7 years ago by jdemeyer

comment:48 Changed 7 years ago by jdemeyer

This needs to depend on $(SAGERUNTIME), not only the Sage library because we actually run Sage code. See new patch.

Moreover, this causes build problems for sagenb:

Processing dependencies for Twisted==12.1.0
Searching for zope.interface

Link to http://pypi.python.org/simple/zope.interface/ ***BLOCKED*** by --allow-hosts

Couldn't find index page for 'zope.interface' (maybe misspelled?)
Scanning index of all packages (this may take a while)

Link to http://pypi.python.org/simple/ ***BLOCKED*** by --allow-hosts

No local packages or download links found for zope.interface
error: Could not find suitable distribution for Requirement.parse('zope.interface')
Error installing Twisted-12.1.0.tar.bz2 !

comment:49 follow-up: Changed 7 years ago by fbissey

SAGERUNTIME? I hadn't seen that. I should have paid more attention to the originating ticket. OK I hadn't thought of twisted but the comment really is for #10353. I shall try to update the sagenb spkg to include zope.interface sometime this week.

comment:50 in reply to: ↑ 49 Changed 7 years ago by jdemeyer

Replying to fbissey:

SAGERUNTIME? I hadn't seen that.

Basically, $(INST)/$(SAGE) is purely the Sage library (i.e. the stuff in devel/sage). But actually starting Sage or running Sage code requires more, because things are imported from other packages, for example the notebook. That stuff is in $(SAGERUNTIME).

It used to be the case that $(INST)/$(SAGE) was what is essentially $(SAGERUNTIME) now. But that's silly because it adds extra unneeded build-time dependencies for $(INST)/$(SAGE), which needs a long time to build.

Last edited 7 years ago by jdemeyer (previous) (diff)

comment:51 follow-up: Changed 7 years ago by fbissey

Understood. But zope.interface shouldn't be a blocker for this ticket, only for #10353.

comment:52 in reply to: ↑ 51 Changed 7 years ago by jdemeyer

Replying to fbissey:

Understood. But zope.interface shouldn't be a blocker for this ticket, only for #10353.

Fair enough, I always confuse these two tickets.

comment:53 Changed 7 years ago by jdemeyer

  • Status changed from needs_work to positive_review

comment:54 Changed 7 years ago by jdemeyer

  • Dependencies changed from #13896 to #13896, #13963

Potential build failures due to #13963.

comment:55 Changed 7 years ago by jdemeyer

Could this have caused (note this is over GF(8)):

sage -t  "devel/sage-main/sage/rings/polynomial/polynomial_ring.py"
**********************************************************************
File "/release/merger/sage-5.7.beta0/devel/sage-main/sage/rings/polynomial/polynomial_ring.py", line 1641:
    sage: R.lagrange_polynomial([(a^2+a,a),(a,1),(a^2,a^2+a+1)], algorithm="neville", previous_row=p)[-1]
Expected:
    a^2*x^2 + a^2*x + a^2
Got:
    a^2*x^2 + a*x + a^2 + a
**********************************************************************

comment:56 follow-up: Changed 7 years ago by fbissey

Is it a new doctest? I see this on unreleased 5.7.beta0. I agressively patched sage-on-gentoo (5.5) to get rid of zodb and we haven't seen that anywhere. I'll admit the move to new gap is giving me trouble but it should be unrelated - or should it?

comment:57 in reply to: ↑ 56 Changed 7 years ago by jdemeyer

Replying to fbissey:

Is it a new doctest?

No, it's not. On first sight, it doesn't seem to use anything besides basic polynomial arithmetic. But if basic polynomial arithmetic were broken, we would see a lot more doctests failing.

I'm not at all saying that this ticket has anything to do with it, I merge a bunch of tickets and when there's a failure I have to guess which ticket caused it.

comment:58 Changed 7 years ago by jdemeyer

  • Merged in set to sage-5.7.beta0
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:59 Changed 7 years ago by kcrisman

Has anybody noticed the very very long string of primes you get when installing this spkg yet? This is with beta1 + #6495, but I assume it is part of this spkg. See the log.

comment:60 follow-up: Changed 7 years ago by fbissey

It is part of the spkg. Don't ask me why ohanar put a "print p " in there. I don't mind the spam but I had a complaint about it in sage-on-gentoo and removed it there.

comment:61 in reply to: ↑ 60 ; follow-up: Changed 7 years ago by kcrisman

It is part of the spkg. Don't ask me why ohanar put a "print p " in there. I don't mind the spam but I had a complaint about it in sage-on-gentoo and removed it there.

Don't worry, it wasn't directed at anyone in particular! Presumably it's something left over from debugging. Shall I assume this is to be considered non-optimal and a ticket should be opened?

comment:62 Changed 7 years ago by kini

fbissey: sage-on-gentoo is like a third level of ticket review after the official reviewer and the release manager! :D

comment:63 in reply to: ↑ 61 Changed 7 years ago by kcrisman

It is part of the spkg. Don't ask me why ohanar put a "print p " in there. I don't mind the spam but I had a complaint about it in sage-on-gentoo and removed it there.

Don't worry, it wasn't directed at anyone in particular! Presumably it's something left over from debugging. Shall I assume this is to be considered non-optimal and a ticket should be opened?

So... should I open a ticket? It does seem bizarre, and apparently easily fixed? But if it was necessary, I don't want to.

comment:64 follow-up: Changed 7 years ago by fbissey

That's very easy to fix but opening a ticket is pretty much the only way it will ever get merged.

comment:65 in reply to: ↑ 64 Changed 7 years ago by kcrisman

That's very easy to fix but opening a ticket is pretty much the only way it will ever get merged.

#14075

Note: See TracTickets for help on using tickets.