Opened 4 years ago
Closed 3 years ago
#21725 closed task (invalid)
Infrastructure for conversions of Sage number types for external Cython/Python packages (fpylll, pypolymake, ...)
Reported by: | mkoeppe | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-duplicate/invalid/wontfix |
Component: | interfaces | Keywords: | |
Cc: | malb, jdemeyer, vdelecroix, infinity0, mmezzarobba, tmonteil, dimpase, slelievre, defeo, Winfried | Merged in: | |
Authors: | Reviewers: | Jeroen Demeyer | |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description (last modified by )
There is a current trend to develop Cython libraries as separate packages, rather than as part of Sage.
Many of these libraries take numbers of various kinds as input or produce them as output.
It would be nice if these Cython libraries could do conversions from/to Sage number types. Otherwise, on the Sage side we would have to provide a separate wrapper around the Cython wrapper.
Currently, this leads to
- awkward dependencies (such as fpylll depending on sagelib - see this sage-devel discussion and #21728) and to
- duplication of effort (for example, PyPolymake? and its GMP wrappers)
We need to develop shared infrastructure (a new lightweight library, which does NOT depend on sagelib) for this.
Change History (52)
comment:1 Changed 4 years ago by
- Cc mmezzarobba added
comment:2 in reply to: ↑ description ; follow-up: ↓ 4 Changed 4 years ago by
- Milestone changed from sage-7.5 to sage-duplicate/invalid/wontfix
- Status changed from new to needs_review
comment:3 follow-up: ↓ 7 Changed 4 years ago by
If a Python call + a copy per number conversion it not too high a price to pay, then perhaps sage objects backed by a “well-known” C library could provide an interface that would make them read/write an mpz_t
(resp. mpq_t
, mpfr_t
, mpfi_t
...) from/to a specified location? This would at least make it possible to exchange multiple-precision numbers with sage without build-depending on it.
comment:4 in reply to: ↑ 2 Changed 4 years ago by
comment:5 Changed 4 years ago by
- Milestone changed from sage-duplicate/invalid/wontfix to sage-wishlist
comment:6 Changed 4 years ago by
I mean, how could it ever be possible to convert something to Sage types (I assume you mean with an efficient C interface) without depending on Sage?
comment:7 in reply to: ↑ 3 ; follow-up: ↓ 16 Changed 4 years ago by
Replying to mmezzarobba:
If a Python call + a copy per number conversion it not too high a price to pay, then perhaps sage objects backed by a “well-known” C library could provide an interface that would make them read/write an
mpz_t
(resp.mpq_t
,mpfr_t
,mpfi_t
...) from/to a specified location?
Sage types already have such an interface using Cython. So you want to add something like add using a Python interface? I presume using a PyCapsule
?
To be honest, I don't really understand the use case for this ticket. Why would you want to convert something to Sage without depending on Sage?
comment:8 in reply to: ↑ description ; follow-up: ↓ 10 Changed 4 years ago by
Replying to mkoeppe:
- duplication of effort (for example, PyPolymake? and its GMP wrappers)
Can you elaborate on this?
comment:9 follow-ups: ↓ 11 ↓ 13 ↓ 15 ↓ 26 Changed 4 years ago by
For example, it would not seem so hard to provide a C library with a minimal interface such as
int mpz_from_python_integer(mpz_t target, PyObject *object)
that fills target
with a GMP integer converted from object
and returns an error code if it can't.
By default (in plain Python), it would only deal with Python integers.
But libraries that provide new integer types such as sagelib can hook into this, for example by defining a method or any other of Python's dynamic mechanisms.
For the opposite conversion, for converting to the Python/Sage? world, one would pass an extra argument to the wrapped function that indicates the type of output (Python integer? Sage integer? ...)
comment:10 in reply to: ↑ 8 Changed 4 years ago by
Replying to jdemeyer:
Replying to mkoeppe:
- duplication of effort (for example, PyPolymake? and its GMP wrappers)
Can you elaborate on this?
Various libraries that I am currently interested in (polymake, normaliz) need Cython wrappers, which should be usable with or without Sage. The interfaces are using integers (and integer vectors, matrices, and more complicated structures) - and one should be able to request Python integers or Sage integers from the Cython wrapper. However, it should definitely not build-depend on sagelib (because we will need the opposite dependency -- sagelib will have to build-depend on pyPolymake, pyNormaliz etc.)
comment:11 in reply to: ↑ 9 ; follow-up: ↓ 12 Changed 4 years ago by
Replying to mkoeppe:
For the opposite conversion, for converting to the Python/Sage? world, one would pass an extra argument to the wrapped function that indicates the type of output (Python integer? Sage integer? ...)
For example C library prototype:
PyObject *mpz_to_python_integer(mpz_t source, PyObject *output_type)
comment:12 in reply to: ↑ 11 Changed 4 years ago by
Replying to mkoeppe:
Replying to mkoeppe:
For the opposite conversion, for converting to the Python/Sage? world, one would pass an extra argument to the wrapped function that indicates the type of output (Python integer? Sage integer? ...)
For example C library prototype:
PyObject *mpz_to_python_integer(mpz_t source, PyObject *output_type)
This is exactly the kind of thing I am fighting with in pypolymake. Actually pypolymake could easily provide a function to set the interface
int set_pypolymake_mpz_wrapper((PyObject *) (mpz_t, PyObject *))
I am not sure what is the best approach here but +1 for having a standardized and Sage agnostic way of doing it! I am available to test as many approaches as suggested within pypolymake.
Note that we might want to do that for many basic types (integer, rational, floating point real and complexes, Zmodn, matrices, polynomials, ...). And this explosion might become a nightmare.
comment:13 in reply to: ↑ 9 ; follow-up: ↓ 21 Changed 4 years ago by
Replying to mkoeppe:
For example, it would not seem so hard to provide a C library
I hope that you mean a Python module and not a C library. Libraries are a pain in Python.
One thing we really need to consider is efficiency. We already have plenty of slow ways to do what you want (e.g. for integers we can go via Python ints, for other types we could use strings). So I would like a solution which is implementable 100% in Cython (in particular, there should be no Python methods/attributes involved).
comment:14 Changed 4 years ago by
- Component changed from PLEASE CHANGE to interfaces
comment:15 in reply to: ↑ 9 Changed 4 years ago by
There is this thing: https://pypi.python.org/pypi/gmpy2 which provides an interface to GMP and MPFR. It implements wrappers around the GMP/MPFR types, so it could probably be used to pass around mpz_t
pointers.
I don't think it was created with this use case in mind, but I still think that this could be a good starting point. We obviously would need to add a Cython interface.
comment:16 in reply to: ↑ 7 ; follow-up: ↓ 18 Changed 4 years ago by
Replying to jdemeyer:
Replying to mmezzarobba:
If a Python call + a copy per number conversion it not too high a price to pay, then perhaps sage objects backed by a “well-known” C library could provide an interface that would make them read/write an
mpz_t
(resp.mpq_t
,mpfr_t
,mpfi_t
...) from/to a specified location?Sage types already have such an interface using Cython. So you want to add something like add using a Python interface? I presume using a
PyCapsule
?
Yes, something like that. Or perhaps the buffer protocol, if that's more convenient for some reason? Or even by asking the caller to pass an instance of Python extension class of a certain fixed form...
To be honest, I don't really understand the use case for this ticket. Why would you want to convert something to Sage without depending on Sage?
More or less for the same reason as the Debian packagers: to make it possible for Python interfaces to software using GMP, MPFR & co to be used in conjunction with Sage without necessarily being compiled with Sage support. For example, I am involved in a research project for which we developed a Python interface to Sollya. In most cases, this interface is used to call Sollya from pure Python programs. In this mode, all computations with multiple-precision types (Sollya is based on MPFI) are done by the Sollya library and stay there. However, occasionally, it is also very useful to write code that uses both Sollya and Sage, and passes MPFI intervals back and forth. Currently, in order to do that, we need either to somehow encode these intervals as Python objects, or to build the interface in a special mode with Sage support. Now, the Python-Sollya interface is semi-private code with about four users, so that's not much of an issue for us. But if we were to clean it up and distribute it, supporting both modes would cause all kinds of headaches for no good reason.
I don't think it is reasonable to move the Sage number types to an external package, if only because the Sage library is full of circular dependencies between modules. (And I don't see how to have conversions that don't go through Python at all without doing something like that...) However, I don't see the problem with trying to standardize a way to exchange, say, GMP, MPFR, MPFI, and perhaps FLINT/Arb objects between Python libraries (as originally suggested), or at least having an ad hoc solution to import/export these kinds of objects to/from Sage (as suggested in my comment).
comment:17 Changed 4 years ago by
- Status changed from needs_review to needs_work
comment:18 in reply to: ↑ 16 Changed 4 years ago by
I am going to add a gmpy2
optional package to check whether we can do something with it: #21727
comment:19 Changed 4 years ago by
I have been looking at gmpy2
for the last hour or so. I think it could work to use the gmpy2
types as intermediates for the proposed conversions.
The big issue is that gmpy2
currently does not expose a C or Cython interface. So either we need to work with upstream to implement that or we need to fork gmpy2
.
comment:20 Changed 4 years ago by
- Cc infinity0 added; infinity0@… removed
comment:21 in reply to: ↑ 13 ; follow-up: ↓ 22 Changed 4 years ago by
Replying to jdemeyer:
Replying to mkoeppe:
For example, it would not seem so hard to provide a C library
I hope that you mean a Python module and not a C library. Libraries are a pain in Python.
The C prototypes that I wrote are an illustration of what I would do if we were making the Python wrappers by hand instead of using Cython.
Of course, we should be making a Cython library with corresponding Cython prototypes.
comment:22 in reply to: ↑ 21 ; follow-up: ↓ 23 Changed 4 years ago by
Replying to mkoeppe:
Replying to jdemeyer:
Replying to mkoeppe:
For example, it would not seem so hard to provide a C library
I hope that you mean a Python module and not a C library. Libraries are a pain in Python.
The C prototypes that I wrote are an illustration of what I would do if we were making the Python wrappers by hand instead of using Cython.
Of course, we should be making a Cython library with corresponding Cython prototypes.
And does it look like a good idea to use gmpy2
for that? It already implements thin wrappers around the GMP types, which is precisely what we need.
comment:23 in reply to: ↑ 22 Changed 4 years ago by
Replying to jdemeyer:
Replying to mkoeppe:
Of course, we should be making a Cython library with corresponding Cython prototypes.
And does it look like a good idea to use
gmpy2
for that? It already implements thin wrappers around the GMP types, which is precisely what we need.
I'll take a look at gmpy2
soon to find out.
comment:24 Changed 4 years ago by
- Description modified (diff)
comment:25 follow-up: ↓ 28 Changed 4 years ago by
My idea would be:
- Add an
__mpz__
method to various types. This will be used to convert an "arbitrary" object to agmpy2
integer (which is just a thin wrapper around anmpz_t
). This conversion will not be very fast, but probably fast enough for most use cases.
- For the cases where we have complete control over the types, we could use Cython code to construct a
gmpy2
integer using the C interface.
- For conversion from
gmpy2
to a typet
, we just need to add someisinstance(x, gmpy2.mpz)
branches int.__init__
.
As I said before, this would require non-trivial (but backwards compatible) changes to gmpy2
.
comment:26 in reply to: ↑ 9 Changed 4 years ago by
Replying to mkoeppe:
For example, it would not seem so hard to provide a C library with a minimal interface such as
int mpz_from_python_integer(mpz_t target, PyObject *object)that fills
target
with a GMP integer converted fromobject
and returns an error code if it can't.
With my idea, this function would be just the constructor of the gmpy2.mpz
type.
comment:27 Changed 4 years ago by
- Cc tmonteil added
comment:28 in reply to: ↑ 25 ; follow-ups: ↓ 29 ↓ 30 Changed 4 years ago by
Replying to jdemeyer:
My idea would be:
- Add an
__mpz__
method to various types. This will be used to convert an "arbitrary" object to agmpy2
integer (which is just a thin wrapper around anmpz_t
). This conversion will not be very fast, but probably fast enough for most use cases.
- For the cases where we have complete control over the types, we could use Cython code to construct a
gmpy2
integer using the C interface.
- For conversion from
gmpy2
to a typet
, we just need to add someisinstance(x, gmpy2.mpz)
branches int.__init__
.As I said before, this would require non-trivial (but backwards compatible) changes to
gmpy2
.
Why do you think it would be better to go through Python wrappers rather than direct C function calls? Note that such situation would create a dependency to gmpy2 for many projects. On the plus side, flint and arb have python-flint.
That being said, what would be the integration of these wrappers in the Sage ecosystem? For example with respect to coercion, would it make sense to have Sage types inheriting from these wrappers, etc.
comment:29 in reply to: ↑ 28 Changed 4 years ago by
Replying to vdelecroix:
Why do you think it would be better to go through Python wrappers rather than direct C function calls?
I just don't see how one could implement that if you don't want build-time dependencies and you don't want Python objects to be involved. You need to pass around mpz_t
pointers but you cannot literally do that in Python: you need to wrap the mpz_t
pointer in some Python object. And that's precisely what gmpy2
does.
Let me also point out that gmpy2
does object caching (like Sage Integer
), such that object creation/destruction can be very fast.
Note that such situation would create a dependency to gmpy2 for many projects.
You will need a dependency on something anyway. This feature needs to be implemented somewhere (not in thin air) and you will need to depend on that.
That being said, what would be the integration of these wrappers in the Sage ecosystem? For example with respect to coercion, would it make sense to have Sage types inheriting from these wrappers, etc.
I see those gmpy2
types purely as a way of passing around data. There should be a conversion from/to those types, but nothing more than that. Inheriting from those types wouldn't work because that's incompatible with the Sage coercion model.
We could still support those types in the Sage coercion model, like we currently do for the standard Python types like int
. But that's not a priority.
comment:30 in reply to: ↑ 28 ; follow-up: ↓ 32 Changed 4 years ago by
Replying to vdelecroix:
On the plus side, flint and arb have python-flint.
I don't really understand what you mean with that comment. Are you saying that we could use this instead of gmpy2
? One issue is the fact that GMP/MPIR types are more "standard" than the FLINT types. A second issue is that gmpy2
looks more optimized for speed than python-flint
.
comment:31 follow-up: ↓ 33 Changed 4 years ago by
Replying to jdemeyer:
Replying to vdelecroix:
Why do you think it would be better to go through Python wrappers rather than direct C function calls?
I just don't see how one could implement that if you don't want build-time dependencies and you don't want Python objects to be involved. You need to pass around
mpz_t
pointers but you cannot literally do that in Python: you need to wrap thempz_t
pointer in some Python object. And that's precisely whatgmpy2
does.Let me also point out that
gmpy2
does object caching (like SageInteger
), such that object creation/destruction can be very fast.
This is not exactly what I need for pypolymake. This library is just a Python interface to polymake (that is currently a mix of C++ and Perl). When the answer to some method is an integer I want the possibility to return a Sage integer (if it is used from Sage), a gmpy2 integer (if it is used from Python with gmpy2 available) or a Python integer.
This is the kind of template I have
cpdef my_integer_question(): cdef mpz_t ans my_polymake_function(ans) return mpz_wrapper(ans)
And I want this mpz_wrapper
parametrizable (on module loading or by the user). This is currently working but not very clean.
With your proposal, this function will have to go through gmpy2
and then call a (possibly pure Python) conversion gmpy2 -> whatever
.
cpdef my_integer_question(): cdef mpz_t ans my_polymake_function(ans) my_gmpy2 = mpz_wrapper(ans) return gmpy2_converter(my_gmpy2)
I am not sure I prefer the gmpy2 version.
comment:32 in reply to: ↑ 30 Changed 4 years ago by
Replying to jdemeyer:
Replying to vdelecroix:
On the plus side, flint and arb have python-flint.
I don't really understand what you mean with that comment. Are you saying that we could use this instead of
gmpy2
? One issue is the fact that GMP/MPIR types are more "standard" than the FLINT types. A second issue is thatgmpy2
looks more optimized for speed thanpython-flint
.
This is a follow-up comment: we also want to support flint and arb conversions and there are already small Python wrappers for that purpose. It is disjoint from the gmp problem but it was worth mentioning to help taking a decision.
comment:33 in reply to: ↑ 31 Changed 4 years ago by
Replying to vdelecroix:
With your proposal, this function will have to go through
gmpy2
and then call a (possibly pure Python) conversiongmpy2 -> whatever
.
At least it's possible to do a slow-but-working pure Python conversion.
comment:34 Changed 4 years ago by
I think we are talking about two different things:
- Implementing a way for Python projects to efficiently communicate integers between them, without build-time dependencies (the
fpylll
problem).
- A customizable
make_whatever_integer_from_mpz()
function (thepypolymake
problem)
These are different problems and I consider this ticket to be about 1. (but the ticket description is not clear what it really is about).
comment:35 follow-up: ↓ 37 Changed 4 years ago by
And actually, 2. is totally trivial to implement in like 10 lines of Cython code (you just need to store a pointer to a conversion function in a global variable). Ideally, this would be added to some existing project because it seems a bit overkill to create a separate Python project for just those 10 lines.
comment:36 Changed 4 years ago by
- Cc dimpase added
comment:37 in reply to: ↑ 35 ; follow-up: ↓ 38 Changed 4 years ago by
Replying to jdemeyer:
And actually, 2. is totally trivial to implement in like 10 lines of Cython code (you just need to store a pointer to a conversion function in a global variable).
I would not like to see Cython libraries to use global variables for that; it seems a short-sighted design decision. If library A does that, but then libraries B and C use library A, they may need library A to use two different output conversions.
Also fpylll's current solution, namely making Sage integers instead of Python integers whenever it finds sage.all in the system, seems like a poor design decision as well. If another Python library uses fpylll, it may break when fpylll decides to use Sage integers.
comment:38 in reply to: ↑ 37 Changed 4 years ago by
Replying to mkoeppe:
I would not like to see Cython libraries to use global variables for that; it seems a short-sighted design decision. If library A does that, but then libraries B and C use library A, they may need library A to use two different output conversions.
I actually think that a global variable is the whole point of the requested feature. You want all libraries to use a common given output format requested by the user (not other libraries).
comment:39 follow-up: ↓ 40 Changed 4 years ago by
It's a cleaner design to pass an extra argument around.
In my experience, global variables never solve a problem but only create new ones.
comment:40 in reply to: ↑ 39 Changed 4 years ago by
Replying to mkoeppe:
It's a cleaner design to pass an extra argument around.
Or just return something generic (like a gmpy2
instance) and let the user code convert it if needed.
In my experience, global variables never solve a problem but only create new ones.
Vincent is the one who asked for this feature, so I'll let him discuss it with you.
Anyway, I would still like to know what you think of my idea to use gmpy2
as "de facto" standard to pass around integers.
comment:41 Changed 4 years ago by
- Cc slelievre added
comment:42 Changed 4 years ago by
- Cc defeo added
comment:43 Changed 4 years ago by
- Cc Winfried added
comment:44 Changed 4 years ago by
- Milestone changed from sage-wishlist to sage-duplicate/invalid/wontfix
- Status changed from needs_work to positive_review
Given the unclear ticket description and the lack of interest, I suggest to close this as invalid.
I am not saying that there is nothing to be done here, just that this ticket is too vague. It seems that everybody has a different idea about what it means.
If you want some feature in the spirit of this ticket, please open a new ticket, clearly stating what you want.
comment:45 follow-up: ↓ 47 Changed 4 years ago by
- Milestone changed from sage-duplicate/invalid/wontfix to sage-wishlist
Uh, we are working on it. Greeting from Sage Days.
comment:46 Changed 4 years ago by
- Status changed from positive_review to needs_work
comment:47 in reply to: ↑ 45 Changed 4 years ago by
comment:48 Changed 4 years ago by
ping
comment:49 Changed 4 years ago by
These are two weeks Sage days. Why are you so picky about this ticket?
comment:50 Changed 4 years ago by
Does it really take 2 weeks to figure out what the ticket is about?
I'm picky about this one because I feel like I might be interested and I could help if only somebody would clarify what this ticket is about.
In reality, I guess that this ticket is really about several things and each of those things should get its own ticket.
comment:51 Changed 3 years ago by
Another case at hand is #23024...
comment:52 Changed 3 years ago by
- Milestone changed from sage-wishlist to sage-duplicate/invalid/wontfix
- Resolution set to invalid
- Reviewers set to Jeroen Demeyer
- Status changed from needs_work to closed
11 months and still no explanation.
Replying to mkoeppe:
That's not going to happen.