Opened 4 years ago

Closed 3 years ago

#21725 closed task (invalid)

Infrastructure for conversions of Sage number types for external Cython/Python packages (fpylll, pypolymake, ...)

Reported by: mkoeppe Owned by:
Priority: major Milestone: sage-duplicate/invalid/wontfix
Component: interfaces Keywords:
Cc: malb, jdemeyer, vdelecroix, infinity0, mmezzarobba, tmonteil, dimpase, slelievre, defeo, Winfried Merged in:
Authors: Reviewers: Jeroen Demeyer
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by mkoeppe)

There is a current trend to develop Cython libraries as separate packages, rather than as part of Sage.

Many of these libraries take numbers of various kinds as input or produce them as output.

It would be nice if these Cython libraries could do conversions from/to Sage number types. Otherwise, on the Sage side we would have to provide a separate wrapper around the Cython wrapper.

Currently, this leads to

We need to develop shared infrastructure (a new lightweight library, which does NOT depend on sagelib) for this.

Change History (52)

comment:1 Changed 4 years ago by mmezzarobba

  • Cc mmezzarobba added

comment:2 in reply to: ↑ description ; follow-up: Changed 4 years ago by jdemeyer

  • Milestone changed from sage-7.5 to sage-duplicate/invalid/wontfix
  • Status changed from new to needs_review

Replying to mkoeppe:

We need to develop shared infrastructure (a new lightweight library, which does NOT depend on sagelib) for this.

That's not going to happen.

comment:3 follow-up: Changed 4 years ago by mmezzarobba

If a Python call + a copy per number conversion it not too high a price to pay, then perhaps sage objects backed by a “well-known” C library could provide an interface that would make them read/write an mpz_t (resp. mpq_t, mpfr_t, mpfi_t...) from/to a specified location? This would at least make it possible to exchange multiple-precision numbers with sage without build-depending on it.

Last edited 4 years ago by mmezzarobba (previous) (diff)

comment:4 in reply to: ↑ 2 Changed 4 years ago by mkoeppe

Replying to jdemeyer:

Replying to mkoeppe:

We need to develop shared infrastructure (a new lightweight library, which does NOT depend on sagelib) for this.

That's not going to happen.

How do you know?

comment:5 Changed 4 years ago by mkoeppe

  • Milestone changed from sage-duplicate/invalid/wontfix to sage-wishlist

comment:6 Changed 4 years ago by jdemeyer

I mean, how could it ever be possible to convert something to Sage types (I assume you mean with an efficient C interface) without depending on Sage?

comment:7 in reply to: ↑ 3 ; follow-up: Changed 4 years ago by jdemeyer

Replying to mmezzarobba:

If a Python call + a copy per number conversion it not too high a price to pay, then perhaps sage objects backed by a “well-known” C library could provide an interface that would make them read/write an mpz_t (resp. mpq_t, mpfr_t, mpfi_t...) from/to a specified location?

Sage types already have such an interface using Cython. So you want to add something like add using a Python interface? I presume using a PyCapsule?

To be honest, I don't really understand the use case for this ticket. Why would you want to convert something to Sage without depending on Sage?

comment:8 in reply to: ↑ description ; follow-up: Changed 4 years ago by jdemeyer

Replying to mkoeppe:

  • duplication of effort (for example, PyPolymake? and its GMP wrappers)

Can you elaborate on this?

comment:9 follow-ups: Changed 4 years ago by mkoeppe

For example, it would not seem so hard to provide a C library with a minimal interface such as

int mpz_from_python_integer(mpz_t target, PyObject *object)

that fills target with a GMP integer converted from object and returns an error code if it can't.

By default (in plain Python), it would only deal with Python integers.

But libraries that provide new integer types such as sagelib can hook into this, for example by defining a method or any other of Python's dynamic mechanisms.

For the opposite conversion, for converting to the Python/Sage? world, one would pass an extra argument to the wrapped function that indicates the type of output (Python integer? Sage integer? ...)

comment:10 in reply to: ↑ 8 Changed 4 years ago by mkoeppe

Replying to jdemeyer:

Replying to mkoeppe:

  • duplication of effort (for example, PyPolymake? and its GMP wrappers)

Can you elaborate on this?

Various libraries that I am currently interested in (polymake, normaliz) need Cython wrappers, which should be usable with or without Sage. The interfaces are using integers (and integer vectors, matrices, and more complicated structures) - and one should be able to request Python integers or Sage integers from the Cython wrapper. However, it should definitely not build-depend on sagelib (because we will need the opposite dependency -- sagelib will have to build-depend on pyPolymake, pyNormaliz etc.)

comment:11 in reply to: ↑ 9 ; follow-up: Changed 4 years ago by mkoeppe

Replying to mkoeppe:

For the opposite conversion, for converting to the Python/Sage? world, one would pass an extra argument to the wrapped function that indicates the type of output (Python integer? Sage integer? ...)

For example C library prototype:

PyObject *mpz_to_python_integer(mpz_t source, PyObject *output_type)

comment:12 in reply to: ↑ 11 Changed 4 years ago by vdelecroix

Replying to mkoeppe:

Replying to mkoeppe:

For the opposite conversion, for converting to the Python/Sage? world, one would pass an extra argument to the wrapped function that indicates the type of output (Python integer? Sage integer? ...)

For example C library prototype:

PyObject *mpz_to_python_integer(mpz_t source, PyObject *output_type)

This is exactly the kind of thing I am fighting with in pypolymake. Actually pypolymake could easily provide a function to set the interface

int set_pypolymake_mpz_wrapper((PyObject *) (mpz_t, PyObject *))

I am not sure what is the best approach here but +1 for having a standardized and Sage agnostic way of doing it! I am available to test as many approaches as suggested within pypolymake.

Note that we might want to do that for many basic types (integer, rational, floating point real and complexes, Zmodn, matrices, polynomials, ...). And this explosion might become a nightmare.

comment:13 in reply to: ↑ 9 ; follow-up: Changed 4 years ago by jdemeyer

Replying to mkoeppe:

For example, it would not seem so hard to provide a C library

I hope that you mean a Python module and not a C library. Libraries are a pain in Python.

One thing we really need to consider is efficiency. We already have plenty of slow ways to do what you want (e.g. for integers we can go via Python ints, for other types we could use strings). So I would like a solution which is implementable 100% in Cython (in particular, there should be no Python methods/attributes involved).

comment:14 Changed 4 years ago by jdemeyer

  • Component changed from PLEASE CHANGE to interfaces

comment:15 in reply to: ↑ 9 Changed 4 years ago by jdemeyer

There is this thing: https://pypi.python.org/pypi/gmpy2 which provides an interface to GMP and MPFR. It implements wrappers around the GMP/MPFR types, so it could probably be used to pass around mpz_t pointers.

I don't think it was created with this use case in mind, but I still think that this could be a good starting point. We obviously would need to add a Cython interface.

comment:16 in reply to: ↑ 7 ; follow-up: Changed 4 years ago by mmezzarobba

Replying to jdemeyer:

Replying to mmezzarobba:

If a Python call + a copy per number conversion it not too high a price to pay, then perhaps sage objects backed by a “well-known” C library could provide an interface that would make them read/write an mpz_t (resp. mpq_t, mpfr_t, mpfi_t...) from/to a specified location?

Sage types already have such an interface using Cython. So you want to add something like add using a Python interface? I presume using a PyCapsule?

Yes, something like that. Or perhaps the buffer protocol, if that's more convenient for some reason? Or even by asking the caller to pass an instance of Python extension class of a certain fixed form...

To be honest, I don't really understand the use case for this ticket. Why would you want to convert something to Sage without depending on Sage?

More or less for the same reason as the Debian packagers: to make it possible for Python interfaces to software using GMP, MPFR & co to be used in conjunction with Sage without necessarily being compiled with Sage support. For example, I am involved in a research project for which we developed a Python interface to Sollya. In most cases, this interface is used to call Sollya from pure Python programs. In this mode, all computations with multiple-precision types (Sollya is based on MPFI) are done by the Sollya library and stay there. However, occasionally, it is also very useful to write code that uses both Sollya and Sage, and passes MPFI intervals back and forth. Currently, in order to do that, we need either to somehow encode these intervals as Python objects, or to build the interface in a special mode with Sage support. Now, the Python-Sollya interface is semi-private code with about four users, so that's not much of an issue for us. But if we were to clean it up and distribute it, supporting both modes would cause all kinds of headaches for no good reason.

I don't think it is reasonable to move the Sage number types to an external package, if only because the Sage library is full of circular dependencies between modules. (And I don't see how to have conversions that don't go through Python at all without doing something like that...) However, I don't see the problem with trying to standardize a way to exchange, say, GMP, MPFR, MPFI, and perhaps FLINT/Arb objects between Python libraries (as originally suggested), or at least having an ad hoc solution to import/export these kinds of objects to/from Sage (as suggested in my comment).

Last edited 4 years ago by mmezzarobba (previous) (diff)

comment:17 Changed 4 years ago by mmezzarobba

  • Status changed from needs_review to needs_work

comment:18 in reply to: ↑ 16 Changed 4 years ago by jdemeyer

I am going to add a gmpy2 optional package to check whether we can do something with it: #21727

comment:19 Changed 4 years ago by jdemeyer

I have been looking at gmpy2 for the last hour or so. I think it could work to use the gmpy2 types as intermediates for the proposed conversions.

The big issue is that gmpy2 currently does not expose a C or Cython interface. So either we need to work with upstream to implement that or we need to fork gmpy2.

comment:20 Changed 4 years ago by infinity0

  • Cc infinity0 added; infinity0@… removed

comment:21 in reply to: ↑ 13 ; follow-up: Changed 4 years ago by mkoeppe

Replying to jdemeyer:

Replying to mkoeppe:

For example, it would not seem so hard to provide a C library

I hope that you mean a Python module and not a C library. Libraries are a pain in Python.

The C prototypes that I wrote are an illustration of what I would do if we were making the Python wrappers by hand instead of using Cython.

Of course, we should be making a Cython library with corresponding Cython prototypes.

comment:22 in reply to: ↑ 21 ; follow-up: Changed 4 years ago by jdemeyer

Replying to mkoeppe:

Replying to jdemeyer:

Replying to mkoeppe:

For example, it would not seem so hard to provide a C library

I hope that you mean a Python module and not a C library. Libraries are a pain in Python.

The C prototypes that I wrote are an illustration of what I would do if we were making the Python wrappers by hand instead of using Cython.

Of course, we should be making a Cython library with corresponding Cython prototypes.

And does it look like a good idea to use gmpy2 for that? It already implements thin wrappers around the GMP types, which is precisely what we need.

comment:23 in reply to: ↑ 22 Changed 4 years ago by mkoeppe

Replying to jdemeyer:

Replying to mkoeppe:

Of course, we should be making a Cython library with corresponding Cython prototypes.

And does it look like a good idea to use gmpy2 for that? It already implements thin wrappers around the GMP types, which is precisely what we need.

I'll take a look at gmpy2 soon to find out.

comment:24 Changed 4 years ago by mkoeppe

  • Description modified (diff)

comment:25 follow-up: Changed 4 years ago by jdemeyer

My idea would be:

  • Add an __mpz__ method to various types. This will be used to convert an "arbitrary" object to a gmpy2 integer (which is just a thin wrapper around an mpz_t). This conversion will not be very fast, but probably fast enough for most use cases.
  • For the cases where we have complete control over the types, we could use Cython code to construct a gmpy2 integer using the C interface.
  • For conversion from gmpy2 to a type t, we just need to add some isinstance(x, gmpy2.mpz) branches in t.__init__.

As I said before, this would require non-trivial (but backwards compatible) changes to gmpy2.

comment:26 in reply to: ↑ 9 Changed 4 years ago by jdemeyer

Replying to mkoeppe:

For example, it would not seem so hard to provide a C library with a minimal interface such as

int mpz_from_python_integer(mpz_t target, PyObject *object)

that fills target with a GMP integer converted from object and returns an error code if it can't.

With my idea, this function would be just the constructor of the gmpy2.mpz type.

comment:27 Changed 4 years ago by tmonteil

  • Cc tmonteil added

comment:28 in reply to: ↑ 25 ; follow-ups: Changed 4 years ago by vdelecroix

Replying to jdemeyer:

My idea would be:

  • Add an __mpz__ method to various types. This will be used to convert an "arbitrary" object to a gmpy2 integer (which is just a thin wrapper around an mpz_t). This conversion will not be very fast, but probably fast enough for most use cases.
  • For the cases where we have complete control over the types, we could use Cython code to construct a gmpy2 integer using the C interface.
  • For conversion from gmpy2 to a type t, we just need to add some isinstance(x, gmpy2.mpz) branches in t.__init__.

As I said before, this would require non-trivial (but backwards compatible) changes to gmpy2.

Why do you think it would be better to go through Python wrappers rather than direct C function calls? Note that such situation would create a dependency to gmpy2 for many projects. On the plus side, flint and arb have python-flint.

That being said, what would be the integration of these wrappers in the Sage ecosystem? For example with respect to coercion, would it make sense to have Sage types inheriting from these wrappers, etc.

comment:29 in reply to: ↑ 28 Changed 4 years ago by jdemeyer

Replying to vdelecroix:

Why do you think it would be better to go through Python wrappers rather than direct C function calls?

I just don't see how one could implement that if you don't want build-time dependencies and you don't want Python objects to be involved. You need to pass around mpz_t pointers but you cannot literally do that in Python: you need to wrap the mpz_t pointer in some Python object. And that's precisely what gmpy2 does.

Let me also point out that gmpy2 does object caching (like Sage Integer), such that object creation/destruction can be very fast.

Note that such situation would create a dependency to gmpy2 for many projects.

You will need a dependency on something anyway. This feature needs to be implemented somewhere (not in thin air) and you will need to depend on that.

That being said, what would be the integration of these wrappers in the Sage ecosystem? For example with respect to coercion, would it make sense to have Sage types inheriting from these wrappers, etc.

I see those gmpy2 types purely as a way of passing around data. There should be a conversion from/to those types, but nothing more than that. Inheriting from those types wouldn't work because that's incompatible with the Sage coercion model.

We could still support those types in the Sage coercion model, like we currently do for the standard Python types like int. But that's not a priority.

comment:30 in reply to: ↑ 28 ; follow-up: Changed 4 years ago by jdemeyer

Replying to vdelecroix:

On the plus side, flint and arb have python-flint.

I don't really understand what you mean with that comment. Are you saying that we could use this instead of gmpy2? One issue is the fact that GMP/MPIR types are more "standard" than the FLINT types. A second issue is that gmpy2 looks more optimized for speed than python-flint.

comment:31 follow-up: Changed 4 years ago by vdelecroix

Replying to jdemeyer:

Replying to vdelecroix:

Why do you think it would be better to go through Python wrappers rather than direct C function calls?

I just don't see how one could implement that if you don't want build-time dependencies and you don't want Python objects to be involved. You need to pass around mpz_t pointers but you cannot literally do that in Python: you need to wrap the mpz_t pointer in some Python object. And that's precisely what gmpy2 does.

Let me also point out that gmpy2 does object caching (like Sage Integer), such that object creation/destruction can be very fast.

This is not exactly what I need for pypolymake. This library is just a Python interface to polymake (that is currently a mix of C++ and Perl). When the answer to some method is an integer I want the possibility to return a Sage integer (if it is used from Sage), a gmpy2 integer (if it is used from Python with gmpy2 available) or a Python integer.

This is the kind of template I have

cpdef my_integer_question():
    cdef mpz_t ans
    my_polymake_function(ans)
    return mpz_wrapper(ans)

And I want this mpz_wrapper parametrizable (on module loading or by the user). This is currently working but not very clean.

With your proposal, this function will have to go through gmpy2 and then call a (possibly pure Python) conversion gmpy2 -> whatever.

cpdef my_integer_question():
    cdef mpz_t ans
    my_polymake_function(ans)
    my_gmpy2 = mpz_wrapper(ans)
    return gmpy2_converter(my_gmpy2)

I am not sure I prefer the gmpy2 version.

comment:32 in reply to: ↑ 30 Changed 4 years ago by vdelecroix

Replying to jdemeyer:

Replying to vdelecroix:

On the plus side, flint and arb have python-flint.

I don't really understand what you mean with that comment. Are you saying that we could use this instead of gmpy2? One issue is the fact that GMP/MPIR types are more "standard" than the FLINT types. A second issue is that gmpy2 looks more optimized for speed than python-flint.

This is a follow-up comment: we also want to support flint and arb conversions and there are already small Python wrappers for that purpose. It is disjoint from the gmp problem but it was worth mentioning to help taking a decision.

comment:33 in reply to: ↑ 31 Changed 4 years ago by jdemeyer

Replying to vdelecroix:

With your proposal, this function will have to go through gmpy2 and then call a (possibly pure Python) conversion gmpy2 -> whatever.

At least it's possible to do a slow-but-working pure Python conversion.

comment:34 Changed 4 years ago by jdemeyer

I think we are talking about two different things:

  1. Implementing a way for Python projects to efficiently communicate integers between them, without build-time dependencies (the fpylll problem).
  1. A customizable make_whatever_integer_from_mpz() function (the pypolymake problem)

These are different problems and I consider this ticket to be about 1. (but the ticket description is not clear what it really is about).

comment:35 follow-up: Changed 4 years ago by jdemeyer

And actually, 2. is totally trivial to implement in like 10 lines of Cython code (you just need to store a pointer to a conversion function in a global variable). Ideally, this would be added to some existing project because it seems a bit overkill to create a separate Python project for just those 10 lines.

Last edited 4 years ago by jdemeyer (previous) (diff)

comment:36 Changed 4 years ago by mkoeppe

  • Cc dimpase added

comment:37 in reply to: ↑ 35 ; follow-up: Changed 4 years ago by mkoeppe

Replying to jdemeyer:

And actually, 2. is totally trivial to implement in like 10 lines of Cython code (you just need to store a pointer to a conversion function in a global variable).

I would not like to see Cython libraries to use global variables for that; it seems a short-sighted design decision. If library A does that, but then libraries B and C use library A, they may need library A to use two different output conversions.

Also fpylll's current solution, namely making Sage integers instead of Python integers whenever it finds sage.all in the system, seems like a poor design decision as well. If another Python library uses fpylll, it may break when fpylll decides to use Sage integers.

comment:38 in reply to: ↑ 37 Changed 4 years ago by jdemeyer

Replying to mkoeppe:

I would not like to see Cython libraries to use global variables for that; it seems a short-sighted design decision. If library A does that, but then libraries B and C use library A, they may need library A to use two different output conversions.

I actually think that a global variable is the whole point of the requested feature. You want all libraries to use a common given output format requested by the user (not other libraries).

Last edited 4 years ago by jdemeyer (previous) (diff)

comment:39 follow-up: Changed 4 years ago by mkoeppe

It's a cleaner design to pass an extra argument around.

In my experience, global variables never solve a problem but only create new ones.

comment:40 in reply to: ↑ 39 Changed 4 years ago by jdemeyer

Replying to mkoeppe:

It's a cleaner design to pass an extra argument around.

Or just return something generic (like a gmpy2 instance) and let the user code convert it if needed.

In my experience, global variables never solve a problem but only create new ones.

Vincent is the one who asked for this feature, so I'll let him discuss it with you.

Anyway, I would still like to know what you think of my idea to use gmpy2 as "de facto" standard to pass around integers.

comment:41 Changed 4 years ago by mkoeppe

  • Cc slelievre added

comment:42 Changed 4 years ago by defeo

  • Cc defeo added

comment:43 Changed 4 years ago by mkoeppe

  • Cc Winfried added

comment:44 Changed 4 years ago by jdemeyer

  • Milestone changed from sage-wishlist to sage-duplicate/invalid/wontfix
  • Status changed from needs_work to positive_review

Given the unclear ticket description and the lack of interest, I suggest to close this as invalid.

I am not saying that there is nothing to be done here, just that this ticket is too vague. It seems that everybody has a different idea about what it means.

If you want some feature in the spirit of this ticket, please open a new ticket, clearly stating what you want.

comment:45 follow-up: Changed 4 years ago by mkoeppe

  • Milestone changed from sage-duplicate/invalid/wontfix to sage-wishlist

Uh, we are working on it. Greeting from Sage Days.

comment:46 Changed 4 years ago by mkoeppe

  • Status changed from positive_review to needs_work

comment:47 in reply to: ↑ 45 Changed 4 years ago by jdemeyer

Replying to mkoeppe:

Uh, we are working on it.

Good! I am looking forward to your update of the ticket description (I hope you have read 44 and understand why I proposed to close this ticket).

Last edited 4 years ago by jdemeyer (previous) (diff)

comment:48 Changed 4 years ago by jdemeyer

ping

comment:49 Changed 4 years ago by vdelecroix

These are two weeks Sage days. Why are you so picky about this ticket?

comment:50 Changed 4 years ago by jdemeyer

Does it really take 2 weeks to figure out what the ticket is about?

I'm picky about this one because I feel like I might be interested and I could help if only somebody would clarify what this ticket is about.

In reality, I guess that this ticket is really about several things and each of those things should get its own ticket.

comment:51 Changed 3 years ago by dimpase

Another case at hand is #23024...

comment:52 Changed 3 years ago by jdemeyer

  • Milestone changed from sage-wishlist to sage-duplicate/invalid/wontfix
  • Resolution set to invalid
  • Reviewers set to Jeroen Demeyer
  • Status changed from needs_work to closed

11 months and still no explanation.

Note: See TracTickets for help on using tickets.