Opened 6 years ago

Closed 5 years ago

#18159 closed enhancement (fixed)

cardinality must output Infinty or a Sage integer

Reported by: vdelecroix Owned by:
Priority: critical Milestone: sage-6.6
Component: categories Keywords:
Cc: Merged in:
Authors: Vincent Delecroix Reviewers: Jean-Pierre Flori
Report Upstream: N/A Work issues:
Branch: eb50b94 (Commits) Commit: eb50b94345e732b879ab5faf5c8016252a17d60c
Dependencies: Stopgaps:

Description (last modified by vdelecroix)

We add a ._test_cardinality() method in the category Sets that takes care of checking that the output of .cardinality() is a Sage integer or +Infinity.

Change History (45)

comment:1 Changed 6 years ago by vdelecroix

  • Authors set to Vincent Delecroix
  • Branch set to u/vdelecroix/18159
  • Commit set to 28378f014398fbb4f94836a47e3098210c8e4588

New commits:

28378f0Trac 18159: add a ._test_cardinality in Sets category

comment:2 Changed 6 years ago by git

  • Commit changed from 28378f014398fbb4f94836a47e3098210c8e4588 to 17924b8da8777805ca17289e00e62c3beca5a146

Branch pushed to git repo; I updated commit sha1. New commits:

17924b8Trac 18159: fix cardinality/is_finite in sets/set.py

comment:3 follow-up: Changed 6 years ago by ncohen

What is wrong with an 'int'? O_o

comment:4 in reply to: ↑ 3 Changed 6 years ago by vdelecroix

Replying to ncohen:

What is wrong with an 'int'? O_o

2/3 for example.

comment:5 follow-up: Changed 6 years ago by ncohen

I do not understand. It seems that your code detects it as 'wrong' if a function .cardinality() returns int(4). Why is that?

comment:6 Changed 6 years ago by vdelecroix

  • Description modified (diff)

comment:7 in reply to: ↑ 5 Changed 6 years ago by vdelecroix

Replying to ncohen:

I do not understand. It seems that your code detects it as 'wrong' if a function .cardinality() returns int(4). Why is that?

Yes. And the reason is because Python int do not behave as Sage integers. For example

sage: S = my_finite_set()
sage: S.cardinality().factor()
sage: ~(S.cardinality())

For coherence, we need to have some specifications for .cardinality.

The main reason for that ticket was that some sets were actually returning rational numbers see http://trac.sagemath.org/ticket/17852#comment:92

comment:8 follow-up: Changed 6 years ago by ncohen

That they behave differently is not (to me) a sufficient reason to prevent everybody from returning python ints. Checking that the output can be turned into an int would be sufficient to detect this problem with rational numbers.

comment:9 in reply to: ↑ 8 ; follow-up: Changed 6 years ago by vdelecroix

Replying to ncohen:

That they behave differently is not (to me) a sufficient reason to prevent everybody from returning python ints. Checking that the output can be turned into an int would be sufficient to detect this problem with rational numbers.

All right, output string is accepted as well?

This is very bad. Imagine I need something like:

sage: vector([s.cardinality() for s in S])

If any of the s.cardinality() returns a rational, I end up with a vector over rational numbers (which are much slower than over integers).

It is very similar to is_X functions that return boolean answers. You might complain if they start returning gap booleans or numpy boolean objects.

Vincent

comment:10 in reply to: ↑ 9 ; follow-up: Changed 6 years ago by ncohen

What would you have against testing that it is an int, a long, an Integer, or equal to (not 'is') infinity?

Nathann

comment:11 in reply to: ↑ 10 Changed 6 years ago by vdelecroix

Replying to ncohen:

What would you have against testing that it is an int, a long, an Integer, or equal to (not 'is') infinity?

You do not want to see a Sage rational anymore? And what about numpy integers, pari integers, gp integers, gap integers and libgap integers?

Testing equality to infinity might be broken or you can have objects that want to be equal to everybody (why not?).

The specification in sage.categories.sets_cat is:

``self.cardinality()`` should return the cardinality of the set
``self`` as a sage :class:`Integer` or as ``infinity``.

If you complain that it should not be the case then you should argue why. And not telling me to argue why not.

Vincent

comment:12 follow-up: Changed 6 years ago by ncohen

I think that it should not be the case because python ints/long are smaller and faster than Sage Integers, and that we should not be forced to instantiate Integers if we do not need to.

For the very same reason you invoked, earlier, when you said that you did not want rationals because of the possible loss of time.

Nathann

comment:13 follow-up: Changed 6 years ago by ncohen

(note that your problems are solved if you specify ZZ when you build your vector, isn't it?)

comment:14 Changed 6 years ago by vdelecroix

  • Description modified (diff)

comment:15 in reply to: ↑ 12 ; follow-up: Changed 6 years ago by vdelecroix

Replying to ncohen:

I think that it should not be the case because python ints/long are smaller and faster than Sage Integers, and that we should not be forced to instantiate Integers if we do not need to.

For the very same reason you invoked, earlier, when you said that you did not want rationals because of the possible loss of time.

It was not only loss of time but well defineness. Additioning Python int with Sage integers is slower than Sage integer with Sage integer.

sage: python_one = 1r
sage: sage_one = 1
sage: sage_a = 2**12
sage: timeit("sage_a + python_one", number=5000000)
5000000 loops, best of 3: 66.3 ns per loop
sage: timeit("sage_a + sage_one", number=5000000)
5000000 loops, best of 3: 57.9 ns per loop

So it is better to have everything being Sage integers.

On the other hand, if you really want Python integers you should call len(my_set) (and be sure to obtain a Python integer) and not my_set.cardinality().

Vincent

Last edited 6 years ago by vdelecroix (previous) (diff)

comment:16 in reply to: ↑ 13 Changed 6 years ago by vdelecroix

Replying to ncohen:

(note that your problems are solved if you specify ZZ when you build your vector, isn't it?)

Yes, and

sage: vector(ZZ, ['1', gap(2), pari(3)])
(1, 2, 3)

works too.

comment:17 Changed 6 years ago by git

  • Commit changed from 17924b8da8777805ca17289e00e62c3beca5a146 to e40975a50978d76bf1c8ad4a141470dc0b010289

Branch pushed to git repo; I updated commit sha1. New commits:

e40975aTrac 18159: fix some tests related to TestSuite

comment:18 Changed 6 years ago by vdelecroix

  • Description modified (diff)

comment:19 in reply to: ↑ 15 ; follow-up: Changed 6 years ago by ncohen

So it is better to have everything being Sage integers.

It may be better to have Sage integers if you want, for instance, to add them with other Sage integers. Why would that be sufficient to prevent anyone from returning int?

Nathann

comment:20 Changed 6 years ago by ncohen

Okay do whatever you like. At some point in this community there was a rule that we should all agree on something to make it pass. Nowadays it's like you just have to convince one of your friends and it can get in.

Nathann

comment:21 in reply to: ↑ 19 ; follow-up: Changed 6 years ago by vdelecroix

Replying to ncohen:

So it is better to have everything being Sage integers.

It may be better to have Sage integers if you want, for instance, to add them with other Sage integers. Why would that be sufficient to prevent anyone from returning int?

It is not sufficient. You were arguing that Python ints were better because faster. I just showed you that it is not completely right. An argument for X is generally not an argument for Y.

The reason to oblige people to return Sage integers or Infinity in the method .cardinality of a Parent is mostly for consistency (things are generally better if specified)

comment:22 in reply to: ↑ 21 ; follow-up: Changed 6 years ago by ncohen

It is not sufficient. You were arguing that Python ints were better because faster. I just showed you that it is not completely right. An argument for X is generally not an argument for Y.

All I wanted to point out is that *sometimes* an int is better than an Integer, and that for this reason they should not be forbidden. I do not mean to say that a Python int is always better than a Sage Integer, and quite clearly it is not the case: ints overflows, Integers do not.

The reason to oblige people to return Sage integers or Infinity in the method .cardinality of a Parent is mostly for consistency.

Couldn't we just use the TestSuite to detect errors? When you see a non-integer then something is most probably wrong and the testsuite should report it. What you are doing right now is simulate a "return type" for a function in a language that does not support it.

Nathann

comment:23 in reply to: ↑ 22 Changed 6 years ago by vdelecroix

Replying to ncohen:

It is not sufficient. You were arguing that Python ints were better because faster. I just showed you that it is not completely right. An argument for X is generally not an argument for Y.

All I wanted to point out is that *sometimes* an int is better than an Integer, and that for this reason they should not be forbidden. I do not mean to say that a Python int is always better than a Sage Integer, and quite clearly it is not the case: ints overflows, Integers do not.

Just to be sure: I do not want to forbid Python integer in general. For the output of .cardinality() I consider that Sage integers are *always* better. Perhaps I am wrong and this can be debated if needed. And, as I already mentioned, you can get Python integer for less effort doing len(my_set).

The reason to oblige people to return Sage integers or Infinity in the method .cardinality of a Parent is mostly for consistency.

Couldn't we just use the TestSuite to detect errors?

That is my goal. But we do not agree that returning an int for .cardinality() is an error. In many situation, "ensuring" that it is a Sage integer prevent doing wrong things afterwards.

When you see a non-integer then something is most probably wrong and the testsuite should report it. What you are doing right now is simulate a "return type" for a function in a language that does not support it.

Yes. But having as dependency a ticket "switch from Python to a strongly typed language" would be too much effort. So Python is a fact and we have to deal with it.

Let me mention that Python is trying to do some type checking because it is useful:

sage: class A(object):
....:     def __len__(self):
....:         return 'haha! well tried!'
sage: a = A()
sage: len(a)
Traceback (most recent call last):
...
TypeError: an integer is required

(note that a.__len__() returns what you asked it to do).

Vincent

comment:24 Changed 6 years ago by git

  • Commit changed from e40975a50978d76bf1c8ad4a141470dc0b010289 to a5d1e0e6f8d61d169bc6dcd416b9bb1309e7fe62

Branch pushed to git repo; I updated commit sha1. New commits:

d2e0608Trac 18159: fix the three parents
a5d1e0eTrac 18159: some more cleanup in debruijn_sequence.pyx

comment:25 Changed 6 years ago by vdelecroix

  • Status changed from new to needs_review

comment:26 Changed 6 years ago by vdelecroix

  • Description modified (diff)

comment:27 Changed 6 years ago by jpflori

I'd say we should ask for others' feeling on sage-devel before making a move here. Personally I don't mind returning a Sage integer for cardinality, but Nathann does not agree. So we should try to satisfy a large proportion of Sage users (or let Sage rot forever).

comment:28 Changed 6 years ago by bruno

I agree with Vincent's view here: Having .cardinality() return a Sage Integer is better to my mind since then it comes with all the methods for integers. For a basic user, I guess it may be misleading than for some (finite...) sets you can write S.cardinality().factor() while it sometimes results in an error.

I understand Nathann's concern on efficiency. Yet, my impression is that the issue is going to arise to advanced or expert users (whatever this means). These users should be capable of switching to len(S) if they feel the need for it. The proposed change actually is an improvement since it provides the user the choice for the return type, while no such choice existed before.

comment:29 follow-up: Changed 6 years ago by ncohen

I understand Nathann's concern on efficiency. Yet, my impression is that the issue is going to arise to advanced or expert users (whatever this means). These users should be capable of switching to len(S) if they feel the need for it. The proposed change actually is an improvement since it provides the user the choice for the return type, while no such choice existed before.

The arguments that I see on this thread are not about whether "cardinality" should return python ints or Sage integers.

While the ticket is indeed about .cardinality only, you will see that the arguments given here can be applied to any other Sage function that returns integers.

Of course, having *all* such Sage functions return Integer objects is almost impossible to enforce, yet there is an easy way to enforce that for .cardinality, and this is what the branch implements.

I do not like what is going on here, but I will not oppose it. You are too many.

Nathann

comment:30 in reply to: ↑ 29 ; follow-up: Changed 6 years ago by bruno

The arguments that I see on this thread are not about whether "cardinality" should return python ints or Sage integers.

While the ticket is indeed about .cardinality only, you will see that the arguments given here can be applied to any other Sage function that returns integers.

That is not totally exact, since as Vincent mentioned it, you can replace S.cardinality() by len(S) if you want an int, and for efficiency concerns, len(S.set()) is the fastest way to get it. Not all functions have this python counterpart.

As a user, I very much cares about the consistency in the return types. And I find this very annoying:

sage: S = Set(range(2^20))
sage: T = CartesianProduct(S,S)
sage: T.cardinality().factor()
2^40
sage: S.cardinality().factor()
Traceback (most recent call last):
...
AttributeError: 'int' object has no attribute 'factor'

Note that I still agree that the same argument may be used at some other places. For instance, let me consider univariate polynomials in sparse representation. I find the following very weird, and would be in favor of changing this behavior:

sage: R.<x> = PolynomialRing(ZZ, sparse=True)
sage: p = x^(2^100)+1
sage: map(lambda t: type(t), p.exponents())
[<type 'int'>, <type 'sage.rings.integer.Integer'>]

I do not like what is going on here, but I will not oppose it. You are too many.

I've understood that (one of) your concern(s) is efficiency issues. But is it the only one? Don't you think that inconsistency also is a problem?

comment:31 in reply to: ↑ 30 Changed 6 years ago by ncohen

Please, do whatever you want. I voiced my opinion one thousand times already.

Nathann

Last edited 6 years ago by ncohen (previous) (diff)

comment:32 Changed 6 years ago by jpflori

  • Status changed from needs_review to needs_work
  • Work issues set to rebase?

comment:33 Changed 6 years ago by git

  • Commit changed from a5d1e0e6f8d61d169bc6dcd416b9bb1309e7fe62 to b79143d79eeb85ea079b1dd4ea96ce054594b9cc

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

2d83910Trac #18159: add a ._test_cardinality in Sets category
001e781Trac #18159: remove a duplicated test
aa0fe00Trac #18159: fix cardinality/is_finite in sets/set.py
78f1163Trac #18159: fix some tests related to TestSuite
842a3feTrac #18159: fix the three parents
b79143dTrac #18159: some more cleanup in debruijn_sequence.pyx

comment:34 Changed 6 years ago by vdelecroix

  • Status changed from needs_work to needs_review

comment:35 Changed 5 years ago by vdelecroix

  • Work issues rebase? deleted

comment:36 Changed 5 years ago by jpflori

  • Status changed from needs_review to needs_work

There are conflicts in src/sage/combinat/crystals/monomial_crystals.py.

comment:37 Changed 5 years ago by vdelecroix

I will not keep it up to date because you like me working. As you can notice, I rebased it 5 weeks ago and nobody cared. Does your message mean that you will review it? There is no need to keep the branch up to date if there is no reviewer... needs_review means needs a reviewer. Not a patchbot log reader.

Vincent

comment:38 Changed 5 years ago by jpflori

I have some time to review it right now. I'm not making you work for fun... it's just that my dev time is very scarce these days. And I have no idea of how to fix the second conflict in the file.

comment:39 Changed 5 years ago by git

  • Commit changed from b79143d79eeb85ea079b1dd4ea96ce054594b9cc to 0aad2f94b2d59dee992443efe99ef76c847b80c9

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

575b264Trac #18159: add a ._test_cardinality in Sets category
cc87bd5Trac #18159: remove a duplicated test
32c403dTrac #18159: fix cardinality/is_finite in sets/set.py
937ebf4Trac #18159: fix some tests related to TestSuite
eb50b94Trac #18159: fix two parents
0aad2f9Trac #18159: some more cleanup in debruijn_sequence.pyx

comment:40 Changed 5 years ago by vdelecroix

  • Status changed from needs_work to needs_review

Rebased on 6.8.beta6

comment:41 follow-up: Changed 5 years ago by jpflori

Is there any reason why the DeBruijn? squence stuff is included here?

comment:42 in reply to: ↑ 41 Changed 5 years ago by vdelecroix

Replying to jpflori:

Is there any reason why the DeBruijn? squence stuff is included here?

Not really. You can ignore 0aad2f9 and I will remove it.

comment:43 Changed 5 years ago by git

  • Commit changed from 0aad2f94b2d59dee992443efe99ef76c847b80c9 to eb50b94345e732b879ab5faf5c8016252a17d60c

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

comment:44 Changed 5 years ago by jpflori

  • Reviewers set to Jean-Pierre Flori
  • Status changed from needs_review to positive_review

Thanks for removing the unrelated commit. LGTM.

comment:45 Changed 5 years ago by vbraun

  • Branch changed from u/vdelecroix/18159 to eb50b94345e732b879ab5faf5c8016252a17d60c
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.