Opened 11 years ago
Closed 11 years ago
#6870 closed defect (fixed)
[with patch, positive review] Bug in binomial
Reported by: | hgranath | Owned by: | somebody |
---|---|---|---|
Priority: | minor | Milestone: | sage-4.1.2 |
Component: | basic arithmetic | Keywords: | binomial |
Cc: | Merged in: | Sage 4.1.2.rc0 | |
Authors: | Hakan Granath | Reviewers: | Karl-Dieter Crisman, Adam Webb |
Report Upstream: | Work issues: | ||
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description
Here are two cases where binomial fails. I think it is not properly converting its arguments to Integers in all cases where it should.
sage: binomial(1/2,1/1) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /home/hakan/.sage/temp/joker/27910/_home_hakan__sage_init_sage_0.py in <module>() /media/megadisk/sage-4.1.1/local/lib/python2.6/site-packages/sage/rings/arith.py in binomial(x, m) 2602 except AttributeError: 2603 pass -> 2604 raise TypeError, 'Either m or x-m must be an integer' 2605 if isinstance(x, (int, long, integer.Integer)): 2606 if x >= 0 and (m < 0 or m > x): TypeError: Either m or x-m must be an integer
sage: binomial(10^20+1/1,10^20) --------------------------------------------------------------------------- OverflowError Traceback (most recent call last) /home/hakan/.sage/temp/joker/27910/_home_hakan__sage_init_sage_0.py in <module>() /media/megadisk/sage-4.1.1/local/lib/python2.6/site-packages/sage/rings/arith.py in binomial(x, m) 2625 from sage.functions.all import gamma 2626 return gamma(x+1)/gamma(P(m+1))/gamma(x-m+1) -> 2627 return misc.prod([x-i for i in xrange(m)]) / P(factorial(m)) 2628 2629 def multinomial(*ks): OverflowError: long int too large to convert to int
Attachments (10)
Change History (36)
Changed 11 years ago by
comment:1 Changed 11 years ago by
- Summary changed from [with patch, needs review] Bug in binomial to [with patch, needs work] Bug in binomial
comment:2 Changed 11 years ago by
- Summary changed from [with patch, needs work] Bug in binomial to [with second patch, needs review] Bug in binomial
I think to some extent this is a correctness vs speed issue. Before the patch there is an asymmetry in the checking of whether m and x-m is in ZZ. So e.g. binomial(3/2,1/2) works while binomial(3/2,1/1) fails, although the documentation indicates they should be equivalent. For me it was very confusing when something like that happened. The benchmark case binomial(n+1,n) benefits from this asymmetry. I have thought about it, but so far I have not found a satisfactory solution which will not affect the speed of binomial(n+1,n) while doing the right thing in other cases.
The other thing my patch is doing is to check more carefully if x is an integer. As noted this will be an improvement when this succeeds. Apart from the above benchmarks, try e.g. binomial(QQ(107),252525) or binomial(SR(107),252525) with and without the patch applied. This part of the patch however is not as important to me, a user could always use binomial(ZZ(x),m) when beneficial. Hence an alternative patch is to just do the first part. This will decrease the speed regression in the symbolic case.
Changed 11 years ago by
comment:3 Changed 11 years ago by
- Summary changed from [with second patch, needs review] Bug in binomial to [with second patch, needs work] Bug in binomial
No, we should be able to compute the second type too, so don't jettison that. But can't you combine the two? Checking type is not so bad to do, and then you can still always try to coerce later. After all, at the bottom there is a special case check for reals and floats. So maybe you could do something like
if isinstance(m or x-m,whatever.rational): convert to integer if possible
I did check, and the slowdown for symbolic ones is because of that check. Checking for an instance takes about 1/10 the time.
sage: def f(): ....: try: ....: m=ZZ(a) ....: except: ....: pass ....: sage: timeit('f()') 625 loops, best of 3: 34.4 µs per loop sage: def g(): ....: isinstance(a,(int,long,Integer)) ....: sage: timeit('g()') 625 loops, best of 3: 2.57 µs per loop
where a = n+1
I guess the point is that we should definitely have as many correct answers as possible, but that the slowdown should be on these admittedly unusual cases with rationals, which are less likely to come up than the usual cases.
comment:4 Changed 11 years ago by
But I have a problem then. In the situation where x is of type SR and m of type Integer, consider the cases x is n+1 (case A) and x is SR(107) (case B).
To handle these cases correctly they have to be treated differently, but how to distinguish between them using fast checks like "isinstance" and similar if using "try: ZZ(x)" is not allowed because it will slow down case A?
comment:5 Changed 11 years ago by
Maybe this?
sage: var('n') n sage: a = n+1 sage: from sage.symbolic.expression import Expression sage: isinstance(a,Expression) True sage: def g(): ....: isinstance(a,Expression) ....: sage: timeit('g()') 625 loops, best of 3: 573 ns per loop sage: a=SR(10**7) sage: timeit('g()') 625 loops, best of 3: 595 ns per loop sage: timeit('f()') 625 loops, best of 3: 1.59 µs per loop
You can still use try: ZZ(x), it just makes sense to check type first, since it adds only nanoseconds, and then if someone is silly enough to use SR(107) instead of 107, they'll have to pay the microsecond penalty. I jest a little, but I think you should try something like this to see if it would work. If not, the first patch is probably better than allowing rational things to get through. You could also add a check for rationals instead:
sage: from sage.rings.rational import Rational sage: def g(): ....: isinstance(a,Rational) ....: sage: timeit('g()') 625 loops, best of 3: 1.23 µs per loop sage: a = 10**7-1/1 sage: type(a) <type 'sage.rings.rational.Rational'> sage: timeit('g()') 625 loops, best of 3: 570 ns per loop
So even if a isn't rational, you lose very little time by checking that before you coerce.
comment:6 Changed 11 years ago by
- Summary changed from [with second patch, needs work] Bug in binomial to [with third patch, needs review] Bug in binomial
If we want to fix only the rational case, that is of course easy to do. I have two alternative new patches: one for doing just that, and one which is basically my first patch but reworked to not affect the case of floating point input (which was an undesired side effect).
With the QQ patch, it is unfortunately still easy to construct cases that will fail: e.g. binomial(1/2,SR(1)), binomial(SR(1020+1),1020) and binomial(SR(107+1),107) (the last one takes a looong time).
Changed 11 years ago by
Changed 11 years ago by
comment:7 Changed 11 years ago by
I think you misunderstood. My point is that one can ALSO do a check for symbolic expressions, as outlined above (that's what I meant by "they'll have to pay the microsecond penalty". We can't catch every case, of course, but we might as well try to get the ones we know about.
I hope that these comments are not frustrating; this is overall a good fix to a non-critical but nonetheless important bug.
comment:8 follow-up: ↓ 9 Changed 11 years ago by
I really have no idea what to do. So assume we have something like
if isinstance(x,Expression):
then what to do? We have no idea at this point if x happens to be something like SR(107) or n+1, and we will not know until we try x=ZZ(x). If that try should fail, it is already too late to avoid the time penalty that is the point of this discussion.
comment:9 in reply to: ↑ 8 Changed 11 years ago by
Replying to hgranath:
I really have no idea what to do. So assume we have something like
if isinstance(x,Expression):then what to do? We have no idea at this point if x happens to be something like SR(107) or n+1, and we will not know until we try x=ZZ(x). If that try should fail, it is already too late to avoid the time penalty that is the point of this discussion.
Understood; I think I didn't catch what the salient point was in your earlier comment, but now I do, and I agree that there is no easy way around it. My apologies.
Here is my last idea. I think that the .pyobject() method of Expression can catch this - because it returns an error for anything which is not a bare coefficient:
sage: def h(): try: a.pyobject() return ZZ(a) except: return a sage: a = n+1 sage: h() n + 1 sage: timeit('h()') 625 loops, best of 3: 5.98 µs per loop sage: a = SR(10^7) sage: timeit('h()') 625 loops, best of 3: 1.89 µs per loop
The docstring confirms this performs as advertised. If that doesn't speed things up, then I guess it's not possible :(
So if that doesn't work, your original solution stands as a definite improvement, with the changes you noted - put the rational check inside the already existing not isinstance(m, int etc.) check, before trying m = ZZ(m), do the floating point fix where you have it. Oh yeah, be sure to add doctests for the SR(107) etc.
Because in the long run, we should have more correct cases, I think. If someone notices a really bad slowdown, we will have to write a very fast symbolic binomial or something. But all of these are an improvement on before. Thanks for all of it!
comment:10 Changed 11 years ago by
Thanks for the tip, I did not know about the pyobject method. It definitely improved performance in the symbolic case.
I hope I finally got everything right in version 5 of the patch!
Changed 11 years ago by
comment:11 Changed 11 years ago by
Unfortunately my Sage upgrade croaked, so I can't check it immediately. I will try to do so as soon as possible.
However, what does this patch do with this?
sage: binomial(SR(3/2),SR(1/1))?
Trying x-m won't work on this. Note that
sage: type(SR(1/1).pyobject()) <type 'sage.rings.rational.Rational'>
which means you may still want a m=ZZ(m) or rational check once you have discovered you aren't in the n+1 case. I may have that wrong, though, since I can't actually try the patch out.
As for a trivial point, there is a misspelling of "coerce" as well.
comment:12 Changed 11 years ago by
You are right, binomial(SR(3/2),SR(1/1)) actually worked (for other reasons) but not e.g. binomial(3/2,SR(1/1)). Fixed in new version.
Changed 11 years ago by
comment:13 Changed 11 years ago by
- Reviewers set to Karl-Dieter Crisman
- Summary changed from [with third patch, needs review] Bug in binomial to [with patch, positive review] Bug in binomial
I get an odd doctest failure:
********************************************************************** File "/Users/.../crypto/boolean_function.pyx", line 1013: sage: B.nonlinearity() Expected: 222 Got: 217 **********************************************************************
One might as well add a doctest for that last case you mentioned, too. I'm rebasing your last patch to fix these, and putting positive review since I only changed the doctests so they pass. I think we finally got it!
comment:14 Changed 11 years ago by
Incidentally, in the future you may want to name your patches by Trac number and replace ones that are outdated. Not a big deal, but I know if I don't say it then Minh will :)
comment:15 Changed 11 years ago by
The patch trac_6870-final-v2.patch
is the same as trac_6870-final.patch
. The only difference is that I have set the username to Hakan Granath. This is because trac_6870-final.patch
is a rebase of Hakan's previous patches.
comment:16 Changed 11 years ago by
- Summary changed from [with patch, positive review] Bug in binomial to [with patch, needs work] Bug in binomial
With trac_6870-final-v2.patch
, I got the following doctest failure:
sage -t -long devel/sage/sage/crypto/boolean_function.pyx ********************************************************************** File "/scratch/mvngu/release/sage-4.1.2.alpha2/devel/sage-main/sage/crypto/boolean_function.pyx", line 1013: sage: B.nonlinearity() Expected: 217 Got: 222 ********************************************************************** 1 items had failures: 1 of 6 in __main__.example_36 ***Test Failed*** 1 failures. For whitespace errors, see the file /home/mvngu/.sage//tmp/.doctest_boolean_function.py [5.3 s]
comment:17 Changed 11 years ago by
I do not know what is happening with the file boolean_function.pyx, I can not find it in my version of sage (4.1.1).
comment:18 Changed 11 years ago by
Yeah, I saw that, but couldn't figure out where it came from. I don't think it's from this, because I got it in a branch without this patch. Can you try that in a clean branch, Minh?
(Incidentally, this file is in 4.1.2.alpha2, at any rate.)
comment:19 Changed 11 years ago by
And how do you DO that keeping the name the same thing?
comment:20 Changed 11 years ago by
- Summary changed from [with patch, needs work] Bug in binomial to [with patch, positive review] Bug in binomial
Minh, I tried this again in a branch with no binomial changes, and it is still there. I am restoring positive review, and suggest that one looks at #6950 and friends for this.
(Actually, I think it's a true one-liner, because that function depends on the current random state in Sage, and probably somewhere that got reset or changed, so that the "random" output isn't really that random. But I leave it the release manager to verify this and open a ticket.)
comment:21 Changed 11 years ago by
- Summary changed from [with patch, positive review] Bug in binomial to [with patch, needs review] Bug in binomial
Replying to kcrisman:
Minh, I tried this again in a branch with no binomial changes, and it is still there. I am restoring positive review, and suggest that one looks at #6950 and friends for this.
The doctest failure I got above is due to a 32- vs. 64-bit issue. On a 32-bit system, it would report 217. But on a 64-bit system, it would report 222. These results are consistent for all the machines I have tested on. See my doctest reports for Sage 4.1.2.alpha2:
- 32- and 64-bit Ubuntu
- 32- and 64-bit Mandriva
- 32- and 64-bit Debian
- 32- and 64-bit Fedora, Red Hat, CentOS
- 32- and 64-bit openSUSE
- 32- and 64-bit Mac OS X 10.5.8
In all of the above reports, the doctest in question pass on 64-bit platforms, but fail on 32-bit platforms. I have attached the patch trac_6870-bitness-issue.patch
which takes care of this bitness issue. It should be applied on top of trac_6870-final-v2.patch
. If my patch is good, then everything is ready to be merged.
comment:22 follow-up: ↓ 23 Changed 11 years ago by
I believe you, though I have no way of testing this, as I don't plan to build a 64-bit Sage any time soon. My point is that, as far as I can tell, it should be the subject of its own ticket, not this one.
In any case, I an unable to review that part of your patch. I'm sorry :(
comment:23 in reply to: ↑ 22 Changed 11 years ago by
Replying to kcrisman:
I believe you, though I have no way of testing this, as I don't plan to build a 64-bit Sage any time soon. My point is that, as far as I can tell, it should be the subject of its own ticket, not this one.
In any case, I an unable to review that part of your patch. I'm sorry :(
What you can do is get the patch trac_6870-final-v2.patch
and remove the hunk:
1011 1011 sage: B.nvariables() 1012 1012 9 1013 1013 sage: B.nonlinearity() 1014 222 1014 217 1015 1015 """ 1016 1016 from sage.misc.randstate import current_randstate 1017 1017 r = current_randstate().python_random()
from that patch. The new patch would be the same as the original, only with changes to the file sage/rings/arith.py
. As for my patch, you could open another ticket and put the patch there. That way, the patch won't be lost to history, and you could still review Hakan's changes to sage/rings/arith.py
. As for the doctest failure in sage/crypto/boolean_function.pyx
, you reference the new ticket from this ticket. How does that sound?
comment:24 Changed 11 years ago by
That seems reasonable.
Okay, v2.2 should be it. You can revert to positive review once you check it really does pass all tests - on my current machine, that would probably take 12 hours or so.
New ticket is #7020.
comment:25 Changed 11 years ago by
- Reviewers changed from Karl-Dieter Crisman to Karl-Dieter Crisman, Adam Webb
- Summary changed from [with patch, needs review] Bug in binomial to [with patch, positive review] Bug in binomial
The patch looks good to me too. Tested on 32 and 64 bit systems. ~ Adam
comment:26 Changed 11 years ago by
- Merged in set to Sage 4.1.2.rc0
- Resolution set to fixed
- Status changed from new to closed
Merged trac_6870-final-v2.2.patch
.
This is a good idea, and somewhat surprisingly it speeds up very small binomials without slowing down biggish ones. However, it does seem to slow down symbolic binomials:
Before:
After:
I imagine that in some applications with a lot of symbolic stuff that could be a problem, but I'm not sure. Is it possible to put a catch in for expressions that would keep things more or less as they were there, without making the numeric ones much slower?