Opened 10 years ago
Last modified 10 years ago
#10792 closed task
Upgrade numpy to 1.5.1 — at Version 33
Reported by: | jason | Owned by: | tbd |
---|---|---|---|
Priority: | major | Milestone: | sage-4.7 |
Component: | packages: standard | Keywords: | |
Cc: | evanandel, kcrisman | Merged in: | |
Authors: | François Bissey, Jason Grout | Reviewers: | David Kirkby, Karl-DIeter Crisman, Ethan Van Andel |
Report Upstream: | N/A | Work issues: | doctest changes for patch |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description (last modified by )
New version of numpy is out: 1.5.1
SPKG: http://spkg-upload.googlecode.com/files/numpy-1.5.1.spkg
Apply: attachment:trac-10792-riemann-complexwarnings.patch and attachment:trac_10792-fix_riemann_doctest.patch
Change History (35)
comment:1 Changed 10 years ago by
- Milestone changed from sage-4.6.2 to sage-4.7
comment:2 Changed 10 years ago by
It sounds like there may be just a small change we should make to the Riemann map function. Do you know line it is complaining about?
comment:3 Changed 10 years ago by
I am not sure. I don't understand the code very well in Riemann map. The only thing suspect to me is this
pt1 = np.complex(pt)
on line 499 in riemann.pyx. pt1 type has never been defined so I don't know if cython automatically give it the type of the right hand side or if it defaults to something else. In which case it would have to be cast.
comment:4 Changed 10 years ago by
Okay. I'll try to do some printf debugging to see if I can find it "soon".
comment:5 follow-up: ↓ 6 Changed 10 years ago by
Do you have a 1.5.1 spkg that you could post? It sounds like you already made one.
comment:6 in reply to: ↑ 5 Changed 10 years ago by
Replying to jason:
Do you have a 1.5.1 spkg that you could post? It sounds like you already made one.
I get that a lot. I will prepare one soon. But I don't need a new spkg to test things in sage-on-gentoo because I can use the system version, which is how I am aware of this issue and a few others regarding upgrading components.
comment:7 Changed 10 years ago by
You can find a numpy-1.5.1 spkg here http://spkg-upload.googlecode.com/files/numpy-1.5.1.spkg
comment:8 follow-up: ↓ 18 Changed 10 years ago by
I don't understand the riemann code very well either, but it seems like the warning is coming from the line where temp (a real array?) is being assigned a value from self.cps (a complex array)---see the code that is deleted in the trac-10792-riemann.patch file. I simplified the code a lot to avoid having a temporary array, but this changed lots of numerical results. So one question is: are the results it returns now better or worse than before?
I also seem to get a huge number of Warning: invalid value encountered in divide
warnings when doing Riemann_Map([lambda t: e^(I*t) - 0.5*e^(-I*t)], [lambda t: I*e^(I*t) + 0.5*I*e^(-I*t)], 0)
. These come from NaN values popping up in the array. Does anyone else see those?
comment:9 Changed 10 years ago by
- Status changed from new to needs_work
comment:10 Changed 10 years ago by
The warnings have been there since cython-0.13 has landed. The code itself is numerically sensitive to various details including lapack implementation. You may have seen something about it on sage-release.
I will take a few slow days because the town where I live (Christchurch, New Zealand) has been hit by a 6.3 earthquake. Where I live is ok, where I work isn't.
comment:11 Changed 10 years ago by
- Cc evanandel added
CCing Ethan, since he was the original author on the riemann.pyx file.
Ethan: I've been looking at the code that seems to cause two problems. First is a ComplexWarnings? that now appears in numpy 1.5.1, which I believe comes from a temporary array that was assumed to be real, but maybe should be complex. I've rewritten the affected lines to not use a temporary array.
The other problem seems like it is a problem even before the upgrade---I keep getting Warning: invalid value encountered in divide
, and the error happens in these lines in the _generate_theta_array
function in calculus/riemann.pyx:
K = np.array( [-TWOPI / N * sadp * sadp[t] * 1 / (TWOPI * I) * ((dp / adp) / (cp - cp[t]) - ((dp[t] / adp[t]) / (cp - cp[t])).conjugate()) for t in np.arange(NB)], dtype=np.complex128)
I noticed several things that could be pulled out of the inner loop, and so rewrote this line in the attached patch (which I'll update shortly). However, I don't know this code or the algorithms very well---can you look at the patch?
It appears that the warnings happen when there is a division by zero, as in this test case:
sage: import numpy as np sage: a=np.array([1,2],dtype=np.complex128) sage: b=np.array([0,1],dtype=np.complex128) sage: a/b Warning: invalid value encountered in divide array([ nan nanj, 2. +0.j]) sage: a[0]/b[0] (nan+nan*j)
So should we turn off the error checking for this error when doing this operation since at least one entry in cp-cp[t] will be zero? Or should we check for this entry and not calculate it?
Changed 10 years ago by
comment:12 Changed 10 years ago by
Ethan,
I've split the issue into two problems: the ComplexWarnings? problem that I hopefully fixed with the patch on this ticket and the invalid division warning, which I've made #10821. Can you check this code and the numerical differences in output? I now get these failures:
% sage -t -long riemann.pyx | grep -v "Warning: invalid value encountered in divide" E:128 Detected SAGE64 flag Building Sage on OS X in 64-bit mode sage -t -long "trees/sage-4.6.1/devel/sage-main/sage/calculus/riemann.pyx" ********************************************************************** File "/Users/grout/sage-trees/sage-4.6.1/devel/sage-main/sage/calculus/riemann.pyx", line 123: sage: print "error =", m.inverse_riemann_map(m.riemann_map(x)) - x # long time Expected: error = (-0.0001211863...+0.001606139...j) Got: error = (-0.000291742263558+0.00160874617675j) ********************************************************************** File "/Users/grout/sage-trees/sage-4.6.1/devel/sage-main/sage/calculus/riemann.pyx", line 484: sage: m.riemann_map(0.25 + sqrt(-0.5)) # long time Expected: (0.137514137885...+0.876696023004...j) Got: (0.13751414123737432+0.87669602715226935j) ********************************************************************** File "/Users/grout/sage-trees/sage-4.6.1/devel/sage-main/sage/calculus/riemann.pyx", line 486: sage: m.riemann_map(1.3*I) # long time Expected: (-1.561029396...+0.989694535737...j) Got: (-1.560925774791804e-05+0.98969453688560927j) ********************************************************************** File "/Users/grout/sage-trees/sage-4.6.1/devel/sage-main/sage/calculus/riemann.pyx", line 489: sage: m.riemann_map(0.4) # long time Expected: (0.733242677182...+3.50767714...j) Got: (0.73324267718110436+3.2076771460460016e-06j) ********************************************************************** File "/Users/grout/sage-trees/sage-4.6.1/devel/sage-main/sage/calculus/riemann.pyx", line 492: sage: m.riemann_map(np.complex(-3, 0.0001)) # long time Expected: (1.405757135...+1.05106...j) Got: (1.4057571374988058e-05+8.0616369487397992e-10j) ********************************************************************** 2 items had failures: 1 of 15 in __main__.example_1 4 of 9 in __main__.example_8 ***Test Failed*** 5 failures. For whitespace errors, see the file /Users/grout/.sage//tmp/.doctest_riemann.py [105.0 s] ---------------------------------------------------------------------- The following tests failed: sage -t -long "trees/sage-4.6.1/devel/sage-main/sage/calculus/riemann.pyx" Total time for all tests: 105.1 seconds
It looks like the errors are all slight numerical errors. Can someone confirm that?
comment:13 Changed 10 years ago by
- Description modified (diff)
comment:14 Changed 10 years ago by
- Status changed from needs_work to needs_info
comment:15 follow-up: ↓ 16 Changed 10 years ago by
Sorry for the slow response.
It looks like everything is slight numerical problems, which is expected given (as mentioned above) the numerical sensitivity. I'll happily test, but I'm not sure how to upgrade Sage's numpy. If someone can tell me how to get that running, I'll be able to confirm this patch and #10821.
Side Note 1: I see that my code is apparently quite incomprehensible to people. The underlying mathematics is quite a bear, so it'll never be nice and accessible, but is there anything I can do to improve the documentation?
Side Note 2: What's the status of fast_callable these days? Patch #8867 is still out there, and if fast_callable is ready, I think we might be able to merge that as well (otherwise, I can try amending it to work with fast_callable as is)
comment:16 in reply to: ↑ 15 Changed 10 years ago by
- Description modified (diff)
Replying to evanandel:
Sorry for the slow response.
It looks like everything is slight numerical problems, which is expected given (as mentioned above) the numerical sensitivity. I'll happily test, but I'm not sure how to upgrade Sage's numpy. If someone can tell me how to get that running, I'll be able to confirm this patch and #10821.
You can go through the following steps:
1) drop the numpy spkg in the spkg/standard folder
2) sage -f numpy-1.5.1
3) apply the patches (from here and #10821 ) the easiest way is to follow the procedure from the developer's manual http://www.sagemath.org/doc/developer/walk_through.html#reviewing-a-patch
4) sage -b
5) test
Side Note 1: I see that my code is apparently quite incomprehensible to people. The underlying mathematics is quite a bear, so it'll never be nice and accessible, but is there anything I can do to improve the documentation?
could you refer to a review of it in a journal or an online resource?
Side Note 2: What's the status of fast_callable these days? Patch #8867 is still out there, and if fast_callable is ready, I think we might be able to merge that as well (otherwise, I can try amending it to work with fast_callable as is)
No idea.
comment:17 Changed 10 years ago by
As for fast-callable, #5572 is still on my todo list, but isn't likely to get done in the near future (e.g., this semester). If someone else wants to work on it, they are more than welcome.
comment:18 in reply to: ↑ 8 ; follow-up: ↓ 19 Changed 10 years ago by
Replying to jason:
I don't understand the riemann code very well either, but it seems like the warning is coming from the line where temp (a real array?) is being assigned a value from self.cps (a complex array)---see the code that is deleted in the trac-10792-riemann.patch file. I simplified the code a lot to avoid having a temporary array, but this changed lots of numerical results. So one question is: are the results it returns now better or worse than before?
This is where it would be helpful doctests actually documented why the particular value is correct. I've seen sooooo many doctests where the "expected value" is whatever someone got on their computer and is not substantiated in any way as a comment in the code.
Also, if the algorithm, or its implementation in Sage is has poor numerical stability, this should be documented.
Could this be computed with Mathematica or Wolfram|Alpha to arbitrary precision? Just as thought. If so, that could be documented - we have permission from Wolfram Research to use Wolfram|Alpha for the purpose of comparing results and documenting those compassions.
comment:19 in reply to: ↑ 18 ; follow-up: ↓ 25 Changed 10 years ago by
Replying to drkirkby:
This is where it would be helpful doctests actually documented why the particular value is correct. I've seen sooooo many doctests where the "expected value" is whatever someone got on their computer and is not substantiated in any way as a comment in the code.
Unfortunately to my knowledge, this is the only extant tool that performs this sort of Riemann map. I believe that there are one or two cases where the analytic map is known, so I can probably add some tests that check accuracy against that.
Also, if the algorithm, or its implementation in Sage is has poor numerical stability, this should be documented.
As far as I've seen, it's not unstable in the sense of dramatically losing accuracy, but the many numerical calculations are sensitive to slight differences in machine-level implementation. This results in slight differences in the final error. I should be able to do some error analysis and see if these deviations are within the bounds of the algorithm.
Could this be computed with Mathematica or Wolfram|Alpha to arbitrary precision? Just as thought. If so, that could be documented - we have permission from Wolfram Research to use Wolfram|Alpha for the purpose of comparing results and documenting those compassions.
Not without complete reimplementation, and I know of no reason why their performance should be better than ours. You can increase the numerical precision of the computation by increasing N (the number of collocation points on the boundary.) I'll can create a couple of comparision tests that can be run on different machines to see if that decreases the numerical deviation.
comment:20 Changed 10 years ago by
Could someone run a test where after upgrading numpy they rebuild scipy before doing sage -b. Technically rebuilding scipy can be done after sage -b as it is a pure runtime dependency. Then test sage/matrix/matrix_double_dense.py to see if there are more warnings in there. I have one appearing after updating to scipy-0.9 and I just want to check if it wasn't hiding before. I am fairly sure it isn't but it has to be checked and I am still quite stretched resource wise after the earthquake.
comment:21 Changed 10 years ago by
- Cc kcrisman added
comment:22 follow-up: ↓ 23 Changed 10 years ago by
See also #7831 for a fix for FreeBSD that perhaps can be part of this spkg, assuming we can find people to test on FreeBSD.
comment:23 in reply to: ↑ 22 ; follow-up: ↓ 24 Changed 10 years ago by
Replying to kcrisman:
See also #7831 for a fix for FreeBSD that perhaps can be part of this spkg, assuming we can find people to test on FreeBSD.
Sure, I'll have a look at incorporating it. Did you have a go at the patch in the present ticket? I am not impressed by the results but it could be a fluck.
comment:24 in reply to: ↑ 23 Changed 10 years ago by
Replying to fbissey:
Replying to kcrisman:
See also #7831 for a fix for FreeBSD that perhaps can be part of this spkg, assuming we can find people to test on FreeBSD.
Sure, I'll have a look at incorporating it. Did you have a go at the patch in the present ticket? I am not impressed by the results but it could be a fluck.
No, I haven't had a chance to do this, and I won't be at the machine I use for testing this sort of thing (the older PPC Mac) for several days. Just wanting to point it out, in case it makes things easy.
comment:25 in reply to: ↑ 19 Changed 10 years ago by
- Reviewers set to David Kirkby
Replying to evanandel:
Replying to drkirkby:
This is where it would be helpful doctests actually documented why the particular value is correct. I've seen sooooo many doctests where the "expected value" is whatever someone got on their computer and is not substantiated in any way as a comment in the code.
Unfortunately to my knowledge, this is the only extant tool that performs this sort of Riemann map. I believe that there are one or two cases where the analytic map is known, so I can probably add some tests that check accuracy against that.
So essentially the "test" in its current form just demonstrates that the result on different machines is approximately the same. It does not demonstrate that the result on these machines is approximately correct - the code could be incorrect and be giving results 1000 times what they should be.
I'd feel a lot happier if the doctest used an example where the result is known, and cited a reference to the result.
Consider for example the approach I took to testing some finite difference software I wrote for computing the impedance of electrical transmission lines of arbitrary cross section.
Although there are a few examples demonstrating the answer from strange cross sections, where I have no chance of computing the result analytically
http://atlc.sourceforge.net/examples.html
for the purposes of testing
http://atlc.sourceforge.net/accuracy.html
the tests were based on cases where analytical results were known. The results were checked on a variety of CPUs (AMD, Cray, Intel Itanium, Intel x86, PA-RISC, PPC, Sun SPARC etc) on a variety of operating systems (AIX, Irix, HP-UX, OS X, Solaris, tru64, Unicos, Unixware, plus various Linux distributions)
Also, if the algorithm, or its implementation in Sage is has poor numerical stability, this should be documented.
As far as I've seen, it's not unstable in the sense of dramatically losing accuracy, but the many numerical calculations are sensitive to slight differences in machine-level implementation. This results in slight differences in the final error. I should be able to do some error analysis and see if these deviations are within the bounds of the algorithm.
Could this be computed with Mathematica or Wolfram|Alpha to arbitrary precision? Just as thought. If so, that could be documented - we have permission from Wolfram Research to use Wolfram|Alpha for the purpose of comparing results and documenting those compassions.
Not without complete reimplementation, and I know of no reason why their performance should be better than ours. You can increase the numerical precision of the computation by increasing N (the number of collocation points on the boundary.) I'll can create a couple of comparison tests that can be run on different machines to see if that decreases the numerical deviation.
Well as far as I know Numpy will use machine precision, whereas anything you do in Mathematica can be done to arbitrary precision. There are many things in Sage that can only be done on an FPU, so use of an arbitrary precision floating point maths is not supported. Mathematica does not suffer that limitation.
Lets say for example a doctest of Sage's factorial function used 50! as an example, and gave the result 30414093201713378043612608166064768844377641568960522000000000000
. If every machine it was tried on, using a variety of CPUs (PPC, SPARC, Intel x86, AMD x86, Intel Itanium processors) gave the same result, it does not prove Sage is correct. That would only demonstrate the reproducibility of Sage's factorial function, which is something very different from a good test in my opinion. Performing the same calculation in MATLAB, which can only do the calculation on the floating point processor, would increase confidence in the result. Performing the calculation on Mathematica would show a difference in one of the digits, which would lead one to question what of the two packages is wrong. (I purposely changed one of the digits, introducing a relative error of 3.28 x 10^{-52}, which would not be seen on an FPU).
I was reading only recently about when a new Mersenne prime was found. That was found on a x86 chip, but it was also verified by other software on an x86 chip and also by yet more software on a SPARC processor.
It seems to me that many doctests in Sage, just show reproducibly, and don't actually test the algorithm or the implementation. This appears to be one such test.
PS, the "author" field needs to be filled in.
Dave
comment:26 Changed 10 years ago by
Dave you're absolutely right. This is a numerical algorithm including Simpson's method and Nystrom method integration etc. so even if the numerics are done to arbitrary precision, it will still be numeric and will verify only itself. There are, however, a few cases where the analytic Riemann Map is computable, and I intend to add a clearly labeled test case for that. Note also that the last two doctests are signficant failures, the numpy upgrade seems to have changed the way it handles obscure complex types.
I will be travelling for a few days, but I should have a patch up that improves the doctests and fixes that error by Thursday or Friday.
Ethan
comment:27 Changed 10 years ago by
OK, I have tried a more minimal change and get basically the same result on the riemann.pyx doctest. The interpolator.pyx doctest is fine.
I will leave Ethan work his magic before commenting again.
comment:28 Changed 10 years ago by
- Status changed from needs_info to needs_work
- Work issues set to doctest changes for patch, verify a couple analytic cases
Ethan, I think you are too hard on yourself with saying 'the last two doctests are significant failures', as the previous ones did not end in j
, but in e-xyj
. In both cases that means the change in values are just as small as for the other ones.
sage: sage: m.riemann_map(0.4) # long time (0.73324267718245451+3.5076771459132302e-06j) sage: sage: import numpy as np # long time sage: sage: m.riemann_map(np.complex(-3, 0.0001)) # long time (1.4057571354958995e-05+1.051061653044145e-09j)
Is there any way to check for that pattern in a doctest? It would be convenient to guard against what Ethan thought was happening.
Updating the work issues. We still need
- doctest change for Jason's patch
- some type of confirmation of one or two analytic examples of this
Though in my opinion, this second one could be moved to #10821, since the problem was there before.
Also, seeing this made me look through riemann.pyx, and it turns out that file could use some general TLC. Esp. with regard to a few doc things and some fairly massive redundancy with complex plots. See #10945.
comment:29 Changed 10 years ago by
Jason's patch is correct. Positive review on that part.
Here's another thought.
self.cps = np.zeros([self.B, N], dtype=np.complex128) # Find the points on the boundaries and their derivatives. for k in xrange(self.B): for i in xrange(N): self.cps[k, i] = np.complex(fs[k](self.tk[i]))
So when the original code does
temp = np.zeros([self.B, 1]) for k in xrange(self.B): temp[k, 0] = self.cps[k, 0]
then I would say the problem is in taking np.zeros and assigning complex numbers to that temp array, which is defined to be real a priori. Maybe that is the change in how Numpy did things - it now complains?
My point being that if we change
- temp = np.zeros([self.B, 1]) + temp = np.zeros([self.B, 1],dtype=np.complex128)
and that also removes the errors, and we get the same doctest changes, then we can be sure that the problem is simply that this is a really hard numerical thing to get right, and Ethan should be commended for even getting within epsilon.
comment:30 Changed 10 years ago by
- Work issues changed from doctest changes for patch, verify a couple analytic cases to doctest changes for patch
Hoo boy, testing it out definitely shows some serious numerical instability even for many more plot points. Notice this is before the numpy upgrade or the patch.
sage: m = Riemann_Map([lambda t: e^(I*t) - 0.5*e^(-I*t)], [lambda t: I*e^(I*t) + 0.5*I*e^(-I*t)], 0, N=2000) sage: m.riemann_map(.45) (0.8585664030690745+7.4218379456630256e-07j) sage: m.riemann_map(.46) (0.88541813465210162+3.4358512148266537e-07j) sage: m.riemann_map(.47) (0.91294518302491301-2.6212328466317408e-06j) sage: m.riemann_map(.48) (0.94117262428806991-3.5456252284863065e-05j) sage: m.riemann_map(.49) (0.96930329181581476-0.00016029304727194931j) sage: m.riemann_map(.491) (0.97166253445139428-5.3770138915852049e-05j) sage: m.riemann_map(.492) (0.97361488824863462+0.00018530315147004251j) sage: m.riemann_map(.493) (0.97486154873554642+0.00065056144560648023j) sage: m.riemann_map(.494) (0.97491963265632509+0.0014788861938800405j) sage: m.riemann_map(.495) (0.97313345395283191+0.0028473636237802834j) sage: m.riemann_map(.496) (0.96917007545329004+0.0049319541164373499j) sage: m.riemann_map(.497) (0.96580931231079858+0.0077812018516007905j) sage: m.riemann_map(.498) (0.98261586147447211+0.011068697234118194j) sage: m.riemann_map(.499) (1.1489715720875546+0.013786517516344927j)
There is nothing wrong with what the values are doing, though, and that's the important part. Taking a hint from the doc:
sage: "error =", m.inverse_riemann_map(m.riemann_map(.49)) - .49 ('error =', (-0.00032080499713899036-5.4888563828232042e-05j)) sage: "error =", m.inverse_riemann_map(m.riemann_map(.495)) - .495 ('error =', (-0.0040172648041303383+0.00096565040731113271j)) sage: "error =", m.inverse_riemann_map(m.riemann_map(.499)) - .499 ('error =', (-0.95337317730092064-0.0038944237237982667j))
And honestly, the documentation is clear that near boundaries there are problems.
It is clear to me that at least for simple examples this performs as advertised, so I don't think Ethan needs to justify the error; it's just a very unstable thing, which probably can be improved by using something more sophisticated than trapezoid or Simpson (maybe?) - but that's not this ticket. I love the plots; I tried to do this with Grapher when I taught complex analysis in some random naive ways, and it never worked very well. This is a great addition.
So, as far as I'm concerned, positive review once someone actually uploads a patch that corrects the doctests. I recommend that the analytic examples go on #10945, unless Ethan has the time to add them here. The Numpy upgrade should go in sooner rather than later.
comment:31 Changed 10 years ago by
OK I attached a patch to adjust the riemann.pyx doctests. Since some of these tests are very sensitive I reduced the precision to which the results are tested. I also preserved any "e-xx" in the results so we know they are meant to be small values close to zero.
If they move again it should be easier to figure if we just have changed the value of the small number that in a perfect world would go to zero.
For integration I like to use quadratures personally. We could get them from gsl. It would depends of the number of sample points. Of course for any methods you choose you can find pitfalls with a given function.
comment:32 Changed 10 years ago by
comment:33 Changed 10 years ago by
- Description modified (diff)
- Reviewers changed from David Kirkby to David Kirkby, Karl-DIeter Crisman, Ethan Van Andel
- Status changed from needs_work to needs_review
Updating should be straightforward but some doctest will break because of some extra verbosity:
and
I think that's all the failures related to numpy-1.5.1