Opened 7 years ago
Closed 16 months ago
#13135 closed defect (fixed)
list_plot3d.py should not generate NaN coordinates
Reported by:  jdemeyer  Owned by:  jason, was 

Priority:  blocker  Milestone:  sage8.2 
Component:  graphics  Keywords:  
Cc:  strogdon, fbissey  Merged in:  
Authors:  Jeroen Demeyer  Reviewers:  Steven Trogdon 
Report Upstream:  N/A  Work issues:  
Branch:  4d17a73 (Commits)  Commit:  4d17a73d3e0b7f7151acc4e29146c3992f3ac43c 
Dependencies:  Stopgaps: 
Description (last modified by )
Both #12798 and the matplotlib upgrade (#23696) cause this:
sage t long warnlong 10.0 src/sage/plot/plot3d/list_plot3d.py ********************************************************************** File "src/sage/plot/plot3d/list_plot3d.py", line 143, in sage.plot.plot3d.list_plot3d.list_plot3d Warning, slow doctest: list_plot3d(l, interpolation_type='clough', texture='yellow', num_points=100) Test ran for 926.47 s **********************************************************************
on hardware which is slow in computing with NaN
values.
This plot is done by tachyon but in the input to tachyon, there are many NaN
s. This can be seen by applying nancount.patch. The result is
sage t long src/sage/plot/plot3d/list_plot3d.py ********************************************************************** File "src/sage/plot/plot3d/list_plot3d.py", line 143, in sage.plot.plot3d.list_plot3d.list_plot3d Failed example: list_plot3d(l, interpolation_type='clough', texture='yellow', num_points=100) Expected: Graphics3d Object Got: Got 3206 NaN values Graphics3d Object ********************************************************************** File "src/sage/plot/plot3d/list_plot3d.py", line 387, in sage.plot.plot3d.list_plot3d.list_plot3d_tuples Failed example: show(list_plot3d([[1, 1, 1], [1, 2, 1], [0, 1, 3], [1, 0, 4]], point_list=True)) Expected nothing Got: Got 31 NaN values ********************************************************************** File "src/sage/plot/plot3d/list_plot3d.py", line 391, in sage.plot.plot3d.list_plot3d.list_plot3d_tuples Failed example: list_plot3d([(1, 2, 3), (0, 1, 3), (2, 1, 4), (1, 0, 2)], texture='yellow', num_points=50) Expected: Graphics3d Object Got: Got 1273 NaN values Graphics3d Object **********************************************************************
Attachments (3)
Change History (69)
comment:1 Changed 7 years ago by
 Milestone changed from sage5.2 to sage5.1
comment:2 Changed 7 years ago by
 Description modified (diff)
comment:3 Changed 7 years ago by
 Description modified (diff)
comment:4 followup: ↓ 6 Changed 7 years ago by
comment:5 Changed 7 years ago by
 Description modified (diff)
comment:6 in reply to: ↑ 4 ; followup: ↓ 7 Changed 7 years ago by
Replying to ppurka:
This takes 0.18s on my laptop in sage5.1.beta5. I don't see why it should take >5s on some other hardware unless the hardware is really that slow.
Do an explaination of why it could have slowed down a lot because of this ticket?
comment:7 in reply to: ↑ 6 Changed 7 years ago by
Replying to jdemeyer:
Replying to ppurka:
This takes 0.18s on my laptop in sage5.1.beta5. I don't see why it should take >5s on some other hardware unless the hardware is really that slow.
Do an explaination of why it could have slowed down a lot because of this ticket?
I think the way to handle the slow down is to reduce the num_points
to 100.
It is not really a "slow down" per se, but it actually gives an accurate plot. Before #13135 3d plots of lists of points were just plain wrong.
I think the following is a good means of checking the time that the plot in the revised description takes
sage: %time list_plot3d([(0,0,1), (2,3,4), (1,1,1)],num_points=400).export_jmol('/tmp/a.jmol') CPU times: user 5.37 s, sys: 0.05 s, total: 5.42 s Wall time: 5.44 s sage: %time list_plot3d([(0,0,1), (2,3,4), (1,1,1)],num_points=100).export_jmol('/tmp/a.jmol') CPU times: user 0.35 s, sys: 0.00 s, total: 0.35 s Wall time: 0.35 s
Can you run this command on the Solaris machine and find out what is the time taken? If it takes much less time on the latter command, then it would be good to have the latter one (with 100 points) instead of the former one (with 400 points).
comment:8 Changed 7 years ago by
Here's an other weird thing: the test takes much longer when running the doctests than when run individually. Inside Sage, there is not much difference with/without the #12798 patches. But when running the doctest, there is a huge difference.
Could there be a resource leak of some kind?
comment:9 Changed 7 years ago by
A resource leak might be a possibility. I ran the following two commands:
Vanilla sage5.1beta5
...sage5.1.beta5/devel/sage» ../../sage t long force_lib sage/plot/plot3d/list_plot3d.py sage t long force_lib "devel/sagemain/sage/plot/plot3d/list_plot3d.py" [10.3 s]  All tests passed! Total time for all tests: 10.3 seconds
Then I changed that list_plot
example to the following:
sage: list_plot3d([(0,0,1), (2,3,4), (1,1,1)],num_points=400).export_jmol(tmp_filename()+'.jmol') # long time
After change to the list_plot
example
...sage5.1.beta5/devel/sage» ../../sage t long force_lib sage/plot/plot3d/list_plot3d.py sage t long force_lib "devel/sagemain/sage/plot/plot3d/list_plot3d.py" [8.6 s]  All tests passed! Total time for all tests: 8.6 seconds
That is quite a bit of a difference just for dumping the jmol into a file.
Is there some automated way to measure the memory/cpu usage of a running process?
comment:10 Changed 7 years ago by
 Description modified (diff)
 Milestone changed from sage5.1 to sage5.2
 Priority changed from blocker to major
comment:11 Changed 6 years ago by
 Milestone changed from sage5.11 to sage5.12
comment:12 Changed 5 years ago by
 Milestone changed from sage6.1 to sage6.2
comment:13 Changed 5 years ago by
 Milestone changed from sage6.2 to sage6.3
comment:14 Changed 5 years ago by
 Milestone changed from sage6.3 to sage6.4
comment:15 Changed 17 months ago by
 Description modified (diff)
 Summary changed from list_plot3d.py timeout in Solaris to list_plot3d.py should not generate NaN coordinates
comment:16 Changed 17 months ago by
 Priority changed from major to blocker
comment:17 Changed 17 months ago by
 Description modified (diff)
comment:18 Changed 17 months ago by
I'm attaching an example Tachyon input file generated by list_plot3d
.
comment:19 Changed 17 months ago by
 Description modified (diff)
 Milestone changed from sage6.4 to sage8.2
comment:20 Changed 17 months ago by
 Description modified (diff)
comment:21 Changed 17 months ago by
 Description modified (diff)
comment:22 Changed 17 months ago by
simpler case maybe
sage: var('x,y') sage: f = lambda x,y: x+y if x*x+y*y<=1 else NaN sage: plot3d(f,(x,2,2),(y,2,2)).tachyon()
comment:23 followups: ↓ 24 ↓ 34 Changed 17 months ago by
and
sage: G = polygon([(NaN,NaN,NaN)]*3) sage: G.tachyon_repr(G.default_render_params()) ['TRI V0 nan nan nan V1 nan nan nan V2 nan nan nan', 'texture3']
Maybe we should filter this out of tachyon input ? or more generally ?
comment:24 in reply to: ↑ 23 Changed 17 months ago by
comment:25 Changed 17 months ago by
 Cc strogdon added
comment:26 Changed 17 months ago by
Doesn't explain the NaN
but could the following be what one is observing when doctesting
sage: l = [] sage: for i in range(5, 5): ....: for j in range(5, 5): ....: l.append((normalvariate(0, 1), normalvariate(0, 1), normalvariate(0, 1))) ....: sage: %timeit list_plot3d(l, interpolation_type='clough', texture='yellow', num_points=100) The slowest run took 109.32 times longer than the fastest. This could mean that an intermediate result is being cached. 1 loop, best of 3: 3.11 ms per loop
The first run is always the slowest.
comment:27 Changed 17 months ago by
 Cc fbissey added
comment:28 Changed 17 months ago by
OK in the case of clough
and linear
we could able to filter NaN
at a cost. Both define a function g
like so
def g(x, y): z = f([x, y]) return (x, y, z)
I guess g
could return empty if z
turns to be a NaN
. Is it even doable?
comment:29 Changed 17 months ago by
Filtering NaN
s is easy, but that is really ignoring the problem. The real problem is that NaN
s are appearing in a computation which did not produce NaN
s before #23696.
comment:30 Changed 17 months ago by
The interpolation algorithm is completely different, so there is a possibility that this particular algorithm is completely unsuited to this particular set of points. I am now curious to see what the plot look like but that will have to wait until I go back to work tomorrow morning (~14 hours in my time zone).
comment:31 followup: ↓ 32 Changed 17 months ago by
It may be that the nan
are a feature of using scipy.interpolate.CloughTocher2DInterpolator. The documentation indicates that if requested points are outside the convex hull then, unless specified, they are filled in with nan
.
comment:32 in reply to: ↑ 31 ; followup: ↓ 33 Changed 17 months ago by
Replying to strogdon:
It may be that the
nan
are a feature of using scipy.interpolate.CloughTocher2DInterpolator. The documentation indicates that if requested points are outside the convex hull then, unless specified, they are filled in withnan
.
Thanks for the info. If so, filtering them probably is the right thing to do after all.
comment:33 in reply to: ↑ 32 Changed 17 months ago by
Replying to jdemeyer:
Replying to strogdon:
It may be that the
nan
are a feature of using scipy.interpolate.CloughTocher2DInterpolator. The documentation indicates that if requested points are outside the convex hull then, unless specified, they are filled in withnan
.Thanks for the info. If so, filtering them probably is the right thing to do after all.
Yes, this algorithm just doesn't do extrapolation and put NaN for anything that would be an extrapolation. We could set an arbitrary value, but to what? The average z value of the input? 0?
comment:34 in reply to: ↑ 23 Changed 17 months ago by
Replying to chapoton:
Maybe we should filter this out of tachyon input ? or more generally ?
If you want to filter them, I think the right place is the triangulate()
method.
comment:35 Changed 17 months ago by
I just spent some time with matplotlib code to check what it does for extrapolation. And it looks like it returns NaN
for those points as well. So the problem should be present as well for linear
interpolation. I couldn't find anything in the documentation about what the stuff behind spline
do for extrapolation  by the look of the output picture it just extrapolates.
comment:36 followup: ↓ 37 Changed 17 months ago by
There is some discussion also on #12798 about NaN
coordinates and even a patch.
comment:37 in reply to: ↑ 36 ; followups: ↓ 38 ↓ 40 Changed 17 months ago by
comment:38 in reply to: ↑ 37 ; followup: ↓ 39 Changed 17 months ago by
Replying to strogdon:
in so doing it appears the surface is cropped such that it is not displayed over the entire convex hull.
Maybe, but I wouldn't try to fix that here. It would be a nontrivial change to change the triangulate()
method to better take into account the input points. That is not something that we should fix in this blocker ticket.
comment:39 in reply to: ↑ 38 Changed 17 months ago by
Replying to jdemeyer:
Replying to strogdon:
in so doing it appears the surface is cropped such that it is not displayed over the entire convex hull.
Maybe, but I wouldn't try to fix that here. It would be a nontrivial change to change the
triangulate()
method to better take into account the input points. That is not something that we should fix in this blocker ticket.
Does it help your situation? I can't really test whether there is a time savings. I can only verify that the NaN
are stripped.
comment:40 in reply to: ↑ 37 ; followup: ↓ 41 Changed 17 months ago by
Replying to strogdon:
The patch does remove the
NaN
values but in so doing it appears the surface is cropped such that it is not displayed over the entire convex hull.
That is, most of them are gone. There are still a few corner
ones left.
comment:41 in reply to: ↑ 40 ; followup: ↓ 42 Changed 17 months ago by
Replying to strogdon:
Replying to strogdon:
The patch does remove the
NaN
values but in so doing it appears the surface is cropped such that it is not displayed over the entire convex hull.That is, most of them are gone. There are still a few
corner
ones left.
Can you elaborate on that. I thought NaN
were not displayed anyway, so I didn't expect a difference before and after stripping them.
comment:42 in reply to: ↑ 41 Changed 17 months ago by
Replying to fbissey:
Replying to strogdon:
Replying to strogdon:
The patch does remove the
NaN
values but in so doing it appears the surface is cropped such that it is not displayed over the entire convex hull.That is, most of them are gone. There are still a few
corner
ones left.Can you elaborate on that. I thought
NaN
were not displayed anyway, so I didn't expect a difference before and after stripping them.
Basically, from my understanding, the stripping is an attempt to find the maximal rectangle that fits inside the convex hull and the 3D surface is drawn over this rectangle. The procedure sometimes fails, most notably at the corners of the constructed rectangle.
comment:43 Changed 17 months ago by
Patch to see changes before and after stripping the NaN
s. Apply as patch p1 < before_after_stripping_nan.patch
and then rebuild sage, ./sage b
. To test use that in
comment:26 without the timeit
.
comment:44 followup: ↓ 45 Changed 17 months ago by
Yes, I can see there is some differences. What does the plotting do with NaN
?
comment:45 in reply to: ↑ 44 Changed 17 months ago by
Replying to fbissey:
Yes, I can see there is some differences. What does the plotting do with
NaN
?
My understanding is that points that have associated values of NaN
are not plotted. When the test is run there should be two plots. The plot where the NaN
are stripped
should be displayed above a nearly rectangular region, except possibly, near the corners of the region. It is at these corners where there are some, possibly remaining NaN
s.
comment:46 followup: ↓ 49 Changed 16 months ago by
From commit 950bdd8145d262e493321bef67d3d57d26f9c464 the changes to src/sage/plot/plot3d/list_plot3d.py
contain
 T=delaunay.Triangulation(x,y)  f=T.nn_interpolator(z)  f.default_value=0.0  j=numpy.complex(0,1)  vals=f[ymin:ymax:j*num_points,xmin:xmax:j*num_points] + points=[[x[i],y[i]] for i in range(len(x))] + j = numpy.complex(0, 1) + f = interpolate.CloughTocher2DInterpolator(points,z) from .parametric_surface import ParametricSurface
So if I'm not mistaken the old delaunay triagulation
implementation fills points outside the convex hull with zeros, i.e. f.default_value=0.0
. The same thing can be achieved with scipy.interpolate.CloughTocher2DInterpolator with the change

src/sage/plot/plot3d/list_plot3d.py
diff git a/src/sage/plot/plot3d/list_plot3d.py b/src/sage/plot/plot3d/list_plot3d.py index c0c578b99f..77aeb583ae 100644
a b def list_plot3d_tuples(v, interpolation_type, texture, **kwds): 461 461 462 462 points=[[x[i],y[i]] for i in range(len(x))] 463 463 j = numpy.complex(0, 1) 464 f = interpolate.CloughTocher2DInterpolator(points,z )464 f = interpolate.CloughTocher2DInterpolator(points,z,fill_value=0.0) 465 465 from .parametric_surface import ParametricSurface 466 466 def g(x, y): 467 467 z = f([x, y])
In my opinion the resulting plot is not particularly pleasing but it does reproduce the old behavior. I'm unable to determine whether there is any CPU savings. I do believe that modern graphing algorithms generate these NaN
s as a mechanism to not plot points and that leaving things asis is what should be done. In which case the doctest could be marked as # long time
until the older hardware that doesn't like the NaN
s is gone. So I would favor either of the above two approaches instead of stripping
the NaN
s.
comment:47 followup: ↓ 48 Changed 16 months ago by
I must say that in a previous comment I mentioned the possibility of using fill_value
. If 0
reproduce the old behavior we could go with that but I don't think it is an especially good or always appropriate value. As you said if NaN
means do not plot, then it is more appropriate.
The problem is not with older hardware, power8 is not especially old, but its handling of NaN is currently poor. I don't know if it can be improved at the software level eventually but I would hope so.
comment:48 in reply to: ↑ 47 Changed 16 months ago by
Replying to fbissey:
I must say that in a previous comment I mentioned the possibility of using
fill_value
. If0
reproduce the old behavior we could go with that but I don't think it is an especially good or always appropriate value. As you said ifNaN
means do not plot, then it is more appropriate.The problem is not with older hardware, power8 is not especially old, but its handling of NaN is currently poor. I don't know if it can be improved at the software level eventually but I would hope so.
I couldn't check things with vanilla Sage. But I was able to install MPL1.5.3 on sageongentoo and I could manually replicate the current behavior by setting f.default_value=numpy.float('nan')
with the old delaunay triangulation
.
comment:49 in reply to: ↑ 46 ; followup: ↓ 50 Changed 16 months ago by
Replying to strogdon:
I do believe that modern graphing algorithms generate these
NaN
s as a mechanism to not plot points and that leaving things asis is what should be done.
1.
Plotting NaN
s makes no sense at all. It can only lead to garbage in, garbage out.
I am actually surprised that removing the NaN
s changes the output. I haven't looked at the code, but it makes me suspect that maybe the NaN
removal is not done correctly.
comment:50 in reply to: ↑ 49 ; followup: ↓ 51 Changed 16 months ago by
Replying to jdemeyer:
Plotting
NaN
s makes no sense at all. It can only lead to garbage in, garbage out.
Try it!
sage: import matplotlib.pyplot as plt sage: fig, (ax1, ax2) = plt.subplots(1,2) sage: ax1.plot([1, 2, 2, 2, 3], [1, 1, NaN, 2, 2]); sage: ax2.plot([1, 2, 2, 3], [1, 1, 2, 2]); sage: plt.show()
There may be other ways, but both scipy and matplotlib know how to handle such things.
comment:51 in reply to: ↑ 50 Changed 16 months ago by
What's your point? That example is showing a Line2D
object which is a highlevel object: it's a collection of lines, which the lowlevel renderer will treat as individual lines to be drawn.
It's precisely the transformation of a highlevel object (the input to list_plot3d
) to a lowlevel object (a collection of triangles) that is going wrong in this ticket.
comment:52 followup: ↓ 53 Changed 16 months ago by
In other words: the bug is not about a NaN
appearing in the input of list_plot3d
. It is about a NaN
appearing in the processing of otherwise valid input and being sent to the rendering engine.
comment:53 in reply to: ↑ 52 ; followup: ↓ 54 Changed 16 months ago by
Replying to jdemeyer:
In other words: the bug is not about a
NaN
appearing in the input oflist_plot3d
. It is about aNaN
appearing in the processing of otherwise valid input and being sent to the rendering engine.
OK, so I guess there could be a problem inside ParametricSurface
which receives input that contains the NaN
s. A big task to fix that. So perhaps we should do as was done previously with the old code until ParametricSurface
can be investigated, i.e. fill points outside the convex hull with zeros? I believe we know how to do that with CloughTocher2DInterpolator
.
comment:54 in reply to: ↑ 53 Changed 16 months ago by
Replying to strogdon:
OK, so I guess there could be a problem inside
ParametricSurface
which receives input that contains theNaN
s. A big task to fix that.
I still think that it should be possible to just remove all the triangles containing a NaN
. That can't be too hard.
comment:55 Changed 16 months ago by
 Priority changed from blocker to critical
Doesn't strike me as a release blocker bug...
comment:56 Changed 16 months ago by
 Priority changed from critical to blocker
It is. It yields a massive (orders of magnitude) slowdown on most platforms (x86_64 is the exception here), causing make ptest
to timeout consistently and it is a recent regression.
comment:57 Changed 16 months ago by
comment:58 Changed 16 months ago by
 Branch set to u/jdemeyer/list_plot3d_py_should_not_generate_nan_coordinates
comment:59 Changed 16 months ago by
 Commit set to 4c5a3736ed18c692f0d63ba6b22e52ff1d7cd9a8
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
4c5a373  Remove vertices containing a NaN or infinity

comment:60 Changed 16 months ago by
 Status changed from new to needs_review
comment:61 followup: ↓ 62 Changed 16 months ago by
 Status changed from needs_review to needs_work
I thought all along that if there was a problem it had to be with the code that does the triangulation, not with the inputs. It's good that someone was capable of sorting through the code. I see no NaN
s in the output and the surfaces appear to be drawn over the full convex hull. It may be interesting to see what are the CPU savings? But I do see at least one failure
sage t long src/sage/plot/plot3d/parametric_surface.pyx ********************************************************************** File "src/sage/plot/plot3d/parametric_surface.pyx", line 257, in sage.plot.plot3d.parametric_surface.ParametricSurface.obj_repr Failed example: s[:2]+s[2][:3]+s[3][:3] Expected: ['g obj_1', 'usemtl texture91', 'v 2 2 0', 'v 2 1.89744 0.399737', 'v 1.89744 1.89744 0', 'f 1 2 3 4', 'f 2 5 6 3', 'f 5 7 8 6'] Got: ['g obj_1', 'usemtl texture111', 'v 2 2 0', 'v 2 1.89744 0.399737', 'v 1.89744 1.89744 0', 'f 1 2 3 4', 'f 2 5 6 3', 'f 5 7 8 6'] ********************************************************************** 1 item had failures: 1 of 5 in sage.plot.plot3d.parametric_surface.ParametricSurface.obj_repr
comment:62 in reply to: ↑ 61 Changed 16 months ago by
Replying to strogdon:
But I do see at least one failure
I cannot reproduce that failure but the original test had texture...
so I'll change that.
comment:63 Changed 16 months ago by
 Commit changed from 4c5a3736ed18c692f0d63ba6b22e52ff1d7cd9a8 to 4d17a73d3e0b7f7151acc4e29146c3992f3ac43c
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
4d17a73  Remove vertices containing a NaN or infinity

comment:64 Changed 16 months ago by
 Status changed from needs_work to needs_review
comment:65 Changed 16 months ago by
 Reviewers set to Steven Trogdon
 Status changed from needs_review to positive_review
On my very old, although x86_64, computer
sage t long warnlong 10.0 src/sage/plot/plot3d/list_plot3d.py [45 tests, 23.20 s]  All tests passed!  Total time for all tests: 23.3 seconds cpu time: 4.4 seconds cumulative wall time: 23.2 seconds
The results do visually look to be correct. I can't test on anything but x86_64 but if it is slow elsewhere it's not now due to the NaN
s.
comment:66 Changed 16 months ago by
 Branch changed from u/jdemeyer/list_plot3d_py_should_not_generate_nan_coordinates to 4d17a73d3e0b7f7151acc4e29146c3992f3ac43c
 Resolution set to fixed
 Status changed from positive_review to closed
This takes 0.18s on my laptop in sage5.1.beta5. I don't see why it should take >5s on some other hardware unless the hardware is really that slow.