Opened 10 years ago

Closed 10 years ago

# Avoid a regression in the creation of homsets

Reported by: Owned by: Simon King tbd critical sage-5.7 performance Jean-Pierre Flori, Paul Zimmermann, Volker Braun, Robert Bradshaw, Nils Bruin, Martin Albrecht, Michael Orlitzky sage-5.7.beta3 Simon King Volker Braun N/A #715

### Description

By #715 and related tickets, one observes a dramatic regression in the following example.

```sage: p = polar_plot(lambda t: (100/(100+(t-pi/2)^8))*(2-sin(7*t)-cos(30*t)/2), -pi/4, 3*pi/2, color="red",plot_points=1000)
```

First of all, it is very strange that the following homset is created several thousands of times:

```Set of Homomorphisms from Integer Ring to Real Interval Field with 64 bits of precision
```

This also occurs without #715. Hence, enabling garbage collection seems not to be to blame.

However, the creation time for the homsets increases clearly, which is shown by prun.

### comment:1 Changed 10 years ago by Simon King

PS: There is almost no regression in this example, if #13014 is reverted, as Jeroen reported on sage-devel. Still, both problems should be addressed:

1. The previously existing repeated creation of a homset.
2. The new slowdown in the creation of homsets.
Last edited 10 years ago by Simon King (previous) (diff)

### comment:2 Changed 10 years ago by Simon King

#13014 has to do with gcd/lcm, and apparently gcd is used quite a lot in this example, according to #715, comment:366 - WHY??

Last edited 10 years ago by Simon King (previous) (diff)

### comment:3 Changed 10 years ago by Volker Braun

I take it that whenever you write `pi/2` we first see if that can be simplified to just the numerator. Which requires a gcd, and #13014 changed gcd of symbolics to be more expensive.

Really the example is very poor, a python function returning a symbolic expression. If you just had a symbolic function then the `fast_callable` trickery could do its job. Just leave out the `lambda t:` and it is >100 times faster!

```sage: var('t')
t
sage: %time p = polar_plot((100/(100+(t-pi/2)^8))*(2-sin(7*t)-cos(30*t)/2), -pi/4, 3*pi/2, color="red",plot_points=1000)
CPU times: user 0.04 s, sys: 0.00 s, total: 0.04 s
Wall time: 0.05 s
```

### comment:4 Changed 10 years ago by Simon King

Indeed, without the lambda, only one set of morphisms is created, namely Set of Morphisms from Set of Python objects of type `'sage.ext.fast_eval.FastDoubleFunc'` to Symbolic Ring in Category of sets.

### comment:5 Changed 10 years ago by Simon King

I defined `f = lambda t: (100/(100+(t-pi/2)^8))*(2-sin(7*t)-cos(30*t)/2)` and evaluated it on both integers, rationals and symbolic constants. But alas, no repeated creation of the same homset occurs in the process.

So, why are the homsets not taken from cache when calling `polar_plot`? The same happens with `plot`, by the way.

### comment:6 Changed 10 years ago by Simon King

Even when setting the number of evaluation points to 3, one gets loads of homsets. So, evaluation is probably not the culprit.

### comment:7 Changed 10 years ago by Simon King

I tried to simplify the circumstances a bit (smaller number of evaluation points, no symbolic expression in the definition of the plot range. However,

```sage: t = var('t')
sage: f = lambda t: (100/(100+(t-pi/2)^8))*(2-sin(7*t)-cos(30*t)/2)
sage: p = plot(f, -4, 3,plot_points=3)
```

is still nasty.

### comment:8 follow-up:  9 Changed 10 years ago by Volker Braun

The plot is adaptive, so even if you specify 3 points it uses many more.

About the caching, is the problem that the cache gets deleted too quickly? Between two evaluations of the lambda there is nothing that holds a reference to the symbolic expression. So all coercion caches should be free to be garbage collected.

### comment:9 in reply to:  8 Changed 10 years ago by Simon King

About the caching, is the problem that the cache gets deleted too quickly?

Have we been using weak references to cache homsets prior to #715?

Between two evaluations of the lambda there is nothing that holds a reference to the symbolic expression. So all coercion caches should be free to be garbage collected.

Here is the problem:

```QQ.gcd(3.14159,2*3.14159)
```

creates two times `Set of Homomorphisms from Integer Ring to Real Interval Field with 64 bits of precision in Category of euclidean domains`. So, no symbolics involved at this point.

### comment:10 Changed 10 years ago by Simon King

The cache of the Hom function seems not to be used, even though the Hom function is called. Namely, defining

```sage: H = Hom(ZZ, RealIntervalField(prec=64), category=EuclideanDomains())
```

I still find that

```sage: QQ.gcd(3.14159,2*3.14159)
```

creates two copies of H each time it is called. Odd.

### comment:11 Changed 10 years ago by Simon King

Got it!!!

`RealIntervalField(prec=64)` should be a unique parent, but it isn't! It is created in two different addresses in memory during QQ.gcd(3.14159,2*3.14159). So, apparently the `RealIntervalField` constructor is not called in `QQ.gcd`. That would be a clear bug.

Last edited 10 years ago by Simon King (previous) (diff)

### comment:12 Changed 10 years ago by Simon King

My apologies to whoever created `QQ.gcd`. It now seems that the coercion into `QQ` is to blame:

```sage: a = QQ(3.14159)
```

internally creates a new copy of `RealIntervalField(prec=64)`, and hence a new homset.

### comment:13 Changed 10 years ago by Simon King

Look into `sage.rings.real_mpfi.RealIntervalFieldElement.simplest_rational`: It calls `RealIntervalField_class` directly, but should call the `RealField` constructor (or, if the overhead of calling a function matters, should look into the cache first).

### comment:14 Changed 10 years ago by Simon King

With the following patch, the homsets are taken from cache. I don't know about timings (calling overhead) yet.

• ## sage/rings/real_mpfi.pyx

`diff --git a/sage/rings/real_mpfi.pyx b/sage/rings/real_mpfi.pyx`
 a # First, we try using approximate arithmetic of slightly higher # precision. cdef RealIntervalFieldElement highprec highprec = RealIntervalField_class(int(self.prec() * 1.2))(self) highprec = RealIntervalField(int(self.prec() * 1.2))(self) cdef Rational try1 = highprec._simplest_rational_helper()

### comment:15 Changed 10 years ago by Simon King

PS: How could that fix be tested against?

### comment:16 follow-up:  18 Changed 10 years ago by Simon King

Authors: → Simon King new → needs_review

The attached patch is preliminary, as it is not tested, and I don't know if I was over-eager to cdefine stuff.

Anyway. Without the patch, it is

```sage: %timeit a = QQ.gcd(3.14159,2*3.14159)
125 loops, best of 3: 4.58 ms per loop
sage: var('t')
t
sage: %time p = plot(lambda t: (100/(100+(t-pi/2)^8))*(2-sin(7*t)-cos(30*t)/2), -pi/4, 3*pi/2, color="red",plot_points=1000)
CPU times: user 87.06 s, sys: 0.69 s, total: 87.75 s
Wall time: 88.15 s
```

With the patch, it is

```sage: %timeit a = QQ.gcd(3.14159,2*3.14159)
125 loops, best of 3: 2.33 ms per loop
sage: var('t')
t
sage: %time p = plot(lambda t: (100/(100+(t-pi/2)^8))*(2-sin(7*t)-cos(30*t)/2), -pi/4, 3*pi/2, color="red",plot_points=1000)
CPU times: user 55.41 s, sys: 0.00 s, total: 55.41 s
Wall time: 55.48 s
```

Would that be enough to fix the regression completely?

### comment:17 Changed 10 years ago by Robert Bradshaw

I haven't tested this yet, but the code looks good to me and clearly fixes a bug.

### comment:18 in reply to:  16 ; follow-up:  19 Changed 10 years ago by Nils Bruin

Not introduced by this patch, so shouldn't be held against it, but ...

```RealIntervalField_cache = {}
```

Is a strong cache! So `RealIntervalField`s get nailed in memory. That's a very good argument for ensuring that "increase precision" leads to a very predictable and sparse list of precisions, to minimize the costs from this.

### comment:19 in reply to:  18 ; follow-up:  20 Changed 10 years ago by Simon King

Not introduced by this patch, so shouldn't be held against it, but ...

```RealIntervalField_cache = {}
```

Is a strong cache!

Yes. And since `RealIntervalField` is a very simple function (taking just one integer and one bool as argument), I would actually suggest to let `RealIntervalField_class` inherit from `UniqueRepresentation` and make `RealIntervalField` and alias for `RealIntervalField_class`. The point is that `UniqueRepresentation` will soonish have a weak cache (namely: If all the problems with #715 and #11521 are sorted out). See #12215.

The questions are:

1. Do we want to the change from a custom constructor function to `UniqueRepresentation`?
2. Do we want to change it here, or shall we have a quick fix for now and do it properly on a different ticket?

I expect that it would be relatively harmless to use `UniqueRepresentation` in this case. Shall I try?

### comment:20 in reply to:  19 Changed 10 years ago by Simon King

I expect that it would be relatively harmless to use `UniqueRepresentation` in this case.

I stand corrected. `UniqueRepresentation` is a Python class, and thus can not be used as a base class of a Cython class like `RealIntervalField_class`.

### comment:21 Changed 10 years ago by Simon King

Question to Robert: Is it possible to use a metaclass in Cython?

I tried:

```cdef class Test:
__metaclass__ = ClasscallMetaclass
@staticmethod
def __classcall__(cls,data):
print "classcall"
O = super(cls,cls).__new__(cls,data)
cls.__init__(O,data)
return O
def __init__(self, data):
print "init"

class Test2:
__metaclass__ = ClasscallMetaclass
@staticmethod
def __classcall__(cls,data):
print "classcall"
O = super(cls,cls).__new__(cls,data)
cls.__init__(O,data)
return O
def __init__(self, data):
print "init"
```

While I obtain

```sage: Test2('bla')
classcall
init
<...Test2 object at 0x6128680>
```

I only get

```sage: Test('bla')
init
<...Test object at 0x72eddc0>
```

with the Cython function. Hence, it compiles with the metaclass, but apparently it has no effect.

### comment:22 Changed 10 years ago by Robert Bradshaw

I think metaclasses are only supported for non-cdef classes. (This should probably be a compile-time error rather than silently ignoring it.)

There's also sage.structure.factory.UniqueFactory?. It may also be worth holding a strong reference to the last N (for some small N) parents created to avoid creating the same one over and over.

In any case, this isn't a regression, so I'd be happy to get this fix in and post a follow-up ticket to ensure uniqueness.

### comment:23 Changed 10 years ago by Volker Braun

Reviewers: → Volker Braun needs_review → positive_review

Well then lets get it in ;-)

### comment:24 follow-up:  25 Changed 10 years ago by Jeroen Demeyer

Status: positive_review → needs_work

The following Cython-generated file needs to be added to `.hgignore`:

```sage/rings/real_mpfi.h
```

### comment:25 in reply to:  24 Changed 10 years ago by Simon King

Status: needs_work → positive_review

The following Cython-generated file needs to be added to `.hgignore`:

```sage/rings/real_mpfi.h
```

Done. And I hope it is ok if I revert it to "positive review".

### comment:26 Changed 10 years ago by Jeroen Demeyer

Merged in: → sage-5.7.beta3 → fixed positive_review → closed
Note: See TracTickets for help on using tickets.