Opened 5 years ago
Closed 5 years ago
#17814 closed defect (fixed)
Make calling a cached method independent of source code inspection
Reported by:  SimonKing  Owned by:  

Priority:  major  Milestone:  sage6.6 
Component:  distribution  Keywords:  
Cc:  Merged in:  
Authors:  Simon King  Reviewers:  Volker Braun 
Report Upstream:  N/A  Work issues:  
Branch:  0e8ef00 (Commits)  Commit:  0e8ef000e434b654c76299b2ea694ca15be965a3 
Dependencies:  Stopgaps: 
Description (last modified by )
If a pyx file in the sage library uses cached methods and then the source file is removed after compiling Sage, then accessing the cached methods in this file becomes impossible. Example: Move the file src/sage/rings/finite_rings/finite_field_base.pyx
away. Then:
sage: K=GF(5) sage: K.factored_order  AttributeError Traceback (most recent call last) <ipythoninput2488672e9c19b> in <module>() > 1 K.factored_order() /home/king/Sage/git/sage/src/sage/structure/parent.pyx in sage.structure.parent.Parent.__getattr__ (build/cythonized/sage/structure/parent.c:7863)() 838 return self.__cached_methods[name] 839 except KeyError: > 840 attr = getattr_from_other_class(self, self._category.parent_class, name) 841 self.__cached_methods[name] = attr 842 return attr /home/king/Sage/git/sage/src/sage/structure/misc.pyx in sage.structure.misc.getattr_from_other_class (build/cythonized/sage/structure/misc.c:1582)() 249 dummy_error_message.cls = type(self) 250 dummy_error_message.name = name > 251 raise dummy_attribute_error 252 try: 253 attribute = getattr(cls, name) AttributeError: 'FiniteField_prime_modn_with_category' object has no attribute 'factored_order'
The same does work in attached pyx files, even after removing the temporary copy of the pyx file. Conjecture: The cached method's __get__
method relies on source code inspection. That should of course not be the case. Source code inspection should only come in play if the user wants to see the source code or at least the doc string.
I chose the component "distribution", since it was reported on sagedevel as something that happened on certain sage distributions that do not provide sage sources.
Change History (74)
comment:1 Changed 5 years ago by
 Component changed from PLEASE CHANGE to distribution
 Description modified (diff)
comment:2 Changed 5 years ago by
 Type changed from PLEASE CHANGE to defect
comment:3 Changed 5 years ago by
comment:4 followup: ↓ 5 Changed 5 years ago by
Fixing sageinspect won't be easy. I currently have absolutely no clue how to read off the number of arguments (or even the names of the arguments) of a function that is defined in a Cython file. Those functions don't seem to have any attributes holding useful information.
Has someone else an idea? If not, then I don't see what we can do. We do want a special cached method implementation for methods without arguments, and thus we need to determine the number of arguments of the tobewrapped method. If that doesn't work without reading the sources, what could we possibly do?
comment:5 in reply to: ↑ 4 Changed 5 years ago by
Replying to SimonKing:
Fixing sageinspect won't be easy. I currently have absolutely no clue how to read off the number of arguments (or even the names of the arguments)
Do you really require the number and names of arguments or do you only want to know whether a function takes any *args
and/or **kwds
at all (apart from the self
argument)?
Technically, Python always uses *args
and **kwds
to pass arguments (except for some special methods like __new__
). It's the Cython method implementation which interprets these *args
and **kwds
.
comment:6 followup: ↓ 7 Changed 5 years ago by
The question is also: if there is a method
def f(self, arg=1)
do we require that obj.f()
and obj.f(1)
and obj.f(arg=1)
all use the same cache entry? If you don't require that, there might be a solution by working on the level of *args
and **kwds
.
comment:7 in reply to: ↑ 6 ; followups: ↓ 8 ↓ 9 Changed 5 years ago by
Replying to jdemeyer:
The question is also: if there is a method
def f(self, arg=1)do we require that
obj.f()
andobj.f(1)
andobj.f(arg=1)
all use the same cache entry? If you don't require that, there might be a solution by working on the level of*args
and**kwds
.
Yes, we do require that. Equivalent inputs should result in identical output.
The problem occurs when the cached method is bound to an instance. At that point, we want to know if the method accepts any argument in addition to self
. If it does, then a CachedMethodCaller
is set as an attribute of the instance. If it does not, then a CachedMethodCallerNoArgs
is bound to the method. If sage can't answer the question, then an AttributeError
is raised. We don't need to know all argument names and all defaults, but we do need to know their number.
comment:8 in reply to: ↑ 7 ; followup: ↓ 10 Changed 5 years ago by
The two statements below seem like a contraction: using the example above, if you want to ensure that obj.f()
and obj.f(1)
and obj.f(arg=1)
use the same cache entry, you do need to know the names and defaults of the arguments.
Replying to SimonKing:
Yes, we do require that. Equivalent inputs should result in identical output.
We don't need to know all argument names and all defaults, but we do need to know their number.
comment:9 in reply to: ↑ 7 Changed 5 years ago by
Replying to SimonKing:
we want to know if the method accepts any argument in addition to
self
.
That's possible using the Python API function PyCFunction_GetFlags
. When applied to a bound method, it returns the flags as defined in https://docs.python.org/2/capi/structures.html#c.PyMethodDef
comment:10 in reply to: ↑ 8 ; followups: ↓ 11 ↓ 17 Changed 5 years ago by
Replying to jdemeyer:
The two statements below seem like a contraction: using the example above, if you want to ensure that
obj.f()
andobj.f(1)
andobj.f(arg=1)
use the same cache entry, you do need to know the names and defaults of the arguments.
For calling the function, you of course need it (so, I admit that the title of this ticket is misleading). However, for determining whether it ought to be a CachedMethodCaller
or a CachedMethodCallerNoArgs
, it suffices to know a little less.
And thank you for the link to PyCFunction_GetFlags
. So, it seems that sage.misc.sageinspect
should get a little addition sage.misc.sage_inspect_cython
written in Cython that provides helpers for inspection of Cython methods.
comment:11 in reply to: ↑ 10 Changed 5 years ago by
Replying to SimonKing:
For calling the function, you of course need it (so, I admit that the title of this ticket is misleading). However, for determining whether it ought to be a
CachedMethodCaller
or aCachedMethodCallerNoArgs
, it suffices to know a little less.
Sure, but that doesn't solve the problem really (unless you want to fix the issue of this ticket only for methods taking no arguments). You still need to call the function eventually.
comment:12 followup: ↓ 13 Changed 5 years ago by
Replying to SimonKing:
Replying to jdemeyer:
The question is also: if there is a method
def f(self, arg=1)do we require that
obj.f()
andobj.f(1)
andobj.f(arg=1)
all use the same cache entry? If you don't require that, there might be a solution by working on the level of*args
and**kwds
.Yes, we do require that.
Is that really so important? Does it happen a lot in practice that people or the Sage library call a function in different but equivalent ways?
comment:13 in reply to: ↑ 12 Changed 5 years ago by
Replying to jdemeyer:
Replying to SimonKing:
Yes, we do require that.
Is that really so important? Does it happen a lot in practice that people or the Sage library call a function in different but equivalent ways?
I believe it is important, and I think it is specified somewhere in the docs of sage.misc.cachefunc. There are places where it is used for UniqueRepresentation
. Indeed, if you have a unique parent that depends on arguments which (partially) have a default, then you simply don't know if the user will provide these arguments explicitly or implicitly. But when pickling/unpickling it, the arguments will be provided in a uniform way (always explicitly, if I recall correctly).
Hence, if we changed that, we would break uniqueness of UniqueRepresentation
under pickling.
comment:14 Changed 5 years ago by
I see. This is important where the semantics of cached_function
are really important.
I always had in mind only the application of speeding up the function by not computing the same thing multiple times. For the latter application, you need to consider also the overhead of parsing *args
and **kwds
to compute the correct cache entry. Where speed is important, we might just use *args
and **kwds
instead.
comment:15 Changed 5 years ago by
Proposal: have 2 different implementations of cached_function
:
 one for Python methods, which works just like the current implementation.
 one for Cython methods, which works on the level of
*args
and**kwds
.
comment:16 Changed 5 years ago by
This proposal would solve the issue on the ticket. Specializing for Cython, it might also be possible to speedup things more.
comment:17 in reply to: ↑ 10 ; followup: ↓ 18 Changed 5 years ago by
Replying to SimonKing:
And thank you for the link to
PyCFunction_GetFlags
. So, it seems thatsage.misc.sageinspect
should get a little additionsage.misc.sage_inspect_cython
written in Cython that provides helpers for inspection of Cython methods.
It is puzzling. When I call PyCFunction_GetFlags
on the function/method being wrapped, no crash occurs in that function. However, importing lazy_attribute then fails, and so Sage won't start. No idea how the two things (a function that gives me information on other functions without raising an error, and an import statement) could possibly interfere.
comment:18 in reply to: ↑ 17 Changed 5 years ago by
Replying to SimonKing:
Replying to SimonKing:
And thank you for the link to
PyCFunction_GetFlags
. So, it seems thatsage.misc.sageinspect
should get a little additionsage.misc.sage_inspect_cython
written in Cython that provides helpers for inspection of Cython methods.It is puzzling. When I call
PyCFunction_GetFlags
on the function/method being wrapped, no crash occurs in that function. However, importing lazy_attribute then fails, and so Sage won't start. No idea how the two things (a function that gives me information on other functions without raising an error, and an import statement) could possibly interfere.
It's hard to tell without seeing the code. Are you sure the PyCFunction_GetFlags
call is what creates the problem and not for example a changed import or the mere fact of accessing obj.method?
Anyway, I'm still very curious what you think about 11 and 15
comment:19 followup: ↓ 20 Changed 5 years ago by
The proposal from comment:15 would of course solve the problem of the introspection that is needed to wrap a cython function in a cached method. However, it would mean that the specification of uniqueness would be violated for those cython functions for which the argument spectrum is not available.
I'll try to show you a diff file shortly, so that you can see how calling PyCFunction_GetFlags
makes the import of lazy_attribute fail. Currently, I am rebuilding Sage, so, it may take a while.
comment:20 in reply to: ↑ 19 Changed 5 years ago by
Replying to SimonKing:
The proposal from comment:15 would of course solve the problem of the introspection that is needed to wrap a cython function in a cached method. However, it would mean that the specification of uniqueness would be violated for those cython functions for which the argument spectrum is not available.
True, but I think the uniqueness is only important for certain very specific functions. In any case, if we want to solve this ticket, I think it's the only way to go.
comment:21 Changed 5 years ago by
Try this:

src/sage/misc/cachefunc.pyx
diff git a/src/sage/misc/cachefunc.pyx b/src/sage/misc/cachefunc.pyx index a4276b9..515c74a 100644
a b import sage.misc.weak_dict 481 481 from sage.misc.weak_dict import WeakValueDictionary 482 482 from sage.misc.decorators import decorator_keywords 483 483 484 from cpython cimport PyObject 485 486 cdef extern from "methodobject.h": 487 cdef int PyCFunction_GetFlags(PyObject *op) 488 489 484 490 cdef frozenset special_method_names = frozenset(['__abs__', '__add__', 485 491 '__and__', '__call__', '__cmp__', '__coerce__', '__complex__', '__contains__', '__del__', 486 492 '__delattr__', '__delete__', '__delitem__', '__delslice__', '__dir__', '__div__', … … cdef class CachedMethod(object): 2672 2678 # we need to analyse the argspec 2673 2679 f = (<CachedFunction>self._cachedfunc).f 2674 2680 if self.nargs==0: 2681 bla = PyCFunction_GetFlags(<PyObject *>f) 2675 2682 args, varargs, keywords, defaults = sage_getargspec(f) 2676 2683 if varargs is None and keywords is None and len(args)<=1: 2677 2684 self.nargs = 1
According to the crash report that I get, the error occurs in the line args, varargs, keywords, defaults = sage_getargspec(f)
, thus, after PyCFunction_GetFlags
has returned something.
comment:22 Changed 5 years ago by
Replace
cdef int PyCFunction_GetFlags(PyObject *op)
by
cdef int PyCFunction_GetFlags(object op) except 1
comment:23 followup: ↓ 24 Changed 5 years ago by
With the correct except
declaration, there is an error if the argument is not a "builtin" (Cython) method:
SystemError: Objects/methodobject.c:64: bad argument to internal function
comment:24 in reply to: ↑ 23 Changed 5 years ago by
Replying to jdemeyer:
With the correct
except
declaration, there is an error if the argument is not a "builtin" (Cython) method:SystemError: Objects/methodobject.c:64: bad argument to internal function
OK, we then have an error whose traceback points to calling the function.
comment:25 followup: ↓ 26 Changed 5 years ago by
Apparently PyCFunction_GetFlags
is the wrong tool (or we first need to extract the cfunction from the method), since always an error is raised.
comment:26 in reply to: ↑ 25 ; followup: ↓ 28 Changed 5 years ago by
Replying to SimonKing:
Apparently
PyCFunction_GetFlags
is the wrong tool (or we first need to extract the cfunction from the method), since always an error is raised.
It is the right tool for bound Cython methods.
comment:27 Changed 5 years ago by
I am currently testing something else, but for example
sage: K = GF(5) sage: m = K.factored_unit_order sage: m <builtin method factored_unit_order of FiniteField_prime_modn_with_category object at 0x7f357853c1e0>
This m
object should be suitable as argument for PyCFunction_GetFlags
.
comment:28 in reply to: ↑ 26 Changed 5 years ago by
Replying to jdemeyer:
Replying to SimonKing:
Apparently
PyCFunction_GetFlags
is the wrong tool (or we first need to extract the cfunction from the method), since always an error is raised.It is the right tool for bound Cython methods.
Exactly. And what is wrapped by a cached method is not a bound method, but an unbound method (or even a function that is not a method at all?). Would it make sense to take the time to create a bound copy, after making sure that it is not a Python thingy?
comment:29 Changed 5 years ago by
 Branch set to u/SimonKing/make_calling_a_cached_method_independent_of_source_code_inspection
comment:30 followup: ↓ 32 Changed 5 years ago by
 Branch u/SimonKing/make_calling_a_cached_method_independent_of_source_code_inspection deleted
 Status changed from new to needs_review
I have attached a branch. If you have a cached method without arguments in a cython file and remove the sources before starting sage, then both accessing and calling the method should work. If you have a cached method with arguments in a cython file and remove the sources before starting sage, then accessing the method works, but calling would still not work. Methods in Python files should not be affected.
I don't know how to write a meaningful doctest for it (and also I didn't run the testsuite). Anyway, perhaps you can comment whether you think that it is a solution to the problem, or at least a step in the right direction.
comment:31 Changed 5 years ago by
 Branch set to u/SimonKing/make_calling_a_cached_method_independent_of_source_code_inspection
 Commit set to 30543e7b283c52cad2fbeb060efe78a74c7b42ba
Why has the branch field automatically been deleted?
comment:32 in reply to: ↑ 30 ; followup: ↓ 33 Changed 5 years ago by
comment:33 in reply to: ↑ 32 ; followups: ↓ 34 ↓ 35 Changed 5 years ago by
Replying to jdemeyer:
Personally, I still think that 15 is the best solution (but I'm still open to other suggestions).
This information is cheaply available when cython does its work, We could ask the cython devs if they would be willing to equip their function objects with attributes that play the role of __code__.co_argcount
, __code__.co_flags
(to see if *args
or **kwds
is present), __code__.varnames
, func_defaults
.
Cython writes the code that splits up the *args
and **kwargs
that python calls at CAPI get, and generates errors if it doesn't fit some internal pattern. For inspection and dynamical code, the required pattern information should really be made available.
They could either write it into the docstring (as they already do with source file information) or they could put some attributes on their cython function objects.
Example:
sage: cython("""def f(a,b=0): pass""") sage: f(a=1) sage: f(b=1) TypeError: f() takes at least 1 positional argument (0 given) sage: f(1,c=1) TypeError: f() got an unexpected keyword argument 'c'
I think it's quite reasonable if cython would make available what the number of positional arguments can be and which keyword arguments are accepted.
comment:34 in reply to: ↑ 33 Changed 5 years ago by
Replying to nbruin:
Replying to jdemeyer:
Personally, I still think that 15 is the best solution (but I'm still open to other suggestions).
This information is cheaply available when cython does its work, We could ask the cython devs if they would be willing to equip their function objects with attributes that play the role of
__code__.co_argcount
,__code__.co_flags
(to see if*args
or**kwds
is present),__code__.varnames
,func_defaults
.
If I recall correctly, there is a compile option to do so.
In any case, I believe that for now we should proceed as follows:
 Avoid source code inspection in the cached method's
__get__
(as done by the currently attached branch)  When creating the
ArgumentFixer
, we need to know the argspec. If it can not be determined, then no error should be raised. Instead, a generic argument fixer (or no argument fixer at all) shall be used, meaning that the argspec is assumed to be formed byself
,*args
and**kwds
. This I can implement shortly, as soon as I have reinstalled ccache on my current Sage version (now it takes hours to switch branches:/
).  In the long run, ask the cython devs to provide functions that can access the relevant information.
comment:35 in reply to: ↑ 33 Changed 5 years ago by
Replying to nbruin:
They could either write it into the docstring (as they already do with source file information) or they could put some attributes on their cython function objects.
I think the docstring is the only option: look at the fields of PyMethodDef
Even then, I would propose to use just *args
and **kwds
for Cython methods, which will surely be faster than messing with the list of arguments. In the case where uniquess really matters for Cython methods, a wrapper could be added to fix the arguments.
comment:36 followup: ↓ 37 Changed 5 years ago by
Now I am totally puzzled. There are the Cython modules sage.misc.function_mangling
and sage.misc.cachefunc
, but they don't appear in src/module_list.py
! So, why are they built at all? Is that a recent change? E.g.., is it not needed anymore to mention a .pyx
file in module_list.py
if it has no .pxd
and no compile options?
comment:37 in reply to: ↑ 36 ; followup: ↓ 40 Changed 5 years ago by
Replying to SimonKing:
Now I am totally puzzled. There are the Cython modules
sage.misc.function_mangling
andsage.misc.cachefunc
, but they don't appear insrc/module_list.py
! So, why are they built at all? Is that a recent change? E.g.., is it not needed anymore to mention a.pyx
file inmodule_list.py
if it has no.pxd
and no compile options?
See #17767
comment:38 followups: ↓ 42 ↓ 45 Changed 5 years ago by
Soon I will submit another commit. With it, I can do the following:
I created a file src/sage/misc/cache_test.pyx
and rebuilt Sage. In fact, the new pyx file got compiled even without mentioning it in module_list.py
. Content of the file:
from sage.misc.cachefunc import cached_method, cached_function from sage.structure.parent cimport Parent @cached_function def bla(x,y,z=2): return (x+y)*z cdef class Foo(Parent): @cached_method def bar(self, a,b, c=1): return (a+b)*c
I started a Sage session and got:
sage: from sage.misc.cache_test import Foo, bla sage: bla(2,3,2) is bla(2,3,z=2) True sage: F = Foo() sage: F.bar(2,3,2) is F.bar(2,3,c=2) True
as expected.
Then I left Sage, removed src/sage/misc/cache_test.pyx
. Restarting (but not recompiling!) Sage, I now get:
sage: from sage.misc.cache_test import Foo, bla sage: bla(2,3,2) is bla(2,3,z=2) False sage: bla(2,3,2) == bla(2,3,z=2) True sage: F = Foo() sage: F.bar(2,3,2) is F.bar(2,3,c=2) False sage: F.bar(2,3,2) == F.bar(2,3,c=2) True sage: bla(2,3,2) is bla(2,3,2) True sage: F.bar(2,3,2) is F.bar(2,3,2) True
So, the argument mangling fails, but at least one can call the wrapped functions and methods in the expected way, with a basic level of caching.
Now the questions are: Would that be enough to declare the problem as "fixed for now"? And how to doctest it? Is it possible/reasonable to have a source file that is temporarily moved away by its own doctest?
Branch will soon be updated.
comment:39 Changed 5 years ago by
 Commit changed from 30543e7b283c52cad2fbeb060efe78a74c7b42ba to a0b829dc10b4ccb00b99e073de8a1c0006aa1e51
Branch pushed to git repo; I updated commit sha1. New commits:
a0b829d  Make a cached function/method work with default argspec when introspection fails

comment:40 in reply to: ↑ 37 Changed 5 years ago by
comment:41 followup: ↓ 43 Changed 5 years ago by
If we need to access argument information on cython functions/methods, we might consider using
cython(""" import cython @cython.binding(True) def f(a,b=0): pass def g(a,b=0): pass def c(F,n_in): cdef int n=n_in for i in range(n): F(1) """)
With this we have
sage: f.func_defaults (0,) sage: f.__code__.co_varnames ('a', 'b')
There might be a penalty to pay, though:
sage: %time c(f,1000000000) CPU times: user 7.7 s, sys: 10 ms, total: 7.71 s Wall time: 8.06 s sage: %time c(g,1000000000) CPU times: user 7.43 s, sys: 13 ms, total: 7.45 s Wall time: 7.78 s
If we can do away with source file inspection during startup (as would now happen for cython modules), we should see a significant improvement of startup time too! Finding line numbers in a text file is not a fast operation.
comment:42 in reply to: ↑ 38 ; followup: ↓ 44 Changed 5 years ago by
Replying to SimonKing:
Is it possible/reasonable to have a source file that is temporarily moved away by its own doctest?
Certainly not :)
comment:43 in reply to: ↑ 41 ; followup: ↓ 47 Changed 5 years ago by
Replying to nbruin:
If we need to access argument information on cython functions/methods, we might consider using
cython(""" import cython @cython.binding(True) def f(a,b=0): pass def g(a,b=0): pass def c(F,n_in): cdef int n=n_in for i in range(n): F(1) """)
I think this trick only works with modulelevel functions, not method (but I need to recheck).
comment:44 in reply to: ↑ 42 ; followup: ↓ 46 Changed 5 years ago by
Replying to jdemeyer:
Replying to SimonKing:
Is it possible/reasonable to have a source file that is temporarily moved away by its own doctest?
Certainly not :)
Why not? The test itself is obtained from a temporary copy of the file, if I recall correctly. One might think of writing a test that renames the original source file, then does the test, and moves the source file back. Of course, it will only work if the user has write permission.
comment:45 in reply to: ↑ 38 Changed 5 years ago by
Replying to SimonKing:
So, the argument mangling fails, but at least one can call the wrapped functions and methods in the expected way, with a basic level of caching.
Honestly, I absolutely don't like it. It's much better if stuff obviously breaks (i.e. the current situation) instead of breaking in a very subtle way (some method relies on uniquess, which is not guaranteed if the source file is missing).
If this issue is fixed, it should be fixed in such a way that it doesn't matter at all whether the source file is present.
comment:46 in reply to: ↑ 44 Changed 5 years ago by
Replying to SimonKing:
The test itself is obtained from a temporary copy of the file, if I recall correctly.
That's no longer true (but the test is read before it is executed, so that's not the issue).
One might think of writing a test that renames the original source file, then does the test, and moves the source file back. Of course, it will only work if the user has write permission.
Doctests should never write anything below SAGE_ROOT
and your proposal will break badly if the test is interrupted at the wrong moment (you could try something with try
/finally
but still it's dangerous). Also, you will get race conditions (imagine you're editing the file while the test is running).
comment:47 in reply to: ↑ 43 ; followup: ↓ 48 Changed 5 years ago by
Replying to jdemeyer:
I think this trick only works with modulelevel functions, not method (but I need to recheck).
It does seem to work for class
in cython, but sadly not for cdef class
:
cython(""" my_global=[] def mydec(a): global my_global my_global.append(a) return(a) import cython class A(object): @mydec @cython.binding(True) def b(x,y=0): pass cdef class B(object): @mydec @cython.binding(True) def b(x,y=0): pass """)
sage: my_global [<cyfunction A.b at 0x7f1b0e947590>, <method 'b' of '_home_nbruin__sage_temp_art_17765_tmp_DC65IM_spyx_0.B' objects>]
The first object does have the func_*
attributes, but the second doesn't. Perhaps that is something that the cython devs can do something about.
comment:48 in reply to: ↑ 47 Changed 5 years ago by
Replying to nbruin:
The first object does have the
func_*
attributes, but the second doesn't. Perhaps that is something that the cython devs can do something about.
Not without a performance penalty. If you want the method to be a fast PyCFunction
, you are limited in which attributes it has.
comment:49 followup: ↓ 50 Changed 5 years ago by
In other words, I think it needs a Python patch first to add the hooks which then Cython could use.
comment:50 in reply to: ↑ 49 ; followup: ↓ 51 Changed 5 years ago by
Replying to jdemeyer:
In other words, I think it needs a Python patch first to add the hooks which then Cython could use.
well ... I think the more reasonable solution is to write the signature in the docstring, which is available. That's what cython already does to make source location available.
It's unfortunate: all the required objects already exist in the cython code, there are just no hooks to attach them to.
comment:51 in reply to: ↑ 50 ; followups: ↓ 52 ↓ 53 Changed 5 years ago by
Replying to nbruin:
It's unfortunate: all the required objects already exist in the cython code, there are just no hooks to attach them to.
That's why I suggest to have a practical fix that works for now. Recall that the problem from the ticket description came from a report on sagedevel, where people actually work with flavours of Sage that come without the source code.
I think that the current branch does not have a high danger of creating subtle caching problems. First of all, with the branch, a change in behaviour can only occur when we have a cached methodwithdefaultargumentsand/orargs/kwds of a cdef class whose source file is not available. Methods without arguments will be totally fine if the source file is gone (so, the branch does fix a problem).
A change in behaviour would most likely be a problem where coercion is involved, i.e., in UniqueRepresentation
. But there, we are safe, since (a) subclasses of UniqueRepresentation
must be Python classes anyway (and Python classes aren't affected by the bug), and (b) UniqueRepresentation.__classcall__
has generic arguments *args,**kwds
.
If you think that it is too dangerous to silently change behaviour rather than raise a straight forward error, we could instead raise a warning (only once, similar to deprecation warnings) when a cached method is called on a method whose argspec can't be determined.
comment:52 in reply to: ↑ 51 Changed 5 years ago by
Replying to SimonKing:
I think that the current branch does not have a high danger of creating subtle caching problems. First of all, with the branch, a change in behaviour can only occur when we have a cached methodwithdefaultargumentsand/orargs/kwds of a cdef class whose source file is not available. Methods without arguments will be totally fine if the source file is gone (so, the branch does fix a problem).
I agree the danger is "not high" but it's not zero either. Subtle deviations like this are the source of incredibly difficulttoreplicate issues later on, because the reporter will of course forget to report that this is a sourceless distribution. (Perhaps it's not so bad: the bug will just be ignored because we can't replicate it)
I'm not so sure a warning is an option: I'd expect that the first occurrence would be during system startup, so the warning might get lost.
Anyway, I think the fact that we look at the source at all in order to figure out how to preprocess the arguments is a deficiency. I'd prefer if we can use this ticket to track the resolution of that, but if this issue is really urgent for some users, I guess we can merge something along the lines of what you propose.
I expect that we'll find a cythonbased solution, possibly docstring based. It might be a configure option to select if you want this extra data in your docstrings, but in sage we probably would, since we process the docstrings for display anyway.
comment:53 in reply to: ↑ 51 Changed 5 years ago by
Replying to SimonKing:
I think that the current branch does not have a high danger of creating subtle caching problems. First of all, with the branch, a change in behaviour can only occur when we have a cached methodwithdefaultargumentsand/orargs/kwds of a cdef class whose source file is not available. Methods without arguments will be totally fine if the source file is gone (so, the branch does fix a problem).
Proposal 15 also has that property.
A change in behaviour would most likely be a problem where coercion is involved, i.e., in
UniqueRepresentation
. But there, we are safe, since (a) subclasses ofUniqueRepresentation
must be Python classes anyway (and Python classes aren't affected by the bug), and (b)UniqueRepresentation.__classcall__
has generic arguments*args,**kwds
.
Python classes would be unaffected by proposal 15.
If you think that it is too dangerous to silently change behaviour rather than raise a straight forward error, we could instead raise a warning (only once, similar to deprecation warnings) when a cached method is called on a method whose argspec can't be determined.
I think we need to agree first on whether the uniquess property for Cython cached methods is important.
 If uniqueness is important, then it is indeed "too dangerous to silently change behaviour". ==> Go with the solution of Nils to somehow put the argspec information in the docstring and use that.
 If uniqueness is not important, then why bother reading the argspec anyway? ==> Proposal 15
comment:54 Changed 5 years ago by
Can we perhaps instrument the caching machinery to log any cache hits that would have been misses if *arg, **kwarg
would be processed into a cache key without transporting over keywords to positional parameters? Then we can see the extent of the problem.
The main one I expect there is from cases where people specify a default parameter, but with the default value, and that's a serious one:
@cached_function def a(x,y=1) return set([x,y])
Then [a(1),a(1,1)]
should contain two identical sets, but only caching on *arg
and **kwarg
would return result in two nonidentical sets, because the second would have an implicit keyword parameter y
rather than a second positional parameter.
I think this is potentially quite a serious issue, especially because in sage, much caching happens with the purpose of making things unique (but perhaps currently never in cython applications ...)
comment:55 followup: ↓ 56 Changed 5 years ago by
I think this embedsignature=True
is what we need:
sage: cython(""" ....: import cython ....: cdef class A(object): ....: @cython.embedsignature(True) ....: def b(x,y=0): pass ....: """) sage: print A.b.__doc__ A.b(x, y=0) File: _home_nbruin__sage_temp_art_5476_tmp_LOYsNA_spyx_0.pyx (starting at line 10)
i.e., if we ensure that we have embedsignature=True
(probably globally throughout the library), then we just need to change how cython docstrings are used to accommodate for the fact that the first line is the signature. Then sage_argspec
can just look in the docstring to see if it can find a signature there. No source inspection required anymore.
comment:56 in reply to: ↑ 55 Changed 5 years ago by
Replying to nbruin:
sage: cython(""" ....: import cython ....: cdef class A(object): ....: @cython.embedsignature(True) ....: def b(x,y=0): pass ....: """) sage: print A.b.__doc__ A.b(x, y=0) File: _home_nbruin__sage_temp_art_5476_tmp_LOYsNA_spyx_0.pyx (starting at line 10)
Cool! Would that option be easy to enable globally? If not, then I still think assuming a generic argspec and printing a warning (once for each affected cached method) would be better than nothing.
comment:57 Changed 5 years ago by
See #17847
comment:58 followup: ↓ 60 Changed 5 years ago by
In other words: Because of Cython caching, we can not use the embedded argspec information, even if we could enable "embedsignature(True)" globally, and additionally, if "embedsignature(True)" was enabled globally, some valid Cython code wouldn't compile.
comment:59 Changed 5 years ago by
 Status changed from needs_review to needs_work
comment:60 in reply to: ↑ 58 Changed 5 years ago by
Replying to SimonKing:
Because of Cython caching, we can not use the embedded argspec information
This is a oneoff hitch. It just means that when we enable embedsignature
globally, people should clear their cython cache (once).
even if we could enable "embedsignature(True)" globally
We can.
if "embedsignature(True)" was enabled globally, some valid Cython code wouldn't compile.
but it's easy to reformulate those cases.
comment:61 Changed 5 years ago by
Let me mention that #17847 is now in Sage, which makes this ticket feasible.
comment:62 Changed 5 years ago by
Would it be ok to rebase this branch on top of the current develop branch (and then forcepush)? Or would it be ok if I merged this branch into the current develop and start with it? Or merge the current develop into this branch? I still don't get what is considered best praxis.
comment:63 Changed 5 years ago by
 Commit changed from a0b829dc10b4ccb00b99e073de8a1c0006aa1e51 to b340eebb3cec1d5f273b9c382599f23736045e98
comment:64 Changed 5 years ago by
 Status changed from needs_work to needs_review
I am now using the embedded signatures (due to #17847) for introspection. I preserved the previous commits by merging the previous ticket branch into the current develop to get the new ticket branch; hopefully the correct procedure.
In order to demonstrate what is happening, I added a dummy method to a nested test class provided in sage.misc.nested_class. As it turns out, there has been a bug in the previous sage_getargspec
that is now fixed. I suppose that the bug actually is in Cython:
sage: from sage.misc.nested_class import MainClass sage: print MainClass.NestedClass.NestedSubClass.dummy.func_defaults None
Actually the default is not None but is a tuple, which is correctly determined by the new version of sage_getargspec
:
sage: sage_getargspec(MainClass.NestedClass.NestedSubClass.dummy) ArgSpec(args=['self', 'x', 'r'], varargs='args', keywords='kwds', defaults=((1, 2, 3.4),))
Moreover, the above does not involve reading the source code. I wrote a new function that determines the argspec from the embedded signature, and strips the signature at the same time:
sage: print MainClass.NestedClass.NestedSubClass.dummy.__doc__ NestedSubClass.dummy(self, x, *args, r=(1, 2, 3.4), **kwds) File: sage/misc/nested_class.pyx (starting at line 314) A dummy method to demonstrate the embedding of method signature for nested classes. ... sage: from sage.misc.sageinspect import _extract_embedded_signature sage: print _extract_embedded_signature(MainClass.NestedClass.NestedSubClass.dummy.__doc__, 'dummy')[0] File: sage/misc/nested_class.pyx (starting at line 314) A dummy method to demonstrate the embedding of method signature for nested classes. ... sage: _extract_embedded_signature(MainClass.NestedClass.NestedSubClass.dummy.__doc__, 'dummy')[1] ArgSpec(args=['self', 'x', 'r'], varargs='args', keywords='kwds', defaults=((1, 2, 3.4),))
The tests in sage.misc pass.
comment:65 Changed 5 years ago by
 Commit changed from b340eebb3cec1d5f273b9c382599f23736045e98 to 0e8ef000e434b654c76299b2ea694ca15be965a3
Branch pushed to git repo; I updated commit sha1. New commits:
0e8ef00  Catch syntax error when extraction of signature fails

comment:66 followup: ↓ 68 Changed 5 years ago by
On 6.7beta2 I got
sage: K=GF(5) sage: K.factored_order Cached version of <method 'factored_order' of 'sage.rings.finite_rings.finite_field_base.FiniteField' objects>
comment:67 followup: ↓ 69 Changed 5 years ago by
 Status changed from needs_review to needs_info
sorry already in comment:27... could you update the description then?
comment:68 in reply to: ↑ 66 ; followup: ↓ 70 Changed 5 years ago by
Replying to vdelecroix:
On 6.7beta2 I got
sage: K=GF(5) sage: K.factored_order Cached version of <method 'factored_order' of 'sage.rings.finite_rings.finite_field_base.FiniteField' objects>
You mean, you got it after removing src/sage/rings/finite_rings/finite_field_base.pyx
? Then you say the problem is fixed.
comment:69 in reply to: ↑ 67 Changed 5 years ago by
Replying to vdelecroix:
sorry already in comment:27... could you update the description then?
I don't understand what info you need.
comment:70 in reply to: ↑ 68 ; followup: ↓ 71 Changed 5 years ago by
Replying to SimonKing:
Replying to vdelecroix:
On 6.7beta2 I got
sage: K=GF(5) sage: K.factored_order Cached version of <method 'factored_order' of 'sage.rings.finite_rings.finite_field_base.FiniteField' objects>You mean, you got it after removing
src/sage/rings/finite_rings/finite_field_base.pyx
? Then you say the problem is fixed.
Nope. Just starting from a fresh sage6.7.beta2.
comment:71 in reply to: ↑ 70 Changed 5 years ago by
Replying to vdelecroix:
Replying to SimonKing:
Replying to vdelecroix:
On 6.7beta2 I got
sage: K=GF(5) sage: K.factored_order Cached version of <method 'factored_order' of 'sage.rings.finite_rings.finite_field_base.FiniteField' objects>You mean, you got it after removing
src/sage/rings/finite_rings/finite_field_base.pyx
? Then you say the problem is fixed.Nope. Just starting from a fresh sage6.7.beta2.
My mistake... I read too fast!!
comment:72 Changed 5 years ago by
 Status changed from needs_info to needs_review
I still don't see what info was requested. Hence, I change it back to "needs review".
comment:73 Changed 5 years ago by
 Reviewers set to Volker Braun
 Status changed from needs_review to positive_review
comment:74 Changed 5 years ago by
 Branch changed from u/SimonKing/make_calling_a_cached_method_independent_of_source_code_inspection to 0e8ef000e434b654c76299b2ea694ca15be965a3
 Resolution set to fixed
 Status changed from positive_review to closed
Here one can see how introspection enters:
So: cached_method tries to determine the number of arguments. This is because there is a special implementation for functions with no argument (except
self
).In other words, the wrong behaviour could be fixed by finding a way to get the number of arguments (or better the argspec) of a cython function without looking at its source code. I'll see if that's possible. So, the fix will likely be in
sage.misc.sageinspect
, not insage.misc.cachefunc
.