Opened 9 years ago

Last modified 7 years ago

#12868 new defect

Attaching a pyx file in the presence of __init__.py results in wrong module names

Reported by: SimonKing Owned by: jason
Priority: major Milestone: sage-6.4
Component: misc Keywords:
Cc: Merged in:
Authors: Reviewers:
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Description

Create a new folder (say, /home/simon/SAGE/work/tests/) and create a file test.pyx with the content

class bla(type):
    pass

Start sage, attach the file, and you will find something like

sage: attach test.pyx
Compiling ./test.pyx...
sage: bla.__module__
'sage: attach test.pyx
Compiling ./test.pyx...
sage: bla.__module__
'_home_simon_SAGE_work_tests_test_pyx_0'

Then, create an __init__.py file in the same folder. You will find

sage: attach test.pyx
Compiling ./test.pyx...
sage: bla.__module__
'_home_simon_SAGE_work_tests_test_pyx._home_simon_SAGE_work_tests_test_pyx_0'
sage: sys.modules.has_key(bla.__module__)
False

In fact, even if __init__.py is present, the module is still known under the "short" name, but the class doesn't know about it:

sage: import _home_simon_SAGE_work_tests_test_pyx_0
sage: _home_simon_SAGE_work_tests_test_pyx_0.bla
<class '_home_simon_SAGE_work_tests_test_pyx._home_simon_SAGE_work_tests_test_pyx_0.bla'>

Please change the component if you think that there is one that fits less bad than "misc"...

Change History (20)

comment:1 Changed 9 years ago by leif

I'm not yet sure whether that's really a bug, or just confusing. (If I understand correctly, the presence of an __init__.py file was unintentional in your case.)

The behaviour should probably also depend on whether some directory of the path to the attached file is in PYTHONPATH. Haven't yet looked at the code though.


[W.r.t. the component: "User interface" seems a candidate as well.]

comment:2 Changed 9 years ago by leif

Btw., the output of (in your case)

sage: bla?

looks different as well, depending on whether an __init__.py file is present in the same directory.

Without one, there's also:

Loaded File:	/home/leif/.sage/temp/sleepless/16739/spyx/_tmp_foo_foo_pyx/_tmp_foo_foo_pyx_0.so
Source File:	/home/leif/.sage/temp/sleepless/16739/spyx/_tmp_foo_foo_pyx/_tmp_foo_foo_pyx_0.so

(The attached file is /tmp/foo/foo.pyx.)

comment:3 Changed 9 years ago by leif

Looks like distutils were responsible for that behaviour...

comment:4 follow-up: Changed 9 years ago by leif

In the Cython-generated .c file, I see things like

...
/* Module declarations from '_tmp_foo_foo_pyx._tmp_foo_foo_pyx_0' */
#define __Pyx_MODULE_NAME "_tmp_foo_foo_pyx._tmp_foo_foo_pyx_0"
int __pyx_module_is_main__tmp_foo_foo_pyx___tmp_foo_foo_pyx_0 = 0;

/* Implementation of '_tmp_foo_foo_pyx._tmp_foo_foo_pyx_0' */
static char __pyx_k_1[] = "File: _tmp_foo_foo_pyx_0.pyx (starting at line 2)";
static char __pyx_k_2[] = "_tmp_foo_foo_pyx._tmp_foo_foo_pyx_0";
static char __pyx_k__bar[] = "bar";
static char __pyx_k____main__[] = "__main__";
static char __pyx_k____test__[] = "__test__";
static PyObject *__pyx_n_s_2;
static PyObject *__pyx_n_s____main__;
static PyObject *__pyx_n_s____test__;
static PyObject *__pyx_n_s__bar;

static PyMethodDef __pyx_methods[] = {
  {0, 0, 0, 0}
};

#if PY_MAJOR_VERSION >= 3
static struct PyModuleDef __pyx_moduledef = {
    PyModuleDef_HEAD_INIT,
    __Pyx_NAMESTR("_tmp_foo_foo_pyx_0"),
    __Pyx_DOCSTR(__pyx_k_1), /* m_doc */
    -1, /* m_size */
    __pyx_methods /* m_methods */,
    NULL, /* m_reload */
    NULL, /* m_traverse */
    NULL, /* m_clear */
    NULL /* m_free */
};
#endif
...

with an __init__.py file present. (My dummy class is called bar rather than bla.)

In the temporary build directory, also a symbolic link to __init__.py is created:

/home/leif/.sage/temp/sleepless/17317/spyx/_tmp_foo_foo_pyx:
total 112
drwxr-xr-x 3 leif leif  4096 2012-04-22 17:45 .
drwxr-xr-x 3 leif leif  4096 2012-04-22 17:45 ..
drwxr-xr-x 4 leif leif  4096 2012-04-22 17:45 build
-rw-r--r-- 1 leif leif     0 2012-04-22 17:45 err
lrwxrwxrwx 1 leif leif    16 2012-04-22 17:45 foo.pyx -> /tmp/foo/foo.pyx
lrwxrwxrwx 1 leif leif    20 2012-04-22 17:45 __init__.py -> /tmp/foo/__init__.py
-rw-r--r-- 1 leif leif  1536 2012-04-22 17:45 log
-rw-r--r-- 1 leif leif  1400 2012-04-22 17:45 setup.py
-rw-r--r-- 1 leif leif 46397 2012-04-22 17:45 _tmp_foo_foo_pyx_0.c
-rw-r--r-- 1 leif leif  3993 2012-04-22 17:45 _tmp_foo_foo_pyx_0.html
-rw-r--r-- 1 leif leif   157 2012-04-22 17:45 _tmp_foo_foo_pyx_0.pyx
-rwxr-xr-x 1 leif leif 36595 2012-04-22 17:45 _tmp_foo_foo_pyx_0.so

setup.py there looks "clean".

So I'm pretty sure -- if at all -- distutils are to blame for the unexpected behaviour, although it seems quite reasonable to me.

comment:5 in reply to: ↑ 4 ; follow-up: Changed 9 years ago by leif

Replying to leif:

So I'm pretty sure -- if at all -- distutils are to blame for the unexpected behaviour, although it seems quite reasonable to me.

Nope, actually Cython is causing this, since the following does not solve your problem (if it really is one):

  • sage/misc/cython.py

    diff --git a/sage/misc/cython.py b/sage/misc/cython.py
    a b  
    432432    SAGE_ROOT  = os.environ['SAGE_ROOT']
    433433    SAGE_LOCAL = SAGE_ROOT + '/local/'
    434434
     435# Remove unwanted link to an __init__.py file from the current directory,
     436# i.e., from the temporary build directory only (#12868):
     437if os.path.exists("__init__.py"):
     438    os.unlink("__init__.py")
     439
    435440extra_link_args =  ['-L' + SAGE_LOCAL + '/lib']
    436441extra_compile_args = %s
    437442

You can verify that Cython changes the module name depending on whether an __init__.py file is present by

$ sage --sh -c 'cython test.pyx' && grep '^#.*__Pyx_MODULE_NAME' test.c`

Although it is not yet clear to me how Cython finds the __init__.py if I remove it from the temporary build directory, as there's only the symbolic link to the "unrelated" original .pyx source file, but probably Cython somehow "knows" about the relation.

comment:6 in reply to: ↑ 5 ; follow-ups: Changed 9 years ago by SimonKing

Replying to leif:

... the following does not solve your problem (if it really is one):

We have a module containing a class bla, but sys.modules[bla.__module__] gives a key error. I am surprised that you question whether that is a problem.

comment:7 in reply to: ↑ 6 ; follow-up: Changed 9 years ago by leif

Replying to SimonKing:

Replying to leif:

... the following does not solve your problem (if it really is one):

We have a module containing a class bla, but sys.modules[bla.__module__] gives a key error. I am surprised that you question whether that is a problem.

No, I'd rather say it's a "user error"... ;-)

Seriously, my above patch didn't work because the link was removed just too late (since I initially thought the problem was caused by distutils, which it is not).

Currently changing the patch to remove the (in your opinion) undesirable link to __init__.py earlier, such that cython, when invoked, won't see it.

comment:8 in reply to: ↑ 7 ; follow-up: Changed 9 years ago by SimonKing

Replying to leif:

Currently changing the patch to remove the (in your opinion) undesirable link to __init__.py earlier, such that cython, when invoked, won't see it.

That wouldn't be a solution. Perhaps the user wants (for some reason) to have it in the folder (in particular, it is no user error).

I believe that it should simply not matter for attaching files whether or not __init__.py is there. Of course, if you have a Python file and want that you can import (not attach) it, then you would create the __init__.py (this is why I created it, anyway).

Attaching a file should be independent from importing, and both should simultaneously be possible. So, unlinking __init__.py is bad.

Up to yesterday, I was in the belief that "attaching a pyx-file" means that

  1. the file is copied into a temporary directory,
  2. the compilation is done there, and
  3. the module is imported from the temporary directory.

Since the temporary directory would not contain __init__.py, there should be no problem, isn't it?

Last edited 9 years ago by SimonKing (previous) (diff)

comment:9 in reply to: ↑ 8 ; follow-up: Changed 9 years ago by leif

Replying to SimonKing:

Replying to leif:

Currently changing the patch to remove the (in your opinion) undesirable link to __init__.py earlier, such that cython, when invoked, won't see it.

That wouldn't be a solution. Perhaps the user wants (for some reason) to have it in the folder (in particular, it is no user error).

In the temporary build folder? Not sure whether there are situations where this is needed.

This one works for me:

  • sage/misc/cython.py

    diff --git a/sage/misc/cython.py b/sage/misc/cython.py
    a b  
    382382        os.system(cmd)
    383383        if os.path.exists("%s/setup.py" % build_dir):
    384384            os.unlink("%s/setup.py" % build_dir)
     385        if os.path.exists("%s/__init__.py" % build_dir):
     386            sys.stderr.write("Note: Ignoring %s/__init__.py ...\n" % abs_base)
     387            sys.stderr.flush()
     388            os.unlink("%s/__init__.py" % build_dir)
    385389
    386390    if compile_message:
    387391        print "Compiling %s..."%filename

No idea whether it breaks anything else. Note that you have to run ./sage -b after changing the file, which in turn causes (for whatever reason) dozens of extension modules to get rebuilt.

In case having __init__.py in the temporary build directory is desirable, we could just restore the link after Cython has been run.


I believe that it should simply not matter for attaching files whether or not __init__.py is there. Of course, if you have a Python file and want that you can import (not attach) it, then you would create the __init__.py (this is why I created it, anyway).

Attaching a file should be independent from importing, and both should simultaneously be possible. So, unlinking __init__.py is bad.

Importing in the usual way should still work, since you don't import from the temporary directory, do you?


Up to yesterday, I was in the belief that "attaching a pyx-file means" that

  1. the file is copied into a temporary directory,
  2. the compilation is done there, and
  3. the module is imported from the temporary directory.

That's correct. But in addition, symbolic links to any(!) file in the directory containing the attached file are created in the temporary directory (since e.g. you otherwise wouldn't be able to import or include other files in that directory from the attached file). Above, I've only changed that part.

Since the temporary directory would not contain __init__.py, there should be no problem, isn't it?

That's what I was saying. (Unless you try to import __init__ in the attached file...)

comment:10 in reply to: ↑ 9 ; follow-up: Changed 9 years ago by SimonKing

Replying to leif:

That wouldn't be a solution. Perhaps the user wants (for some reason) to have it in the folder (in particular, it is no user error).

In the temporary build folder? Not sure whether there are situations where this is needed.

No, in the original folder.

This one works for me:

  • sage/misc/cython.py

    diff --git a/sage/misc/cython.py b/sage/misc/cython.py
    a b  
    382382        os.system(cmd)
    383383        if os.path.exists("%s/setup.py" % build_dir):
    384384            os.unlink("%s/setup.py" % build_dir)
     385        if os.path.exists("%s/__init__.py" % build_dir):
     386            sys.stderr.write("Note: Ignoring %s/__init__.py ...\n" % abs_base)
     387            sys.stderr.flush()
     388            os.unlink("%s/__init__.py" % build_dir)
    385389
    386390    if compile_message:
    387391        print "Compiling %s..."%filename

... That's correct. But in addition, symbolic links to any(!) file in the directory containing the attached file are created in the temporary directory (since e.g. you otherwise wouldn't be able to import or include other files in that directory from the attached file). Above, I've only changed that part.

I see! So, you are not unlinking it from the original directory.

That's what I was saying. (Unless you try to import __init__ in the attached file...)

Hm. Difficult to tell. If you have files foo.pyx and bar.py in your folder, and in foo.pyx you want to do from bar import Bar, then I guess you need to have __init__.py in the folder in which the compilation happens, isn't it? So, there are situations in which the presence of __init__.py in the temporary folder (via a symbolic link) makes sense.

The best solution would be to patch Cython, so that the wrong naming of modules does not occur (in some post above, I think it was shown that the naming is chosen by Cython, not by distutils).

comment:11 Changed 9 years ago by leif

Replying to leif:

[...] in addition, symbolic links to any(!) file in the directory containing the attached file are created in the temporary directory (since e.g. you otherwise wouldn't be able to import or include other files in that directory from the attached file).

Of course there are other, probably smarter (though not necessarily simpler) ways to achieve this, e.g. by modifying sys.path for the attached module, and by passing according -I options to cython (although I think the generated setup.py already takes care of the latter).

comment:12 in reply to: ↑ 10 Changed 9 years ago by leif

Replying to SimonKing:

The best solution would be to patch Cython, so that the wrong naming of modules does not occur (in some post above, I think it was shown that the naming is chosen by Cython, not by distutils).

I don't think Cython is wrong by what it does, as __init__.py files belong to the package (and hence directory) they're contained in, i.e. .../<package_name>/__init__.py belongs to <package_name>.

Consequently, all .py and .pyx files there also belong to that package, hence their module names get the package prefix.

comment:13 in reply to: ↑ 6 ; follow-up: Changed 9 years ago by leif

Replying to SimonKing:

Replying to leif:

... the following does not solve your problem (if it really is one):

We have a module containing a class bla, but sys.modules[bla.__module__] gives a key error. I am surprised that you question whether that is a problem.

That should IMHO be solvable by importing / loading the module differently, although I think you run into the same if for example you create links to extension modules and import from these rather than their original location. So I'm not sure if it really is an error.

comment:14 in reply to: ↑ 13 Changed 9 years ago by leif

Replying to leif:

Replying to SimonKing:

Replying to leif:

... the following does not solve your problem (if it really is one):

We have a module containing a class bla, but sys.modules[bla.__module__] gives a key error. I am surprised that you question whether that is a problem.

That should IMHO be solvable by importing / loading the module differently, although I think you run into the same if for example you create links to extension modules and import from these rather than their original location.

I think that's actually a cleaner (and probably also safer) solution.

  • sage/misc/cython.py

    diff --git a/sage/misc/cython.py b/sage/misc/cython.py
    a b  
    382382        os.system(cmd)
    383383        if os.path.exists("%s/setup.py" % build_dir):
    384384            os.unlink("%s/setup.py" % build_dir)
     385        if os.path.exists("%s/__init__.py" % build_dir):
     386            sys.stderr.write("Note: Not ignoring %s/__init__.py ...\n" % abs_base)
     387            sys.stderr.flush()
     388            # os.unlink("%s/__init__.py" % build_dir)
    385389
    386390    if compile_message:
    387391        print "Compiling %s..."%filename
     
    527531        if os.system(cmd):
    528532            raise RuntimeError, "Error making local copy of shared object library for %s"%filename
    529533
    530     return name, build_dir
     534    if os.path.exists("%s/__init__.py" % build_dir):
     535        # In that case, the module name Cython creates contains the package
     536        # name, i.e., the name of the directory the file is contained in.
     537        assert not build_dir.endswith(os.path.sep)
     538        return "%s.%s" % (base, name), os.path.dirname(build_dir)
     539    else:
     540        return name, build_dir
    531541
    532542
    533543

Note that this change has the side effect that now __init__.py also gets executed, which IMHO is the desired behaviour:

sage: attach("/tmp/foo/foo.pyx")
Note: Not ignoring /tmp/foo/__init__.py ...
Compiling /tmp/foo/foo.pyx...
Hello, I'm '/tmp/foo/__init__.py'.
sage: bar.__module__
'_tmp_foo_foo_pyx._tmp_foo_foo_pyx_0'
sage: sys.modules[bar.__module__]
<module '_tmp_foo_foo_pyx._tmp_foo_foo_pyx_0' from '/home/leif/.sage//temp/sleepless/21086//spyx/_tmp_foo_foo_pyx/_tmp_foo_foo_pyx_0.so'>
sage: bar?
Type:		classobj
String Form:	_tmp_foo_foo_pyx._tmp_foo_foo_pyx_0.bar
Namespace:	Interactive
Loaded File:	/home/leif/.sage/temp/sleepless/21086/spyx/_tmp_foo_foo_pyx/_tmp_foo_foo_pyx_0.so
Source File:	/home/leif/.sage/temp/sleepless/21086/spyx/_tmp_foo_foo_pyx/_tmp_foo_foo_pyx_0.so

sage: 

comment:15 Changed 9 years ago by leif

Just checked: Imports in the attached file also work properly; the imported modules also get the (temporary) package name prepended:

sage: attach("/tmp/foo/foo.pyx")
Note: Not ignoring /tmp/foo/__init__.py ...
Compiling /tmp/foo/foo.pyx...
Hello, I'm '/tmp/foo/__init__.py'.
sage: bar.__module__
'_tmp_foo_foo_pyx._tmp_foo_foo_pyx_0'
sage: sys.modules[bar.__module__]
<module '_tmp_foo_foo_pyx._tmp_foo_foo_pyx_0' from '/home/leif/.sage//temp/sleepless/21279//spyx/_tmp_foo_foo_pyx/_tmp_foo_foo_pyx_0.so'>
sage: bar?
Type:		classobj
String Form:	_tmp_foo_foo_pyx._tmp_foo_foo_pyx_0.bar
Namespace:	Interactive
Loaded File:	/home/leif/.sage/temp/sleepless/21279/spyx/_tmp_foo_foo_pyx/_tmp_foo_foo_pyx_0.so
Source File:	/home/leif/.sage/temp/sleepless/21279/spyx/_tmp_foo_foo_pyx/_tmp_foo_foo_pyx_0.so

sage: Baz?
Type:		classobj
String Form:	_tmp_foo_foo_pyx.baz.Baz
Namespace:	Interactive
Loaded File:	/home/leif/.sage/temp/sleepless/21279/spyx/_tmp_foo_foo_pyx/baz.py
Source File:	/home/leif/.sage/temp/sleepless/21279/spyx/_tmp_foo_foo_pyx/baz.py

sage: Baz.__module__
'_tmp_foo_foo_pyx.baz'
sage: !cat /tmp/foo/foo.pyx
# foo.pyx

from baz import Baz

class bar:
    pass

sage: !cat /tmp/foo/baz.py
# baz.py

class Baz:
    pass
sage: 

comment:16 Changed 9 years ago by leif

I'll attach a proper patch after further testing.

comment:17 Changed 8 years ago by jdemeyer

  • Milestone changed from sage-5.11 to sage-5.12

comment:18 Changed 7 years ago by vbraun_spam

  • Milestone changed from sage-6.1 to sage-6.2

comment:19 Changed 7 years ago by vbraun_spam

  • Milestone changed from sage-6.2 to sage-6.3

comment:20 Changed 7 years ago by vbraun_spam

  • Milestone changed from sage-6.3 to sage-6.4
Note: See TracTickets for help on using tickets.