Opened 3 years ago

Closed 2 years ago

#21682 closed enhancement (fixed)

Add a separate "build_cython" command to setup.py

Reported by: embray Owned by:
Priority: major Milestone: sage-7.4
Component: build Keywords: cython distutils
Cc: mkoeppe Merged in:
Authors: Erik Bray Reviewers: Jeroen Demeyer
Report Upstream: Reported upstream. Developers acknowledge bug. Work issues:
Branch: a049f2d (Commits) Commit: a049f2d968d9ad5d901e68b44d46f1c06b5e747f
Dependencies: Stopgaps:

Description (last modified by embray)

This is a follow-on to #21600. It adds a new cythonize command to the setup.py, and moves most of the logic for running cythonize() (previously in the customized build_ext) command, into this new command.

The new cythonize command is run always from build_ext anyways, but this has a couple advantages:

1) It is possible to run just cythonize, without compiling extension modules. This maybe isn't that common, but can be useful for development, especially when testing issues with Cython itself. It will also be useful for generating source distributions, where we might want to ship Cythonized sources in the source tarball, and hence would need to generate them as part of the process of building the tarball (without necessarily compiling).

This also allows passing options directly from the command-line to cythonize. This isn't much taken advantage of currently but the possibility is there.

2) Code is clearer and more modular. Specific details for compiling cython code are kept separate from details for compiling C/C++ code.

See also upstream discussion of a "generic" implementation: https://github.com/cython/cython/issues/1514

Change History (56)

comment:1 Changed 3 years ago by embray

I have a few other to-do issues in this code, but they're not essential. Not sure if they should be worried about for now, or in a follow-up issue if this code is accepted.

comment:2 Changed 3 years ago by jdemeyer

I suggest that you first discuss this with upstream cython.

comment:3 Changed 3 years ago by git

  • Commit changed from 80955e5be90d0c92061982e062f1cbc7e071ed28 to 487d66adc34cb7c3997c70d4efee4d411a769d18

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

487d66aAdd a new 'cythonize' command to setup.py to logically separate out the cythonization step.

comment:4 follow-up: Changed 3 years ago by embray

Rebased--I agree this could also be worth bringing upstream in some form, but do you have any other comments?

comment:5 Changed 3 years ago by embray

  • Status changed from new to needs_review

comment:6 in reply to: ↑ 4 Changed 3 years ago by jdemeyer

Replying to embray:

do you have any other comments?

It would be better to separate the Sage-specific parts from the general code which could be upstreamed.

comment:7 Changed 3 years ago by jdemeyer

  • Dependencies #21600 deleted

comment:8 follow-up: Changed 3 years ago by jdemeyer

Bikeshed: since all build commands start with build, wouldn't this command better be called build_cython or something like that?

comment:9 follow-ups: Changed 3 years ago by jdemeyer

Why did you change

os.environ.get('SAGE_DEBUG', None) != 'no'

to

os.environ.get('SAGE_DEBUG', 'no') != 'no'

comment:10 follow-up: Changed 3 years ago by jdemeyer

Regarding this:

    # TODO: This works for the C sources themselves, but this logic
    # should also apply to Cython sources--shouldn't the be rebuilt
    # if this file changes?  Alternatively, we could add more details
    # (e.g. cythonize() flags) to the .cython_version file

I implemented that dependency on setup.py for the C files only because Cythonization does not really depend on setup.py.

You are right that the cythonize() flags affect Cythonization, but that is why we have .cython_version. But it would indeed be better if .cython_version would be more auto-generated with the relevant arguments given to cythonize().

comment:11 follow-up: Changed 3 years ago by jdemeyer

I would suggest to simplify

        if self.parallel is None:
            self.parallel = int(os.environ.get('SAGE_NUM_THREADS', 0))

        if isinstance(self.parallel, str):
            try:
                self.parallel = int(self.parallel)
            except ValueError:
                raise DistutilsOptionError("parallel should be an integer")

to

        if self.parallel is None:
            self.parallel = os.environ.get('SAGE_NUM_THREADS', 0)

        try:
            self.parallel = int(self.parallel)
        except ValueError:
            raise DistutilsOptionError("parallel should be an integer")

comment:12 follow-up: Changed 3 years ago by jdemeyer

In finalize_options, you are setting build_dir unconditionally. Does that mean that the --build-dir option is ignored?

comment:13 in reply to: ↑ 8 Changed 3 years ago by embray

Replying to jdemeyer:

Bikeshed: since all build commands start with build, wouldn't this command better be called build_cython or something like that?

I think that was my first inclination as well. I went with cythonize just to emphasize that this was mostly a wrapper around that, but I think you've convinced me (just by confirming my initial idea) over toward build_cython

comment:14 in reply to: ↑ 9 ; follow-up: Changed 3 years ago by embray

Replying to jdemeyer:

Why did you change

os.environ.get('SAGE_DEBUG', None) != 'no'

to

os.environ.get('SAGE_DEBUG', 'no') != 'no'

I don't think I meant to. I think originally it was os.environ.get('SAGE_DEBUG', 'no') and the fact that it's changed now might be a symptom of rebasing. I'll fix that. I definitely like the current spelling better.

comment:15 in reply to: ↑ 11 Changed 3 years ago by embray

Replying to jdemeyer:

I would suggest to simplify

        if self.parallel is None:
            self.parallel = int(os.environ.get('SAGE_NUM_THREADS', 0))

        if isinstance(self.parallel, str):
            try:
                self.parallel = int(self.parallel)
            except ValueError:
                raise DistutilsOptionError("parallel should be an integer")

to

        if self.parallel is None:
            self.parallel = os.environ.get('SAGE_NUM_THREADS', 0)

        try:
            self.parallel = int(self.parallel)
        except ValueError:
            raise DistutilsOptionError("parallel should be an integer")

Agreed

comment:16 in reply to: ↑ 12 Changed 3 years ago by embray

Replying to jdemeyer:

In finalize_options, you are setting build_dir unconditionally. Does that mean that the --build-dir option is ignored?

Yes, IIRC that was in anticipation of fixing #21535, but I left the actual value hard-coded for now so as to not change existing behavior (since currently setting it to anything else will break other things that hard-code SAGE_CYTHONIZED). For the purposes of this ticket as a stand-alone enhancement that is confusing though. I could leave the internal build_dir attribute for now and just get rid of the user option (temporarily).

comment:17 in reply to: ↑ 14 Changed 3 years ago by jdemeyer

Replying to embray:

I definitely like the current spelling better.

It's not just a matter of spelling. It is a functional difference.

comment:18 Changed 3 years ago by jdemeyer

For history about this os.environ.get('SAGE_DEBUG', None), see #13881.

comment:19 Changed 3 years ago by embray

I agree, I wasn't being flippant :)

comment:20 in reply to: ↑ 10 ; follow-up: Changed 3 years ago by embray

Replying to jdemeyer:

Regarding this:

    # TODO: This works for the C sources themselves, but this logic
    # should also apply to Cython sources--shouldn't the be rebuilt
    # if this file changes?  Alternatively, we could add more details
    # (e.g. cythonize() flags) to the .cython_version file

I implemented that dependency on setup.py for the C files only because Cythonization does not really depend on setup.py.

You are right that the cythonize() flags affect Cythonization, but that is why we have .cython_version. But it would indeed be better if .cython_version would be more auto-generated with the relevant arguments given to cythonize().

What if we made a "version 2" .cython_version (with backwards compat for the old format of course), with a new format in JSON that can accurately reflect the options passed to cythonize()?

comment:21 in reply to: ↑ 20 ; follow-up: Changed 3 years ago by jdemeyer

Replying to embray:

What if we made a "version 2" .cython_version (with backwards compat for the old format of course), with a new format in JSON that can accurately reflect the options passed to cythonize()?

I think that's a good idea. But that should be an independent ticket. I don't see the need for the "backwards compat for the old format" though.

comment:22 Changed 3 years ago by embray

  • Description modified (diff)
  • Report Upstream changed from N/A to Reported upstream. No feedback yet.

Opened an upstream issue at https://github.com/cython/cython/issues/1514

A generic build_cython command could take most any cythonize() arguments as options. A sage-specific wrapper around that would just be setting (or possibly enforcing) certain defaults.

comment:23 in reply to: ↑ 21 Changed 3 years ago by embray

Replying to jdemeyer:

Replying to embray:

What if we made a "version 2" .cython_version (with backwards compat for the old format of course), with a new format in JSON that can accurately reflect the options passed to cythonize()?

I think that's a good idea. But that should be an independent ticket. I don't see the need for the "backwards compat for the old format" though.

Agree it's a separate issue. As for backwards-compat, it at least needs to fail gracefully if the existing file is not in the new format, and rebuild + regenerate the file but that's fine.

comment:24 Changed 3 years ago by jdemeyer

  • Summary changed from Add a separate "cythonize" command to setup.py to Add a separate "build_cython" command to setup.py

comment:25 in reply to: ↑ 9 ; follow-up: Changed 3 years ago by embray

Replying to jdemeyer:

Why did you change

os.environ.get('SAGE_DEBUG', None) != 'no'

to

os.environ.get('SAGE_DEBUG', 'no') != 'no'

Actually, just to make sure I'm clear about this, is the point that debug should be enabled by default if SAGE_DEBUG is not explicitly set to "no"?

comment:26 in reply to: ↑ 25 Changed 3 years ago by jdemeyer

Replying to embray:

Actually, just to make sure I'm clear about this, is the point that debug should be enabled by default if SAGE_DEBUG is not explicitly set to "no"?

Exactly.

comment:27 follow-up: Changed 3 years ago by embray

I'm not sure how I feel about that--it goes against the default for building extension modules which is that the --debug option is not on by default. I'd be happy to change that in Sage's build_* commands, but it should be changed across the board, and a --no-debug option to turn it off is added. But it would be simpler to not make it the default.

Instead, if we want to enable debug by default (which I agree is the best choice for development) we can either add it in the setup.cfg (but it would have to be removed when making a release) or in your local pydistutils.cfg.

Or am I missing something here?

comment:28 Changed 3 years ago by embray

(I realize that the "gdb_debug" option for Cythonize and the "debug" option for building C sources are not the same thing, but they also kind of go hand-in-hand).

Last edited 3 years ago by embray (previous) (diff)

comment:29 follow-up: Changed 3 years ago by embray

Nevermind--I read the docs for SAGE_DEBUG and that gave me a better understanding. I'll play around a bit more with how the command-line option works, but I agree that the behavior you requested is consistent with how the SAGE_DEBUG environment variable itself should be interpreted.

I think for the build_cython command I'll remove the --debug option entirely, and make gdb_debug=True by default pending the value of SAGE_DEBUG (that is, identical to the previous behavior). There's no huge advantage to having an option to explicitly disable it.

Last edited 3 years ago by embray (previous) (diff)

comment:30 Changed 3 years ago by git

  • Commit changed from 487d66adc34cb7c3997c70d4efee4d411a769d18 to 56695aa5e4ebac5fa4fbe32a07e8b865a460feb1

Branch pushed to git repo; I updated commit sha1. New commits:

4ccd15cRename 'cythonize' command to 'build_cython' per https://trac.sagemath.org/ticket/21682#comment:13
15c7661Slightly simpler logic for handling the parallel option
56df3cdTemporarily disable this user option since its value is hard-coded
56695aaRemove --debug option since there's no reason not to produce debugging output unless SAGE_DEBUG=no

comment:31 in reply to: ↑ 27 Changed 3 years ago by jdemeyer

Replying to embray:

I'm not sure how I feel about that--it goes against the default for building extension modules which is that the --debug option is not on by default.

In practice, --debug is the default because Python compiles by default with -g (even twice!). The default gcc command line starts with

gcc -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC

(Sage adds the -Wno-unused flag to this and removes -Wstrict-prototypes)

Adding the --debug option to distutils adds an additional -g but that doesn't change anything really.

Last edited 3 years ago by jdemeyer (previous) (diff)

comment:32 in reply to: ↑ 29 Changed 3 years ago by jdemeyer

Replying to embray:

There's no huge advantage to having an option to explicitly disable it.

The only potential advantage would be disk space (and maybe build time).

comment:33 Changed 3 years ago by mkoeppe

  • Cc mkoeppe added

comment:34 Changed 2 years ago by jdemeyer

I recently did some work on the setup.py for cysignals and I realized that it makes the most sense to run Cython as first build command, before build_py. This is because build_py handles package_data and we may want to handle Cython-generated files as package_data some day (see #20108).

For this reason, in cysignals I decided to run Cython in the base build command.

comment:35 Changed 2 years ago by embray

I see--that does make sense. It simplifies things too--we can just insert build_cython into the front of the build sub-commands list.

comment:36 Changed 2 years ago by git

  • Commit changed from 56695aa5e4ebac5fa4fbe32a07e8b865a460feb1 to a0d57dcf36355100b4dfbb6bdef4aad42359eca3

Branch pushed to git repo; I updated commit sha1. New commits:

a0d57dcSimply insert build_cython in the front of the build sub_commands list.

comment:37 Changed 2 years ago by mkoeppe

  • Status changed from needs_review to needs_work
  • Work issues set to rebasing

comment:38 Changed 2 years ago by git

  • Commit changed from a0d57dcf36355100b4dfbb6bdef4aad42359eca3 to aaa6d38cbf84549edc720727fd3a81df22329463

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

4fc2dd6Add a new 'cythonize' command to setup.py to logically separate out the cythonization step.
c647256Rename 'cythonize' command to 'build_cython' per https://trac.sagemath.org/ticket/21682#comment:13
dbb28d4Slightly simpler logic for handling the parallel option
14b70cfTemporarily disable this user option since its value is hard-coded
eb28c9dRemove --debug option since there's no reason not to produce debugging output unless SAGE_DEBUG=no
aaa6d38Simply insert build_cython in the front of the build sub_commands list.

comment:39 Changed 2 years ago by embray

  • Status changed from needs_work to needs_review
  • Work issues rebasing deleted

Rebased


New commits:

4fc2dd6Add a new 'cythonize' command to setup.py to logically separate out the cythonization step.
c647256Rename 'cythonize' command to 'build_cython' per https://trac.sagemath.org/ticket/21682#comment:13
dbb28d4Slightly simpler logic for handling the parallel option
14b70cfTemporarily disable this user option since its value is hard-coded
eb28c9dRemove --debug option since there's no reason not to produce debugging output unless SAGE_DEBUG=no
aaa6d38Simply insert build_cython in the front of the build sub_commands list.

comment:40 Changed 2 years ago by embray

  • Status changed from needs_review to needs_work

Needs rebasing, and should work with #21682

comment:41 Changed 2 years ago by mkoeppe

You mean #22106?

comment:42 Changed 2 years ago by embray

  • Dependencies set to #22106

Yes, I meant #22106. Thanks for the catch.

comment:43 Changed 2 years ago by embray

Ah, since #22106 is at least in "positive review" status I should rebase this on it.

comment:44 Changed 2 years ago by git

  • Commit changed from aaa6d38cbf84549edc720727fd3a81df22329463 to a7e548b5f4e2c1dded48326d50363764f652aa1b

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

3bb5e73Generate auto-generated sources as the earliest step in the 'build' command.
1286549Cosmetic changes
5775788Add a new 'cythonize' command to setup.py to logically separate out the cythonization step.
f9f6d94Rename 'cythonize' command to 'build_cython' per https://trac.sagemath.org/ticket/21682#comment:13
2dd6fe7Slightly simpler logic for handling the parallel option
ac28d2dTemporarily disable this user option since its value is hard-coded
97fadcbRemove --debug option since there's no reason not to produce debugging output unless SAGE_DEBUG=no
a7e548bSimply insert build_cython in the front of the build sub_commands list.

comment:45 Changed 2 years ago by embray

  • Status changed from needs_work to needs_review

Rebased this on top of the current branch for #22106, which should be good to merge it looks like.

Other than the rebasing I don't think this needed any additional work.

comment:46 Changed 2 years ago by jdemeyer

  • Branch changed from u/embray/cythonize-command to u/jdemeyer/cythonize-command

comment:47 Changed 2 years ago by jdemeyer

  • Commit changed from a7e548b5f4e2c1dded48326d50363764f652aa1b to cdb2f8934240777310aeb2116919a3b74848a9fc
  • Dependencies #22106 deleted

Merged with 8.0.beta0


New commits:

cdb2f89Merge tag '8.0.beta0' into t/21682/cythonize-command

comment:48 Changed 2 years ago by git

  • Commit changed from cdb2f8934240777310aeb2116919a3b74848a9fc to 327108a7f6ecde89f7bc6a16a580d642f5f76018

Branch pushed to git repo; I updated commit sha1. New commits:

327108aAdd missing "mkpath"

comment:49 Changed 2 years ago by jdemeyer

  • Report Upstream changed from Reported upstream. No feedback yet. to Reported upstream. Developers acknowledge bug.

comment:50 Changed 2 years ago by git

  • Commit changed from 327108a7f6ecde89f7bc6a16a580d642f5f76018 to e3465ea9e22a0863ccd69bdc3bd28151166926cb

Branch pushed to git repo; I updated commit sha1. New commits:

e3465eaNo need to use build_ext to get the list of extensions

comment:51 Changed 2 years ago by jdemeyer

  • Reviewers set to Jeroen Demeyer

Erik, I'm happy with this. If you like my changes, you can set this to positive_review.

comment:52 Changed 2 years ago by embray

  • Status changed from needs_review to needs_work

I'm happy with the last change except in the sage_build_cython.initialize_options it should also set self.extensions = None. I know, it's a bit silly, but I like to stick to the convention on this. (The reason for this convention has to do with how distutils commands can set their options from options on other commands, i.e. self.set_undefined_options. No, it's not likely to be used this way here, but I do also like that documents what attributes should be set on the finalized command.)

comment:53 Changed 2 years ago by git

  • Commit changed from e3465ea9e22a0863ccd69bdc3bd28151166926cb to a049f2d968d9ad5d901e68b44d46f1c06b5e747f

Branch pushed to git repo; I updated commit sha1. New commits:

a049f2dSet self.extensions in initialize_options()

comment:54 Changed 2 years ago by jdemeyer

  • Status changed from needs_work to positive_review

comment:55 Changed 2 years ago by embray

Thanks!

comment:56 Changed 2 years ago by vbraun

  • Branch changed from u/jdemeyer/cythonize-command to a049f2d968d9ad5d901e68b44d46f1c06b5e747f
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.