Opened 6 months ago

Last modified 4 months ago

#33317 needs_work enhancement

Unify computation of number of parallel jobs

Reported by: gh-tobiasdiez Owned by:
Priority: major Milestone: sage-9.7
Component: build Keywords:
Cc: mkoeppe, jhpalmieri, tscrim, nthiery, chapoton Merged in:
Authors: Reviewers:
Report Upstream: N/A Work issues:
Branch: public/build/num_threads (Commits, GitHub, GitLab) Commit: af4c3e207a178b7f414925c913619d353d267553
Dependencies: Stopgaps:

Status badges

Description

Currently, the number of jobs used for parallel operations is re-computed and set in various places. In this ticket, we unify this to one central place in sage.env, which is then reused across the whole infrastructure. The only exception is the script build/bin/sage-build-num-threads which is only needed in the build of the sage-conf package. This can probably be improved as well, but we leave it for a follow-up ticket.

Changes in detail:

  • Remove SAGE_NUM_THREADS_PARALLEL since it was only used for the docbuild.
  • Introduce the method thread_count in sage.env that based on the environment variable SAGE_NUM_THREADS and the number of CPUs computes the number of parallel jobs sage should use.
  • Use this method (or its cached value THREAD_COUNT) everywhere where previously SAGE_NUM_THREADS has been used.
  • Remove a few places that also calculated the number of CPUs or number of parallel jobs.
  • In particular, remove the computation of the number of threads to use in the make files. This is now exclusively handled through sage.env.
  • Don't specify SAGE_NUM_THREADS on CI, but let this be calculated using the number of CPUs available.

Change History (20)

comment:1 Changed 6 months ago by gh-tobiasdiez

  • Status changed from new to needs_review

comment:2 Changed 6 months ago by git

  • Commit changed from 97ebc34207bacb31cab915c1e7179ed15e1d088a to af4c3e207a178b7f414925c913619d353d267553

Branch pushed to git repo; I updated commit sha1. New commits:

af4c3e2Fix style

comment:3 follow-up: Changed 6 months ago by jhpalmieri

Does this duplicate #30639? If so, let's close #30639, since there is no branch there.

comment:4 follow-up: Changed 6 months ago by jhpalmieri

If I'm reading the old code right, the default number of threads was 2, but you've changed it to 10. At least one reason to keep the default low was for the case of shared machines, where a single user should only use many cores at a time intentionally.

comment:5 in reply to: ↑ 3 Changed 6 months ago by gh-tobiasdiez

Replying to jhpalmieri:

Does this duplicate #30639? If so, let's close #30639, since there is no branch there.

Yes, removing src/bin/sage-num-threads.py is indeed done as part of this ticket. Thanks for pointing this out.

comment:6 in reply to: ↑ 4 ; follow-up: Changed 6 months ago by gh-tobiasdiez

Replying to jhpalmieri:

If I'm reading the old code right, the default number of threads was 2, but you've changed it to 10. At least one reason to keep the default low was for the case of shared machines, where a single user should only use many cores at a time intentionally.

Yes, you are right. Some of the defaults changed as part of this refactoring since the defaults used where not homogeneous. Personally, I would say that we should try to use a high parallelization as the default, because most systems today have more than 8 cores (especially developer machines). People in a shared environment can still overwrite this of course. But if you think another default is more reasonable, this can be easily changed.

comment:7 Changed 6 months ago by mkoeppe

ALL_CAPS variables in sage.env should be strings, not integers. Is it really necessary to introduce THREAD_COUNT as a variable? Having a function that returns this value should be enough.

comment:8 follow-up: Changed 6 months ago by mkoeppe

Also, I don't see why the words num_threads (which developers are familiar with) should be replaced by thread_count.

comment:9 follow-up: Changed 6 months ago by mkoeppe

Also, the point of build/bin/sage-build-num-threads (called in build/make/install) is that it initializes the number of threads in a build context based on the user's use of make -jNUM. Please don't delete this.

comment:10 in reply to: ↑ description Changed 6 months ago by mkoeppe

Replying to gh-tobiasdiez:

The only exception is the script build/bin/sage-build-num-threads which is only needed in the build of the sage-conf package.

... no, that's not what it does

comment:11 in reply to: ↑ 6 ; follow-up: Changed 6 months ago by jhpalmieri

Replying to gh-tobiasdiez:

Replying to jhpalmieri:

If I'm reading the old code right, the default number of threads was 2, but you've changed it to 10. At least one reason to keep the default low was for the case of shared machines, where a single user should only use many cores at a time intentionally.

Yes, you are right. Some of the defaults changed as part of this refactoring since the defaults used where not homogeneous. Personally, I would say that we should try to use a high parallelization as the default, because most systems today have more than 8 cores (especially developer machines). People in a shared environment can still overwrite this of course.

I'm concerned with mathematicians not knowing that they should change it and so accidentally overloading a system. We should not rely on users knowing what they're doing.

comment:12 in reply to: ↑ 11 ; follow-up: Changed 6 months ago by slelievre

Replying to jhpalmieri:

I'm concerned with mathematicians not knowing that they should change it and so accidentally overloading a system. We should not rely on users knowing what they're doing.

+1

Users who need speed will figure out how to change a low default to something higher.

They can set the number of threads to use for parallel computations in their init.sage.

comment:13 in reply to: ↑ 12 ; follow-up: Changed 6 months ago by tscrim

Replying to slelievre:

Replying to jhpalmieri:

I'm concerned with mathematicians not knowing that they should change it and so accidentally overloading a system. We should not rely on users knowing what they're doing.

+1

Users who need speed will figure out how to change a low default to something higher.

They can set the number of threads to use for parallel computations in their init.sage.

+1 as well. We could very likely upset a sysadmin, especially with such a change where even a more knowledgeable user might not be paying so close attention to the Sage development cycle.

comment:14 in reply to: ↑ 8 ; follow-up: Changed 6 months ago by gh-tobiasdiez

Replying to mkoeppe:

Also, I don't see why the words num_threads (which developers are familiar with) should be replaced by thread_count.

I took the cpu_count function from python as a blueprint, but can change the naming of course if you think num_threads is easier / more common.

comment:15 in reply to: ↑ 9 ; follow-up: Changed 6 months ago by gh-tobiasdiez

Replying to mkoeppe:

Also, the point of build/bin/sage-build-num-threads (called in build/make/install) is that it initializes the number of threads in a build context based on the user's use of make -jNUM. Please don't delete this.

I didn't delete this script, but only the one in src/bin.

comment:16 in reply to: ↑ 13 Changed 6 months ago by gh-tobiasdiez

Replying to tscrim:

Replying to slelievre:

Replying to jhpalmieri:

I'm concerned with mathematicians not knowing that they should change it and so accidentally overloading a system. We should not rely on users knowing what they're doing.

+1

Users who need speed will figure out how to change a low default to something higher.

They can set the number of threads to use for parallel computations in their init.sage.

+1 as well. We could very likely upset a sysadmin, especially with such a change where even a more knowledgeable user might not be paying so close attention to the Sage development cycle.

I think you misunderstood me. I didn't change the user-facing multiprocessing for calculation, i.e. what ncpus in sage/parallel returns (at least if I don't overlook something). This still first returns SAGE_NUM_THREADS if set, and only as a fallback uses the number of cores. What I changed was that other parallel processes follow the same strategy. This change should only impact devs that compile sage and run doctests. I'm not sure if it really makes sense to be more strict there than with user-facing calculations. I would expect that devs have more knowledge that they might need to restrict parallel computations in order to not negatively impact their system.

Anyway, if the special treatment of compilation and doctests is indeed preferred, what would be a good default? No parallelization at all if the user doesn't specify SAGE_NUM_THREADS?

comment:17 in reply to: ↑ 15 Changed 6 months ago by mkoeppe

Replying to gh-tobiasdiez:

Replying to mkoeppe:

Also, the point of build/bin/sage-build-num-threads (called in build/make/install) is that it initializes the number of threads in a build context based on the user's use of make -jNUM. Please don't delete this.

I didn't delete this script, but only the one in src/bin.

You deleted its use in build/make/install

comment:18 in reply to: ↑ 14 Changed 6 months ago by mkoeppe

Replying to gh-tobiasdiez:

Replying to mkoeppe:

Also, I don't see why the words num_threads (which developers are familiar with) should be replaced by thread_count.

I [...] can change the naming of course

Yes, please change it back.

comment:19 Changed 6 months ago by mkoeppe

  • Status changed from needs_review to needs_work

comment:20 Changed 4 months ago by mkoeppe

  • Milestone changed from sage-9.6 to sage-9.7
Note: See TracTickets for help on using tickets.