Opened 2 years ago
Last modified 5 months ago
#24937 needs_info defect
@parallel only uses one thread if SAGE_NUM_THREADS is not set; ncpus is wrong.
Reported by: | msaaltink | Owned by: | |
---|---|---|---|
Priority: | minor | Milestone: | sage-8.2 |
Component: | performance | Keywords: | |
Cc: | jdemeyer, hivert, mmancini, gh-lerouxje | Merged in: | |
Authors: | Reviewers: | ||
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description
Some recent change - possibly #23713 - has changed the behaviour of @parallel. Now, unless I set SAGE_NUM_THREADS, my parallel functions only use 1 thread. Previously they used as many threads as I have cpus, which seems like a much nicer default.
Running sage 8.2beta5 with SAGE_NUM_THREADS unset in my environment gives this:
sage: os.environ["SAGE_NUM_THREADS"] '1'
As a result, ncpus is wrong
sage: sage.parallel.ncpus.ncpus() 1
The correct answer here should be 4 (on my test machine):
sage: os.sysconf("SC_NPROCESSORS_ONLN") 4
Change History (12)
comment:1 Changed 2 years ago by
comment:2 Changed 2 years ago by
- Cc jdemeyer hivert added
comment:3 follow-up: ↓ 4 Changed 2 years ago by
The change in #23713 was certainly meant as feature.
The problem is that it's very hard to figure out a good default that works for everybody (from single users on desktop machines to server farms). One could argue that using 1 processor by default makes sense because that way @parallel
is transparent unless you explicitly ask for multiple processors (in other words: it makes it possible to do multiprocessing but doesn't force it). So instead you should just explicitly specify the number of processors that you want.
I'm happy to continue this discussion on sage-devel if needed.
comment:4 in reply to: ↑ 3 Changed 2 years ago by
Replying to jdemeyer:
The change in #23713 was certainly meant as feature.
If it is a feature, then the documentation of ncpus()
must be updated, making some reference to SAGE_NUM_THREADS
. The current docstring says:
Detects the number of effective CPUs in the system.
The problem is that it's very hard to figure out a good default that works for everybody (from single users on desktop machines to server farms). One could argue that using 1 processor by default makes sense because that way
@parallel
is transparent unless you explicitly ask for multiple processors (in other words: it makes it possible to do multiprocessing but doesn't force it). So instead you should just explicitly specify the number of processors that you want.I'm happy to continue this discussion on sage-devel if needed.
Yes please.
For instance, it would be convenient to have a replacement for the old ncpus()
, i.e. a function that does return the total number of CPUs on the computer. In the present stage, all the code that does that (starting at line 56 of src/sage/parallel/ncpus.py
) seems unreachable: starting a sage session sets the environment variable SAGE_NUM_THREADS
to 1, so that ncpus()
is basically equivalent to return 1
.
comment:5 Changed 2 years ago by
- Cc mmancini added
comment:6 Changed 2 years ago by
For coherence with the must used threads library (openmp) the number of "available" threads should be set to the maximum available on the machine (which turns to be the consistent with old behavior) ant he function sage.parallel.ncpus.ncpus, coherently with its definition, return this value. This "avalable" value is assumed also the default for parallelism. In the other hand, if parallelism is not invoked explicitly sage will we use only a processus.
As in openmp, the user could redefine the default value (ncpus) for parallelism defining a environment variable : SAGE_NUM_THREADS (like OMP_NUM_THREADS).
The actual behavior which sets ncpus = 1 is if not a bug, a no desired behavior and should be corrected.
comment:7 follow-up: ↓ 12 Changed 2 years ago by
See #25779 for a duplicate.
I think ncpus()
should return the number of cores (maybe unless overwritten by some environment variable) because that's what the name implies. I don't have a very strong opinion on how @parallel
should work exactly but I feel that "1" is a poor default for most of our casual users. I personally liked the "min(8,cores)" logic that I think was taking place before. In any case, @parallel
should certainly mention SAGE_NUM_THREADS
in its documentation.
Btw, you can set SAGE_NUM_THREADS=0
to get the old behaviour back, I think.
comment:8 Changed 2 years ago by
- Cc gh-lerouxje added
comment:9 follow-up: ↓ 10 Changed 2 years ago by
- Status changed from new to needs_info
I volunteer to fix this. But what should the new behaviour be?
comment:10 in reply to: ↑ 9 Changed 2 years ago by
Replying to saraedum:
I volunteer to fix this.
Very good!
But what should the new behaviour be?
I think it's worth asking the question on sage-devel list.
comment:11 Changed 15 months ago by
Relevant / related discussions:
comment:12 in reply to: ↑ 7 Changed 5 months ago by
I personally liked the "min(8,cores)" logic that I think was taking place before.
+1 from me for having npcus()
return min(8, cores)
I confirm the bug and could trace it down to some changes between Sage 8.1.beta5 and 8.1.beta8. On my computer (Intel Core i7-6700HQ), with Sage 8.1.beta5 and all previous versions:
while with Sage 8.1.beta8 and all versions above:
So yes, a very probable culprit is #23713, which has been merged in Sage 8.1.beta8.
A consequence of this is the follwing bug reported at https://ask.sagemath.org/question/42439/parallel-how-to-use-all-cpus/