Opened 7 years ago
Closed 7 years ago
#13880 closed defect (fixed)
Respect ulimit -v for GAP memory pool size
Reported by: | vbraun | Owned by: | jason |
---|---|---|---|
Priority: | blocker | Milestone: | sage-5.6 |
Component: | misc | Keywords: | |
Cc: | Merged in: | sage-5.6.beta3 | |
Authors: | Volker Braun | Reviewers: | Dmitrii Pasechnik |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description (last modified by )
Limit the GAP pool size to 1/10 of the ulimit -v, if the user has set one. Also, allow zero-sized swap in a doctest.
Attachments (2)
Change History (40)
Changed 7 years ago by
comment:1 Changed 7 years ago by
- Status changed from new to needs_review
comment:2 Changed 7 years ago by
- Description modified (diff)
- Summary changed from Fix doctest for amout of swap to Respect ulimit -v for GAP memory pool size
comment:3 follow-up: ↓ 4 Changed 7 years ago by
Why available_swap() in ZZ
?
I'm not sure whether the fallback value of virtual_memory_limit()
(self.available_swap() + self.available_ram()
) is appropriate.
It doesn't take into account the memory already used by the process, and the available (or total) swap can be much larger than what's available to a single process (e.g. 16 GB swap space on a 32-bit processor or OS); similar for the RAM "available".
comment:4 in reply to: ↑ 3 ; follow-up: ↓ 5 Changed 7 years ago by
Replying to leif:
Why
available_swap() in ZZ
?
To test that it returns an integer as promised in the docstring.
I've added a manual 2GB limit for the rare case of 32-bit machines with >> 2GB ram.
comment:5 in reply to: ↑ 4 Changed 7 years ago by
Replying to vbraun:
Replying to leif:
Why
available_swap() in ZZ
?To test that it returns an integer as promised in the docstring.
Not that obvious, to me at least... ;-)
Still wonder whether we shouldn't also test that the result is actually non-negative.
I've added a manual 2GB limit for the rare case of 32-bit machines with >> 2GB ram.
Ok. (The more common case -- which I meant -- is having >> 2 GB swap space, which is handled now as well.)
Slightly related (memory_info.py
probably deserved its own ticket):
available_ram()
should probably have an optional boolean parameter to not just return the amount of free memory ("MemFree
"), as this can often be pretty close to zero, due to buffering and especially caching. (I.e., the amount of "available" RAM, allocatable by a process without running out of [physical] memory, might in fact be much larger, regardless of whether swap space is available or not.)
As is, although its docstring contains "free" in parentheses, the term "available RAM" is IMHO potentially misleading.
W.r.t. GAP, I think it may currently "run out of memory" simply because [there's no or not enough swap space and] most (or at least much) of the physical memory is eaten up by just caches.
comment:6 Changed 7 years ago by
I disagree with
suggested_size = min(suggested_size, int(mem.virtual_memory_limit()/10))
This virtual_memory_limit
is per-process, so the suggested size should be roughly equal to virtual_memory_limit
, not much less. Maybe something like
# Respect ulimit -v, use slightly less than maximum allowed amount of memory suggested_size = min(suggested_size, int(mem.virtual_memory_limit()*7/8))
comment:7 Changed 7 years ago by
- Priority changed from major to blocker
comment:8 follow-up: ↓ 9 Changed 7 years ago by
I'm sure you can improve on available_ram()
but its good enough for starting GAP, I think. The perfect is the enemy of the good.
For libGAP the pool is actually in the Sage process, so what Jeroen suggests would be a bad idea.
Also, I don't think the user's intention when setting ulimit -v
was for Sage to spawn a couple of processes that each try to get as close as possible to the limit.
comment:9 in reply to: ↑ 8 ; follow-up: ↓ 10 Changed 7 years ago by
Replying to vbraun:
I'm sure you can improve on
available_ram()
but its good enough for starting GAP, I think. The perfect is the enemy of the good.
But I'd say that this patch actually makes things worse than they are. GAP currently works fine with ulimit -v
, it automatically adjusts (remember the "halving pool size" messages).
For libGAP the pool is actually in the Sage process, so what Jeroen suggests would be a bad idea.
No, because this ticket doesn't refer to libGAP.
Also, I don't think the user's intention when setting
ulimit -v
was for Sage to spawn a couple of processes that each try to get as close as possible to the limit.
I don't think the user intends GAP to take 1/10th of the memory either. If I would set the memory limit to 2GB and GAP doesn't want to allocate 300MB, I would be quite surprised.
I do agree this is all bikeshedding, that's why I intentionally don't set this to needs_work.
comment:10 in reply to: ↑ 9 Changed 7 years ago by
Replying to jdemeyer:
But I'd say that this patch actually makes things worse than they are. GAP currently works fine with
ulimit -v
, it automatically adjusts (remember the "halving pool size" messages).
Not if you use libgap. It also adjusts to the point where you don't have any virtual memory left for Sage and then other stuff runs of of memory.
comment:11 Changed 7 years ago by
The libGAP discussion IMHO belongs to another ticket (and is IMHO pretty irrelevant at this point). Dima mentioned the GAP developers seem to be working on a new memory manager, probably also because the current one is unsuitable for a library, at least when used together with other libraries or memory managers.
Would it make sense to use a different pool size for doctesting (i.e., a close[r]-to-minimal one)?
I still believe the current implementation will prevent GAP from working if the RAM is filled up with buffers and caches (and there's no or "not enough" swap space), which to me doesn't seem to be too untypical. Haven't tried yet though.
comment:12 follow-ups: ↓ 13 ↓ 15 Changed 7 years ago by
The libGAP discussion is relevant because they share the function to determine the pool size.
If you have neither free ram nor any swap then we'll still reserve a pool that is large enough to run all long doctests.
comment:13 in reply to: ↑ 12 ; follow-up: ↓ 14 Changed 7 years ago by
Replying to vbraun:
The libGAP discussion is relevant because they share the function to determine the pool size.
Perhaps the suggested_size
routine needs a flag for_libgap=true/false
then, to compute appropriate defaults for either situation. Obviously, the two cases ask for separate considerations in the face of per-process limits.
Isn't this ticket dealing with a sage in which libgap is non-existent? patching this upon introduction of libgap is rather straightforward.
comment:14 in reply to: ↑ 13 ; follow-up: ↓ 17 Changed 7 years ago by
Replying to nbruin:
Isn't this ticket dealing with a sage in which libgap is non-existent? patching this upon introduction of libgap is rather straightforward.
That's what I meant with (currently) "irrelevant". (Besides that libGAP may come with a new / alternate memory manager when Sage is "ready" for it.)
comment:15 in reply to: ↑ 12 Changed 7 years ago by
Replying to vbraun:
If you have neither free ram nor any swap then we'll still reserve a pool that is large enough to run all long doctests.
Ok. That's probably enough for "ordinary" Sage sessions as well.
Not sure right now how it behaves when users happen to need a larger pool size. (They should get an appropriate error message, and the pool size should be easily customizable. AFAIK, the latter currently requires modifying $DOT_SAGE/init.sage
.)
comment:16 Changed 7 years ago by
You can always use sage.interfaces.gap.set_gap_memory_pool_size()
before starting any gap computation if you want to set it manually. This is not changed in this ticket.
comment:17 in reply to: ↑ 14 ; follow-up: ↓ 18 Changed 7 years ago by
Replying to leif:
Besides that libGAP may come with a new / alternate memory manager when Sage is "ready" for it.
LibGAP is ready. It only needs review for a few obvious improvements over the previously reviewed version. I.e. will probably never be merged considering the glacial process. But ready it is.
comment:18 in reply to: ↑ 17 Changed 7 years ago by
Replying to vbraun:
Replying to leif:
Besides that libGAP may come with a new / alternate memory manager when Sage is "ready" for it.
LibGAP is ready. It only needs review for a few obvious improvements over the previously reviewed version. I.e. will probably never be merged considering the glacial process.
I'm working on this from a crevasse in a local Singaporean glacier. It helps against hardware overheating...
comment:19 Changed 7 years ago by
I am testing this on a system where mem.available_swap() comes up negative:
sage: t = mem._parse_proc_meminfo(); t {'available_ram': 1435963392, 'total_swap': 10909376512, 'available_swap': 8558673920, 'Committed_AS': 16000847872, 'total_ram': 7929249792} sage: mem.available_swap() -5091471360
which is not surprising, as mem.available_swap() computes t['total_swap']-t['Committed_AS']
, rather than returning t['available_swap']
.
What is going on? A bug or a feature?
That's how far I get digging up the cause of
sage -t -long "devel/sage-main/sage/misc/memory_info.py" ********************************************************************** File "/usr/local/src/sage/sage-5.5/devel/sage-main/sage/misc/memory_info.py", line 122: sage: mem.virtual_memory_limit() > 0 Expected: True Got: False ********************************************************************** 1 items had failures:
comment:20 follow-up: ↓ 21 Changed 7 years ago by
How about the following patch:
diff --git a/sage/misc/memory_info.py b/sage/misc/memory_info.py --- a/sage/misc/memory_info.py +++ b/sage/misc/memory_info.py @@ -263,7 +263,10 @@ """ info = self._parse_proc_meminfo() try: - return info['total_swap'] - info['Committed_AS'] + dif = info['total_swap'] - info['Committed_AS'] + if dif > 0: + return dif + return info['available_swap'] except KeyError: return info['available_swap']
at least this fixes the doctests on the system in question
comment:21 in reply to: ↑ 20 Changed 7 years ago by
Replying to dimpase:
How about the following patch:
diff --git a/sage/misc/memory_info.py b/sage/misc/memory_info.py --- a/sage/misc/memory_info.py +++ b/sage/misc/memory_info.py @@ -263,7 +263,10 @@ """ info = self._parse_proc_meminfo() try: - return info['total_swap'] - info['Committed_AS'] + dif = info['total_swap'] - info['Committed_AS'] + if dif > 0: + return dif + return info['available_swap'] except KeyError: return info['available_swap']at least this fixes the doctests on the system in question
I actually have no clear idea why 'Committed_AS' gets into the picture here. E.g. I read
So this is a worst-case estimate, and info['total_swap'] < info['Committed_AS']
means, from the OS point of view, that you're living on the edge. Well, maybe. And the measurement is taken while make -j4 ptestlong
on this 4-processor machine.
comment:22 Changed 7 years ago by
- Status changed from needs_review to needs_work
- Work issues set to mem.available_swap() can be negative
comment:23 Changed 7 years ago by
dimpase: the new test simply checks
MemoryInfo().available_swap() in ZZ
so perhaps this is a feature, not a bug? I don't mind that available_swap()
can be negative.
comment:24 follow-up: ↓ 26 Changed 7 years ago by
The problem is that I don't know an easy way to report "available swap". It could all be reserved by some other gap session for all we know. So Total-Committed_AS
is a good approximation. Yes it can go negative if the kernel is overcommitting, but one could argue that it is then to be expected that a negative number is returned. In any case, it works as I intended it. If you have a better metric for available swap that doesn't involve going through the VM allocations of every process then I'd be happy to implement that. But in its present state it is good enough for starting GAP; if Committed_AS > Total
then you probably want the minimial sized pool for GAP.
comment:25 Changed 7 years ago by
- Status changed from needs_work to needs_review
- Work issues mem.available_swap() can be negative deleted
comment:26 in reply to: ↑ 24 Changed 7 years ago by
Replying to vbraun:
The problem is that I don't know an easy way to report "available swap".
how about self._parse_proc_meminfo()['available_swap']
?
As it is now, this value can be different from MemoryInfo().available_swap()
, and this is rather confusing.
It could all be reserved by some other gap session for all we know.
but we also know that this might be a huge overestimate.
So
Total-Committed_AS
is a good approximation.
at least it should be named differently, not MemoryInfo().available_swap()
, to avoid the clash with self._parse_proc_meminfo()['available_swap']
Yes it can go negative if the kernel is overcommitting, but one could argue that it is then to be expected that a negative number is returned. In any case, it works as I intended it. If you have a better metric for available swap that doesn't involve going through the VM allocations of every process then I'd be happy to implement that. But in its present state it is good enough for starting GAP; if Committed_AS > Total
then you probably want the minimial sized pool for GAP.
comment:27 follow-up: ↓ 28 Changed 7 years ago by
/proc/meminfo
doesn't exclude reserved yet unused swap in the SwapFree field.
I've renamed the (private) dict key to free_swap
. In any case thats not visible to the user.
comment:28 in reply to: ↑ 27 Changed 7 years ago by
Replying to vbraun:
/proc/meminfo
doesn't exclude reserved yet unused swap in the SwapFree field.I've renamed the (private) dict key to
free_swap
. In any case thats not visible to the user.
This is not enough. I still get
File "/usr/local/src/sage/sage-5.6.beta2/devel/sage-main/sage/misc/memory_info.py", line 122: sage: mem.virtual_memory_limit() > 0 Expected: True Got: False
virtual_memory_limit() as implemented is too conservative, I think. Indeed:
sage: from sage.misc.memory_info import MemoryInfo sage: mem = MemoryInfo() sage: mem.av mem.available_ram mem.available_swap sage: mem.available_swap() -5123952640 sage: mem.available_ram() 1262751744 sage: mem.virtual_memory_limit() -3861200896
unless you will allow virtual_memory_limit() be negative, too, but this is IMHO even weirder... Shouldn't an implementation respect the overcommitting settings somehow?
comment:29 follow-up: ↓ 30 Changed 7 years ago by
I've change virtual_memory_limit()
to default to total swap+ram instead of available swap+ram. Now its always positive (and quite large unless you set a limit manually).
comment:30 in reply to: ↑ 29 Changed 7 years ago by
Replying to vbraun:
I've change
virtual_memory_limit()
to default to total swap+ram instead of available swap+ram. Now its always positive (and quite large unless you set a limit manually).
Why don't you use 'free_swap' instead of 'available_swap' ? This reflects the reality and the system settings (for overcommitting) far better.
With 'total swap'+'total ram' one would potentially end up with too much swap reserved for GAP.
comment:31 follow-up: ↓ 32 Changed 7 years ago by
No, the virtual memory limit should only be used for the pool size if either it was set by hand to a relatively small value or if the address space is small compared to ram size (32-bit PAE).
comment:32 in reply to: ↑ 31 Changed 7 years ago by
Replying to vbraun:
No, the virtual memory limit should only be used for the pool size if either it was set by hand to a relatively small value or if the address space is small compared to ram size (32-bit PAE).
The docstring for virtual memory limit says:
This is the value set by ``ulimit -v`` at the command line or a practical limit if no limit is set.
This does not describe the current implementation. (And it wasn't really true before, either).
comment:33 follow-up: ↓ 34 Changed 7 years ago by
Why not? For all practical purposes, the limit is either swap+ram or your processor address limit if you don't set it.
comment:34 in reply to: ↑ 33 Changed 7 years ago by
Replying to vbraun:
Why not? For all practical purposes, the limit is either swap+ram or your processor address limit if you don't set it.
because you neither return the value X set by ulimit -v
(you return X+Y, for some nonzero Y), nor anything practical (no single process can reach this limit, ever) to speak about.
comment:35 Changed 7 years ago by
Are you sure? I do return RLIMIT_AS if set:
[vbraun@volker-desktop ~]$ ulimit -v 12341234 [vbraun@volker-desktop ~]$ sage ... sage: from sage.misc.memory_info import MemoryInfo sage: MemoryInfo().virtual_memory_limit() 12637423616 sage: _/1024 12341234
And you probably can request total ram+swap thanks to overcommit if not much else is going on or the overcommit ratio is stupidly high.
comment:36 Changed 7 years ago by
- Status changed from needs_review to positive_review
comment:37 Changed 7 years ago by
- Reviewers set to Dmitrii Pasechnik
comment:38 Changed 7 years ago by
- Merged in set to sage-5.6.beta3
- Resolution set to fixed
- Status changed from positive_review to closed
Updated patch