Opened 3 years ago

Last modified 3 years ago

#22186 new defect

Process address space layout problems on Cygwin-32

Reported by: embray Owned by:
Priority: major Milestone: sage-wishlist
Component: porting: Cygwin Keywords: windows, cygwin, cygwin32
Cc: jpflori Merged in:
Authors: Reviewers:
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Description

There's a problem with managing the address space layout of the Python process Sage runs in that might be difficult to overcome on 32-bit Windows (this can be worked around somewhat on 32-bit Cygwin running on a 64-bit Windows, but there's almost no reason to do that).

The problem has to do with the known issue of building Sage in Cygwin of DLL rebasing. This in turn is mostly only becomes an issue when your fork a process--this unfortunately happens quite regularly as almost any time you start a subprocess via pexpect or the subprocess model it uses a fork-exec model. Windows has no notion of fork, and the steps Cygwin takes to emulate it are very complicated and fragile (though also extremely well tested and generally working).

The biggest difficult with fork on Cygwin has to do with copying the address spaces of DLLs. This article explains the issue in detail, but in short the issue is that Windows DLLs do not contain position-independent code. Rather, when they are linked they assume a "preferred" base address which is the address within each application's address space that the DLL's code and data is loaded at. If two DLLs have the same base address (which they often do), a DLL can be relocated, but this can only happen once (and also adds certain overhead). DLLs that have been relocated create a problem for Cygwin as, when it forks, it cannot guarantee that the new address of the DLL (something determined by the Windows kernel) can be reserved in the child process's memory layout.

To get around this it's typical to use the command rebaseall which, given a list of DLLs that might be used in a process, sets their preferred base addresses so that none of them overlap with each other, and hence don't need to be relocated, eschewing any problems. When building Sage on Cygwin, one of the steps is to rebase all DLLs in $SAGE_LOCAL, and this avoids most fork problems (Cygwin also keeps a database of known DLLs and their bases/sizes so that we can run rebase without overlapping with any of Cygwin's DLLs).

In Cygwin-64 this has generally worked well, and so far I have avoided most forking problems. However, in 32-bit Windows the more limited address space presents some major problems with rebasing and I'm not completely sure they can be resolved perfectly effectively.

By default, the Cygwin rebase/rebaseall utilities, given a list of DLLs to rebase, start their base addresses at just below 0x70000000 (that is, the address space for the first DLL ends at 0x70000000) and go down from there. Why 0x70000000 I'm not exactly sure, except that addresses 0x80000000 and up are reserved for the kernel (unless using the http://msdn.microsoft.com/en-us/library/ms791558.aspx boot mode), and Windows seems to like to prefer to map some system DLLs in the range just below 0x80000000. The other important address to know about is 0x20000000 which is where all 32-bit Cygwin processes locate their heap (Cygwin maintains its own heap distinct from the normal heap reserved for processes by Windows--this is so that it can maintain various POSIX semantics without getting in the way of--or being gotten in the way of--by the kernel). By default I think it reserves 384MB for the heap, but of course it can grow as needed, and as memory is available).

One of the problems with all this is that rebase is not very smart about how it interacts with the rest of the addresss space layout, and seems to gladly rebase DLLs right into the heap if it has to. Sage has so many DLLs (and this includes DLLs that are part of Cygwin) that in the process of rebaseall it will blow right through the default range reserved for Cygwin's heap. This means that any DLLs that are rebased into the heap inevitably get relocated when they are loaded.

There are some things we could do to be smarter about this and maybe get around the problem for most cases. Currently the sage-rebase and sage-rebaseall scripts (see also related ticket #20986) are dumb about how they order things. They just use find to find all DLLs in $SAGE_LOCAL and run rebase on them in lexicographic order.

We can probably do better with some more careful ordering of the DLL base addresses. Start with only those Cygwin DLLs that are typically loaded when running Sage (which is not necessarily all of them), and put them at the highest addresses well away from the heap. Likewise, follow then by those DLLs in $SAGE_LOCAL that are typically loaded upon an import sage.all (This can all be analyzed by running sage and looking at its /proc/<pid>/maps). The point would be to make it as unlikely as possible that any DLLs will have to be relocated. This is of course by no means foolproof. For example, one might do a lot of calculations and grow the heap, and then import a Cython module (that is not normally imported by default) which then has to be relocated, thus possibly breaking any later attempts to fork. This might have to include better tooling and documentation so that it's easy for a user to 'rebase' a module that is not "common" but that they happen to use regularly.

All that said, while I think there's plenty of room to improve the situation it's bad enough that I'd lean toward strongly recommending against using Sage on 32-bit Cygwin. I'll probably put the issue aside for now and come back to it later if at all.

Change History (3)

comment:1 Changed 3 years ago by jpflori

  • Cc jpflori added

comment:2 Changed 3 years ago by embray

  • Milestone set to sage-wishlist

comment:3 Changed 3 years ago by embray

  • Keywords cygwin32 added
Note: See TracTickets for help on using tickets.