Opened 11 years ago

Closed 10 years ago

#9631 closed defect (fixed)

Remerge #9501 after resolving NFS and/or doctest problems with @fork

Reported by: mpatel Owned by: jason
Priority: major Milestone: sage-4.7.2
Component: misc Keywords:
Cc: leif, jhpalmieri, kcrisman, malb, mvngu, SimonKing, was Merged in: sage-4.7.2.alpha2
Authors: William Stein, Mitesh Patel Reviewers: Volker Braun
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by vbraun)

We merged #9501 in Sage 4.5.2.alpha1 but backed it out entirely in 4.5.2.rc0 (cf. #9616), because a Network File System (NFS) problem on the Sage cluster gives frequent doctest failures.

Please see #9501 and #9616 for discussion.

Apply trac_9631-fork_decorator.5.patch

Attachments (6)

trac_9631-fork_decorator.patch (10.5 KB) - added by mpatel 11 years ago.
Version of #9501's v2, rebased for 4.6.rc0.
trac_9631-fork_decorator.2.patch (10.7 KB) - added by vbraun 10 years ago.
Rediffed for sage-4.7.1.rc1
trac_9631-fork_decorator.3.patch (10.7 KB) - added by vbraun 10 years ago.
One more docfix
trac_9631-fork_decorator.4.patch (10.7 KB) - added by was 10 years ago.
polished docstrings fixing some typos.
trac_9631-fork_decorator.4-5.diff (11.9 KB) - added by leif 10 years ago.
Diff between v4 (William's) and v5 (Leif's) patch. For reference / review only.
trac_9631-fork_decorator.5.patch (14.4 KB) - added by leif 10 years ago.
Some more docstring fixes.

Download all attachments as: .zip

Change History (26)

comment:1 Changed 11 years ago by mpatel

After the recent changes on the Sage cluster, I still see failures on sage.math with 4.6.alpha2 + #9501's v2 patch:

sage-4.6.a2$ env DOT_SAGE="$HOME/.sage" ./sage -t -long devel/sage/sage/parallel/decorate.py
sage -t -long "devel/sage/sage/parallel/decorate.py"        


------------------------------------------------------------
Unhandled SIGSEGV: A segmentation fault occurred in Sage.
This probably occurred because a *compiled* component
of Sage has a bug in it (typically accessing invalid memory)
or is not properly wrapped with _sig_on, _sig_off.
You might want to run Sage under gdb with 'sage -gdb' to debug this.
Sage will now terminate (sorry).
------------------------------------------------------------

**********************************************************************
File "/mnt/usb1/scratch/mpatel/apps/sage-4.6.a2/devel/sage/sage/parallel/decorate.py", line 300:
    sage: g()
Expected:
    '10'
Got:
    [Errno 16] Device or resource busy: '/home/mpatel/.sage/temp/sage.math.washington.edu/7514/dir_0/.nfs00000000015a0bad0006c177'
    '10'
**********************************************************************
File "/mnt/usb1/scratch/mpatel/apps/sage-4.6.a2/devel/sage/sage/parallel/decorate.py", line 311:
    sage: g()
Expected:
    'a'
Got:
    [Errno 16] Device or resource busy: '/home/mpatel/.sage/temp/sage.math.washington.edu/7514/dir_1/.nfs00000000015a0bad0006c178'
    'a'
**********************************************************************
File "/mnt/usb1/scratch/mpatel/apps/sage-4.6.a2/devel/sage/sage/parallel/decorate.py", line 300:
    sage: g()
Expected:
    '10'
Got:
    [Errno 16] Device or resource busy: '/home/mpatel/.sage/temp/sage.math.washington.edu/7514/dir_0/.nfs00000000015a0bad0006c177'
    '10'
**********************************************************************
File "/mnt/usb1/scratch/mpatel/apps/sage-4.6.a2/devel/sage/sage/parallel/decorate.py", line 311:
    sage: g()
Expected:
    'a'
Got:
    [Errno 16] Device or resource busy: '/home/mpatel/.sage/temp/sage.math.washington.edu/7514/dir_1/.nfs00000000015a0bad0006c178'
    'a'

I believe the Unhandled SIGSEGV is expected.

comment:2 follow-up: Changed 11 years ago by mpatel

Ticket #10098 is about a similar [Errno 16] Device or resource busy error.

comment:3 in reply to: ↑ 2 Changed 11 years ago by mpatel

Replying to mpatel:

Ticket #10098 is about a similar [Errno 16] Device or resource busy error.

See #10098's comment 9ff for a proposed possible solution.

Changed 11 years ago by mpatel

Version of #9501's v2, rebased for 4.6.rc0.

comment:4 Changed 11 years ago by mpatel

  • Authors set to William Stein, Mitesh Patel
  • Milestone set to sage-4.6.1
  • Status changed from new to needs_review

I've attached a tweaked version of William's patch v2 from #9501.

Changed 10 years ago by vbraun

Rediffed for sage-4.7.1.rc1

Changed 10 years ago by vbraun

One more docfix

comment:5 Changed 10 years ago by vbraun

  • Description modified (diff)
  • Reviewers set to Volker Braun
  • Status changed from needs_review to positive_review

I've updated the patch for Sage-4.7.1, there were some rejects due to changed docstrings. I didn't touch any actual functionality.

The actual workings of the fork decorator look good. Positive review.

comment:6 Changed 10 years ago by jdemeyer

  • Milestone changed from sage-4.7.1 to sage-4.7.2

comment:7 Changed 10 years ago by leif

There are still a few typos in the docstrings.

Related (but shouldTM apply independently, haven't tested this yet): #11658

Changed 10 years ago by was

polished docstrings fixing some typos.

comment:8 follow-up: Changed 10 years ago by was

  • Description modified (diff)

Hi Leif, I just attached trac_9631-fork_decorator.4.patch which polished the docstrings fixing some typos. If there are any remaining typos, please point them out explicitly.

comment:9 in reply to: ↑ 8 Changed 10 years ago by leif

Replying to was:

Hi Leif, I just attached trac_9631-fork_decorator.4.patch which polished the docstrings fixing some typos. If there are any remaining typos, please point them out explicitly.

  • trac_9631-fork_decorator.4.patch

    old new  
    1717+++ b/sage/parallel/decorate.py
    1818@@ -15,19 +15,17 @@
    1919     r"""
    20      Convert a to a pair (args, kwds) using some rules:
     20     Convert ``a`` to a pair ``(args, kwds)`` using some rules:
    2121 
    2222-        * if already of that form, leave that way.
    2323-        * if a is a tuple make (a,{})
     
    4444@@ -53,9 +51,14 @@
    4545 class Parallel:
    4646     r"""
    47      Create parallel decorated function.
     47     Create ``parallel``-decorated function.
    4848-
    4949     """
    5050     def __init__(self, p_iter = 'fork', ncpus=None, **kwds):
     
    5656+        """
    5757         # The default p_iter is currently the reference implementation.
    5858         # This may change.
    59          self.p_iter = None
     59         self.p_iter = None # ??? = p_iter, which defaults to 'fork', not the sequ. ref. impl.
    6060@@ -81,19 +84,16 @@
    6161 
    6262     def __call__(self, f):
     
    6767-        in possibly random order. Here x is replaced by its
    6868+        Create a function that wraps ``f`` and that when called with a
    6969+        list of inputs returns an iterator over pairs ``(x, f(x))`` in
    70 +        possibly random order. Here ``x`` is replaced by its
    71          normalized form (args, kwds) using normalize_inputs.
     70+        possibly random order.  Here ``x`` is replaced by its
     71         normalized form ``(args, kwds)`` using ``normalize_inputs()``.
    7272 
    7373         INPUT:
    7474-
     
    102102+         The parallel subprocesses will not have access to data
    103103+         created in pexpect interfaces.  This behavior with respect to
    104104+         pexpect interfaces is very important to keep in mind when
    105 +         setting up certain computations. It's the one big limitation
     105+         setting up certain computations.  It's the one big limitation
    106106+         of this decorator.
    107107+
    108108     INPUT:
     
    114114+            - ``fork``            -- (default) use a new forked process for each input
    115115+            - ``multiprocessing`` -- use multiprocessing library
    116116+            - ``reference``       -- use a fake serial reference implementation
    117       - ``ncpus`` -- integer, number of cpus
    118       - ``timeout`` -- number of seconds until task is killed (only supported by 'fork')
     117      - ``ncpus`` -- integer, maximal number of subprocesses to use at the same time
     118      - ``timeout`` -- number of seconds until a subprocess is killed (only supported by ``fork``; zero means not at all)
    119119 
    120120+    .. warning::
    121121+
    122 +         If you use anything but 'fork' above, then a whole new
     122+         If you use anything but ``fork`` above, then a whole new
    123123+         subprocess is spawned, so none of your local state (variables,
    124124+         certain functions, etc.) is available.
    125125+
    126126     EXAMPLES:
    127127 
    128      We create a simple decoration for a simple function. The number
     128     We create a simple decoration for a simple function.  The number
    129129@@ -148,7 +166,6 @@
    130130         sage: @parallel(2)
    131131         ... def f(n): return n*n
    132132 
    133133-
    134      We create a decorator that uses 3 processes, and times out
     134     We create a decorator that uses three subprocesses, and times out
    135135     individual processes after 10 seconds::
    136136 
    137137@@ -174,3 +191,152 @@
     
    144144+
    145145+###################################################################
    146146+# The @fork decorator -- evaluate a function with no side effects
    147 +# in memory, so the only side effects are on disk.
     147+# in memory, so the only side effects (if any) are on disk.
    148148+#
    149149+# We have both a function and a class below, so that the decorator
    150150+# can be used with or without options:
     
    158158+
    159159+class Fork:
    160160+    """
    161 +    A fork decorator class.
     161+    A ``fork`` decorator class.
    162162+    """
    163163+    def __init__(self, timeout=0, verbose=False):
    164164+        """
    165165+        INPUT:
    166 +         - ``timeout`` -- (default: 0) kills subrocess after this many
    167 +           seconds, or if timeout=0, do not kill the subprocess.
     166+         - ``timeout`` -- (default: 0) kill each subprocess after it has run this many
     167+           seconds (wall time), or if ``timeout`` is zero, do not kill any subprocesses.
    168168+         - ``verbose`` -- (default: ``False``) whether to print
    169169+           anything about what the decorator does (e.g., killing
    170170+           subprocesses)
     
    182182+    def __call__(self, f):
    183183+        """
    184184+        INPUT:
     185
    185186+         - ``f`` -- a function
    186187+
    187188+        OUTPUT:
     189
    188190+         - A decorated function.
    189191+
    190192+        EXAMPLES::
     
    206208+    """
    207209+    Decorate a function so that when called it runs in a forked
    208210+    subprocess.  This means that it won't have any in-memory
    209 +    side-effects on the parent Sage process.  The pexpect interfaces
     211+    side effects on the parent Sage process.  The pexpect interfaces
    210212+    are all reset.
    211213+   
    212214+    INPUT:
    213215+      - ``f`` -- a function
    214 +      - ``timeout`` -- (default: 0) if positive, kills subrocess after
    215 +        this many seconds
     216+      - ``timeout`` -- (default: 0) if positive, kill subprocess after
     217+        this many seconds (wall time)
    216218+      - ``verbose`` -- (default: ``False``) whether to print anything
    217 +        about what the decorator does (e.g., killing subprocesses)
     219+        about what the decorator does (e.g., killing the subprocess)
    218220+
    219221+    .. warning::
    220222+
    221 +        The forked subprocesses will not have access to data created
     223+        The forked subprocess will not have access to data created
    222224+        in pexpect interfaces.  This behavior with respect to pexpect
    223225+        interfaces is very important to keep in mind when setting up
    224 +        certain computations. It's the one big limitation of this
     226+        certain computations.  It's the one big limitation of this
    225227+        decorator.
    226228+
    227229+    EXAMPLES:
    228230+
    229 +    We create a function and run it with the fork decorator.  Note
     231+    We create a function and run it with the ``fork`` decorator.  Note
    230232+    that it does not have a side effect.  Despite trying to change
    231 +    the global variable "a" below in g, the variable a does not get
    232 +    changed.::
     233+    the global variable ``a`` below in ``g``, the variable ``a`` does not get
     234+    changed::
    233235+   
    234236+        sage: a = 5
    235237+        sage: @fork
     
    242244+        sage: a
    243245+        5
    244246+
    245 +    We use fork to make sure that the function terminates after 1
     247+    We use ``fork`` to make sure that the function terminates after one
    246248+    second, no matter what::
    247249+   
    248250+        sage: @fork(timeout=1, verbose=True)
     
    253255+        Killing subprocess ... with input ((10000000,), {'m': 5}) which took too long
    254256+        'NO DATA (timed out)'
    255257+
    256 +    We illustrate that pexpect interface state is not affected by
     258+    We illustrate that the state of the pexpect interface is not altered by
    257259+    forked functions (they get their own new pexpect interfaces!)::
    258260+   
    259261+        sage: gp.eval('a = 5')
     
    305307-            - ``timeout`` -- (float) time in seconds until a subprocess is automatically killed
    306308-            - ``verbose`` -- whether to print anything about what the iterator does (e.g., killing subprocesses)
    307309+            - ``ncpus`` -- the maximal number of simultaneous
    308 +              processes to spawn
    309 +            - ``timeout`` -- (float, default: 0) time in seconds until
     310+              subprocesses to spawn
     311+            - ``timeout`` -- (float, default: 0) wall time in seconds until
    310312+              a subprocess is automatically killed
    311313+            - ``verbose`` -- (default: False) whether to print
    312314+              anything about what the iterator does (e.g., killing
    313315+              subprocesses)
    314316+            - ``reset_interfaces`` -- (default: True) whether to reset
    315 +              all expect interfaces
     317+              all pexpect interfaces
    316318 
    317319         EXAMPLES::
    318320 
     
    326328         """
    327329@@ -206,7 +213,8 @@
    328330 
    329              # The expect interfaces (and objects defined in them) are
     331             # The pexpect interfaces (and objects defined in them) are
    330332             # not valid.
    331333-            sage.interfaces.quit.invalidate_all()
    332334+            if self.reset_interfaces:

Explicit, but perhaps a bit inconvenient. Didn't apply it, only looked at the patch.

comment:10 Changed 10 years ago by leif

P.S.: The wording w.r.t. ncpus and timeout could be unified.

comment:11 Changed 10 years ago by was

Since you changed the code, you should post a patch, since it seems silly for me to just manually copy your changes in...

comment:12 follow-up: Changed 10 years ago by was

I see -- you literally patched the patch? I don't know how to get your patch of the patch off of trac. I can view it, but can't download it...?

comment:13 in reply to: ↑ 12 Changed 10 years ago by leif

Replying to was:

I see -- you literally patched the patch? I don't know how to get your patch of the patch off of trac. I can view it, but can't download it...?

Yep, sorry. (I actually edited your patch and then made a diff just for displaying.)

In principle one can extract inline patches from mail notifications (or "reply" to the comment just to copy parts), but since I also changed context lines, the patched patch wouldn't apply anyway.

I was just going to upload a v5 (no code changes btw., just in comments), but somehow managed to totally mess up the patch (and apparently also the files themselves) such that I now have two mixed versions with both having some of your and some of my changes... 8/

Trying to fix this, then I'll upload a "real" v5 patch; I made more changes than in the comment above.

Changed 10 years ago by leif

Diff between v4 (William's) and v5 (Leif's) patch. For reference / review only.

Changed 10 years ago by leif

Some more docstring fixes.

comment:14 Changed 10 years ago by leif

Ok, attached a v5 patch, and a diff between William's v4 and my version.

Take a look at it, haven't updated the description yet.

Hope I didn't miss something, as I had to redo almost all from scratch, because some very weird things must have happened. Not only did the editor confuse files, presumably due to renaming and symbolic links, but also my first committed and exported version completely vanished from the Mercurial repository, which must be some weird bug related to cloning and rebuilding a branch.

comment:15 follow-up: Changed 10 years ago by vbraun

  • Status changed from positive_review to needs_work

Looks good. If you can add a description then we'll be good to go.

comment:16 in reply to: ↑ 15 Changed 10 years ago by leif

Replying to vbraun:

Looks good. If you can add a description then we'll be good to go.

? I just meant I haven't updated the ticket's description to refer to the new v5 patch.

comment:17 Changed 10 years ago by leif

  • Status changed from needs_work to needs_info

comment:18 Changed 10 years ago by vbraun

  • Description modified (diff)
  • Status changed from needs_info to needs_review

Sorry, I didn't notice that the patch does already include some description; since it starts with a hash-mark its somewhat indistinguishable from the automatic mercurial comments. But should be good enough.

comment:19 Changed 10 years ago by vbraun

  • Status changed from needs_review to positive_review

comment:20 Changed 10 years ago by jdemeyer

  • Merged in set to sage-4.7.2.alpha2
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.