Opened 11 years ago

Closed 10 years ago

#10112 closed defect (fixed)

random_prime does not handle erroneous input gracefully - it hangs

Reported by: drkirkby Owned by: was
Priority: major Milestone: sage-4.8
Component: number theory Keywords:
Cc: Merged in: sage-4.8.alpha1
Authors: Mike Hansen, Francis Clarke Reviewers: David Kirkby, Johan Bosman
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by jdemeyer)

As noted at #10111 and

http://groups.google.com/group/sage-devel/browse_thread/thread/6e8d6f28c915830d?hl=en

random_prime() is not well documented.

But random_prime() is also broken for some erroneous input. Programs should always handle poor input properly, which is the case of Sage probably means generating an error message.

sage: random_prime(0,True,-1) # Hangs
sage: random_prime(0,False,-1) # Hangs
sage: random_prime(1,0,-23) # Hangs
sage: random_prime(-12,0,-12) # Hangs
sage: random_prime(0, proof=False, lbound=-12) # Hangs

Having a negative lower bound can cause the function to hang, but this is not always the case, as the example below shows. Here an error message is generated.

sage: random_prime(-12,0,-4)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)

/export/home/drkirkby/sage-4.6.alpha3/<ipython console> in <module>()

/export/home/drkirkby/sage-4.6.alpha3/local/lib/python2.6/site-packages/sage/rings/arith.pyc in random_prime(n, proof, lbound)
   1144     n = ZZ(n)
   1145     if n < lbound:
-> 1146         raise ValueError, "n must be greater than lbound: %s"%(lbound)
   1147     elif n == 2:
   1148         return ZZ(n)

ValueError: n must be greater than lbound: -4

Dave

Apply

  1. trac_10112.patch (merged in sage-4.8.alpha1)
  2. trac_10112-replacement-extra.patch (merged in sage-4.8.alpha1)
  3. trac_10112-ascii.patch (merged in sage-4.8.alpha2)

to the Sage library.

Attachments (4)

trac_10112.patch (1.8 KB) - added by mhansen 11 years ago.
trac_10112-extra.patch (1.5 KB) - added by fwclarke 10 years ago.
to apply after trac_10112.patch
trac_10112-replacement-extra.patch (3.2 KB) - added by fwclarke 10 years ago.
Apply after trac_10112.patch INSTEAD of trac_10112-extra.patch
trac_10112-ascii.patch (1.0 KB) - added by jhpalmieri 10 years ago.
apply on top of other patches

Download all attachments as: .zip

Change History (25)

comment:1 Changed 11 years ago by mhansen

  • Authors set to Mike Hansen
  • Status changed from new to needs_review

comment:2 Changed 11 years ago by drkirkby

Mike, thank you for fixing this. I'll try this later today, but here's a few comments from reading the patch only - I've not tested it in Sage yet.

Should the example on line 1127 be

random_prime(200, proof=False, lbound=100) ?

Currently it's the same as the previous example.

One thing that would be nice is if you did like John did at #10105 and have doctests for the commands that currently hang Sage. Then if there are any regressions, and the bug(s) get back in, this will be detected.

Perhaps I'll get bored of running my little script that supplies junk to commands to see what ones crash sage (like #10113) or hang Sage like this ticket, #10108 and #10105.

Changed 11 years ago by mhansen

comment:3 Changed 11 years ago by mhansen

I've updated the patch.

comment:4 Changed 11 years ago by drkirkby

Thank you, I will take a look later today and test it.

comment:5 Changed 10 years ago by fwclarke

The patch deals with the cases listed. But there are other cases which hang, like

sage: random_prime(8, lbound=8)

and some that almost do, like

sage: random_prime(3, lbound=-10^8)

They can be very easily dealt with. The additional patch does this.

Changed 10 years ago by fwclarke

to apply after trac_10112.patch

comment:6 Changed 10 years ago by drkirkby

  • Status changed from needs_review to positive_review

Those two patches are a definite improvement, so I'm giving it a positive review.

I've opened #10277 for another issue with random_prime

Sage's implementation is also incredibly slow compared to Mathematica. After 20 minutes of CPU time on a 3.33 GHz machine I was unable to generate one up to 101000 with Sage. Yet Mathematica can do that in 7 seconds!

Dave

comment:7 Changed 10 years ago by drkirkby

  • Authors changed from Mike Hansen to Mike Hansen, Francis Clarke
  • Reviewers set to David Kirkby

comment:8 Changed 10 years ago by drkirkby

  • Status changed from positive_review to needs_info

I'm setting this to 'needs info', as William remarked on sage-devel that he did not see this issue with 4,6 that I get at. It seems highly unlikely, but these patches could just be whats casing that problem for me.

comment:9 Changed 10 years ago by fwclarke

  • Status changed from needs_info to needs_work

The issue raised in #10277 (which surely arises from using random_prime as patched here) shows the weakness of using prime_pi. There is a much better way of checking that there are some primes in the range in question. A replacement alternative patch will be posted very soon.

comment:10 Changed 10 years ago by fwclarke

  • Status changed from needs_work to needs_review

A replacement patch is attached.  It uses Betrand's postulate and refinements due to Nakura and Schoenfield.  Only when the ratio of the upper and lower bounds is quite close to one is the smallest prime in the interval computed.

comment:11 follow-up: Changed 10 years ago by burcin

  • Status changed from needs_review to needs_work

Can we create a fast but unsafe function random_prime_unsafe() and make the random_prime() function exposed to the users call that?

ATM, the random_prime() function is used in speed critical code. See sage/ext/multi_modular.pyx for example. In this use case, the checks are unnecessary and they would cause a dramatic slow down. We should be able to bypass the checks when we know the input makes sense. I suggest separating the core of the function from the code that makes it idiot-proof, and using the core directly in places like sage/ext/multi_modular.pyx.

When this change is made, there should be a clear and loud warning in the release notes to notify people who rely on the current behavior of random_prime() to switch to this unsafe version in their personal codebase as well.

Changed 10 years ago by fwclarke

Apply after trac_10112.patch INSTEAD of trac_10112-extra.patch

comment:12 in reply to: ↑ 11 Changed 10 years ago by fwclarke

Replying to burcin:

Can we create a fast but unsafe function random_prime_unsafe() and make the random_prime() function exposed to the users call that? ATM, the random_prime() function is used in speed critical code. See sage/ext/multi_modular.pyx for example. In this use case, the checks are unnecessary and they would cause a dramatic slow down. [... ...]

A good idea. But the slowdown is, for most cases, far less significant with the new patch.

comment:13 Changed 10 years ago by fwclarke

Probably the way forward would be to have a "check" option. But to quantify my earlier remark, I note that for the default bounds used in sage/ext/multi_modular.pyx (10^15 and 10^10), the code checking that there are primes in that interval adds, on my machine, about 50 microseconds to the 1325 microseconds taken to find a random prime.

comment:14 follow-up: Changed 10 years ago by drkirkby

  • Status changed from needs_work to needs_review

Is there any reason this should not be changed to positive review if the slowdown is very small?

Unless there's universal agreement on this, I suggest the changes in documentation, are put on #10111. That ticket was supposed to address the deficiencies in the documentation. The only comment on that ticket is from Mike, and is See the patch at #10112. But if Mikes patch here, which changes both the documentation and the code is going to stall, we might as well just fix the documentation for now on #10111. The changes needed to the documentation appear to be

  • The example random_prime(200, lbound=100) is added
  • The example random_prime(200, proof=False, lbound=100) is added.
  • The error message in the code (which I guess is code not documentation), which changes n must be greater than lbound to n must be at least lbound. Clearly looking at the code, fwclarke's change is correct, since the test before the error message is if n < lbound. Therefore n=lbound is permitted by the code, so the correct error message should be n must be at least lbound

If there is no agreement to merge the two patches here, I'll create one patch which just fixes the above 3 documentation problems and add that to #10111. Does that make sense?

Dave

comment:15 in reply to: ↑ 14 Changed 10 years ago by johanbosman

  • Description modified (diff)
  • Reviewers changed from David Kirkby to David Kirkby, Johan Bosman
  • Status changed from needs_review to positive_review

Replying to drkirkby:

Is there any reason this should not be changed to positive review if the slowdown is very small?

I don't think so. The code looks okay, is well documented, and passes all tests.

comment:16 Changed 10 years ago by jdemeyer

  • Description modified (diff)

comment:17 Changed 10 years ago by jdemeyer

  • Merged in set to sage-4.8.alpha1
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:18 Changed 10 years ago by jdemeyer

  • Resolution fixed deleted
  • Status changed from closed to new

The patch contains non-ASCII characters which Sphinx 1.1.2 (#10620) rejects.

comment:19 Changed 10 years ago by jhpalmieri

  • Status changed from new to needs_review

We could put # -*- coding: utf-8 -*- at the top of the file arith.py -- I think the sagenb project has done this with all of their files -- but I prefer the attached patch.

Changed 10 years ago by jhpalmieri

apply on top of other patches

comment:20 Changed 10 years ago by jhpalmieri

  • Description modified (diff)

comment:21 Changed 10 years ago by jdemeyer

  • Description modified (diff)
  • Resolution set to fixed
  • Status changed from needs_review to closed
Note: See TracTickets for help on using tickets.