Ticket #853 (needs_info enhancement)
Add a pslq implementation to Sage
| Reported by: | was | Owned by: | was |
|---|---|---|---|
| Priority: | major | Milestone: | sage-wishlist |
| Component: | number theory | Keywords: | |
| Cc: | burcin, robertwb | Work issues: | need advice on interface |
| Report Upstream: | N/A | Reviewers: | David Kirkby |
| Authors: | Paul Zimmermann, Alex Ghitza | Merged in: | |
| Dependencies: | Stopgaps: |
Description (last modified by AlexGhitza) (diff)
David Bailey's ARPREC package http://crd.lbl.gov/~dhbailey/mpdist/ includes several implementations of PSLQ, written in C++, and is licensed under BSD. However, ARPREC raw multi arithmetic timings don't look too favorable http://pari.math.u-bordeaux.fr/benchs/timings-mpfr.html and one has the same fix-x86 issues as quad-double. It looks like, however, one of the advantages of PSLQ is that it does not require full-precision at many of the intermediate steps. (that's what this two-level stuff is about in his package--most operations are performed with machine-double arithmetic).
Zimmermann also has a GPL implementation, based on gmp, which is only 1000 lines long. http://www.loria.fr/~zimmerma/free/ No idea yet how speeds compare.
Attachments
Change History
comment:2 Changed 6 years ago by zimmerma
Damien Stehle did some comparisons between the PSLQ implementation of Bailey and his FPLLL, and FPLLL was much faster. Of course it would be good to have an independent comparison, but it might be that PSLQ does not outperform LLL when searching for linear relations.
comment:3 Changed 4 years ago by wdj
To be precise, Zimmerman's code is at http://www.loria.fr/~zimmerma/free/pslq-1.0.c and is licensed under the GPLv2+.
comment:4 Changed 4 years ago by was
- Summary changed from Add a pslq implementation to Sage to [with patch] Add a pslq implementation to Sage
I've attached the code/paper that was attached to the following email.
agnes.jany@googlemail.com> to wstein@gmail.com date Mon, Aug 10, 2009 at 2:16 PM subject PSLQ implementation mailed-by googlemail.com hide details 2:16 PM (1 hour ago) Reply Follow up message Dear Mr Stein, I'm a mathematics student at the Johannes-Gutenberg University of Mainz, Germany. As a part of my diploma thesis, I have implemented PSLQ to SAGE. You can find the code, a worksheet and a documentation in the attachment of this email. Maybe you can use my work for your project. Yours sincerely, Agnes Jany
comment:5 Changed 3 years ago by AlexGhitza
- Report Upstream set to N/A
Note that mpmath, which is now a standard Sage package, contains an implementation of pslq:
---------------------------------------------------------------------- | Sage Version 4.2.1, Release Date: 2009-11-14 | | Type notebook() for the GUI, and license() for information. | ---------------------------------------------------------------------- sage: import sage.libs.mpmath.all as mpmath sage: mpmath.mp.dps = 30 sage: mpmath.pslq([sqrt(n) for n in range(2, 8+1)]) [2, 0, 0, 0, 0, 0, -1] sage: mpmath.pslq([pi/4, acot(5), acot(239)]) [1, -4, 1]
The examples are from the mpmath documentation, see http://mpmath.googlecode.com/svn/trunk/doc/build/identification.html
The first one says that the only integer relation between the square roots of 2,3,...,8 is 2\sqrt{2}-\sqrt{8}=0. The second is one of the cool formulas expressing pi as a combination of arccotangents.
comment:6 Changed 3 years ago by AlexGhitza
It would still be worth it to wrap Paul Zimmermann's C implementation, since it's fast. I tried it together with the mpmath implementation on the real life example given at the top of http://www.cecm.sfu.ca/organics/papers/bailey/paper/html/node6.html
Both give the right answer (yay!). According to timeit:
Zimmermann's C implementation:
25 loops, best of 3: 13.6 ms per loop
mpmath's implementation:
5 loops, best of 3: 267 ms per loop
So the C code is 20 times faster in this example.
comment:7 Changed 3 years ago by fredrik.johansson
The mpmath implementation should be correct, but it doesn't implement all the stopping criteria, and sometimes the precision, tolerance and iteration settings need fiddling with to give the right result. So even ignoring speed, it would be desirable for Sage to use an implementation that gives good control over all the settings.
The 20x difference actually sounds a bit high (though not unrealistic). I wonder if the number of iterations is the same for both implementations (it could be quite different depending just on small implementation details). Also, on my computer, PSLQ runs about twice as fast with mpmath+gmpy than mpmath+Sage.
I don't know how the other implementations compare, but I would favor adding e.g. Paul Zimmermann's implementation to Sage. It should be trivial to wrap mpmath.pslq as well, so perhaps it could be provided as an optional algorithm.
comment:9 Changed 3 years ago by AlexGhitza
- Status changed from needs_work to needs_info
- Work issues set to need advice on interface
- Milestone changed from sage-wishlist to sage-4.3.3
- Summary changed from [with patch] Add a pslq implementation to Sage to Add a pslq implementation to Sage
- Authors set to Paul Zimmermann, Alex Ghitza
I have created an spkg for Paul Zimmermann's C implementation of PSLQ, see
http://sage.math.washington.edu/home/ghitza/pslq-1.0.spkg
I have also started writing an interface for using this from Sage, see the attached patch trac_853.patch. This is obviously not done: for one, there should be way more docstrings and doctests. However, I would like some feedback on the interface before I put much more work into this. If somebody has a cleaner way of using PSLQ from Sage, I'm happy to listen and throw away what I've done so far.
I'm about to ask for feedback on sage-devel.
Changed 3 years ago by AlexGhitza
-
attachment
trac_853.patch
added
apply after installing pslq-1.0.spkg
comment:10 Changed 3 years ago by AlexGhitza
- Milestone changed from sage-4.3.3 to sage-wishlist
PS (not LQ): I do think that we should also eventually have implementation="mpmath" as an option, if only for verification purposes; but I would prefer not to overload this ticket, since it's already two years old.
PPS: I'll change the milestone back to sage-wishlist until this is ready for review.
comment:11 Changed 3 years ago by drkirkby
- Reviewers set to David Kirkby
It's good to see you have tested this on Solaris, though there is a potential Solaris issue:
if [ `uname` = "Darwin" -a "$SAGE64" = "yes" ]; then
echo "64 bit MacIntel"
CFLAGS="$CFLAGS -m64 "; export CFLAGS
fi
should, at the very least, be replaced by
if [ "x`uname`" = xyes ]; then
echo "Building a 64-bit version of pslq"
CFLAGS="$CFLAGS -m64 "; export CFLAGS
LDFLAGS="$LDFLAGS -m64 "; export LDFLAGS
fi
It is in general safer to put an x in front of the uname and then test for whatever we want, with an x in front of that. (This is for maximum portability, using any shell, and is not essential, but I think a good habit to get into). There is no need to quote "xyes" as we know it has no spaces in it.
Without removing the Darwin restriction, it would be impossible to build a 64-bit version on Solaris, OpenSolaris or other system such as HP-UX where both 32-bit and 64-bit executables are supported.
Whether LDFLAGS is necessary or not depends on the package. I've not tried building this 64-bit Solaris. I'll have a look at that later, but I do not think setting LDFLAGS ever appears to do any harm, and is sometimes essential.
Actually, better still would be
if [ -z "$CFLAG64" ] ; then
CFLAG64=-m64
fi
if [ "x`uname`" = xyes ]; then
echo "Building a 64-bit of pslq"
CFLAGS="$CFLAGS $CFLAG64"
LDFLAGS="$LDFLAGS $CFLAG64"
fi
if [ "x`$SAGE_LOCAL/bin/testcc.sh`" = xGNU ] ; then
CFLAGS="$CFLAGS -Wall -pedantic"
fi
export CFLAGS
export LDFLAGS
as that would
- Allow the variable CFLAG64 to be set to whatever compiler flag is necessary to build 64-bit code, which is not -m64 for all compilers. (CFLAG64 has been used in other packages for this purpose, to increase portability).
- Add the compiler options -Wall and -pedantic if using gcc.
Compiling with the -Wall -pedantic options I get:
drkirkby@hawk:/tmp/pslq-1.0/src$ gcc -Wall -pedantic -c pslq-1.0.c pslq-1.0.c: In function ‘print_column’: pslq-1.0.c:175: warning: format ‘%u’ expects type ‘unsigned int’, but argument 2 has type ‘long unsigned int’ pslq-1.0.c: In function ‘print_relation’: pslq-1.0.c:224: warning: format ‘%u’ expects type ‘unsigned int’, but argument 3 has type ‘long unsigned int’ pslq-1.0.c: In function ‘print_matrix’: pslq-1.0.c:240: warning: format ‘%u’ expects type ‘unsigned int’, but argument 2 has type ‘long unsigned int’ pslq-1.0.c: In function ‘pslq’: pslq-1.0.c:855: warning: format ‘%u’ expects type ‘unsigned int’, but argument 2 has type ‘long int’ pslq-1.0.c:858: warning: format ‘%u’ expects type ‘unsigned int’, but argument 2 has type ‘long int’ pslq-1.0.c:860: warning: format ‘%d’ expects type ‘int’, but argument 2 has type ‘long int’ pslq-1.0.c:870: warning: format ‘%u’ expects type ‘unsigned int’, but argument 2 has type ‘long int’ pslq-1.0.c:892: warning: format ‘%u’ expects type ‘unsigned int’, but argument 2 has type ‘long int’ pslq-1.0.c: In function ‘main’: pslq-1.0.c:972: warning: implicit declaration of function ‘strcmp’ pslq-1.0.c:1040: warning: format ‘%u’ expects type ‘unsigned int’, but argument 2 has type ‘long unsigned int’ pslq-1.0.c:1044: warning: format ‘%u’ expects type ‘unsigned int *’, but argument 2 has type ‘long unsigned int *’ pslq-1.0.c:1055: warning: format ‘%u’ expects type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
Some of those warnings would lead me to believe a 64-bit build of this would not work as expected. In that case, 'unsigned int' would be 4 bytes, but 'long unsigned int' would be 8 bytes. That could go very pear shaped.
The function strcmp() is defined in strings.h on Solaris, so I would suggest adding
#include <strings.h>
I can't comment on the maths aspect of it - I'm not a mathematician.
Some of these issues need reporting upstream, some are problems with spkg-install.
I would note that all of the above code snippets I wrote were untested, so would need testing
Dave
comment:13 Changed 3 years ago by zimmerma
I'd like to review that patch (now at SD20 in Marseilles) however I've downloaded sage 4.3.3.alpha1 a few days ago and compiled it on my laptop (Core 2 Duo under Fedora 12), and sage -t * gives 593 Segfaults (without any patch applied). With this, I don't see how I could seriously review that patch. I hope the final 4.3.3 release is better (now compiling). Is this problem on Fedora 12 being analyzed currently? I can send the complete log from sage -t * with 4.3.3.alpha1 if needed.
Paul
comment:14 Changed 3 years ago by AlexGhitza
Hi Paul,
I'm in a similar situation, and currently having to build sage-4.3.3 on my laptop since I messed up my 4.3.3.alpha1. I suggest you send an email to sage-devel about the Fedora 12 issues, with a link to the test log (and maybe also to the build log).
Since you're looking at reviewing this, I will try to finish up the documentation and other things today (most likely tonight... Australian time).
comment:15 Changed 3 years ago by zimmerma
I did a comparison of my PSLQ implementation (within Sage) with fpLLL with a knapsack matrix. With Bailey's "node6" example, fpLLL is 14 times faster:
sage: m = matrix(9,10)
sage: for i in range(9):
m[i,i]=1
sage: for i in range(9):
m[i,9]=ZZ(num_list[i]*RealField(200)(2)^180)//2^10
sage: L=m.LLL()
sage: L.row(0)
(-480, 1920, 0, -16, -255, -660, 840, 160, -360, 219687)
sage: p.coefficients()
(480, -1920, 0, 16, 255, 660, -840, -160, 360)
sage: %timeit L=m.LLL()
625 loops, best of 3: 1.41 ms per loop
sage: %timeit p=PSLQ(num_list, prec=167)
25 loops, best of 3: 19.8 ms per loop
Thus apart from historical reasons (or comparison with fpLLL) I don't see any point to add PSLQ in Sage. Or the default PSLQ mode should be to call fpLLL. However maybe I'm biased because fpLLL was designed by a former student of mine.
comment:16 Changed 3 years ago by zimmerma
Or the default PSLQ mode should be to call fpLLL.
in fact, it would be cleaner to have a function linear_relation, which could have algorithm=LLL (default) or algorithm=PSLQ.
comment:17 Changed 3 years ago by zimmerma
I got some more information about PSLQ by David Bailey, who agreed that I forward it to the Sage developers (the references [1] and [2] are those from pslq-1.0.c):
Comments: Reference [1] in your note is the original PSLQ paper, but the algorithm as presented there is quite cumbersome, as it involves (needlessly) many full-matrix operations. Reference [2] stated an abbreviated but equivalent version; unfortunately, however, it includes one bug. Thus I strongly suggest that you base your implementation on the following paper: David H. Bailey and David J. Broadhurst, "Parallel Integer Relation Detection: Techniques and Applications," /Mathematics of Computation/, vol. 70, no. 236 (Oct 2000), pg. 1719-1736. Our preprint copy is available at: http://crd.lbl.gov/~dhbailey/dhbpapers/ppslq.pdf The basic PSLQ algorithm is stated on page 2, and should work well as stated (please let me know if you have any problems). A two-level and a three-level variant are also described, which are faster but quite a bit more complicated. However, if you are really serious, I suggest that you try the "multi-pair" variant of PSLQ, which is presented in the above paper beginning on page 10. Although we devised this scheme originally to be suitable for parallel processing, we have found that even on a single processor system it runs significantly faster, and is significantly more effective in recovering relations when the input data is given only to limited precision. Two- and a three-level variants of the multi-pair scheme, in analogy to the two- and three-level versions of the regular PSLQ, are also given in the paper. These are much faster than the basic multi-pair PSLQ scheme, because they perform most operations using ordinary double-precision arithmetic, updating the multi-precision arrays only occasionally when needed. In my own work, I always use the multi-pair PSLQ. I use the basic multi-pair PSLQ for n up to 10 or 20 and for modest precision. For larger n, and, say, 500-digit or more precision, I generally use two-level multi-pair scheme. For truly "heroic" calculations (e.g., n > 100 and precision level > 2000 digits), I use the three-level multi-pair scheme, since it has advantages for very large calculations and runs well on a parallel system -- see some case studies mentioned in the above paper. Please let me know if it works for you. And if you have any questions, I would be pleased to respond. If you wish, you can look at the implementations of PSLQ and the multi-pair PSLQ schemes (in both C++ and Fortran-90) that we have bundled with our ARPREC package: http://crd.lbl.gov/~dhbailey/mpdist
I will try to modify my code to use the "basic PSLQ algorithm" described in the paper mentioned above. However in the short term I won't be able to implement the multi-pair variant. Thus if somebody wants to do it, please proceed. Alternatively, one might use the PSLQ variants from ARPREC (if the license is ok).
comment:18 Changed 3 years ago by AlexGhitza
Just a note on ARPREC's license, since Paul brought it up: it is not BSD as stated in this ticket's description, rather BSD-LBNL. However, this is apparently compatible with both GPLv2 and GPLv3, according to the table "Good licenses" at
http://fedoraproject.org/wiki/Licensing
So there should be no legal obstacles to using ARPREC.
comment:20 follow-up: ↓ 21 Changed 11 months ago by hivert
Hi Paul,
How much work would that be to interface your C code with Sage ? Do you have a proof of concept ?
Florent
comment:21 in reply to: ↑ 20 ; follow-up: ↓ 22 Changed 11 months ago by zimmerma
comment:22 in reply to: ↑ 21 Changed 11 months ago by hivert
Replying to zimmerma:
Replying to hivert:
How much work would that be to interface your C code with Sage ? Do you have a proof of concept ?
no idea. Why do you ask? See comment 2. I see no reason to interface PSLQ.
I'm at a small workshop and there is someone which is currently using Maple and is considering to switch to Sage.. Maple has both LLL and PSLQ. He told me that, he has some stability problem with LLL, in the sense that removing some precision digits gives drastically different results. Apparently PSLQ doesn't. I've no idea if it's a problem with the algorithms or the implementation.
comment:23 Changed 11 months ago by zimmerma
I'm curious seeing an example with some stability problem with LLL.
Paul
comment:24 Changed 11 months ago by AlexGhitza
This past January, a student of mine and I have run some experiments comparing fpLLL and the PSLQ implementations from ARPREC (we wanted to take the best current implementations to get a realistic comparison). In the examples we ran, we found almost no reason to use PSLQ instead of fpLLL for finding integer relations. The only situation where PSLQ might be more appropriate is when it is extremely expensive to generate extra digits in the input floating point numbers. PSLQ has a slight edge here because it tends to require fewer digits of precision than fpLLL. Most of the time this is of no consequence because fpLLL is much faster. We'll try to write something up describing our experiments and results, but I don't know how soon I'll find time for that.
In terms of "stability", our experiments indicate that PSLQ tends to stick with the correct answer once it finds it, as you add more digits of precision. With fpLLL, you sometimes hit the right answer with, say 190 digits, but then you get different answers for a short while (say, 191 to 197 digits), and then it stabilises on the right answer again. Paul, I can dig up an explicit example of this if you are interested. Again, I don't see this as an issue from a practical point of view -- I would run the algorithm until I get the exact same answer with 3 or 5 different precisions.
comment:25 Changed 11 months ago by zimmerma
Alex,
I can dig up an explicit example of this if you are interested.
yes please do! Such explicit examples are extremely useful.
Paul
