Opened 8 years ago
Last modified 5 years ago
#15492 needs_work enhancement
Implement multidimensional numerical integration
Reported by:  stassev  Owned by:  

Priority:  major  Milestone:  sage6.4 
Component:  numerical  Keywords:  numerical integration, cuba 
Cc:  novoselt  Merged in:  
Authors:  Reviewers:  
Report Upstream:  N/A  Work issues:  
Branch:  u/rws/implement_multidimensional_numerical_integration (Commits, GitHub, GitLab)  Commit:  a9b78d7cf0c69cb67098a4882a42957e390906d3 
Dependencies:  Stopgaps: 
Description (last modified by )
The patch coming with this ticket is a multidimensional numerical integrator, which uses calls to the Cuba library (http://www.feynarts.de/cuba/), or in case of 1dim integrals, it defaults to the GSL numerical_integral.
The attached code also has an option to try to reduce the dimensionality of the integral by first trying a (multithreaded) symbolic evaluation.
To make the integral evaluation fast, the code first makes an honest effort to compile the integrand (either using _fast_float_ or cython, depending on the input).
The code can output verbose information on all intermediate steps (including the resulting cython code to be compiled; results from the symbolic preprocessing, etc.), which tries to make the evaluation as transparent as possible. This should allow both for debugging, as well as for optimizing the input integrand in slowtoconverge cases.
For most cases I tested it, this implementation is faster than Mathematica's NIntegrate thanks to Cuba and the included symbolic preprocessor.
Note that the patch requires the Cuba library to be installed for Sage. So, I am attaching Cuba as a spkg package with this ticket as well. That spkg package needs to be installed first before building sage with the patch.
Attachments (3)
Change History (22)
comment:1 Changed 8 years ago by
 Description modified (diff)
 Status changed from new to needs_review
comment:2 Changed 8 years ago by
comment:3 Changed 8 years ago by
LGPL is fine.
If you want to touch cuba's build system at all, it would be nice to
transition to libtool / automake. There shouldn't be any handcrafted
makefile.in
, it should be autogenerated from Makefile.am
. Its
basically impossible to generate shared libaries by hand. I'm pretty
sure your makefile.in
won't work on OSX and Solaris. But for
starters, you tould just use the cuba static library (static libraries
suck, but we could leave enabling shared libraries for the future...)
The code should go into sage.libs.cuba
, either a separate directory
or just sage/libs/cuba.pyx
. Don't add a copy of the cuba header to
the Sage library, include the installed header (see make install
).
The Sage library code is supposed to make at least an attempt to keep lines at <= 80 chars. Its acceptable to have longer lines, but only if it is necessary to improve readability. E.g. comments and docstrings very rarely have any excuse to be longer than that. In emacs reflowing a paragraph of text is Metaq (Alt + 'q'), e.g.
Also, PEP8 spacing would be nice.
Line breaks in the doctest should use the new syntax ....:
:
sage: foo(123, ....: 456)
The multidim_integration
function is 1000+ lines long. It really
needs to be refactored into subroutines of reasonable size.
Use SR.symbol()
for temporary variables whose name you don't care about.
The whole functionality to compile a symbolic expression into C code
is nice but unrelated to cuba. It should be moved elsewhere (probably
somewhere in sage.ext
). It is also sufficiently independed as a task
that it could be a separate ticket, you could first implement cuba
with fast_float
and later add the C/cython callable as an optimization.
comment:4 Changed 8 years ago by
Hello all,
Thank you so much for your prompt comments! I hope I address them at least soso below. I updated the attached patch and spkg files accordingly.
As sort of an excuse, I wanted to say that actually this is my first python (and Sage) code ever, so bear with me ... Here it goes point by point.
Replying to kcrisman's comments:
 I can never remember whether the LGPL is compatible with GPL in the correct order we need it to be here...
 All I can say is that Wikipedia says it is, and a cursory look at the text confirms it ... for me.
 I think that you should consult more (as per your sagedevel post) with where this should fit in with Sage. We already have enough directories, and ideally this would have at least some direct connection with already existing integration facility and/or the GSL stuff. That said, obviously this is not something one can plop in the calculus directory either.
 Moved code to sage.libs.cuba as per comment by vbraun above.
 There's also the matter of requiring a new spkg without the "waiting period as optional spkg". I'm not saying what should happen, just that this is typically the way this is handled.
 I am not sure what this means. How do I make a new spkg "without a waiting period"?
 The code is also very "case"y and full of lots and lots of options for the functions... which may be unavoidable. But it is hard to go through for that reason. Is there any way you can modularize this more? For instance, replaceDictionary (and we don't usually use camelCase) looks painful to me.
 Fixed camelCase, thanks! Modularity is addressed below. About the plethora of options  it is, I think, unavoidable. Numerical integration is always a bit of an art. What I tried to do is make the code as hackable as possible using only function arguments. You can always just ignore the obscure options... As an example of how "case"y it could have been, take a look at the full Mathematica documentation of NIntegrate: http://reference.wolfram.com/mathematica/tutorial/NIntegrateIntroduction.html http://reference.wolfram.com/mathematica/tutorial/NIntegrateIntegrationStrategies.html http://reference.wolfram.com/mathematica/tutorial/NIntegrateIntegrationRules.html
Replying to vbraun's comments:
 If you want to touch cuba's build system at all, it would be nice to transition to libtool / automake. There shouldn't be any handcrafted makefile.in, it should be autogenerated from Makefile.am. Its basically impossible to generate shared libaries by hand. I'm pretty sure your makefile.in won't work on OSX and Solaris. But for starters, you tould just use the cuba static library (static libraries suck, but we could leave enabling shared libraries for the future...)
 I have practically zero experience with libtool/automake, and especially with making software portable. The included makefile.in is already testing my abilities. And unfortunately I failed to build the sage lib with the static libcuba.a. So, I'd definitely appreciate any help you can give me here. But note that the original code comes with a makefile.in and without a makefile.am. All I did was just cosmetic changes to the already supplied makefile.in: added a fPIC and $(CC) shared o libfile.so file.o which I don't see how it can be harmful. So, if something more is needed for portability, this may be an upstream issue. But again, I am hardly an expert.
 The code should go into sage.libs.cuba, either a separate directory or just sage/libs/cuba.pyx. Don't add a copy of the cuba header to the Sage library, include the installed header (see make install).
 Moved code to sage.libs.cuba. Removed copy to cuba.h.
 The Sage library code is supposed to make at least an attempt to keep lines at <= 80 chars. Its acceptable to have longer lines, but only if it is necessary to improve readability. E.g. comments and docstrings very rarely have any excuse to be longer than that. In emacs reflowing a paragraph of text is Metaq (Alt + 'q'), e.g.
 Thanks for the suggestion. Done ... more or less: The problem is that when I reflow the examples, keywords such as "rel tol" no longer have effect, so I left those untouched.
 Also, PEP8 spacing would be nice.
 I had no idea about recommended spacings! Thanks for pointing those out to me. Tried to fix them, but I'm sure I missed a lot.
 Line breaks in the doctest should use the new syntax ....::
 Much better looking now, thanks!
 The multidim_integration function is 1000+ lines long. It really needs to be refactored into subroutines of reasonable size.
 Modularity is addressed below.
 Use SR.symbol() for temporary variables whose name you don't care about.
 I really don't care about those vars outside that code, so for me they are indeed temporary. But let me know what I should use instead, if you still think this is not "standard Sage". Thanks!
 The whole functionality to compile a symbolic expression into C code is nice but unrelated to cuba. It should be moved elsewhere (probably somewhere in sage.ext). It is also sufficiently independed as a task that it could be a separate ticket, you could first implement cuba with fast_float and later add the C/cython callable as an optimization.
 Modularity is addressed below.
 Now, about modularity. Yes, it would be nice if compiling into C can be made automatic for any task (for people who do not want to bother with cython), but I don't think I am up for that task. Note that the cython compilation part of the patch specifically invokes multidim_integration_unit_cube. So as written, it is in fact very Cuba specific. In principle, that call can be replaced to a call to any other functional (in a similar way I did the function substitution using a dummy global function), but it already starts getting too contrived. Maybe at some point I'd think about extending that part of the code into a separate utility, but it will not happen now. However, I agree that that piece of the code can be moved to a separate function within the cuba lib. So, I did that, but again, only within the cuba lib. Also, note that of the original 1000+ lines of multidim_integration, 600 are in fact documentation; and the majority of the remaining 400 lines are simply sanity checks and printing debugging information, which should be pretty selfexplanatory ... I'd hope... But ok, I still split the code :)
 Nathann Cohen suggested that I should add .rst documentation. I tried adding that in doc/en/reference, but hg says those changes are ignored. Suggestions?
Cheers!
comment:5 Changed 8 years ago by
 Cc novoselt added
comment:6 Changed 8 years ago by
I found ways to fix the two problems I had before:
 After linefolding an example (using ....: ) which need keywords such as 'rel tol', those keywords must be sitting on the first line. Otherwise they have no effect. So, I folded (I hope) all lines neatly that end up in the documentation.
 About the rst documentation problem  I was modifying the directory structure of doc/en/reference/calculus/ instead of just the file index.rst. Now, adding documentation works for me. So, I added it.
Also I expanded the text of the documentation to make clearer why those examples matter. And I moved all imports to top level, so that function calls are not slowed down by those. For some reason %time registers that change (in Wall time), while timeit() does not...
I attach the updated patch.
comment:7 Changed 8 years ago by
As an interface that is presentable to the enduser I would propose
sage: integral.symbolic(...) sage: integral.numerical(...) sage: integral.multidim(...) sage: integral.multidim_numerical(...) sage: integral(...) # autoguess
The cuba library interface shouldn't do any symbolic integration. That is not only unrelated, most numerical options also don't make sense for symbolic integration.
fast_callable
should be used instead of the olderfast_float
.
 The C source code generation should be factored out and doctested separately.
 Dynamically loaded modules should be wrapped in some object (maybe
CythonCallable
) and unloaded when finished (i.e. destroyed). Otherwise its leaking memory...
 library code shoudn't use
sage_eval
.
 Catchall
except:
statements are not allowed  could beKeyboardInterrupt
, for example.
 Docstrings in Python should be imperative: "Return foo" instead of "Returns foo".
comment:8 Changed 8 years ago by
Let me for now address only your first comment, since the rest are secondary issues.
I completely agree with you that such a split of the interface must eventually happen. However, at the practical level, I do not think it is within the range of my current sage abilities or that I have enough motivation to do it. Here is why:
I've had way too many issues with integrate() to even start thinking about a multidimensional symbolic integrator. Here is a laundry list of problems I've faced in case anybody cares to take up the problem:
 If there are parameters in the integral then there is no easy way to pass answers to maxima, when it starts asking "Is <expression> positive, negative or zero?", when <expression> is some mess (which usually implies that using assume() to fix it won't work), which for multidimensional integrals is what you usually get in my experience. If there are no parameters, then all you care about is getting some number  so, there is not much point in the analytical answer itself (in my opinion of course).
 In many cases I tried (after transforming to the unit cube), maxima gives back an answer only once I avoid the boundaries of the interval and use the principal value of the integral. But to avoid them I need to introduce an epsilon,i.e. a parameter  which brings me back to point 1 above.
 sympy.integrate is actually in many cases much nicer (in my opinion), giving back piecewise answers, having a somewhat better handle on special functions such as Ci(x), etc. But sage doesn't support the piecewise answers; and SR(sympy.Ci(x)) gives an error, even though sage has Ci(x). E.g. try this: integrate(cos(x)/x,x,algorithm='sympy')
 If the analytical integral is not doable, then the question remains what should the symbolic result be. Just the first function
multidim_analytical_integration_unit_cube
returns? Or all of its results as a list? And handling anything but constant intervals (as over the unit cube in the current implementation) becomes somewhat of a mess, unless you transform your intervals to the unit cube. But in that latter case, simple integrands with simple (but nonnumeric) integration intervals very often become horrible messes when transformed to the unit cube, which even Mathematica can't handle.
So, really, the current multidim_analytical_integration_unit_cube
is the best I can hope for for now. And as a comparison, Mathematica also has such a SymbolicProcessing? option for its NIntegrate, which (at least at the user level) has nothing to do with doing multidimensional integrals with Integrate.
So, maybe a compromise would be simply to hide multidim_analytical_integration_unit_cube
with an underscore and not advertise it at all to the enduser? Let me know what you think.
comment:9 Changed 8 years ago by
I don't have a good answer to symbolic integration, as you say symbolic multidimensional integration is usually a mess unless there is some symmetry to exploit.
As for 4), you can have unevaluated symbolic expressions:
sage: x.add(x, hold=True) x + x sage: sage.symbolic.integration.integral.indefinite_integral(x, x, hold=True) integrate(x, x)
So my suggestion is (and thats what I meant to say initially) to just focus on the numerical integration and interfacing with cuba here. Anything else can go to a separate ticket. We don't even have to think too much about a userinterface yet, which is what you have done anyways in the current version where you have to explicitly import from sage.libs.cuba
. The integral()
command can then later be enhanced to make it easy to use.
comment:10 Changed 8 years ago by
So, I guess the issue is immediate usability vs. code cleanliness. I myself clearly vote for the former  I'd still like to include the symbolic preprocessing option with this patch. The option is really useful I think. Without it many integrals that I happen to stumble upon in my work are completely undoable numerically (which was exactly the reason I introduced it in the first place). That option is not used by default, so if anybody uses the integrator, the symbolic part would never kick in unless they actually enable it on purpose, which would mean they were desperate (as I was ...).
I do realize the symbolic option is unrelated to cuba, and I have no problem if at some point that part of the code is moved to a separate lib, once an actual symbolic multidim integrator is put in place. It is well separated in the code, so transitioning to a separate lib should not be an issue in terms of code maintainance. So, even if you favor code cleanliness, I don't think that the presence of that option would break longterm development/support.
Now, of course I'm just a first time contributor, and only a monthlong user of sage, which will soon be a decade old. So, I'll understand if you still decide to kill the option.
comment:11 Changed 8 years ago by
Ok, it wasn't clear to me that the symbolic preprocessing is that important. It is still unrelated to cuba, so how about moving it to sage.libs.cuba.symbolic_preprocessing.py
or something like that? Otherwise it makes it just harder to debug and read the cuba.pyx
file. Similar for C/Cython source code generation.
It also would be nice to split up multidim_integration
some more. For example, the entire if (symbolic)
block would be much better as a separate function: Makes multidim_integration
shorter and easier to read if you don't have to juggle with temporary variables funcS
, newvars
. E.g.
if symbolic: try: return multidim_integration_symbolic(...) except SymbolicTimeout: pass
comment:12 Changed 8 years ago by
Thank you for all the useful comments, Volker!
Below are my modifications in response to your comments.
Best, Svetlin
fast_callable
should be used instead of the olderfast_float
.
 Done.
 The C source code generation should be factored out and doctested separately.
 moved to new file: cython_compilation.pyx and tested.
 Dynamically loaded modules should be wrapped in some object (maybe
CythonCallable
) and unloaded when finished (i.e. destroyed). Otherwise its leaking memory...
 There are no dynamically loaded modules in the code (at least as far as I understand the definition). I moved all of them to top level so that loading modules does not slow down the code.
 library code shoudn't use
sage_eval
.
 switched to eval
 Catchall
except:
statements are not allowed  could beKeyboardInterrupt
, for example.
 fixed. Removed most try statements, which I realized I only needed in the code development.
 Docstrings in Python should be imperative: "Return foo" instead of "Returns foo".
 fixed.
 Ok, it wasn't clear to me that the symbolic preprocessing is that
important. It is still unrelated to cuba, so how about moving it to
sage.libs.cuba.symbolic_preprocessing.py
or something like that? Otherwise it makes it just harder to debug and read thecuba.pyx
file.
 Thanks! Moved to new file: symbolic_preprocessing.pyx. Added underscore in front of _multidim_analytical_integration_unit_cube
.
 Similar for C/Cython source code generation.
 Moved to new file: cython_compilation.pyx
 It also would be nice to split up
multidim_integration
some more. For example, the entireif (symbolic)
block would be much better as a separate function: Makesmultidim_integration
shorter and easier to read if you don't have to juggle with temporary variablesfuncS
,newvars
. E.g.if symbolic: try: return multidim_integration_symbolic(...) except SymbolicTimeout: pass
 Cleaned that part of the code a lot. Temp vars were actually left over cruft, which I removed. Now that piece of the code just checks whether integrand is constant and returns; or not, and continues.
Also, made additional minor tweaks and fixed a bug I had introduced at some point for Python function integrands, and for which I was missing an important test.
comment:13 Changed 8 years ago by
 Updating patch against sage 5.13.
 Removing left over unnecessary imports.
comment:14 Changed 8 years ago by
 Milestone changed from sage6.1 to sage6.2
comment:15 Changed 8 years ago by
 Milestone changed from sage6.2 to sage6.3
comment:16 Changed 8 years ago by
 Branch set to u/rws/implement_multidimensional_numerical_integration
comment:17 Changed 8 years ago by
 Commit set to a9b78d7cf0c69cb67098a4882a42957e390906d3
 Status changed from needs_review to needs_work
I have made a branch so that people can work better with your code. It however depends on the existence of the cuba package, which isn't even optional at the moment. So this needs work in that both the package needs to be included in Sage (please open a separate ticket) and this code should fail gracefully if the package is not install. This is all detailed in the developer manual.
New commits:
a9b78d7  Trac 15492: add multidim numerical integration

comment:18 Changed 7 years ago by
 Milestone changed from sage6.3 to sage6.4
comment:19 Changed 5 years ago by
+1 for having MonteCarlo integration in Sage.
I recently needed this functionality, and resorted to using the Cubature package; Python bindings exist, and I found the interface quite nice. I guess the difficult part would be to rewrite cubature.py for Sage..
Wow, this is interesting  and under active development. Here are annoying things that have nothing to do with the correctness of your code that nonetheless should be brought up.
replaceDictionary
(and we don't usually usecamelCase
) looks painful to me.