Opened 2 years ago
Last modified 3 months ago
#30139 new task
Metaticket - SageManifolds Code Improvement Discussion
Reported by: | Michael Jung | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-9.8 |
Component: | manifolds | Keywords: | sagemanifolds |
Cc: | Eric Gourgoulhon, Travis Scrimshaw, Tobias Diez, Frédéric Chapoton, Matthias Köppe, Nicolas M. Thiéry | Merged in: | |
Authors: | Reviewers: | ||
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description (last modified by )
The main purpose of this ticket is a discussion and coordination platform regarding general code improvements of the preexisting SageManifolds code apart from bugfixes and new features. That is, for instance, the removal of existing code duplication, and strategies to improve readability, debugging and compatibility.
I for myself do not have much experience in this field. However, I am eager to improve this nice piece of package as much as I can, so that anyone can benefit from it. In any case, this is not a one man task and I appreciate any support in this matter.
I am looking forward to the discussion! :)
Tasks:
- #29234: Remove code redundancies between
Section
andTensorField
Change History (38)
comment:1 follow-up: 4 Changed 2 years ago by
Cc: | Eric Gourgoulhon Travis Scrimshaw Tobias Diez Frédéric Chapoton Matthias Köppe Nicolas M. Thiéry added |
---|
comment:2 Changed 2 years ago by
Description: | modified (diff) |
---|
comment:3 Changed 2 years ago by
Component: | geometry → manifolds |
---|
comment:4 follow-up: 5 Changed 2 years ago by
Let me preface my comments: I have only very recently started to look into sage-manifolds and my understanding of this code is very shallow.
Replying to gh-mjungmath:
The new feature of vector bundles, for example, came with a cost: code duplication and redundancies. This is certainly unwanted (see ticket #29234). Take sections and tensor fields. They share a lot of common code snippets and methods. For example the method
restrict
which is more or less the same. Bugs for sections must now be fixed for tensor fields, and vice versa. But a simple inheritation comes with a problem: we have to drop most of the preexisting documentation. Imho, this would be sad, and I like to discuss how we deal with that.
If you are concerned about removing doctests (rather than documentation) as a byproduct of removing code duplications: There is a solution by moving many tests from doctests to _test...
methods. These are inherited by subclasses and therefore test coverage can be improved.
I did this in the sage.numerical.backends
modules in #20323 and follow-up tickets, and recently gh-kliem
started to do the same in sage.geometry.polyhedra
(see for example #29918).
In sage.numerical.backends.logging_backend
I have some code that assists with transforming doctests to _test...
methods, which could be generalized easily.
comment:5 follow-up: 15 Changed 2 years ago by
Replying to mkoeppe:
If you are concerned about removing doctests (rather than documentation) as a byproduct of removing code duplications: There is a solution by moving many tests from doctests to
_test...
methods. These are inherited by subclasses and therefore test coverage can be improved.
Thank you. This is good to know and definitely an important point. My concern, however, is also the documentation itself. Tensor fields and sections are declared slighly differently. For example, sections are constructed by using an instance of the vector bundle. Tensor fields, on the other hand, are declared by using an instance of the manifold. Therefore, it would be good to have separate closed minimal examples and hence distinct documentations.
What about something like this:
class MyClass(ParentClass) r""" Blabla """ my_method(self, a, b, c='c'): r""" Overwritten docstring """ return ParentClass.my_method(self, a, b, c=c)
Or is that too nasty?
comment:6 Changed 2 years ago by
Nothing wrong with that, I would say. (Of course, one would use super()
instead of referring to the parent class by name.)
The performance impact should be minimal.
comment:7 follow-ups: 9 16 Changed 2 years ago by
Just one caveat: it makes it more complicated to look at the code of the method by introspection.
I don't have a good solution though. The best approximation I can come it is to include both the examples/doc of vector bundles and tensor fields in the documentation of the method. It makes it a bit long for the reader. But points out the similarity, which is not bad per se.
Matthias: glad to see _test methods being adopted at a large scale, with additional tooling :-)
comment:8 Changed 2 years ago by
In a different direction: The (multi)linear algebra in sage.tensor
that was developed for sage-manifolds has a much better design than the linear algebra provided by sage.matrix
(see, for example, #30091).
It would be nice to be able to use it for computational (exact and inexact) (multi)linear algebra. In #30061, Eric already did some improvements that makes better use of sparsity.
In addition to such speedups, it would be good to be able to use different backends for storing tensors instead of the current dictionary backend.
On the one hand, for special cases such as vectors and matrices by existing special implementations; for the special case of fully symmetric tensors, using fast implementations of multivariate polynomials.
On the other hand, for general numerical tensors using mainstream accelerated frameworks like PyTorch (see #30096).
comment:9 follow-ups: 10 18 Changed 2 years ago by
Replying to mkoeppe:
Nothing wrong with that, I would say. (Of course, one would use
super()
instead of referring to the parent class by name.)
I would have thought that referring to the parent class by name is more robust against changes and increases readability. Is there a particular reason to prefer super()
?
The performance impact should be minimal.
That's good.
Replying to nthiery:
Just one caveat: it makes it more complicated to look at the code of the method by introspection.
What exactly do you mean by that?
I don't have a good solution though. The best approximation I can come it is to include both the examples/doc of vector bundles and tensor fields in the documentation of the method. It makes it a bit long for the reader. But points out the similarity, which is not bad per se.
I'd guess this would contradict the well-documented philosophy of Sage, wouldn't it?
comment:10 Changed 2 years ago by
Replying to gh-mjungmath:
Replying to mkoeppe:
(Of course, one would use
super()
instead of referring to the parent class by name.)I would have thought that referring to the parent class by name is more robust against changes and increases readability. Is there a particular reason to prefer
super()
?
For classes with single inheritance, it is merely a matter of style. But as soon as multiple inheritance enters, using super()
ensures that methods of all superclasses are called (and called in the right order)
comment:11 Changed 2 years ago by
Description: | modified (diff) |
---|
comment:12 follow-up: 13 Changed 2 years ago by
As someone who's new to sage, but would like to get involved in SageManifolds?: how can I help?
comment:13 Changed 2 years ago by
Replying to gh-bollu:
As someone who's new to sage, but would like to get involved in SageManifolds?: how can I help?
Welcome to the project! There are various ways to contribute, see https://sagemanifolds.obspm.fr/contrib.html.
If you take a look at the metaticket #18528, you'll see that various tickets are in stalled state (mostly the blue ones), are therefore are open for development. For instance, #18786 about complex structures. You could pick one of those or create a brand new ticket about a topic that you like.
comment:14 Changed 2 years ago by
Description: | modified (diff) |
---|
comment:15 Changed 2 years ago by
Replying to gh-mjungmath:
Replying to mkoeppe:
What about something like this:
class MyClass(ParentClass) r""" Blabla """ my_method(self, a, b, c='c'): r""" Overwritten docstring """ return ParentClass.my_method(self, a, b, c=c)Or is that too nasty?
This is code duplication and should be avoided IMHO, cf. comment:6 in #29234 for the case of sections vs. tensor fields.
comment:16 Changed 2 years ago by
Replying to nthiery:
I don't have a good solution though. The best approximation I can come it is to include both the examples/doc of vector bundles and tensor fields in the documentation of the method. It makes it a bit long for the reader. But points out the similarity, which is not bad per se.
+1
comment:17 Changed 2 years ago by
Tobias had a very nice idea how to deal with some code duplication, in particular the restriction process: introduce sheafs (cf. comment:7 in #29234).
I like that idea and I think this is something we definitely should aim for. However, this task is not as easy as one might think on the first glimpse. My first attempt would be utilizing the morphism framework of Sage. Unfortunately, my experience with morphisms in Sage is very limited. What do you think? Any further ideas?
comment:18 Changed 2 years ago by
Replying to gh-mjungmath:
Replying to nthiery:
Just one caveat: it makes it more complicated to look at the code of the method by introspection.
What exactly do you mean by that?
I guess Nicolas means that if the user would like to take a look at the source code via t.my_method??
, he/she would get return ParentClass.my_method(self, a, b, c=c)
, i.e.
useless information.
comment:19 Changed 2 years ago by
Milestone: | sage-9.2 → sage-9.3 |
---|
comment:20 follow-up: 21 Changed 23 months ago by
I thought a bit about optimizations and the first thing that came to my mind: Cythonize code. Of course, one has to check with prun etc. whether it's worth it. However, Cythonizing comp.py
and/or the algebraic part in the tensor module should be doable.
One thing that could definitely be Cythonized is the way of how names, i.e. strings, are concatenated, for example in format_utilities.py
. I can imagine that this could improve speed a lot. Of course, this needs to be investigated with some performance checks before, too.
Eric, what do you think?
comment:21 Changed 23 months ago by
Replying to gh-mjungmath:
I thought a bit about optimizations and the first thing that came to my mind: Cythonize code. Of course, one has to check with prun etc. whether it's worth it. However, Cythonizing
comp.py
and/or the algebraic part in the tensor module should be doable.One thing that could definitely be Cythonized is the way of how names, i.e. strings, are concatenated, for example in
format_utilities.py
. I can imagine that this could improve speed a lot. Of course, this needs to be investigated with some performance checks before, too.Eric, what do you think?
Cythonizing the algebraic part of tensor calculus will unfortunately not help much to improve the computational speed on manifolds. The reason is that the main bottleneck is symbolic calculus on the tensor field components and more specifically the simplification process, which is performed via Maxima (default backend, Lisp based) or by SymPy (optional backend, Python based). To really gain in performance, one shall rely on parallelization rather than cythonization. A real limitation in this respect is #27492, which forbids parallelization with symbolic functions (side note: certainly Sage's symbolic functions require some major rewritting, see #31047)
comment:22 follow-ups: 23 25 Changed 23 months ago by
Perhaps I am missing something. SymPy? has a new C++ backend called SymEngine? https://github.com/symengine/symengine. Would that improve performance by some degree?
comment:23 Changed 23 months ago by
Replying to gh-bollu:
Perhaps I am missing something. SymPy? has a new C++ backend called SymEngine? https://github.com/symengine/symengine. Would that improve performance by some degree?
Thanks for pointing this! This is certainly worth to investigate. Especially SymEngine's README says: Python wrappers allow easy usage from Python and integration with SymPy and Sage (the symengine.py repository).
comment:24 Changed 23 months ago by
If one were to attempt to integrate/use SymEngine? for the purpose of making the algebra faster, what should one do? I'd imagine
- Pull in SymEngine? into SAGE
- Replace the calls to Maxima to call to SymEngine?
- Benchmark?
comment:25 follow-up: 26 Changed 23 months ago by
Replying to gh-bollu:
Perhaps I am missing something. SymPy? has a new C++ backend called SymEngine? https://github.com/symengine/symengine. Would that improve performance by some degree?
That looks extremely promising. It's definitely worth investigating. And who knows, perhaps it could even replace the entire symbolic backend in Sage.
Perhaps we should post this in the sage-devel group and initiate a discussion.
comment:26 Changed 23 months ago by
Replying to gh-mjungmath:
Replying to gh-bollu:
Perhaps I am missing something. SymPy? has a new C++ backend called SymEngine? https://github.com/symengine/symengine. Would that improve performance by some degree?
That looks extremely promising. It's definitely worth investigating. And who knows, perhaps it could even replace the entire symbolic backend in Sage.
Indeed, this is all the more relevant, given that the main developer of Pynac (Sage's symbolic backend) has resigned: https://groups.google.com/g/sage-devel/c/JVLHRy2mxUI
Perhaps we should post this in the sage-devel group and initiate a discussion.
Yes this is certainly the right place to initiate the discussion.
comment:27 Changed 23 months ago by
I've created a thread on sage-devel about moving to SymEngine? https://groups.google.com/g/sage-devel/c/sY3zh-pq8T4
comment:28 Changed 21 months ago by
Milestone: | sage-9.3 → sage-9.4 |
---|
Sage development has entered the release candidate phase for 9.3. Setting a new milestone for this ticket based on a cursory review of ticket status, priority, and last modification date.
comment:29 Changed 20 months ago by
Indeed, it would be nice to (co)homology of manifolds. There are two approaches I see right now:
- Morse homology
- Čech cohomology
Čech cohomology sounds promising since it can (more or less) directly be obtained from charts and their transition maps (see e.g. https://webspace.science.uu.nl/~caval101/homepage/Differential_geometry_2014_files/notes%20on%20Cech%20cohomology.pdf).
As for Morse homology I'd like to refer to https://www.sciencedirect.com/science/article/pii/S0723086905000642. With the orientation implementation we already have, this could be doable, even though there are some immediate problems that I can think of:
- How do we check whether a function is Morse?
- How can we obtain all critial points? How do we determine their indices?
Is there perhaps a nice class of Morse functions we can easily deal with?
Opinions, comments, hints, references and ideas are welcome! :)
comment:30 follow-up: 31 Changed 20 months ago by
I would probably start with Čech cohomology since it is more straightforward.
For finding the critical points in Morse homology, I think you can compute them on each chart individually since they are suppose to be isolated IIRC and then remove redundancies. From there, you can check the degeneracy condition.
comment:31 Changed 20 months ago by
Replying to tscrim:
I would probably start with Čech cohomology since it is more straightforward.
For that, we need "good covers" consisting of contractible sets, and all its intersections being contractible. Thus we need a way to check contractibility for open subsets. Is that worth a post in sage-devel?
comment:32 follow-up: 33 Changed 20 months ago by
The manifolds code is growing steadily and more files are added. Do you think it is time that we consider refactoring the folder structure of the manifolds module?
comment:33 Changed 20 months ago by
Replying to gh-mjungmath:
The manifolds code is growing steadily and more files are added. Do you think it is time that we consider refactoring the folder structure of the manifolds module?
Some of the things added as part of #31740 (Meta-ticket: Families, posets, complexes of manifold subsets) could certainly be moved into a subpackage sage.manifolds.subsets
:
comment:35 Changed 16 months ago by
Milestone: | sage-9.4 → sage-9.5 |
---|
comment:36 Changed 12 months ago by
Milestone: | sage-9.5 → sage-9.6 |
---|
comment:37 Changed 7 months ago by
Milestone: | sage-9.6 → sage-9.7 |
---|
comment:38 Changed 3 months ago by
Milestone: | sage-9.7 → sage-9.8 |
---|
Allow me to start the discussion. In my opinion, the highest priority has the removal of code duplication. The new feature of vector bundles, for example, came with a cost: code duplication and redundancies. This is certainly unwanted (see ticket #29234). Take sections and tensor fields. They share a lot of common code snippets and methods. For example the method
restrict
which is more or less the same. Bugs for sections must now be fixed for tensor fields, and vice versa. But a simple inheritation comes with a problem: we have to drop most of the preexisting documentation. Imho, this would be sad, and I like to discuss how we deal with that.Of course, this issue is not only restricted to vector bundles. As I modified tensor fields, I had to make changes in three different files simultaneously which had a very similar setup with similar code snippets:
manifolds/differentiable/tensorfield.py
,manifolds/differentiable/tensorfield_paral.py
andtensor/modules/free_module_tensor.py
. I had to copy the very same lines over and over again. It was a tedious task not only to modify the code but also to understand how it was built up. A simplification would be desireable, even though I have no clue how. If I look through the code, I am sure, I will find further examples.Another thing I am bothering with is the invocation of private variables via
._private_variable
. I think, these variables are private for a reason, and in my opinion, to keep the code as clean as possible, they should be invoced by methods and not directly. But I think, this has low priority because it has not such a tremendous impact. However, it would be nice, I guess.