Opened 16 months ago

Last modified 3 months ago

#30139 new task

Metaticket - SageManifolds Code Improvement Discussion

Reported by: gh-mjungmath Owned by:
Priority: major Milestone: sage-9.5
Component: manifolds Keywords: sagemanifolds
Cc: egourgoulhon, tscrim, gh-tobiasdiez, chapoton, mkoeppe, nthiery Merged in:
Authors: Reviewers:
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by gh-mjungmath)

The main purpose of this ticket is a discussion and coordination platform regarding general code improvements of the preexisting SageManifolds code apart from bugfixes and new features. That is, for instance, the removal of existing code duplication, and strategies to improve readability, debugging and compatibility.

I for myself do not have much experience in this field. However, I am eager to improve this nice piece of package as much as I can, so that anyone can benefit from it. In any case, this is not a one man task and I appreciate any support in this matter.

I am looking forward to the discussion! :)

Tasks:

  • #29234: Remove code redundancies between Section and TensorField

Change History (35)

comment:1 follow-up: Changed 16 months ago by gh-mjungmath

  • Cc egourgoulhon tscrim gh-tobiasdiez chapoton mkoeppe nthiery added

Allow me to start the discussion. In my opinion, the highest priority has the removal of code duplication. The new feature of vector bundles, for example, came with a cost: code duplication and redundancies. This is certainly unwanted (see ticket #29234). Take sections and tensor fields. They share a lot of common code snippets and methods. For example the method restrict which is more or less the same. Bugs for sections must now be fixed for tensor fields, and vice versa. But a simple inheritation comes with a problem: we have to drop most of the preexisting documentation. Imho, this would be sad, and I like to discuss how we deal with that.

Of course, this issue is not only restricted to vector bundles. As I modified tensor fields, I had to make changes in three different files simultaneously which had a very similar setup with similar code snippets: manifolds/differentiable/tensorfield.py, manifolds/differentiable/tensorfield_paral.py and tensor/modules/free_module_tensor.py. I had to copy the very same lines over and over again. It was a tedious task not only to modify the code but also to understand how it was built up. A simplification would be desireable, even though I have no clue how. If I look through the code, I am sure, I will find further examples.

Another thing I am bothering with is the invocation of private variables via ._private_variable. I think, these variables are private for a reason, and in my opinion, to keep the code as clean as possible, they should be invoced by methods and not directly. But I think, this has low priority because it has not such a tremendous impact. However, it would be nice, I guess.

comment:2 Changed 16 months ago by gh-mjungmath

  • Description modified (diff)

comment:3 Changed 16 months ago by gh-mjungmath

  • Component changed from geometry to manifolds

comment:4 in reply to: ↑ 1 ; follow-up: Changed 16 months ago by mkoeppe

Let me preface my comments: I have only very recently started to look into sage-manifolds and my understanding of this code is very shallow.

Replying to gh-mjungmath:

The new feature of vector bundles, for example, came with a cost: code duplication and redundancies. This is certainly unwanted (see ticket #29234). Take sections and tensor fields. They share a lot of common code snippets and methods. For example the method restrict which is more or less the same. Bugs for sections must now be fixed for tensor fields, and vice versa. But a simple inheritation comes with a problem: we have to drop most of the preexisting documentation. Imho, this would be sad, and I like to discuss how we deal with that.

If you are concerned about removing doctests (rather than documentation) as a byproduct of removing code duplications: There is a solution by moving many tests from doctests to _test... methods. These are inherited by subclasses and therefore test coverage can be improved.

I did this in the sage.numerical.backends modules in #20323 and follow-up tickets, and recently gh-kliem started to do the same in sage.geometry.polyhedra (see for example #29918).

In sage.numerical.backends.logging_backend I have some code that assists with transforming doctests to _test... methods, which could be generalized easily.

comment:5 in reply to: ↑ 4 ; follow-up: Changed 16 months ago by gh-mjungmath

Replying to mkoeppe:

If you are concerned about removing doctests (rather than documentation) as a byproduct of removing code duplications: There is a solution by moving many tests from doctests to _test... methods. These are inherited by subclasses and therefore test coverage can be improved.

Thank you. This is good to know and definitely an important point. My concern, however, is also the documentation itself. Tensor fields and sections are declared slighly differently. For example, sections are constructed by using an instance of the vector bundle. Tensor fields, on the other hand, are declared by using an instance of the manifold. Therefore, it would be good to have separate closed minimal examples and hence distinct documentations.

What about something like this:

class MyClass(ParentClass)
    r"""
    Blabla
    """

    my_method(self, a, b, c='c'):
        r"""
        Overwritten docstring
        """
        return ParentClass.my_method(self, a, b, c=c)

Or is that too nasty?

Last edited 16 months ago by gh-mjungmath (previous) (diff)

comment:6 Changed 16 months ago by mkoeppe

Nothing wrong with that, I would say. (Of course, one would use super() instead of referring to the parent class by name.) The performance impact should be minimal.

comment:7 follow-ups: Changed 16 months ago by nthiery

Just one caveat: it makes it more complicated to look at the code of the method by introspection.

I don't have a good solution though. The best approximation I can come it is to include both the examples/doc of vector bundles and tensor fields in the documentation of the method. It makes it a bit long for the reader. But points out the similarity, which is not bad per se.

Matthias: glad to see _test methods being adopted at a large scale, with additional tooling :-)

comment:8 Changed 16 months ago by mkoeppe

In a different direction: The (multi)linear algebra in sage.tensor that was developed for sage-manifolds has a much better design than the linear algebra provided by sage.matrix (see, for example, #30091).

It would be nice to be able to use it for computational (exact and inexact) (multi)linear algebra. In #30061, Eric already did some improvements that makes better use of sparsity.

In addition to such speedups, it would be good to be able to use different backends for storing tensors instead of the current dictionary backend.

On the one hand, for special cases such as vectors and matrices by existing special implementations; for the special case of fully symmetric tensors, using fast implementations of multivariate polynomials.

On the other hand, for general numerical tensors using mainstream accelerated frameworks like PyTorch (see #30096).

comment:9 in reply to: ↑ 7 ; follow-ups: Changed 16 months ago by gh-mjungmath

Replying to mkoeppe:

Nothing wrong with that, I would say. (Of course, one would use super() instead of referring to the parent class by name.)

I would have thought that referring to the parent class by name is more robust against changes and increases readability. Is there a particular reason to prefer super()?

The performance impact should be minimal.

That's good.

Replying to nthiery:

Just one caveat: it makes it more complicated to look at the code of the method by introspection.

What exactly do you mean by that?

I don't have a good solution though. The best approximation I can come it is to include both the examples/doc of vector bundles and tensor fields in the documentation of the method. It makes it a bit long for the reader. But points out the similarity, which is not bad per se.

I'd guess this would contradict the well-documented philosophy of Sage, wouldn't it?

comment:10 in reply to: ↑ 9 Changed 16 months ago by mkoeppe

Replying to gh-mjungmath:

Replying to mkoeppe:

(Of course, one would use super() instead of referring to the parent class by name.)

I would have thought that referring to the parent class by name is more robust against changes and increases readability. Is there a particular reason to prefer super()?

For classes with single inheritance, it is merely a matter of style. But as soon as multiple inheritance enters, using super() ensures that methods of all superclasses are called (and called in the right order)

comment:11 Changed 16 months ago by mkoeppe

  • Description modified (diff)

comment:12 follow-up: Changed 15 months ago by gh-bollu

As someone who's new to sage, but would like to get involved in SageManifolds?: how can I help?

comment:13 in reply to: ↑ 12 Changed 15 months ago by egourgoulhon

Replying to gh-bollu:

As someone who's new to sage, but would like to get involved in SageManifolds?: how can I help?

Welcome to the project! There are various ways to contribute, see https://sagemanifolds.obspm.fr/contrib.html.

If you take a look at the metaticket #18528, you'll see that various tickets are in stalled state (mostly the blue ones), are therefore are open for development. For instance, #18786 about complex structures. You could pick one of those or create a brand new ticket about a topic that you like.

comment:14 Changed 15 months ago by gh-mjungmath

  • Description modified (diff)

comment:15 in reply to: ↑ 5 Changed 15 months ago by egourgoulhon

Replying to gh-mjungmath:

Replying to mkoeppe:

What about something like this:

class MyClass(ParentClass)
    r"""
    Blabla
    """

    my_method(self, a, b, c='c'):
        r"""
        Overwritten docstring
        """
        return ParentClass.my_method(self, a, b, c=c)

Or is that too nasty?

This is code duplication and should be avoided IMHO, cf. comment:6 in #29234 for the case of sections vs. tensor fields.

comment:16 in reply to: ↑ 7 Changed 15 months ago by egourgoulhon

Replying to nthiery:

I don't have a good solution though. The best approximation I can come it is to include both the examples/doc of vector bundles and tensor fields in the documentation of the method. It makes it a bit long for the reader. But points out the similarity, which is not bad per se.

+1

comment:17 Changed 15 months ago by gh-mjungmath

Tobias had a very nice idea how to deal with some code duplication, in particular the restriction process: introduce sheafs (cf. comment:7 in #29234).

I like that idea and I think this is something we definitely should aim for. However, this task is not as easy as one might think on the first glimpse. My first attempt would be utilizing the morphism framework of Sage. Unfortunately, my experience with morphisms in Sage is very limited. What do you think? Any further ideas?

comment:18 in reply to: ↑ 9 Changed 15 months ago by egourgoulhon

Replying to gh-mjungmath:

Replying to nthiery:

Just one caveat: it makes it more complicated to look at the code of the method by introspection.

What exactly do you mean by that?

I guess Nicolas means that if the user would like to take a look at the source code via t.my_method??, he/she would get return ParentClass.my_method(self, a, b, c=c), i.e. useless information.

comment:19 Changed 14 months ago by mkoeppe

  • Milestone changed from sage-9.2 to sage-9.3

comment:20 follow-up: Changed 9 months ago by gh-mjungmath

I thought a bit about optimizations and the first thing that came to my mind: Cythonize code. Of course, one has to check with prun etc. whether it's worth it. However, Cythonizing comp.py and/or the algebraic part in the tensor module should be doable.

One thing that could definitely be Cythonized is the way of how names, i.e. strings, are concatenated, for example in format_utilities.py. I can imagine that this could improve speed a lot. Of course, this needs to be investigated with some performance checks before, too.

Eric, what do you think?

Last edited 9 months ago by gh-mjungmath (previous) (diff)

comment:21 in reply to: ↑ 20 Changed 9 months ago by egourgoulhon

Replying to gh-mjungmath:

I thought a bit about optimizations and the first thing that came to my mind: Cythonize code. Of course, one has to check with prun etc. whether it's worth it. However, Cythonizing comp.py and/or the algebraic part in the tensor module should be doable.

One thing that could definitely be Cythonized is the way of how names, i.e. strings, are concatenated, for example in format_utilities.py. I can imagine that this could improve speed a lot. Of course, this needs to be investigated with some performance checks before, too.

Eric, what do you think?

Cythonizing the algebraic part of tensor calculus will unfortunately not help much to improve the computational speed on manifolds. The reason is that the main bottleneck is symbolic calculus on the tensor field components and more specifically the simplification process, which is performed via Maxima (default backend, Lisp based) or by SymPy (optional backend, Python based). To really gain in performance, one shall rely on parallelization rather than cythonization. A real limitation in this respect is #27492, which forbids parallelization with symbolic functions (side note: certainly Sage's symbolic functions require some major rewritting, see #31047)

comment:22 follow-ups: Changed 9 months ago by gh-bollu

Perhaps I am missing something. SymPy? has a new C++ backend called SymEngine? https://github.com/symengine/symengine. Would that improve performance by some degree?

comment:23 in reply to: ↑ 22 Changed 9 months ago by egourgoulhon

Replying to gh-bollu:

Perhaps I am missing something. SymPy? has a new C++ backend called SymEngine? https://github.com/symengine/symengine. Would that improve performance by some degree?

Thanks for pointing this! This is certainly worth to investigate. Especially SymEngine's README says: Python wrappers allow easy usage from Python and integration with SymPy and Sage (the symengine.py repository).

comment:24 Changed 9 months ago by gh-bollu

If one were to attempt to integrate/use SymEngine? for the purpose of making the algebra faster, what should one do? I'd imagine

  1. Pull in SymEngine? into SAGE
  2. Replace the calls to Maxima to call to SymEngine?
  3. Benchmark?

comment:25 in reply to: ↑ 22 ; follow-up: Changed 9 months ago by gh-mjungmath

Replying to gh-bollu:

Perhaps I am missing something. SymPy? has a new C++ backend called SymEngine? https://github.com/symengine/symengine. Would that improve performance by some degree?

That looks extremely promising. It's definitely worth investigating. And who knows, perhaps it could even replace the entire symbolic backend in Sage.

Perhaps we should post this in the sage-devel group and initiate a discussion.

comment:26 in reply to: ↑ 25 Changed 9 months ago by egourgoulhon

Replying to gh-mjungmath:

Replying to gh-bollu:

Perhaps I am missing something. SymPy? has a new C++ backend called SymEngine? https://github.com/symengine/symengine. Would that improve performance by some degree?

That looks extremely promising. It's definitely worth investigating. And who knows, perhaps it could even replace the entire symbolic backend in Sage.

Indeed, this is all the more relevant, given that the main developer of Pynac (Sage's symbolic backend) has resigned: https://groups.google.com/g/sage-devel/c/JVLHRy2mxUI

Perhaps we should post this in the sage-devel group and initiate a discussion.

Yes this is certainly the right place to initiate the discussion.

comment:27 Changed 9 months ago by gh-bollu

I've created a thread on sage-devel about moving to SymEngine? https://groups.google.com/g/sage-devel/c/sY3zh-pq8T4

comment:28 Changed 7 months ago by mkoeppe

  • Milestone changed from sage-9.3 to sage-9.4

Sage development has entered the release candidate phase for 9.3. Setting a new milestone for this ticket based on a cursory review of ticket status, priority, and last modification date.

comment:29 Changed 7 months ago by gh-mjungmath

Indeed, it would be nice to (co)homology of manifolds. There are two approaches I see right now:

  • Morse homology
  • Čech cohomology

Čech cohomology sounds promising since it can (more or less) directly be obtained from charts and their transition maps (see e.g. https://webspace.science.uu.nl/~caval101/homepage/Differential_geometry_2014_files/notes%20on%20Cech%20cohomology.pdf).

As for Morse homology I'd like to refer to https://www.sciencedirect.com/science/article/pii/S0723086905000642. With the orientation implementation we already have, this could be doable, even though there are some immediate problems that I can think of:

  • How do we check whether a function is Morse?
  • How can we obtain all critial points? How do we determine their indices?

Is there perhaps a nice class of Morse functions we can easily deal with?

Opinions, comments, hints, references and ideas are welcome! :)

comment:30 follow-up: Changed 7 months ago by tscrim

I would probably start with Čech cohomology since it is more straightforward.

For finding the critical points in Morse homology, I think you can compute them on each chart individually since they are suppose to be isolated IIRC and then remove redundancies. From there, you can check the degeneracy condition.

comment:31 in reply to: ↑ 30 Changed 6 months ago by gh-mjungmath

Replying to tscrim:

I would probably start with Čech cohomology since it is more straightforward.

For that, we need "good covers" consisting of contractible sets, and all its intersections being contractible. Thus we need a way to check contractibility for open subsets. Is that worth a post in sage-devel?

comment:32 follow-up: Changed 6 months ago by gh-mjungmath

The manifolds code is growing steadily and more files are added. Do you think it is time that we consider refactoring the folder structure of the manifolds module?

Last edited 6 months ago by gh-mjungmath (previous) (diff)

comment:33 in reply to: ↑ 32 Changed 6 months ago by mkoeppe

Replying to gh-mjungmath:

The manifolds code is growing steadily and more files are added. Do you think it is time that we consider refactoring the folder structure of the manifolds module?

Some of the things added as part of #31740 (Meta-ticket: Families, posets, complexes of manifold subsets) could certainly be moved into a subpackage sage.manifolds.subsets:

  • from #31653: sage.manifolds.continuous_map_image -> sage.manifolds.subsets.continuous_map_image
  • likewise #31644, #31688

comment:34 Changed 3 months ago by gh-mjungmath

Opened #32274 for the structuring issue.

comment:35 Changed 3 months ago by mkoeppe

  • Milestone changed from sage-9.4 to sage-9.5
Note: See TracTickets for help on using tickets.