Opened 3 years ago

Last modified 2 years ago

#21380 new task

cleaning CategoryObject/Parent

Reported by: vdelecroix Owned by:
Priority: major Milestone: sage-7.4
Component: categories Keywords:
Cc: tscrim Merged in:
Authors: Reviewers:
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Description (last modified by vdelecroix)

Abstract

This ticket stands to clean CategoryObject and Parent with respect to:

  • hacks and too specialized methods (e.g. base or _ngens_)
  • undocumented and old (e.g. sage.structure.generators)
  • duplicated stuff (e.g. _an_element_ and an_element)

Subtasks

Base and generators

We should move out of CategoryObject the attributes and methods related to generators and variable names.

Attributes:

  • _names

Methods:

  • gens_dict
  • gens_dict_recursive
  • objgens
  • objgen
  • _first_ngens
  • _defining_names
  • _assign_names
  • __temporarily_change_names
  • variable_names
  • variable_name
  • latex_variable_names
  • latex_name
  • inject_variables
  • base_ring

It is not clear where all this should be moved... Moreover "generators" is ambiguous since "monoid generators", "group generators", "algebra generators", ... are different things.

Keep

These were originally proposed to be removed too, but are sufficiently useful:

Attributes:

  • _base

Methods:

  • base

Functions:

  • normalize_names
  • certify_names

Tickets

Some related tickets are

  • gradual transition for to remove inheritance from ParentWithGens to Ring (#13683? still useful?)
  • #24430: normalization for _names, variable_name and variable_names in the context of polynomials and series

Transition to FiniteEnumeratedSet

  • set the category to FiniteEnumeratedSets where appropriate (#12957)
  • Move __len__ and __getitem__ from Parent to FiniteEnumeratedSets (#12955)
  • deprecate CombinatorialClass in favor of the EnumeratedSets category (#12913, #19986)

Duplicated names for same function

  • cardinality vs order (#18410)

Others

  • move Set_generic and Set_Python_type_class out of sage.structure.parent. Moreover we should use UniqueRepresentation instead of the custom class factory Set_PythonType.
  • Various fixes in category initialization (#21353)
  • cleanup variable factories (#18390)
  • cleanup cartesian products (task ticket #15425)

Closed tickets

  • refactor getattr_from_other_class (#20686)
  • Simplify _populate_generators_ (#21381)
  • Remove sage.structure.generators (#21382)
  • Remove ParentWith*AbelianGens (#21383)
  • Remove _populate_generators_ (#21385)
  • Always enable debug.bad_parent_warnings (#24109)

Change History (46)

comment:1 Changed 3 years ago by jdemeyer

  • Description modified (diff)
  • Summary changed from cleaning CategoryObject/Element/Parent to cleaning CategoryObject/Parent

Are you sure this would affect Element much? All the things you write are about CategoryObject or Parent.

comment:2 Changed 3 years ago by jdemeyer

  • Description modified (diff)

comment:3 Changed 3 years ago by jdemeyer

  • Description modified (diff)

comment:4 Changed 3 years ago by tscrim

  • Cc tscrim added

comment:5 Changed 3 years ago by vdelecroix

  • Description modified (diff)

comment:6 Changed 3 years ago by jdemeyer

  • Description modified (diff)

comment:7 Changed 3 years ago by jdemeyer

  • Description modified (diff)

comment:8 Changed 3 years ago by jdemeyer

  • Description modified (diff)

comment:9 Changed 3 years ago by jdemeyer

  • Description modified (diff)

comment:10 Changed 3 years ago by jdemeyer

  • Description modified (diff)

comment:11 Changed 3 years ago by jdemeyer

  • Description modified (diff)

comment:12 Changed 3 years ago by jdemeyer

  • Description modified (diff)

comment:13 Changed 3 years ago by jdemeyer

  • Description modified (diff)

comment:14 Changed 3 years ago by jdemeyer

  • Description modified (diff)

comment:15 Changed 3 years ago by jdemeyer

  • Description modified (diff)

comment:16 Changed 3 years ago by jdemeyer

  • Description modified (diff)

comment:17 Changed 3 years ago by jdemeyer

  • Description modified (diff)

comment:18 in reply to: ↑ description Changed 2 years ago by jdemeyer

Replying to vdelecroix:

We should move out of CategoryObject the attributes and methods relative to "base"

After thinking about this now and then, I think that it does make sense to keep _base inside CategoryObject. There are a lot of objects which naturally have a base object. For efficiency and simplicity, I suggest to keep CategoryObject._base.

comment:19 Changed 2 years ago by jdemeyer

  • Description modified (diff)

comment:20 Changed 2 years ago by jdemeyer

  • Description modified (diff)

comment:21 follow-up: Changed 2 years ago by jdemeyer

  • Description modified (diff)

What is wrong with the functions normalize_names() and certify_names? They have to be put somewhere, why not in structure/category_object.pyx?

comment:22 follow-up: Changed 2 years ago by vdelecroix

Replying to jdemeyer:

Replying to vdelecroix:

We should move out of CategoryObject the attributes and methods relative to "base"

After thinking about this now and then, I think that it does make sense to keep _base inside CategoryObject. There are a lot of objects which naturally have a base object. For efficiency and simplicity, I suggest to keep CategoryObject._base.

Some does, some does not and there is no specification of what a base is. I would be happy to hear a definition.

comment:23 in reply to: ↑ 21 Changed 2 years ago by vdelecroix

Replying to jdemeyer:

What is wrong with the functions normalize_names() and certify_names? They have to be put somewhere, why not in structure/category_object.pyx?

Right (I made the listing rather quickly)

comment:24 in reply to: ↑ 22 ; follow-up: Changed 2 years ago by jdemeyer

Replying to vdelecroix:

Some does, some does not and there is no specification of what a base is. I would be happy to hear a definition.

Well, we cannot define "base" in full generality. Some objects do not have a natural base and then this attribute should simply be None.

But in certain important categories, we can define "base". For example, for modules, we define "base" to mean the parent of the scalars used for scalar multiplication.

comment:25 in reply to: ↑ 24 ; follow-ups: Changed 2 years ago by vdelecroix

Replying to jdemeyer:

Replying to vdelecroix:

Some does, some does not and there is no specification of what a base is. I would be happy to hear a definition.

Well, we cannot define "base" in full generality. Some objects do not have a natural base and then this attribute should simply be None.

But in certain important categories, we can define "base". For example, for modules, we define "base" to mean the parent of the scalars used for scalar multiplication.

This is precisely the reason why I dislike it. It is a fake global concept that is mostly meant for "the underlying ring of a module". And because modules are important all parents have an attribute _base as well as methods base and base_ring. Think for a minute about the list of parents without base: permutations, graphs, groups, rings, schemes, simplicial complexes, ...

Moreover, I (maybe wrongly) thought that this "base ring of a module" is the kind of information that belongs to the category

sage: Modules(ZZ)
Category of modules over Integer Ring
sage: Modules(ZZ).base()
Integer Ring

comment:26 in reply to: ↑ 25 ; follow-up: Changed 2 years ago by jdemeyer

Replying to vdelecroix:

Think for a minute about the list of parents without base: permutations, graphs, groups, rings, schemes, simplicial complexes, ...

Some rings and schemes do have a concept of "base"...

comment:27 in reply to: ↑ 26 Changed 2 years ago by vdelecroix

Replying to jdemeyer:

Replying to vdelecroix:

Think for a minute about the list of parents without base: permutations, graphs, groups, rings, schemes, simplicial complexes, ...

Some rings and schemes do have a concept of "base"...

Some graphs also (Cayley graphs) as well as groups (GL(2,Z)) etc

comment:28 in reply to: ↑ 25 Changed 2 years ago by jdemeyer

Replying to vdelecroix:

It is a fake global concept that is mostly meant for "the underlying ring of a module".

Note that there are a lot of parents in Sage which represent a module of some sort. Most of them do not inherit from Module. So even if we implement _base in the Module class, that won't help unless we change a large number of parents to inherit from that.

comment:29 follow-up: Changed 2 years ago by tscrim

Ideally base would be something that belongs to the appropriate categories. The reason I am not currently proposing this is that I'm very worried about the speed regression of making this a pure Python function call.

I am starting to wonder if we do want to actually have (new-style) Parent, ParentWithBase, ParentWithGens, and possibly ParentWithBaseGens classes that give these various features as common ABCs. While this would probably be an invasive change, it should not be too painful (and should be easy enough to deprecate). For those very few (if any?) non-Parent-but-CategoryObject-that-needs-base (or gens) classes, we can see how many there are and if we just want them to have a common API or a common ABC of CategoryObjectWithGens.

comment:30 follow-ups: Changed 2 years ago by vdelecroix

I don't really care if we have to do a big change if it makes things better. Though we have to be sure that it will end with something better!

Is speed really a serious issue for accessing base or gens? This is indeed the whole problem of categories:

  • the category mantra says that a parent should just inherit from Parent (or possibly CategoryObject)
  • if we do that, the methods inherited from ParentMethods of the category are slow so we want custom base classes

comment:31 in reply to: ↑ 29 ; follow-up: Changed 2 years ago by jdemeyer

Replying to tscrim:

I am starting to wonder if we do want to actually have (new-style) Parent, ParentWithBase, ParentWithGens, and possibly ParentWithBaseGens classes that give these various features as common ABCs.

What do you mean exactly? Do you want to remove Parent? Do you want to remove Module? I am not following...

comment:32 in reply to: ↑ 30 ; follow-up: Changed 2 years ago by jdemeyer

Replying to vdelecroix:

Is speed really a serious issue for accessing base?

One thing that I have been wanting to do since a long time is improving the semantics of _lmul_ and _rmul_ (which are meant for scalar multiplication). Most of the arithmetic/coercion model has been cleaned up by now, but those two methods are an ugly remaining corner. I would like to define _lmul_ and _rmul_ as multiplication by an element of parent._base. Ideally, that would have fast access to the _base attribute.

comment:33 in reply to: ↑ 32 ; follow-up: Changed 2 years ago by vdelecroix

Replying to jdemeyer:

Replying to vdelecroix:

Is speed really a serious issue for accessing base?

One thing that I have been wanting to do since a long time is improving the semantics of _lmul_ and _rmul_ (which are meant for scalar multiplication). Most of the arithmetic/coercion model has been cleaned up by now, but those two methods are an ugly remaining corner. I would like to define _lmul_ and _rmul_ as multiplication by an element of parent._base. Ideally, that would have fast access to the _base attribute.

Once more, _base means "base ring on a module"! Would be better to use the name _base_ring and mention that it will be used for scalar multiplication. That would be a clear semantic.

comment:34 in reply to: ↑ 33 ; follow-up: Changed 2 years ago by jdemeyer

Replying to vdelecroix:

Once more, _base means "base ring on a module"!

For this purpose, it certainly does.

Would be better to use the name _base_ring

Not sure because I wouldn't to require that it's a ring. I am sure that there are structures which admit some kind of scalar multiplication by a non-ring.

and mention that it will be used for scalar multiplication. That would be a clear semantic.

Sure! But then I would still like to keep _base on CategoryObject. The methods base() and base_ring() are a different story. Those might be implemented in the category.

comment:35 in reply to: ↑ 34 ; follow-ups: Changed 2 years ago by vdelecroix

Replying to jdemeyer:

Replying to vdelecroix:

Once more, _base means "base ring on a module"!

For this purpose, it certainly does.

Would be better to use the name _base_ring

Not sure because I wouldn't to require that it's a ring. I am sure that there are structures which admit some kind of scalar multiplication by a non-ring.

Fair enough. Examples? Let us keep _base as it used to be. But then specify that it is used precisely for _lmul_ and _rmul_ where the attribute declaration is.

and mention that it will be used for scalar multiplication. That would be a clear semantic.

Sure! But then I would still like to keep _base on CategoryObject. The methods base() and base_ring() are a different story. Those might be implemented in the category.

Why CategoryObject and not Parent?

Moving base() and base_ring() to the category would not be possible if the attribute _base is private. How do you plan to declare it?

Last edited 2 years ago by vdelecroix (previous) (diff)

comment:36 in reply to: ↑ 31 ; follow-up: Changed 2 years ago by tscrim

Replying to jdemeyer:

Replying to tscrim:

I am starting to wonder if we do want to actually have (new-style) Parent, ParentWithBase, ParentWithGens, and possibly ParentWithBaseGens classes that give these various features as common ABCs.

What do you mean exactly? Do you want to remove Parent? Do you want to remove Module? I am not following...

I was thinking of a hierarchy like this:

+-Parent
|
+---ParentWithBase  (maybe useless right now)
|
+-+-ParentWithGens
  |
  +-+-ParentWithBaseAndGens
    |
    +-Module (etc).

with Parent being a subclass of CategoryObject, and we implement base(ring) and gens in the appropriate subclass. (I said new-style because there still is the old-style ParentWithGens floating around.) It is a bit more of a maintenance burden in terms of the class hierarchy, but it makes the code have better associations (as well as cleaning our user visible attributes).

comment:37 in reply to: ↑ 35 Changed 2 years ago by jdemeyer

Replying to vdelecroix:

Replying to jdemeyer:

Replying to vdelecroix:

Once more, _base means "base ring on a module"!

For this purpose, it certainly does.

Would be better to use the name _base_ring

Not sure because I wouldn't to require that it's a ring. I am sure that there are structures which admit some kind of scalar multiplication by a non-ring.

Fair enough. Examples?

I don't know any example. It's just that there is no reason to force this to be a ring. Because then we have to worry about checking that _base_ring really is a ring. And if you don't check, then why call it _base_ring and not _base?

comment:38 in reply to: ↑ 30 Changed 2 years ago by tscrim

Replying to vdelecroix:

Is speed really a serious issue for accessing base or gens?

Getting the base ring and the generators are very common operations, so a slowdown in that access could potentially have much larger speed implications.

This is indeed the whole problem of categories:

  • the category mantra says that a parent should just inherit from Parent (or possibly CategoryObject)
  • if we do that, the methods inherited from ParentMethods of the category are slow so we want custom base classes

There usually should not be too many simple getter methods in a category that are used in tight loops that this is a problem, but there are a few that do warrant the special attention.

comment:39 in reply to: ↑ 35 Changed 2 years ago by jdemeyer

Replying to vdelecroix:

Moving base() and base_ring() to the category would not be possible if the attribute _base is private. How do you plan to declare it?

You could make _base accessible to Python code (cdef public _base instead of cdef _base).

comment:40 in reply to: ↑ 36 ; follow-up: Changed 2 years ago by jdemeyer

Replying to tscrim:

I was thinking of a hierarchy like this:

+-Parent
|
+---ParentWithBase  (maybe useless right now)
|
+-+-ParentWithGens
  |
  +-+-ParentWithBaseAndGens
    |
    +-Module (etc).

I don't think that it makes much sense to have unrelated classes ParentWithBase and ParentWithBaseAndGens. Since base seems more important than gens, I would actually suggest

CategoryObject
|
+--Parent
   |
   +--ParentWithBase
      |
      +--Module
      |
      +--Ring
      |
     ...

I said new-style because there still is the old-style ParentWithGens floating around.

Thanks for the reminder. Getting rid of those old-style parents should probably be higher priority than this ticket.

comment:41 in reply to: ↑ 40 ; follow-ups: Changed 2 years ago by tscrim

Replying to jdemeyer:

Replying to tscrim:

I was thinking of a hierarchy like this:

+-Parent
|
+---ParentWithBase  (maybe useless right now)
|
+-+-ParentWithGens
  |
  +-+-ParentWithBaseAndGens
    |
    +-Module (etc).

I don't think that it makes much sense to have unrelated classes ParentWithBase and ParentWithBaseAndGens.

Ideally it should be a diamond IMO, but Cython does not (yet?) support multiple inheritance (or something like interfaces in Java-speak).

Since base seems more important than gens, I would actually suggest

CategoryObject
|
+--Parent
   |
   +--ParentWithBase
      |
      +--Module
      |
      +--Ring
      |
     ...

We have objects with base but without gens (e.g., polytopes), objects without base but with gens (e.g., permutation groups), as well as with(out) both. So I was trying to address that with my proposed hierarchy. At least I would like ParentWithGens in between Parent and ParentWithBase as there are very few things in Sage that have gens but no concept of base. (Which, as it turns out, is not quite the old parent hierarchy ParentWithGens -> ParentWithBase -> Parent_old.)

Although I'm a little worried with the group class hierarchy becoming a bit more of a mess because of the Cython parents and not having the multiple inheritance. Might be a necessary complication in order to improve the interface (which IMO is net benefit). Spreading out to a more refined parent hierarchy is going to be a fairly invasive change, so I guess it will just have to be done.

I said new-style because there still is the old-style ParentWithGens floating around.

Thanks for the reminder. Getting rid of those old-style parents should probably be higher priority than this ticket.

Yes, definitely. Although most of them are really fundamental parents such as Ring, number fields, etc. I guess we can take care of the leaf classes a little more easily, such as ComplexIntervalField. Although it is somewhat counter to the objective of having a more refined parent hierarchy as it usually means making the class inherit from Parent...

comment:42 in reply to: ↑ 41 ; follow-up: Changed 2 years ago by jdemeyer

Replying to tscrim:

Cython does not (yet?) support multiple inheritance

First of all, the next release of Cython is going to support multiple inheritance in a limited way. The list of bases can be CyClass, PyClass1, ..., PyClassN, which is the most general thing which is allowed by Python and which doesn't hurt Cython's performance (because all Python classes come after all Cython classes in the MRO).

Second, the fact that diamonds are not allowed is because of Python's design. In plain Python, you cannot do class X(list, str) either. So don't blame Cython for something that they cannot fix.

Anyway, the conclusion is that ParentWithGens could be a Python class and then the diamond would work in Cython 0.28.

Last edited 2 years ago by jdemeyer (previous) (diff)

comment:43 in reply to: ↑ 41 ; follow-up: Changed 2 years ago by jdemeyer

Replying to tscrim:

Yes, definitely. Although most of them are really fundamental parents such as Ring

The large majority of rings use the new coercion model, so it's not as bad as you think.

comment:44 in reply to: ↑ 42 Changed 2 years ago by tscrim

Replying to jdemeyer:

Replying to tscrim:

Cython does not (yet?) support multiple inheritance

First of all, the next release of Cython is going to support multiple inheritance in a limited way. The list of bases can be CyClass, PyClass1, ..., PyClassN, which is the most general thing which is allowed by Python and which doesn't hurt Cython's performance (because all Python classes come after all Cython classes in the MRO).

That's good to hear and should be useful here.

Second, the fact that diamonds are not allowed is because of Python's design. In plain Python, you cannot do class X(list, str) either. So don't blame Cython for something that they cannot fix.

Sorry, I guess I was thinking a little too loosely with not distinguishing between classes and (builtin) types.

Anyway, the conclusion is that ParentWithGens could be a Python class and then the diamond would work in Cython 0.28.

I think this would be the most sensible solution since most of our Cython parent classes would inherit from the ParentWithGensAndBase (e.g., Ring, where we really treat everything as some R-algebra), but would keep the flexibility for other parents in a reasonable way.

comment:45 in reply to: ↑ 43 Changed 2 years ago by tscrim

Replying to jdemeyer:

Replying to tscrim:

Yes, definitely. Although most of them are really fundamental parents such as Ring

The large majority of rings use the new coercion model, so it's not as bad as you think.

That's good to hear. Now if only I could find some time to work on this directly...

comment:46 Changed 2 years ago by vdelecroix

  • Description modified (diff)
Note: See TracTickets for help on using tickets.