Opened 2 years ago

Last modified 3 weeks ago

#30738 needs_review enhancement

Replace custom Sage unit test discovery by pytest

Reported by: gh-tobiasdiez Owned by:
Priority: major Milestone: sage-9.9
Component: refactoring Keywords: testsuite
Cc: mkoeppe, tscrim, nthiery, slelievre Merged in:
Authors: Tobias Diez Reviewers:
Report Upstream: N/A Work issues:
Branch: public/refactoring/pytest (Commits, GitHub, GitLab) Commit: b047abeae5c3f3bf755afa37bf65a0ed4bc61d97
Dependencies: Stopgaps:

Status badges

Description (last modified by mkoeppe)

Currently, Sage code contains _test_xyz methods that are invoked via doctests and discovered via a custom TestSuite class. This ticket is to replace this custom test discovery by the use of the established pytest framework https://docs.pytest.org/ (this is in line with #28936 - using mainstream Python test infrastructure).

Advantages of this approach:

  • Reuse standard test infrastructure instead of a custom test discovery interface (this is a big maintenance relief, especially given the large list of todos in the TestSuite code).
  • Better error messages in case a test fails.
  • Easier onboarding of new Python developers.
  • Clear separation of test methods vs production code. For example, the *_test.py files can easily be excluded from distribution.
  • Good support by IDEs (auto completion, test discovery and easier invocation). For example, VSCode automatically discovers all pytest tests, provides a convenient interface to run them (all, a selection, only failed ones) and allows to directly start a debug session for a failing test. See https://code.visualstudio.com/docs/python/testing
  • Less and clearer code (no hidden assumptions).

Disadvantages of this approach:

  • Current developers have to learn pytest and its conventions (files must be named *_test.py and test methods must start with test_).
  • Tests no longer live in the same file as the code (I would call this an advantage, but some people may prefer to have them really close together).

In order to migrate, the following changes are necessary:

  • Move _test_xyz from a category module to new file <module>_test.py (e.g. finite_semigroups_test.py).
  • Rewrite tests using pytest (which is straightforward and more or less only amounts to removing all the custom code involving self_tester).
  • Create a new test class for the parent class under test (e.g TestFiniteSemigroup).
  • Create a static method category_instances() in this test class returning a list of instances to test.

For the moment, I did this only for a few test methods in sets_cat using finite semigroups as example to illustrate the process.

The new implementation is as follows:

  • The test class for the category (e.g. in sets_cat_test.py) defines generic tests that should pass for every object in that category. Moreover, the test class may provides instances (objects) to test via the method category_instances.
  • Running pytest finds the test file finite_semigroups_test.py, then determines the categories of the objects (returned from category_instances) and adds all generic test methods for that category to the test class (this happens in conftest.py).
  • Each test method can have the parameter category_instance which is bound to the return value of the category_instances method.

Running cd src && pytest --verbose (from a virtual env with sage and pytest installed), the output is as follows:

sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_enumerated_set_contains[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED                                                         [  2%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_enumerated_set_contains[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED                                                    [  4%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_enumerated_set_contains[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED                                               [  6%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_enumerated_set_iter_list[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED                                                        [  8%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_enumerated_set_iter_list[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED                                                   [ 10%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_enumerated_set_iter_list[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED                                              [ 13%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_enumerated_set_iter_cardinality[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED                                                 [ 15%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_enumerated_set_iter_cardinality[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED                                            [ 17%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_enumerated_set_iter_cardinality[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED                                       [ 19%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_associativity[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED                                                                   [ 21%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_associativity[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED                                                              [ 23%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_associativity[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED                                                         [ 26%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_an_element[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED                                                                      [ 28%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_an_element[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED                                                                 [ 30%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_an_element[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED                                                            [ 32%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_an_element_idempotent[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED                                                           [ 34%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_an_element_idempotent[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED                                                      [ 36%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_an_element_idempotent[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED                                                 [ 39%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_cardinality_return_type[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED                                                         [ 41%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_cardinality_return_type[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED                                                    [ 43%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_cardinality_return_type[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED                                               [ 45%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_construction[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED                                                                    [ 47%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_construction[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED                                                               [ 50%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_construction[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED                                                          [ 52%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_element_eq_reflexive[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED                                                            [ 54%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_element_eq_reflexive[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED                                                       [ 56%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_element_eq_reflexive[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED                                                  [ 58%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_elements_eq_symmetric[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED                                                           [ 60%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_elements_eq_symmetric[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED                                                      [ 63%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_elements_eq_symmetric[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED                                                 [ 65%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_elements_eq_transitive[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED                                                          [ 67%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_elements_eq_transitive[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED                                                     [ 69%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_elements_eq_transitive[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED                                                [ 71%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_elements_neq[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED                                                                    [ 73%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_elements_neq[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED                                                               [ 76%]
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_elements_neq[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED                                                          [ 78%]
...

As one can see, each general test method is invoked for the three examples.

TODO (in follow-up tickets):

  • Complete migration to pytest.
  • Once that is done, stop invoking TestSuite in doctests and remove the TestSuite class.

See also:

  • #33546 Replace custom Sage doctest discovery by pytest (using pytest_collect_file)

Change History (84)

comment:1 Changed 2 years ago by gh-tobiasdiez

Description: modified (diff)

comment:2 Changed 2 years ago by mkoeppe

Are you aware that the crucial point of our _test methods, in particular those provided in the category framework, is that objects of subclasses run these tests?

comment:3 Changed 2 years ago by gh-tobiasdiez

Yes, and this structure is also somewhat reflected in the new code as the tests for the subclasses inherit from the general base test class. If I understood the code correctly, then the main purpose of these subclasses (with respect to the tests) is to provide the examples (e.g. using the example() convention). This can be archived in a more transparent and cleaner way using parameterized test fixtures of pytest.

comment:4 in reply to:  3 Changed 2 years ago by mkoeppe

Replying to gh-tobiasdiez:

If I understood the code correctly, then the main purpose of these subclasses (with respect to the tests) is to provide the examples (e.g. using the example() convention).

No, the category classes provide mixins for dynamically generated classes for all Parent and Element objects. The example() are just additional documentation.

comment:5 Changed 2 years ago by gh-tobiasdiez

Sure, but that doesn't seem to be important for the tests itself. Take for example the tests in sets_cat. Almost all of them simply call tester.some_elements() and then run certain assertions against the returned elements. Thus, there is no semantical difference to having one general test method accepting an element as argument, and then testing the elements returned by some_elements() using this method. This is exactly what is proposed in this ticket.

One question though: is the purpose of these tests to check that one general implementation is correct for all these examples (e.g sets_cat provides a general equality implementation which is checked), or that the subclasses provide an implementation that adheres to general principles (e.g. finitely_generated_semigroups provides an implementation of equality that is checked). Just so that I know where to best place these tests (test_sets_cat vs test_finitely_generated_semigroups).

comment:6 in reply to:  5 Changed 2 years ago by mkoeppe

Replying to gh-tobiasdiez:

One question though: is the purpose of these tests to check that one general implementation is correct for all these examples (e.g sets_cat provides a general equality implementation which is checked), or that the subclasses provide an implementation that adheres to general principles (e.g. finitely_generated_semigroups provides an implementation of equality that is checked).

The latter.

comment:7 Changed 2 years ago by gh-tobiasdiez

Thanks, I thought so too. Then the current file structure should be right. Should I go ahead and convert the other test methods to pytest as well? (As part of this ticket, or open a new ticket for every method?)

comment:8 Changed 2 years ago by mkoeppe

I do not know enough about pytest. Can you expand the ticket description to explain how the test discovery works after the ticket, and how it ensures that still the same things get tested?

comment:9 Changed 2 years ago by mkoeppe

Milestone: sage-9.2sage-9.3

comment:10 Changed 2 years ago by gh-tobiasdiez

Description: modified (diff)
Status: newneeds_review

comment:11 Changed 2 years ago by gh-tobiasdiez

I've extended the ticket description. Does this clarify your questions?

The test test_element_eq_reflexive (and other similar tests to be added) are invoked for each element of test_elements. Thus, by adding all the instances currently covered by the unit tests we make sure that the test coverage stays the same.

comment:12 Changed 2 years ago by mkoeppe

Sorry, I still don't get it. The branch as it is on the ticket - do you claim that it still tests the same objects as before?

comment:13 Changed 2 years ago by gh-tobiasdiez

If you run pytest on this branch, it invokes test_elements_eq_reflexive with the arguments ['a', 'b', 'c', 'ab', 'ac', 'ba', 'bc', 'ca', 'cb', 'abc']. This is the same as the doctest for finite_semigroups:

Now, let us look at the structure of the semigroup::

        sage: S = FiniteSemigroups().example(alphabet = ('a','b','c'))
        sage: S.cayley_graph(side="left", simple=True).plot()
        Graphics object consisting of 60 graphics primitives
        sage: S.j_transversal_of_idempotents() # random (arbitrary choice)
        ['acb', 'ac', 'ab', 'bc', 'a', 'c', 'b']

    We conclude by running systematic tests on this semigroup::

        sage: TestSuite(S).run(verbose = True)
        running ._test_an_element() . . . pass
        running ._test_associativity() . . . pass
        running ._test_cardinality() . . . pass
        running ._test_category() . . . pass
        running ._test_construction() . . . pass
        running ._test_elements() . . .
          Running the test suite of self.an_element()
          running ._test_category() . . . pass
          running ._test_eq() . . . pass
          running ._test_new() . . . pass
          running ._test_not_implemented_methods() . . . pass
          running ._test_pickling() . . . pass
          pass
        running ._test_elements_eq_reflexive() . . . pass
        running ._test_elements_eq_symmetric() . . . pass
        running ._test_elements_eq_transitive() . . . pass
        running ._test_elements_neq() . . . pass
        running ._test_enumerated_set_contains() . . . pass
        running ._test_enumerated_set_iter_cardinality() . . . pass
        running ._test_enumerated_set_iter_list() . . . pass
        running ._test_eq() . . . pass
        running ._test_new() . . . pass
        running ._test_not_implemented_methods() . . . pass
        running ._test_pickling() . . . pass
        running ._test_some_elements() . . . pass

So the idea would be to replace this doctest by pytest. There are a few more doctests of this form that can be replaced in a similar vain by adding the corresponding elements to the test_elements list.

comment:14 in reply to:  13 Changed 2 years ago by mkoeppe

Replying to gh-tobiasdiez:

There are a few more doctests of this form that can be replaced in a similar vain by adding the corresponding elements to the test_elements list.

A few? Currently _test_elements_eq_reflexive is invoked by TestSuite(object).run() whenever object happens to be in the category of sets. This is a huge, dynamically determined list.

comment:15 Changed 2 years ago by gh-tobiasdiez

Sure, there are quite a lot of instances where a TestSuite is run from the doctest. Each one would need to be replaced by a test file similar to test_finite_semigroups.py. Pytest is not a magician and the information which examples to test still needs to be provided.

So the replacement dictionary roughly speaking looks like:

  • TestSuite -> pytest
  • TestSuite(S).run(verbose = True) -> @pytest.fixture(params=S)

The idea is not to change the way the tests cases are provided (although that may certainly be improved in the future) but to replace sage's only implementation of test discovery and invocation by what is probably the most establish Python testing framework.

comment:16 Changed 2 years ago by mkoeppe

So in your design, there would be a hierarchy of Tests classes parallel to the hierarchy of, say, parent classes in sage?

comment:17 Changed 2 years ago by gh-tobiasdiez

Yes! Every parent class would have a corresponding test class which specifies which general axioms this parent class wants to adhere to. This is done by deriving from one or more generic test classes (the one in the branch should probably be renamed to GenericSetTests).

comment:18 Changed 2 years ago by mkoeppe

Okay, but the parent classes are dynamic classes formed by mixing in axioms. So the test classes would also be determined dynamically?

comment:19 in reply to:  15 Changed 2 years ago by mkoeppe

Status: needs_reviewneeds_work

Replying to gh-tobiasdiez:

here are quite a lot of instances where a TestSuite is run from the doctest. Each one would need to be replaced by a test file similar to test_finite_semigroups.py. Pytest is not a magician and the information which examples to test still needs to be provided.

Thanks for this clarification. I'm setting this ticket to "needs_work" because the branch in this form is not complete -- it effectively removes existing tests for _test_elements_eq_reflexive. Without a much more complete branch that reaches test parity, this idea is difficult to evaluate.

Overall I think there is a conflict here between the dynamic nature of our category system and what seems to amount to a static determination of what objects to test for what properties (in the proposed changes).

Last edited 2 years ago by mkoeppe (previous) (diff)

comment:20 Changed 2 years ago by gh-tobiasdiez

I'm not that familiar yet with the category framework. When you say "dynamic classes formed by mixing in axioms" you mean something like Parent.__init__(self, category = Semigroups().Finite().FinitelyGenerated())? But, in general, yes the idea is to mixin the tests. I don't see any reason why this couldn't be done dynamically.

I agree, this branch has not yet feature parity with the master branch. Before I invest more time into it, I just wanted to confirm that this idea would be worth looking at in general.

comment:21 in reply to:  20 Changed 2 years ago by mkoeppe

Replying to gh-tobiasdiez:

When you say "dynamic classes formed by mixing in axioms" you mean something like Parent.__init__(self, category = Semigroups().Finite().FinitelyGenerated())?

Yes, things like this.

comment:22 Changed 2 years ago by git

Commit: 3b243d0e8c790de8987f81dcf877d7b07ead4a11e198d1cca15b1a4e262ff625d8725d44fb8b7703

Branch pushed to git repo; I updated commit sha1. New commits:

1df0a95Merge branch 'develop' of git://github.com/sagemath/sage into public/refactoring/pytest
b12608bReadd test
79abf0dRefactor
99ae00cReplace startup exception by warning
11882e5Use context manager
6a52fbfRemove lazy import finish startup
f96025eMerge branch 'develop' of git://github.com/sagemath/sage into public/build/startupWarning
b952bb5Fix doctests
5de08cdMerge branch 'public/build/startupWarning' of git://trac.sagemath.org/sage into public/refactoring/pytest
e198d1cMake tests mixins dynamic

comment:23 Changed 2 years ago by gh-tobiasdiez

Dependencies: #30748
Description: modified (diff)
Status: needs_workneeds_review

Thanks for the input. I've now reworked the code completely. The generic category tests are now dynamically mixed in the test class. The new approach is explained in the ticket description in more detail. The upshot is that it is very easy to add new generic test methods (simply add them to the category test class, e.g. test_sets_cat) and to add new examples (simply create a new test class implementing category_instances).

comment:24 Changed 2 years ago by mkoeppe

Status: needs_reviewneeds_work

It's really hard to see from this ticket how this is intended to go.

The previous version that I reviewed was removing existing doctests and reimplementing a small subset of them using pytest.

Now the new version is only adding some new tests using pytest that seem to duplicate some existing tests run by the doctest framework.

comment:25 Changed 2 years ago by gh-tobiasdiez

The current version is meant as a starting point for discussion whether this approach is ok. So it is a fully working prototype that indeed reimplements a subset of the current category framework tests, and is flexible enough to be extended to cover everything. Getting full parity with the current category tests needs a bit of work by extending test_sets_cat with the remaining test functions.

I'm not sure what I should do now. The only open point is that pytest needs to be run during the normal test run (on user systems as well as on the ci pipeline). But that's not really the purpose of this ticket. I think, it's clear that one cannot reimplement the complete category tests in one ticket, so this is why I opted for a minimal version. What is your proposal to proceed from here?

comment:26 Changed 2 years ago by gh-tobiasdiez

Status: needs_workneeds_review

Feedback/review still needed.

comment:27 in reply to:  25 ; Changed 2 years ago by mkoeppe

Replying to gh-tobiasdiez:

Getting full parity with the current category tests needs a bit of work by extending test_sets_cat with the remaining test functions.

But even for the test function that you already added, it is only testing a few objects.

So to get parity even for this one test function, category_instances would have to be extended vastly -- if I understand the design correctly.

It is unclear who should do this work and why.

comment:28 Changed 2 years ago by gh-tobiasdiez

Dependencies: #30748#30748, #30901

comment:29 Changed 2 years ago by gh-tobiasdiez

I've now vastly extended the tests, and replaced some of the TestSuite? calls by pytest. A handful of test methods are still missing to get full feature parity (will do this later), but I hope the idea and design get clearer now (at the cost that there are bigger changes).

The following change is characteristic for the aim of migrating from Sage's custom TestSuite? framework to pytest:

-        sage: TestSuite(C).run()
+        sage: import pytest
+        sage: pytest.main(["-r", "test_finite_semigroups.py"])

(calling pytest from the doctests is only a temporary work-around until it's properly integrated into the build chain)

Last edited 2 years ago by gh-tobiasdiez (previous) (diff)

comment:30 Changed 2 years ago by git

Commit: e198d1cca15b1a4e262ff625d8725d44fb8b77031488582a6c48583565f28432a465cbc0ea40d314

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

94e20c7Revert some of the changes
6dd6e5cFix compilation
eceefb3Remove string wrap
d345bffFix test
c47c4bfCorrect indent
3fcaf5fMerge branch 'develop' of git://github.com/sagemath/sage into public/build/multiarchsimple
090e6f1Simplify code
fa4556aRemove _get_sage_local
7b8be0dMerge branch 'public/build/multiarchsimple' of git://trac.sagemath.org/sage into public/refactoring/pytest
1488582Towards full feature parity

comment:31 in reply to:  27 Changed 2 years ago by mkoeppe

Cc: tscrim nthiery added

Replying to mkoeppe:

But even for the test function that you already added, it is only testing a few objects.

So to get parity even for this one test function, category_instances would have to be extended vastly -- if I understand the design correctly.

These changes do not seem to address the above comment? You added more test functions, instead of showing how one would achieve parity for what one test function covers. Or maybe I'm misunderstanding.

comment:32 Changed 2 years ago by gh-tobiasdiez

I guess there are two ways for migration:

  • Replace one test in the category framework by a pytest, and then make sure that this test is invoked for every example. Do this again for every test.
  • Duplicate all tests in the category framework using pytest, and then replace one TestSuite(...).run statement by its pytest equivalent. Do this again for every test invocation.

Since there are only a small number of tests (order of 10) but way more examples (order of 1000), my idea was follow the second approach since this can be done easier in steps.

Of course, the complete migration is quite a huge project. This ticket was meant to as a starting point.

The advantages I see of pytest over a custom testsuite implementation are in the ticket description.

comment:33 in reply to:  32 ; Changed 2 years ago by mkoeppe

If I understand the proposed design correctly, for each category you would manually maintain a list of category instances to test. To me this seems like a huge step backward: The categories of an object are computed dynamically - and thus all test methods that come with the categories are run.

comment:34 Changed 2 years ago by mkoeppe

(Feel free to add this to "disadvantages of the approach" in the ticket description.)

comment:35 Changed 2 years ago by mkoeppe

From "advantages" in the ticket description:

Clear separation of test methods vs production code. For example, the test_*.py files can easily be excluded from distribution.

We definitely would not want to exclude them from the distribution! Also user-defined classes will want to run the test methods that are provided by the categories as part of their tests.

comment:36 Changed 2 years ago by mkoeppe

Could you elaborate on this in the ticket description:

Good support by IDE (auto completion, test discovery and easier invocation)

comment:37 in reply to:  33 ; Changed 2 years ago by gh-tobiasdiez

Replying to mkoeppe:

If I understand the proposed design correctly, for each category you would manually maintain a list of category instances to test.

Yes, but that's not different from the current code. There you also create the examples and then run the testsuite (which makes sure to run the correct tests for this category). This is the same in the proposed design, except that you now define the examples in the test file. I would even argue that the proposed design is more flexible and makes it very transparent which examples are used to test a given category.

Last edited 2 years ago by gh-tobiasdiez (previous) (diff)

comment:38 in reply to:  36 Changed 2 years ago by gh-tobiasdiez

Description: modified (diff)

Replying to mkoeppe:

Could you elaborate on this in the ticket description:

Good support by IDE (auto completion, test discovery and easier invocation)

Done.

comment:39 in reply to:  35 Changed 2 years ago by gh-tobiasdiez

Replying to mkoeppe:

From "advantages" in the ticket description:

Clear separation of test methods vs production code. For example, the test_*.py files can easily be excluded from distribution.

We definitely would not want to exclude them from the distribution! Also user-defined classes will want to run the test methods that are provided by the categories as part of their tests.

Probably depends on the context. People using sagelib only as a (calculation) library may not need these tests. One could for example easily split them out into a new sage.testing package (or sage.categories.testing) once everything is modularized nicely.

comment:40 in reply to:  37 Changed 2 years ago by mkoeppe

Replying to gh-tobiasdiez:

Replying to mkoeppe:

If I understand the proposed design correctly, for each category you would manually maintain a list of category instances to test.

Yes, but that's not different from the current code. There you also create the examples and then run the testsuite (which makes sure to run the correct tests for this category). This is the same in the proposed design, except that you now define the examples in the test file.

I think you may be missing that the category of an object is not statically known but is computed by code.

comment:41 Changed 2 years ago by gh-tobiasdiez

No, I'm not. The test_some_category.py says only "please check the category tests for the following examples" and pytest automatically determines based on the category of these examples which tests to run. In fact, the latter is done here

+def pytest_pycollect_makeitem(collector: PyCollector, name: str, obj: type):
+    if inspect.isclass(obj) and "category_instances" in dir(obj):
+        # Enrich test class by functions from the corresponding category test class
+        for category, category_test_class in categories_with_tests.items():
+            if category() in obj.category_instances()[0].category().all_super_categories():
+                methods= [method for method in dir(category_test_class)
+                        if not method.startswith('__') and callable(getattr(category_test_class, method))]
+            
+                for method in methods:
+                    setattr(obj, method, getattr(category_test_class, method))
+
+        return pytest.Class.from_parent(collector, name=name, obj=obj)
+    else:
+        return None

comment:42 Changed 2 years ago by mkoeppe

OK, but how is this compatible with your design to list the test objects in category_instances?

comment:43 Changed 2 years ago by gh-tobiasdiez

The design is as follows. Say you have a category LieGroups combining the categories Manifolds and Groups. Then you:

  1. Create manifolds_test and groups_test files, specifying the category tests that should hold for all objects of this category.
  2. Create a liegroups_test file, specifying in category_instances the examples of Lie groups you want to test, say SO(3) and SU(2). Moreover, add test methods that are particular for the cateogory of Lie groups.

When run, the code then looks at these examples, determines that they are objects of the categories Manifolds, Groups and Lie Groups and collects all tests from manifolds_test, groups_test and liegroups_test.

This can already be seen in action in the code, where the finite semigroups inhert tests from the category of sets and of finite sets.

comment:44 Changed 2 years ago by mkoeppe

So in the end it does not matter at all in which method category_instances an object is put?

comment:45 in reply to:  43 ; Changed 2 years ago by mkoeppe

Replying to gh-tobiasdiez:

This can already be seen in action in the code, where the finite semigroups inhert tests from the category of sets and of finite sets.

But having test_associativity supplied by TestFiniteSemigroup surely can't be the final design?!

comment:46 in reply to:  44 ; Changed 2 years ago by gh-tobiasdiez

Replying to mkoeppe:

So in the end it does not matter at all in which method category_instances an object is put?

No, it doesn't really matter. One could also have one big test file listing all these examples. But I would advocate to have one test file for each category, because then one can easily also write additional tests (not related to the category framework). It's also consistent with the usual conventions where one usually has one test file for each module file.

comment:47 in reply to:  45 Changed 2 years ago by gh-tobiasdiez

Replying to mkoeppe:

But having test_associativity supplied by TestFiniteSemigroup surely can't be the final design?!

I think this was a mistake on my side, it should be in a new test file for the "Semigroup" category.

comment:48 in reply to:  46 ; Changed 2 years ago by mkoeppe

Replying to gh-tobiasdiez:

Replying to mkoeppe:

So in the end it does not matter at all in which method category_instances an object is put?

No, it doesn't really matter. One could also have one big test file listing all these examples.

OK, this is an important clarification, thanks.

But I would advocate to have one test file for each category, because then one can easily also write additional tests (not related to the category framework).

Would it not be better to have one test file for each module (... whose doctest currently has calls to TestSuite)? I really don't see what is gained by grouping the test objects by category -- which is coarser than how things are grouped currently (i.e., by module) and also arbitrary (because for join categories you would be making an arbitrary choice into which test file to put an object).

comment:49 in reply to:  48 Changed 2 years ago by gh-tobiasdiez

Replying to mkoeppe:

Would it not be better to have one test file for each module (... whose doctest currently has calls to TestSuite)? I really don't see what is gained by grouping the test objects by category -- which is coarser than how things are grouped currently (i.e., by module) and also arbitrary (because for join categories you would be making an arbitrary choice into which test file to put an object).

It depends on what you want to test. If you want to test that the implementation of the category is correct, you put it in the test file of the category. If you want to test the implementation of an example, you put it in the test file of the example.

comment:50 Changed 2 years ago by nthiery

Hello!

Thanks for bringing me in the conversation. It's been ten years since I designed TestSuite? (with the help from many). Since that time, Python and the available testing tools have evolved a lot. I am glad to see efforts going into rethinking the design in this new context, questioning the implementation, and studying alternatives that would reach the same design goals but better suit the context. Especially toward using now standard infrastructure and practice to reduce both the technical debt and the entry barrier for new devs and/or for integration with other software.

comment:51 Changed 2 years ago by nthiery

(the rant may be a bit long, so I am splitting this into several comments).

So, what were these design goals:

(I) Hardening library and user defined algebraic structures, and beyond

The tests are there for checking the internal consistency of category code and for checking that parents (and their elements, morphisms, ...) satisfy the specifications of their categories (e.g. axioms). Supporting user defined algebraic structures is a must. In my research area we routinely create new algebraic structures, and when doing so TestSuite? is a daily tool: practice has told me that, once the implementation of that algebraic structure pass the testsuite, most of the time it's actually correct (certainly something that is research area dependent!).

In some cases, it even becomes a research tool: I am not sure whether the newly defined algebraic structure actually satisfy the axioms I want it to satisfy, and I use TestSuite? to actually run checks. And use post-mortem instrospection with the debugger to actually recover the counter examples ot my conjectures.

The next step in this direction is to work toward merging the _test_xxx methods and _is_xxx methods: most of the time, the code is mostly the same: checking whether axiom xxx is satisfied. What changes is how thorough the test is, and how failures are reported. I have toyed around with the idea with prototypes in sage-semigroups. Alas I never got to actually converge to a proposal for Sage. But this is something we want to keep in mind.

Last edited 2 years ago by nthiery (previous) (diff)

comment:52 Changed 2 years ago by tscrim

Something else I want to clarify is that the category examples are generally meant to minimal working examples of what needs to be implemented for a particular category. However, some categories actually use other existing objects in Sage, which is something many of the categories do, but such a WME is not essential to defining a category.

comment:53 Changed 2 years ago by nthiery

(II) Categories as bookshelves of code, documentation and tests

Each category models some algebraic realm. Like the bookshelf about Group Theory in a library, the category of groups in GAP or Sage ties together everything that applies to all groups: code, documentation, and tests. In a perfect world where the same system would be used for computations and proofs, I would include here mathematical statements (e.g. formulae for the axioms) and proofs! Then, there are subshelves for group elements, group morphisms, ... Subshelves for particular types of groups, and so on.

There are several ways to implement that tying together. In GAP, categories are a mere filters; each method is tied to a category through an individual declaration. And the language uses a bespoke method resolution that data. In SageMath instead, we decided to rely on standard OO method resolution, grouping together methods in classes. Both approaches have their advantages notably in terms of flexibility left to organize the code, and enable user extensions to existing categories.

Two features are required for testing purposes:

(a) Recovering all the tests that one wants to run to check an object in category A

(b) For each of them, resolving which test method is to be called. Indeed, a subcategory may overload a _test_xxx method when, for example, it has a more efficient way to check xxx.

Note that (b) is of the same nature as when resolving a usual method call. It thus seems highly desirable to use the exact same resolution mechanism as for code.

Last edited 2 years ago by nthiery (previous) (diff)

comment:54 Changed 2 years ago by nthiery

(III) Scalability

Nothing much to say here: whatever design is chosen should scale to the rich hierarchy of categories in SageMath, with all the mantras (axioms, ...) that were implemented to support that richness.

A critical bit is to satisfy as much as possible the Single Source Of Truth principle. E.g. if at some point I want to refine the category of a parent (e.g. because a new category was created), I should only need to update a single location in the code.

Last edited 2 years ago by nthiery (previous) (diff)

comment:55 Changed 2 years ago by nthiery

Sorry for the rant; I am trying to clarify for myself what guided -- implicitly or explicitly -- the design back then :-) Hopefully this is useful food for thought for a redesign.

Last edited 2 years ago by nthiery (previous) (diff)

comment:56 Changed 2 years ago by nthiery

Oh, one piece of information in case it would be relevant: for the longest time it's been on my pile to instrument TestSuite? so that, by running the Sage tests, one would get a relatively thorough list of all Sage parents together with their categories.

A typical applications would be to enable inverse lookup: given a category, suggest to the user a relatively thorough list of parents implementing that category; e.g. for documentation purposes.

comment:57 Changed 2 years ago by gh-tobiasdiez

Oh, that's wonderful feedback! This kind of deep insight is hard to get if you only read the existing code and don't know the story behind it's origin. Thanks a lot for your detailed account.

Please don't take this proposal as a critics of your work on the TestSuite?. I was actually amazed when I discovered that Sage has it's own testing framework. The main idea behind this ticket was to outsource the cost of maintenance of such a framework to the broader Python community, so that we can focus on the math-related elements.

Concerning your points, I'm happy to report that the proposed design should fit all the points you mentioned. In short:

(I): It is flexible enough to accommodate new user-defined categories and examples.

(II): It relies on pytest's discovery mechanism, which makes it very transparent which tests are run for a given category object.

(III): Scalability is a bit hard to judge for me, since I'm not that familiar with all the possible edge cases. But so far I had no problems migrating the existing code. In particular, the "single source of truth principle" applies, since the testing code only relies on the calculated super categories.

Hence, I would like to invite you to play around with the new design and see if it's works for your use cases. Basic documentation about pytest is added in #31003. Let me know if you encounter any problems.

Merry Christmas!

comment:58 Changed 2 years ago by gh-kliem

Status: needs_reviewneeds_work

Red branch.

comment:59 Changed 2 years ago by git

Commit: 1488582a6c48583565f28432a465cbc0ea40d3147692b09eb35b21cc38a326a49bff2f2f241d2fb4

Branch pushed to git repo; I updated commit sha1. New commits:

7692b09Merge branch 'develop' of git://github.com/sagemath/sage into public/refactoring/pytest

comment:60 Changed 2 years ago by gh-tobiasdiez

Dependencies: #30748, #30901
Status: needs_workneeds_review

New commits:

7692b09Merge branch 'develop' of git://github.com/sagemath/sage into public/refactoring/pytest

comment:61 Changed 2 years ago by git

Commit: 7692b09eb35b21cc38a326a49bff2f2f241d2fb4de780549b6b6baba9e14798595eb3f5091c574d4

Branch pushed to git repo; I updated commit sha1. New commits:

3d5dd33Revert "Remove lazy import finish startup"
de78054Revert "Use context manager"

comment:62 Changed 2 years ago by gh-tobiasdiez

Merged develop branch and removed the (unnecessary) dependencies.

comment:63 Changed 23 months ago by mkoeppe

Milestone: sage-9.3sage-9.4

Setting new milestone based on a cursory review of ticket status, priority, and last modification date.

comment:64 Changed 19 months ago by chapoton

Status: needs_reviewneeds_work

red branch => needs work

comment:65 Changed 19 months ago by mkoeppe

Milestone: sage-9.4sage-9.5

comment:66 Changed 14 months ago by gh-tobiasdiez

Description: modified (diff)

comment:67 Changed 14 months ago by gh-tobiasdiez

Description: modified (diff)
Status: needs_workneeds_review

Rebased, so ready-for-review again.

comment:68 Changed 14 months ago by git

Commit: de780549b6b6baba9e14798595eb3f5091c574d46d4152d6c198dd3863561efc25a67cd7b5f06202

Branch pushed to git repo; I updated commit sha1. New commits:

6d4152dMerge branch 'develop' of https://github.com/sagemath/sage into public/refactoring/pytest

comment:69 Changed 14 months ago by git

Commit: 6d4152d6c198dd3863561efc25a67cd7b5f06202161e2f4b4bf21ffbbc7bceb7fe085def84350e7c

Branch pushed to git repo; I updated commit sha1. New commits:

2cef637Remove test output
161e2f4Reorder imports

comment:70 Changed 14 months ago by gh-tobiasdiez

Description: modified (diff)

comment:71 Changed 14 months ago by slelievre

Cc: slelievre added
Description: modified (diff)
Keywords: testsuite added

comment:72 Changed 14 months ago by mkoeppe

Milestone: sage-9.5sage-9.6

comment:73 Changed 11 months ago by mkoeppe

Milestone: sage-9.6sage-pending

comment:74 Changed 11 months ago by gh-tobiasdiez

Why did you change the milestone? I don't see anything that would prevent this being merged soon (except for the missing review).

comment:75 Changed 11 months ago by mkoeppe

There's been no activity, and it's not clear if there are any developers who would be interested in rewriting the entirety of our test suite.

comment:76 Changed 11 months ago by gh-tobiasdiez

Milestone: sage-pendingsage-9.6

While being a relatively huge project, I think, a completely migration can be done in a relatively straightforward way with an overseeable time investment. I would definitely up for doing this alone if necessary. So I'm appreciate if this ticket could be reviewed soon. Thanks!

comment:77 Changed 10 months ago by mkoeppe

You have received review comments: comment:45, comment:48, etc.

comment:78 Changed 10 months ago by mkoeppe

Description: modified (diff)
Summary: Replace custom test discovery by pytestReplace custom Sage unit test discovery by pytest

comment:79 Changed 10 months ago by git

Commit: 161e2f4b4bf21ffbbc7bceb7fe085def84350e7cfab57938380110184825ccf2db6bc431e81ded51

Branch pushed to git repo; I updated commit sha1. New commits:

36cfdfcMerge remote-tracking branch 'origin/develop' into public/refactoring/pytest
fab5793Move test_associativity

comment:80 Changed 10 months ago by gh-tobiasdiez

Replying to mkoeppe:

You have received review comments: comment:45, comment:48, etc.

I've fixed comment:45. To comment:48 I've already reacted in comment:49.

comment:81 Changed 10 months ago by git

Commit: fab57938380110184825ccf2db6bc431e81ded51b047abeae5c3f3bf755afa37bf65a0ed4bc61d97

Branch pushed to git repo; I updated commit sha1. New commits:

b047abeMerge branch 'develop' into public/refactoring/pytest

comment:82 Changed 10 months ago by mkoeppe

Milestone: sage-9.6sage-9.7

comment:83 Changed 4 months ago by mkoeppe

Milestone: sage-9.7sage-9.8

comment:84 Changed 3 weeks ago by mkoeppe

Milestone: sage-9.8sage-9.9
Note: See TracTickets for help on using tickets.