Opened 2 years ago
Last modified 3 weeks ago
#30738 needs_review enhancement
Replace custom Sage unit test discovery by pytest
Reported by:  ghtobiasdiez  Owned by:  

Priority:  major  Milestone:  sage9.9 
Component:  refactoring  Keywords:  testsuite 
Cc:  mkoeppe, tscrim, nthiery, slelievre  Merged in:  
Authors:  Tobias Diez  Reviewers:  
Report Upstream:  N/A  Work issues:  
Branch:  public/refactoring/pytest (Commits, GitHub, GitLab)  Commit:  b047abeae5c3f3bf755afa37bf65a0ed4bc61d97 
Dependencies:  Stopgaps: 
Description (last modified by )
Currently, Sage code contains _test_xyz
methods that are invoked via doctests and discovered via a custom TestSuite
class. This ticket is to replace this custom test discovery by the use of the established pytest framework https://docs.pytest.org/ (this is in line with #28936  using mainstream Python test infrastructure).
Advantages of this approach:
 Reuse standard test infrastructure instead of a custom test discovery interface (this is a big maintenance relief, especially given the large list of todos in the
TestSuite
code).  Better error messages in case a test fails.
 Easier onboarding of new Python developers.
 Clear separation of test methods vs production code. For example, the
*_test.py
files can easily be excluded from distribution.  Good support by IDEs (auto completion, test discovery and easier invocation). For example, VSCode automatically discovers all pytest tests, provides a convenient interface to run them (all, a selection, only failed ones) and allows to directly start a debug session for a failing test. See https://code.visualstudio.com/docs/python/testing
 Less and clearer code (no hidden assumptions).
Disadvantages of this approach:
 Current developers have to learn pytest and its conventions (files must be named
*_test.py
and test methods must start withtest_
).  Tests no longer live in the same file as the code (I would call this an advantage, but some people may prefer to have them really close together).
In order to migrate, the following changes are necessary:
 Move
_test_xyz
from a categorymodule
to new file<module>_test.py
(e.g.finite_semigroups_test.py
).  Rewrite tests using pytest (which is straightforward and more or less only amounts to removing all the custom code involving
self_tester
).  Create a new test class for the parent class under test (e.g
TestFiniteSemigroup
).  Create a static method
category_instances()
in this test class returning a list of instances to test.
For the moment, I did this only for a few test methods in sets_cat
using finite semigroups as example to illustrate the process.
The new implementation is as follows:
 The test class for the category (e.g. in
sets_cat_test.py
) defines generic tests that should pass for every object in that category. Moreover, the test class may provides instances (objects) to test via the methodcategory_instances
.  Running pytest finds the test file
finite_semigroups_test.py
, then determines the categories of the objects (returned fromcategory_instances
) and adds all generic test methods for that category to the test class (this happens inconftest.py
).  Each test method can have the parameter
category_instance
which is bound to the return value of thecategory_instances
method.
Running cd src && pytest verbose
(from a virtual env with sage and pytest installed), the output is as follows:
sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_enumerated_set_contains[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED [ 2%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_enumerated_set_contains[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED [ 4%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_enumerated_set_contains[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED [ 6%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_enumerated_set_iter_list[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED [ 8%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_enumerated_set_iter_list[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED [ 10%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_enumerated_set_iter_list[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED [ 13%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_enumerated_set_iter_cardinality[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED [ 15%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_enumerated_set_iter_cardinality[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED [ 17%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_enumerated_set_iter_cardinality[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED [ 19%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_associativity[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED [ 21%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_associativity[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED [ 23%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_associativity[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED [ 26%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_an_element[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED [ 28%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_an_element[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED [ 30%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_an_element[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED [ 32%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_an_element_idempotent[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED [ 34%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_an_element_idempotent[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED [ 36%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_an_element_idempotent[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED [ 39%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_cardinality_return_type[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED [ 41%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_cardinality_return_type[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED [ 43%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_cardinality_return_type[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED [ 45%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_construction[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED [ 47%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_construction[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED [ 50%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_construction[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED [ 52%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_element_eq_reflexive[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED [ 54%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_element_eq_reflexive[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED [ 56%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_element_eq_reflexive[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED [ 58%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_elements_eq_symmetric[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED [ 60%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_elements_eq_symmetric[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED [ 63%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_elements_eq_symmetric[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED [ 65%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_elements_eq_transitive[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED [ 67%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_elements_eq_transitive[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED [ 69%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_elements_eq_transitive[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED [ 71%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_elements_neq[An example of a finite semigroup: the left regular band generated by ('a', 'b')] PASSED [ 73%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_elements_neq[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c')] PASSED [ 76%] sage/categories/finite_semigroups_test.py::TestFiniteSemigroup::test_elements_neq[An example of a finite semigroup: the left regular band generated by ('a', 'b', 'c', 'd')] PASSED [ 78%] ...
As one can see, each general test method is invoked for the three examples.
TODO (in followup tickets):
 Complete migration to pytest.
 Once that is done, stop invoking
TestSuite
in doctests and remove theTestSuite
class.
See also:
 #33546 Replace custom Sage doctest discovery by pytest (using pytest_collect_file)
Change History (84)
comment:1 Changed 2 years ago by
Description:  modified (diff) 

comment:2 Changed 2 years ago by
comment:3 followup: 4 Changed 2 years ago by
Yes, and this structure is also somewhat reflected in the new code as the tests for the subclasses inherit from the general base test class. If I understood the code correctly, then the main purpose of these subclasses (with respect to the tests) is to provide the examples (e.g. using the example()
convention). This can be archived in a more transparent and cleaner way using parameterized test fixtures of pytest.
comment:4 Changed 2 years ago by
Replying to ghtobiasdiez:
If I understood the code correctly, then the main purpose of these subclasses (with respect to the tests) is to provide the examples (e.g. using the
example()
convention).
No, the category classes provide mixins for dynamically generated classes for all Parent
and Element
objects. The example()
are just additional documentation.
comment:5 followup: 6 Changed 2 years ago by
Sure, but that doesn't seem to be important for the tests itself. Take for example the tests in sets_cat
. Almost all of them simply call tester.some_elements()
and then run certain assertions against the returned elements. Thus, there is no semantical difference to having one general test method accepting an element as argument, and then testing the elements returned by some_elements()
using this method. This is exactly what is proposed in this ticket.
One question though: is the purpose of these tests to check that one general implementation is correct for all these examples (e.g sets_cat
provides a general equality implementation which is checked), or that the subclasses provide an implementation that adheres to general principles (e.g. finitely_generated_semigroups
provides an implementation of equality that is checked). Just so that I know where to best place these tests (test_sets_cat vs test_finitely_generated_semigroups).
comment:6 Changed 2 years ago by
Replying to ghtobiasdiez:
One question though: is the purpose of these tests to check that one general implementation is correct for all these examples (e.g
sets_cat
provides a general equality implementation which is checked), or that the subclasses provide an implementation that adheres to general principles (e.g.finitely_generated_semigroups
provides an implementation of equality that is checked).
The latter.
comment:7 Changed 2 years ago by
Thanks, I thought so too. Then the current file structure should be right. Should I go ahead and convert the other test methods to pytest as well? (As part of this ticket, or open a new ticket for every method?)
comment:8 Changed 2 years ago by
I do not know enough about pytest. Can you expand the ticket description to explain how the test discovery works after the ticket, and how it ensures that still the same things get tested?
comment:9 Changed 2 years ago by
Milestone:  sage9.2 → sage9.3 

comment:10 Changed 2 years ago by
Description:  modified (diff) 

Status:  new → needs_review 
comment:11 Changed 2 years ago by
I've extended the ticket description. Does this clarify your questions?
The test test_element_eq_reflexive
(and other similar tests to be added) are invoked for each element of test_elements
. Thus, by adding all the instances currently covered by the unit tests we make sure that the test coverage stays the same.
comment:12 Changed 2 years ago by
Sorry, I still don't get it. The branch as it is on the ticket  do you claim that it still tests the same objects as before?
comment:13 followup: 14 Changed 2 years ago by
If you run pytest
on this branch, it invokes test_elements_eq_reflexive
with the arguments ['a', 'b', 'c', 'ab', 'ac', 'ba', 'bc', 'ca', 'cb', 'abc']
.
This is the same as the doctest for finite_semigroups
:
Now, let us look at the structure of the semigroup:: sage: S = FiniteSemigroups().example(alphabet = ('a','b','c')) sage: S.cayley_graph(side="left", simple=True).plot() Graphics object consisting of 60 graphics primitives sage: S.j_transversal_of_idempotents() # random (arbitrary choice) ['acb', 'ac', 'ab', 'bc', 'a', 'c', 'b'] We conclude by running systematic tests on this semigroup:: sage: TestSuite(S).run(verbose = True) running ._test_an_element() . . . pass running ._test_associativity() . . . pass running ._test_cardinality() . . . pass running ._test_category() . . . pass running ._test_construction() . . . pass running ._test_elements() . . . Running the test suite of self.an_element() running ._test_category() . . . pass running ._test_eq() . . . pass running ._test_new() . . . pass running ._test_not_implemented_methods() . . . pass running ._test_pickling() . . . pass pass running ._test_elements_eq_reflexive() . . . pass running ._test_elements_eq_symmetric() . . . pass running ._test_elements_eq_transitive() . . . pass running ._test_elements_neq() . . . pass running ._test_enumerated_set_contains() . . . pass running ._test_enumerated_set_iter_cardinality() . . . pass running ._test_enumerated_set_iter_list() . . . pass running ._test_eq() . . . pass running ._test_new() . . . pass running ._test_not_implemented_methods() . . . pass running ._test_pickling() . . . pass running ._test_some_elements() . . . pass
So the idea would be to replace this doctest by pytest
. There are a few more doctests of this form that can be replaced in a similar vain by adding the corresponding elements to the test_elements
list.
comment:14 Changed 2 years ago by
Replying to ghtobiasdiez:
There are a few more doctests of this form that can be replaced in a similar vain by adding the corresponding elements to the
test_elements
list.
A few? Currently _test_elements_eq_reflexive
is invoked by TestSuite(object).run()
whenever object
happens to be in the category of sets. This is a huge, dynamically determined list.
comment:15 followup: 19 Changed 2 years ago by
Sure, there are quite a lot of instances where a TestSuite
is run from the doctest. Each one would need to be replaced by a test file similar to test_finite_semigroups.py
. Pytest is not a magician and the information which examples to test still needs to be provided.
So the replacement dictionary roughly speaking looks like:
TestSuite
>pytest
TestSuite(S).run(verbose = True)
>@pytest.fixture(params=S)
The idea is not to change the way the tests cases are provided (although that may certainly be improved in the future) but to replace sage's only implementation of test discovery and invocation by what is probably the most establish Python testing framework.
comment:16 Changed 2 years ago by
So in your design, there would be a hierarchy of Tests
classes parallel to the hierarchy of, say, parent classes in sage?
comment:17 Changed 2 years ago by
Yes! Every parent class would have a corresponding test class which specifies which general axioms this parent class wants to adhere to. This is done by deriving from one or more generic test classes (the one in the branch should probably be renamed to GenericSetTests
).
comment:18 Changed 2 years ago by
Okay, but the parent classes are dynamic classes formed by mixing in axioms. So the test classes would also be determined dynamically?
comment:19 Changed 2 years ago by
Status:  needs_review → needs_work 

Replying to ghtobiasdiez:
here are quite a lot of instances where a
TestSuite
is run from the doctest. Each one would need to be replaced by a test file similar totest_finite_semigroups.py
. Pytest is not a magician and the information which examples to test still needs to be provided.
Thanks for this clarification. I'm setting this ticket to "needs_work" because the branch in this form is not complete  it effectively removes existing tests for _test_elements_eq_reflexive
. Without a much more complete branch that reaches test parity, this idea is difficult to evaluate.
Overall I think there is a conflict here between the dynamic nature of our category system and what seems to amount to a static determination of what objects to test for what properties (in the proposed changes).
comment:20 followup: 21 Changed 2 years ago by
I'm not that familiar yet with the category framework. When you say "dynamic classes formed by mixing in axioms" you mean something like Parent.__init__(self, category = Semigroups().Finite().FinitelyGenerated())
? But, in general, yes the idea is to mixin the tests. I don't see any reason why this couldn't be done dynamically.
I agree, this branch has not yet feature parity with the master branch. Before I invest more time into it, I just wanted to confirm that this idea would be worth looking at in general.
comment:21 Changed 2 years ago by
Replying to ghtobiasdiez:
When you say "dynamic classes formed by mixing in axioms" you mean something like
Parent.__init__(self, category = Semigroups().Finite().FinitelyGenerated())
?
Yes, things like this.
comment:22 Changed 2 years ago by
Commit:  3b243d0e8c790de8987f81dcf877d7b07ead4a11 → e198d1cca15b1a4e262ff625d8725d44fb8b7703 

Branch pushed to git repo; I updated commit sha1. New commits:
1df0a95  Merge branch 'develop' of git://github.com/sagemath/sage into public/refactoring/pytest

b12608b  Readd test

79abf0d  Refactor

99ae00c  Replace startup exception by warning

11882e5  Use context manager

6a52fbf  Remove lazy import finish startup

f96025e  Merge branch 'develop' of git://github.com/sagemath/sage into public/build/startupWarning

b952bb5  Fix doctests

5de08cd  Merge branch 'public/build/startupWarning' of git://trac.sagemath.org/sage into public/refactoring/pytest

e198d1c  Make tests mixins dynamic

comment:23 Changed 2 years ago by
Dependencies:  → #30748 

Description:  modified (diff) 
Status:  needs_work → needs_review 
Thanks for the input. I've now reworked the code completely. The generic category tests are now dynamically mixed in the test class. The new approach is explained in the ticket description in more detail. The upshot is that it is very easy to add new generic test methods (simply add them to the category test class, e.g. test_sets_cat
) and to add new examples (simply create a new test class implementing category_instances
).
comment:24 Changed 2 years ago by
Status:  needs_review → needs_work 

It's really hard to see from this ticket how this is intended to go.
The previous version that I reviewed was removing existing doctests and reimplementing a small subset of them using pytest.
Now the new version is only adding some new tests using pytest that seem to duplicate some existing tests run by the doctest framework.
comment:25 followup: 27 Changed 2 years ago by
The current version is meant as a starting point for discussion whether this approach is ok. So it is a fully working prototype that indeed reimplements a subset of the current category framework tests, and is flexible enough to be extended to cover everything.
Getting full parity with the current category tests needs a bit of work by extending test_sets_cat
with the remaining test functions.
I'm not sure what I should do now. The only open point is that pytest needs to be run during the normal test run (on user systems as well as on the ci pipeline). But that's not really the purpose of this ticket. I think, it's clear that one cannot reimplement the complete category tests in one ticket, so this is why I opted for a minimal version. What is your proposal to proceed from here?
comment:27 followup: 31 Changed 2 years ago by
Replying to ghtobiasdiez:
Getting full parity with the current category tests needs a bit of work by extending
test_sets_cat
with the remaining test functions.
But even for the test function that you already added, it is only testing a few objects.
So to get parity even for this one test function, category_instances
would have to be extended vastly  if I understand the design correctly.
It is unclear who should do this work and why.
comment:28 Changed 2 years ago by
Dependencies:  #30748 → #30748, #30901 

comment:29 Changed 2 years ago by
I've now vastly extended the tests, and replaced some of the TestSuite? calls by pytest. A handful of test methods are still missing to get full feature parity (will do this later), but I hope the idea and design get clearer now (at the cost that there are bigger changes).
The following change is characteristic for the aim of migrating from Sage's custom TestSuite? framework to pytest:
 sage: TestSuite(C).run() + sage: import pytest + sage: pytest.main(["r", "test_finite_semigroups.py"])
(calling pytest from the doctests is only a temporary workaround until it's properly integrated into the build chain)
comment:30 Changed 2 years ago by
Commit:  e198d1cca15b1a4e262ff625d8725d44fb8b7703 → 1488582a6c48583565f28432a465cbc0ea40d314 

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:
94e20c7  Revert some of the changes

6dd6e5c  Fix compilation

eceefb3  Remove string wrap

d345bff  Fix test

c47c4bf  Correct indent

3fcaf5f  Merge branch 'develop' of git://github.com/sagemath/sage into public/build/multiarchsimple

090e6f1  Simplify code

fa4556a  Remove _get_sage_local

7b8be0d  Merge branch 'public/build/multiarchsimple' of git://trac.sagemath.org/sage into public/refactoring/pytest

1488582  Towards full feature parity

comment:31 Changed 2 years ago by
Cc:  tscrim nthiery added 

Replying to mkoeppe:
But even for the test function that you already added, it is only testing a few objects.
So to get parity even for this one test function,
category_instances
would have to be extended vastly  if I understand the design correctly.
These changes do not seem to address the above comment? You added more test functions, instead of showing how one would achieve parity for what one test function covers. Or maybe I'm misunderstanding.
comment:32 followup: 33 Changed 2 years ago by
I guess there are two ways for migration:
 Replace one test in the category framework by a pytest, and then make sure that this test is invoked for every example. Do this again for every test.
 Duplicate all tests in the category framework using pytest, and then replace one
TestSuite(...).run
statement by its pytest equivalent. Do this again for every test invocation.
Since there are only a small number of tests (order of 10) but way more examples (order of 1000), my idea was follow the second approach since this can be done easier in steps.
Of course, the complete migration is quite a huge project. This ticket was meant to as a starting point.
The advantages I see of pytest over a custom testsuite implementation are in the ticket description.
comment:33 followup: 37 Changed 2 years ago by
If I understand the proposed design correctly, for each category you would manually maintain a list of category instances to test. To me this seems like a huge step backward: The categories of an object are computed dynamically  and thus all test methods that come with the categories are run.
comment:34 Changed 2 years ago by
(Feel free to add this to "disadvantages of the approach" in the ticket description.)
comment:35 followup: 39 Changed 2 years ago by
From "advantages" in the ticket description:
Clear separation of test methods vs production code. For example, the test_*.py files can easily be excluded from distribution.
We definitely would not want to exclude them from the distribution! Also userdefined classes will want to run the test methods that are provided by the categories as part of their tests.
comment:36 followup: 38 Changed 2 years ago by
Could you elaborate on this in the ticket description:
Good support by IDE (auto completion, test discovery and easier invocation)
comment:37 followup: 40 Changed 2 years ago by
Replying to mkoeppe:
If I understand the proposed design correctly, for each category you would manually maintain a list of category instances to test.
Yes, but that's not different from the current code. There you also create the examples and then run the testsuite (which makes sure to run the correct tests for this category). This is the same in the proposed design, except that you now define the examples in the test file. I would even argue that the proposed design is more flexible and makes it very transparent which examples are used to test a given category.
comment:38 Changed 2 years ago by
Description:  modified (diff) 

Replying to mkoeppe:
Could you elaborate on this in the ticket description:
Good support by IDE (auto completion, test discovery and easier invocation)
Done.
comment:39 Changed 2 years ago by
Replying to mkoeppe:
From "advantages" in the ticket description:
Clear separation of test methods vs production code. For example, the test_*.py files can easily be excluded from distribution.
We definitely would not want to exclude them from the distribution! Also userdefined classes will want to run the test methods that are provided by the categories as part of their tests.
Probably depends on the context. People using sagelib only as a (calculation) library may not need these tests. One could for example easily split them out into a new sage.testing package (or sage.categories.testing) once everything is modularized nicely.
comment:40 Changed 2 years ago by
Replying to ghtobiasdiez:
Replying to mkoeppe:
If I understand the proposed design correctly, for each category you would manually maintain a list of category instances to test.
Yes, but that's not different from the current code. There you also create the examples and then run the testsuite (which makes sure to run the correct tests for this category). This is the same in the proposed design, except that you now define the examples in the test file.
I think you may be missing that the category of an object is not statically known but is computed by code.
comment:41 Changed 2 years ago by
No, I'm not. The test_some_category.py
says only "please check the category tests for the following examples" and pytest automatically determines based on the category of these examples which tests to run. In fact, the latter is done here
+def pytest_pycollect_makeitem(collector: PyCollector, name: str, obj: type): + if inspect.isclass(obj) and "category_instances" in dir(obj): + # Enrich test class by functions from the corresponding category test class + for category, category_test_class in categories_with_tests.items(): + if category() in obj.category_instances()[0].category().all_super_categories(): + methods= [method for method in dir(category_test_class) + if not method.startswith('__') and callable(getattr(category_test_class, method))] + + for method in methods: + setattr(obj, method, getattr(category_test_class, method)) + + return pytest.Class.from_parent(collector, name=name, obj=obj) + else: + return None
comment:42 Changed 2 years ago by
OK, but how is this compatible with your design to list the test objects in category_instances
?
comment:43 followup: 45 Changed 2 years ago by
The design is as follows. Say you have a category LieGroups
combining the categories Manifolds
and Groups
.
Then you:
 Create
manifolds_test
andgroups_test
files, specifying the category tests that should hold for all objects of this category.  Create a
liegroups_test
file, specifying incategory_instances
the examples of Lie groups you want to test, saySO(3)
andSU(2)
. Moreover, add test methods that are particular for the cateogory of Lie groups.
When run, the code then looks at these examples, determines that they are objects of the categories Manifolds, Groups and Lie Groups and collects all tests from manifolds_test
, groups_test
and liegroups_test
.
This can already be seen in action in the code, where the finite semigroups inhert tests from the category of sets and of finite sets.
comment:44 followup: 46 Changed 2 years ago by
So in the end it does not matter at all in which method category_instances
an object is put?
comment:45 followup: 47 Changed 2 years ago by
Replying to ghtobiasdiez:
This can already be seen in action in the code, where the finite semigroups inhert tests from the category of sets and of finite sets.
But having test_associativity
supplied by TestFiniteSemigroup
surely can't be the final design?!
comment:46 followup: 48 Changed 2 years ago by
Replying to mkoeppe:
So in the end it does not matter at all in which method
category_instances
an object is put?
No, it doesn't really matter. One could also have one big test file listing all these examples. But I would advocate to have one test file for each category, because then one can easily also write additional tests (not related to the category framework). It's also consistent with the usual conventions where one usually has one test file for each module file.
comment:47 Changed 2 years ago by
Replying to mkoeppe:
But having
test_associativity
supplied byTestFiniteSemigroup
surely can't be the final design?!
I think this was a mistake on my side, it should be in a new test file for the "Semigroup" category.
comment:48 followup: 49 Changed 2 years ago by
Replying to ghtobiasdiez:
Replying to mkoeppe:
So in the end it does not matter at all in which method
category_instances
an object is put?No, it doesn't really matter. One could also have one big test file listing all these examples.
OK, this is an important clarification, thanks.
But I would advocate to have one test file for each category, because then one can easily also write additional tests (not related to the category framework).
Would it not be better to have one test file for each module (... whose doctest currently has calls to TestSuite
)? I really don't see what is gained by grouping the test objects by category  which is coarser than how things are grouped currently (i.e., by module) and also arbitrary (because for join categories you would be making an arbitrary choice into which test file to put an object).
comment:49 Changed 2 years ago by
Replying to mkoeppe:
Would it not be better to have one test file for each module (... whose doctest currently has calls to
TestSuite
)? I really don't see what is gained by grouping the test objects by category  which is coarser than how things are grouped currently (i.e., by module) and also arbitrary (because for join categories you would be making an arbitrary choice into which test file to put an object).
It depends on what you want to test. If you want to test that the implementation of the category is correct, you put it in the test file of the category. If you want to test the implementation of an example, you put it in the test file of the example.
comment:50 Changed 2 years ago by
Hello!
Thanks for bringing me in the conversation. It's been ten years since I designed TestSuite? (with the help from many). Since that time, Python and the available testing tools have evolved a lot. I am glad to see efforts going into rethinking the design in this new context, questioning the implementation, and studying alternatives that would reach the same design goals but better suit the context. Especially toward using now standard infrastructure and practice to reduce both the technical debt and the entry barrier for new devs and/or for integration with other software.
comment:51 Changed 2 years ago by
(the rant may be a bit long, so I am splitting this into several comments).
So, what were these design goals:
(I) Hardening library and user defined algebraic structures, and beyond
The tests are there for checking the internal consistency of category code and for checking that parents (and their elements, morphisms, ...) satisfy the specifications of their categories (e.g. axioms). Supporting user defined algebraic structures is a must. In my research area we routinely create new algebraic structures, and when doing so TestSuite? is a daily tool: practice has told me that, once the implementation of that algebraic structure pass the testsuite, most of the time it's actually correct (certainly something that is research area dependent!).
In some cases, it even becomes a research tool: I am not sure whether the newly defined algebraic structure actually satisfy the axioms I want it to satisfy, and I use TestSuite? to actually run checks. And use postmortem instrospection with the debugger to actually recover the counter examples ot my conjectures.
The next step in this direction is to work toward merging
the _test_xxx
methods and _is_xxx
methods: most of the time,
the code is mostly the same: checking whether axiom xxx is
satisfied. What changes is how thorough the test is, and
how failures are reported. I have toyed around with the idea
with prototypes in sagesemigroups. Alas I never got to actually
converge to a proposal for Sage. But this is something we want
to keep in mind.
comment:52 Changed 2 years ago by
Something else I want to clarify is that the category examples are generally meant to minimal working examples of what needs to be implemented for a particular category. However, some categories actually use other existing objects in Sage, which is something many of the categories do, but such a WME is not essential to defining a category.
comment:53 Changed 2 years ago by
(II) Categories as bookshelves of code, documentation and tests
Each category models some algebraic realm. Like the bookshelf about Group Theory in a library, the category of groups in GAP or Sage ties together everything that applies to all groups: code, documentation, and tests. In a perfect world where the same system would be used for computations and proofs, I would include here mathematical statements (e.g. formulae for the axioms) and proofs! Then, there are subshelves for group elements, group morphisms, ... Subshelves for particular types of groups, and so on.
There are several ways to implement that tying together. In GAP, categories are a mere filters; each method is tied to a category through an individual declaration. And the language uses a bespoke method resolution that data. In SageMath instead, we decided to rely on standard OO method resolution, grouping together methods in classes. Both approaches have their advantages notably in terms of flexibility left to organize the code, and enable user extensions to existing categories.
Two features are required for testing purposes:
(a) Recovering all the tests that one wants to run to check an object in category A
(b) For each of them, resolving which test method is to be called. Indeed, a subcategory may overload a _test_xxx method when, for example, it has a more efficient way to check xxx.
Note that (b) is of the same nature as when resolving a usual method call. It thus seems highly desirable to use the exact same resolution mechanism as for code.
comment:54 Changed 2 years ago by
(III) Scalability
Nothing much to say here: whatever design is chosen should scale to the rich hierarchy of categories in SageMath, with all the mantras (axioms, ...) that were implemented to support that richness.
A critical bit is to satisfy as much as possible the Single Source Of Truth principle. E.g. if at some point I want to refine the category of a parent (e.g. because a new category was created), I should only need to update a single location in the code.
comment:55 Changed 2 years ago by
Sorry for the rant; I am trying to clarify for myself what guided  implicitly or explicitly  the design back then :) Hopefully this is useful food for thought for a redesign.
comment:56 Changed 2 years ago by
Oh, one piece of information in case it would be relevant: for the longest time it's been on my pile to instrument TestSuite? so that, by running the Sage tests, one would get a relatively thorough list of all Sage parents together with their categories.
A typical applications would be to enable inverse lookup: given a category, suggest to the user a relatively thorough list of parents implementing that category; e.g. for documentation purposes.
comment:57 Changed 2 years ago by
Oh, that's wonderful feedback! This kind of deep insight is hard to get if you only read the existing code and don't know the story behind it's origin. Thanks a lot for your detailed account.
Please don't take this proposal as a critics of your work on the TestSuite?. I was actually amazed when I discovered that Sage has it's own testing framework. The main idea behind this ticket was to outsource the cost of maintenance of such a framework to the broader Python community, so that we can focus on the mathrelated elements.
Concerning your points, I'm happy to report that the proposed design should fit all the points you mentioned. In short:
(I): It is flexible enough to accommodate new userdefined categories and examples.
(II): It relies on pytest's discovery mechanism, which makes it very transparent which tests are run for a given category object.
(III): Scalability is a bit hard to judge for me, since I'm not that familiar with all the possible edge cases. But so far I had no problems migrating the existing code. In particular, the "single source of truth principle" applies, since the testing code only relies on the calculated super categories.
Hence, I would like to invite you to play around with the new design and see if it's works for your use cases. Basic documentation about pytest is added in #31003. Let me know if you encounter any problems.
Merry Christmas!
comment:59 Changed 2 years ago by
Commit:  1488582a6c48583565f28432a465cbc0ea40d314 → 7692b09eb35b21cc38a326a49bff2f2f241d2fb4 

Branch pushed to git repo; I updated commit sha1. New commits:
7692b09  Merge branch 'develop' of git://github.com/sagemath/sage into public/refactoring/pytest

comment:60 Changed 2 years ago by
Dependencies:  #30748, #30901 

Status:  needs_work → needs_review 
New commits:
7692b09  Merge branch 'develop' of git://github.com/sagemath/sage into public/refactoring/pytest

comment:61 Changed 2 years ago by
Commit:  7692b09eb35b21cc38a326a49bff2f2f241d2fb4 → de780549b6b6baba9e14798595eb3f5091c574d4 

comment:63 Changed 23 months ago by
Milestone:  sage9.3 → sage9.4 

Setting new milestone based on a cursory review of ticket status, priority, and last modification date.
comment:65 Changed 19 months ago by
Milestone:  sage9.4 → sage9.5 

comment:66 Changed 14 months ago by
Description:  modified (diff) 

comment:67 Changed 14 months ago by
Description:  modified (diff) 

Status:  needs_work → needs_review 
Rebased, so readyforreview again.
comment:68 Changed 14 months ago by
Commit:  de780549b6b6baba9e14798595eb3f5091c574d4 → 6d4152d6c198dd3863561efc25a67cd7b5f06202 

Branch pushed to git repo; I updated commit sha1. New commits:
6d4152d  Merge branch 'develop' of https://github.com/sagemath/sage into public/refactoring/pytest

comment:69 Changed 14 months ago by
Commit:  6d4152d6c198dd3863561efc25a67cd7b5f06202 → 161e2f4b4bf21ffbbc7bceb7fe085def84350e7c 

comment:70 Changed 14 months ago by
Description:  modified (diff) 

comment:71 Changed 14 months ago by
Cc:  slelievre added 

Description:  modified (diff) 
Keywords:  testsuite added 
comment:72 Changed 14 months ago by
Milestone:  sage9.5 → sage9.6 

comment:73 Changed 11 months ago by
Milestone:  sage9.6 → sagepending 

comment:74 Changed 11 months ago by
Why did you change the milestone? I don't see anything that would prevent this being merged soon (except for the missing review).
comment:75 Changed 11 months ago by
There's been no activity, and it's not clear if there are any developers who would be interested in rewriting the entirety of our test suite.
comment:76 Changed 11 months ago by
Milestone:  sagepending → sage9.6 

While being a relatively huge project, I think, a completely migration can be done in a relatively straightforward way with an overseeable time investment. I would definitely up for doing this alone if necessary. So I'm appreciate if this ticket could be reviewed soon. Thanks!
comment:78 Changed 10 months ago by
Description:  modified (diff) 

Summary:  Replace custom test discovery by pytest → Replace custom Sage unit test discovery by pytest 
comment:79 Changed 10 months ago by
Commit:  161e2f4b4bf21ffbbc7bceb7fe085def84350e7c → fab57938380110184825ccf2db6bc431e81ded51 

comment:80 Changed 10 months ago by
Replying to mkoeppe:
You have received review comments: comment:45, comment:48, etc.
I've fixed comment:45. To comment:48 I've already reacted in comment:49.
comment:81 Changed 10 months ago by
Commit:  fab57938380110184825ccf2db6bc431e81ded51 → b047abeae5c3f3bf755afa37bf65a0ed4bc61d97 

Branch pushed to git repo; I updated commit sha1. New commits:
b047abe  Merge branch 'develop' into public/refactoring/pytest

comment:82 Changed 10 months ago by
Milestone:  sage9.6 → sage9.7 

comment:83 Changed 4 months ago by
Milestone:  sage9.7 → sage9.8 

comment:84 Changed 3 weeks ago by
Milestone:  sage9.8 → sage9.9 

Are you aware that the crucial point of our _test methods, in particular those provided in the category framework, is that objects of subclasses run these tests?