## #33432 new defect

# Restore basic stats commands to the global name space

Mean, median and mode are now deprecated by #29662. E.g.:

>median([1,2,3]) 2 :1: DeprecationWarning: sage.stats.basic_stats.median is deprecated; use numpy.median or numpy.nanmedian instead See https://trac.sagemath.org/29662 for details.

But these basic functions should have some default functionality. It seems strange to not have a top-level "mean" or "median" function, given all of the other esoteric top-level functions.

The idea is to provide `mean`

and `median`

commands with functionality at least of the deprecated commands.

Discussion in sage-support: https://groups.google.com/g/sage-support/c/fglHtSGKFJk

Replying to mkoeppe:

The functionality is still there; it has not been removed.

Then do we have the option of abolishing the deprecation, instead of relying on numpy?

The deprecation messages are an improvement over the previous status quo. They point users to more suitable facilities.

The deprecation messages may also provide developers with incentive to produce improved versions of those functions. I don't know if this is a problem:

sage: import numpy sage: type(numpy.mean([1,2,3])) <class 'numpy.float64'>

Yes, numpy's function returns a numpy type. That's why it should be called explicitly as `np.mean`

after `import numpy as np`

; importing these functions into our global namespace would not be a good idea.

Okay. Then how do we provide mean and median in the global namespace, which is the goal of this ticket?

They are still in the global namespace.

I am confused. They are deprecated, and will be removed from the global namespace.

Only if we remove them. We don't have to.

I see your idea.

But I don't agree with you. Our student and teacher users wouldn't want to have "mean" and "median" commands with deprecation string attached. This is the point of this ticket.

Our student and teacher users wouldn't want to have "mean" and "median" commands with deprecation string attached.

The deprecation message provides a necessary commentary/update to their teaching materials.

Replying to klee:

Our student and teacher users wouldn't want to have "mean" and "median" commands with deprecation string attached.

The deprecation message provides a necessary commentary/update to their teaching materials.

Really? It is already complicated enough to teach sagemath. To my mind having an extra level of noise is not helping. `mean`

and `median`

are ought to be elementary functions, likely to be presented in the first course. If these functions are to be kept the warning is more harmful than useful.

I am in favour of removing the warnings for `mean`

, `median`

(and possibly `variance`

and `std`

) and keep these functions roughly as they are. If the input data is a `numpy`

array the code calls the correct numpy method.

Really? It is already complicated enough to teach sagemath. To my mind having an extra level of noise is not helping.

`mean`

and`median`

are ought to be elementary functions, likely to be presented in the first course. If these functions are to be kept the warning is more harmful than useful.

+1

I am in favour of removing the warnings for

`mean`

,`median`

(and possibly`variance`

and`std`

) and keep these functions roughly as they are. If the input data is a`numpy`

array the code calls the correct numpy method.

+1

Really? It is already complicated enough to teach sagemath. To my mind having an extra level of noise is not helping.

`mean`

and`median`

are ought to be elementary functions, likely to be presented in the first course. If these functions are to be kept the warning is more harmful than useful.+1

I am in favour of removing the warnings for

`mean`

,`median`

(and possibly`variance`

and`std`

) and keep these functions roughly as they are. If the input data is a`numpy`

array the code calls the correct numpy method.+1

Yes.

See also this comment; it's unfortunate that it sounds like it might be too hard to overload those e.g. mean to work with Sage integers if/when Python 3.8+ becomes default.

I am in favour of removing the warnings for

`mean`

,`median`

(and possibly`variance`

and`std`

) and keep these functions roughly as they are.

That by itself does not sound like a good plan. There's still the disservice to learners: A Sage-specific dead end with limited functionality and no perspective.

How about this:

- In the short term, transform
`sage.stats.basic_stats`

to a module that provides the same API as the built-in`statistics`

module (https://docs.python.org/3/library/statistics.html).

- In the long term, work with the Python community so that the built-in
`statistics`

module can handle collections with a mix of types, including Sage's numbers and other objects.

But someone would have to work on it.

This is hitting #28234

sage: import statistics sage: statistics.mean([1,2,3,4]) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-2-8b5038d5efc8> in <module> ----> 1 statistics.mean([Integer(1),Integer(2),Integer(3),Integer(4)]) /usr/lib/python3.10/statistics.py in mean(data) 327 if n < 1: 328 raise StatisticsError('mean requires at least one data point') --> 329 T, total, count = _sum(data) 330 assert count == n 331 return _convert(total / n, T) /usr/lib/python3.10/statistics.py in _sum(data) 196 else: 197 # Sum all the partial sums using builtin sum. --> 198 total = sum(Fraction(n, d) for d, n in partials.items()) 199 return (T, total, count) 200 /usr/lib/python3.10/statistics.py in <genexpr>(.0) 196 else: 197 # Sum all the partial sums using builtin sum. --> 198 total = sum(Fraction(n, d) for d, n in partials.items()) 199 return (T, total, count) 200 /usr/lib/python3.10/fractions.py in __new__(cls, numerator, denominator, _normalize) 146 isinstance(denominator, numbers.Rational)): 147 numerator, denominator = ( --> 148 numerator.numerator * denominator.denominator, 149 denominator.numerator * numerator.denominator 150 ) TypeError: unsupported operand type(s) for *: 'builtin_function_or_method' and 'builtin_function_or_method'

The dilemma remains

- make
`numerator`

/`denominator`

attributes instead of methods to be compatible with Python`numbers.Rational`

- convince Python dev that
`numerator()`

/`denominator()`

should be equally supported on the python side (which has already been tried by Jeroen in the past) - continue being orthogonal to Python

But it does not block work for the "short term plan".

- convince Python dev that
`numerator()`

/`denominator()`

should be equally supported on the python side (which has already been tried by Jeroen in the past)

The problem in `statistics`

is more specifically https://trac.sagemath.org/ticket/28234#comment:62

The functions `mean`

and `median`

will be restored in #33453. The other functions are not compatible with the statistics module and proper deprecation are raised.

The functions

`mean`

and`median`

will be restored in #33453. The other functions are not compatible with the statistics module and proper deprecation are raised.

So the plan is to also import the other functions from the `sage.stats.statisics`

module into the global namespace after the deprecation period?

Replying to vdelecroix:

The functions

`mean`

and`median`

will be restored in #33453. The other functions are not compatible with the statistics module and proper deprecation are raised.So the plan is to also import the other functions from the

`sage.stats.statisics`

module into the global namespace after the deprecation period?

Nothing is fixed yet. We could already pull all the contents of `sage.stats.statistics`

into the global namespace but `mode`

(whose specification conflicts with `stats.basic_stats.mode`

). The deprecations have to stay because of the change of behaviour

`mode`

->`multimode`

(`mode`

becomes something else)`std`

->`stdev`

and`pstdev`

(depending on the value of`bias`

)`variance`

->`variance`

and`pvariance`

(depending on the value of`bias`

)

To my mind, I think it is better to have them as `statistics.mean`

, `statistics.median`

, etc rather than in the global namespace. But that is a personal taste.

To my mind, I think it is better to have them as

`statistics.mean`

,`statistics.median`

, etc rather than in the global namespace. But that is a personal taste.

The original idea of this ticket is to have the basic stats command readily available from the global namespace.

The functionality is still there; it has not been removed.