Opened 3 years ago

Last modified 12 months ago

#29663 new enhancement

Clarify and enhance descriptive statistics (and more)

Reported by: kcrisman Owned by:
Priority: major Milestone: sage-feature
Component: statistics Keywords:
Cc: dunfield Merged in:
Authors: Reviewers:
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: #29662 Stopgaps:

Status badges

Description (last modified by kcrisman)

We have some basic statistics functionality in sage stats for some descriptive statistics. Unfortunately, it is really basic.

This ticket is for clarifying the relationship of that material to the Sage probability distributions, histogram, Scipy, GSL, and other libraries - perhaps including pandas, though this is not (yet) standard in Sage.

  • Ideally there would be interfaces to the best native Python functionality rather than something specific to Sage (though that may not be possible).
  • There may be a tutorial page in the (reference manual) documentation for demonstrating best practices.
  • There could be a more education-oriented tutorial elsewhere, along the lines of the PREP Quickstart but more comprehensive.
  • As noted at #29662, Python 3 has a stats module, though presumably that module can't handle (say) the mean of several Integers or even stranger objects, as-is.

If all of those generate interest, this ticket would be converted to a metaticket to keep track of them.

Change History (6)

comment:1 Changed 3 years ago by dimpase

Dependencies: #29662

comment:2 Changed 3 years ago by kcrisman

Description: modified (diff)

comment:3 Changed 3 years ago by dunfield

Cc: dunfield added

I use pandas pretty heavily from within Sage (Python 2.7 version). The only problem I encounter has to do with pandas not recognizing Sage's Integer as an integer. Assuming one has the standard preparser on, you have to do things like:

dataframe.loc[int(100)]
dataframe.apply(some_function, axis=int(1))

to keep it happy.

comment:4 in reply to:  3 ; Changed 3 years ago by kcrisman

I use pandas pretty heavily from within Sage (Python 2.7 version).

Hmm, yeah that is exactly the kind of problem I expected (brian had some similar issues iirc). I assume you pip install it, not included in our Python from the get-go, right?

comment:5 in reply to:  4 Changed 3 years ago by dunfield

Replying to kcrisman:

I assume you pip install it, not included in our Python from the get-go, right?

Yes, I just use pip install which has always worked smoothly (though it takes a bit of time to compile). The main dependency is just a reasonably recent version of numpy which of course Sage has.

comment:6 in reply to:  3 Changed 12 months ago by gh-sheerluck

Replying to dunfield:

pandas not recognizing Sage's Integer as an integer.

I added

from sage.rings.integer import Integer
if type(key) is Integer:
    ...

to pandas/core/indexes/{base,range}.py

Note: See TracTickets for help on using tickets.