Opened 13 years ago
Closed 13 years ago
#7197 closed task (fixed)
basic statistics functions
Reported by: | amhou | Owned by: | amhou |
---|---|---|---|
Priority: | minor | Milestone: | sage-4.3 |
Component: | statistics | Keywords: | statistics, mean, median, mode, standard deviation |
Cc: | Merged in: | sage-4.3.alpha1 | |
Authors: | Andrew Hou | Reviewers: | William Stein |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description
Basic statistics functions for a new class in Sage. Only descriptive statistics right now.
Attachments (10)
Change History (25)
comment:1 Changed 13 years ago by
- Status changed from new to needs_review
comment:2 Changed 13 years ago by
- Status changed from needs_review to needs_work
comment:3 Changed 13 years ago by
I should also say I'm glad you are working on these! I was very surprised to learn a few weeks ago that Sage did not have a generic standard deviation function. We needed it in the class I was teaching!
Changed 13 years ago by
comment:4 Changed 13 years ago by
- Status changed from needs_work to needs_review
Patch added.
Arguments for std and variance changed to "bias = True/False?" for division by n and n-1 respectively.
comment:5 Changed 13 years ago by
- Owner changed from mhampton to amhou
comment:6 Changed 13 years ago by
Is there any way to have "std_sample" and "std_population" (and same for variance)? When teaching very basic classes statistics, we just refer to them as population and sample std or variance. Having specific functions (as excel or their calculators do) would make more sense to students.
Changed 13 years ago by
comment:7 Changed 13 years ago by
- Status changed from needs_review to needs_work
REFEREE REPORT:
- All tests pass in the entire tree after applying this.
- I'm OK with not adding std_sample and std_population simply because R, matlab, mathematica all don't and the instructor can easily add some alias's for their class.
- Add copyright header block.
- Add a docstring section at the top with AUTHOR, overview of capabilities, etc.
- Don't import numpy at top level; it'll just get moved later since we should not import numpy/matplotlib/etc. at startup.
- For
def std(v, bias=False):
and any other function that handles special types, put in examples that illustrate that your code for handling these types works.
Fix all the above and I'll be happy with this patch!
Changed 13 years ago by
comment:8 Changed 13 years ago by
- Status changed from needs_work to needs_review
Changed 13 years ago by
comment:9 Changed 13 years ago by
REPORT 2:
- a little too spartan:
""" Basic Statistics This file contains basic descriptive functions. AUTHOR: - Andrew Hou (11/06/2009) ... """
- Make sure there is a test that tests this code:
""" if hasattr(v, 'mean'): return v.mean()
- Same for mode:
if hasattr(v, 'mode'): return v.mode()
- Same for this:
if hasattr(v, 'standard_deviation'): return v.standard_deviation(bias=bias)
- Type checking in python should always use isinstance:
if type(v) is numpy.ndarray: if type(v) == numpy.ndarray:
should be
if isinstance(v, numpy.ndarray):
- Test this:
if hasattr(v, 'variance'): return v.variance(bias = bias)
- Change this:
if bias == True: # population variance if isinstance(x, (int,long)): return x/ZZ(len(v)) return x/len(v) elif bias == False:
to
if bias: # population variance if isinstance(x, (int,long)): return x/ZZ(len(v)) return x/len(v) else:
- Make sure this is tested:
if hasattr(v, 'median'): return v.median()
- Weird """ in moving_average:
{{{ 318 """
319 x = []
}}}
- Change
bin_size = len(v)/bins
to floor division:
bin_size = int(len(v)//bins)
- You can do this at the very end of each docstring if you want...
AUTHOR: - Andrew Hou (11/06/2009)
comment:10 Changed 13 years ago by
- Status changed from needs_review to needs_work
Changed 13 years ago by
comment:11 Changed 13 years ago by
- Status changed from needs_work to needs_review
comment:12 Changed 13 years ago by
Issues:
- Delete "Included as of 11/06/2009" and reword.
- Fix: "returns the most common occuring member of a sample." (and occurring is the right spelling)
- "Functions have also been imported under the namespace 'stats'." Change to not use the passive voice. I.e., "The functions are available in the namespace stats, i.e., you can use them by typing stats.mean, stats.median, etc."
- Change all ' to in the top section. (two single quotes as separate characters) means "monospace" in ReST markup.
Changed 13 years ago by
Changed 13 years ago by
Changed 13 years ago by
Changed 13 years ago by
comment:13 Changed 13 years ago by
- Report Upstream set to N/A
- Status changed from needs_review to positive_review
comment:14 Changed 13 years ago by
- Reviewers set to William Stein
comment:15 Changed 13 years ago by
- Merged in set to sage-4.3.alpha1
- Resolution set to fixed
- Status changed from positive_review to closed
Some comments: