Ticket #7197: trac_7197_part_7.patch

File trac_7197_part_7.patch, 7.2 KB (added by amhou, 12 years ago)
• sage/stats/basic_stats.py

```# HG changeset patch
# User Andrew Hou <amhou@uw.edu>
# Date 1258418752 28800
# Node ID faf0b8837f76ed1e651646126bfc09ec42909904
# Parent  9ee1085d76f4763b648cda04791f4b5bc45f9b6a
Fixed documentation.

diff -r 9ee1085d76f4 -r faf0b8837f76 sage/stats/basic_stats.py```
 a """ Basic Statistics This file contains basic descriptive functions. Included as of 11/06/2009 are the mean, median, mode, moving average, standard deviation, and the variance. When calling a function on data, there are checks for functions already defined for that data type. This file contains basic descriptive functions. Included are the mean, median, mode, moving average, standard deviation, and the variance. When calling a function on data, there are checks for functions already defined for that data type. The 'mean' function returns the arithmetic mean (the sum of all the members The ``mean`` function returns the arithmetic mean (the sum of all the members of a list, divided by the number of members). Further revisions may include the geometric and harmonic mean. The 'median' function returns the number separating the higher half of a sample from the lower half. The 'mode' returns the most common occuring member of a sample. The 'moving average' is a finite impulse response filter, creating a series of averages using a user-defined number of subsets of the full data set. The 'standard deviation' and the 'variance' return a measurement of how far data points tend to be from the arithmetic mean. the geometric and harmonic mean. The ``median`` function returns the number separating the higher half of a sample from the lower half. The ``mode`` returns the most common occuring member of a sample, plus the number of times it occurs. If entries occur equally common, a list of the most common  entries are returned. The ``moving average`` is a finite impulse response filter, creating a series of averages using a user-defined number of subsets of the full data set. The ``standard deviation`` and the ``variance`` return a measurement of how far data points tend to be from the arithmetic mean. Functions have also been imported under the namespace 'stats'. Functions are available in the namespace ``stats``, i.e. you can use them by typing ``stats.mean``, ``stats.median``, etc. AUTHOR: - Andrew Hou (11/06/2009) """ def mean(v): """ Return the mean of the elements of `v`. Return the mean of the elements of ``v``. We define the mean of the empty list to be NaN, following the convention of MATLAB, Scipy, and R. INPUT: - `v` -- a list of numbers - ``v`` -- a list of numbers OUTPUT: def mode(v): """ Return the mode (most common) of the elements of 'v' Return the mode (most common) of the elements of ``v`` If 'v' is empty, we define the mode to be null. If ``v`` is empty, we define the mode to be null. If all elements occur only once, we define the mode to be null. If multiple elements occur at the same frequency, all will be displayed. INPUT: - 'v' -- a list - ``v`` -- a list OUTPUT: [] sage: mode(['sage', 4, I, 3/5, 'sage', pi]) [('sage', 2)] sage: class MyClass: ...     def mode(self): ...         return 1 sage: stats.mode(MyClass()) 1 """ if hasattr(v, 'mode'): return v.mode() from operator import itemgetter freq = {} for i in v: try: def std(v, bias=False): """ Returns the standard deviation of the elements of 'v'. Returns the standard deviation of the elements of ``v`` We define the standard deviation of the empty list to be NaN, following the convention of MATLAB, Scipy, and R. INPUT: - 'v' -- a list of numbers - ``v`` -- a list of numbers - bias -- bool (default: False); if False, divide by len(v) - 1 instead of len(v) to give a less biased estimator (sample) for the standard deviation. - ``bias`` -- bool (default: False); if False, divide by len(v) - 1 instead of len(v) to give a less biased estimator (sample) for the standard deviation. OUTPUT: def variance(v, bias=False): """ Returns the variance of the elements of 'v'. Returns the variance of the elements of ``v`` We define the variance of the empty list to be NaN, following the convention of MATLAB, Scipy, and R. INPUT: - 'v' -- a list of numbers - ``v`` -- a list of numbers - bias -- bool (default: False); if False, divide by len(v) - 1 instead of len(v) to give a less biased estimator (sample) for the standard deviation. - ``bias`` -- bool (default: False); if False, divide by len(v) - 1 instead of len(v) to give a less biased estimator (sample) for the standard deviation. OUTPUT: 841.66666666666663 sage: variance(x, bias=True) 833.25 sage: class MyClass: ...     def variance(self, bias = False): ...        return 1 sage: stats.variance(MyClass()) 1 """ if hasattr(v, 'variance'): return v.variance(bias=bias) import numpy x = 0 return v.var() elif bias == False: return v.var(ddof=1) if hasattr(v, 'variance'): return v.variance(bias = bias) if len(v) == 0: # variance of empty set defined as NaN return NaN def median(v): """ Return the median (middle value) of the elements of 'v' Return the median (middle value) of the elements of ``v`` If 'v' is empty, we define the median to be null. If 'v' is comprised of strings, TypeError occurs. For elements other than numbers, the median is a result of 'sorted()' If ``v`` is empty, we define the median to be null. If ``v`` is comprised of strings, TypeError occurs. For elements other than numbers, the median is a result of ``sorted()`` INPUT: - 'v' -- a list - ``v`` -- a list OUTPUT: - median element of 'v' - median element of ``v`` EXAMPLES:: cut up into that number of bins. Then, the mean of each bin is calculated, and appended into a new list. If 'v' is empty, we define the entries of the moving average to be NaN. If ``v`` is empty, we define the entries of the moving average to be NaN. INPUT: - v -- a list - ``v`` -- a list - bins -- number of bins, default set to 1 - ``bins`` -- number of bins, default set to 1 OUTPUT: