# Student's t-test in Python

Imagine you have a given stochastic process, a function that makes random
decisions, and that you want to calculate the average of a given quantity
coming out of this process. One run of the process gives you one sample of the
desired quantity. To estimate your average, you could then make a list of
samples, and run the process again and again until the variance of this list is
low enough. But what "low" is a good value when you don't even know the mean *a
priori*?

Student's t-test provides an answer to this question using the unbiased estimator of the standard deviation. (If you do the math, you will note that it is not as simple as replacing the actual variance by its estimator in a Chebyshev inequality.) Here is how it codes in Python:

```
def student_test(samples, prec, confidence=0.9, min_samples=10):
"""Student's t-test.
samples -- list of numbers generated from your stochastic process
prec -- desired distance between the empirical and real mean
confidence -- desired probability of correct answer
min_samples -- minimum number of samples
"""
assert 0. < confidence < 1.
if len(samples) < min_samples:
return False
n = len(samples)
sigma = std(samples)
quantile = stats.t.ppf(confidence, n - 1)
return quantile * sigma / sqrt(n) < prec
```

This function will return `True` when the empirical mean of your list of
samples is less than `prec` away from the theoretical one (statistically,
this is true under the hypothesis that all samples are *i.i.d.* and either
follow a normal distribution or are numerous enough so that the law of large
numbers applies). The `confidence` parameter tunes the tolerance to failure:
when samples are drawn from a normal distribution, it sets exactly the
probability that the tests returns a correct answer. Of course, higher values
of `prec` or `confidence` tend to require more samples.

```
def run_until_good_mean_estimate(f):
"""Generate a list of samples using Student's t-test.
Generate samples from a stochastic function f (no argument, returns
a real number) until the list of generated samples provides a good
estimate of the mean return value of f().
"""
prec, samples = 1., []
while not student_test(samples, prec):
samples.append(f())
mu = numpy.mean(samples)
prec = mu / 100.
print "Mean f(): %.2f +/- %.2f" % (mu, prec)
return samples
```

Here, we fed back one percent of the empirical mean as precision parameter for
Student's t-test. As a rule of thumb, when a better value for this parameter is
not known *a priori*, this should yield a result with two significant digits of
precision.