1 Statistical inference, estimation, hypothesis testing

1.1 A bit of philosophy about statistical decision making

Here are some quotations about the point of statistics:

Hultquist, R.A.(1969). Introduction to Statistics. New York: Holt, Rinenhart and Winston.

"Statistics is a science that concerns itself with experimentation and the collection, description and analysis of data... Statistical methods are tools for examining data."

Kendall, M. G. & Stuart, A.(1969). The Advanced Theory of Statistics, Vol. 1, 3rd Edn. London: Griffin.

"Statistics is the branch of scientific method which deals with data obtained by counting or measuring the properties of natural phenomena."

Keeping, E.S.(1962).Introduction to Statistical Inference. New York: Van Nostrand.

"Statistics is concerned with variables that fluctuate in a more or less unpredictable way... We call this characteristics randomness..."

Chernoff, H. & Moses, L.E.(1959). Elementary Decision Theory. New York: Wiley.

"Years ago a statistician might have claimed that statistics deals with the processing of data... today's statistician will be more likely to say that statistics is concerned with decision making in the face of uncertainty."

Bartlett, M.S.(1962). Essays on Probability and Statistics. London: Methuen.

"Statistics is concerned with thins we can count. In so far as things or persons, are unique or ill defined, statistics are meaningless and statisticians silenced; in so far as things are similar and definite - so many male workers over 25, so many nuts and bolts made during December - they can be counted and new statistical facts are born."

The theory of statistics takes the real data arising from the practical situation and uses these data to validate a specific model, to make 'rational guesses' or 'estimates' of the numerical values of relevant parameters, or even to originate a model. This reverse, inductive, process is possible only because the 'language' of probability theory is available to form the deductive link. Its aim is to enable inferences to be drawn about the model from the information provided by the sample data or to construct procedures to aid the taking of decisions relevant to the practical situation. The idea of the statistical decision making can be seen in the Figure 1 by Barnett, V.(1973). Comparative Statistical Inference. John Wiley & Sons. London. Figure 1

Statistical decision-making increases the reliability of the research, because we can generalise the results of our data to the larger population. On the other hand, testing hypotheses doesn't itself make the research any more scientific. It is likely, though, that such tests do not prevent researchers making serious mistakes in their conclutions.

Statistical decision-making has no meaning in itself. It comes meaningful on part of the research process. The following picture (Figure 2) shows the role of statistical decision-making. Figure 2

In+ figure 2, the decision making process goes clockwise and statistical principles and justification goes anticlockwise. These principles and justification are based on the notion of sampling, on sample distributuin, and on the fact we trend not to belive in small probabilities (less that 0.05). If we are carrying out an experiment, you replace the population with the behaviour of the values in the experiment, and the sample with the results of the experiment.

Althought the hypothesis test is based on this theoretical background, results of the hypothese tests (the p-values)can be understood as a continuation of data description: the smaller p-value, the more dependency. For example p=0.002 tells about the stronger or more reliable dependent than p=0.05. The researcher sees the dependency in the data description, but it must be represented and proved by a test.

1.2 Estimating and testing statistics

Given the distribution of a variable in a population, we obtain the results about the distributions of various quantities, such as the mean and variance, calculated from sample observations. Such a quantity is called a statistic. These results are of direct interest in the planning of sampling enquires, as they enable the investigator to estimate the precision attainable with a sample of a given size, and hence help him to decide how large a sample should be taken.

When the sample has been taken , what sort of inferences can be drawn about the population, on the basis of the sample? We don't know the characteristics of the population. We have taken one random sample and wish to use our knowledge of sampling theory to make whatever inference can be made about the population. One fundamental difficulty usually arises. The expressions of sampling variation given by the various formulae for standard errors or variances usually involve some parameters of the population. For instance, the standard error of the sample mean is /n. If we are attempting to make an inference about a normal distribution on the basis of one random sample, we shall know the sample size, n, but not the population standard deviation, . We cannot, therefore, calculate the standard error exactly.

Testing hypotheses is actually making a decision between two hypotheses H0 and H1. In decision making, we can calculate two kinds of errors in decision: accept H1 when H0 is correct, or accept H0 when H1 is correct. From this, we can draw the following table.


        True            

Select  H0           H1     

H0      OK           ß

H1      p or       OK     



Here p or is probability to make error to select H1 when H0 is correct, and ß is probability make error to select H0 when H1 is correct. We are mainly interested in p.

Example 1

A researcher claim that the mean of the IQ for students is 110 and the expected value for all population is 100, with the standard deviation of 10. Here we can test the nullhypotheses

H0: µ0 = 100

against the alternative hypotheses

H1: µ1 = 110.

If the nullhypotheses is correct, the IQ is N(100, 102)-distributed. If the alternative hypothese is correct, the IQ is N(110, 102) -distributed. The sample distribution of the mean in sample size n = 16 for H0 is N(100, 102/16) and for H1 N(110, 102/16).

If we make the conclusion with p < 0.05 , we can calculate the cut point c. If the mean from the sample is greater the c, we reject H0. If the mean is smaller than c, we accept H0. Because the continuos distribution has no probabilities for the exact values, we can calculate c from following formulas:

Here p (or ) is the significance (probability) that the sample mean is from the population, where µ = 100. It is understood also as a probability to make a wrong conclusion, when we reject the nullhypotheses. The ß is the risk (probability) that we take to make a wrong conclusion, when we accept the nullhypotheses. Instead of ß is 1-ß often used, and it is called as the power of the test (here 0.991).The probabilities can bee seen in the picture 3:

From the previous formulas, we can see the effect of the sample size. When sample size increases, the standard error of the mean decreases and we get smaller p-values and larger power for the test. We can also calculate the required number of cases (n), if we know the clinically relevant difference (µ0 - µ1), the standard deviation of the variable (), the significance level (p = 0.05) and the power of the test (1-ß), that will be required. That is not very practical way to define the sample size, but it shows the dependency between significance, test power, sample size, required difference and deviation.

1.3 Models in the hypothesis tests

All the hypothesis tests can be understood as the test between two models. We want to fit a model into our data and the test tells us witch model fits better. For example, the hypothesis of previous example can be written as

The information of the H0 from the sample size n can be written due the likelihood function L(µ0=100)

The likelihood ratio test is the ratio between logarithms of the likelihood functions:

The -2 log is chi-square distributed, but we get the previous normal distributed test value, when we take the square root of chi-square distributed -2 log .

Here we got the usual test for a single mean just from the hypothesis using the idea of likelihood ratio test. The point is that we can define a test for any hypotheses using likelihood ratio idea. All we need is the well-defined hypotheses H0.