Updated: 13 April, 2007
Information on group presentations added
  Bio 184 Experiment 5
A. Linkage Mapping in Drosophila melanogaster
This page is sectioned into three categories:

Probability

DEFINITION: Probability is the measure of how likely an event is.

Basis: This external link gives a pretty reasonable introduction to probability. Check it out! And then there is the next evolution here.

The biggest challenge that the student faces is to know when to apply which principle. One of the best ways to learn this is to remember some basic examples that one can mentally use to remind oneself. I find the mental use of the 6-sided die very helpful!


Null Hypothesis
DEFINITION: The null hypothesis is the hypothesis being tested. "The hypothesis that the restriction or set of restrictions to be tested does in fact hold."
The null hypothesis is a term that statisticians often use to indicate the statistical hypothesis tested. The purpose of most statistical tests, is to determine if the obtained results provide a reason to reject the hypothesis that they are merely a product of chance factors. For example, in an experiment in which two groups of randomly selected subjects have received different treatments and have yielded different means, it is always necessary to ask if the difference between the obtained means is among the differences that would be expected to occur by chance whenever two groups are randomly selected. In this example, the hypothesis tested is that the two samples are from populations with the same mean. Another way to say this is to assert that the investigator tests the null hypothesis that the difference between the means of the populations from which the samples were drawn, is zero. If the difference between the means of the samples is among those that would occur rarely by chance when the null hypothesis is true, the null hypothesis is rejected and the investigator describes the results as statistically significant.
The null hypothesis is an hypothesis about a population parameter. The purpose of hypothesis testing is to test the viability of the null hypothesis in the light of experimental data. Depending on the data, the null hypothesis either will or will not be rejected as a viable possibility.

The null hypothesis is often the reverse of what the experimenter actually believes; it is put forward to allow the data to contradict it.

 

The Experimental Method

Throughout the laboratory portion of most Biology laboratories, you will be conducting experiments. Science proceeds by use of the experimental method. This handout provides a summary of the steps that are used in pursuing scientific research. This general method is used not only in biology but in chemistry, physics, geology and other hard sciences.

To gather information about the biological world, we use two mechanisms: our sensory perception and our ability to reason. We can identify and count the types of trees in a forest with our eyes, we can identify birds in the rainforest canopy with our ears, and we can identify the presence of a skunk with our nose. Touch and taste help us experience the biological world as well. With the information we gather from our senses, we can make inferences using our reason and logic. For instance, you know that you see palm trees in tropical and subtropical regions and can infer that palm trees will not be found in central Maine because of the harshness of our winter.

Our reason allows us to make predictions about the natural world. Scientists attempt to predict and perhaps control future events based on present and past knowledge. The ability to make accurate predictions hinges on the seven steps of the Scientific Method.

Step 1. Make observations. These observations should be objective, not subjective. In other words, the observations should be capable of verification by other scientists. Subjective observations, which are based on personal opinions and beliefs, are not in the realm of science. Here’s an objective statement: It is 58 °F in this room. Here’s a subjective statement: It is cool in this room.

The first step in the Scientific Method is to make objective observations. These observations are based on specific events that have already happened and can be verified by others as true or false.

Step 2. Form a hypothesis. Our observations tell us about the past or the present. As scientists, we want to be able to predict future events. We must therefore use our ability to reason.

Scientists use their knowledge of past events to develop a general principle or explanation to help predict future events. The general principle is called a hypothesis. The type of reasoning involved is called inductive reasoning (deriving a generalization from specific details).

A hypothesis should have the following characteristics:

• It should be a general principle that holds across space and time

• It should be a tentative idea

• It should agree with available observations

• It should be kept as simple as possible.

• It should be testable and potentially falsifiable. In other words, there should be a

way to show the hypothesis is false; a way to disprove the hypothesis.

Some mammals have two hind limbs would be a useless hypothesis. There is no observation that would not fit this hypothesis!

All mammals have two hind limbs is a good hypothesis. We would look throughout the world at mammals. When we find whales, which have no hind limbs, we would have shown our hypothesis to be false; we have falsified the hypothesis.

When a hypothesis involves a cause-and-effect relationship, we state our hypothesis to indicate there is no effect. A hypothesis, which asserts no effect, is called a null hypothesis. For instance, the drug Celebra does not help relieve rheumatoid arthritis.

Step 3. Make a prediction. From step 2, we have made a hypothesis that is tentative and may or may not be true. How can we decide if our hypothesis is true?

Our hypothesis should be broad; it should apply uniformly through time and through space. Scientists cannot usually check every possible situation where a hypothesis might apply. Let’s consider the hypothesis: All plant cells have a nucleus. We cannot examine every living plant and every plant that has ever lived to see if this hypothesis is false. Instead, we generate a prediction using deductive reasoning (generating a specific expectation from a generalization). From our hypothesis, we can make the following prediction: If I examine cells from a blade of grass, each one will have a nucleus.

Now, let’s consider the drug hypothesis: The drug Celebra does not help relieve rheumatoid arthritis . To test this hypothesis, we would need to choose a specific set of conditions and then predict what would happen under those conditions if the hypothesis were true. Conditions you might wish to test are doses administered, length of time the medication is taken, the ages of the patients and the number of people to be tested.

All of these conditions that are subject to change are called variables. To gauge the effect of Celebra, we need to perform a controlled experiment. The experimental group is subjected to the variable we want to test and the control group is not exposed to that variable. In a controlled experiment, the only variable that should be different between the two groups is the variable we want to test.

Let’s make a prediction based on observations of the effect of Celebra in the laboratory. The prediction is: Patients suffering from rheumatoid arthritis who take Celebra and patients who take a placebo (a starch tablet instead of the drug) do not differ in the severity of rheumatoid arthritis. [Note that we base our prediction on our null hypothesis of no effect of Celebra.]

Step 4. Perform an experiment. We rely again on our sensory perception to collect information. We design an experiment based on our prediction.

Our experiment might be as follows: 1000 patients between the ages of 50 and 70 will be randomly assigned to one of two groups of 500. The experimental group will take Celebra four times a day and the control group will take a starch placebo four times a day. The patients will not know whether their tablets are Celebra or the placebo. Patients will take the drugs for two months. At the end of two months, medical exams will be administered to determine if flexibility of the arms and fingers has changed.

Step 5. Analyze the results of the experiment. Our experiment produced the following results: 350 of the 500 people who took Celebra reported diminished arthritis as the end of the period. 65 of the 500 people who took the placebo reported improvement.

The data appear to show that there was a significant effect of Celebra. We would need to do a statistical analysis to demonstrate the effect. Such an analysis reveals that there is a statistically significant effect of Celebra.

Step 6. Draw a conclusion. From our analysis of the experiment, we have two possible outcomes: the results agree with the prediction or they disagree with the prediction. In our case, we can reject our prediction of no effect of Celebra. Because the prediction is wrong, we must also reject the hypothesis it was based on.

Our task now is to reframe the hypothesis is a form that is consistent with the available information. Our hypothesis now could be: The administration of Celebra reduces rheumatoid arthritis compared to the administration of a placebo.

With present information, we accept our hypothesis as true. Have we proved it to be true? Absolutely not! There are always other explanations that can explain the results. It is possible that the more of the 500 patients who took Celebra were going to improve anyway. It’s possible that more of the patients who took Celebra also ate bananas every day and that bananas improved the arthritis. You can suggest countless other explanations.

How can we prove that our new hypothesis is true? We never can. The scientific method does not allow any hypothesis to be proven. Hypotheses can be disproved in which case that hypothesis is rejected as false. All we can say about a hypothesis, which stands up to, a test to falsify it is that we failed to disprove it. There is a world of difference between failing to disprove and proving. Make sure you understand this distinction; it is the foundation of the scientific method.

So what would we do with our hypothesis above? We currently accept it as true. To be rigorous, we need to subject the hypothesis to more tests that could show it is wrong. For instance, we could repeat the experiment but switch the control and experimental group. If the hypothesis keeps standing up to our efforts to knock it down, we can feel more confident about accepting it as true. However, we will never be able to state that the hypothesis is true. Rather, we accept it as true because the hypothesis stood up to several experiments to show it is false.

Step 7. Report your results. Scientists publish their findings in scientific journals and books, in talks at national and international meetings and in seminars at colleges and universities. Disseminating results is an essential part of the scientific method. It allows other people to verify your results, develop new tests of your hypothesis or apply the knowledge you have gained to solve other problems.

 
Chi Square Analysis
The probability value (p-value) of a statistical hypothesis test is the probability of getting a value of the test statistic as extreme as or more extreme than that observed by chance alone, if the null hypothesis H0, is true.

It is the probability of wrongly rejecting the null hypothesis if it is in fact true.

It is equal to the significance level of the test for which we would only just reject the null hypothesis. The p-value is compared with the actual significance level of our test and, if it is smaller, the result is significant. That is, if the null hypothesis were to be rejected at the 5% significance level, this would be reported as "p < 0.05".

Small p-values suggest that the null hypothesis is unlikely to be true. The smaller it is, the more convincing is the rejection of the null hypothesis. It indicates the strength of evidence for say, rejecting the null hypothesis H0, rather than simply concluding "Reject H0' or "Do not reject H0".

The significance level of a statistical hypothesis test is a fixed probability of wrongly rejecting the null hypothesis H0, if it is in fact true.

It is the probability of a type I error and is set by the investigator in relation to the consequences of such an error. That is, we want to make the significance level as small as possible in order to protect the null hypothesis and to prevent, as far as possible, the investigator from inadvertently making false claims.

The significance level is usually denoted by
Significance Level = P(type I error) =
Usually, the significance level is chosen to be 0.05 (or equivalently, 5%)

 

Frequency Distributions

One important set of statistical tests allows us to test for deviations of observed frequencies from expected frequencies. To introduce these tests, we will start with a simple, non-biological example. We want to determine if a coin is fair. In other words, are the odds of flipping the coin heads-up the same as tails-up. We collect data by flipping the coin 200 times. The coin landed heads-up 108 times and tails-up 92 times. At first glance, we might suspect that the coin is biased because heads resulted more often than than tails. However, we have a more quantitative way to analyze our results, a chi-squared test.

To perform a chi-square test (or any other statistical test), we first must establish our null hypothesis. In this example, our null hypothesis is that the coin should be equally likely to land head-up or tails-up every time, OR PUT ANOTHER WAY, that the observed results of (108 and 92) are not due to a bias in the coin but due to change alone because of the small sample. The null hypothesis allows us to state expected frequencies. For 200 tosses, we would expect 100 heads and 100 tails.

The next step is to prepare a table as follows.
  Heads Tails Total
Observed 108 92 200
Expected 100 100 200
Total 208 192 400

The Observed values are those we gather ourselves. The expected values are the frequencies expected, based on our null hypothesis. We total the rows and columns as indicated. It's a good idea to make sure that the row totals equal the column totals (both total to 400 in this example).

Using probability theory, statisticians have devised a way to determine if a frequency distribution differs from the expected distribution. To use this chi-square test, we first have to calculate chi-squared.

Chi-squared = (row 1 (observed-expected)2/(expected)) + (row 2 (observed-expected)2/(expected))

We have two classes to consider in this example, heads and tails.

Chi-squared = (100-108)2/100 + (100-92)2/100 = (-8)2/100 + (8)2/100 = 0.64 + 0.64 = 1.28

Now we have to consult a table of critical values of the chi-squared distribution. Here is a portion of such a table.

 
NO REASON TO DOUBT HYPOTHESIS
REASON TO DOUBT HYPOTHESIS
 
DEVIATIONS INSIGNIFICANT
DEVIATIONS SIGNIFICANT
df/prob. 0.99 0.95 0.90 0.80 0.70 0.50 0.30 0.20 0.10 0.05 0.02 0.01
1 0.00013 0.0039 0.016 0.64 0.15 0.46 1.07 1.64 2.71 3.84 5.41 6.64
2 0.02 0.10 0.21 0.45 0.71 1.39 2.41 3.22 4.60 5.99 7.82 9.21
3 0.12 0.35 0.58 1.00 1.42 2.37 3.66 4.64 6.25 7.82 9.84 11.34
4 0.3 0.71 1.06 1.65 2.20 3.36 4.88 5.99 7.78 9.49 11.67 13.28
5 0.55 1.14 1.61 2.34 3.00 4.35 6.06 7.29 9.24 11.07 13.39 15.09

The left-most column list the degrees of freedom (df). We determine the degrees of freedom by subtracting one from the number of classes. In this example, we have two classes (heads and tails), so our degrees of freedom is 1. Our chi-squared value is 1.28. Move across the row for 1 df until we find critical numbers that bound our value. In this case, 1.07 (corresponding to a probability of 0.30) and 1.64 (corresponding to a probability of 0.20). We can interpolate our value of 1.24 to estimate a probability of 0.27.

(Our Chi-squared value= 1.28 and falls at around a probability of 0.27)

This value means that there is a 73% chance that our coin is biased. In other words, the probability of getting 108 heads out of 200 coin tosses with a fair coin is 27%. In biological applications, a probability of 5%, which is pretty relaxed, is usually adopted as the standard. This value means that the chances of an observed value arising by chance is only 1 in 20. Because the chi-squared value we obtained in the coin example is greater than 0.05 (0.27 to be precise), we accept the null hypothesis as true and conclude that our coin is fair.

RULE OF THUMB: If your Chi-square value falls to the left of the 0.05 probability column for that degree of freedom, we accept the null hypothesis, and conclude that the observed data are infact in agreement with the expected data.
 


 

copyright © , Dr. Kamal Dulai. all rights reserved

home  ::  contact  ::  about