Collaborative Statistics (MT230-Spring 2012) by Barbara Illowsky, Ph.D., Susan Dean - HTML preview

PLEASE NOTE: This is an HTML preview only and some elements such as links or page numbers may be incorrect.
Download the book in PDF, ePub, Kindle for a complete version.

Chapter 9Hypothesis Testing: Single Mean and Single Proportion

9.1Hypothesis Testing: Single Mean and Single Proportion*

Student Learning Outcomes

By the end of this chapter, the student should be able to:

  • Differentiate between Type I and Type II Errors

  • Describe hypothesis testing in general and in practice

  • Conduct and interpret hypothesis tests for a single population mean, population standard deviation known.

  • Conduct and interpret hypothesis tests for a single population mean, population standard deviation unknown.

  • Conduct and interpret hypothesis tests for a single population proportion.

Introduction

One job of a statistician is to make statistical inferences about populations based on samples taken from the population. Confidence intervals are one way to estimate a population parameter. Another way to make a statistical inference is to make a decision about a parameter. For instance, a car dealer advertises that its new small truck gets 35 miles per gallon, on the average. A tutoring service claims that its method of tutoring helps 90% of its students get an A or a B. A company says that women managers in their company earn an average of $60,000 per year.

A statistician will make a decision about these claims. This process is called "hypothesis testing." A hypothesis test involves collecting data from a sample and evaluating the data. Then, the statistician makes a decision as to whether or not there is sufficient evidence based upon analyses of the data, to reject the null hypothesis.

In this chapter, you will conduct hypothesis tests on single means and single proportions. You will also learn about the errors associated with these tests.

Hypothesis testing consists of two contradictory hypotheses or statements, a decision based on the data, and a conclusion. To perform a hypothesis test, a statistician will:

  1. Set up two contradictory hypotheses.

  2. Collect sample data (in homework problems, the data or summary statistics will be given to you).

  3. Determine the correct distribution to perform the hypothesis test.

  4. Analyze sample data by performing the calculations that ultimately will allow you to reject or fail to reject the null hypothesis.

  5. Make a decision and write a meaningful conclusion.

To do the hypothesis test homework problems for this chapter and later chapters, make copies of the appropriate special solution sheets. See the Table of Contents topic "Solution Sheets".

9.2Null and Alternate Hypotheses*

The actual test begins by considering two hypotheses. They are called the null hypothesis and the alternate hypothesis. These hypotheses contain opposing viewpoints.

Ho: The null hypothesis: It is a statement about the population that will be assumed to be true unless it can be shown to be incorrect beyond a reasonable doubt.

Ha: The alternate hypothesis: It is a claim about the population that is contradictory to Ho and what we conclude when we reject Ho.

Example 9.1

Ho: No more than 30% of the registered voters in Santa Clara County voted in the primary election.

Ha: More than 30% of the registered voters in Santa Clara County voted in the primary election.

Example 9.2

We want to test whether the mean grade point average in American colleges is different from 2.0 (out of 4.0).

Ho: _autogen-svg2png-0008.png Ha: μ ≠ 2.0

Example 9.3

We want to test if college students take less than five years to graduate from college, on the average.

Ho: _autogen-svg2png-0012.png Ha: μ < 5

Example 9.4

In an issue of U. S. News and World Report, an article on school standards stated that about half of all students in France, Germany, and Israel take advanced placement exams and a third pass. The same article stated that 6.6% of U. S. students take advanced placement exams and 4.4 % pass. Test if the percentage of U. S. students who take advanced placement exams is more than 6.6%.

Ho: p_autogen-svg2png-0017.png Ha: p > 0.066

Since the null and alternate hypotheses are contradictory, you must examine evidence to decide if you have enough evidence to reject the null hypothesis or not. The evidence is in the form of sample data.

After you have determined which hypothesis the sample supports, you make a decision. There are two options for a decision. They are "reject Ho" if the sample information favors the alternate hypothesis or "do not reject Ho" or "fail to reject Ho" if the sample information is insufficient to reject the null hypothesis.

Mathematical Symbols Used in Ho and Ha:

Table 9.1.
HoHa
equal (=)not equal () or greater than (>) or less than (<)
greater than or equal to ()less than (<)
less than or equal to ()more than (>)

Ho always has a symbol with an equal in it. Ha never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers (including one of the co-authors in research work) use = in the Null Hypothesis, even with > or < as the symbol in the Alternate Hypothesis. This practice is acceptable because we only make the decision to reject or not reject the Null Hypothesis.

Optional Collaborative Classroom Activity

Bring to class a newspaper, some news magazines, and some Internet articles . In groups, find articles from which your group can write a null and alternate hypotheses. Discuss your hypotheses with the rest of the class.

9.3Outcomes and the Type I and Type II Errors*

When you perform a hypothesis test, there are four possible outcomes depending on the actual truth (or falseness) of the null hypothesis Ho and the decision to reject or not. The outcomes are summarized in the following table:

Table 9.2.
ACTIONHo IS ACTUALLY...
 TrueFalse
Do not reject HoCorrect OutcomeType II error
Reject HoType I ErrorCorrect Outcome

The four possible outcomes in the table are:

  • The decision is to not reject Ho when, in fact, Ho is true (correct decision).

  • The decision is to reject Ho when, in fact, Ho is true (incorrect decision known as a Type I error).

  • The decision is to not reject Ho when, in fact, Ho is false (incorrect decision known as a Type II error).

  • The decision is to reject Ho when, in fact, Ho is false (correct decision whose probability is called the Power of the Test).

Each of the errors occurs with a particular probability. The Greek letters α and β represent the probabilities.

α = probability of a Type I error = P(Type I error) = probability of rejecting the null hypothesis when the null hypothesis is true.

β = probability of a Type II error = P(Type II error) = probability of not rejecting the null hypothesis when the null hypothesis is false.

α and β should be as small as possible because they are probabilities of errors. They are rarely 0.

The Power of the Test is 1–β. Ideally, we want a high power that is as close to 1 as possible. Increasing the sample size can increase the Power of the Test.

The following are examples of Type I and Type II errors.

Example 9.5

Suppose the null hypothesis, Ho, is: Frank's rock climbing equipment is safe.

Type I error: Frank thinks that his rock climbing equipment may not be safe when, in fact, it really is safe. Type II error: Frank thinks that his rock climbing equipment may be safe when, in fact, it is not safe.

α = probability that Frank thinks his rock climbing equipment may not be safe when, in fact, it really is safe. β = probability that Frank thinks his rock climbing equipment may be safe when, in fact, it is not safe.

Notice that, in this case, the error with the greater consequence is the Type II error. (If Frank thinks his rock climbing equipment is safe, he will go ahead and use it.)

Example 9.6

Suppose the null hypothesis, Ho, is: The victim of an automobile accident is alive when he arrives at the emergency room of a hospital.

Type I error: The emergency crew thinks that the victim is dead when, in fact, the victim is alive. Type II error: The emergency crew does not know if the victim is alive when, in fact, the victim is dead.

α = probability that the emergency crew thinks the victim is dead when, in fact, he is really alive = P(Type I error). β = probability that the emergency crew does not know if the victim is alive when, in fact, the victim is dead = P(Type II error).

The error with the greater consequence is the Type I error. (If the emergency crew thinks the victim is dead, they will not treat him.)

9.4Distribution Needed for Hypothesis Testing*

Earlier in the course, we discussed sampling distributions. Particular distributions are associated with hypothesis testing. Perform tests of a population mean using a normal distribution or a student's-t distribution. (Remember, use a student's-t distribution when the population standard deviation is unknown and the distribution of the sample mean is approximately normal.) In this chapter we perform tests of a population proportion using a normal distribution (usually n is large or the sample size is large).

If you are testing a single population mean, the distribution for the test is for means:

_autogen-svg2png-0002.png ~ _autogen-svg2png-0003.png or _autogen-svg2png-0004.png

The population parameter is μ. The estimated value (point estimate) for μ is _autogen-svg2png-0007.png, the sample mean.

If you are testing a single population proportion, the distribution for the test is for proportions or percentages:

P' ~ _autogen-svg2png-0009.png

The population parameter is p. The estimated value (point estimate) for p is p'. _autogen-svg2png-0013.png where x is the number of successes and n is the sample size.

9.5Assumption*

When you perform a hypothesis test of a single population mean μ using a Student's-t distribution (often called a t-test), there are fundamental assumptions that need to be met in order for the test to work properly. Your data should be a simple random sample that comes from a population that is approximately normally distributed. You use the sample standard deviation to approximate the population standard deviation. (Note that if the sample size is sufficiently large, a t-t