Essentials of Modern Business Statistics
April 21, 2017 | Author: Krishna Chaitanya Tummala | Category: N/A
Short Description
Answers...
Description
Solutions Manual to Accompany
Essentials of Modern Business Statistics With Microsoft Excel Second Edition
David R. Anderson University of Cincinnati
Dennis J. Sweeney University of Cincinnati
Thomas A. Williams Rochester Institute of Technology
South-Western Cincinnati, Ohio
2-1
Contents Preface Chapter 1.
Data and Statistics
2.
Descriptive Statistics: Tabular and Graphical Methods
3.
Descriptive Statistics: Numerical Methods
4.
Introduction to Probability
5.
Discrete Probability Distributions
6.
Continuous Probability Distributions
7.
Sampling and Sampling Distributions
8.
Interval Estimation
9.
Hypothesis Testing
10.
Comparisons Involving Means
11.
Comparisons Involving Proportions and A Test of Independence
12.
Simple Linear Regression
13.
Multiple Regression
14.
Statistical Methods for Quality Control
2-2
Preface The purpose of Essentials of Modern Business Statistics with Microsoft Excel is to provide students, primarily in the fields of business administration and economics, with a sound conceptual introduction to the field of statistics and its many applications. The text is applications-oriented and has been written with the needs of the nonmathematician in mind.
The solutions manual furnishes assistance by identifying learning objectives and providing detailed solutions for all exercises in the text.
Note: The solutions to the case problems are included in a separate manual.
Acknowledgements We would like to provide special recognition to Catherine J. Williams for her efforts in preparing the solutions manual.
David R. Anderson Dennis J. Sweeney Thomas A. Williams
2-3
Chapter 1 Data and Statistics Learning Objectives 1.
Obtain an appreciation for the breadth of statistical applications in business and economics.
2.
Understand the meaning of the terms elements, variables, and observations as they are used in statistics.
3.
Understand that data are obtained using one of the following scales of measurement: nominal, ordinal, interval, and ratio.
4.
Obtain an understanding of the difference between qualitative, quantitative, crossectional and time series data.
5.
Learn about the sources of data for statistical analysis both internal and external to the firm.
6.
Be aware of how errors can arise in data.
7.
Know the meaning of descriptive statistics and statistical inference.
8.
Be able to distinguish between a population and a sample.
9.
Understand the role a sample plays in making statistical inferences about the population.
2-4
Solutions: 1.
2.
3.
4.
5.
Statistics can be referred to as numerical facts. In a broader sense, statistics is the field of study dealing with the collection, analysis, presentation and interpretation of data. a.
9
b.
4
c.
Country and room rate are qualitative variables; number of rooms and the overall score are quantitative variables.
d.
Country is nominal; room rate is ordinal; number of rooms is ratio and overall score is interval.
a.
Average number of rooms = 808/9 = 89.78 or approximately 90 rooms
b.
2 of 9 are located in England; approximately 22%
c.
4 of 9 have a room rate of $$; approximately 44%
a.
10
b.
All brands are models of minisystems manufactured.
c.
Average price = 3140/10 = $314
d.
$314
a.
5
b.
Price, CD capacity, and the number of tape decks are quantitative. Sound quality and FM tuning sensitivity and selectivity are qualitative.
c.
Average CD capacity = 30/10 = 3.
d.
7 (100) = 70% 10
e.
4 (100) = 40% 10
6.
Questions a, c, and d are quantitative. Questions b and e are qualitative.
7.
8.
a.
The variable is qualitative.
b.
Nominal with four labels or categories.
a.
1005
b.
Qualitative
2-5
9.
c.
Percentages
d.
.29(1005) = 291.45 or approximately 291.
a.
Qualitative
b.
30 of 71; 42.3%
10. a.
Quantitative; ratio
b.
Qualitative; nominal
c.
Qualitative (Note: Rank is a numeric label that identifies the position of a student in the class. Rank does not indicate how much or how many and is not quantitative.); ordinal
d.
Qualitative; nominal
e.
Quantitative; ratio
11. a.
Quantitative; ratio
b.
Qualitative; ordinal
c.
Qualitative; ordinal (assuming employees can be ranked by classification)
d.
Quantitative; ratio
e.
Qualitative; nominal
12. a.
The population is all visitors coming to the state of Hawaii.
b.
Since airline flights carry the vast majority of visitors to the state, the use of questionnaires for passengers during incoming flights is a good way to reach this population. The questionnaire actually appears on the back of a mandatory plants and animals declaration form that passengers must complete during the incoming flight. A large percentage of passengers complete the visitor information questionnaire.
c.
Questions 1 and 4 provide quantitative data indicating the number of visits and the number of days in Hawaii. Questions 2 and 3 provide qualitative data indicating the categories of reason for the trip and where the visitor plans to stay.
13. a.
Quantitative - Earnings measured in billions of dollars.
b.
Time series with 6 observations
c.
Volkswagen's annual earnings.
d.
Time series shows an increase in earnings. An increase would be expected in 2003, but it appears that the rate of increase is slowing.
2-6
14. a. b.
Type of music is a qualitative variable The graph, based on time series data, is shown below.
Percentage of Music Sales
34 32 30 28 26 24 22 20 1995
1996
1997
1998
1999
2000
2001
Year c.
The bar graph, based on cross-sectional data, is shown below.
% of Music Sales in 1998
30.0 25.0 20.0 15.0 10.0 5.0
th er O
Ja zz
os pe l Cl as sic al
G
Ra p
Po p
R& B
Co un try
Ro ck
0.0
Type of Music 15.
Crossectional data. The data were collected at the same or approximately the same point in time.
16. a.
We would like to see data from product taste tests and test marketing the product.
b.
Such data would be obtained from specially designed statistical studies.
2-7
17.
Internal data on salaries of other employees can be obtained from the personnel department. External data might be obtained from the Department of Labor or industry associations.
18. a.
(48/120)100% = 40% in the sample died from some form of heart disease. This can be used as an estimate of the percentage of all males 60 or older who die of heart disease.
b. 19. a.
The data on cause of death is qualitative. All subscribers of Business Week at the time the 1996 survey was conducted.
b.
Quantitative
c.
Qualitative (yes or no)
d.
Crossectional - 1996 was the time of the survey.
e.
Using the sample results, we could infer or estimate 59% of the population of subscribers have an annual income of $75,000 or more and 50% of the population of subscribers have an American Express credit card.
20. a.
56% of market belonged to A.C. Nielsen $387,325 is the average amount spent per category
b.
3.73
c.
$387,325
21. a.
The two populations are the population of women whose mothers took the drug DES during pregnancy and the population of women whose mothers did not take the drug DES during pregnancy.
b.
It was a survey.
c.
63 / 3.980 = 15.8 women out of each 1000 developed tissue abnormalities.
d.
The article reported “twice” as many abnormalities in the women whose mothers had taken DES during pregnancy. Thus, a rough estimate would be 15.8/2 = 7.9 abnormalities per 1000 women whose mothers had not taken DES during pregnancy.
e.
In many situations, disease occurrences are rare and affect only a small portion of the population. Large samples are needed to collect data on a reasonable number of cases where the disease exists.
22. a.
All adult viewers reached by the Denver, Colorado television station.
b.
The viewers contacted in the telephone survey.
c.
A sample. It would clearly be too costly and time consuming to try to contact all viewers.
23. a.
Percent of television sets that were tuned to a particular television show and/or total viewing audience.
b.
All television sets in the United States which are available for the viewing audience. Note this would not include television sets in store displays.
c.
A portion of these television sets. Generally, individual households would be contacted to determine which programs were being viewed.
2-8
d. 24. a.
The cancellation of programs, the scheduling of programs, and advertising cost rates. This is a statistically correct descriptive statistic for the sample.
b. c.
An incorrect generalization since the data was not collected for the entire population. An acceptable statistical inference based on the use of the word “estimate.”
d.
While this statement is true for the sample, it is not a justifiable conclusion for the entire population.
e.
This statement is not statistically supportable. While it is true for the particular sample observed, it is entirely possible and even very likely that at least some students will be outside the 65 to 90 range of grades.
2-9
Chapter 2 Descriptive Statistics: Tabular and Graphical Methods Learning Objectives
1.
Learn how to construct and interpret summarization procedures for qualitative data such as : frequency and relative frequency distributions, bar graphs and pie charts. Be able to use Excel's COUNTIF function to construct a frequency distribution and the Chart Wizard to construct a bar graph and pie chart.
2.
Learn how to construct and interpret tabular summarization procedures for quantitative data such as: frequency and relative frequency distributions, cumulative frequency and cumulative relative frequency distributions. Be able to use Excel's FREQUENCY function to construct a frequency distribution and the Chart Wizard to construct a histogram.
3.
Learn how to construct a histogram and an ogive as graphical summaries of quantitative data.
4.
Be able to use and interpret the exploratory data analysis technique of a stem-and-leaf display.
5.
Learn how to construct and interpret cross tabulations and scatter diagrams of bivariate data. Be able to use Excel's Pivot Table report to construct a cross tabulation and the Chart Wizard to construct a scatter diagram.
2 - 10
Solutions:
1. Class A B C
2. a. b.
Frequency 60 24 36 120
Relative Frequency 60/120 = 0.50 24/120 = 0.20 36/120 = 0.30 1.00
1 - (.22 + .18 + .40) = .20 .20(200) = 40
c/d Class A B C D Total
3.
a.
360° x 58/120 = 174°
b.
360° x 42/120 = 126°
Frequency .22(200) = 44 .18(200) = 36 .40(200) = 80 .20(200) = 40 200
Percent Frequency 22 18 40 20 100
c.
No Opinion 16.7%
Yes 48.3% No 35%
d.
70 60 Frequency
50 40 30 20 10 0 Yes
No
No Opinion
Response 4.
a.
The data are qualitative.
b. TV Show Millionaire Frasier Chicago Hope Charmed Total:
Frequency 24 15 7 4 50
12
Percent Frequency 48 30 14 8 100
c.
30
Frequency
25 20 15 10 5 0 Millionaire
Frasier
Chicago
Charmed
TV Show
Charmed 8% Chicago 14% Millionaire 48%
Frasier 30%
d. 5.
Millionaire has the largest market share. Frasier is second.
a. Name Brown Davis Johnson Jones Smith Williams
Frequency 7 6 10 7 12 8
Relative Frequency .14 .12 .20 .14 .24 .16
Percent Frequency 14% 12% 20% 14% 24% 16%
50
1.00
b. 14 12
Frequency
10 8 6 4 2 0 Brown c.
Brown Davis Johnson Jones Smith Williams
Davis
Johnson
Jones
Smith
.14 x 360 = 50.4° .12 x 360 = 43.2° .20 x 360 = 72.0° .14 x 360 = 50.4° .24 x 360 = 86.4° .16 x 360 = 57.6° Williams 16%
Smith 24%
Brown 14% Jones 14%
Davis 12% Johnson 20%
d. 6.
Most common: Smith, Johnson and Williams
a. Book 7 Habits Millionaire Motley Dad
Frequency 10 16 9 13
14
Percent Frequency 16.66 26.67 15.00 21.67
Williams
WSJ Guide Other Total:
6 6 60
10.00 10.00 100.00
The Ernst & Young Tax Guide 2000 with a frequency of 3, Investing for Dummies with a frequency of 2, and What Color is Your Parachute? 2000 with a frequency of 1 are grouped in the "Other" category.
b.
The rank order from first to fifth is: Millionaire, Dad, 7 Habits, Motley, and WSJ Guide.
c.
The percent of sales represented by The Millionaire Next Door and Rich Dad, Poor Dad is 48.33%.
7. Rating Outstanding Very Good Good Average Poor
Frequency 19 13 10 6 2 50
Relative Frequency 0.38 0.26 0.20 0.12 0.04 1.00
Management should be pleased with these results. 64% of the ratings are very good to outstanding. 84% of the ratings are good or better. Comparing these ratings with previous results will show whether or not the restaurant is making improvements in its ratings of food quality. 8.
a. Position Pitcher Catcher 1st Base 2nd Base 3rd Base Shortstop Left Field Center Field Right Field
9.
Frequency 17 4 5 4 2 5 6 5 7 55
b.
Pitchers (Almost 31%)
c.
3rd Base (3 - 4%)
d.
Right Field (Almost 13%)
e.
Infielders (16 or 29.1%) to Outfielders (18 or 32.7%)
Relative Frequency 0.309 0.073 0.091 0.073 0.036 0.091 0.109 0.091 0.127 1.000
a/b. Starting Time 7:00 7:30 8:00 8:30 9:00
Frequency 3 4 4 7 2
Percent Frequency 15 20 20 35 10
20
16
100
c.
Bar Graph
8 7
Frequency
6 5 4 3 2 1 0 7:00
7:30
8:00
8:30
9:00
Starting Time d. 9:00 10%
7:00 15%
7:30 20%
8:30 35% 8:00 20%
e. 10. a.
The most preferred starting time is 8:30 a.m.. Starting times of 7:30 and 8:00 a.m. are next. The data refer to quality levels from 1 "Not at all Satisfied" to 7 "Extremely Satisfied."
b. Rating 3 4 5 6 7
Frequency 2 4 12 24 18 60
Relative Frequency 0.03 0.07 0.20 0.40 0.30 1.00
c.
Bar Graph 30 25
Frequency
20 15 10 5 0 3
4
5
6
7
Rating
d.
The survey data indicate a high quality of service by the financial consultant. The most common ratings are 6 and 7 (70%) where 7 is extremely satisfied. Only 2 ratings are below the middle scale value of 4. There are no "Not at all Satisfied" ratings.
11. Class
Frequency
Relative Frequency
Percent Frequency
12-14 15-17 18-20 21-23 24-26
2 8 11 10 9 40
0.050 0.200 0.275 0.250 0.225 1.000
5.0 20.0 27.5 25.5 22.5 100.0
Total 12. Class less than or equal to 19 less than or equal to 29 less than or equal to 39 less than or equal to 49 less than or equal to 59
Cumulative Frequency 10 24 41 48 50
18
Cumulative Relative Frequency .20 .48 .82 .96 1.00
13.
18 16
Frequency
14 12 10 8 6 4 2 0 10-19
20-29
30-39
40-49
50-59
1.0
.8
.6
.4
.2
0
10
20
30
40
50
14. a/b. Class 6.0 - 7.9 8.0 - 9.9 10.0 - 11.9 12.0 - 13.9 14.0 - 15.9
Frequency 4 2 8 3 3 20
PercentFrequency 20 10 40 15 15 100
60
15. a/b. Waiting Time 0-4 5-9 10 - 14 15 - 19 20 - 24 Totals
Frequency 4 8 5 2 1 20
RelativeFrequency 0.20 0.40 0.25 0.10 0.05 1.00
c/d. Waiting Time Less than or equal to 4 Less than or equal to 9 Less than or equal to 14 Less than or equal to 19 Less than or equal to 24 e.
Cumulative Frequency 4 12 17 19 20
Cumulative Relative Frequency 0.20 0.60 0.85 0.95 1.00
12/20 = 0.60
16. a. Stock Price ($) 10.00 - 19.99 20.00 - 29.99 30.00 - 39.99 40.00 - 49.99 50.00 - 59.99 60.00 - 69.99 Total
Relative Frequency 0.40 0.16 0.24 0.08 0.04 0.08 1.00
Frequency 10 4 6 2 1 2 25
20
Percent Frequency 40 16 24 8 4 8 100
12
Frequency
10 8 6 4 2 0 10.0019.99
20.0029.99
30.0039.99
40.0049.99
50.0059.99
60.0069.99
Stock Price Many of these are low priced stocks with the greatest frequency in the $10.00 to $19.99 range. b. Earnings per Share ($) -3.00 to -2.01 -2.00 to -1.01 -1.00 to -0.01 0.00 to 0.99 1.00 to 1.99 2.00 to 2.99 Total
Frequency 2 0 2 9 9 3 25
Relative Frequency 0.08 0.00 0.08 0.36 0.36 0.12 1.00
Percent Frequency 8 0 8 36 36 12 100
Frequency
10 9 8 7 6 5 4 3 2 1 0 -3.00 to -2.01
-2.00 to -1.01
-1.00 to -0.01
0.00 to 0.99
1.00 to 1.99
2.00 to 2.99
Earnings per Share The majority of companies had earnings in the $0.00 to $2.00 range. Four of the companies lost money. 17. a. Amount 0-99 100-199 200-299 300-399 400-499
b.
Frequency 5 5 8 4 3 25
Histogram
22
Relative Frequency .20 .20 .32 .16 .12 1.00
9 8 7 Frequency
6 5 4 3 2 1 0 0-99
100-199
200-299
300-399
400-499
Amount ($)
c.
18. a.
The largest group spends $200-$300 per year on books and magazines. There are more in the $0 to $200 range than in the $300 to $500 range. Lowest salary: $93,000 Highest salary: $178,000
b. Salary ($1000s) 91-105 106-120 121-135 136-150 151-165 166-180 Total
Frequency 4 5 11 18 9 3 50
c.
Proportion $135,000 or less: 20/50.
d.
Percentage more than $150,000: 24%
Relative Frequency 0.08 0.10 0.22 0.36 0.18 0.06 1.00
Percent Frequency 8 10 22 36 18 6 100
20 18 16 Frequency
14 12 10 8 6 4 2 0 91-105
106-120 121-135 136-150 151-165 166-180 Salary ($1000s)
e. 19. a/b. Number 140 - 149 150 - 159 160 - 169 170 - 179 180 - 189 190 - 199 Totals
Frequency 2 7 3 6 1 1 20
Relative Frequency 0.10 0.35 0.15 0.30 0.05 0.05 1.00
c/d. Number Less than or equal to 149 Less than or equal to 159 Less than or equal to 169 Less than or equal to 179 Less than or equal to 189 Less than or equal to 199
Cumulative Frequency 2 9 12 18 19 20
24
Cumulative Relative Frequency 0.10 0.45 0.60 0.90 0.95 1.00
e.
Frequency
20
15
10
5
140 20. a.
160
180
200
The percentage of people 34 or less is 20.0 + 5.7 + 9.6 + 13.6 = 48.9.
b.
The percentage of the population over 34 years old is 16.3 + 13.5 + 8.7 + 12.6 = 51.1
c.
The percentage of the population that is between 25 and 54 years old inclusively is 13.6 + 16.3 + 13.5 = 43.4
d.
The percentage less than 25 years old is 20.0 + 5.7 + 9.6 = 35.3. So there are (.353)(275) = 97.075 million people less than 25 years old.
e.
An estimate of the number of retired people is (.5)(.087)(275) + (.126)(275) = 46.6125 million.
21. a/b. Computer Usage (Hours) 0.0 2.9 3.0 5.9 6.0 8.9 9.0 - 11.9 12.0 - 14.9 Total
Frequency 5 28 8 6 3 50
Relative Frequency 0.10 0.56 0.16 0.12 0.06 1.00
c.
30
Frequency
25 20 15 10 5 0 0.0 - 2.9
3.0 - 5.9
6.0 - 8.9
9.0 - 11.9 12.0 - 14.9
Computer Usage (Hours) d.
60
Frequency
50
40
30
20
10
0
3
6
9
12
15
Computer Usage (Hours) e.
The majority of the computer users are in the 3 to 6 hour range. Usage is somewhat skewed toward the right with 3 users in the 12 to 15 hour range.
26
22.
23.
24.
5
7 8
6
4 5 8
7
0 2 2 5 5 6 8
8
0 2 3 5
Leaf Unit = 0.1 6
3
7
5 5 7
8
1 3 4 8
9
3 6
10
0 4 5
11
3
Leaf Unit = 10 11
6
12
0 2
13
0 6 7
14
2 2 7
15
5
16
0 2 8
17
0 2 3
25.
26.
9
8 9
10
2 4 6 6
11
4 5 7 8 8 9
12
2 4 5 7
13
1 2
14
4
15
1
Leaf Unit = 0.1 0
4 7 8 9 9
1
1 2 9
2
0 0 1 3 5 5 6 8
3
4 9
4
8
5 6 7
1
28
27. 4
1 3 6 6 7
5
0 0 3 8 9
6
0 1 1 4 4 5 7 7 9 9
7
0 0 0 1 3 4 4 5 5 6 6 6 7 8 8
8
0 1 1 3 4 4 5 7 7 8 9
9
0 2 2 7
or 4
1 3
4
6 6 7
5
0 0 3
5
8 9
6
0 1 1 4 4
6
5 7 7 9 9
7
0 0 0 1 3 4 4
7
5 5 6 6 6 7 8 8
8
0 1 1 3 4 4
8
5 7 7 8 9
9
0 2 2
9
7
28. a. 0
5 8
1
1 1 3 3 4 4
1
5 6 7 8 9 9
2
2 3 3 3 5 5
2
6 8
3 3
6 7 7 9
4
0
4
7 8
5 5 6
0
b. 2000 P/E Forecast 5-9 10 - 14 15 - 19 20 - 24 25 - 29 30 - 34 35 - 39 40 - 44 45 - 49 50 - 54 55 - 59 60 - 64 Total
Frequency 2 6 6 6 2 0 4 1 2 0 0 1 30
29. a.
30
Percent Frequency 6.7 20.0 20.0 20.0 6.7 0.0 13.3 3.3 6.7 0.0 0.0 3.3 100.0
y
x
1
2
Total
A
5
0
5
B
11
2
13
C
2
10
12
Total
18
12
30
1
2
Total
A
100.0
0.0
100.0
B
84.6
15.4
100.0
C
16.7
83.3
100.0
b.
y
x
c.
y
x
d.
30. a.
1
2
A
27.8
0.0
B
61.1
16.7
C
11.1
83.3
Total
100.0
100.0
Category A values for x are always associated with category 1 values for y. Category B values for x are usually associated with category 1 values for y. Category C values for x are usually associated with category 2 values for y.
56 40
y
24 8 -8 -24 -40 -40
-30
-20
-10
0
10
20
30
40
x b.
There is a negative relationship between x and y; y decreases as x increases.
31. Quality Rating Good Very Good Excellent Total
Meal Price ($) 20-29 30-39 33.9 2.7 54.2 60.5 11.9 36.8 100.0 100.0
10-19 53.8 43.6 2.6 100.0
40-49 0.0 21.4 78.6 100.0
As the meal price goes up, the percentage of high quality ratings goes up. A positive relationship between meal price and quality is observed. 32. a. Sales/Margins/ROE A B C D E Total
0-19
20-39
EPS Rating 40-59
1 1 3 4
1 2 4
0-19
20-39
4 1 1 6
60-79 1 5 2 1
80-100 8 2 3
9
13
60-79 11.11 41.67 28.57 20.00
80-100 88.89 16.67 42.86
Total 9 12 7 5 3 36
b. Sales/Margins/ROE A B C D E
EPS Rating 40-59
8.33 14.29 60.00
33.33 14.29
20.00 66.67
33.33
Total 100 100 100 100 100
Higher EPS ratings seem to be associated with higher ratings on Sales/Margins/ROE. Of those companies with an "A" rating on Sales/Margins/ROE, 88.89% of them had an EPS Rating of 80 or
32
higher. Of the 8 companies with a "D" or "E" rating on Sales/Margins/ROE, only 1 had an EPS rating above 60. 33. a. Sales/Margins/ROE A B C D E Total
A 1 1 1 1 4
Industry Group Relative Strength B C D 2 2 4 5 2 3 3 2 1 1 1 2 11 7 10
E
Total 9 12 7 5 3 36
1 1 2 4
b/c. The frequency distributions for the Sales/Margins/ROE data is in the rightmost column of the crosstabulation. The frequency distribution for the Industry Group Relative Strength data is in the bottom row of the crosstabulation. d.
Once the crosstabulation is complete, the individual frequency distributions are available in the margins.
34. a. 80 70
Relative Price Strength
60 50 40 30 20 10 0 0
20
40
60
80
100
120
EPS Rating
b.
One might expect stocks with higher EPS ratings to show greater relative price strength. However, the scatter diagram using this data does not support such a relationship. The scatter diagram appears similar to the one showing "No Apparent Relationship" in Figure 2.19.
35. a.
The crosstabulation is shown below: Count of Observation Position Guard
Speed 4-4.5
4.5-5
5-5.5
5.5-6
Grand Total
12
1
13
Offensive tackle
2
Wide receiver
6
9
Grand Total
6
11
7
3
12 15
19
4
40
b.
There appears to be a relationship between Position and Speed; wide receivers had faster speeds than offensive tackles and guards.
c.
The scatter diagram is shown below:
10 9 Rating
8 7 6 5 4 4
4.5
5
5.5
6
Speed
d.
There appears to be a relationship between Speed and Rating; slower speeds appear to be associated with lower ratings. In other words,, prospects with faster speeds tend to be rated higher than prospects with slower speeds.
36. a. Vehicle F-Series Silverado Taurus Camry Accord
Frequency 17 12 8 7 6
34
Percent Frequency 34 24 16 14 12
Total b.
50
100
The two top selling vehicles are the Ford F-Series Pickup and the Chevrolet Silverado.
Accord 12% F-Series 34%
Camry 14%
Taurus 16% Silverado 24%
c. 37. a/b. Industry Beverage Chemicals Electronics Food Aerospace Totals c.
Frequency 2 3 6 7 2 20
Percent Frequency 10 15 30 35 10 100
8 7
Frequency
6 5 4 3 2 1 0 Beverage
Chemicals
Electronics
Food
Aerospace
Industry 38. a. Response Accuracy Approach Shots Mental Approach Power Practice Putting Short Game Strategic Decisions Total b.
Frequency 16 3 17 8 15 10 24 7 100
Percent Frequency 16 3 17 8 15 10 24 7 100
Poor short game, poor mental approach, lack of accuracy, and limited practice.
39. a-d.
Sales 0 - 499 500 - 999 1000 - 1499 1500 - 1999 2000 - 2499
Frequency 13 3 0 3 1
Relative Frequency 0.65 0.15 0.00 0.15 0.05
36
Cumulative Frequency 13 16 16 19 20
Cumulative Relative Frequency 0.65 0.80 0.80 0.95 1.00
Total
20
1.00
e.
14 12
Frequency
10 8 6 4 2 0 0-499
500-999
1000-1499
1500-1999
2000-2499
Sales 40. a. Closing Price 0 - 9.99 10 - 19.99 20 - 29.99 30 - 39.99 40 - 49.99 50 - 59.99 60 - 69.99 70 - 79.99 Totals
Frequency 9 10 5 11 2 2 0 1 40
Relative Frequency 0.225 0.250 0.125 0.275 0.050 0.050 0.000 0.025 1.000
b. Closing Price Less than or equal to 9.99 Less than or equal to 19.99 Less than or equal to 29.99 Less than or equal to 39.99 Less than or equal to 49.99
Cumulative Frequency 9 19 24 35 37
Cumulative Relative Frequency 0.225 0.475 0.600 0.875 0.925
Less than or equal to 59.99 Less than or equal to 69.99 Less than or equal to 79.99
39 39 40
0.975 0.975 1.000
c. 12 10
Frequency
8 6 4 2 0
10
d.
20
30
40 Closing Price
50
60
70
80
Over 87% of common stocks trade for less than $40 a share and 60% trade for less than $30 per share.
41. a. Exchange American New York Over the Counter
Frequency 3 2 15 20
Relative Frequency 0.15 0.10 0.75 1.00
b. Earnings Per Share 0.00 - 0.19 0.20 - 0.39 0.40 - 0.59 0.60 - 0.79 0.80 - 0.99
Frequency 7 7 1 3 2 20
Relative Frequency 0.35 0.35 0.05 0.15 0.10 1.00
Seventy percent of the shadow stocks have earnings per share less then $0.40. It looks like low EPS should be expected for shadow stocks.
Price-Earning Ratio 0.00 - 9.9
Frequency 3
38
Relative Frequency 0.15
10.0 - 19.9 20.0 - 29.9 30.0 - 39.9 40.0 - 49.9 50.0 - 59.9
7 4 3 2 1 20
0.35 0.20 0.15 0.10 0.05 1.00
P-E Ratios vary considerably, but there is a significant cluster in the 10 - 19.9 range. 42. Income ($) 18,000-21,999 22,000-25,999 26,000-29,999 30,000-33,999 34,000-37,999 Total
Frequency 13 20 12 4 2 51
Relative Frequency 0.255 0.392 0.235 0.078 0.039 1.000
25
Frequency
20
15
10
5
0 18,000 - 21,999
22,000 - 25,999
26,000 - 29,999 Per Capita Income
43. a.
30,000 - 33,999
34,000 - 37,999
0
8 9
1
0 2 2 2 3 4 4 4
1
5 5 6 6 6 6 7 7 8 8 8 8 9 9 9
2
0 1 2 2 2 3 4 4 4
2
5 6 8
3
0 1 3
b/c/d. Number Answered Correctly 5-9 10 - 14 15 - 19 20 - 24 25 - 29 30 - 34 Totals e.
Relative Frequency 0.050 0.200 0.375 0.225 0.075 0.075 1.000
Frequency 2 8 15 9 3 3 40
Cumulative Frequency 2 10 25 34 37 40
Relatively few of the students (25%) were able to answer 1/2 or more of the questions correctly. The data seem to support the Joint Council on Economic Education’s claim. However, the degree of difficulty of the questions needs to be taken into account before reaching a final conclusion.
44. a/b. High Temperature
Low Temperature
3
3
9
4
4
3 6 8
5
7
5
0 0 0 2 4 4 5 5 7 9
6
1 4 4 4 4 6 8
6
1 8
7
3 5 7 9
7
2 4 5 5
8
0 1 1 4 6
8
9
0 2 3
9
c.
It is clear that the range of low temperatures is below the range of high temperatures. Looking at the stem-and-leaf displays side by side, it appears that the range of low temperatures is about 20 degrees below the range of high temperatures.
d.
There are two stems showing high temperatures of 80 degrees or higher. They show 8 cities with high temperatures of 80 degrees or higher.
40
e.
Frequency High Temp. Low. Temp. 0 1 0 3 1 10 7 2 4 4 5 0 3 0 20 20
Temperature 30-39 40-49 50-59 60-69 70-79 80-89 90-99 Total
Low Temperature
45. a.
80 75 70 65 60 55 50 45 40 35 30 40
50
60
70
80
90
100
High Temperature b.
There is clearly a positive relationship between high and low temperature for cities. As one goes up so does the other.
46. a. Occupation Cabinetmaker Lawyer Physical Therapist Systems Analyst Total
30-39
40-49
1
5
1
2 7
30-39
40-49
10
50
Satisfaction Score 50-59 60-69 70-79 2 4 3 2 1 1 5 2 1 1 4 3 10 11 8
80-89 1 2 3
Total 10 10 10 10 40
b. Occupation Cabinetmaker Lawyer Physical Therapist Systems Analyst
20
Satisfaction Score 50-59 60-69 70-79 20 40 30 20 10 10 50 20 10 10 40 30
80-89 10 20
Total 100 100 100 100
c.
Each row of the percent crosstabulation shows a percent frequency distribution for an occupation. Cabinet makers seem to have the higher job satisfaction scores while lawyers seem to have the lowest. Fifty percent of the physical therapists have mediocre scores but the rest are rather high.
47. a. 40,000 35,000
Revenue $mil
30,000 25,000 20,000 15,000 10,000 5,000 0 0
10,000
20,000
30,000
40,000
50,000
60,000
70,000
80,000
90,000
Employees
b.
There appears to be a positive relationship between number of employees and revenue. As the number of employees increases, annual revenue increases.
48. a. Fuel Type Year Constructed Elect. Nat. Gas Oil Propane Other 1973 or before 40 183 12 5 7 1974-1979 24 26 2 2 0 1980-1986 37 38 1 0 6 1987-1991 48 70 2 0 1 Total 149 317 17 7 14
Total 247 54 82 121 504
b. Year Constructed 1973 or before 1974-1979 1980-1986 1987-1991 Total
Frequency 247 54 82 121 504
Fuel Type Electricity Nat. Gas Oil Propane Other Total
42
Frequency 149 317 17 7 14 504
100,000
c.
Crosstabulation of Column Percentages Fuel Type Year Constructed Elect. Nat. Gas Oil Propane Other 1973 or before 26.9 57.7 70.5 71.4 50.0 1974-1979 16.1 8.2 11.8 28.6 0.0 1980-1986 24.8 12.0 5.9 0.0 42.9 1987-1991 32.2 22.1 11.8 0.0 7.1 Total 100.0 100.0 100.0 100.0 100.0
d.
Crosstabulation of row percentages. Fuel Type Year Constructed Elect. Nat. Gas Oil Propane Other 1973 or before 16.2 74.1 4.9 2.0 2.8 1974-1979 44.5 48.1 3.7 3.7 0.0 1980-1986 45.1 46.4 1.2 0.0 7.3 1987-1991 39.7 57.8 1.7 0.0 0.8
e.
Total 100.0 100.0 100.0 100.0
Observations from the column percentages crosstabulation For those buildings using electricity, the percentages have not changes greatly over the years. For the buildings using natural gas, the majority were constructed in 1973 or before; the second largest percentage was constructed in 1987-1991. Most of the buildings using oil were constructed in 1973 or before. All of the buildings using propane are older. Observations from the row percentages crosstabulation Most of the buildings in the CG&E service area use electricity or natural gas. In the period 1973 or before most used natural gas. From 1974-1986, it is fairly evenly divided between electricity and natural gas. Since 1987 almost all new buildings are using electricity or natural gas with natural gas being the clear leader.
49. a.
Crosstabulation for stockholder's equity and profit.
Stockholders' Equity ($000) 0-1200 1200-2400 2400-3600 3600-4800 4800-6000 Total b.
0-200 10 4 4
200-400 1 10 3
18
2 16
Profits ($000) 400-600 600-800
3
1
3 6
1 2
800-1000
1000-1200 1
2 1 1
1 2
4
4
Total 12 16 13 3 6 50
800-1000 0.00 12.50 7.69 33.33 0.00
1000-1200 8.33 0.00 7.69 66.67 0.00
Total 100 100 100 100 100
Crosstabulation of Row Percentages.
Stockholders' Equity ($1000s) 0-1200 1200-2400 2400-3600 3600-4800 4800-6000
0-200 83.33 25.00 30.77 0.00
200-400 8.33 62.50 23.08 0.00 33.33
Profits ($000) 400-600 600-800 0.00 0.00 0.00 0.00 23.08 7.69 0.00 0.00 50.00 16.67
c. 50. a.
Stockholder's equity and profit seem to be related. As profit goes up, stockholder's equity goes up. The relationship, however, is not very strong. Crosstabulation of market value and profit.
Market Value ($1000s) 0-8000 8000-16000 16000-24000 24000-32000 32000-40000 Total b.
51. a.
27
Profit ($1000s) 300-600 600-900 4 4 2 2 1 1 2 2 1 13 6
900-1200
4
Total 27 12 4 4 3 50
900-1200 0.00 16.67 25.00 25.00 0.00
Total 100 100 100 100 100
2 1 1
Crosstabulation of Row Percentages.
Market Value ($1000s) 0-8000 8000-16000 16000-24000 24000-32000 32000-40000 c.
0-300 23 4
0-300 85.19 33.33 0.00 0.00 0.00
Profit ($1000s) 300-600 600-900 14.81 0.00 33.33 16.67 50.00 25.00 25.00 50.00 66.67 33.33
There appears to be a positive relationship between Profit and Market Value. As profit goes up, Market Value goes up. Scatter diagram of Profit vs. Stockholder's Equity. 1400.0 1200.0
Profit ($1000s)
1000.0 800.0 600.0 400.0 200.0 0.0 0.0
1000.0
2000.0
3000.0
4000.0
5000.0
Stockholder's Equity ($1000s) b.
Profit and Stockholder's Equity appear to be positively related.
44
6000.0
7000.0
52. a.
Scatter diagram of Market Value and Stockholder's Equity. 45000.0
Market Value ($1000s)
40000.0 35000.0 30000.0 25000.0 20000.0 15000.0 10000.0 5000.0 0.0 0.0
1000.0
2000.0
3000.0
4000.0
5000.0
6000.0
Stockholder's Equity ($1000s) b.
There is a positive relationship between Market Value and Stockholder's Equity.
7000.0
Chapter 3 Descriptive Statistics: Numerical Methods Learning Objectives
1.
Understand the purpose of measures of location.
2.
Be able to compute the mean, median, mode, quartiles, and various percentiles.
3.
Understand the purpose of measures of variability.
4.
Be able to compute the range, interquartile range, variance, standard deviation, and coefficient of variation.
5.
Understand how z scores are computed and how they are used as a measure of relative location of a data value.
6.
Know how Chebyshev’s theorem and the empirical rule can be used to determine the percentage of the data within a specified number of standard deviations from the mean.
7.
Learn how to construct a 5-number summary and a box plot.
8.
Be able to compute and interpret covariance and correlation as measures of association between two variables.
9.
Be able to compute a weighted mean.
46
Solutions:
x=
1.
Σxi 75 = = 15 n 5
10, 12, 16, 17, 20 Median = 16 (middle value) x=
2.
Σxi 96 = = 16 n 6
10, 12, 16, 17, 20, 21 Median = 3.
15, 20, 25, 25, 27, 28, 30, 32 i=
20 (8) = 1.6 100
2nd position = 20
i=
25 (8) = 2 100
20 + 25 = 22.5 2
i=
65 (8) = 5.2 100
6th position = 28
i=
75 (8) = 6 100
28 + 30 = 29 2
Mean =
4.
5.
16 + 17 = 16.5 2
Σxi 657 = = 59.727 n 11
Median = 57
6th item
Mode = 53
It appears 3 times
Σxi 1106.4 = = 36.88 n 30
a.
x=
b.
There are an even number of items. Thus, the median is the average of the 15th and 16th items after the data have been placed in rank order. Median =
c.
36.6 + 36.7 = 36.65 2
Mode = 36.4 This value appears 4 times
d.
First Quartile i =
25 I F G H100 JK30 = 7.5
Rounding up, we see that Q1 is at the 8th position. Q1 = 36.2 e.
Third Quartile i =
75 I F G H100 JK30 = 22.5
Rounding up, we see that Q3 is at the 23rd position. Q3 = 37.9 6.
a.
x=
Σxi 1845 = = 92.25 n 20
Median is average of 10th and 11th values after arranging in ascending order. Median =
66 + 95 = 80.5 2
Data are multimodal b.
x=
Σxi 1334 = = 66.7 n 20
Median =
66 + 70 = 68 2
Mode = 70 (4 brokers charge $70)
7.
c.
Comparing all three measures of central location (mean, median and mode), we conclude that it costs more, on average, to trade 500 shares at $50 per share.
d.
Yes, trading 500 shares at $50 per share is a transaction value of $25,000 whereas trading 1000 shares at $5 per share is a transaction value of $5000.
a.
x=
b.
Yes, the mean here is 46 minutes. The newspaper reported on average of 45 minutes.
c.
Median =
d.
Q1 = 7 (value of 8th item in ranked order)
Σxi 1380 = = 46 n 30
45 + 52.9 = 48.95 2
Q3 = 70.4 (value of 23rd item in ranked list)
48
e.
40 Find position i = 30 = 12; 40th percentile is average of values in 12th and 13th positions. 100 40th percentile =
8.
a.
x=
28.8 + 29.1 = 28.95 2
Σxi 695 = = 34.75 n 20
Mode = 25 (appears three times) b.
Data in order: 18, 20, 25, 25, 25, 26, 27, 27, 28, 33, 36, 37, 40, 40, 42, 45, 46, 48, 53, 54 Median (10th and 11th positions) =
33 + 36 = 34.5 2
At home workers are slightly younger c.
i=
25 (20) = 5; use positions 5 and 6 100
Q1 =
i=
75 (20) = 15; use positions 15 and 16 100
Q3 =
d.
i=
25 + 26 = 25.5 2
42 + 45 = 43.5 2
32 (20) = 6.4; round up to position 7 100
32nd percentile = 27 At least 32% of the people are 27 or younger. 9.
Σxi 270,377 = = 10,815.08 Median (Position 13) = 8296 n 25
a.
x=
b.
Median would be better because of large data values.
c.
i = (25 / 100) 25 = 6.25 Q1 (Position 7) = 5984 i = (75 / 100) 25 = 18.75 Q3 (Position 19) = 14,330
d.
i = (85/100) 25 = 21.25
85th percentile (position 22) = 15,593. Approximately 85% of the websites have less than 15,593 unique visitors.
10. a.
Σxi = 435 x=
Σxi 435 = = 48.33 n 9
Data in ascending order:
28 42 45 48 49 50 55 58 60 Median = 49 Do not report a mode; each data value occurs once. The index could be considered good since both the mean and median are less than 50. b.
25 i= 9 = 2.25 100 Q1 (3rd position) = 45 75 i= 9 = 6.75 100 Q3 (7th position) = 55
11.
Using the mean we get x city =15.58, x country = 18.92 For the samples we see that the mean mileage is better in the country than in the city. City 13.2 14.4 15.2 15.3 15.3 15.3 15.9 16 16.1 16.2 16.2 16.7 16.8 ↑ Median Mode: 15.3 Country 17.2 17.4 18.3 18.5 18.6 18.6 18.7 19.0 19.2 19.4 19.4 20.6 21.1 ↑ Median Mode: 18.6, 19.4 The median and modal mileages are also better in the country than in the city.
50
12. a.
x=
Σxi 12, 780 = = $639 n 20
b.
x=
Σxi 1976 = = 98.8 pictures n 20
c. d.
13.
Σxi 2204 = = 110.2 minutes n 20 This is not an easy choice because it is a multicriteria problem. If price was the only criterion, the lowest price camera (Fujifilm DX-10) would be preferred. If maximum picture capacity was the only criterion, the maximum picture capacity camera (Kodak DC280 Zoom) would be preferred. But, if battery life was the only criterion, the maximum battery life camera (Fujifilm DX10) would be preferred. There are many approaches used to select the best choice in a multicriteria situation. These approaches are discussed in more specialized books on decision analysis. x=
Range 20 - 10 = 10 10, 12, 16, 17, 20 i=
25 (5) = 1.25 100
Q1 (2nd position) = 12 i=
75 (5) = 3.75 100
Q3 (4th position) = 17
IQR = Q3 - Q1 = 17 - 12 = 5 14.
x=
s2 =
Σxi 75 = = 15 n 5 Σ ( xi − x ) 2 64 = = 16 n −1 4
s = 16 = 4
15.
15, 20, 25, 25, 27, 28, 30, 34
Range = 34 - 15 = 19
i=
25 (8) = 2 100
Q1 =
20 + 25 = 22.5 2
i=
75 (8) = 6 100
Q1 =
28 + 30 = 29 2
IQR = Q3 - Q1 = 29 - 22.5 = 6.5 x=
Σxi 204 = = 255 . n 8
s2 =
Σ ( xi − x ) 2 242 = = 34.57 n −1 7
s = 34.57 = 5.88
16. a. b.
Range = 190 - 168 = 22 Σ( xi − x ) 2 = 376 s = 376 = 75.2 5 2
c.
s = 75.2 = 8.67
d.
8.67 Coefficient of Variation = 100 = 4.87 178
17.
Range = 92-67 = 25 IQR = Q3 - Q1 = 80 - 77 = 3 x = 78.4667
∑(x − x ) i
s2 =
∑(x
i
2
= 411.7333 − x)
n −1
2
=
411.7333 = 29.4095 14
s = 29.4095 = 5.4231 18. a.
x=
Σxi 115.13 (Mainland); 36.62 (Asia) n
Median (7th and 8th position) Mainland = (110.87 + 112.25) / 2 = 111.56 Median (6th and 7th position) Asia = (32.98 + 40.41) / 2 = 36.695 b.
Range = High - Low
Range Standard Deviation Coefficient of Variation
Mainland 86.24 26.82 23.30
52
Asia 42.97 11.40 31.13
c. 19. a.
b.
Greater mean and standard deviation for Mainland. Greater coefficient of variation for Asia. Range = 60 - 28 = 32 IQR = Q3 - Q1 = 55 - 45 = 10 435 x= = 48.33 9 Σ( xi − x ) 2 = 742
s2 =
Σ( xi − x ) 2 742 = = 92.75 n −1 8
s = 92.75 = 9.63 c. 20.
The average air quality is about the same. But, the variability is greater in Anaheim. Dawson Supply: Range = 11 - 9 = 2
s=
4.1 = 0.67 9
J.C. Clark: Range = 15 - 7 = 8
s= 21. a.
60.1 = 2.58 9
Winter Range = 21 - 12 = 9 IQR = Q3 - Q1 = 20-16 = 4 Summer Range = 38 - 18 = 20 IQR = Q3 - Q1 = 29-18 = 11
b. Winter Summer c.
Variance 8.2333 44.4889
Standard Deviation 2.8694 6.6700
Winter
s Coefficient of Variation = x
2.8694 100 = 17.7 100 = 16.21
Summer
s 6.6700 Coefficient of Variation = 100 = 100 = 26.05 x 25.6 d.
More variability in the summer months.
22. a.
500 Shares at $50 Min Value = 34
Max Value = 195
Range = 195 - 34 = 161
Q1 =
45 + 50 = 47.5 2
Q3 =
140 + 140 = 140 2
Interquartile range = 140 - 47.5 = 92.5 1000 Shares at $5 Min Value = 34
Max Value = 90
Range = 90 - 34 = 56
Q1 =
60 + 60.5 = 60.25 2
Q3 =
79.5 + 80 = 79.75 2
Interquartile range = 79.75 - 60.25 = 19.5 b.
500 Shares at $50
s2 =
Σ( xi − x ) 2 51, 402.25 = = 2705.3816 n −1 19
s = 2705.3816 = 52.01 1000 Shares at $5
s2 =
Σ( xi − x ) 2 5526.2 = = 290.8526 n −1 19
s = 290.8526 = 17.05 c.
500 Shares at $50 Coefficient of Variation =
s 52.01 (100) = (100) = 56.38 x 92.25
1000 Shares at $5 Coefficient of Variation = d.
s 17.05 (100) = (100) = 25.56 x 66.70
The variability is greater for the trade of 500 shares at $50 per share. This is true whether we use the standard deviation or the coefficient of variation as a measure.
23.
s2 = 0.0021 Production should not be shut down since the variance is less than .005.
24.
Quarter milers
s = 0.0564
54
Coefficient of Variation = (s/ x )100 = (0.0564/0.966)100 = 5.8 Milers
s = 0.1295 Coefficient of Variation = (s/ x )100 = (0.1295/4.534)100 = 2.9 Yes; the coefficient of variation shows that as a percentage of the mean the quarter milers’ times show more variability. 25.
26.
27. a.
Σxi 75 = = 15 n 5
x= s2 =
Σ( xi − x ) 2 = n −1
10
z=
10 − 15 = −1.25 4
20
z=
20 − 15 = +1.25 4
12
z=
12 − 15 = −0.75 4
17
z=
17 − 15 = +.50 4
16
z=
16 − 15 = +.25 4
z=
520 − 500 = +.20 100
z=
650 − 500 = +1.50 100
z=
500 − 500 = 0.00 100
z=
450 − 500 = −0.50 100
z=
280 − 500 = −2.20 100
z=
40 − 30 1 = 2 1 − 2 5 2
64 =4 4
= 0.75 At least 75%
b.
z=
45 − 30 1 = 3 1 − 2 5 3
c.
z=
38 − 30 1 = 1.6 1 − 2 = 0.61 At least 61% 5 1.6
d.
z=
42 − 30 1 = 2.4 1 − 2 5 2.4
= 0.83 At least 83%
e.
z=
48 − 30 1 = 3.6 1 − 2 5 3.6
= 0.92 At least 92%
28. a.
Approximately 95%
b.
Almost all
c.
Approximately 68%
29. a.
= 0.89 At least 89%
This is from 2 standard deviations below the mean to 2 standard deviations above the mean. With z = 2, Chebyshev’s theorem gives: 1−
1 1 1 3 = 1− 2 = 1− = 2 z 2 4 4
Therefore, at least 75% of adults sleep between 4.5 and 9.3 hours per day. b.
This is from 2.5 standard deviations below the mean to 2.5 standard deviations above the mean. With z = 2.5, Chebyshev’s theorem gives: 1 1 1 = 1− = 1− = .84 2 2 z 2.5 6.25 Therefore, at least 84% of adults sleep between 3.9 and 9.9 hours per day. 1−
c.
30. a.
b.
c.
With z = 2, the empirical rule suggests that 95% of adults sleep between 4.5and 9.3 hours per day. The probability obtained using the empirical rule is greater than the probability obtained using Chebyshev’s theorem. 2 hours is 1 standard deviation below the mean. Thus, the empirical rule suggests that 68% of the kids watch television between 2 and 4 hours per day. Since a bell-shaped distribution is symmetric, approximately, 34% of the kids watch television between 2 and 3 hours per day. 1 hour is 2 standard deviations below the mean. Thus, the empirical rule suggests that 95% of the kids watch television between 1 and 5 hours per day. Since a bell-shaped distribution is symmetric, approximately, 47.5% of the kids watch television between 1 and 3 hours per day. In part (a) we concluded that approximately 34% of the kids watch television between 2 and 3 hours per day; thus, approximately 34% of the kids watch television between 3 and 4 hours per day. Hence, approximately 47.5% + 34% = 81.5% of kids watch television between 1 and 4 hours per day. Since 34% of the kids watch television between 3 and 4 hours per day, 50% - 34% = 16% of the kids watch television more than 4 hours per day.
56
31. a.
Approximately 68% of scores are within 1 standard deviation from the mean.
b.
Approximately 95% of scores are within 2 standard deviations from the mean.
c.
Approximately (100% - 95%) / 2 = 2.5% of scores are over 130.
d.
Yes, almost all IQ scores are less than 145.
z=
71.00 − 90.06 = −0.95 20
b.
z=
168 − 90.06 = 3.90 20
c.
The z-score in part a indicates that the value is 0.95 standard deviations below the mean. The z-score in part b indicates that the value is 3.90 standard deviations above the mean.
32. a.
The labor cost in part b is an outlier and should be reviewed for accuracy. 33. a. x is approximately 63 or $63,000, and s is 4 or $4000 b.
This is from 2 standard deviations below the mean to 2 standard deviations above the mean. With z = 2, Chebyshev’s theorem gives: 1−
1 1 1 3 = 1− 2 = 1− = z2 2 4 4
Therefore, at least 75% of benefits managers have an annual salary between $55,000 and $71,000. c.
The histogram of the salary data is shown below:
9 8 7 Frequency
6 5 4 3 2 1 0 56-58
58-60
60-62
62-64
64-66
66-68
68-70
70-72
72-74
Salary Although the distribution is not perfectly bell shaped, it does appear reasonable to assume that the distribution of annual salary can be approximated by a bell-shaped distribution. d.
With z = 2, the empirical rule suggests that 95% of benefits managers have an annual salary between $55,000 and $71,000. The probability is much higher than obtained using Chebyshev’s theorem, but requires the assumption that the distribution of annual salary is bell shaped.
e. There are no outliers because all the observations are within 3 standard deviations of the mean.
34. a. x is 100 and s is 13.88 or approximately 14 b.
If the distribution is bell shaped with a mean of 100 points, the percentage of NBA games in which the winning team scores more than 100 points is 50%. A score of 114 points is z = 1 standard deviation above the mean. Thus, the empirical rule suggests that 68% of the winning teams will score between 86 and 114 points. In other words, 32% of the winning teams will score less than 86 points or more than 114 points. Because a bell-shaped distribution is symmetric, approximately 16% of the winning teams will score more than 114 points.
c.
For the winning margin, x is 11.1 and s is 10.77. To see if there are any outliers, we will first compute the z-score for the winning margin that is farthest from the sample mean of 11.1, a winning margin of 32 points.
z=
x − x 32 − 11.1 = = 1.94 s 10.77
Thus, a winning margin of 32 points is not an outlier (z = 1.94 < 3). Because a winning margin of 32 points is farthest from the mean, none of the other data values can have a z-score that is less than 3 or greater than 3 and hence we conclude that there are no outliers 35. a.
x=
Σxi 79.86 = = 3.99 n 20
58
Median = b.
4.17 + 4.20 = 4.185 (average of 10th and 11th values) 2
Q1 = 4.00 (average of 5th and 6th values) Q3 = 4.50 (average of 15th and 16th values) Σ( xi − x ) 2 12.5080 = = 0.8114 n −1 19
c.
s=
d.
Allison One: z =
4.12 − 3.99 ≈ 0.16 0.8114
Omni Audio SA 12.3: z = e.
2.32 − 3.99 ≈ −2.06 0.8114
The lowest rating is for the Bose 501 Series. It’s z-score is:
z=
2.14 − 3.99 ≈ −2.28 0.8114
This is not an outlier so there are no outliers. 36.
15, 20, 25, 25, 27, 28, 30, 34 Smallest = 15
i=
25 (8) = 2 100
Median =
i=
Q1 =
20 + 25 = 22.5 2
Q3 =
28 + 30 = 29 2
25 + 27 = 26 2
75 (8) = 8 100
Largest = 34 37.
15 38.
5, 6, 8, 10, 10, 12, 15, 16, 18 Smallest = 5
20
25
30
35
i=
25 (9) = 2.25 Q1 = 8 (3rd position) 100
Median = 10
i=
75 (9) = 6.75 Q3 = 15 (7th position) 100
Largest = 18
5
39.
10
15
20
IQR = 50 - 42 = 8 Lower Limit: Upper Limit:
Q1 - 1.5 IQR = 42 - 12 = 30 Q3 + 1.5 IQR = 50 + 12 = 62
65 is an outlier 40. a. b.
Five number summary: 5 9.6 14.5 19.2 52.7 IQR = Q3 - Q1 = 19.2 - 9.6 = 9.6 Lower Limit: Upper Limit:
c.
Q1 - 1.5 (IQR) = 9.6 - 1.5(9.6) = -4.8 Q3 + 1.5(IQR) = 19.2 + 1.5(9.6) = 33.6
The data value 41.6 is an outlier (larger than the upper limit) and so is the data value 52.7. The financial analyst should first verify that these values are correct. Perhaps a typing error has caused 25.7 to be typed as 52.7 (or 14.6 to be typed as 41.6). If the outliers are correct, the analyst might consider these companies with an unusually large return on equity as good investment candidates.
d.
*
-10 41. a.
5
20
35
Median (11th position) 4019
i=
25 (21) = 5.25 100
Q1 (6th position) = 1872
60
*
50
65
i=
75 (21) = 15.75 100
Q3 (16th position) = 8305 608, 1872, 4019, 8305, 14138 b.
Limits: IQR = Q3 - Q1 = 8305 - 1872 = 6433 Lower Limit:
Q1 - 1.5 (IQR) = -7777
Upper Limit:
Q3 + 1.5 (IQR) = 17955
c.
There are no outliers, all data are within the limits.
d.
Yes, if the first two digits in Johnson and Johnson's sales were transposed to 41,138, sales would have shown up as an outlier. A review of the data would have enabled the correction of the data.
e.
0 42. a.
6,000
3,000
9,000
12,000
15,000
Mean = 105.7933 Median = 52.7
b.
Q1 = 15.7 Q3 = 78.3
c.
IQR = Q3 - Q1 = 78.3 - 15.7 = 62.6 Lower limit for box plot = Q1 - 1.5(IQR) = 15.7 - 1.5(62.6) = -78.2 Upper limit for box plot = Q3 + 1.5 (IQR) = 78.3 + 1.5(62.6) = 172.2 Note: Because the number of shares covered by options grants cannot be negative, the lower limit for the box plot is set at 0. This, outliers are value in the data set greater than 172.2. Outliers: Silicon Graphics (188.8) and ToysRUs (247.6)
d. 43. a.
Mean percentage = 26.73. The current percentage is much greater. Five Number Summary (Midsize) 51 71.5 81.5 96.5 128 Five Number Summary (Small)
73 101 108.5 121 140 b.
Box Plots Midsize
50
60
70
80
90
100
110
120
130
60
70
80
90
100
110
120
130
140
Small Size
50 c. 44. a. b. c.
150
The midsize cars appear to be safer than the small cars.
x = 37.48 Median = 23.67 Q1 = 7.91 Q3 = 51.92 IQR = 51.92 - 7.91 = 44.01 Lower Limit:
Q1 - 1.5(IQR) = 7.91 - 1.5(44.01) = -58.11
Upper Limit:
Q3 + 1.5(IQR) = 51.92 + 1.5(44.01) = 117.94
Russia, with a percent change of 125.89, is an outlier. Turkey, with a percent change of 254.45 is another outlier. d.
With a percent change of 22.64, the United States is just below the 50th percentile - the median.
45. a.
70 60 50 y
40 30 20 10 0 0
5
10 x 62
15
20
b.
Negative relationship
c/d. Σxi = 40
x=
40 =8 5
Σyi = 230
Σ ( xi − x )( yi − y ) = −240
sxy =
230 = 46 5
Σ ( xi − x ) 2 = 118
Σ ( yi − y ) 2 = 520
Σ( xi − x )( yi − y ) −240 = = −60 n −1 5 −1
sx =
Σ( xi − x ) 2 = n −1
118 = 5.4314 5 −1
sy =
Σ( yi − y )2 = n −1
520 = 11.4018 5 −1
rxy =
y=
sxy sx s y
=
−60 = −0.969 (5.4314)(11.4018)
There is a strong negative linear relationship.
46. a.
18 16 14 12 y
10 8 6 4 2 0 0
5
10
15 x
20
25
30
b.
Positive relationship
c/d. Σxi = 80
x=
80 = 16 5
Σ ( xi − x )( yi − y ) = 106
sxy =
sx =
Σyi = 50
y=
50 = 10 5
Σ ( xi − x ) 2 = 272
Σ ( yi − y ) 2 = 86
Σ( xi − x )( yi − y ) 106 = = 26.5 n −1 5 −1 Σ( xi − x ) 2 = n −1
272 = 8.2462 5 −1
Σ( yi − y ) 2 86 = = 4.6368 n −1 5 −1 sxy 26.5 = = 0.693 rxy = sx s y (8.2462)(4.6368)
sy =
A positive linear relationship
47. a.
750 700
y = SAT
650 600 550 500 450 400 2.6
2.8
3
3.2 x = GPA
b.
Positive relationship
64
3.4
3.6
3.8
c/d. Σxi = 19.8
x=
19.8 = 3.3 6
Σ ( xi − x )( yi − y ) = 143
sxy =
Σyi = 3540
Σ ( xi − x ) 2 = 0.74
3540 = 590 6
Σ ( yi − y ) 2 = 36,400
Σ( xi − x )( yi − y ) 143 = = 28.6 n −1 6 −1
sx =
Σ( xi − x ) 2 = n −1
0.74 = 0.3847 6 −1
sy =
Σ( yi − y ) 2 = n −1
36, 400 = 85.3229 6 −1
rxy =
y=
sxy sx s y
=
28.6 = 0.8713 (0.3847)(85.3229)
A positive linear relationship 48.
Let x = driving speed and y = mileage Σxi = 420
x=
420 = 42 10
Σ( xi − x )( yi − y ) = −475
sxy =
Σyi = 270
Σ( xi − x )2 = 1660
270 = 27 10 Σ( yi − y ) 2 = 164
Σ( xi − x )( yi − y ) −475 = = −52.7778 n −1 10 − 1
sx =
Σ( xi − x ) 2 1660 = = 13.5810 n −1 10 − 1
sy =
Σ( yi − y ) 2 164 = = 4.2687 n −1 10 − 1
rxy =
y=
sxy sx s y
=
−52.7778 = −.91 (13.5810)(4.2687)
A strong negative linear relationship 49. a. b. 50. a.
The sample correlation coefficient is .78. There is a positive linear relationship between the performance score and the overall rating. The sample correlation coefficient is .92.
b.
There is a strong positive linear relationship between the two variables.
51.
The sample correlation coefficient is .88. This indicates a strong positive linear relationship between the daily high and low temperatures.
52. a.
x=
b.
Σwi xi 6(3.2) + 3(2) + 2(2.5) + 8(5) 70.2 = = = 3.69 Σwi 6+ 3+ 2 +8 19
3.2 + 2 + 2.5 + 5 12.7 = = 3175 . 4 4
53.
fi 4 7 9 5 25 x=
s2 =
Mi 5 10 15 20
fi Mi 20 70 135 100 325
Σf i M i 325 = = 13 n 25 fi
Mi
Mi − x
(M i − x )2
fi (M i − x )2
4 7 9 5
5 10 15 20
-8 -3 +2 +7
64 9 4 49
256 63 36 245 600
Σf i ( M i − x ) 2 600 = = 25 n −1 24
s = 25 = 5 54. a. Grade xi 4 (A) 3 (B) 2 (C) 1 (D) 0 (F)
x= b.
Weight wi 9 15 33 3 0 60 Credit Hours
Σwi xi 9(4) + 15(3) + 33(2) + 3(1) 150 = = = 2.5 Σwi 9 + 15 + 33 + 3 60
Yes; satisfies the 2.5 grade point average requirement
55.
fi 4 7
Mi 5 10
66
fi Mi 20 70
9 5 25
x=
s2 =
15 20
135 100 325
Σf i M i 325 = = 13 n 25 fi
Mi
Mi − x
(M i − x )2
fi (M i − x )2
4 7 9 5
5 10 15 20
-8 -3 +2 +7
64 9 4 49
256 63 36 245 600
Σf i ( M i − x ) 2 600 = = 25 n −1 24
s = 25 = 5 56.
Mi
fi
fi Mi
Mi − x
74 192 280 105 23 6 680
2 7 12 17 22 27
148 1,344 3,360 1,785 506 162 7,305
-8.742647 -3.742647 1.257353 6.257353 11.257353 16.257353
(M i − x )2 76.433877 14.007407 1.580937 39.154467 126.728000 264.301530
fi (M i − x )2 5,656.1069 2,689.4221 442.6622 4,111.2190 2,914.7439 1,585.8092 17,399.9630
Estimate of total gallons sold: (10.74)(120) = 1288.8 7305 x= = 10.74 680
s2 =
17,399.9630 = 25.63 679
s = 5.06 57. a. Class 0 1 2 3 4 Totals
x=
fi 15 10 40 85 350 500
Mi 0 1 2 3 4
( Mi − x )2
fi ( Mi − x )2
Σ i fM i 1745 = = 3.49 n 500
b.
Mi − x
fi Mi 0 10 80 255 1400 1745
-3.49 -2.49 -1.49 -0.49 +0.51
58. a.
12.18 6.20 2.22 0.24 0.26 Total
s2 =
Σ ( M i − x ) 2 f i 444.95 = = 0.8917 n −1 499
x=
Σxi 3463 = = 138.52 n 25
182.70 62.00 88.80 20.41 91.04 444.95
s = 0.8917 = 0.9443
Median = 129 (13th value) Mode = 0 (2 times) b.
It appears that this group of young adults eats out much more than the average American. The mean and median are much higher than the average of $65.88 reported in the newspaper.
c.
Q1 = 95 (7th value) Q3 = 169 (19th value)
d.
Min = 0
Max = 467
Range = 467 - 0 = 467 IQR = Q3 - Q1 = 169 - 95 = 74 e.
s2 = 9271.01
f.
The z - score for the largest value is:
z=
s = 96.29
467 − 138.52 = 3.41 96.29
It is the only outlier and should be checked for accuracy. 59. a.
Σxi = 760 x=
Σxi 760 = = 38 n 20
Median is average of 10th and 11th items. Median =
36 + 36 = 36 2
The modal cash retainer is 40; it appears 4 times. b.
For Q1,
68
25 i= 20 = 5 100 Since i is integer, 28 + 30 = 29 2
Q1 =
For Q3,
75 i= 20 = 15 100 Since i is integer,
Q3 = c
40 + 50 = 45 2
Range = 64 – 15 = 49 Interquartile range = 45 – 29 = 16
d.
s = 2
∑ ( xi − x )
2
n −1
=
3318 = 174.6316 20 − 1
s = s 2 = 174.6316 = 13.2148 e.
60. a.
s 13.2148 Coefficient of variation = 100 = 100 = 34.8 x 38
x=
Σxi 260 = = 18.57 n 14
Median = 16.5 (Average of 7th and 8th values) b.
s2 = 53.49
c.
Quantex has the best record: 11 Days
d.
z=
s = 7.31
27 − 18.57 = 115 . 7.31
Packard-Bell is 1.15 standard deviations slower than the mean. e.
z=
12 − 18.57 = −0.90 7.31
IBM is 0.9 standard deviations faster than the mean. f.
Check Toshiba:
z=
37 − 18.57 = 2.52 7.31
On the basis of z - scores, Toshiba is not an outlier, but it is 2.52 standard deviations slower than the mean. 61.
Sample mean = 7195.5 Median = 7019 (average of positions 5 and 6) Sample variance = 7,165,941 Sample standard deviation = 2676.93
62. a. b.
The sample mean is 83.135 and the sample standard deviation is 16.173. With z = 2, Chebyshev’s theorem gives: 1−
1 1 1 3 = 1− 2 = 1− = 2 z 2 4 4
Therefore, at least 75% of household incomes are within 2 standard deviations of the mean. Using the sample mean and sample standard deviation computed in part (a), the range within 75% of household incomes must fall is 83.135 ± 2(16.173) = 83.135 ± 32.346; thus, 75% of household incomes must fall between 50.789 and 115.481, or $50,789 to $115,481. c.
With z = 2, the empirical rule suggests that 95% of household incomes must fall between $50,789 to $115,481. For the same range, the probability obtained using the empirical rule is greater than the probability obtained using Chebyshev’s theorem.
d.
The z-score for Danbury, CT is 3.04; thus, the Danbury, CT observation is an outlier.
63. a.
Public Transportation: x = Automobile: x =
b.
320 = 32 10
320 = 32 10
Public Transportation: s = 4.64 Automobile: s = 1.83
c.
Prefer the automobile. The mean times are the same, but the auto has less variability.
d.
Data in ascending order: Public:
25 28 29 29 32 32 33 34 37 41
Auto:
29 30 31 31 32 32 33 33 34 35
Five number Summaries Public:
25 29 32 34 41
Auto:
29 31 32 33 35
70
Box Plots: Public:
24
28
32
36
40
28
32
36
40
Auto:
24
The box plots do show lower variability with automobile transportation and support the conclusion in part c. 64. a.
b.
65. a.
The sample covariance is 502.67. Because the sample covariance is positive, there is a positive linear relationship between income and home price. The sample correlation coefficient is .933; this indicates a strong linear relationship between income and home price.
Let x = media expenditures ($ millions) and y = shipments in barrels (millions) Σxi = 404.1
x=
404.1 = 40.41 10
Σ( xi − x )( yi − y ) = 3763.481 sxy =
Σyi = 119.9
sx =
Σ( xi − x )( yi − y ) 3763.481 = = 418.1646 n −1 10 − 1
Σ( xi − x )2 19, 248.469 = = 46.2463 n −1 10 − 1
Σ( yi − y )2 939.349 = = 10.2163 n −1 10 − 1 sxy 418.1646 rxy = = = 0.885 sx s y (46.2463)(10.2163) sy =
119.9 = 11.99 10
Σ( xi − x )2 = 19, 248.469
A positive relationship b.
y=
Σ( yi − y ) 2 = 939.349
Note: The same value can also be obtained using Excel's CORREL function 66. a. b.
The scatter diagram indicates a positive relationship Σxi = 798
Σyi = 11, 688
Σxi2 = 71,306 rxy =
Σxi yi = 1, 058, 019
Σyi2 = 16, 058,736
Σxi yi − ( Σxi Σyi ) / n Σx − ( Σxi ) / n Σy − ( Σyi ) / n 2 i
2
2 i
2
=
Strong positive relationship
67. a.
The scatter diagram is shown below:
72
1, 058, 019 − (798)(11,688) / 9 71,306 − (798) 2 / 9 16, 058, 736 − (11, 688)2 / 9
= .9856
3.5 3
Earnings
2.5 2 1.5 1 0.5 0 0
5
10
15
20
25
30
Book Value b.
68. a. b.
The sample correlation coefficient is .75; this indicates a linear relationship between book value and earnings. (800 + 750 + 900)/3 = 817 Month Weight
69.
January
February
1
2
March 3
x=
Σwi xi 1(800) + 2(750) + 3(900) 5000 = = = 833 Σwi 1+ 2 + 3 6
x=
Σwi xi 20(20) + 30(12) + 10(7) + 15(5) + 10(6) 965 = = = 11.4 days Σwi 20 + 30 + 10 + 15 + 10 85
70.
a.
x=
fi
Mi
fi Mi
Mi − x
10 40 150 175 75 15 10 475
47 52 57 62 67 72 77
470 2080 8550 10850 5025 1080 770 28,825
-13.68 -8.68 -3.68 +1.32 +6.32 +11.32 +16.32
28,825 = 60.68 475
( Mi − x )2 187.1424 75.3424 13..5424 1.7424 39.9424 128.1424 266.3424
fi ( Mi − x )2 1871.42 3013.70 2031.36 304.92 2995.68 1922.14 2663.42 14,802.64
b.
s2 =
14,802.64 = 31.23 474
s = 31.23 = 5.59 71. fi
Mi
fi Mi
Mi − x
2 6 4 4 2 2 20
29.5 39.5 49.5 59.5 69.5 79.5
59.0 237.0 198.0 238.0 139.0 159.0 1,030.0
-22 -12 -2 8 18 28
x=
1030 = 51.5 20
s=
4320 = 227.37 19
s = 15.08
74
( Mi − x )2 484 144 4 64 324 784
fi ( Mi − x )2 968 864 16 256 648 1568 4320
Chapter 4 Introduction to Probability Learning Objectives
1.
Obtain an appreciation of the role probability information plays in the decision making process.
2.
Understand probability as a numerical measure of the likelihood of occurrence.
3.
Know the three methods commonly used for assigning probabilities and understand when they should be used.
4.
Know how to use the laws that are available for computing the probabilities of events.
5.
Understand how new information can be used to revise initial (prior) probability estimates using Bayes’ theorem.
Solutions:
1.
Number of experimental Outcomes = (3) (2) (4) = 24
2.
6I 6! F 6 ⋅ 5⋅ 4 ⋅ 3⋅ 2 ⋅1 = = G J H3K 3!3! (3 ⋅ 2 ⋅1)(3 ⋅ 2 ⋅1) = 20 ABC ABD ABE ABF ACD
P36 =
3.
ACE ACF ADE ADF AEF
BCD BCE BCF BDE BDF
BEF CDE CDF CEF DEF
6! = (6)(5)(4) = 120 (6 − 3)!
BDF BFD DBF DFB FBD FDB 4.
a. 1st Toss
2nd Toss
3rd Toss H
H T H
(H,H,H)
T (H,H,T) H
(H,T,H)
T (H,T,T) H
T H T
(T,H,H)
T (T,H,T) H
(T,T,H)
T (T,T,T) b.
Let: H be head and T be tail (H,H,H) (T,H,H) (H,H,T) (T,H,T) (H,T,H) (T,T,H) (H,T,T) (T,T,T)
c. 5.
The outcomes are equally likely, so the probability of each outcomes is 1/8. P(Ei) = 1 / 5 for i = 1, 2, 3, 4, 5 P(Ei) ≥ 0 for i = 1, 2, 3, 4, 5 P(E1) + P(E2) + P(E3) + P(E4) + P(E5) = 1 / 5 + 1 / 5 + 1 / 5 + 1 / 5 + 1 / 5 = 1 The classical method was used.
13 - 76
6.
P(E1) = .40, P(E2) = .26, P(E3) = .34 The relative frequency method was used.
7.
8.
No. Requirement (4.3) is not satisfied; the probabilities do not sum to 1. P(E1) + P(E2) + P(E3) + P(E4) = .10 + .15 + .40 + .20 = .85 a.
There are four outcomes possible for this 2-step experiment; planning commission positive - council approves; planning commission positive - council disapproves; planning commission negative council approves; planning commission negative - council disapproves.
b.
Let
p = positive, n = negative, a = approves, and d = disapproves Planning Commission
Council a
(p, a)
d p
(p, d)
n a
.
(n, a)
d (n, d)
9.
50I F 50! 50 ⋅ 49 ⋅ 48 ⋅ 47 G H4 J K= 4!46! = 4 ⋅ 3 ⋅ 2 ⋅1 = 230,300
10. a.
Use the relative frequency approach: P(California) = 1,434/2,374 = .60
b.
Number not from 4 states = 2,374 - 1,434 - 390 - 217 - 112 = 221 P(Not from 4 States) = 221/2,374 = .09
c.
P(Not in Early Stages) = 1 - .22 = .78
d.
Estimate of number of Massachusetts companies in early stage of development - (.22)390 ≈ 86
13 - 77
e.
If we assume the size of the awards did not differ by states, we can multiply the probability an award went to Colorado by the total venture funds disbursed to get an estimate. Estimate of Colorado funds = (112/2374)($32.4) = $1.53 billion Authors' Note: The actual amount going to Colorado was $1.74 billion.
11. a. b. 12. a.
No, the probabilities do not sum to one. They sum to .85. Owner must revise the probabilities so they sum to 1.00. Use the counting rule for combinations: 49I 49! F (49)(48)(47)(46)(45) = = = 1,906,884 G J 5 5 ! 44 ! (5)(4)(3)(2)(1) HK
b.
Very small: 1/1,906,884 = 0.0000005
c.
Multiply the answer to part (a) by 42 to get the number of choices for the six numbers. No. of Choices = (1,906,884)(42) = 80,089,128 Probability of Winning = 1/80,089,128 = 0.0000000125
13.
Initially a probability of .20 would be assigned if selection is equally likely. Data does not appear to confirm the belief of equal consumer preference. For example using the relative frequency method we would assign a probability of 5 / 100 = .05 to the design 1 outcome, .15 to design 2, .30 to design 3, .40 to design 4, and .10 to design 5.
14. a.
P (E2) = 1 / 4
b.
P(any 2 outcomes) = 1 / 4 + 1 / 4 = 1 / 2
c.
P(any 3 outcomes) = 1 / 4 + 1 / 4 + 1 / 4 = 3 / 4
15. a.
S = {ace of clubs, ace of diamonds, ace of hearts, ace of spades}
b.
S = {2 of clubs, 3 of clubs, . . . , 10 of clubs, J of clubs, Q of clubs, K of clubs, A of clubs}
c.
There are 12; jack, queen, or king in each of the four suits.
d.
For a: 4 / 52 = 1 / 13 = .08 For b: 13 / 52 = 1 / 4 = .25 For c: 12 / 52 = .23
13- 78
16. a.
(6) (6) = 36 sample points
b. Die 2 1
2
3
4
5
6
1
2
3
4
5
6
7
2
3
4
5
6
7
8
3
4
5
6
7
8
9
4
5
6
7
8
9
10
5
6
7
8
9
10
11
6
7
8
9
10
11
12
Total for Both
Die 1
c.
6 / 36 = 1 / 6
d.
10 / 36 = 5 / 18
e.
No. P(odd) = 18 / 36 = P(even) = 18 / 36 or 1 / 2 for both.
f.
Classical. A probability of 1 / 36 is assigned to each experimental outcome.
17. a.
(4, 6), (4, 7), (4 , 8)
b.
.05 + .10 + .15 = .30
c.
(2, 8), (3, 8), (4, 8)
d.
.05 + .05 + .15 = .25
e.
.15
18. a.
0; probability is .05
b.
4, 5; probability is .10 + .10 = .20
c.
0, 1, 2; probability is .05 + .15 + .35 = .55
19. a. b.
Yes, the probabilities are all greater than or equal to zero and they sum to one. P(A) = P(0) + P(1) + P(2) = .08 + .18 + .32 = .58
13 - 79
.
c.
P(B) = P(4) = .12
20. a.
P(N) = 56/500 = .112
b.
P(T) = 43/500 = .086
c.
Total in 6 states = 56 + 53 + 43 + 37 + 28 + 28 = 245 P(B) = 245/500 = .49 Almost half the Fortune 500 companies are headquartered in these states.
21. a.
P(A) = P(1) + P(2) + P(3) + P(4) + P(5) =
20 12 6 3 1 + + + + 50 50 50 50 50
= .40 + .24 + .12 + .06 + .02 = .84 b.
P(B) = P(3) + P(4) + P(5) = .12 + .06 + .02 = .20
c. 22. a.
P(2) = 12 / 50 = .24 P(A) = .40, P(B) = .40, P(C) = .60
b.
P(A ∪ B) = P(E1, E2, E3, E4) = .80. Yes P(A ∪ B) = P(A) + P(B).
c.
Ac = {E3, E4, E5} Cc = {E1, E4} P(Ac) = .60 P(Cc) = .40
d.
A ∪ Bc = {E1, E2, E5} P(A ∪ Bc) = .60
e.
P(B ∪ C) = P(E2, E3, E4, E5) = .80
23. a.
P(A) = P(E1) + P(E4) + P(E6) = .05 + .25 + .10 = .40 P(B) = P(E2) + P(E4) + P(E7) = .20 + .25 + .05 = .50 P(C) = P(E2) + P(E3) + P(E5) + P(E7) = .20 + .20 + .15 + .05 = .60
b.
A ∪ B = {E1, E2, E4, E6, E7} P(A ∪ B) = P(E1) + P(E2) + P(E4) + P(E6) + P(E7) = .05 + .20 + .25 + .10 + .05 = .65
c.
A ∩ B = {E4}
P(A ∩ B) = P(E4) = .25
13- 80
d.
Yes, they are mutually exclusive.
e.
Bc = {E1, E3, E5, E6}; P(Bc) = P(E1) + P(E3) + P(E5) + P(E6) = .05 + .20 + .15 + .10 = .50
24.
Let E = experience exceeded expectations M = experience met expectations a.
Percentage of respondents that said their experience exceeded expectations = 100 - (4 + 26 + 65) = 5% P(E) = .05
b.
P(M ∪ E) = P(M) + P(E) = .65 + .05 = .70
25.
Let Y = high one-year return M = high five-year return a.
P(Y) = 15/30 = .50 P(M) = 12/30 = .40 P(Y ∩ M) = 6/30 = .20
b.
P(Y ∪ M) = P(Y) + P(M) - P(Y ∩ M) = .50 + .40 - .20 = .70
c.
1 - P(Y ∪ M) = 1 - .70 = .30
26.
Let Y = high one-year return M = high five-year return a.
P(Y) = 9/30 = .30 P(M) = 7/30 = .23
b.
P(Y ∩ M) = 5/30 = .17
c.
P(Y ∪ M) = .30 + .23 - .17 = .36 P(Neither) = 1 - .36 = .64
27. Big Ten Pac-10
a.
Yes No
P(Neither) =
Yes 849 2112 2,961
No 3645 6823 10,468
6823 = .51 13, 429
13 - 81
4494 8935 13,429
b.
P(Either) =
c.
P(Both) =
28.
2961 4494 849 + − = .05 13, 429 13, 429 13, 429
849 = .06 13, 429
Let: B = rented a car for business reasons P = rented a car for personal reasons a.
P(B ∪ P) = P(B) + P(P) - P(B ∩ P) = .54 + .458 - .30 = .698
b.
P(Neither) = 1 - .698 = .302
29. a.
P(E) =
1033 = .36 2851
P(R) =
854 = .30 2851
P(D) =
964 = .34 2851
b.
Yes; P(E ∩ D) = 0
c.
Probability =
d.
Yes
e.
P(E ∪ A) = P(E) + P(A) = .36 + .18 = .54
1033 = .43 2375
30. a. P(A B) =
P(A ∩ B) = .40 = .6667 P(B) .60
P(B A) =
P(A ∩ B) = .40 = .80 P(A) .50
b.
c. 31. a.
No because P(A | B) ≠ P(A) P(A ∩ B) = 0
b. P(A B) =
P(A ∩ B) = 0 = 0 P(B) .4
c.
No. P(A | B) ≠ P(A); ∴ the events, although mutually exclusive, are not independent.
d.
Mutually exclusive events are dependent.
13- 82
13 - 83
32. a. Single
Married
Total
Under 30
.55
.10
.65
30 or over
.20
.15
.35
Total
.75
.25
1.00
b.
65% of the customers are under 30.
c.
The majority of customers are single: P(single) = .75.
d.
.55
e.
Let: A = event under 30 B = event single P(B A) =
f.
P(A ∩ B) = .55 = .8462 P(A) .65
P(A ∩ B) = .55 P(A)P(B) = (.65)(.75) = .49 Since P(A ∩ B) ≠ P(A)P(B), they cannot be independent events; or, since P(A | B) ≠ P(B), they cannot be independent.
33. a. Reason for Applying Quality
Cost/Convenience
Other
Total
Full Time
.218
.204
.039
.461
Part Time
.208
.307
.024
.539
.426
.511
.063
1.00
b.
It is most likely a student will cite cost or convenience as the first reason - probability = .511. School quality is the first reason cited by the second largest number of students - probability = .426.
c.
P(Quality | full time) = .218 / .461 = .473
d.
P(Quality | part time) = .208 / .539 = .386
13- 84
e.
For independence, we must have P(A)P(B) = P(A ∩ B). From the table, P(A ∩ B) = .218, P(A) = .461, P(B) = .426 P(A)P(B) = (.461)(.426) = .196 Since P(A)P(B) ≠ P(A ∩ B), the events are not independent.
34. a.
P(O) = 0.38 + 0.06 = 0.44
b.
P(Rh-) = 0.06 + 0.02 + 0.01 + 0.06 = 0.15
c.
P(both Rh-) = P(Rh-) P(Rh-) = (0.15)(0.15) = 0.0225
d.
P(both AB) = P(AB) P(AB) = (0.05)(0.05) = 0.0025
e.
P(Rh − O) =
f.
P(Rh+) = 1 - P(Rh-) = 1 - 0.15 = 0.85
P(Rh − ∩O) 0.06 = = 0.136 P(O) 0.44
P(B Rh+) = 35. a.
P(B ∩ Rh+) 0.09 = = 0.106 P(Rh+) 0.85
P(Up for January) = 31 / 48 = 0.646
b.
P(Up for Year) = 36 / 48 = 0.75
c.
P(Up for Year ∩ Up for January) = 29 / 48 = 0.604 P(Up for Year | Up for January) = 0.604 / 0.646 = 0.935
d.
They are not independent since P(Up for Year) ≠ P(Up for Year | Up for January) 0.75 ≠ 0.935
36. a. Occupation Cabinetmaker Lawyer Physical Therapist Systems Analyst
Under 50 .000 .150 .000 .050 Total .200
50-59 .050 .050 .125 .025 .250
Satisfaction Score 60-69 70-79 .100 .075 .025 .025 .050 .025 .100 .075 .275 .200
b.
P(80s) = .075 (a marginal probability)
c.
P(80s | PT) = .050/.250 = .20 (a conditional probability)
d.
P(L) = .250 (a marginal probability)
13 - 85
80-89 .025 .000 .050 .000 .075
Total .250 .250 .250 .250 1.000
e.
P(L ∩ Under 50) = .150 (a joint probability)
f.
P(Under 50 | L) = .150/.250 = .60 (a conditional probability)
g.
P(70 or higher) = .275 (Sum of marginal probabilities)
37. a.
P(A ∩ B) = P(A)P(B) = (.55)(.35) = .19
b.
P(A ∪ B) = P(A) + P(B) - P(A ∩ B) = .55 + .35 - .19 = .71
c.
P(shutdown) = 1 - P(A ∪ B) = 1 - .71 = .29
38. a. b.
c.
P(Telephone) =
52 ≈ 0.2737 190
This is an intersection of two events. It seems reasonable to assume the next two messages will be independent; we use the multiplication rule for independent events.
30 15 P(E-mail ∩ Fax) = P(E-mail) P(Fax) = ≈ 0.0125 190 190 This is a union of two mutually exclusive events. P(Telephone ∪ Interoffice Mail) = P(Telephone) + P(Interoffice Mail) =
39. a. b.
52 18 70 + = ≈ 0.7368 190 190 190
Yes, since P(A1 ∩ A2) = 0 P(A1 ∩ B) = P(A1)P(B | A1) = .40(.20) = .08 P(A2 ∩ B) = P(A2)P(B | A2) = .60(.05) = .03
c.
P(B) = P(A1 ∩ B) + P(A2 ∩ B) = .08 + .03 = .11
d. P(A
B) = .08 = .7273 .11
P(A
B) = .03 = .2727 .11
1
2
40. a.
P(B ∩ A1) = P(A1)P(B | A1) = (.20) (.50) = .10 P(B ∩ A2) = P(A2)P(B | A2) = (.50) (.40) = .20 P(B ∩ A3) = P(A3)P(B | A3) = (.30) (.30) = .09
b. P(A2 B) =
.20 = .51 .10 + .20 + .09
c. Events A1
P(Ai) .20
P(B | Ai) .50
13- 86
P(Ai ∩ B) .10
P(Ai | B) .26
A2 A3
.50 .40 .20 .51 .30 .30 .09 .23 1.00 .39 1.00 S1 = successful, S2 = not successful and B = request received for additional information.
41. a.
P(S1) = .50
b.
P(B | S1) = .75
c. P(S 1 B) = 42.
(.50) (.75) = .375 = .65 (.50) (.75) + (.50) (.40) .575
M = missed payment D1 = customer defaults D2 = customer does not default P(D1) = .05
P(D2) = .95
P(M | D2) = .2
P(M | D1) = 1
a. P( D1 M) =
P( D1)P( M D1 ) P( D1)P(M D1 ) + P( D2)P( M D2 )
=
(.05) (1) (.05) (1) + (.95) (.2)
=
.05 .24
= .21 b. 43.
Yes, the probability of default is greater than .20. Let: S = small car Sc = other type of vehicle F = accident leads to fatality for vehicle occupant We have P(S) = .18, so P(Sc) = .82. Also P(F | S) = .128 and P(F | Sc) = .05. Using the tabular form of Bayes Theorem provides:
Events S Sc
Prior Probabilities .18 .82 1.00
Conditional Probabilities .128 .050
Joint Probabilities .023 .041 .064
Posterior Probabilities .36 .64 1.00
From the posterior probability column, we have P(S | F) = .36. So, if an accident leads to a fatality, the probability a small car was involved is .36.
13 - 87
44.
Let
A1 = Story about Basketball Team A2 = Story about Hockey Team W = "We Win" headline P(A1) = .60
P(W | A1) = .641
P(A2) = .40
P(W | A2) = .462
Ai A1 A2
P(Ai) .60 .40
P(W | A1) .641 .462
P(W ∩ Ai) .3846 .1848 .5694
The probability the story is about the basketball team is .6754. 45. a.
Let S = person is age 65 or older P(S) =
b.
34,991, 753 = .12 281, 421,906
Let D = takes prescription drugs regularly P(D) = P(D ∩ S) + P(D ∩ Sc) = P(D | S)P(S) + P(D | Sc)P(Sc) = .82(.12) + .49(.88) = .53
c.
Let D5 = takes 5 or more prescriptions P(D5 ∩ S) = P(D5 | S)P(S) = .40(.12) = .048
d.
P(S | D5) =
P(S ∩ D5 ) P(D5 )
P(D5) = P(S ∩ D5) + P(Sc ∩ D5) = P(D5 | S)P(S) + P(D5 | Sc)P(Sc) = .40(.12) + (.28)(.88) = .048 + .246 = .294 P(S | D5) = 46. a.
.048 = .16 .294
P(Excellent) = .18 P(Pretty Good) = .50 P(Pretty Good ∪ Excellent) = .18 + .50 = .68
13- 88
P(Ai | M ) .3846/.5694 .1848/.5694
= .6754 = .3246 1.0000
Note: Events are mutually exclusive since a person may only choose one rating.
b.
1035 (.05) = 51.75 We estimate 52 respondents rated US companies poor.
c.
1035 (.01) = 10.35 We estimate 10 respondents did not know or did not answer.
47. a. b.
(2) (2) = 4 Let
s = successful u = unsuccessful Oil
Bonds s
E1
u s
E2
u s
E3
u E4 c.
O = {E1, E2}
d.
M = {E1, E3} O ∪ M = {E1, E2, E3}
e.
O ∩ M = {E1}
f.
No; since O ∩ M has a sample point.
48. a.
P(satisfied) = 0.61
13 - 89
b.
The 18 - 34 year old group (64% satisfied) and the 65 and over group (70% satisfied).
c.
P(not satisfied) = 0.26 + 0.04 = 0.30
49.
Let
I = treatment-caused injury D = death from injury N = injury caused by negligence M = malpractice claim filed $ = payment made in claim
We are given P(I) = 0.04, P(N | I) = 0.25, P(D | I) = 1/7, P(M | N) = 1/7.5 = 0.1333, and P($ | M) = 0.50 a.
P(N) = P(N | I) P(I) + P(N | Ic) P(Ic) = (0.25)(0.04) + (0)(0.96) = 0.01
b. P(D) = P(D | I) P(I) + P(D | Ic) P(Ic) = (1/7)(0.04) + (0)(0.96) = 0.006 c.
P(M) = P(M | N) P(N) + P(M | Nc) P(Nc) = (0.1333)(0.01) + (0)(0.99) = 0.001333 P($) = P($ | M) P(M) + P($ | Mc) P(Mc) = (0.5)(0.001333) + (0)(0.9987) = 0.00067
50. a.
Probability of the event = P(average) + P(above average) + P(excellent) =
11 14 13 + + 50 50 50
= .22 + .28 + .26 = .76 b.
Probability of the event = P(poor) + P(below average) =
51. a.
4 8 + =.24 50 50
P(leases 1) = 168 / 932 = 0.18
b.
P(2 or fewer) = 401 / 932 + 242 / 932 + 65 / 932 = 708 / 932 = 0.76
c.
P(3 or more) = 186 / 932 + 112 / 932 = 298 / 932 = 0.32
d.
P(no cars) = 19 / 932 = 0.02
13- 90
52. a. Yes
No
Total
23 and Under
.1026
.0996
.2022
24 - 26
.1482
.1878
.3360
27 - 30
.0917
.1328
.2245
31 - 35
.0327
.0956
.1283
36 and Over
.0253
.0837
.1090
Total
.4005
.5995
1.0000
b.
.2022
c.
.2245 + .1283 + .1090 = .4618
d.
.4005
53. a.
.
P(24 to 26 | Yes) = .1482 / .4005 = .3700
b.
P(Yes | 36 and over) = .0253 / .1090 = .2321
c.
.1026 + .1482 + .1878 + .0917 + .0327 + .0253 = .5883
d.
P(31 or more | No) = (.0956 + .0837) / .5995 = .2991
e.
No, because the conditional probabilities do not all equal the marginal probabilities. For instance, P(24 to 26 | Yes) = .3700 ≠ P(24 to 26) = .3360
54.
Let
I = important or very important M = male F = female
a.
P(I) = .49 (a marginal probability)
b.
P(I | M) = .22/.50 = .44 (a conditional probability)
13 - 91
c.
P(I | F) = .27/.50 = .54 (a conditional probability)
d.
It is not independent P(I) = .49 ≠ P(I | M) = .44 and P(I) = .49 ≠ P(I | F) = .54
e.
Since level of importance is dependent on gender, we conclude that male and female respondents have different attitudes toward risk.
55. a. P(B S) =
P(B ∩ S) .12 = = .30 P(S) .40
We have P(B | S) > P(B). Yes, continue the ad since it increases the probability of a purchase. b.
Estimate the company’s market share at 20%. Continuing the advertisement should increase the market share since P(B | S) = .30.
c. P(B S) =
P(B ∩ S) = .10 = .333 P(S) .30
The second ad has a bigger effect. 56. a.
P(A) = 200/800 = .25
b. c.
P(B) = 100/800 = .125 P(A ∩ B) = 10/800 = .0125
d.
P(A | B) = P(A ∩ B) / P(B) = .0125 / .125 = .10
e.
No, P(A | B) ≠ P(A) = .25
57.
Let
A = lost time accident in current year B = lost time accident previous year
Given: P(B) = .06, P(A) = .05, P(A | B) = .15 a.
P(A ∩ B) = P(A | B)P(B) = .15(.06) = .009
b.
P(A ∪ B) = P(A) + P(B) - P(A ∩ B) = .06 + .05 - .009 = .101 or 10.1%
58.
Let: A = return is fraudulent B = exceeds IRS standard for deductions Given: P(A | B) = .20, P(A | Bc) = .02, P(B) = .08, find P(A) = .3. Note P(Bc) = 1 - P(B) = .92
13- 92
P(A) = P(A ∩ B) + P(A ∩ Bc) = P(B)P(A | B) + P(Bc)P(A | Bc) = (.08)(.20) + (.92)(.02) = .0344 We estimate 3.44% will be fraudulent.
59. a. b.
P(Oil) = .50 + .20 = .70 Let S = Soil test results Events High Quality (A1) Medium Quality (A2) No Oil (A3)
P(Ai) .50 .20 .30 1.00
P(S | Ai) .20 .80 .20
P(Ai ∩ S) .10 .16 .06 P(S) = .32
P(Ai | S) .31 .50 .19 1.00
P(Oil) = .81 which is good; however, probabilities now favor medium quality rather than high quality oil. 60. a.
A1 = field will produce oil A2 = field will not produce oil W = well produces oil Events Oil in Field No Oil in Field
P(Ai) .25 .75 1.00
P(Wc | Ai) .20 1.00
P(Wc ∩ Ai) .05 .75 .80
P(Ai | Wc) .0625 .9375 1.0000
The probability the field will produce oil given a well comes up dry is .0625. b. Events Oil in Field No Oil in Field
P(Ai) .0625 .9375 1.0000
P(Wc | Ai) .20 1.00
P(Wc ∩ Ai) .0125 .9375 .9500
P(Ai | Wc) .0132 .9868 1.0000
The probability the well will produce oil drops further to .0132. c.
Suppose a third well comes up dry. The probabilities are revised as follows: Events Oil in Field Incorrect Adjustment
P(Ai) .0132 .9868 1.0000
P(Wc | Ai) .20 1.00
P(Wc ∩ Ai) .0026 .9868 .9894
Stop drilling and abandon field if three consecutive wells come up dry.
13 - 93
P(Ai | Wc) .0026 .9974 1.0000
Chapter 5 Discrete Probability Distributions Learning Objectives 1.
Understand the concepts of a random variable and a probability distribution.
2.
Be able to distinguish between discrete and continuous random variables.
3.
Be able to compute and interpret the expected value, variance, and standard deviation for a discrete random variable and understand how an Excel worksheet can be used to ease the burden of the calculations.
4.
Be able to compute probabilities using a binomial probability distribution and be able to compute these probabilities using Excel's BINOMDIST function.
5.
Be able to compute probabilities using a Poisson probability distribution and be able to compute these probabilities using Excel's POISSON function.
6.
Know when and how to use the hypergeometric probability distribution and be able to compute these probabilities using Excel's HYPGEOMDIST function.
13- 94
Solutions: 1.
a.
Head, Head (H,H) Head, Tail (H,T) Tail, Head (T,H) Tail, Tail (T,T)
b.
x = number of heads on two coin tosses
c. Outcome (H,H) (H,T) (T,H) (T,T)
2.
Values of x 2 1 1 0
d.
Discrete. It may assume 3 values: 0, 1, and 2.
a.
Let x = time (in minutes) to assemble the product.
b.
It may assume any positive value: x > 0.
c.
Continuous
3.
Let Y = position is offered N = position is not offered a.
S = {(Y,Y,Y), (Y,Y,N), (Y,N,Y), (Y,N,N), (N,Y,Y), (N,Y,N), (N,N,Y), (N,N,N)}
b.
Let N = number of offers made; N is a discrete random variable.
c. Experimental Outcome Value of N 4. 5.
(Y,Y,Y) (Y,Y,N) (Y,N,Y) (Y,N,N) (N,Y,Y) (N,Y,N) (N,N,Y) (N,N,N) 3 2 2 1 2 1 1 0
x = 0, 1, 2, . . ., 12. a.
S = {(1,1), (1,2), (1,3), (2,1), (2,2), (2,3)}
b. Experimental Outcome Number of Steps Required 6.
a. b. c.
(1,1) 2
(1,2) 3
values: 0,1,2,...,20 discrete values: 0,1,2,... discrete values: 0,1,2,...,50 discrete
13 - 95
(1,3) 4
(2,1) 3
(2,2) 4
(2,3) 5
7.
d.
values: 0 ≤ x ≤ 8 continuous
e.
values: x > 0 continuous
a.
f (x) ≥ 0 for all values of x. Σ f (x) = 1 Therefore, it is a proper probability distribution.
8.
b.
Probability x = 30 is f (30) = .25
c.
Probability x ≤ 25 is f (20) + f (25) = .20 + .15 = .35
d.
Probability x > 30 is f (35) = .40
a.
f (x) 3/20 = .15 5/20 = .25 8/20 = .40 4/20 = .20 Total 1.00
x 1 2 3 4
b.
f (x) .4
.3
.2
.1
x 1 c.
2
f (x) ≥ 0 for x = 1,2,3,4. Σ f (x) = 1
9.
a. Age 6 7 8 9
Number of Children 37,369 87,436 160,840 239,719
f(x) 0.018 0.043 0.080 0.119
13- 96
3
4
10 11 12 13 14
286,719 306,533 310,787 302,604 289,168 2,021,175
0.142 0.152 0.154 0.150 0.143 1.001
b. f(x) .16 .14 .12 .10 .08 .06 .04 .02
x 6
c.
7
8
9
10 11 12 13 14
f(x) ≥ 0 for every x Σ f(x) = 1
Note: Σ f(x) = 1.001 in part (a); difference from 1 is due to rounding values of f(x). 10. a.
x 1 2 3 4 5
f(x) 0.05 0.09 0.03 0.42 0.41 1.00
x 1 2 3 4 5
f(x) 0.04 0.10 0.12 0.46 0.28 1.00
b.
c.
P(4 or 5) = f (4) + f (5) = 0.42 + 0.41 = 0.83
13 - 97
d.
Probability of very satisfied: 0.28
e.
Senior executives appear to be more satisfied than middle managers. 83% of senior executives have a score of 4 or 5 with 41% reporting a 5. Only 28% of middle managers report being very satisfied.
11. a. Duration of Call x 1 2 3 4
f(x) 0.25 0.25 0.25 0.25 1.00
b. f (x) 0.30
0.20
0.10
x 0
1
2
3
4
c.
f (x) ≥ 0 and f (1) + f (2) + f (3) + f (4) = 0.25 + 0.25 + 0.25 + 0.25 = 1.00
d.
f (3) = 0.25
e.
P(overtime) = f (3) + f (4) = 0.25 + 0.25 = 0.50
12. a. b.
13. a.
Yes; f (x) ≥ 0 for all x and Σ f (x) = .15 + .20 + .30 + .25 + .10 = 1 P(1200 or less) = f (1000) + f (1100) + f (1200) = .15 + .20 + .30 = .65 Yes, since f (x) ≥ 0 for x = 1,2,3 and Σ f (x) = f (1) + f (2) + f (3) = 1/6 + 2/6 + 3/6 = 1
b.
f (2) = 2/6 = .333
c.
f (2) + f (3) = 2/6 + 3/6 = .833
14. a.
f (200) = 1 - f (-100) - f (0) - f (50) - f (100) - f (150) = 1 - .95 = .05 This is the probability MRA will have a $200,000 profit.
13- 98
b.
P(Profit) = f (50) + f (100) + f (150) + f (200) = .30 + .25 + .10 + .05 = .70
c.
P(at least 100) = f (100) + f (150) + f (200) = .25 + .10 +.05 = .40
15. a. f (x) .25 .50 .25 1.00
x 3 6 9
x f (x) .75 3.00 2.25 6.00
E (x) = µ = 6.00
b. x-µ -3 0 3
x 3 6 9
(x - µ)2 9 0 9
(x - µ)2f (x) 2.25 0.00 2.25 4.50
f (x) .25 .50 .25
Var (x) = σ 2 = 4.50 c.
σ =
4.50 = 2.12
16. a. f (y) .20 .30 .40 .10 1.00
y 2 4 7 8
y f (y) .40 1.20 2.80 .80 5.20
E(y) = µ = 5.20 b. y-µ -3.20 -1.20 1.80 2.80
y 2 4 7 8
(y - µ)2 10.24 1.44 3.24 7.84
(y - µ)2 f (y) 2.048 .432 1.296 .784 4.560
f (y) .20 .30 .40 .10
Var ( y ) = 4.56
σ = 4.56 = 2.14 17. a/b. x 0
f (x) .10
x f (x) .00
x-µ -2.45
13 - 99
(x - µ)2 6.0025
(x - µ)2 f (x) .600250
1 2 3 4 5
E (x)
σ2 σ
.15 .30 .20 .15 .10
.15 .60 .60 .60 .50 2.45
-1.45 - .45 .55 1.55 2.55
2.1025 .2025 .3025 2.4025 6.5025
.315375 .060750 .060500 .360375 .650250 2.047500
= µ = 2.45 = 2.0475 = 1.4309
18. a/b. x 0 1 2 3 4 5
f (x) .01 .23 .41 .20 .10 .05 E (x) =
x f (x) .00 .23 .82 .60 .40 .25 2.30
x-µ -2.3 -1.3 -0.3 0.7 1.7 2.7
(x - µ)2 5.29 1.69 0.09 0.49 2.89 7.29 Var (x) = σ2 =
(x - µ)2 f (x) .0529 .3887 .0369 .0980 .2890 .3645 1.2300
The expected value, E (x) = 2.3, of the probability distribution is the same as the average reported in the 1997 Statistical Abstract of the United States. The variance of the number of television sets per household is Var (x) = 1.23 television sets squared. The standard deviation is σ = 1.11 television sets. 19. a.
E (x) = Σ x f (x) = 0 (.50) + 2 (.50) = 1.00
b.
E (x) = Σ x f (x) = 0 (.61) + 3 (.39) = 1.17
c.
The expected value of a 3 - point shot is higher. So, if these probabilities hold up, the team will make more points in the long run with the 3 - point shot.
20. a. x 0 400 1000 2000 4000 6000
f (x) .90 .04 .03 .01 .01 .01 1.00
x f (x) 0.00 16.00 30.00 20.00 40.00 60.00 166.00
E (x) = 166. If the company charged a premium of $166.00 they would break even. b. Gain to Policy Holder -260.00 140.00 740.00 1,740.00 3,740.00 5,740.00
f (Gain) .90 .04 .03 .01 .01 .01
13- 100
(Gain) f (Gain) -234.00 5.60 22.20 17.40 37.40 57.40 -94.00
E (gain) = -94.00. The policy holder is more concerned that the big accident will break him than with the expected annual loss of $94.00. 21. a.
E (x) = Σ x f (x) = 0.05(1) + 0.09(2) + 0.03(3) + 0.42(4) + 0.41(5) = 4.05
b.
E (x) = Σ x f (x) = 0.04(1) + 0.10(2) + 0.12(3) + 0.46(4) + 0.28(5) = 3.84
c.
Executives:
d.
Middle Managers: σ2 = Σ (x - µ)2 f(x) = 1.1344 Executives: σ = 1.1169
σ2 = Σ (x - µ)2 f(x) = 1.2475
Middle Managers: σ = 1.0651 e.
22. a.
The senior executives have a higher average score: 4.05 vs. 3.84 for the middle managers. The executives also have a slightly higher standard deviation. E (x) = Σ x f (x) = 300 (.20) + 400 (.30) + 500 (.35) + 600 (.15) = 445 The monthly order quantity should be 445 units.
b.
23. a.
Cost: 445 @ $50 = $22,250 Revenue: 300 @ $70 = 21,000 $ 1,250 Loss Laptop: E (x) = .47(0) + .45(1) + .06(2) + .02(3) = .63 Desktop: E (x) = .06(0) + .56(1) + .28(2) + .10(3) = 1.42
b.
Laptop: Var (x) = .47(-.63)2 + .45(.37)2 + .06(1.37)2 + .02(2.37)2 = .4731 Desktop: Var (x) = .06(-1.42)2 + .56(-.42)2 + .28(.58)2 + .10(1.58)2 = .5636
c.
24. a.
From the expected values in part (a), it is clear that the typical subscriber has more desktop computers than laptops. There is not much difference in the variances for the two types of computers. Medium E (x) = Σ x f (x) = 50 (.20) + 150 (.50) + 200 (.30) = 145 Large:
E (x) = Σ x f (x) = 0 (.20) + 100 (.50) + 300 (.30) = 140
Medium preferred. b.
Medium x
f (x)
x-µ
(x - µ)2
13 - 101
(x - µ)2 f (x)
50 150 200
.20 .50 .30
-95 5 55
9025 25 3025
1805.0 12.5 907.5 σ2 = 2725.0
Large y 0 100 300
f (y) .20 .50 .30
y-µ -140 -40 160
(y - µ)2 19600 1600 25600
(y - µ)2 f (y) 3920 800 7680 σ2 = 12,400
Medium preferred due to less variance. 25. a.
S F
S
F S F
b.
2 2! f (1) = (.4)1 (.6)1 = (.4)(.6) = .48 1 1! 1!
c.
2 2! f (0) = (.4)0 (.6)2 = (1)(.36) = .36 0! 2! 0
d.
2 2! f (2) = (.4) 2 (.6)0 = (.16)(1) = .16 2 2! 0!
e.
P (x ≥ 1) = f (1) + f (2) = .48 + .16 = .64
f.
E (x) = n p = 2 (.4) = .8
Var (x) = n p (1 - p) = 2 (.4) (.6) = .48 σ = .48 = .6928
26. a.
f (0) = .3487
b.
f (2) = .1937
c.
P(x ≤ 2) = f (0) + f (1) + f (2) = .3487 + .3874 + .1937 = .9298
d.
P(x ≥ 1) = 1 - f (0) = 1 - .3487 = .6513
e.
E (x) = n p = 10 (.1) = 1
13- 102
f.
Var (x) = n p (1 - p) = 10 (.1) (.9) = .9 σ = .9 = .9487
27. a.
f (12) = .1144
b.
f (16) = .1304
c.
P (x ≥ 16)
d.
P (x ≤ 15) = 1 - P (x ≥ 16) = 1 - .2374 = .7626
e.
E (x) = n p = 20(.7) = 14
f.
Var (x) = n p (1 - p) = 20 (.7) (.3) = 4.2 σ =
28. a. b.
= f (16) + f (17) + f (18) + f (19) + f (20) = .1304 + .0716 + .0278 + .0068 + .0008 = .2374
4.2 = 2.0494
6 f (2) = (.33) 2 (.67)4 = .3292 2 P(at least 2)
= 1 - f(0) - f(1)
6 6 = 1 − (.33)0 (.67)6 − (.33)1 (.67)5 0 1 = 1 - .0905 - .2673 = .6422 c.
10 f (0) = (.33)0 (.67)10 = .0182 0
29.
P(At Least 5)
30. a.
Probability of a defective part being produced must be .03 for each trial; trials must be independent.
b.
= 1 - f (0) - f (1) - f (2) - f (3) - f (4) = 1 - .0000 - .0005 - .0031 - .0123 - .0350 = .9491
Let: D = defective G = not defective
13 - 103
1st part
2nd part
Experimental Outcome D
Number Defective
(D, D)
2
(D, G)
1
(G, D)
1
(G, G)
0
G
D
.
G D G
c.
2 outcomes result in exactly one defect.
d.
P (no defects) = (.97) (.97) = .9409 P (1 defect) = 2 (.03) (.97) = .0582 P (2 defects) = (.03) (.03) = .0009
31.
Binomial n = 10 and p = .05 f ( x) =
10! (.05) x (.95) 10 − x x !(10 − x )!
a.
Yes. Since they are selected randomly, p is the same from trial to trial and the trials are independent.
b.
f (2) = .0746
c.
f (0) = .5987
d.
P (At least 1) = 1 - f (0) = 1 - .5987 = .4013
32. a. b.
.90 P (at least 1) = f (1) + f (2) f (1) =
1 1 2! (.9) (.1) 1! 1!
= 2 (.9) (.1) = .18 f (2) =
2! (.9)2 (.1)0 2! 0!
= 1 (.81) (1) = .81 ∴ P (at least 1) = .18 + .81 = .99
Alternatively
13- 104
P (at least 1) = 1 - f (0) f (0) =
0 2 2! (.9) (.1) = .01 0! 2!
Therefore, P (at least 1) = 1 - .01 = .99 c.
P (at least 1) = 1 - f (0) f (0) =
0 3 3! (.9) (.1) = .001 0! 3!
Therefore, P (at least 1) = 1 - .001 = .999 d.
33. a.
Yes; P (at least 1) becomes very close to 1 with multiple systems and the inability to detect an attack would be catastrophic.
f(12) =
20! (.5)12 (.5)8 12!8!
Using Table 5 in Appendix 8, f(12) = .0708 b.
f(0) + f(1) + f(2) + f(3) + f(4) + f(5) .0000 + .0000 + .0002 + .0011 + .0046 + .0148 = .0207
c.
E(x) = np = 20(.5) = 10
d.
Var (x) = σ2 = np(1 - p) = 20(.5)(.5) = 5
σ = 5 = 2.24 34. a.
f (3) = .0634
b.
The answer here is the same as part (a). The probability of 12 failures with p = .60 is the same as the probability of 3 successes with p = .40.
c.
f (3) + f (4) + · · · + f (15) = 1 - f (0) - f (1) - f (2) = 1 - .0005 - .0047 - .0219 = .9729
35. a.
f (0) + f (1) + f (2) = .0115 + .0576 + .1369 = .2060
b.
f (4) = .2182
c.
1 - [ f (0) + f (1) + f (2) + f (3) ]
d.
µ = n p = 20 (.20) = 4
= 1 - .2060 - .2054 = .5886
36. x 0
f (x) .343
x-µ -.9
(x - µ)2 .81
13 - 105
(x - µ)2 f (x) .27783
1 2 3
37.
.441 .189 .027 1.000
.1 1.1 2.1
.01 1.21 4.41
.00441 .22869 .11907 σ2 = .63000
E(x) = n p = 30(0.29) = 8.7
σ2 = n p (1 - p) = 30(0.29)(0.71) = 6.177 σ =
6.177 = 2.485
38. a.
f ( x) =
3 x e −3 x!
b.
f (2) =
32 e−3 9(.0498) = = .2241 2! 2
c.
f (1) =
31 e −3 = 3(.0498) = .1494 1!
d. 39. a. b.
P (x ≥ 2) = 1 - f (0) - f (1) = 1 - .0498 - .1494 = .8008 f ( x) =
2 x e −2 x!
µ = 6 for 3 time periods
c.
f ( x) =
6 x e −6 x!
d.
f (2) =
22 e −2 4(.1353) = = .2706 2! 2
e.
f (6) =
6 6 e −6 = .1606 6!
f.
f (5) =
45 e −4 = .1563 5!
40. a.
µ = 48 (5 / 60) = 4 f (3) =
b.
43 e −4 (64)(.0183) = = .1952 3! 6
µ = 48 (15 / 60) = 12
13- 106
f (10) = c.
1210 e −12 = .1048 10!
µ = 48 (5 / 60) = 4 I expect 4 callers to be waiting after 5 minutes. f (0) =
40 e −4 = .0183 0!
The probability none will be waiting after 5 minutes is .0183. d.
µ = 48 (3 / 60) = 2.4 f (0) =
2.40 e −2.4 = .0907 0!
The probability of no interruptions in 3 minutes is .0907. 41. a. b.
30 per hour
µ = 1 (5/2) = 5/2 f (3) =
c.
f (0) =
42. a.
f (0) =
b.
(5 / 2)
3
e − (5 / 2)
3!
(5 / 2)
0
e− (5 / 2)
0!
= .2138
= e − (5 / 2) = .0821
7 0 e −7 = e −7 = .0009 0!
probability = 1 - [f(0) + f(1)] f (1) =
71 e −7 = 7e −7 = .0064 1!
probability = 1 - [.0009 + .0064] = .9927 c.
µ = 3.5 f (0) =
3.50 e −3.5 = e −3.5 = .0302 0!
probability = 1 - f(0) = 1 - .0302 = .9698 d.
43. a.
probability = 1 - [f(0) + f(1) + f(2) + f(3) + f(4)] = 1 - [.0009 + .0064 + .0223 + .0521 + .0912] = .8271 f (0) =
100 e −10 = e −10 = .000045 0!
13 - 107
b.
f (0) + f (1) + f (2) + f (3) f (0) = .000045 (part a) 1
-10
f (1) = 10 e 1!
= .00045
Similarly, f (2) = .00225, f (3) = .0075 and f (0) + f (1) + f (2) + f (3) = .010245 c.
2.5 arrivals / 15 sec. period Use µ = 2.5 f (0) =
d.
44.
2.50 e−2.5 = .0821 0!
1 - f (0) = 1 - .0821 = .9179
Poisson distribution applies a.
µ = 1.25 per month
b.
f (0) =
1.250 e−1.25 = 0.2865 0!
c.
f (1) =
1.251 e−1.25 = 0.3581 1!
d. 45. a.
b. c.
46. a.
P (More than 1) = 1 - f (0) - f (1) = 1 - 0.2865 - 0.3581 = 0.3554 average per month =
f (0) =
18 = 1.5 12
1.50 e −1.5 = e−1.5 = .2231 0!
probability = 1 - [f(0) + f(1)] = 1 - [.2231 + .3347] = .4422
3 10 − 3 1 4 −1 f (1) = = 10 4
3! 7! 1!2! 3!4! (3)(35) = = .50 10! 210 4!6!
13- 108
b.
3 10 − 3 2 2 − 2 (3)(1) f (2) = = = .067 45 10 2
c.
3 10 − 3 0 2 − 0 (1)(21) f (0) = = = .4667 45 10 2
d.
3 10 − 3 2 4 − 2 (3)(21) f (2) = = = .30 210 10 4
47.
4 15 − 4 3 10 − 3 (4)(330) f (3) = = = .4396 3003 15 10
48.
Hypergeometric with N = 10 and r = 6
a.
b.
6 4 2 1 (15)(4) f (2) = = = .50 120 10 3 Must be 0 or 1 prefer Coke Classic.
6 4 1 2 (6)(6) f (1) = = = .30 120 10 3 6 4 0 3 (1)(4) f (0) = = = .0333 10 120 3 P (Majority Pepsi) = f (1) + f (0) = .3333 49.
Parts a, b & c involve the hypergeometric distribution with N = 52 and n = 2 a.
r = 20, x = 2
13 - 109
20 32 2 0 (190)(1) f (2) = = = .1433 1326 52 2 b.
r = 4, x = 2
4 48 2 0 (6)(1) f (2) = = = .0045 1326 52 2 c.
r = 16, x = 2
16 36 2 0 (120)(1) f (2) = = = .0905 1326 52 2
d.
Part (a) provides the probability of blackjack plus the probability of 2 aces plus the probability of two 10s. To find the probability of blackjack we subtract the probabilities in (b) and (c) from the probability in (a). P (blackjack) = .1433 - .0045 - .0905 = .0483
50.
N = 60 n = 10 a.
r = 20 x = 0
f (0)
=
20IF 40I F 40! I F 1g b G J G J G H0 KH10K= H10!30!JK= F 40! IF 10!50!I G J G J 60! H K H 60 10!30! 60! K F I G H10JK 10!50!
=
40 ⋅ 39 ⋅ 38 ⋅ 37 ⋅ 36 ⋅ 35 ⋅ 34 ⋅ 33 ⋅ 32 ⋅ 31 60 ⋅ 59 ⋅ 58 ⋅ 57 ⋅ 56 ⋅ 55 ⋅ 54 ⋅ 53 ⋅ 52 ⋅ 51
≈ .01 b.
r = 20 x = 1
13- 110
f (1)
=
20IF 40I F G J G H1 KH9 JK= 20F 40! IF 10!50!I G J G H9!31!KH60! JK 60I F G H10JK
≈ .07 c.
1 - f (0) - f (1) = 1 - .08 = .92
d.
Same as the probability one will be from Hawaii. In part b that was found to equal approximately .07.
51. a.
11 14 2 3 (55)(364) f (2) = = = .3768 53,130 25 5
b.
14 11 2 3 (91)(165) f (2) = = = .2826 53,130 25 5
c.
14 11 5 0 (2002)(1) f (5) = = = .0377 53,130 25 5
d.
14 11 0 5 (1)(462) = .0087 f (0) = = 53,130 25 5
52.
Hypergeometric with N = 10 and r = 2. Focus on the probability of 0 defectives, then the probability of rejecting the shipment is 1 - f (0). a.
n = 3, x = 0
28 0 3 56 f (0) = = = .4667 120 10 3 P (Reject) = 1 - .4667 = .5333
b.
n = 4, x = 0
13 - 111
2 8 0 4 70 f (0) = = = .3333 210 10 4 P (Reject) = 1 - .3333 = .6667
c.
n = 5, x = 0
2 8 0 5 56 f (0) = = = .2222 252 10 5 P (Reject) = 1 - .2222 = .7778
d.
Continue the process. n = 7 would be required with the probability of rejecting = .9333
53. a., b. and c. x 1 2 3 4 5 E(x) = µ = 3.30
σ =
f ( x) 0.18 0.18 0.03 0.38 0.23 1.00
x f ( x) 0.18 0.36 0.09 1.52 1.15 3.30
x-µ -2.30 -1.30 -0.30 0.70 1.70
(x - µ)2 5.29 1.69 0.09 0.49 2.89
(x - µ)2 f (x) 0.9522 0.6084 0.0081 0.7448 3.3235 5.6370
x-µ -2.64 -1.64 -0.64 0.36 1.36
(x - µ)2 6.9696 2.6896 0.4096 0.1296 1.8496
(x - µ)2 f (x) 0.139392 0.161376 0.114688 0.069984 0.184960 0.670400
σ2 = 5.6370
5.6370 = 2.3742
54. a. and b. x 1 2 3 4 5
f ( x) 0.02 0.06 0.28 0.54 0.10 1.00
x f ( x) 0.02 0.12 0.84 2.16 0.50 3.64
f (x) ≥ 0 and Σ f (x) = 1 E(x) = µ = 3.64
Var (x) = σ2 = 0.6704 c.
People do appear to believe the stock market is overvalued. The average response is slightly over halfway between “fairly valued” and “somewhat over valued.”
55. a. f ( x)
x
13- 112
9 10 11 12 13 b.
E ( x)
.30 .20 .25 .05 .20
= Σ x f ( x) = 9 (.30) + 10 (.20) + 11 (.25) + 12 (.05) + 13 (.20) = 10.65
Expected value of expenses: $10.65 million c.
Var (x) = Σ (x - µ)2 f (x) = (9 - 10.65)2 (.30) + (10 - 10.65)2 (.20) + (11 - 10.65)2 (.25) + (12 - 10.65)2 (.05) + (13 - 10.65)2 (.20) = 2.1275
d.
Looks Good: E (Profit) = 12 - 10.65 = 1.35 million However, there is a .20 probability that expenses will equal $13 million and the college will run a deficit.
56. a.
n = 20
and
x = 3
20 f (3) = (0.04)3 (0.04)17 = 0.0364 3
b.
n = 20
and
x = 0
c.
20 f (0) = (0.04)0 (0.96)20 = 0.4420 0 E (x) = n p = 1200 (0.04) = 48
The expected number of appeals is 48. d.
σ2 = n p (1 - p) = 1200 (0.04)(0.96) = 46.08 σ = 46.08 = 6.7882
57. a.
We must have E (x) = np ≥ 10 With p = .4, this leads to: n(.4) ≥ 10 n ≥ 25
b.
With p = .12, this leads to: n(.12) ≥ 10 n ≥ 83.33 So, we must contact 84 people in this age group to have an expected number of internet users of at least 10.
13 - 113
c.
σ = 25(.4)(.6) = 2.45
d.
σ = 84(.12)(.88) = 2.97
58.
Since the shipment is large we can assume that the probabilities do not change from trial to trial and use the binomial probability distribution. a.
n = 5
5 f (0) = (.01)0 (.99)5 = .9510 0
b.
5 f (1) = (.01)1 (.99) 4 = .0480 1
c.
1 - f (0) = 1 - .9510 = .0490
d.
No, the probability of finding one or more items in the sample defective when only 1% of the items in the population are defective is small (only .0490). I would consider it likely that more than 1% of the items are defective.
59. a. b.
E(x) = np = 100(.041) = 4.1
Var (x) = np(1 - p) = 100(.041)(.959) = 3.9319
σ = 3.9319 = 1.9829 60. a.
61.
E(x) = 800(.41) = 328
b.
σ = np (1 − p) = 800(.41)(.59) = 13.91
c.
For this one p = .59 and (1-p) = .41, but the answer is the same as in part (b). For a binomial probability distribution, the variance for the number of successes is the same as the variance for the number of failures. Of course, this also holds true for the standard deviation.
µ = 15 prob of 20 or more arrivals = f (20) + f (21) + · · · = .0418 + .0299 + .0204 + .0133 + .0083 + .0050 + .0029 + .0016 + .0009 + .0004 + .0002 + .0001 + .0001 = .1249
62.
µ = 1.5 prob of 3 or more breakdowns is 1 - [ f (0) + f (1) + f (2) ]. 1 - [ f (0) + f (1) + f (2) ] = 1 - [ .2231 + .3347 + .2510]
13- 114
= 1 - .8088 = .1912
µ = 10 f (4) = .0189
63. 64. a. b.
f (3) =
33 e −3 = 0.2240 3!
f (3) + f (4) + · · · = 1 - [ f (0) + f (1) + f (2) ] 0
f (0) = 3 e 0!
-3
= e
-3
= .0498
Similarly, f (1) = .1494, f (2) = .2240
∴ 1 - [ .0498 + .1494 + .2241 ] = .5767 65.
Hypergeometric N = 52, n = 5 and r = 4.
a.
b.
c.
d.
66. a.
b.
4IF 48I F G J G 2 HKH3 JK= 6(17296) =.0399 52I 2,598,960 F G J H5 K 4IF 48I F G J G H1KH4 JK= 4(194580) =.2995 52I 2,598,960 F G J 5 HK 4IF 48I F G H0JKG H5 JK= 1,712,304 =.6588 52I 2,598,960 F G J H5 K 1 - f (0) = 1 - .6588 = .3412 7 3 1 1 (7)(3) f (1) = = = .4667 45 10 2 7 3 2 0 (21)(1) f (2) = = = .4667 10 45 2
13 - 115
c.
7 3 0 2 (1)(3) f (0) = = = .0667 45 10 2
13- 116
Chapter 6 Continuous Probability Distributions Learning Objectives
1.
Understand the difference between how probabilities are computed for discrete and continuous random variables.
2.
Know how to compute probability values for a continuous uniform probability distribution and be able to compute the expected value and variance for such a distribution.
3.
Be able to compute probabilities using a normal probability distribution. Understand the role of the standard normal distribution in this process.
4.
Be able to use tables for the standard normal probability distribution to compute both standard normal probabilities and probabilities for any normal distribution.
5.
Given a cumulative probability be able to compute the z-value and x-value that cuts off the corresponding area in the left tail of a normal distribution.
6.
Be able to use Excel's NORMSDIST and NORMDIST functions to compute probabilities for the standard normal distribution and any normal distribution.
7.
Be able to use Excel's NORMSINV and NORMINV function to find z and x values corresponding to given cumulative probabilities.
8.
Be able to compute probabilities using an exponential probability distribution and using Excel's EXPONDIST function.
9.
Understand the relationship between the Poisson and exponential probability distributions.
13 - 117
Solutions:
1.
a. f (x)
3 2 1 x
.50
2.
1.0
1.5
2.0
b.
P(x = 1.25) = 0. The probability of any single point is zero since the area under the curve above any single point is zero.
c.
P(1.0 ≤ x ≤ 1.25) = 2(.25) = .50
d.
P(1.20 < x < 1.5) = 2(.30) = .60
a. f (x)
.15 .10 .05 x
0
10
b.
P(x < 15) = .10(5) = .50
c.
P(12 ≤ x ≤ 18) = .10(6) = .60
d.
E ( x) =
e.
Var( x) =
20
10 + 20 = 15 2 (20 − 10)2 = 8.33 12
13- 118
30
40
3.
a. f (x)
3 / 20 1 / 10 1 / 20 x
110
120
130
Minutes
4.
b.
P(x ≤ 130) = (1/20) (130 - 120) = 0.50
c.
P(x > 135) = (1/20) (140 - 135) = 0.25
d.
E ( x) =
120 + 140 = 130 minutes 2
a. f (x)
1.5 1.0 .5 x
0
5.
1
2
b.
P(.25 < x < .75) = 1 (.50) = .50
c.
P(x ≤ .30) = 1 (.30) = .30
d.
P(x > .60) = 1 (.40) = .40
a.
Length of Interval = 261.2 - 238.9 = 22.3 1 for 238.9 ≤ x ≤ 261.2 f ( x) = 22.3 0 elsewhere
b.
Note: 1 / 22.3 = 0.045 P(x < 250) = (0.045)(250 - 238.9) = 0.4995
13 - 119
3
140
Almost half drive the ball less than 250 yards. c.
P(x ≥ 255) = (0.045)(261.2 - 255) = 0.279
d.
P(245 ≤ x ≤ 260) = (0.045)(260 - 245) = 0.675
e.
P(x ≥ 250) = 1 - P(x < 250) = 1 - 0.4995 = 0.5005
The probability of anyone driving it 250 yards or more is 0.5005. With 60 players, the expected number driving it 250 yards or more is (60)(0.5005) = 30.03. Rounding, I would expect 30 of these women to drive the ball 250 yards or more. 6.
a.
P(12 ≤ x ≤ 12.05) = .05(8) = .40
b.
P(x ≥ 12.02) = .08(8) = .64
c.
P( x < 11.98) + P( x > 12.02) 14 4244 3 14 4244 3 .005(8) = .04 .64 = .08(8)
Therefore, the probability is .04 + .64 = .68 7.
a.
P(10,000 ≤ x < 12,000) = 2000 (1 / 5000) = .40
The probability your competitor will bid lower than you, and you get the bid, is .40. b.
P(10,000 ≤ x < 14,000) = 4000 (1 / 5000) = .80
c.
A bid of $15,000 gives a probability of 1 of getting the property.
d.
Yes, the bid that maximizes expected profit is $13,000. The probability of getting the property with a bid of $13,000 is P(10,000 ≤ x < 13,000) = 3000 (1 / 5000) = .60.
The probability of not getting the property with a bid of $13,000 is .40. The profit you will make if you get the property with a bid of $13,000 is $3000 = $16,000 - 13,000. So your expected profit with a bid of $13,000 is EP ($13,000) = .6 ($3000) + .4 (0) = $1800. If you bid $15,000 the probability of getting the bid is 1, but the profit if you do get the bid is only $1000 = $16,000 - 15,000. So your expected profit with a bid of $15,000 is EP ($15,000) = 1 ($1000) + 0 (0) = $1,000.
13- 120
σ = 10
70
80
90
100
110 120
130
8.
9.
a. σ =5
35
40
45
50
55
60
65
b.
.6826 since 45 and 55 are within plus or minus 1 standard deviation from the mean of 50.
c.
.9544 since 40 and 60 are within plus or minus 2 standard deviations from the mean of 50.
10. a.
P(z ≤ 1.5) = .9332
b.
P(z ≤ 1.0) = .8413
c.
P(1.0 ≤ z ≤ 1.5) = .9332 - .8413 = .0919
d.
P(0 < z < 2.5) = .9938 - .5000 = .4938
11. a.
P(z ≥ -1) = P(z ≤ 1) = .8413
b.
P(z ≤ -1) = 1 - P(z ≤ 1) = 1 - .8413 = .1587
c.
P(z ≥ -1.5) = P(z ≤ 1.5) = .9332
d.
P(-2.5 ≤ z) = P(z ≤ 2.5) = .9938
e.
P(-3 < z ≤ 0) = P(0 < z < 3) = .9986 - .5000 = .4986
12. a.
.7967 - .5000 = .2967
b.
.9418 - .5000 = .4418
c.
1.0000 - .6700 = .3300
13 - 121
d. e.
.5910 .8849
f.
1.0000 - .7611 = .2389
13. a.
.6879 - .0239 = .6640
b.
.8888 - .6985 = .1903
c.
.9599 - .8508 = .1091
14. a.
z = 1.96
b.
z = 1.96
c.
z = .61
d.
Area to left of z is .8686 z = 1.12
e.
z = .44
f.
Area to left of z is .6700 z = .44
15. a.
b.
Compute .9030 / 2 = .4515 so the area to the left of z is .5000 + .4515 = .9515.
c.
Compute .2052 / 2 = .1026 so the area to the left of z is .5000 + .1026 = .6026. z = .26.
d.
Look in the table for an area of .9948; z = 2.56.
e.
Look in the table for an area of .6915. Since the value we are seeking is below the mean, the z value must be negative. Thus, z = -.50.
16. a.
17.
Look in the table for an area of 1.0000 - .2119 = .7881. Now z = .80 cuts off an area of .2199 in the upper tail. Thus, for an area of .2119 in the lower tail z = -80. z = 1.66.
Look in the table for an area of .9900. The area value in the table closest to .9900 provides the value z = 2.33.
b.
Look in the table for an area of .9750. This corresponds to z = 1.96.
c.
Look in the table for an area of .9500. Since .9500 is exactly halfway between .9495 (z = 1.64) and .9505 (z = 1.65), we select z = 1.645. However, z = 1.64 or z = 1.65 are also acceptable answers.
d.
Look in the table for an area of .9000. The area value in the table closest to .9000 provides the value z = 1.28. Let x = amount spent
µ = 527, σ = 160
13- 122
a.
b.
700 − 527 = 1.08 160 P(x > 700) = P(z > 1.08) = .5000 - .3599 = .1401 z=
z=
100 − 527 = −2.67 160
P(x < 100) = P(z < -2.67) = .5000 - .4962 = .0038
c.
At 700, z = 1.08 from part (a) At 450, z =
450 − 527 = −.48 160
P(450 < x < 700) = P(-.48 < z < 1.08) = .8599 - .3156 = .5443
d.
z=
300 − 527 = −1.42 160
P(x ≤ 300) = P(z ≤ -1.42) = .5000 - .4222 = .0778
18. a.
Find P(x ≥ 60) At x = 60 60 - 49 = 0.69 16
z =
P(x < 60) = 0.7549 P(x ≥ 60) = 1 - P(x < 60) = 0.2451
b.
Find P(x ≤ 30) At x = 30 z =
30 - 49 = – 1.19 16 P(x ≤ 30) = 1.0000 - 0.8830 = 0.1170
c.
Find z-score so that P(z ≥ z-score) = 0.10 z-score = 1.28 cuts off 10% in upper tail
Now, solve for corresponding value of x. 128 . =
x − 49 16
x = 49 + (16)(1.28) = 69.48
So, 10% of subscribers spend 69.48 minutes or more reading The Wall Street Journal. 19.
We have µ = 3.5 and σ = .8.
13 - 123
a.
z=
5.0 − 3.5 ≈ 1.88 .8
P(x > 5.0) = P(z > 1.88) = 1 - P(z < 1.88) = 1 - .9699 = .0301
The rainfall exceeds 5 inches in 3.01% of the Aprils. b.
z=
3 − 3.5 ≈ −.63 .8
P(x < 3.0) = P(z < -.63) = P(z > .63) = 1 - P(z < .63) = 1 - .7357 = .2643
The rainfall is less than 3 inches in 26.43% of the Aprils. c.
z = 1.28 cuts off approximately .10 in the upper tail of a normal distribution. x = 3.5 + 1.28(.8) = 4.524
If it rains 4.524 inches or more, April will be classified as extremely wet. We use µ = 27 and σ = 8
20. a.
z=
11 − 27 = −2 8
P(x ≤ 11) = P(z ≤ -2) = 1.0000 - .9772 = .0228
The probability a randomly selected subscriber spends less than 11 hours on the computer is .025. b.
z=
40 − 27 ≈ 1.63 8
P(x > 40) = P(z > 1.63) = 1 - P(z ≤ 1.63) = 1 - .9484 = .0516
5.16% of subscribers spend over 40 hours per week using the computer. c.
A z-value of .84 cuts off an area of .20 in the upper tail. x = 27 + .84(8) = 33.72
A subscriber who uses the computer 33.72 hours or more would be classified as a heavy user. 21.
From the normal probability tables, a z-value of 2.05 cuts off an area of approximately .02 in the upper tail of the distribution. x = µ + zσ = 100 + 2.05(15) = 130.75
A score of 131 or better should qualify a person for membership in Mensa. Use µ = 441.84 and σ = 90
22. a.
At 400
13- 124
z=
400 − 441.84 ≈ −.46 90
At 500 z=
500 − 441.84 ≈ .65 90 P(0 ≤ z < .65) = .2422 P(-.46 ≤ z < 0) = .1772 P(400 ≤ z ≤ 500) = .1772 + .2422 = .4194
The probability a worker earns between $400 and $500 is .4194. b.
Must find the z-value that cuts off an area of .20 in the upper tail. Using the normal tables, we find z = .84 cuts off approximately .20 in the upper tail. So, x = µ + zσ = 441.84 + .84(90) = 517.44 Weekly earnings of $517.44 or above will put a production worker in the top 20%.
c.
At 250, z =
250 − 441.84 ≈ −2.13 90
P(x ≤ 250) = P(z ≤ -2.13) = 1.0000 - .9834 = .0166
The probability a randomly selected production worker earns less than $250 per week is .0166. 23. a. b.
z=
60 − 80 = −2 Area to left is 1.0000 - .9772 = .0228 10
At x = 60 z=
60 − 80 = −2 10
Area to left is .0228
z=
75 − 80 = −.5 10
Area to left is .3085
At x = 75
P(60 ≤ x ≤ 75) = .3085 - .0228 = .2857
c.
z=
90 − 80 =1 10
Area = 1 - .8413 = .1587
Therefore 15.87% of students will not complete on time. (60) (.1587) = 9.522 We would expect 9.522 students to be unable to complete the exam in time.
13 - 125
24. a.
x =∑
s=
b.
xi = 902.75 n
∑( xi − x )2 = 114.185 n −1
We will use x as an estimate of µ and s as an estimate of σ in parts (b) - (d) below. Remember the data are in thousands of shares. At 800 z=
800 − 902.75 ≈ −.90 114.185 P(x ≤ 800) = P(z ≤ -.90) = 1 - P(z ≤ .90) = 1 - .8159 = .1841
The probability trading volume will be less than 800 million shares is .1841 c.
At 1000 z=
1000 − 902.75 ≈ .85 114.185 P(x ≥ 1000) = P(z ≥ .85) = 1 - P(z ≤ .85) = 1 - .8023 = .1977
The probability trading volume will exceed 1 billion shares is .1977 d.
A z-value of 1.645 cuts off an area of .05 in the upper tail x = µ + zσ = 902.75 + 1.645(114.185) = 1,090.584
They should issue a press release any time share volume exceeds 1,091 million.
µ = 442.54, σ = 65
25. a.
z=
400 − 442.54 = −.65 65
P(x > 400) = P(z > -.65) = .5000 +.2422 = .7422
b.
z=
300 − 442.54 = −2.19 65
P(x ≤ 300) = P(z ≤ -2.19) = .5000 - .4857 = .0143
c.
At x = 400, z = -.65 from part (a) At x = 500, z =
500 − 442.54 = .88 65
13- 126
P(400 < x < 500) = P(-.65 < z < .88) = .8106 - .2578 = .5528
26. a.
P(x ≤ 6) = 1 - e-6/8 = 1 - .4724 = .5276
b.
P(x ≤ 4) = 1 - e-4/8 = 1 - .6065 = .3935
c.
P(x ≥ 6) = 1 - P(x ≤ 6) = 1 - .5276 = .4724
d. 27. a.
P(4 ≤ x ≤ 6) = P(x ≤ 6) - P(x ≤ 4) = .5276 - .3935 = .1341 P( x ≤ x0 ) = 1 − e − x0 / 3
b.
P(x ≤ 2) = 1 - e-2/3 = 1 - .5134 = .4866
c.
P(x ≥ 3) = 1 - P(x ≤ 3) = 1 - (1 - e −3/ 3 ) = e-1 = .3679
d.
P(x ≤ 5) = 1 - e-5/3 = 1 - .1889 = .8111
e.
P(2 ≤ x ≤ 5) = P(x ≤ 5) - P(x ≤ 2)
= .8111 - .4866 = .3245
28. a.
P(x < 10) = 1 - e-10/20 = .3935
b.
P(x > 30) = 1 - P(x ≤ 30) = 1 - (1 - e-30/20 ) = e-30/20 = .2231
c.
P(10 ≤ x ≤ 30) = P(x ≤ 30) - P(x ≤ 10)
= (1 - e-30/20 ) - (1 - e-10/20 ) = e-10/20 - e-30/20 = .6065 - .2231 = .3834
29. a. f(x) .09 .08 .07 .06 .05 .04 .03 .02 .01 x 6
12
18
13 - 127
24
b.
P(x ≤ 12) = 1 - e-12/12 = 1 - .3679 = .6321
c.
P(x ≤ 6) = 1 - e-6/12 = 1 - .6065 = .3935
d.
P(x ≥ 30) = 1 - P(x < 30) = 1 - (1 - e-30/12) = .0821
30. a.
50 hours
b.
P(x ≤ 25) = 1 - e-25/50 = 1 - .6065 = .3935
c.
P(x ≥ 100) = 1 - (1 - e-100/50) = .1353
31. a.
P(x < 2) = 1 - e-2/2.78 = .5130
b.
P(x > 5) = 1 - P(x ≤ 5) = 1 - (1 - e-5/2.78 ) = e-5/2.78 = .1655
c.
P(x > 2.78) = 1 - P(x ≤ 2.78) = 1 - (1 - e-2.78/2.78 ) = e-1 = .3679 This may seem surprising since the mean is 2.78 minutes. But, for the exponential distribution, the probability of a value greater than the mean is significantly less than the probability of a value less than the mean.
32. a.
If the average number of transactions per year follows the Poisson distribution, the time between transactions follows the exponential distribution. So,
µ=
and
1 of a year 30 1
µ
=
1 = 30 1/ 30
then f(x) = 30 e-30x b.
A month is 1/12 of a year so, 1 1 P x > = 1 − P x ≤ = 1 − (1 − e−30 /12 ) = e −30 /12 = .0821 12 12 The probability of no transaction during January is the same as the probability of no transaction during any month: .0821
c.
Since 1/2 month is 1/24 of a year, we compute, 1 P x ≤ = 1 − e −30 / 24 = 1 − .2865 = .7135 24
13- 128
33. a.
Let x = sales price ($1000s) 1 for 200 ≤ x ≤ 225 f ( x) = 25 0 elsewhere
b.
P(x ≥ 215) = (1 / 25) (225 - 215) = 0.40
c.
P(x < 210) = (1 / 25)(210 - 200) = 0.40
d.
E (x) = (200 + 225)/2 = 212,500 If she waits, her expected sale price will be $2,500 higher than if she sells it back to her company now. However, there is a 0.40 probability that she will get less. It’s a close call. But, the expected value approach to decision making would suggest she should wait.
34. a.
For a normal distribution, the mean and the median are equal.
µ = 63,000 b.
Find the z-score that cuts off 10% in the lower tail. z-score = -1.28 Solving for x, – 1.28 = x – 63,000 15,000 x = 63,000 - 1.28 (15000) = 43,800
c.
The lower 10% of mortgage debt is $43,800 or less. Find P(x > 80,000) At x = 80,000 z =
80,000 – 63,000 = 1.13 15,000
P(x > 80,000) = 1.0000 - .8708 = 0.1292 d.
Find the z-score that cuts off 5% in the upper tail. z-score = 1.645. Solve for x. 1.645 = x – 63,000 15,000 x = 63,000 + 1.645 (15,000) = 87,675 The upper 5% of mortgage debt is in excess of $87,675.
35. a.
P(defect) = 1 - P(9.85 ≤ x ≤ 10.15)
13 - 129
= 1 - P(-1 ≤ z ≤ 1) = 1 - .6826 = .3174 Expected number of defects = 1000(.3174) = 317.4
b.
P(defect) = 1 - P(9.85 ≤ x ≤ 10.15) = 1 - P(-3 ≤ z ≤ 3) = 1 - .9972 = .0028 Expected number of defects = 1000(.0028) = 2.8
c.
Reducing the process standard deviation causes a substantial reduction in the number of defects.
µ = 6,312
36. a.
z = -1.645 cuts off .05 in the lower tail So,
−1.645 =
σ=
b.
1000 − 6312
σ
1000 − 6312 = 3229 −1.645
At 6000, z =
6000 − 6312 = −.10 3229
At 4000, z =
4000 − 6312 = −.72 3229
P(4000 < x < 6000) = P(-.72 < z < -.10) = .4602 - .2358 = .2244 c.
z = 1.88 cuts off approximately .03 in the upper tail x = 6312 + 1.88(3229) = 12,382.52 The households with the highest 3% of expenditures spent more than $12,382.
µ = 10,000
37. a.
σ = 1500
At x = 12,000
13- 130
z=
12,000 − 10, 000 = 1.33 Area to left is .9082 1500 P(x > 12,000) = 1.0000 - .9082 = .0918
b.
At .95
z = 1.645 =
x - 10,000 1500
Therefore, x = 10,000 + 1.645(1500) = 12,468.
95% 0.05
10,000
12,468
12,468 tubes should be produced. 38. a.
At x = 200 z=
200 − 150 = 2 Area = .9772 25 P(x > 200) = 1 - .9772 = .0228
b.
Expected Profit = Expected Revenue - Expected Cost = 200 - 150 = $50
39. a.
Find P(80,000 ≤ x ≤ 150,000) At x = 150,000 z =
150,000 – 126,681 = 0.78 30,000
z =
80,000 – 126,681 = – 1.56 30,000
P(x ≤ 150,000) = 0.7823 At x = 80,000
P(x ≤ 80,000) = 1.0000 - .9406 = 0.0594 P(80,000 ≤ x ≤ 150,000) = 0.7823 - 0.0594 = 0.7229
13 - 131
b.
Find P(x < 50,000) At x = 50,000 z =
50,000 – 126,681 = – 2.56 30,000
P(x < 50,000) = 1.0000 - .9948 = 0.0052
c.
Find the z-score cutting off 95% in the left tail. z-score = 1.645. Solve for x. 1.645 = x – 126,681 30,000 x = 126,681 + 1.645 (30,000) = 176,031 The probability is 0.95 that the number of lost jobs will not exceed 176,031.
40. a.
At 400, z = 400 - 450 = -.500 100
Area to left is .3085
At 500, z = 500 - 450 = +.500 100
Area to left is .6915
P(400 ≤ x ≤ 500) = .6915 - .3085 = .3830 38.3% will score between 400 and 500.
b.
At 630, z = 630 - 450 = 1.80 100 96.41% do worse and 3.59% do better .
c.
At 480, z = 480 - 450 = .30 100 38.21% are acceptable.
41. a.
At 75,000
13- 132
Area to left is .6179
z=
75, 000 − 67, 000 ≈ 1.14 7, 000 P(x > 75,000) = P(z > 1.14) = 1 - P(z ≤ 1.14) = 1 - .8729 = .1271
The probability of a woman receiving a salary in excess of $75,000 is .1271 b.
At 75,000 z=
75, 000 − 65,500 ≈ 1.36 7, 000 P(x > 75,000) = P(z > 1.36) = 1 - P(z ≤ 1.36) = 1 - .9131 = .0869
c.
The probability of a man receiving a salary in excess of $75,000 is .0869 At x = 50,000 50, 000 − 67, 000 z= ≈ −2.43 7, 000 P(x < 50,000) = P(z < -2.43) = 1 - P(z < 2.43) = 1 - .9925 = .0075 The probability of a woman receiving a salary below $50,000 is very small: .0075
d.
The answer to this is the male copywriter salary that cuts off an area of .01 in the upper tail of the distribution for male copywriters. Use z = 2.33 x = 65,500 + 2.33(7,000) = 81,810 A woman who makes $81,810 or more will earn more than 99% of her male counterparts.
42.
σ = .6 At 2% z = -2.05 z =
x -µ
σ
x = 18
∴ -2.05 =
18 - µ .6
µ = 18 + 2.05 (.6) = 19.23 oz.
0.02
18
µ =19.23
The mean filling weight must be 19.23 oz.
13 - 133
43. a.
P(x ≤ 15) = 1 - e-15/36 = 1 - .6592 = .3408
b.
P(x ≤ 45) = 1 - e-45/36 = 1 - .2865 = .7135 Therefore P(15 ≤ x ≤ 45) = .7135 - .3408 = .3727
c.
P(x ≥ 60) = 1 - P(x < 60) = 1 - (1 - e-60/36) = .1889
44. a.
Mean time between arrivals = 1/7 minutes
b.
f(x) = 7e-7x
c. d.
P(x > 1) = 1 - P(x < 1) = 1 - [1 - e-7(1)] = e-7 = .0009 12 seconds is .2 minutes P(x > .2) = 1 - P(x < .2) = 1- [1- e-7(.2)] = e-1.4 = .2466
45. a. b.
1 − x / 36.5 e ≈ .0274e −.0274 x 36.5 P(x < 40) = 1 - e-.0274(40) = 1 - .3342 = .6658 P(x < 20) = 1 - e-.0274(20) = 1 - .5781 = .4219 P(20 < x < 40) = .6658 - .4219 = .2439
c.
From part (b), P(x < 40) = .6658 P(x > 40) = P(x ≥ 40) = 1 - P(x < 40) = 1 - .6658 = .3342
46. a. b.
1
µ
= 0.5 therefore µ = 2 minutes = mean time between telephone calls
Note: 30 seconds = .5 minutes P(x ≤ .5) = 1 - e-.5/2 = 1 - .7788 = .2212
c.
P(x ≤ 1) = 1 - e-1/2 = 1 - .6065 = .3935
d.
P(x ≥ 5)
= 1 - P(x < 5) = 1 - (1 - e-5/2) = .0821
13- 134
Chapter 7 Sampling and Sampling Distributions Learning Objectives
1.
Understand the importance of sampling and how results from samples can be used to provide estimates of population parameters such as the population mean, the population standard deviation and / or the population proportion.
2.
Know what simple random sampling is and how simple random samples are selected.
3.
Be able to select a simple random sample using Excel.
4.
Understand the concept of a sampling distribution.
5.
Know the central limit theorem and the important role it plays in sampling.
6.
Know the characteristics of the sampling distribution of the sample mean ( x ) and the sampling distribution of the sample proportion ( p ).
7.
Learn about a variety of sampling methods including stratified random sampling, cluster sampling, systematic sampling, convenience sampling and judgment sampling.
8.
Know the definition of the following terms: simple random sampling sampling with replacement sampling without replacement sampling distribution point estimator
finite population correction factor standard error
13 - 135
Solutions:
1.
a.
AB, AC, AD, AE, BC, BD, BE, CD, CE, DE
b.
With 10 samples, each has a 1/10 probability.
c.
E and C because 8 and 0 do not apply.; 5 identifies E; 7 does not apply; 5 is skipped since E is already in the sample; 3 identifies C; 2 is not needed since the sample of size 2 is complete.
2.
Using the last 3-digits of each 5-digit grouping provides the random numbers: 601, 022, 448, 147, 229, 553, 147, 289, 209 Numbers greater than 350 do not apply and the 147 can only be used once. Thus, the simple random sample of four includes 22, 147, 229, and 289.
3. 4.
459, 147, 385, 113, 340, 401, 215, 2, 33, 348 a.
We first number the companies from 1 to 10: 1 AT&T, 2 IBM, ⋅⋅⋅, 10 Pfizer. Random Number 6 8 5 4 1*
Company in Sample Microsoft Motorola Cisco Johnson & Johnson AT&T
*Note that the random numbers 5 and 6 were skipped because we are sampling without replacement. b. Company AT&T IBM American Online Johnson & Johnson Cisco Systems Microsoft General Electric Motorola Intel Pfizer
Random Number Assigned 6 8 5 4 5 6 1 1 3 8
Company in Sample
√ √
√ √ √
Note that both American Online and Cisco were assigned a random number of 5. We broke the tie by including the first to receive a 5 in the sample. 10! (10)(9)(8)(7)(6) = = 252 5!(10 − 5)! (5)(4)(3)(2)(1)
c.
Number of Samples of Size 5 =
d.
Use Excel's RAND() function to assign a random number between 0 and 1 to each of the companies, then proceed as in part (b) above. The five with the smallest random numbers can be found by using Excel's SORT tool.
13- 136
5.
a.
283, 610, 39, 254, 568, 353, 602, 421, 638, 164
b.
Generate a random number for each of the 645 students. Include the students associated with the 50 smallest random numbers in the sample.
6.
2782, 493, 825, 1807, 289
7.
Use the data disk accompanying the book and the EAI file. Generate a random number using the RAND() function for each of the 2500 managers. Then sort the list of managers with respect to the random numbers. The first 50 managers are the sample.
8.
a.
21 random numbers were needed. The teams selected are Wisconsin, Clemson, Washington, USC, Oklahoma, and Colorado.
b.
Use Excel to generate 25 random numbers - one for each team. Then sort the list of teams with respect to the list of random numbers. We can also use the same first two digits in column 9 of Table 7.1. Using the random numbers in Table 7.1, the following 6 teams are used in the sample: Nebraska, Florida State, Michigan, Texas, Washington, and TCU. These are the teams with the six smallest random numbers. (There is a tie between TCU and Colorado for 6th smallest.)
9.
511, 791, 99, 671, 152, 584, 45, 783, 301, 568, 754, 750
10.
finite, infinite, infinite, infinite, finite
11. a.
x = Σxi / n =
b.
Σ( xi − x ) 2 n −1 Σ( xi − x ) 2 = (-4)2 + (-1)2 + 12 (-2)2 + 12 + 52 = 48 s=
s= 12. a. b.
54 =9 6
48 = 31 . 6−1
p = 75/150 = .50 p = 55/150 = .3667
13. a.
Totals x = Σxi / n =
xi
( xi − x )
94 100 85 94 92 465
+1 +7 -8 +1 -1 0
465 = 93 5
13 - 137
( xi − x ) 2 1 49 64 1 1 116
b. 14. a.
Σ( xi − x ) 2 116 = = 5.39 n −1 4 149/784 = 0.19 s=
b.
251/784 = 0.32
c.
Total receiving cash = 149 + 219 + 251 = 619 619/784 = 0.79
15. a.
b.
x = Σxi / n =
s=
70 = 7 years 10
Σ( xi − x ) 2 20.2 = = 1.5 years n −1 10 − 1
16.
p = 1117/1400 = 0.80
17. a.
595/1008 = .59
b.
332/1008 = .33
c.
81/1008 = .08
18. a.
Use the data disk accompanying the book and the EAI file. Generate a random number for each manager and select managers associated with the 50 smallest random numbers as the sample.
b.
Use Excel's AVERAGE function to compute the mean for the sample.
c.
Use Excel's STDEV function to compute the sample standard deviation.
d.
Use the sample proportion as a point estimate of the population proportion.
19. a.
The sampling distribution is normal with E ( x ) = µ = 200
σ x = σ / n = 50 / 100 = 5 For +5, ( x - µ ) = 5
x−µ
z=
=
σx
5 =1 5
Probability of being within ± 5 is .6826 b.
For + 10, ( x − µ ) = 10 z=
x−µ
σx
=
Probability of being within ± 10 is .9544
13- 138
10 =2 5
σx =σ / n
20.
σ x = 25 / 50 = 354 . σ x = 25 / 100 = 2.50 σ x = 25 / 150 = 2.04 σ x = 25 / 200 = 1.77 The standard error of the mean decreases as the sample size increases. 21. a. b.
σ x = σ / n = 10 / 50 = 141 . n / N = 50 / 50,000 = .001 . Use σ x = σ / n = 10 / 50 = 141
c.
n / N = 50 / 5000 = .01 Use σ x = σ / n = 10 / 50 = 141 .
d.
n / N = 50 / 500 = .10 Use σ x =
N −n σ 500 − 50 10 = = 1.34 N −1 n 500 − 1 50
Note: Only case (d) where n /N = .10 requires the use of the finite population correction factor. 22. a.
b.
Using the central limit theorem, we can approximate the sampling distribution of x with a normal probability distribution provided n ≥ 30.
n = 30
σ x = σ / n = 50 / 30 = 9.13
x 400
n = 40
σ x = σ / n = 50 / 40 = 7.91
x 13 - 139
400
23. a.
σ x = σ / n = 16 / 50 = 2.26 For +2, ( x − µ ) = 2
z=
x−µ
σx
=
2 = 0.88 2.26
=
2 = 1.25 1.60
=
2 = 1.77 113 .
P(0 ≤ z ≤ 0.88) = .3106 For ± 2, the probability is 2(.3106) = .6212 b.
σx =
16 100
= 1.60
z=
x−µ
σx
P(0 ≤ z ≤ 1.25) = .3944 For ± 2, the probability is 2(.3944) = .7888 c.
σx =
16 200
= 113 .
z=
x−µ
σx
P(0 ≤ z ≤ 1.77) = .4616
d.
For ± 2, the probability is 2(.4616) = .9232 16 σx = = 0.80 400 x−µ 2 z= = = 2.50 0.80 σx
P(0 ≤ z ≤ 2.50) = .4938 For ± 2, the probability is 2(.4938) = .9876 e.
The larger sample provides a higher probability that the sample mean will be within ± 2 of µ.
24. a.
σ x = σ / n = 4000 / 60 = 516.40
x 51,800
E( x ) The normal distribution is based on the Central Limit Theorem.
13- 140
b.
For n = 120, E ( x ) remains $51,800 and the sampling distribution of x can still be approximated by a normal distribution. However, σ x is reduced to 4000 / 120 = 365.15.
c.
As the sample size is increased, the standard error of the mean, σ x , is reduced. This appears logical from the point of view that larger samples should tend to provide sample means that are closer to the population mean. Thus, the variability in the sample mean, measured in terms of σ x , should decrease as the sample size is increased.
σ x = σ / n = 4000 / 60 = 516.40
51,300 51,800 52,300
25. a.
z=
52,300 − 51,800 = +.97 516.40
P(0 ≤ z ≤ .97) = .3340
b.
For ± 500, the probability is 2(.3340) = .6680 σ x = σ / n = 4000 / 120 = 36515 . 52,300 - 51,800 z = = +1.37 365.15 P(0 ≤ z ≤ 1.37) = .4147 For ± 500, the probability is 2(.4147) = .8294
26. a.
A normal distribution E ( x ) = 120 .
σ x = σ / n = 010 . / 50 = 0.014 b.
z=
1.22 − 120 . = 1.41 0.014
P(0 ≤ z ≤ 1.41) = .4207
z=
118 . − 120 . = −1.41 0.014
P(-1.41 ≤ z ≤ 0) = .4207
probability = 0.4207 + 0.4207 = 0.8414
13 - 141
x
c.
z=
1.21 − 120 . = +0.71 0.014
P(0 ≤ z ≤ .71) = .2612
z=
119 . − 120 . = −0.71 0.014
P(-.71 ≤ z ≤ 0) = .2612
probability = 0.2612 + 0.2612 = 0.5224 27. a.
E( x ) = 1017
σ x = σ / n = 100 / 75 = 1155 . z=
1027 − 1017 = 0.87 P(0 ≤ z ≤ .87) = .3078 11.55
z=
1007 − 1017 = −0.87 P(-.87 ≤ z ≤ 0) = .3078 11.55
probability = 0.3078 + 0.3078 = 0.6156 b.
z=
1037 − 1017 = 1.73 11.55
z=
997 − 1017 = −1.73 P(-1.73 ≤ z ≤ 0) = .4582 11.55
P(0 ≤ z ≤ 1.73) = .4582
probability = 0.4582 + 0.4582 = 0.9164 28. a.
z=
x − 34, 000
σ/ n
Error = x - 34,000 = 250 n = 30
z
=
250 2000 /
n = 50
z
=
250 2000 /
n = 100
z =
n = 200
z =
n = 400
z =
.3106 x 2 = .6212
= 1.25
.3944 x 2 = .7888
= 1.77
.4616 x 2 = .9232
= 2.50
.4938 x 2 = .9876
200
250 2000 /
= .88
100
250 2000 /
.2518 x 2 = .5036
50
250 2000 /
= .68 30
400
b. A larger sample increases the probability that the sample mean will be within a specified distance from the population mean. In the salary example, the probability of being within ±250 of µ ranges from .5036 for a sample of size 30 to .9876 for a sample of size 400.
13- 142
29. a.
E( x ) = 982
σ x = σ / n = 210 / 40 = 33.2 z=
x −µ
σ/ n
=
100 210 / 40
= 3.01
.4987 x 2 = .9974 b.
z=
x −µ
σ/ n
=
25 210 / 40
= .75
.2734 x 2 = .5468 c.
30. a.
The sample with n = 40 has a very high probability (.9974) of providing a sample mean within ± $100. However, the sample with n = 40 only has a .5468 probability of providing a sample mean within ± $25. A larger sample size is desirable if the ± $25 is needed. Normal distribution, E( x ) = 166,500
σ x = σ / n = 42,000 / 100 = 4200 x−µ
z=
c.
$5000 z = 5000/4200 = 1.19
P(-1.19 ≤ z ≤ 1.19) = .7660
$2500 z = 2500/4200 = .60
P(-.60 ≤ z ≤ .60) = .4514
$1000 z = 1000/4200 = .24
P(-.24 ≤ z ≤ .24) = .1896
d.
σ/ n
=
10,000 = 2.38 4,200
b.
P(-2.38 ≤ z ≤ 2.38) = .9826
Increase sample size to improve precision of the estimate. Sample size of 100 only has a .4514 probability of being within ± $2,500.
µ = 1.46 σ = .15
31. a.
n = 30 z=
x −µ
σ/ n
=
.03 .15 / 30
≈ 1.10
P(1.43 ≤ x ≤ 1.49) = P(-1.10 ≤ z ≤ 1.10) = .3643(2) = .7286 b.
n = 50 z=
x −µ .03 = ≈ 1.41 σ / n .15 / 50
13 - 143
P(1.43 ≤ x ≤ 1.49) = P(-1.41 ≤ z ≤ 1.41) = .4207(2) = .8414 c.
n = 100 z=
x −µ
σ/ n
=
.03 .15 / 100
= 2.00
P(1.43 ≤ x ≤ 1.49) = P(-2 ≤ z ≤ 2) = .4772(2) = .9544 d. 32. a. b.
A sample size of 100 is necessary. n / N = 40 / 4000 = .01 < .05; therefore, the finite population correction factor is not necessary. With the finite population correction factor
σx =
N −n σ = N −1 n
4000 − 40 8.2 = 129 . 4000 − 1 40
Without the finite population correction factor
σ x = σ / n = 1.30 Including the finite population correction factor provides only a slightly different value for σ x than when the correction factor is not used. c. z=
x−µ 2 = = 154 . 1.30 1.30
P(-1.54 ≤ z ≤ 1.54) = .8764 33. a.
E ( p ) = p = .40 p(1 − p) = n
0.40(0.60) = 0.0490 100
b.
σp =
c.
Normal distribution with E ( p ) = .40 and σ p = .0490
d.
It shows the probability distribution for the sample proportion p .
34. a.
E ( p ) = .40
σp =
p(1 − p) = n
0.40(0.60) = 0.0346 200 z=
p− p
σp
=
0.03 = 0.87 0.0346
P(-.87 ≤ z ≤ .87) = .6156
13- 144
b. z=
p− p
σp
=
0.05 = 1.45 0.0346
P(-1.45 ≤ z ≤ 1.45) = .8530 35.
σp =
p(1 − p) n
σp =
(0.55)(0.45) = 0.0497 100
σp =
(0.55)(0.45) = 0.0352 200
σp =
(0.55)(0.45) = 0.0222 500
σp =
(0.55)(0.45) = 0.0157 1000
σ p decreases as n increases 36. a.
σp = z=
(0.30)(0.70) = 0.0458 100
p− p
σp
=
0.04 = 0.87 0.0458
P(-.87 ≤ z ≤ .87) = 2(.3078) = .6156
Area = 0.3078 x 2 = 0.6156 b.
σp = z=
(0.30)(0.70) = 0.0324 200
p− p
σp
=
0.04 = 123 . 0.0324
Area = 0.3907 x 2 = 0.7814 c.
σp = z=
(0.30)(0.70) = 0.0205 500
p− p
σp
=
0.04 = 195 . 0.0205
Area = 0.4744 x 2 = 0.9488
13 - 145
d.
σp = z=
(0.30)(0.70) = 0.0145 1000
p− p
σp
=
0.04 = 2.76 0.0145
Area = 0.4971 x 2 = 0.9942 e.
With a larger sample, there is a higher probability p will be within ± .04 of the population proportion p.
37. a.
σp =
p(1 − p) = n
0.30(0.70) = 0.0458 100
p 0.30
The normal distribution is appropriate because n p = 100 (.30) = 30 and n (1 - p ) = 100 (.70) = 70 are both greater than 5. b.
P (.20 ≤ p ≤ .40) = ? z =
.40 - .30 = 2.18 .0458
P(0 ≤ z ≤ 2.18) = .4854 Probability sought is 2(.4854) = .9708 c.
P (.25 ≤ p ≤ .35) = ? z =
.35 - .30 = 1.09 .0458
P(-1.09 ≤ z ≤ 1.09) = .7242 38. a.
E ( p ) = .76
σp =
p(1 − p) = n
0.76(1 − 0.76) = 0.0214 400
The normal distribution is appropriate because np = 400(.76) = 304 and n(1-p) = 400 (.24) = 96 are both greater than 5.
13- 146
b.
z=
0.79 − 0.76 = 1.40 0.0214
P(0 ≤ z ≤ 1.40) = .4192
z=
0.73 − 0.76 = −140 . 0.0214
P(-1.40 ≤ z ≤ 0) = .4192
probability = 0.4192 + 0.4192 = 0.8384 c.
p(1 − p) = n
σp =
0.76(1 − 0.76) = 0.0156 750
z=
0.79 − 0.76 = 192 . 0.0156
P(0 ≤ z ≤ 1.92) = .4726
z=
0.73 − 0.76 = −192 . 0.0156
P(-1.92 ≤ z ≤ 0) = .4726
probability = 0.4726 + 0.4726 = 0.9452 39. a.
Normal distribution E ( p ) = .50
σp =
b.
z=
p (1 − p) = n
p− p
σp
=
(.50)(1 − .50) = .0206 589
.04 = 1.94 .0206
.4738 x 2 = .9476 c.
z=
p− p
σp
=
.03 = 1.46 .0206
.4279 x 2 = .8558
d.
z=
p− p
σp
=
.02 = .97 .0206
.3340 x 2 = .6680 40. a.
Normal distribution E ( p ) = 0.25
13 - 147
p(1 − p) = n
σp =
b.
z=
(0.25)(0.75) = 0.0306 200
0.03 = 0.98 0.0306
P(0 ≤ z ≤ .98) = .3365
probability = 0.3365 x 2 = 0.6730 c.
z=
0.05 = 163 . 0.0306
P(0 ≤ z ≤ 1.63) = .4484
probability = 0.4484 x 2 = 0.8968 41. a.
Normal distribution with E( p ) = p = .25 and
σp =
b.
z=
p(1 − p) .25(1 − .25) = ≈ .0137 n 1000
p− p
σp
=
.03 = 2.19 .0137
P(.22 ≤ p ≤ .28) = P(-2.19 ≤ z ≤ 2.19) = .4857(2) = .9714 c.
z=
p− p .25(1 − .25) 500
=
.03 = 1.55 .0194
P(.22 ≤ p ≤ .28) = P(-1.55 ≤ z ≤ 1.55) = .4394(2) = .8788 42. a.
σp =
p(1 − p) = n
p 0.15
b.
P (.12 ≤ p ≤ .18) = ? z =
.18 - .15 = .59 .0505
P(-.59 ≤ z ≤ .59) = .4448
c.
P ( p ≥ .10) = ?
13- 148
0.15(0.85) = 0.0505 50
z =
.10 - .15 = -.99 .0505
P(z ≥ -.99) = .3389 + .5000 = .8389
43. a.
E ( p ) = 0.17 p(1 − p) = n
σp =
(017 . )(1 − 0.17) = 0.01328 800
Normal distribution b.
z=
0.19 − 0.17 = 151 . 0.01328
P(0 ≤ z ≤ 1.51) = .4345
z=
0.34 − 0.37 = −151 . 0.01328
P(-1.51 ≤ z ≤ 0) = .4345
probability = 0.4345 + 0.4345 = 0.8690 c.
p(1 − p) = n
σp =
(0.17)(1 − 017 . ) = 0.0094 1600
z=
0.19 − 0.17 = 2.13 0.0094
P(0 ≤ z ≤ 2.13) = .4834
z=
0.15 − 0.17 = −2.13 0.0094
P(-2.13 ≤ z ≤ 0) = .4834
probability = 0.4834 + 0.4834 = 0.9668 44.
112, 145, 73, 324, 293, 875, 318, 618
45. a.
Normal distribution E(x ) = 3
σx =
b.
z=
σ n
=
x −µ
σ/ n
1.2 50
=
= .17 .25
1.2 / 50
= 1.47
.4292 x 2 = .8584 46. a.
Normal distribution E ( x ) = 31.5
13 - 149
σx =
b.
z=
σ n
=
12 50
= 170 .
1 = 0.59 1.70
P(0 ≤ z ≤ .59) = .2224
probability = 0.2224 x 2 = 0.4448 c.
z=
3 = 1.77 1.70
P(0 ≤ z ≤ 1.77) = .4616
probability = 0.4616 x 2 = 0.9232 47. a.
E ( x ) = $24.07
σx =
z=
σ n
=
4.80 120
= 0.44
0.50 = 114 . 0.44
P(0 ≤ z ≤ 1.14) = .3729
probability = 0.3729 x 2 = 0.7458 b.
z=
100 . = 2.28 0.44
P(0 ≤ z ≤ 2.28) = .4887
probability = 0.4887 x 2 = 0.9774
µ = 41,979 σ = 5000
48. a.
σ x = 5000 / 50 ≈ 707
b.
z=
x −µ
σx
=
0 =0 707
P( x > 41,979) = P(z > 0) = .50 c.
z=
x −µ
σx
=
1000 ≈ 1.41 707
P(40,979 ≤ x ≤ 42,979) = P(-1.41 ≤ z ≤ 1.41) = (.4207)(2) = .8414 d.
σ x = 5000 / 100 = 500 z=
x −µ
σx
=
1000 = 2.00 500
P(40,979 ≤ x ≤ 42,979) = P(-2 ≤ z ≤ 2) = (.4772)(2) = .9544
13- 150
49. a.
σx =
N −n σ N −1 n
N = 2000
σx =
2000 − 50 144 = 2011 . 2000 − 1 50
σx =
5000 − 50 144 = 20.26 5000 − 1 50
N = 5000
N = 10,000
σx =
10,000 − 50 144 = 20.31 10,000 − 1 50
Note: With n / N ≤ .05 for all three cases, common statistical practice would be to ignore 144 the finite population correction factor and use σ x = = 20.36 for each case. 50 b.
N = 2000
z =
25 = 1.24 20.11
P(-1.24 ≤ z ≤ 1.24) = .7850 N = 5000
25 = 123 . 20.26
z= P(-1.23 ≤ z ≤ 1.23) = .7814 N = 10,000
z =
25 = 1.23 20.31
P(-1.23 ≤ z ≤ 1.23) = .7814 All probabilities are approximately .78 50. a.
σx =
σ n
=
500 n
= 20 n = 500 / 20 = 25 and n = (25)2 = 625
b.
For ± 25, z =
25 = 1.25 20
P(-1.25 ≤ z ≤ 1.25) = .7888 51.
Sampling distribution of x
13 - 151
σx = 0.05
σ n
=
σ 30
0.05
µ
1.9
2.1
1.9 + 2.1 = 2 µ =
x
2 The area below µ = 2.1 must be .95. An area of .95 in the standard normal table shows z = 1.645. Thus,
µ=
2.1 − 2.0
σ / 30
= 1.645
Solve for σ.
σ= 52.
(0.1) 30 = 0.33 1.645
p = .305 a.
Normal distribution with E( p ) = p = .305 and
σp =
b.
z=
p (1 − p) .305(1 − .305) = ≈ .0326 n 200
p− p
σp
=
.04 ≈ 1.23 .0326
P(.265 ≤ p ≤ .345) = P(-1.23 ≤ z ≤ 1.23) = .3907(2) = .7814 c.
z=
p− p
σp
=
.02 ≈ .61 .0326
P(.285 ≤ p ≤ .325) = P(-.61 ≤ z ≤ .61) = .2291(2) = .4582
53.
σp =
p(1 − p) = n
(0.40)(0.60) = 0.0245 400
P ( p ≥ .375) = ? z =
.375 - .40 = -1.02 .0245
P(z ≥ -1.02) = P(z ≤ 1.02) = .8461
13- 152
P ( p ≥ .375) = .8461
54. a.
σp =
z=
p(1 − p) = n
p− p
σp
=
(.71)(1 − .71) = .0243 350
.05 = 2.06 .0243
.4803 x 2 = .9606
b.
z=
p− p
σp
=
.75 − .71 = 1.65 .0243
Area = .4505 P ( p ≥ .75) = 1.0000 - .9505 = .0495 55. a.
Normal distribution with E ( p ) = .15 and
σp = b.
p(1 − p) = n
(015 . )(0.85) = 0.0292 150
P (.12 ≤ p ≤ .18) = ? z =
.18 - .15 = 1.03 .0292
P(-1.03 ≤ z ≤ 1.03) = 2(.3485) = .6970 56. a.
σp =
p(1 − p) = n
.25(.75) =.0625 n
Solve for n n=
.25(.75) = 48 (.0625) 2
b.
Normal distribution with E ( p ) = .25 and σ x = .0625
c.
P ( p ≥ .30) = ? z=
.30 − .25 = .8 .0625
P(z ≥ .8) = 1 - P(z ≤ .8) = 1 - .7881 = .2119 Thus P ( p ≥ .30) = .2119
13 - 153
Chapter 8 Interval Estimation Learning Objectives 1.
Be able to construct and interpret an interval estimate of a population mean and / or a population proportion.
2.
Understand the concept of a sampling error.
3.
Be able to use knowledge of a sampling distribution to make probability statements about the sampling error.
4.
Understand and be able to compute the margin of error.
5.
Learn about the t distribution and when it should be used in constructing an interval estimate for a population mean.
6.
Be able to use the worksheets presented in the chapter as templates for constructing interval estimates.
7.
Be able to determine the size of a simple random sample necessary to estimate a population mean and a population proportion with a specified level of precision.
8.
Know the definition of the following terms: confidence interval confidence coefficient confidence level
precision sampling error margin of error degrees of freedom
13- 154
Solutions: 1.
2.
a.
σ x = σ / n = 5 / 40 = 0.79
b.
At 95%, zσ / n = 196 . (5 / 40 ) = 155 .
a.
32 ± 1.645 (6 / 50 ) 32 ± 1.4
b.
(30.6 to 33.4)
32 ± 1.96 (6 / 50 ) 32 ± 1.66
c.
(30.34 to 33.66)
32 ± 2.576 (6 / 50 ) 32 ± 2.19
3.
a.
(29.81 to 34.19)
80 ± 1.96 (15 / 60 ) 80 ± 3.8
b.
(76.2 to 83.8)
80 ± 1.96 (15 / 120 ) 80 ± 2.68
c.
(77.32 to 82.68)
Larger sample provides a smaller margin of error. 126 ± 1.96 ( s / n )
4.
1.96
16.07
n
=4
n=
1.96(16.07) = 7.874 4
n = 62 5.
6.
a.
σ x = σ / n = 5.00 / 49 = .7143
b.
1.96σ / n = 1.96(5.00 / 49 ) = 1.4
c.
34.80 ± 1.4 or (33.40 to 36.20)
a.
x ± 369
b.
s = 50
13 - 155
c.
369 ± 1.96 (50/ 250 ) 369 ± 6.20 (362.8 to 375.2) x ± z.025 (σ / n )
7.
3.37 ± 1.96 (.28 / 120 ) 3.37 ± .05 8.
a.
x ± zα / 2
(3.32 to 3.42)
σ n
12,000 ± 1.645 (2, 200 / 245) 12,000 ± 231 b.
(11,769 to 12,231)
12,000 ± 1.96 (2, 200 / 245) 12,000 ± 275
c.
(11,725 to 12,275)
12,000 ± 2.576 (2, 200 / 245) 12,000 ± 362
9.
(11,638 to 12,362)
d.
Interval width must increase since we want to make a statement about µ with greater confidence.
a.
x=
Σxi = 13.75 n
b.
s=
Σ( xi − x )2 = 4.8969 n −1
c.
Margin of Error = 1.96
s n
4.8969 = 1.96 ≈ 1.24 60
95% Confidence Interval: 13.75 ± 1.24 or $12.51 to $14.99 10.
x ± z.025
s n
7.75 ± 1.96
3.45 180
7.75 ± .50 11. a.
(7.25 to 8.25)
Using Excel we obtained a sample mean of x = 6.34 and a sample standard deviation of 2.163. The confidence interval is shown below: 6.34 ± 1.96 (2.163 / 50 )
13- 156
6.34 ± .60 The 95% confidence interval estimate is 5.74 to 6.94. 12. a.
x=
b.
s=
Σxi 114 = = 3.8 minutes n 30 Σ( xi − x ) 2 = 2.26 minutes n −1
Margin of Error = z.025
c.
x ± z.025
.95
b.
.90
c.
.01
d.
.05
e.
.95
f.
.85
14. a.
2.26 30
= .81 minutes
n (2.99 to 4.61)
1.734
b.
-1.321
c.
3.365
d.
-1.761 and +1.761
e.
-2.048 and +2.048
15. a.
n
= 1.96
s
3.8 ± .81
13. a.
s
x = Σxi / n =
80 = 10 8
Σ ( xi − x ) 2 84 = = 3.464 n −1 8 −1
b.
s=
c.
With 7 degrees of freedom, t.025 = 2.365
x ± t.025 ( s / n )
13 - 157
10 ± 2.365 (3.464 / 8 ) 10 ± 2.90
16. a.
(7.10 to 12.90)
17.25 ± 1.729 (3.3 / 20 ) 17.25 ± 1.28
b.
17.25 ± 2.09 (3.3 / 20 ) 17.25 ± 1.54
c.
(15.97 to 18.53)
(15.71 to 18.79)
17.25 ± 2.861 (3.3 / 20 ) 17.25 ± 2.11
(15.14 to 19.36)
At 90% , 80 ± t.05 ( s / n ) with df = 17 t.05 = 1.740
17.
80 ± 1.740 (10 / 18 ) 80 ± 4.10
(75.90 to 84.10)
At 95%, 80 ± 2.11 (10 / 18 ) with df = 17 t.05 = 2.110 80 ± 4.97
(75.03 to 84.97)
Σxi 18.96 = = $1.58 n 12
18. a.
x=
b.
s=
c.
t.025 = 2.201
Σ( xi − x )2 .239 = = .1474 n −1 12 − 1
x ± t.025 ( s / n ) 1.58 ± 2.201 (.1474 / 12 ) 1.58 ± .09 19.
(1.49 to 1.67)
x = Σxi / n = 6.53 minutes s=
Σ ( xi − x ) 2 = 0.54 minutes n −1
x ± t.025 ( s / n )
13- 158
6.53 ± 2.093 (0.54 / 20 ) 6.53 ± .25
20. a.
(6.28 to 6.78)
22.4 ± 1.96 (5 / 61) 22.4 ± 1.25
b.
(21.15 to 23.65)
With df = 60, t.025 = 2.000 22.4 ± 2 (5 / 61) 22.4 ± 1.28
c.
Confidence intervals are essentially the same regardless of whether z or t is used.
x=
21.
(21.12 to 23.68)
s=
Σxi 864 = = $108 n 8 Σ( xi − x )2 654 = = 9.6658 n −1 8 −1
t.025 = 2.365 x ± t.025 ( s / n ) 108 ± 2.365 (9.6658 / 8) 108 ± 8.08 22. a. b.
(99.92 to 116.08)
Using Excel, x = 6.86 and s = 0.78
x ± t.025 ( s / n ) t.025 = 2.064
df = 24
6.86 ± 2.064 (0.78 / 25 ) 6.86 ± 0.32
z.2025σ 2 (196 . ) 2 (25) 2 = = 96.04 E2 52
(6.54 to 7.18)
23.
n=
24. a.
Planning value of σ = Range/4 = 36/4 = 9
b.
n=
z.2025σ 2 (196 . ) 2 ( 9) 2 = = 34.57 E2 32
Use n = 97
Use n = 35
13 - 159
c.
n=
(1.96) 2 (9) 2 = 77.79 22
25. a.
n=
(1.96) 2 (6.82) 2 = 79.41 Use n = 80 (15 . )2
b.
n=
(1.645) 2 (6.82) 2 = 3147 . 22
26. a.
n=
z 2σ 2 (1.96) 2 (9400)2 = = 339.44 Use 340 E2 (1000) 2
b.
n=
(1.96) 2 (9400)2 = 1357.78 Use 1358 (500) 2
c.
n=
(1.96)2 (9400) 2 = 8486.09 Use 8487 (200) 2
27. a.
n=
(1.96) 2 (2,000) 2 = 6147 . (500) 2
b.
n=
(1.96) 2 (2,000) 2 = 384.16 (200) 2
c.
n=
(1.96) 2 (2,000) 2 = 1536.64 (100) 2
28. a.
n=
z 2σ 2 (1.645) 2 (220)2 = = 52.39 Use 53 E2 (50) 2
b.
n=
(1.96)2 (220) 2 = 74.37 Use 75 (50) 2
c.
n=
(2.576) 2 (220) 2 = 128.47 Use 129 (50) 2
d.
Must increase sample size to increase confidence.
Use n = 78
29. a.
n=
(1.96) 2 (6.25) 2 = 37.52 22
b.
n=
(1.96) 2 (6.25) 2 = 150.06 12
Use n = 32
Use n = 62
Use n = 385
Use n = 1537
Use n = 38
Use n = 151
13- 160
(1.96) 2 (7.8) 2 = 58.43 Use n = 59 22
30.
n=
31 . a.
p = 100/400 = 0.25
b.
c.
p (1 − p ) = n p ± z.025
0.25(0.75) = 0.0217 400
p (1 − p ) n
.25 ± 1.96 (.0217) .25 ± .0424 32. a.
.70 ± 1.645
(.2076 to .2924) 0.70(0.30) 800
.70 ± .0267 b.
.70 ± 1.96
(.6733 to .7267) 0.70(0.30) 800
.70 ± .0318
z.2025 p(1 − p) (1.96) 2 (0.35)(0.65) = = 349.59 (0.05) 2 E2
33.
n=
34.
Use planning value p = .50
n= 35. a.
p = 562/814 = 0.6904
p (1 − p ) 0.6904(1 − 0.6904) = 1.645 = 0.0267 n 814
1645 .
c.
0.6904 ± 0.0267
b.
Use n = 350
(1.96) 2 (0.50)(0.50) = 1067.11 Use n = 1068 (0.03) 2
b.
36. a.
(.6682 to .7318)
(0.6637 to 0.7171)
p = 152/346 = .4393
σp =
p (1 − p ) .4393(1 − .4393) = = .0267 n 346
13 - 161
p ± z.025σ p .4393 ± 1.96(.0267) .4393 ± .0523
p (1 − p ) , p = 182/650 = .28 n
p ± 196 .
37.
(.3870 to .4916)
.28 ± 1.96
(0.28)(0.72) 650
0.28 ± 0.0345 38. a.
(0.2455 to 0.3145)
p (1 − p ) (0.26)(0.74) = 1.96 = 0.0430 n 400
196 .
b.
0.26 ± 0.0430
c.
n=
1.962 (0.26)(0.74) = 821.25 (0.03) 2
39. a.
n=
2 z.025 p (1 − p) (1.96) 2 (.33)(1 − .33) = = 943.75 Use 944 E2 (.03) 2
b.
n=
2 z.005 p (1 − p) (2.576) 2 (.33)(1 − .33) = = 1630.19 Use 1631 E2 (.03) 2
40. a.
b.
41.
(0.2170 to 0.3030) Use n = 822
p = 255/1018 = 0.2505
(0.2505)(1 − 0.2505) = 0.0266 1018
1.96
σp =
p (1 − p ) .16(1 − .16) = = .0102 n 1285
Margin of Error = 1.96 σ p = 1.96(.0102) = .02 .16 ± 1.96 σ p .16 ± .02 42.
n=
(.14 to .18)
2 z.025 p(1 − p ) E2
13- 162
43. a.
September
n=
1.962 (.50)(1 − .50) = 600.25 Use 601 .042
October
n=
1.962 (.50)(1 − .50) = 1067.11 Use 1068 .032
November
n=
1.962 (.50)(1 − .50) = 2401 .022
Pre-Election n =
1.962 (.50)(1 − .50) = 9604 .012
n=
1.962 (0.5)(1 − 0.5) = 600.25 (0.04) 2
b.
p = 445/601 = 0.7404
c.
0.7404 ± 1.96
(0.7404)(0.2596) 601
0.7404 ± 0.0350 44. a.
b.
z.025
s n
= 1.96
(0.7054 to 0.7755)
20,500 400
= 2009
x ± z.025 ( s / n ) 50,000 ± 2009
45. a.
Use n = 601
(47,991 to 52,009)
x ± z.025 ( s / n ) 252.45 ± 1.96 (74.50 / 64 ) 252.45 ± 18.25 or $234.20 to $270.70
b.
Yes. the lower limit for the population mean at Niagara Falls is $234.20 which is greater than $215.60.
46. a.
Using Excel, x = 49.8 minutes
b.
Using Excel, s = 15.99 minutes
c.
x ± 1.96 ( s / n ) 49.8 ± 1.96 (15.99 / 200 ) 49.8 ± 2.22
47. a.
(47.58 to 52.02)
Using Excel, we find x = 16.8 and s = 4.25
13 - 163
With 19 degrees of freedom, t.025 = 2.093
x ± 2.093 ( s / n ) 16.8 ± 2.093 (4.25 / 20 ) 16.8 ± 1.99
b.
(14.81 to 18.79)
Using Excel, we find x = 24.1 and s = 6.21 24.1 ± 2.093 (6.21 / 20 ) 24.1 ± 2.90
c. 48. a.
(21.2 to 27.0)
16.8 / 24.1 = 0.697 or 69.7% or approximately 70%
x = Σxi / n =
132 = 13.2 10
Σ ( xi − x ) 2 547.6 = = 7.8 n −1 9
b.
s=
c.
With d f = 9, t.025 = 2.262
x ± t.025 ( s / n ) 13.2 ± 2.262 (7.8 / 10 ) 13.2 ± 5.58 d.
The ± 5.58 shows poor precision. A larger sample size is desired.
49.
n=
1.962 (45) 2 = 77.79 102
50.
n=
(2.33) 2 (2.6) 2 = 36.7 12
51.
n=
(1.96) 2 (8) 2 = 61.47 22
n=
(2.576) 2 (8) 2 = 10617 . 22
n=
(1.96) 2 (675) 2 = 175.03 100 2
52.
(7.62 to 18.78)
Use n = 78
Use n = 37
Use n = 62
Use n = 107
Use n = 176
13- 164
53. a.
p ± 196 .
p (1 − p ) , p = 212/450 = .47 n (0.47)(0.53) 450
0.47 ± 1.96 0.47 ± 0.0461
b.
0.47 ± 2.576
(0.4239 to 0.5161)
(0.47)(0.53) 450
0.47 ± 0.06 c. 54. a.
b. c. 55. a.
(0.41 to 0.53)
The margin of error becomes larger. p = 200/369 = 0.5420
p (1 − p ) (0.5420)(0.4580) = 196 . = 0.0508 n 369 0.5420 ± 0.0508 (0.4912 to 0.5928) 196 .
p = 504 / 1400 = .36
(0.36)(0.64) = 0.0251 1400
b.
196 .
56. a.
n=
(2.33) 2 (0.70)(0.30) = 1266.74 (0.03) 2
Use n = 1267
b.
n=
(2.33) 2 (0.50)(0.50) = 1508.03 (0.03) 2
Use n = 1509
57. a.
p = 110 / 200 = 0.55
0.55 ± 1.96
(0.55)(0.45) 200
.55 ± .0689 b. 58. a.
b.
n=
(1.96) 2 (0.55)(0.45) = 380.32 (0.05) 2
(.4811 to .6189) Use n = 381
p = 340/500 = .68
σp =
p (1 − p ) .68(1 − .68) = = .0209 n 500
13 - 165
p ± z.025σ p .68 ± 1.96(.0209) .68 ± .0409
59. a.
n=
(.6391 to .7209)
(1.96) 2 (0.3)(0.7) = 2016.84 (0.02) 2
b.
p = 520/2017 = 0.2578
c.
p ± 196 .
p (1 − p ) n
0.2578 ± 1.96
(0.2578)(0.7422) 2017
0.2578 ± 0.0191 60. a.
b.
c.
Use n = 2017
(0.2387 to 0.2769)
p = 618 / 1993 = .3101
p ± 196 .
p (1 − p ) 1993
0.3101 ± 1.96
(0.3101)(0.6899) 1993
.3101 ± .0203
(.2898 to .3304)
n=
z 2 p(1 − p) E2
z=
(1.96) 2 (0.3101)(0.6899) = 8218.64 (0.01) 2
Use n = 8219
No; the sample appears unnecessarily large. The .02 margin of error reported in part (b) should provide adequate precision.
13- 166
Chapter 9 Hypothesis Testing Learning Objectives 1.
Learn how to formulate and test hypotheses about a population mean and/or a population proportion.
2.
Understand the types of errors possible when conducting a hypothesis test.
3.
Be able to determine the probability of making various errors in hypothesis tests.
4.
Know how to compute and interpret p-values.
5.
Be able to use the Excel worksheets presented in the chapter as templates for conducting hypothesis tests about population means and proportions.
6.
Know the definition of the following terms: null hypothesis alternative hypothesis type I error type II error critical value
level of significance one-tailed test two-tailed test p-value
13 - 167
Solutions: 1.
a.
H0: µ ≤ 600
Manager’s claim.
Ha: µ > 600
2.
b.
We are not able to conclude that the manager’s claim is wrong.
c.
The manager’s claim can be rejected. We can conclude that µ > 600.
a.
H0: µ ≤ 14 Ha: µ > 14
3.
4.
b.
There is no statistical evidence that the new bonus plan increases sales volume.
c.
The research hypothesis that µ > 14 is supported. We can conclude that the new bonus plan increases the mean sales volume.
a.
H0: µ = 32
Specified filling weight
Ha: µ ≠ 32
Overfilling or underfilling exists
b.
There is no evidence that the production line is not operating properly. Allow the production process to continue.
c.
Conclude µ ≠ 32 and that overfilling or underfilling exists. Shut down and adjust the production line.
a.
H0: µ ≥ 220 Ha: µ < 220
5.
Research hypothesis
Research hypothesis to see if mean cost is less than $220.
b.
We are unable to conclude that the new method reduces costs.
c.
Conclude µ < 220. Consider implementing the new method based on the conclusion that it lowers the mean cost per hour.
a. The Type I error is rejecting H0 when it is true. In this case, this error occurs if the researcher concludes that the mean newspaper-reading time for individuals in management positions is greater than the national average of 8.6 minutes when in fact it is not. b. The Type II error is accepting H0 when it is false. In this case, this error occurs if the researcher concludes that the mean newspaper-reading time for individuals in management positions is less than or equal to the national average of 8.6 minutes when in fact it is greater than 8.6 minutes.
6.
a.
H0: µ ≤ 1
The label claim or assumption.
Ha: µ > 1 b.
Claiming µ > 1 when it is not. This is the error of rejecting the product’s claim when the claim is true.
13- 168
c. 7.
a.
Concluding µ ≤ 1 when it is not. In this case, we miss the fact that the product is not meeting its label specification. H0: µ ≤ 8000 Ha: µ > 8000
8.
Research hypothesis to see if the plan increases average sales.
b.
Claiming µ > 8000 when the plan does not increase sales. A mistake could be implementing the plan when it does not help.
c.
Concluding µ ≤ 8000 when the plan really would increase sales. This could lead to not implementing a plan that would increase sales.
a.
H0: µ ≥ 220 Ha: µ < 220
9.
b.
Claiming µ < 220 when the new method does not lower costs. A mistake could be implementing the method when it does not help.
c.
Concluding µ ≥ 220 when the method really would lower costs. This could lead to not implementing a method that would lower costs.
a.
z = -1.645 Reject H0 if z < -1.645
b.
z=
x−µ s/ n
=
9.46 − 10
= −191 .
2 / 50
Reject H0; conclude Ha is true. 10. a.
z = 2.05 Reject H0 if z > 2.05
x−µ
z=
c.
Using the cumulative normal probability table, the area to the right of z = 1.36 is 1 - .9131 = .0869. Thus, the p-value is .0869
d.
Do not reject H0
s/ n
=
16.5 − 15
b.
7 / 40
= 136 .
Reject H0 if z < -1.645
11. a.
z=
b.
z=
x−µ s/ n x−µ s/ n
=
=
22 − 25 12 / 100 24 − 25 12 / 100
= −2.50
Reject H0
= −.83
Do Not Reject H0
13 - 169
c.
z=
d.
z=
x−µ s/ n x−µ s/ n
=
=
23.5 − 25 12 / 100 22.8 − 25 12 / 100
= −125 .
Do Not Reject H0
. = −183
Reject H0
12. a.
p-value = 1 - .9656 = .0344
Reject H0
b.
p-value = 1 - .6736 = .3264
Do not reject H0
c.
p-value = 1 - .9332 = .0668
Do not reject H0
d.
z = 3.09 is the largest table value with 1 - .999 = .001 area in tail. For z = 3.30, the p-value is less than .001 or approximately 0. Reject H0.
e.
Since z is to the left of the mean and the rejection region is in the upper tail, the p-value is the area to the right of z = -1.00. Because the standard normal distribution is symmetric, the area to the right of z = -1.00 is the same as the area to the left of z = 1.00. Thus, the p-value = .8413. Do not reject H0.
13. a.
H0: µ ≥ 1056 Ha: µ < 1056
b.
Reject H0 if z < -1.645
c.
z=
d.
Reject H0 and conclude that the mean refund of “last minute” filers is less than $1056.
e.
p-value = 1.0000 - .9664 = .0336
14. a.
x − µ0 s/ n
=
910 − 1056 1600 / 400
= −1.83
z.01 = 2.33 Reject H0 if z > 2.33
x −µ
z=
c.
Reject H0; conclude the mean television viewing time per day is greater than 6.70.
15. a.
s/ n
=
7.25 − 6.70
b.
2.5 / 200
= 3.11
A summary of the sample data is shown below:
Sample Size 100
Sample Mean $9300
Sample Standard Deviation $4500
H0: µ ≥ 10,192 Ha: µ < 10,192 Reject H0 if z < –1.645.
13- 170
z= b.
x−µ s/ n
=
9300 − 10,192 4500 / 100
= −1.98
The area to the left of z = -1.98 is the same as the area to the right of z = 1.98. Using the cumulative normal probability table, the area to the right of z = 1.98 is 1 - .9761 = .0239. Thus, the p-value = .0239.
c. The manager can use the sample results to conclude that the mean sales price of used cars at the dealership is less than the mean sales price of used cars using the national average. The manager may want to explore the possible reasons for the lower prices at the dealership. Perhaps sales personnel are making excessive price concessions to close the sales. Perhaps the dealership is missing out on a portion of the late model used car market that might warrant used cars with higher prices. The manager’s judgment and insight might suggest other reasons the dealership is experiencing the lower mean sales prices. 16.
A summary of the sample data is shown below:
Sample Size 30
Sample Mean 27,500
Sample Standard Deviation 1000
H0: µ ≥ 28,000 Ha: µ < 28,000 Reject H0 if z < -1.645
z=
x − µ0 s/ n
27,500 − 28,000
=
1000 / 30
= −2.74
Reject H0; Tires are not meeting the at least 28,000 design specification. Because the standard normal distribution is symmetric, the area to the left of z = -2.74 is the same as the area to the right of z = 2.74. Using the cumulative normal probability table, the area to the right of z = 2.74 is 1 - .9969 = .0031. Thus, the p-value = .0031. 17. a.
H0: µ ≥ 13 Ha: µ < 13
b.
z.01 = 2.33 Reject H0 if z < -2.33
x −µ
z=
d.
Reject H0; conclude Canadian mean internet usage is less than 13 hours per month.
s/ n
=
10.8 − 13
c.
9.2 / 145
= −2.88
Note: p-value = .002 18. a.
H0: µ ≤ 5.72
13 - 171
Ha: µ > 5.72
=
5.98 − 5.72
c. d.
p-value < α; reject H0. Conclude teens in Chicago have a mean expenditure greater than 5.72.
19. a.
z=
x −µ
= 2.12 s / n 1.24 / 102 p-value = 1.0000 - .9830 = .0170
b.
H0: µ ≥ 181,900 Ha: µ < 181,900
x −µ
z=
c.
p-value = 1.0000 - .9983 = .0017
d.
p-value < α; reject H0. Conclude mean selling price in South is less than the national mean selling price.
20. a.
s/ n
=
166, 400 − 181,900
b.
33,500 / 40
= −2.93
H0: µ ≤ 37,000 Ha: µ > 37,000
x −µ
z=
c.
p-value = 1.0000 - .9292 = .0708
d.
p-value > α; do not reject H0. Cannot conclude population mean salary has increased in June 2001.
21. a. b.
s/ n
=
38,100 − 37, 000
b.
5200 / 48
= 1.47
Reject H0 if z < -1.96 or z > 1.96
z=
x−µ s/ n
=
10.8 − 10 2.5 / 36
= 2.40
Reject H0; conclude Ha is true. 22. a.
Reject H0 if z < -2.33 or z > 2.33
x−µ
z=
c.
p-value = (2) (1 - .8708) = .2584
d.
Do not reject H0
23.
s/ n
=
14.2 − 15
b.
5 / 50
= −113 .
Reject H0 if z < -1.96 or z > 1.96 a.
z=
x−µ s/ n
=
22 − 25 10 / 80
= −2.68
Reject H0
13- 172
b.
z=
c.
z=
d.
z=
x−µ s/ n x−µ s/ n x−µ s/ n
= = =
27 − 25 10 / 80 235 . − 25 10 / 80 28 − 25 10 / 80
= −179 .
Do not reject H0
= −134 .
Do not reject H0
= 2.68
Reject H0
24. a.
p-value = 2(1 - .9641) = .0718
Do not reject H0
b.
p-value = 2(1 - .6736) = .6528
Do not reject H0
c.
p-value = 2(1 - .9798) = .0404
Reject H0
d.
approximately 0 Reject H0
e.
p-value = 2(1 - .8413) = .3174
25. a.
Do not reject H0
z.025 = 1.96 Reject H0 if z < -1.96 or z > 1.96
x −µ
z=
c.
Do not reject H0. Cannot conclude a change in the population mean has occurred.
d.
p-value = 2(1.000 - .9382) = .1236
26. a.
s/ n
=
38.5 − 39.2
b.
4.8 / 112
= −1.54
H0: µ = 8 Ha: µ ≠ 8 Reject H0 if z < -1.96 or if z > 1.96
z=
x − µ0 s/ n
=
7.5 − 8 3.2 / 120
= −171 .
Do not reject H0; cannot conclude the mean waiting time differs from eight minutes. b.
27. a.
Using the cumulative normal probability table, the area to the left of z = -1.71 is 1 - .9564 = .0436. Thus, the p-value = 2 (.0436) = .0872.
H0: µ = 16
Continue production
Ha: µ ≠ 16
Shut down
Reject H0 if z < -1.96 or if z > 1.96 b.
z=
x − µ0 s/ n
=
16.32 − 16 .8 / 30
= 2.19
Reject H0 and shut down for adjustment.
13 - 173
c.
z=
x − µ0 s/ n
1582 . − 16
=
.8 / 30
= −1.23
Do not reject H0; continue to run. d.
For x = 16.32, p-value = 2 (1 - .9857) = .0286 For x = 15.82, p-value = 2 (1 - .8907) = .2186
28.
A summary of the sample data is shown below:
Sample Size 45
Sample Mean 2.39
Sample Standard Deviation .20
H0: µ = 2.2 Ha: µ ≠ 2.2 Reject H0 if z < -2.33 or if z > 2.33
z=
x − µ0 s/ n
2.39 − 2.20
=
.20 / 45
= 6.37
Reject H0 and conclude 2.2 - minute standard is not being met.
H0: µ = 15.20
29.
Ha: µ ≠ 15.20 Reject H0 if z < -1.96 or if z > 1.96 z=
x − µ0 s/ n
14.30 − 15.20
=
5 / 35
= −1.06
Do not reject H0; the sample does not provide evidence to conclude that there has been a change.
p-value = 2 (1 - .8554) = .2892 30. a.
H0: µ = 1075 Ha: µ ≠ 1075
x −µ
z=
c.
p-value = 2(1.0000 - .9236) = .1528
d.
Do not reject H0. Cannot conclude a change in mean amount of charitable giving.
31. a.
s/ n
=
1160 − 1075
b.
840 / 200
= 1.43
With 15 degrees of freedom, t.05 = 1.753 Reject H0 if t > 1.753
13- 174
b.
32. a.
t=
x − µ0 s/ n
=
11 − 10 3 / 16
= 1.33
Do not reject H0
x = ∑ xi / n = 108 / 6 = 18 ∑( xi − x ) 2 = n −1
10 = 1.414 6−1
b.
s=
b. d.
Reject H0 if t < -2.571 or t > 2.571 x − µ0 18 − 20 t= = = −3.46 s / n 1414 . / 6
e.
Reject H0; conclude Ha is true.
33.
Reject H0 if t < -1.721 a.
t=
b.
t=
c.
t=
d.
t=
34.
13 − 15 8 / 22
. = −117
115 . − 15 8 / 22 15 − 15 8 / 22 19 − 15 8 / 22
= −2.05
Do not reject H0
Reject H0
=0
Do not reject H0
= 2.35
Do not reject H0
Excel's TDIST function with 15 degrees of freedom was used to determine each p-value. a.
p-value = .01
Reject H0
b.
p-value = .10
Do not reject H0
c.
p-value = .03
Reject H0
d.
p-value = .15
Do not reject H0
e.
p-value = .003
Reject H0
35. a.
H0: µ = 3.00 Ha: µ ≠ 3.00
b.
t.025 = 2.262 Reject H0 if t < -2.262 or if t > 2.262
13 - 175
Σxi 28 = = 2.80 n 10
c.
x=
d.
s=
e.
t=
f.
Do not reject H0; cannot conclude the population mean earnings per share has changed.
g.
t.10 = 1.383
Σ( xi − x ) 2 .44 = = .70 10 − 1 n −1 x −µ s/ n
=
2.80 − 3.00 .70 / 10
= −.90
p-value is greater than .10 x 2 = .20 Actual p-value = .3916 36. a.
A summary of the sample data is shown below:
Sample Size 25
Sample Mean 84.5
Sample Standard Deviation 14.5
H0: µ = 90 Ha: µ ≠ 90 Degrees of freedom = 24
t.025 = 2.064
Reject H0 if z < -2.064 or if z > 2.064
t=
x − µ0 s/ n
=
84.5 − 90 14.5 / 25
= −1.90
Do not reject H0; we cannot conclude the mean household expenditure in Corning differs from the U.S. mean expenditure. b. 37. a.
Using Excel's TDIST function, the p-value corresponding to t = -1.90 is approximately .07.
H0: µ ≤ 55 Ha: µ > 55 With 7 degrees of freedom, reject H0 if t < 1.895.
x = ∑ xi / n = 475 / 8 = 59.38 s=
∑( xi − x ) 2 12387 . = = 4.21 n −1 7
13- 176
t=
x − µ0
=
s/ n
59.38 − 55 4.21 / 8
= 2.94
Reject H0; the mean number of hours worked per week exceeds 55. b.
38. a.
Using Excel's TDIST function, the p-value corresponding to t = 2.94 is approximately .011.
H0: µ = 4000 Ha: µ ≠ 4000
b.
t.05 = 2.160
13 degrees of freedom
Reject H0 if t < -2.160 or if t > 2.160
x −µ
t=
d.
Do not reject H0; Cannot conclude that the mean cost in New City differs from $4000.
e.
With 13 degrees of freedom
s/ n
=
4120 − 4000
c.
275 / 14
= +1.63
t.05 = 1.771 t.10 = 1.350 1.63 is between 1.350 and 1.771. Therefore the p-value is between .10 and .20. 39. a.
H0: µ ≤ 280 Ha: µ > 280
b.
286.9 - 280 = 6.9 yards
c.
t.05 = 1.860 with 8 degrees of freedom
d.
t=
e.
Reject H0; The population mean distance of the new driver is greater than the USGA approved driver..
f.
t.05 = 1.860
x −µ s/ n
=
286.9 − 280 10 / 9
= 2.07
t.025 = 2.306 p-value is between .025 and .05 Actual p-value = .0361 40.
H0: µ ≤ 2
13 - 177
Ha: µ > 2 With 9 degrees of freedom, reject H0 if t > 1.833
x = 2.4 s = .5164 t=
x − µ0 s/ n
=
2.4 − 2 .5164 / 10
= 2.45
Using Excel's TDIST function, the p-value corresponding to t = 2.45 is approximately .02. Reject H0 and claim µ is greater than 2 hours. For cost estimating purposes, consider using more than 2 hours of labor time. 41. a. b.
Reject H0 if z > 1.645
σp = z=
42. a. b.
.50(.50) =.0354 200
p− p
σp
=
.57 −.50 = 198 . .0354
Reject H0 if z < -1.96 or z > 1.96
σp = z=
.20(.80) =.02 400
p− p
σp
=
.175−.20 = −1.25 .02
c.
p-value = 2(1 - .8944) = .2122
d.
Do not reject H0.
43.
Reject H0
Reject H0 if z < -1.645 a.
σp = z=
.75(.25) =.0250 300
p− p
σp
=
.68−.75 = −2.80 .025
p-value = 1 - .8974 = .0026 Reject H0. b.
z=
p− p
σp
=
.72 −.75 = −120 . .025
13- 178
p-value = 1 - .8849 = .1151 Do not reject H0. c.
z=
p− p
σp
.70−.75 = −2.00 .025
=
p-value = 1 - .8772 = .0228 Reject H0. d.
z=
p− p
σp
=
.77 −.75 =.80 .025
In this case, the p-value is the area to the left of z = .80. Thus, the p-value = .7881. Do not reject H0. 44. a.
H0: p ≤ .40 Ha: p > .40
b.
Reject H0 if z > 1.645
c.
p = 188/420 = .4476
σp =
z=
d.
45.
p(1 − p) .40(1 − .40) = = .0239 n 420
p− p
σp
=
.4476 − .40 = 1.99 .0239
Reject H0. Conclude that there has been an increase in the proportion of users receiving more than ten e-mails per day.
H0: p ≥ .64 Ha: p < .64 Reject H0 if z < –1.645. p = 52/100 = .2667
z=
.52 −.64 .64(.36) 100
= −2.5
Reject H0; conclude that less than 64% of the shoppers believe that the supermarket ketchup is as good as the national name brand ketchup. 46. a.
p = 285/460 = .62
13 - 179
b.
H0: p ≤ 0.50 Ha: p > 0.50 Reject H0 if z > 2.33
p − p0
z=
p0 (1 − p0 ) n
=
.57−.50 .50(1−.50) 500
= 313 .
Reject H0; a Burger King taste preference should be expressed by over 50% of the consumers. c.
47.
Yes; the statistical evidence shows Burger King fries are preferred. The give-away was a good way to get potential customers to try the new fries. A summary of the sample data is shown below:
Sample Size 200
Number of College Students 42
H0: p = .25 Ha: p ≠ .25 Reject H0 if z < -1.645 or if z > 1.645 p = 42/200 = .21
σp = z=
.25(.75) =.0306 200
p − p0
σp
=
.21−.25 = −1.31 .0306
Do not reject H0; the magazine’s claim of 25% cannot be rejected.
p-value = 2 (1 - .9049) = .1902 48. a.
b.
p = 67/105 = .6381 (about 64%)
σp =
z=
p(1 − p) .50(1 − .50) = = .0488 n 105
p− p
σp
=
.6381 − .50 = 2.83 .0488
c.
p-value = 2(1.0000 - .9977) = .0046
d.
p-value < .01, reject H0. Conclude preference is for the four ten-hour day schedule.
13- 180
49. a.
H0: p = .44 Ha: p ≠ .44
b.
p = 205/500 = .41
σp =
z=
p(1 − p) .44(1 − .44) = = .0222 n 500
p− p
σp
=
.41 − .44 = −1.35 .0222
p-value = 2(1.0000 - .9115) = .1770 Do not reject H0. Cannot conclude that there has been a change in the proportion of repeat customers. c.
p = 245/500 = .49
z=
p− p
σp
=
.49 − .44 = 2.25 .0222
p-value = 2(1.0000 - .9878) = .0244 Reject H0. conclude that the proportion of repeat customers has changed. The point estimate of the percentage of repeat customers is now 49%. 50. a.
σp =
z=
p(1 − p) .75(1 − .75) = = .025 n 300
p− p
σp
=
.72 − .75 = −1.20 .025
b.
p-value = 1.0000 - .8849 = .1151
c.
Do not reject H0. Cannot conclude the manager's claim is wrong based on this sample evidence.
51. a.
H0: p ≥ .047 Ha: p < .047
b.
p = 35/1182 = .0296
c.
σp =
z=
d.
.047(1 − .047) = .0062 1182
p− p
σp
=
.0296 − .047 = −2.82 .0062
p-value = 1.0000 - .9976 = .0024
13 - 181
e. 52. a.
p-value < α, reject H0. The error rate for Brooks Robinson is less than the overall error rate. H0: µ ≤ 45,250 Ha: µ > 45,250
x −µ
47, 000 − 45, 250
b.
z=
c.
p-value = 1.0000 - .9966 = .0034
d. 53. a.
s/ n
=
6300 / 95
= 2.71
p-value < α; reject H0. New York City school teachers must have a higher mean annual salary. H0: µ ≥ 30 Ha: µ < 30 Reject H0 if z < –2.33
z=
x − µ0 s/ n
=
29.5 − 30 18 . / 50
. = −196
Do not reject H0; the sample evidence does not support the conclusion that the Buick LeSabre provides less than 30 miles per gallon. b.
p-value = 1 - .9963 = .0037
c.
x ± z.05
x ± z.05
σ n 15 . 45
= 27.6±.44
Interval is 27.16 to 28.04 54.
H0: µ ≤ 25,000 Ha: µ > 25,000 Reject H0 if z > 1.645
z=
x − µ0 s/ n
=
26, 000 − 25, 000 2,500 / 32
= 2.26
p-value = 1.0000 - .9881 = .0119 Reject H0; the claim should be rejected. The mean cost is greater than $25,000. 55.
H0: µ = 120 Ha: µ ≠ 120
13- 182
With n = 10, use a t distribution with 9 degrees of freedom. Reject H0 if t < -2.262 or of t > 2.262
x=
s=
t=
Σxi = 118.9 n Σ( xi − x ) 2 = 4.93 n −1
x − µ0 s/ n
=
118.9 − 120 4.93 / 10
= −.71
Do not reject H0; the results do not permit rejection of the assumption that µ = 120. 56. a.
H0: µ = 550 Ha: µ ≠ 550 Reject H0 if z < -1.96 or if z > 1.96
z=
x − µ0 s/ n
=
562 − 550 40 / 36
. = 180
Do not reject H0; the claim of $550 per month cannot be rejected. b. c.
p-value = 2(1 - .9641) = .0718 s x ± z.025 n x ± 1.96
40 36
= 562 ± 13
Interval is 549 to 575 Do not reject H0 since 550 is in the above interval. 57. a.
A summary of the sample data is shown below:
Sample Size 30
Sample Mean 80
Sample Standard Deviation 20
H0: µ ≤ 72 Ha: µ > 72 z=
x − 72 s/ n
=
80 − 72 20 / 30
= 2.19
13 - 183
p-value = 1 - .9857 = .0143 b.
Since p-value < .05, reject H0; the mean idle time exceeds 72 minutes per day.
H0: p ≥ .79
58.
Ha: p < .79 Reject H0 if z < -1.645 p = 360/500 = .72
z=
p − p0
σp
=
.72−.79 (.79)(.21) 500
= −3.84
Reject H0; conclude that the proportion is less than .79 in 1995. 59.
A summary of the sample data is shown below: Number that Work with Coworkers 304
Sample Size 400
H0: p ≤ .72 Ha: p > .72 Reject H0 if z > 1.645 p = 304/400 = .76
z=
p − p0
.76−.72
. = 187 (.76)(.24) 400 Reject H0: conclude that the proportion of workers at Trident is greater. 60. a.
b.
σp
=
The research is attempting to see if it can be concluded that less than 50% of the working population hold jobs that they planned to hold. .50(.50) =.0136 1350 .41−.50 z= = −6.62 .0136
σp =
Reject H0 if z < -2.33 Reject H0; it can be concluded that less than 50% of the working population hold jobs that they planned to hold. The majority hold jobs due to chance, lack of choice, or some other unplanned reason.
13- 184
.75(.25) =.0229 356 p = 313/356 = .88
σp =
61.
z=
.88−.75 = 5.68 .0229
Reject H0; conclude p≠ 0. Data suggest that 88% of women wear shoes that are at least one size too small.
62. a.
b.
p = 355/546 = .6502
σp =
z=
p(1 − p) .67(1 − .67) = = .0201 n 546
p− p
σp
=
.6502 − .67 = −.98 .0201
c.
p-value = 2(1.0000 - .8365) = .3270
d.
p-value ≥ α, do not reject H0. The assumption of two-thirds cannot be rejected.
63. a.
b.
p = 330/400 = .825
σp =
z=
p(1 − p) .78(1 − .78) = = .0207 n 400
p− p
σp
=
.825 − .78 = 2.17 .0207
c.
p-value = 2(1.0000 - .9850) = .03
d.
p-value < α, reject H0. Arrival rate has changed from 78%. Service appears to be improving.
64. a.
b.
p = 44/125 = .352
σp =
z=
p(1 − p) .47(1 − .47) = = .0446 n 125
p− p
σp
=
.352 − .47 = −2.64 .0446
c.
p-value = 1.0000 - .9959 = .0041
d.
Reject H0; conclude that the proportion of food containing pesticide residues has been reduced.
13 - 185
Chapter 10 Comparisons Involving Means Learning Objectives 1.
Be able to develop interval estimates and conduct hypothesis tests about the difference between the means of two populations.
2.
Know the properties of the sampling distribution of the difference between two means x1 − x2 .
3.
Be able to use the t distribution to conduct statistical inferences about the difference between the means of two normal populations with equal variances.
4.
Understand the concept and use of a pooled variance estimate.
5.
Learn how to analyze the difference between the means of two populations when the samples are independent and when the samples are matched.
6.
Understand how the analysis of variance procedure can be used to determine if the means of more than two populations are equal.
7.
Know the assumptions necessary to use the analysis of variance procedure.
8.
Understand the use of the F distribution in performing the analysis of variance procedure.
9.
Know how to set up an ANOVA table and interpret the entries in the table.
10.
Be able to use the Excel worksheets and tools presented to conduct comparisons involving means.
13- 186
Solutions: 1.
a.
x1 − x2 = 13.6 - 11.6 = 2
b.
sx1 − x2 =
s12 s22 + = n1 n2
(2.2) 2 (3) 2 + = 0.595 50 35
2 ± 1.645(.595) 2 ± .98 or 1.02 to 2.98 c.
2 ± 1.96(.595) 2 ± 1.17 or 0.83 to 3.17
2.
a.
x1 − x2 = 22.5 - 20.1 = 2.4
b.
s2 =
c.
sx1 − x2 = s 2
(n1 − 1) s12 + (n2 − 1) s22 9(2.5) 2 + 7(2) 2 = = 5.27 n1 + n2 − 2 10 + 8 − 2
F 1 1I G Hn + n JK= 1
2
5.27
1 1I F . G H10 + 8 J K= 109
16 degrees of freedom, t.025 = 2.12 2.4 ± 2.12(1.09) 2.4 ± 2.31 or .09 to 4.71 3.
a.
x1 = ∑ xi / n = 54 / 6 = 9 x2 = ∑ xi / n = 42 / 6 = 7
b.
s1 =
∑ ( xi − x1 ) 2 = n1 − 1
18 = 1.90 6 −1
s2 =
∑ ( xi − x2 )2 = n2 − 1
16 = 1.79 6 −1
c.
x1 − x2 = 9 - 7 = 2
d.
s2 =
e.
With 10 degrees of freedom, t.025 = 2.228
(n1 − 1) s12 + (n2 − 1) s22 5(1.90) 2 + 5(1.79) 2 = = 3.41 n1 + n2 − 2 6+6−2
13 - 187
sx1 − x2 = s 2
F 1 1I G Hn + n JK= 1
1 1I F . G H6 + 6 J K= 107
3.41
2
2 ± 2.228(1.07) 2 ± 2.37 or -0.37 to 4.37 4.
a.
x1 − x2 = 1.58 - 0.98 = $0.60
b.
sx1 − x2 =
s12 s22 .122 .082 + = + = .021 n1 n2 50 42
x1 − x2 ± zα / 2 sx1 − x2 .60 ± 1.96(.021) .60 ± .04 or .56 to .64 5.
a.
22.5 - 18.6 = 3.9 miles per day
b.
x1 − x2 ± zα / 2 sx1 − x2 sx1 − x2 =
s12 s22 + = n1 n2
(8.4)2 (7.4)2 + = 1.58 50 50
22.5 - 18.6 ± 1.96(1.58) 3.9 ± 3.1 or 0.6 to 7.0 6. LA 6.72 2.374
x s
Miami 6.34 2.163
x1 − x2 ± zα / 2 sx1 − x2 sx1 − x2 =
s12 s22 + = n1 n2
(2.374) 2 (2.163) 2 + = 0.454 50 50
6.72 - 6.34 ± 1.96(.454) .38 ± .89 or -.51 to 1.27 7.
a.
x1 − x2 = 14.9 - 10.3 = 4.6 years
b.
sx1 − x2 =
s12 s22 + = n1 n2
5.22 3.82 + = .66 100 85
13- 188
z.025 sx1 − x2 = 1.96(.66) = 1.3
c.
x1 − x2 ± z.025 sx1 − x2 4.6 ± 1.3 or 3.3 to 5.9
8.
a.
x1 − x2 = 45,700 - 44,500 = 1,200
b.
Pooled variance s2 =
7(700) 2 + 11(850) 2 = 632, 083 18
1 1 sx1 − x2 = 632, 083 + = 362.88 8 12 With 18 degrees of freedom t.025 = 2.101 1200 ± 2.101(362.88) 1200 ± 762 or 438 to 1962
9.
c.
Populations are normally distributed with equal variances.
a.
n1 = 10
n2 = 8
x1 = 21.2
x2 = 22.8
s1 = 2.70
s2 = 3.55
x1 − x2 = 21.2 - 22.8 = -1.6 Kitchens are less expensive by $1,600. b.
x1 − x2 ± zα / 2 sx1 − x2 Degrees of freedom = n1 + n2 - 2 = 16 t.05 = 1.746 s2 =
9(2.70)2 + 7(3.55) 2 = 9.63 10 + 8 − 2
1 1 sx1 − x2 = 9.63 + = 1.47 10 8 -1.6 ± 1.746(1.47)
13 - 189
-1.6 ± 2.57 or -4.17 to +.97 10. a.
x1 = 17.54
x2 = 15.36
x1 − x2 = 17.54 - 15.36 = $2.18 per hour greater for union workers. (n1 − 1) s12 + (n2 − 1) s22 14(2.24) 2 + 19(1.99) 2 = = 4.41 n1 + n2 − 2 15 + 20 − 2
b.
s2 =
c.
x1 − x2 ± tα / 2 sx1 − x2 1 1 sx1 − x2 = 4.41 + = 0.72 15 20 17.54 − 15.36 ± tα / 2 (.72) = 2.18 ± tα / 2 (.72) Note: Using Excel's TINV function, t.025 = 2.035. 2.18 ± 2.035(.72) 2.18 ± 1.47 or 0.71 to 3.65
d. 11. a.
There does appear to be a difference in the mean wage rate for these two groups. sx1 − x2 =
z=
s12 s22 + = n1 n2
(5.2) 2 (6) 2 + = 118 . 40 50
(25.2 − 22.8) = 2.03 1.18
Reject H0 if z > 1.645 Reject H0; conclude Ha is true and µ1 − µ2 > 0. b. 12. a.
p-value = 1.0000 - .9788 = .0212 sx1 − x2 =
z=
s12 s22 + = n1 n2
(8.4) 2 (7.6) 2 + = 131 . 80 70
( x1 − x2 ) − ( µ 1 − µ 2 ) (104 − 106) − 0 = = −153 . sx1 − x2 1.31
Reject H0 if z < -1.96 or z > 1.96 Do not reject H0 b. 13. a.
p-value = 2(1.0000 - .9370) = .1260 x1 − x2 =1.4 – 1.0 = 0.4
13- 190
s2 =
(n1 − 1) s12 + (n2 − 1) s22 7(.4) 2 + 6(.6) 2 = = 0.2523 n1 + n2 − 2 8+7−2
1 1 sx1 − x2 = 0.2523 + = 0.26 8 7 With 13 degrees of freedom. t.025 = 2.16 Reject H0 if t < -2.16 or t > 2.16 t=
( x1 − x2 ) − ( µ 1 − µ 2 ) 0.4 = = 154 . sx1 − x2 0.26
Do not reject H0 14. a.
H0: µ 1 - µ 2 = 0 Ha: µ1 − µ2 ≠ 0
b.
Reject H0 if z < -1.96 or if z > 1.96
c.
sx1 − x2 =
z=
15.
s12 s22 + = n1 n2
(16.8) 2 (15.2) 2 + = 1.79 150 175
( x1 − x2 ) − 0 = ( 39.3 − 35.4 ) − 0 = 2.18 1.79
sx1 − x2
d.
Reject H0; conclude the population means differ.
e.
p-value = 2(1.0000 - .9854) = .0292 A summary of the sample data is shown below:
Airport Miami Los Angeles
Sample Size 50 50
Sample Mean 6.34 6.72
Sample Standard Deviation 2.163 2.374
We will treat Los Angeles as population 1 H0: µ1 ≤ µ2 Ha: µ1 > µ2
z=
( x1 − x2 ) − ( µ 1 − µ 2 ) 2 1
2 2
s s + n1 n2
=
(6.72 − 6.34) − 0 (2.374) 2 (2.163) 2 + 50 50
Since 0.84 < z.05 = 1.64 we cannot reject H0
13 - 191
= 0.84
16.
H0: µ 1 - µ 2 = 0 Ha: µ1 − µ2 ≠ 0 Reject H0 if z < -1.96 or if z > 1.96
z=
( x1 − x2 ) − 0
σ
2 1
n1
+
σ
2 2
(40 − 35)
=
(9) 2 (10) 2 + 36 49
n2
= 2.41
Reject H0; customers at the two stores differ in terms of mean ages. p-value = 2(1.0000 - .9920) = .0160 17. a.
Population 1 is supplier A. Population 2 is supplier B.
b.
H0: µ1 − µ2 ≤ 0
Stay with supplier A
Ha: µ1 − µ2 > 0
Change to supplier B
Reject H0 if z > 1.645 z=
( x1 − x2 ) − ( µ1 − µ 2 )
σ
2 1
n1
+
σ
2 2
=
n2
(14 − 12.5) − 0 (3) 2 (2) 2 + 50 30
= 2.68
p-value = 1.0000 - .9963 = .0037 Reject H0; change to supplier B. 18.
A summary of the sample data is shown below:
Employees Male Female
Sample Size 44 32
Sample Mean $12.34 $11.59
We will treat the male employees as population 1. H0: µ1 − µ2 ≤ 0 Ha: µ1 − µ2 > 0 Reject H0 if z > 2.33 z=
( x1 − x2 ) − ( µ1 − µ 2 ) 2 1
2 2
s s + n1 n2
=
(12.34 − 11.59) − 0 (.92) 2 (.76) 2 + 44 32
= 3.88
13- 192
Sample Standard Deviation $0.92 $0.76
Reject H0; wage discrimination appears to exist. 19. a.
H0: µ1 − µ2 = 0 Ha: µ1 − µ2 ≠ 0 Degrees of freedom = n1 + n2 - 2 = 24 t.025= 2.064 Reject H0 if t < -2.064 or if t > 2.064 x1 = 30.6
x2 = 27
s1 = 3.35
s2 = 2.64
sx1 − x2 =
t=
s12 s22 + = n1 n2
(3.35) 2 (2.64) 2 + = 120 . 12 14
(30.6 − 27) − 0 = 3.0 120 .
Reject H0; the population means differ. b.
Public Accountants have a higher mean. x1 − x2 = 30.6 - 27 = 3.6, or $3,600.
20. a.
H0: µ1 − µ2 = 0 Ha: µ1 − µ2 ≠ 0
σ 12
σ x −x = 1
z=
2
n1
+
σ 22 n2
=
2.52 2.52 + = .36 112 84
( x1 − x2 ) − 0 = 69.95 − 69.56 = 1.08 σ x −x 1
2
.36
b.
p-value = 2(1.0000 - .8599) = .2802
c.
Do no reject H0. Cannot conclude that there is a difference between the population mean scores for the two golfers.
21. a.
H0: µ1 − µ2 = 0 Ha: µ1 − µ2 ≠ 0
b.
t.025 = 2.021
df = n1 + n2 - 2 = 22 + 20 - 2 = 40
Reject H0 if t < -2.021 or if t > 2.021
13 - 193
c.
s2 =
( n1 − 1) s12 + ( n2 − 1) s22 n1 + n2 − 2
=
(22 − 1)(.8) 2 + (20 − 1)(1.1)2 = .9108 22 + 20 − 2
1 1 1 1 sx1 − x2 = s 2 + = .9108 + = .2948 22 20 n1 n2
t=
( x1 − x2 ) − 0 = 2.5 − 2.1 = 1.36 .2948
s x1 − x2
d.
Do not reject H0. Cannot conclude that a difference between population mean exists.
e.
Using Excel's TDIST function, p-value = .18.
22. a.
H0: µ1 − µ2 ≤ 0 Ha: µ1 − µ2 > 0
b.
t.05 = 1.711
df = n1 + n2 - 2 = 16 + 10 - 2 = 24
Reject H0 if t > 1.711 c.
s2 =
( n1 − 1) s12 + ( n2 − 1) s22 n1 + n2 − 2
=
(16 − 1)(.64)2 + (10 − 1)(.75)2 = .4669 16 + 10 + 2
1 1 1 1 sx1 − x2 = s 2 + = .4669 + = .2755 n n 16 10 2 1
t=
( x1 − x2 ) − 0 = 6.82 − 6.25 = 2.07 s x1 − x2
.2755
d.
Reject H0. Conclude that the consultant with the more experience has the higher population mean rating.
e.
Using Excel's TDIST function, p-value = .025.
23. a.
1, 2, 0, 0, 2
b.
d = ∑ di / n = 5 / 5 = 1
c.
sd =
d.
With 4 degrees of freedom, t.05 = 2.132
∑ ( di − d ) 2 4 = =1 n −1 5−1
Reject H0 if t > 2.132 t=
d − µd 1− 0 = = 2.24 sd / n 1 / 5
13- 194
Using Excel's TDIST function, p-value = .04. Reject H0; conclude µd > 0.
24. a.
3, -1, 3, 5, 3, 0, 1
b.
d = ∑ di / n = 14 / 7 = 2
c.
sd =
d.
d =2
e.
With 6 degrees of freedom t.025 = 2.447
∑ ( di − d ) 2 = n −1
(
2 ± 2.447 2.082 / 7
26 = 2.082 7 −1
)
2 ± 1.93 or .07 to 3.93 25.
Difference = rating after - rating before H0: µd ≤ 0 Ha: µd > 0 With 7 degrees of freedom, reject H0 if t > 1.895 d = .625 and sd = 1.3025 t=
d − µd .625 − 0 = = 136 . sd / n 13025 . / 8
p-value is greater than .10 Do not reject H0; we cannot conclude that seeing the commercial improves the mean potential to purchase. 26.
Differences: .20, .29, .39, .02, .24, .20, .20, .52, .29, .20 d = ∑ di / n = 2.55 /10 = .255 sd =
∑( d i − d ) 2 = .1327 n −1
With df = 9, t.025 = 2.262
13 - 195
d ± t.025
sd n
.1327 .255 ± 2.262 10 .255 ± .095 or .16 to .35 27.
Differences: 8, 9.5, 6, 10.5, 15, 9, 11, 7.5, 12, 5 d = 93.5/10 = 9.35 and sd = 2.954 t.025 = 2.262
e
j
9.35 ± 2.262 2.954 / 10 = 9.35 ± 2.11 Interval estimate is 7.24 to 11.46 28.
H0: µd = 0 Ha: µd ≠ 0 Reject H0 if t < -2.365 or if t > 2.365
df = 7
Differences -.01, .03, -.06, .16, .21, .17, -.09, .11 d = ∑ di / n = .52 / 8 = .065 sd =
t=
∑( d i − d ) 2 = .1131 n −1
d − 0 .065 = = 1.63 sd .1131 8 n
Do not reject H0. Cannot conclude that the population means differ. 29.
Using matched samples, the differences are as follows: 4, -2, 8, 8, 5, 6, -4, -2, -3, 0, 11, -5, 5, 9, 5 H0: µd ≤ 0 Ha: µd > 0 d = 3 and sd = 5.21 t=
d − µd 3− 0 = = 2.23 sd / n 5.21 / 15
Using Excel's TDIST function, p-value = .02.
13- 196
With 14 degrees of freedom, reject H0 if t > 1.761 or if p-value < α = .05. Reject H0. Conclude that the population of readers spends more time, on average, watching television than reading.
30. a.
Difference = Price deluxe - Price Standard H0: µd = 10 Ha: µd ≠ 10 With 6 degrees of freedom, reject H0 if t < -2.447 or if t > 2.447; alternatively, reject H0 p-value < α = .05. d = 8.86 and sd = 2.61 t=
d − µ d 8.86 − 10 = = −116 . sd / n 2.61 / 7
Using Excel's TDIST function, p-value = .29. Do not reject H0; we cannot reject the hypothesis that a $10 price differential exists. b.
d ± tα / 2
sd n
2.61 8.86 ± 2.447 7 8.86 ± 2.41 or 6.45 to 11.27 31. a.
H0: µ1 - µ2 = 0 Ha: µ1 - µ2 ≠ 0 With df = 11, t.025 = 2.201 Reject H0 if t < -2.201 or if t > 2.201; alternatively, reject H0 if p-value < α = .05. Calculate the difference, di, for each stock. d = ∑ di / n = 85 / 12 = 7.08 sd =
∑ ( di − d ) 2 = 3.34 n −1
13 - 197
if
t=
x −µ sd / n
= 7.34
p-value ≈ 0 Reject H0; a decrease in P/E ratios is being projected for 1998.
b.
d ± t.025
sd n
3.34 7.08 ± 2.201 12 7.08 ± 2.12 or 4.96 to 9.21 32. a.
x = (30 + 45 + 36)/3 = 37 k
(
SSTR = ∑ n j x j − x j =1
)
2
= 5(30 - 37)2 + 5(45 - 37)2 + 5(36 - 37)2 = 570
MSTR = SSTR /(k - 1) = 570/2 = 285 k
b.
SSE = ∑ (n j − 1) s 2j = 4(6) + 4(4) + 4(6.5) = 66 j =1
MSE = SSE /(nT - k) = 66/(15 - 3) = 5.5 c.
F = MSTR /MSE = 285/5.5 = 51.82 F.05 = 3.89 (2 degrees of freedom numerator and 12 denominator) Since F = 51.82 > F.05 = 3.89, we reject the null hypothesis that the means of the three populations are equal.
d. Source of Variation Treatments Error Total 33. a.
Sum of Squares 570 66 636
Degrees of Freedom 2 12 14
Mean Square 285 5.5
x = (153 + 169 + 158)/3 = 160 k
(
SSTR = ∑ n j x j − x j =1
)
2
= 4(153 - 160)2 + 4(169 - 160) 2 + 4(158 - 160) 2 = 536
MSTR = SSTR /(k - 1) = 536/2 = 268
13- 198
F 51.82
k
b.
SSE = ∑ (n j − 1) s 2j = 3(96.67) + 3(97.33) +3(82.00) = 828.00 j =1
MSE = SSE /(nT - k) = 828.00 /(12 - 3) = 92.00 c.
F = MSTR /MSE = 268/92 = 2.91 F.05 = 4.26 (2 degrees of freedom numerator and 9 denominator) Since F = 2.91 < F.05 = 4.26, we cannot reject the null hypothesis.
d. Source of Variation Treatments Error Total 34. a.
x=
Sum of Squares 536 828 1364
Degrees of Freedom 2 9 11
Mean Square 268 92
F 2.91
4(100) + 6(85) + 5(79) = 87 15 k
(
SSTR = ∑ n j x j − x j =1
)
2
= 4(100 - 87) 2 + 6(85 - 87) 2 + 5(79 - 87) 2 = 1,020
MSTR = SSB /(k - 1) = 1,020/2 = 510 k
b.
SSE = ∑ (n j − 1) s 2j = 3(35.33) + 5(35.60) + 4(43.50) = 458 j =1
MSE = SSE /(nT - k) = 458/(15 - 3) = 38.17 c.
F = MSTR /MSE = 510/38.17 = 13.36 F.05 = 3.89 (2 degrees of freedom numerator and 12 denominator) Since F = 13.36 > F.05 = 3.89 we reject the null hypothesis that the means of the three populations are equal.
d. Source of Variation Treatments Error Total
Sum of Squares 1020 458 1478
Degrees of Freedom 2 12 14
Mean Square 510 38.17
F 13.36
Source of Variation Treatments Error Total
Sum of Squares 1200 300 1500
Degrees of Freedom 3 60 63
Mean Square 400 5
F 80
35. a.
b.
F.05 = 2.76 (3 degrees of freedom numerator and 60 denominator) Since F = 80 > F.05 = 2.76 we reject the null hypothesis that the means of the 4 populations are equal.
13 - 199
36. a. Source of Variation Treatments Error Total b.
Sum of Squares 120 216 336
Degrees of Freedom 2 72 74
Mean Square 60 3
F 20
F.05 = 3.12 (2 numerator degrees of freedom and 72 denominator) Since F = 20 > F.05 = 3.12, we reject the null hypothesis that the 3 population means are equal.
37. Manufacturer 1 23 6.67
Sample Mean Sample Variance
Manufacturer 2 28 4.67
Manufacturer 3 21 3.33
x = (23 + 28 + 21)/3 = 24 k
(
SSTR = ∑ n j x j − x j =1
)
2
= 4(23 - 24) 2 + 4(28 - 24) 2 + 4(21 - 24) 2 = 104
MSTR = SSTR /(k - 1) = 104/2 = 52 k
SSE = ∑ (n j − 1) s 2j = 3(6.67) + 3(4.67) + 3(3.33) = 44.01 j =1
MSE = SSE /(nT - k) = 44.01/(12 - 3) = 4.89 F = MSTR /MSE = 52/4.89 = 10.63 F.05 = 4.26 (2 degrees of freedom numerator and 9 denominator) Since F = 10.63 > F.05 = 4.26 we reject the null hypothesis that the mean time needed to mix a batch of material is the same for each manufacturer. 38. Superior 5.75 1.64
Sample Mean Sample Variance
Peer 5.5 2.00
Subordinate 5.25 1.93
x = (5.75 + 5.5 + 5.25)/3 = 5.5 k
(
SSTR = ∑ n j x j − x j =1
)
2
= 8(5.75 - 5.5) 2 + 8(5.5 - 5.5) 2 + 8(5.25 - 5.5) 2 = 1
MSTR = SSTR /(k - 1) = 1/2 = .5 k
SSE = ∑ (n j − 1) s 2j = 7(1.64) + 7(2.00) + 7(1.93) = 38.99 j =1
MSE = SSE /(nT - k) = 38.99/21 = 1.86
13- 200
F = MSTR /MSE = 0.5/1.86 = 0.27 F.05 = 3.47 (2 degrees of freedom numerator and 21 denominator) Since F = 0.27 < F.05 = 3.47, we cannot reject the null hypothesis that the means of the three populations are equal; thus, the source of information does not significantly affect the dissemination of the information.
39. Marketing Managers 5 .8
Sample Mean Sample Variance
Marketing Research 4.5 .3
Advertising 6 .4
x = (5 + 4.5 + 6)/3 = 5.17 k
(
SSTR = ∑ n j x j − x j =1
)
2
= 6(5 - 5.17)2 + 6(4.5 - 5.17) 2 + 6(6 - 5.17) 2 = 7.00
MSTR = SSTR /(k - 1) = 7.00/2 = 3.5 k
SSE = ∑ (n j − 1) s 2j = 5(.8) + 5(.3) + 5(.4) = 7.50 j =1
MSE = SSE /(nT - k) = 7.50/(18 - 3) = .5 F = MSTR /MSE = 3.5/.50 = 7.00 F.05 = 3.68 (2 degrees of freedom numerator and 15 denominator) Since F = 7.00 > F.05 = 3.68, we reject the null hypothesis that the mean perception score is the same for the three groups of specialists. 40. Real Estate Agent 67.73 117.72
Sample Mean Sample Variance
Architect 61.13 180.10
Stockbroker 65.80 137.12
x = (67.73 + 61.13 + 65.80)/3 = 64.89 k
(
SSTR = ∑ n j x j − x j =1
)
2
= 15(67.73 - 64.89) 2 + 15(61.13 - 64.89) 2 + 15(65.80 - 64.89) 2 = 345.47
MSTR = SSTR /(k - 1) = 345.47/2 = 172.74 k
SSE = ∑ (n j − 1) s 2j = 14(117.72) + 14(180.10) + 14(137.12) = 6089.16 j =1
13 - 201
MSE = SSE /(nT - k) = 6089.16/(45-3) = 144.98 F = MSTR /MSE = 172.74/144.98 = 1.19 F.05 = 3.22 (2 degrees of freedom numerator and 42 denominator)
Since F = 1.19 < F.05 = 3.22, we cannot reject the null hypothesis that the job stress ratings are the same for the three occupations. 41.
The Excel output is shown below: SUMMARY Groups
Count
Banking Financial Services Insurance
Sum
Average
Variance
12
183
15.25
29.8409
7
128
18.2857
16.5714
10
163
16.3
15.1222
ANOVA Source of Variation
SS
Between Groups
df
MS
40.7732
2
20.3866
Within Groups
563.7786
26
21.6838
Total
604.5517
28
F
P-value 0.9402
0.4034
Since the p-value = 0.4034 > α = 0.05, we cannot reject the null hypothesis that that the mean price/earnings ratio is the same for these three groups of firms. 42.
x1 − x2 ± z.05
s12 s22 + n1 n2
45, 000 − 35, 000 ± 1.645
(4000) 2 (3500) 2 + 60 80
10,000 ± 1066 or 8,934 to 11,066 43.
H0: µ1 - µ2 = 0 Ha: µ1 - µ2 ≠ 0 Reject H0 if z < -1.96 or if z > 1.96
13- 202
F crit 3.3690
z=
( x1 − x2 ) − ( µ1 − µ 2 )
σ
2 1
n1
+
σ
2 2
=
n2
(4.27 − 3.38) − 0 (1.85)2 (1.46) 2 + 120 100
= 3.99
Reject H0; a difference exists with system B having the lower mean checkout time. 44. a.
H0: µ1 - µ2 ≤ 0 Ha: µ1 - µ2 > 0 Reject H0 if z > 1.645
b.
n1= 30
n2 = 30
x1 = 16.23
x2 = 15.70
s1 = 3.52
s2 = 3.31
sx1 − x2 =
(3.52) 2 (3.31) 2 + = 0.88 30 30
z=
( x1 − x2 ) − 0 (16.23 − 15.70) = = 0.59 0.88 sx1 − x2
Do not reject H0; cannot conclude that the mutual funds with a load have a greater mean rate of return. Load funds 16.23% ; no load funds 15.7% c.
At z = 0.59, Area = 0.2224 p-value = 1.0000 - .7224 = 0.2776
45.
Difference = before - after H0: µd ≤ 0 Ha: µd > 0 With 5 degrees of freedom, reject H0 if t > 2.015 d = 6.167 and sd = 6.585 t=
d − µd sd / n
=
6167 . −0 6.585 / 6
= 2.29
Using Excel's TDIST function, p-value = .035. Reject H0; conclude that the program provides weight loss. 46. a.
Population 1 - 1996
13 - 203
Population 2 - 1997 H0: µ1 - µ2 ≤ 0 Ha: µ1 - µ2 > 0 b.
d = ∑ d i / n = 174 . / 14 = 012 . sd =
∑(d i − d ) 2 = 0.33 n −1
Degrees of freedom = 13; t.05 = 1.771 Reject H0 if t > 1.771 or if p-value < α = .05 t=
d −0 sd / n
=
012 . 0.33 / 14
= 142 .
Using Excel's TDIST function, p-value = .09. Do not reject H0. The sample of 14 companies shows earnings are down in the fourth quarter by a mean of 0.12 per share. However, data does not support the conclusion that mean earnings for all companies are down in 1997. 47. a. Area 1 96 50
Sample Mean Sample Variance
pooled estimate =
Area 2 94 40
s12 + s22 50 + 40 = = 45 2 2
1 1 estimate of standard deviation of x1 − x2 = 45 + = 4.74 4 4 t=
x1 − x2 96 − 94 = = .42 4.74 4.74
t.025 = 2.447 (6 degrees of freedom) Since t = .42 < t.025 = 2.477, the means are not significantly different. b.
x = (96 + 94)/2 = 95 k
(
SSTR = ∑ n j x j − x j =1
)
2
= 4(96 - 95) 2 + 4(94 - 95) 2 = 8
MSTR = SSTR /(k - 1) = 8 /1 = 8
13- 204
k
SSE = ∑ (n j − 1) s 2j = 3(50) + 3(40) = 270 j =1
MSE = SSE /(nT - k) = 270 /(8 - 2) = 45 F = MSTR /MSE = 8 /45 = .18 F.05 = 5.99 (1 degree of freedom numerator and 6 denominator) Since F = .18 < F.05 = 5.99 the means are not significantly different.
c. Area 1 96 50
Sample Mean Sample Variance
Area 2 94 40
Area 3 83 42
x = (96 + 94 + 83)/3 = 91 k
(
SSTR = ∑ n j x j − x j =1
)
2
= 4(96 - 91) 2 + 4(94 - 91) 2 + 4(83 - 91) 2 = 392
MSTR = SSTR /(k - 1) = 392 /2 = 196 k
SSE = ∑ (n j − 1) s 2j = 3(50) + 3(40) + 3(42) = 396 j =1
MSTR = SSE /(nT - k) = 396 /(12 - 3) = 44 F = MSTR /MSE = 196 /44 = 4.45 F.05 = 4.26 (2 degrees of freedom numerator and 6 denominator) Since F = 4.45 > F.05 = 4.26 we reject the null hypothesis that the mean asking prices for all three areas are equal. 48.
The Excel output for these data is shown below: SUMMARY Groups
Count
Sum
Average
Variance
Sport Utility
10
586
58.6
20.9333
Small Pickup
10
488
48.8
17.7333
Full-Size Pickup
10
601
60.1
22.1
ANOVA Source of Variation
SS
df
13 - 205
MS
F
P-value
F crit
Between Groups
753.2667
2
376.6333
546.9
27
20.2556
1300.167
29
Within Groups Total
18.5941
8.37E-06
3.3541
Because the p-value = .000 < α = .05, we can reject the null hypothesis that the mean resale value is the same. It appears that the mean resale value for small pickup trucks is much smaller than the mean resale value for sport utility vehicles or full-size pickup trucks.
49. Food 52.25 22.25
Sample Mean Sample Variance
Personal Care 62.25 15.58
Retail 55.75 4.92
x = (52.25 + 62.25 + 55.75)/3 = 56.75 k
(
SSTR = ∑ n j x j − x j =1
)
2
= 4(52.25 - 56.75) 2 + 4(62.25 - 56.75) 2 + 4(55.75 - 56.75) 2 = 206
MSTR = SSTR /(k - 1) = 206 /2 = 103 k
SSE = ∑ (n j − 1) s 2j = 3(22.25) + 3(15.58) + 3(4.92) = 128.25 j =1
MSE = SSE /(nT - k) = 128.25 /(12 - 3) = 14.25 F = MSTR /MSE = 103 /14.25 = 7.23 F.05 = 4.26 (2 degrees of freedom numerator and 9 denominator) Since F = 7.23 exceeds the critical F value, we reject the null hypothesis that the mean age of executives is the same in the three categories of companies. 50. Lawyer 50.0 124.22
Sample Mean Sample Variance x=
Physical Therapist 63.7 164.68
Cabinet Maker 69.1 105.88
Systems Analyst 61.2 136.62
50.0 + 63.7 + 69.1 + 61.2 = 61 4 k
(
SSTR = ∑ n j x j − x j =1
)
2
= 10(50.0 - 61) 2 + 10(63.7 - 61) 2 + 10(69.1 - 61) 2 + 10(61.2 - 61) 2 = 1939.4
13- 206
MSTR = SSTR /(k - 1) = 1939.4 /3 = 646.47 k
SSE = ∑ (n j − 1) s 2j = 9(124.22) + 9(164.68) + 9(105.88) + 9(136.62) = 4,782.60 j =1
MSE = SSE /(nT - k) = 4782.6 /(40 - 4) = 132.85 F = MSTR /MSE = 646.47 /132.85 = 4.87 F.05 = 2.87 (3 degrees of numerator and 36 denominator) Since F = 4.87 > F.05 = 2.87, we reject the null hypothesis that the mean job satisfaction rating is the same for the four professions.
51.
The Excel output for these data is shown below: Anova: Single Factor SUMMARY Groups
Count
Sum
Average
Variance
West
10
1080
108
565.5556
South
10
917
91.7
384.9
Northeast
10
1211
121.1
826.3222
ANOVA Source of Variation Between Groups Within Groups Total
SS
df
MS
F
4338.867
2
2169.4333
15991
27
592.2593
20329.87
29
3.6630
P-value 0.0391
F crit 3.3541
Because the p-value = .0391 < α = .05, we can reject the null hypothesis that the mean rate for the three regions is the same. 52.
The Excel output is shown below: SUMMARY Groups
Count
Sum
Average
Variance
West
10
600
60
52.0933
South
10
454
45.4
57.9067
North Central
10
473
47.3
45.9444
Northeast
10
521
52.1
37.8511
ANOVA
13 - 207
Source of Variation Between Groups
SS
df
MS
1271
3
423.6667
Within Groups
1744.16
36
48.4489
Total
3015.16
39
F 8.7446
P-value 0.0002
F crit 2.8663
Since the p-value = 0.0002 < α = 0.05, we can reject the null hypothesis that that the mean base salary for art directors is the same for each of the four regions.
53.
The Excel output for these data is shown below: SUMMARY Groups
Count
Sum
Average
Variance
Wide Receiver
15
111.2
7.4133
0.7841
Guard
13
79.4
6.1077
0.5474
Offensive Tackle
12
84.7
7.0583
0.6408
ANOVA Source of Variation
SS
df
MS
Between Groups
12.4020
2
6.2010
Within Groups
24.5957
37
0.6647
Total
36.9978
39
F 9.3283
P-value 0.0005
F crit 3.2519
Because the p-value = .0005 < α = .05, we can reject the null hypothesis that the mean rating for the three positions is the same. It appears that wide receivers and tackles have a higher mean rating than guards. 54.
The output obtained using Excel's Anova: Single factor tool is shown.
13- 208
SUMMARY Groups UK US Europe
Count 22 22 22
ANOVA Source of Variation Between Groups Within Groups
SS 731.7533 247.4164
Total
979.1696
Sum 265.14 329.05 442.3
df 2 63
Average Variance 12.0518 1.9409 14.9568 3.4100 20.1045 6.4308
MS 365.8766 3.9272
F 93.1637
P-value 0.0000
F crit 3.1428
65
Since the p-value = 0.0000 is less than α = .05, we can reject the null hypothesis that the mean download time is the same for Websites located in the United kingdom, United States and Europe.
13 - 209
Chapter 11 Comparisons Involving Proportions and A Test of Independence Learning Objectives
1.
Know the properties of the sampling distribution of the difference between two proportions ( p1 − p2 ) .
2.
Be able to develop interval estimates and conduct hypothesis tests about the difference between the proportions of two populations.
3.
Be able to conduct a goodness of fit test when the population is hypothesized to have a multinomial probability distribution.
4.
For a test of independence, be able to set up a contingency table, determine the observed and expected frequencies, and determine if the two variables are independent.
5.
Understand the role of the chi-square distribution in conducting tests of goodness of fit and independence.
6.
Be able to use the Excel worksheets presented as templates for interval estimates and hypothesis tests involving proportions.
13- 210
Solutions:
1.
a.
p1 − p2 = .48 - .36 = .12
b.
s p1 − p2 =
p1 (1 − p1 ) p2 (1 − p2 ) + = n1 n2
0.48(0.52) 0.36(0.64) + = 0.0373 400 300
0.12 ± 1.645(0.0373) 0.12 ± 0.0614 or 0.0586 to 0.1814 c.
0.12 ± 1.96(0.0373) 0.12 ± 0.0731 or 0.0469 to 0.1931
2.
a.
p=
n1 p1 + n2 p2 200(0.22) + 300(016 . ) = = 0184 . n1 + n2 200 + 300
s p1 − p2 = (0184 . )(0.816)
1 1 I F G H200 + 300 J K= 0.0354
Reject H0 if z > 1.645 z=
(.22 − .16) − 0 = 1.69 .0354
Reject H0 b. 3.
p-value = (1.0000 - .9545) = .0455 p1 = 220/400 = 0.55 p2 = 192/400 = 0.48 s p1 − p2 =
0.55(0.45) 0.48(0.52) + = 0.0353 400 400
p1 − p2 ± 1.96 s p1 − p2 0.55 - 0.48 ± 1.96(0.0353) 0.07 ± 0.0691 or 0.0009 to 0.1391 7% more executives are predicting an increase in full-time jobs. The confidence interval shows the difference may be from 0% to 14%.
13 - 211
4.
a.
p1 = 682/1082 = .6303 (63%) p2 = 413/1008 = .4097 (41%) p1 − p2 = .6303 - .4097 = .2206 (22%)
b.
p1 (1 − p1 ) p2 (1 − p2 ) .6303(1 − .6303) .4097(1 − .4097) + = + = .0213 n1 n2 1082 1008
σ p −p = 1
2
p1 − p2 ± 1.96σ p1 − p2 .2206 ± 1.96(.0213) .2206 ± .0418 or .1788 to .2624 p1 − p2 ± zα / 2 s p1 − p2
5.
p1 (1 − p1 ) p2 (1 − p2 ) + = n1 n2
s p1 − p2 =
n1 = 0.57(1710) = 975
∴ s p1 − p2 =
(0.58)(0.42) (0.43)(0.57) + n1 n2
n2 = 0.08(1710) = 137
(0.58)(0.42) (0.43)(0.57) + = 0.045 975 137
0.58 - 0.43 ± 1.96(0.045) 0.15 ± 0.09 or 0.07 to 0.24 6.
a.
p1 = 279/300 = 0.93 p2 = 255/300 = 0.85
b.
H0: p1 - p2 = 0 Ha: p1 - p2 ≠ 0 Reject H0 if z < -1.96 or if z > 1.96 p=
279 + 255 = 0.89 300 + 300
. ) s p1 − p2 = (0.89)(011 z=
1 1 I F G H300 + 300J K= 0.0255
p1 − p2 − 0 0.93 − 0.85 = = 313 . s p1 − p2 0.0255
13- 212
Using Excel's NORMSDIST function, p-value = .002.
c.
Reject H0; women and men differ on this question. p1 − p2 ± 196 . s p1 − p2 s p1 − p2 =
(0.93)(0.07) (0.85)(015 . ) + = 0.0253 300 300
0.93 - 0.85 ± 1.96(0.0253) 0.08 ± 0.05 or 0.03 to 0.13 95% confident that 3% to 13% more women than men agree with this statement. H0: p1 ≤ p2
7.
Ha: p1 > p2 z= p=
s p1 − p2 =
b
( p1 − p2 ) − p1 − p2
n1 p1 + n2 p2 1545(0.675) + 1691(0.608) = = 0.64 n1 + n2 1545 + 1691 p (1 − p )
F 1 1I G Hn + n JK= 1
z=
g
s p1 − p2
(0.64)(0.36)
2
1 1 I F + G H1545 1691JK= 0.017
(0.675 − 0.608) − 0 = 3.94 0.017
Since 3.94 > z.05 = 1.645, we reject H0 p-value ≈ 0 Conclusion: The proportion of men that feel that the division of housework is fair is greater than the proportion of women that feel that the division of housework is fair. 8.
a.
A summary of the sample data is shown below:
Respondents Men Women
Sample Size 200 300
Number Cooperating 110 210
H0: p1 - p2 = 0 Ha: p1 - p2 ≠ 0 Reject H0 if z < -1.96 or if z > 1.96
13 - 213
p=
110 + 210 = 0.64 200 + 300
s p1 − p2 = (0.64)(0.36)
1 1 I F G H200 + 300J K= 0.0438
p1 = 110 / 200 = 0.55 z=
p2 = 210 / 300 = 0.70
b
( p1 − p2 ) − p1 − p2
g= (0.55 − 0.70) − 0 = −3.42 0.0438
s p1 − p2
Reject H0; there is a difference between response rates for men and women. b.
015 . ± 196 .
0.55(0.45) 0.70(0.30) + 200 300
.15 ± .0863 or .0637 to .2363 Greater response rate for women. 9.
a.
H0: p1 - p2 = 0 Ha: p1 - p2 ≠ 0 Reject H0 if z < -1.96 or if z > 1.96 p=
63 + 60 = 0.3514 150 + 200
s p1 − p2 = (0.3514)(0.6486)
1 1 I F G H150 + 200 J K= 0.0516
p1 = 63 / 150 = 0.42
p2 = 60 / 200 = 0.30
z=
b
( p1 − p2 ) − p1 − p2 s p1 − p2
g= (0.42 − 0.30) − 0 = 2.33 0.0516
p-value = 2(1.0000 - .9901) = .0198 Reject H0; there is a difference between the recall rates for the two commercials. b.
(0.42 − 0.30) ± 196 .
0.42(58) 0.30(0.70) + 150 200
.12 ± .10 or .02 to .22 10.
p=
n1 p1 + n2 p2 232(.815) + 210(.724) = = .7718 n1 + n2 232 + 210
s p1 − p2 =
1 1 1 1 p (1 − p ) + = (.7718)(1 − 7718) + = .04 232 210 n1 n2
13- 214
z=
( p1 − p2 ) − 0 = .815 − .724 = 2.28 .04
s p1 − p2
p-value = 2(1.0000 - .9887) = .0226 p-value < .05, reject H0. The population proportions differ. NYSE is showing a greater proportion of stocks below their 1997 highs. 11.
H0: p1 - p2 ≤ 0 Ha: p1 - p2 > 0
p=
s p1 − p2 =
z=
n1 p1 + n2 p2 240(.40) + 250(.32) = = .3592 n1 + n2 240 + 250 1 1 1 1 p (1 − p ) + = (.3592)(1 − .3592) + = .0434 n n 240 250 2 1
( p1 − p2 ) − 0 = .40 − .32 = 1.85 s p1 − p2
.0434
p-value = 1.0000 - .9678 = .0322 p-value < .05, reject H0. The proportion of users at work is greater in Washington D.C. 12.
Expected frequencies:
e1 = 200 (.40) = 80, e2 = 200 (.40) = 80 e3 = 200 (.20) = 40
Actual frequencies:
f1 = 60, f2 = 120, f3 = 20 2 χ =
=
(60 - 80) 2 80 400 80
+
+
1600 80
(120 - 80) 2 80 +
+
(20 - 40) 2 40
400 40
= 5 + 20 + 10 = 35
χ = 9.21034 with k - 1 = 3 - 1 = 2 degrees of freedom 2 .01
Since χ 2 = 35 > 9.21034 reject the null hypothesis. The population proportions are not as stated in the null hypothesis.
13 - 215
13.
Expected frequencies:
e1 = 300 (.25) = 75, e2 = 300 (.25) = 75 e3 = 300 (.25) = 75, e4 = 300 (.25) = 75
Actual frequencies:
f1 = 85, f2 = 95, f3 = 50, f4 = 70 2
χ =
=
=
(85 - 75) 2 75 100 75
+
+
400 75
+
(95 - 75) 2 75 625 75
+
+
(50 - 75) 2 75
+
(70 - 75) 2 75
25 75
1150 75
= 15.33 2 χ .05 = 7.81473 with k - 1 = 4 - 1 = 3 degrees of freedom
Since χ2 = 15.33 > 7.81473 reject H0 We conclude that the proportions are not all equal. 14.
H0 = pABC = .29, pCBS = .28, pNBC = .25, pOther = .18 Ha = The proportions are not pABC = .29, pCBS = .28, pNBC = .25, pOther = .18 Expected frequencies:
300 (.29) = 87, 300 (.28) = 84 300 (.25) = 75, 300 (.18) = 54
e1 = 87, e2 = 84, e3 = 75, e4 = 54 Actual frequencies:
f1 = 95, f2 = 70, f3 = 89, f4 = 46
2 χ .05 = 7.81 (3 degrees of freedom)
(95 - 87)2 (70 - 84)2 (89 - 75)2 (46 - 54)2 + + + 87 84 75 54 = 6.87
χ2 =
Do not reject H0; there is no significant change in the viewing audience proportions.
13- 216
15.
Category Brown Yellow Red Orange Green Blue
Hypothesized Proportion 0.30 0.20 0.20 0.10 0.10 0.10 Totals:
Observed Frequency (fi) 177 135 79 41 36 38 506
Expected Frequency (ei) 151.8 101.2 101.2 50.6 50.6 50.6
(fi - ei)2 / ei 4.18 11.29 4.87 1.82 4.21 3.14 29.51
2 χ .05 = 11.07 (5 degrees of freedom)
Since 29.51 > 11.07, we conclude that the percentage figures reported by the company have changed. 16.
Category Full Service Discount Both
Hypothesized Proportion 1/3 1/3 1/3 Totals:
Observed Frequency (fi) 264 255 229 748
Expected Frequency (ei) 249.33 249.33 249.33
(fi - ei)2 / ei 0.86 0.13 1.66 2.65
2 χ.10 = 4.61 (2 degrees of freedom)
Since 2.65 < 4.61, there is no significant difference in preference among the three service choices. 17.
Category News and Opinion General Editorial Family Oriented Business/Financial Female Oriented African-American
Observed Frequency (fi) 20 15 30 22 16 12 115
Hypothesized Proportion 1/6 1/6 1/6 1/6 1/6 1/6 Totals:
Expected Frequency (ei) 19.17 19.17 19.17 19.17 19.17 19.17
(fi - ei)2 / ei .04 .91 6.12 .42 .52 2.68 10.69
2 χ .10 = 9.24 (5 degrees of freedom)
Since 10.69 > 9.24, we conclude that there is a difference in the proportion of ads with guilt appeals among the six types of magazines. 18.
Expected frequencies:
ei = (1 / 3) (135) = 45
2 2 2 χ 2 = (43 - 45) + (53 - 45) + (39 - 45) = 2.31
45
45
45
13 - 217
19.
2 With 2 degrees of freedom, χ .05 = 5.99 Do not reject H0; there is no justification for concluding a difference in preference exists. H0: p1 = .03, p2 = .28, p3 = .45, p4 = .24
df = 3
2 χ.01 = 11.34
Reject H0 if χ2 > 11.34 Rating Excellent Good Fair Poor
Observed 24 124 172 80 400
(fi - ei)2 / ei 12.00 1.29 .36 2.67 χ2 = 16.31
Expected .03(400) = 12 .28(400) = 112 .45(400) = 180 .24(400) = 96 400
Reject H0; conclude that the ratings differ. A comparison of observed and expected frequencies show telephone service is slightly better with more excellent and good ratings. 20.
H0 = The column variable is independent of the row variable Ha = The column variable is not independent of the row variable Expected Frequencies: A 28.5 21.5
P Q 2
χ =
(20 - 28.5) 28.5
2
+
(44 - 39.9) 39.9
2
+
B 39.9 30.1
(50 - 45.6)
2
45.6
+
C 45.6 34.4 (30 - 21.5)
2
21.5
+
(26 - 30.1) 30.1
= 7.86 2 χ .025 = 7.37776 with (2 - 1) (3 - 1)= 2 degrees of freedom
Since χ2 = 7.86 > 7.37776 Reject H0 Conclude that the column variable is not independent of the row variable. 21.
H0 = The column variable is independent of the row variable Ha = The column variable is not independent of the row variable Expected Frequencies:
P Q R
A 17.5000 28.7500 13.7500
B 30.6250 50.3125 24.0625
13- 218
C 21.8750 35.9375 17.1875
2
+
(30 - 34.4) 34.4
2
(20 - 17.5000)2 (30 - 30.6250)2 (30 - 17.1875)2 + +⋅⋅⋅+ 17.5000 30.6250 17.1875 = 19.78
χ2 =
2 χ .05 = 9.48773 with (3 - 1) (3 - 1)= 4 degrees of freedom
Since χ2 = 19.78 > 9.48773 Reject H0 Conclude that the column variable is not independent of f the row variable. 22.
H0 : Type of ticket purchased is independent of the type of flight Ha: Type of ticket purchased is not independent of the type of flight. Expected Frequencies:
e11 = 35.59 e21 = 150.73 e31 = 455.68
Ticket First First Business Business Full Fare Full Fare
e12 = 15.41 e22 = 65.27 e32 = 197.32
Flight Domestic International Domestic International Domestic International Totals:
Observed Frequency (fi) 29 22 95 121 518 135 920
Expected Frequency (ei) 35.59 15.41 150.73 65.27 455.68 197.32
(fi - ei)2 / ei 1.22 2.82 20.61 47.59 8.52 19.68 100.43
2 χ .05 = 5.99 with (3 - 1)(2 - 1) = 2 degrees of freedom
Since 100.43 > 5.99, we conclude that the type of ticket purchased is not independent of the type of flight. 23. a.
Observed Frequency (fij)
Same Different Total
Domestic 125 140 265
European 55 105 160
Asian 68 107 175
Total 248 352 600
Domestic 109.53 155.47 265
European 66.13 93.87 160
Asian 72.33 102.67 175
Total 248 352 600
Expected Frequency (eij)
Same Different Total Chi Square (fij - eij)2 / eij
Same
Domestic 2.18
European 1.87
13 - 219
Asian 0.26
Total 4.32
Different
1.54
0.18
3.04 χ2 = 7.36
2 χ .05 = 5.99
Degrees of freedom = 2
b.
1.32
Reject H0; conclude brand loyalty is not independent of manufacturer. Brand Loyalty Domestic 125/265 = .472 (47.2%) ← Highest European 55/160 = .344 (34.4%) Asian 68/175 = .389 (38.9%)
24. Major Business Engineering
Oil 30 30
Chemical 22.5 22.5
Industry Electrical 17.5 17.5
Computer 30 30
Note: Values shown above are the expected frequencies. 2 χ .01 = 11.3449 (3 degrees of freedom: 1 x 3 = 3)
χ2 = 12.39 Reject H0; conclude that major and industry not independent. 25.
Expected Frequencies:
e11 e21 e31 e41 e51 e61
= = = = = =
31.0 29.5 13.0 5.5 7.0 14.0
Most Difficult Spouse Spouse Parents Parents Children Children Siblings Siblings In-Laws In-Laws Other Relatives Other Relatives
e12 e22 e32 e42 e52 e62
= = = = = =
31.0 29.5 13.0 5.5 7.0 14.0
Gender Men Women Men Women Men Women Men Women Men Women Men Women Totals:
Observed
Expected
Frequency (fi) 37 25 28 31 7 19 8 3 4 10 16 12 200
Frequency (ei) 31.0 31.0 29.5 29.5 13.0 13.0 5.5 5.5 7.0 7.0 14.0 14.0
2 χ .05 = 11.0705 with (6 - 1) (2 - 1) = 5 degrees of freedom
13- 220
(fi - ei)2 / ei 1.16 1.16 0.08 0.08 2.77 2.77 1.14 1.14 1.29 1.29 0.29 0.29 13.43
Since 13.43 > 11.0705. we conclude that gender is not independent of the most difficult person to buy for.
26.
Expected Frequencies:
e11 e21 e31 e41 e51 e61
= = = = = =
17.16 14.88 28.03 22.31 17.16 15.45
= = = = = =
e12 e22 e32 e42 e52 e62
Magazine News News General General Family Family Business Business Female Female African-American African-American
12.84 11.12 20.97 16.69 12.84 11.55
Appeal Guilt Fear Guilt Fear Guilt Fear Guilt Fear Guilt Fear Guilt Fear Totals:
Observed Frequency (fi) 20 10 15 11 30 19 22 17 16 14 12 15 201
Expected Frequency (ei) 17.16 12.84 14.88 11.12 28.03 20.97 22.31 16.69 17.16 12.84 15.45 11.55
(fi - ei)2 / ei 0.47 0.63 0.00 0.00 0.14 0.18 0.00 0.01 0.08 0.11 0.77 1.03 3.41
2 χ .01 = 15.09 with (6 - 1) (2 - 1) = 5 degrees of freedom
Since 3.41 < 15.09, the hypothesis of independence cannot be rejected. 27. a.
Observed Frequency (fij)
Correct Incorrect Total
Pharm 207 3 210
Consumer 136 4 140
Computer 151 9 160
Telecom 178 12 190
Total 672 28 700
Consumer 134.4 5.6 140
Computer 153.6 6.4 160
Telecom 182.4 7.6 190
Total 672 28 700
Expected Frequency (eij)
Correct Incorrect Total
Pharm 201.6 8.4 210
Chi Square (fij - eij)2 / eij
13 - 221
Correct Incorrect
Pharm .14 3.47
Degrees of freedom = 3
Consumer .02 .46
Computer .04 1.06
Telecom .11 2.55
Total .31 7.53 χ2 = 7.85
2 χ.05 = 7.81473
Reject H0; conclude that order fulfillment is not independent of industry. b. 28.
The pharmaceutical industry is doing the best with 207 of 210 (98.6%) correctly filled orders. Expected Frequencies: Supplier A B C
Good 88.76 173.09 133.15
Part Quality Minor Defect 6.07 11.83 9.10
Major Defect 5.14 10.08 7.75
χ2 = 7.96 2 χ .05 = 9.48773 (4 degrees of freedom: 2 x 2 = 4)
Do not reject H0; conclude that the assumption of independence cannot be rejected 29.
Expected Frequencies: Education Level Did not complete high school High school degree College degree
Democratic 28 32 40
Party Affiliation Republican 28 32 40
Independent 14 16 20
χ2 = 13.42 2 χ .01 = 13.2767 (4 degrees of freedom: 2 x 2 = 4)
Reject H0; conclude that party affiliation is not independent of education level. 30.
Expected Frequencies:
e11 = 11.81 e21 = 8.40 e31 = 21.79
Siskel Con Con Con Mixed Mixed Mixed Pro Pro
e12 = 8.44 e22 = 6.00 e32 = 15.56
Ebert Con Mixed Pro Con Mixed Pro Con Mixed
e13 = 24.75 e23 = 17.60 e33 = 45.65 Observed Frequency (fi) 24 8 13 8 13 11 10 9
13- 222
Expected Frequency (ei) 11.81 8.44 24.75 8.40 6.00 17.60 21.79 15.56
(fi - ei)2 / ei 12.57 0.02 5.58 0.02 8.17 2.48 6.38 2.77
Pro
Pro Totals:
64 160
45.65
2 = 13.28 with (3 - 1) (3 - 1) = 4 degrees of freedom χ .01
Since 45.36 > 13.28, we conclude that the ratings are not independent.
31.
A summary of the sample data is shown below:
Region I II
Sample Size 500 800
p1 = 175 / 500 = .35 s p1 − p2 =
Number Indicating An Intent to Purchase 175 360
p2 = 360 / 800 = .45
0.35(0.65) 0.45(0.55) + = 0.0276 500 800
.10 ± 2.575(.0276) .10 ± .071 or .029 to .171 32. a.
b.
H0: p1 - p2 ≤ 0 Ha: p1 - p2 > 0 p1 = 704/1035 = .6802 (68%)
p2 = 582/1004 = .5797 (58%) p1 − p2 = .6802 - .5797 = .1005 p=
s p1 − p2 =
z=
n1 p1 + n2 p2 1035(0.6802) + 1004(0.5797) = = .6307 n1 + n2 1035 + 1004 1 1 1 1 p (1 − p ) + = (.6307)(1 − .6307) + = .0214 1035 1004 n1 n2
( p1 − p2 ) − 0 .6802 − .5797 = = 4.70 s p1 − p2 .0214
p-value ≈ 0 c. 33. a.
Reject H0; proportion indicating good/excellent increased. H0: p1 - p2 = 0
13 - 223
7.38 45.36
Ha: p1 - p2 ≠ 0 Reject H0 if z < -1.96 or if z > 1.96 p=
76 + 90 = 0.1277 400 + 900
s p1 − p2 = (0.1277)(0.8723)
1 1 I F G H400 + 900 J K= 0.02
p1 = 76 / 400 = 0.19 z=
p2 = 90 / 900 = 0.10
( p1 − p2 ) − ( p1 − p2 ) (0.19 − 0.10) − 0 = = 4.50 s p1 − p2 0.02
p-value ≈ 0 Reject H0; there is a difference between claim rates. b.
0.09 ± 1.96
019 . (0.81) 0.10(0.90) + 400 900
.09 ± .0432 or .0468 to .1332 p=
34.
9+5 14 = = 0.0341 142 + 268 410
s p1 − p2 = (0.0341)(0.9659) p1 = 9 / 142 = 0.0634
1 1 I F G H142 + 268 J K= 0.0188 p2 = 5 / 268 = 0.0187
p1 − p2 = 0.0634 − 0.0187 = 0.0447 z=
0.0447 − 0 = 2.38 0.0188
p-value = 2(1.0000 - .9913) = 0.0174 Reject H0; There is a significant difference in drug resistance between the two states. New Jersey has the higher drug resistance rate. 35. a.
b.
.38(430) = 163.4
Estimate: 163
.23(285) = 65.55
Estimate: 66
p1 − p2 = .38 − .23 = .15 s p1 − p2 =
.38(1 − .38) .23(1 − .23) + = .064 163 66
Confidence interval: .15 ± 1.96(.064) or .15 ± .125(.025 to .275)
13- 224
c.
36. a.
Yes, since the confidence interval in part (b) does not include 0, I would conclude that the Kodak campaign is more effective than most. p1 = .38
p2 = .22
Point estimate = p1 − p2 = .38 − .22 = .16 b.
H0: p1 - p2 ≤ 0 Ha: p1 - p2 > 0
c.
p=
n1 p1 + n2 p2 (200)(.38) + (200)(.22) = = .30 n1 + n2 200 + 200
s p1 − p2 =
z=
1 1 2 p (1 − p ) + = (.3)(.7) 200 = .0458 200 200
.38 − .22 = 3.49 .0458
z.01 = 2.33 With z = 3.49 > 2.33 we reject H0 and conclude that expectations for future inflation have diminished. 37. Observed Expected
60 50
45 50
59 50
36 50
χ2 = 8.04 2 χ .05 = 7.81473 (3 degrees of freedom)
Reject H0; conclude that the order potentials are not the same in each sales territory. 38. Observed Expected χ
2
=
48 37.03
323 306.82
79 126.96
16 21.16
63 37.03
(48 – 37.03) 2 (323 – 306.82) 2 (63 – 37.03) 2 + + •• • + 37.03 306.82 37.03
= 41.69 2 χ .01 = 13.2767 (4 degrees of freedom)
Since 41.69 > 13.2767, reject H0. Mutual fund investors' attitudes toward corporate bonds differ from their attitudes toward corporate stock.
13 - 225
39. Observed Expected χ
2
=
20 35
20 35
40 35
60 35
(20 – 35) 2 (20 – 35) 2 (40 – 35) 2 (60 – 35) 2 + + + 35 35 35 35
= 31.43 2 χ .05 = 7.81473 (3 degrees of freedom)
Since 31.43 > 7.81473, reject H0. The park manager should not plan on the same number attending each day. Plan on a larger staff for Sundays and holidays. 40. Observed Expected
13 18
16 18
28 18
17 18
16 18
χ2 = 7.44 2 χ .05 = 9.48773
Do not reject H0; the assumption that the number of riders is uniformly distributed cannot be rejected. 41. Observed Frequency (fi) 105 235 55 90 15 500
Hypothesized Proportion 0.28 0.46 0.12 0.10 0.04 Totals:
Category Very Satisfied Somewhat Satisfied Neither Somewhat Dissatisfied Very Dissatisfied
Expected Frequency (ei) 140 230 60 50 20
(fi - ei)2 / ei 8.75 0.11 0.42 32.00 1.25 42.53
2 χ .05 = 9.49 (4 degrees of freedom)
Since 42.53 > 9.49, we conclude that the job satisfaction for computer programmers is different than 42.
Expected Frequencies: Quality Shift 1st 2nd 3rd
Good 368.44 276.33 184.22
13- 226
Defective 31.56 23.67 15.78
the
χ2 = 8.11 2 χ .05 = 5.99147 (2 degrees of freedom)
43.
Reject H0; conclude that shift and quality are not independent. Expected Frequencies: e11 e21 e31 e41
= = = =
1046.19 28.66 258.59 516.55
e12 e22 e32 e42
Employment Full-Time Full-time Part-Time Part-Time Self-Employed Self-Employed Not Employed Not Employed
= = = =
632.81 17.34 156.41 312.45 Observed Frequency (fi) 1105 574 31 15 229 186 485 344 2969
Region Eastern Western Eastern Western Eastern Western Eastern Western Totals:
Expected Frequency (ei) 1046.19 632.81 28.66 17.34 258.59 156.41 516.55 312.45
(fi - ei)2 / ei 3.31 5.46 0.19 0.32 3.39 5.60 1.93 3.19 23.37
2 χ .05 = 7.81 with (4 - 1) (2 - 1) = 3 degrees of freedom
Since 23.37 > 7.81, we conclude that employment status is not independent of region. 44.
Expected frequencies: Loan Approval Decision Approved Rejected 24.86 15.14 18.64 11.36 31.07 18.93 12.43 7.57
Loan Offices Miller McMahon Games Runk χ2 = 2.21 2 = 7.81473 (3 degrees of freedom) χ .05
Do not reject H0; the loan decision does not appear to be dependent on the officer. 45. a.
Observed Frequency (fij)
Men Women Total
Never Married 234 216 450
Married 106 168 274
Divorced 10 16 26
Total 350 400 750
Married
Divorced
Total
Expected Frequency (eij) Never Married
13 - 227
Men Women Total
210 240 450
127.87 146.13 274
12.13 13.87 26
350 400 750
Chi Square (fij - eij)2 / eij Never Married 2.74 2.40
Men Women
Married 3.74 3.27
Divorced .38 .33
Total 6.86 6.00 χ2 = 12.86
2 χ.01 = 9.21
Degrees of freedom = 2
Reject H0; conclude martial status is not independent of gender. b.
Martial Status Never Married 66.9% 54.0%
Men Women
Married 30.3% 42.0%
Divorced 2.9% 4.0%
Men 100 - 66.9 = 33.1% have been married Women 100 - 54.0 = 46.0% have been married 46.
Expected Frequencies: e11 =
(50)(18) (50)(24) (50)(12) = 9, e12 = = 12, ⋅ ⋅ ⋅ , e25 = =6 100 100 100
χ2 =
(4 − 9)2 (10 − 12) 2 (4 − 6)2 + + ⋅⋅⋅ + = 9.76 9 12 6
2 χ.05 = 9.48773 (4 degrees of freedom)
Since 9.76 > 9.48773, reject H0. Banking tends to have lower P/E ratios. We can conclude that industry type and P/E ratio are related. 47.
Expected Frequencies: County Urban Rural Total
Sun 56.7 11.3 68
Mon 47.6 9.4 57
Days of the Week Tues Wed Thur 55.1 56.7 60.1 10.9 11.3 11.9 66 68 72
χ2 = 6.20 2 χ .05 = 12.5916 (6 degrees of freedom)
13- 228
Fri 72.6 14.4 87
Sat 44.2 8.8 53
Total 393 78 471
Do not reject H0; the assumption of independence cannot be rejected. 48.
Expected Frequencies:
Occupied Vacant Total
Los Angeles 165.7 34.3 200.0 2
San Diego 124.3 25.7 150.0
San Francisco 186.4 38.6 225.0 2
San Jose 165.7 34.3 200.0
Total 642 133 775
2
(160 - 165.7) (116 - 124.3) (26 - 34.3) + +⋅⋅⋅+ 165.7 124.3 34.3 = 7.78
χ2 =
2 χ.05 = 7.81473 with 3 degrees of freedom
Since χ2 = 7.78 ≤ 7.81473 Do not reject H0. We cannot conclude that office vacancies are dependent on metropolitan area, but it is close: the p-value is slightly larger than .05.
13 - 229
Chapter 12 Simple Linear Regression Learning Objectives 1.
Understand how regression analysis can be used to develop an equation that estimates mathematically how two variables are related.
2.
Understand the differences between the regression model, the regression equation, and the estimated regression equation.
3.
Know how to fit an estimated regression equation to a set of sample data based upon the leastsquares method.
4.
Be able to determine how good a fit is provided by the estimated regression equation and compute the sample correlation coefficient from the regression analysis output.
5.
Understand the assumptions necessary for statistical inference and be able to test for a significant relationship.
6.
Learn how to use a residual plot to make a judgement as to the validity of the regression assumptions, recognize outliers, and identify influential observations.
7.
Know how to develop confidence interval estimates of y given a specific value of x in both the case of a mean value of y and an individual value of y.
8.
Be able to compute the sample correlation coefficient from the regression analysis output.
9.
Know the definition of the following terms: independent and dependent variable simple linear regression regression model regression equation estimated regression equation scatter diagram coefficient of determination standard error of the estimate confidence interval prediction interval residual plot standardized residual plot outlier influential observation leverage
13- 230
Solutions: a. 16 14 12 10 y
1
8 6 4 2 0 0
1
2
3
4
5
6
x b.
There appears to be a linear relationship between x and y.
c.
Many different straight lines can be drawn to provide a linear approximation of the relationship between x and y; in part d we will determine the equation of a straight line that “best” represents the relationship according to the least squares criterion.
d.
Summations needed to compute the slope and y-intercept are: Σxi = 15 b1 =
Σyi = 40
Σ( xi − x )( yi − y ) = 26
Σ( xi − x )( yi − y ) 26 = = 2.6 10 Σ( xi − x ) 2
b0 = y − b1 x = 8 − (2.6)(3) = 0.2 yˆ = 0.2 + 2.6 x e.
y$ = 0.2 − 2.6(4) = 10.6
13 - 231
Σ( xi − x ) 2 = 10
a. 35 30 25 20 y
2.
15 10 5 0 0
2
4
6
8
10
x b.
There appears to be a linear relationship between x and y.
c.
Many different straight lines can be drawn to provide a linear approximation of the relationship between x and y; in part d we will determine the equation of a straight line that “best” represents the relationship according to the least squares criterion.
d.
Summations needed to compute the slope and y-intercept are: Σxi = 19 b1 =
Σyi = 116
Σ( xi − x )( yi − y ) = −57.8
Σ( xi − x )( yi − y ) −57.8 = = −1.8766 30.8 Σ( xi − x ) 2
b0 = y − b1 x = 23.2 − ( −18766 . )(3.8) = 30.3311 y$ = 30.33 − 188 . x e.
y$ = 30.33 − 188 . (6) = 19.05
13- 232
Σ( xi − x ) 2 = 30.8
3.
a.
7 6 5 y
4 3 2 1 0 0
2
4
6
8
x b.
Summations needed to compute the slope and y-intercept are: Σxi = 26 b1 =
Σyi = 17
Σ( xi − x )( yi − y ) = 11.6
Σ( xi − x )( yi − y ) 11.6 = = 0.5088 22.8 Σ( xi − x )2
b0 = y − b1 x = 3.4 − (0.5088)(5.2) = 0.7542 y$ = 0.75 + 0.51x c.
y$ = 0.75 + 0.51(4) = 2.79
13 - 233
Σ( xi − x ) 2 = 22.8
10
4. a. 135 130 125 y
120 115 110 105 100 61
62
63
64
65
66
67
68
69
x
b. c.
Many different straight lines can be drawn to provide a linear approximation of the relationship between x and y; in part d we will determine the equation of a straight line that “best” represents the relationship according to the least squares criterion.
d.
Summations needed to compute the slope and y-intercept are: Σxi = 325 b1 =
Σyi = 585
Σ( xi − x )( yi − y ) = 110
Σ( xi − x )( yi − y ) 110 = = 5.5 20 Σ( xi − x ) 2
b0 = y − b1 x = 117 − (5.5)(65) = −240.5 y$ = −240.5 + 55 . x e.
y$ = −240.5 + 5.5x = −240.5 + 5.5(63) = 106 pounds
13- 234
Σ( xi − x ) 2 = 20
There appe
a. 2100 1900 1700 1500 1300 y
5.
1100 900 700 500 300 100 0
20
40
60
80
100
120
140
x b.
There appears to be a linear relationship between x = media expenditures (millions of dollars) and = case sales (millions).
c.
Many different straight lines can be drawn to provide a linear approximation of the relationship between x and y; in part d we will determine the equation of a straight line that “best” represents the relationship according to the least squares criterion. Summations needed to compute the slope and y-intercept are: Σxi = 420.6 b1 =
Σyi = 5958.7
Σ( xi − x )( yi − y ) = 142, 040.3443
Σ( xi − x )2 = 9847.6486
Σ( xi − x )( yi − y ) 142, 040.3443 = = 14.4238 9847.6486 Σ( xi − x ) 2
b0 = y − b1 x = 8512429 . − (14.4238)(60.0857) = −15.42 y$ = −15.42 + 14.42 x d.
A one million dollar increase in media expenditures will increase case sales by approximately 14.42 million.
e.
y$ = −15.42 + 14.42 x = −15.42 + 14.42(70) = 993.98
13 - 235
y
a.
1.4 1.2 1 0.8 y
6.
0.6 0.4 0.2 0 66
68
70
72
74
76
78
80
82
84
x b.
There appears to be a linear relationship between x = percentage of flights arriving on time and = number of complaints per 100,000 passengers.
c.
Summations needed to compute the slope and y-intercept are: Σxi = 667.2 b1 =
Σyi = 7.18
Σ( xi − x )( yi − y ) = −9.0623
Σ( xi − x ) 2 = 128.7
Σ( xi − x )( yi − y ) −9.0623 = = −0.0704 128.7 Σ( xi − x ) 2
b0 = y − b1 x = 0.7978 − ( −0.0704)(74.1333) = 6.02 y$ = 6.02 − 0.07 x d.
A one percent increase in the percentage of flights arriving on time will decrease the number of complaints per 100,000 passengers by 0.07.
e
y$ = 6.02 − 0.07 x = 6.02 − 0.07(80) = 0.42
13- 236
y
1550
1500
S&P
1450 1400
1350
1300 9600
9800
10000
10200
10400
10600
10800
11000
11200
DJIA
7.
a. b.
Let x = DJIA and y = S&P. Summations needed to compute the slope and y-intercept are: Σxi = 104,850 b1 =
Σyi = 14, 233
Σ( xi − x )( yi − y ) = 268,921
Σ( xi − x ) 2 = 1,806,384
Σ( xi − x )( yi − y ) 268,921 = = 0.14887 2 1,806,384 Σ( xi − x )
b0 = y − b1 x = 1423.3 − (.14887)(10, 485) = −137.629 yˆ = −137.63 + 0.1489 x c.
yˆ = −137.63 + 0.1489(11, 000) = 1500.27 or approximately 1500
13 - 237
a. 1800 1600 1400 1200 Price
8.
1000 800 600 400 200 0 0
1
2
3
4
5
6
Sidetrack Capability
b.
There appears to be a linear relationship between x = sidetrack capability and y = price, with higher priced models having a higher level of handling.
c.
Summations needed to compute the slope and y-intercept are: Σxi = 28 b1 =
Σyi = 10, 621
Σ( xi − x )( yi − y ) = 4003.2
Σ( xi − x )( yi − y ) 4003.2 = = 204.2449 19.6 Σ( xi − x ) 2
b0 = y − b1 x = 1062.1 − (204.2449)(2.8) = 490.21 yˆ = 490.21 + 204.24 x d.
yˆ = 490.21 + 204.24 x = 490.21 + 204.24(4) = 1307
13- 238
Σ( xi − x )2 = 19.6
a.
Let x = years of experience and y = annual sales ($1000s) 150 140 130 120 110 y
9.
100 90 80 70 60 50 0
2
4
6
8
10
x b.
Summations needed to compute the slope and y-intercept are: Σxi = 70 b1 =
Σyi = 1080
Σ( xi − x )( yi − y ) = 568
Σ( xi − x )( yi − y ) 568 = =4 142 Σ( xi − x ) 2
b0 = y − b1 x = 108 − (4)(7) = 80 y$ = 80 + 4 x c.
y$ = 80 + 4 x = 80 + 4(9) = 116
13 - 239
Σ( xi − x ) 2 = 142
12
14
95
Overall Rating
90 85 80 75 70 65 60 100
150
200
250
Performance Score 10. a. b.
Let x = performance score and y = overall rating. Summations needed to compute the slope and yintercept are: Σxi = 2752 b1 =
Σyi = 1177
Σ( xi − x )( yi − y ) = 1723.73
Σ( xi − x )( yi − y ) 1723.73 = = 0.1452 11,867.73 Σ( xi − x ) 2
b0 = y − b1 x = 78.4667 − (.1452)(183.4667) = 51.82 yˆ = 51.82 + 0.145 x c.
yˆ = 51.82 + 0.145(225) = 84.4 or approximately 84
13- 240
Σ( xi − x ) 2 = 11,867.73
11. a.
Let x = hotel revenue and y = gaming revenue 900.0 800.0 700.0 600.0
y
500.0 400.0 300.0 200.0 100.0 0.0 0.0
100.0
200.0
300.0
400.0
500.0
600.0
700.0
800.0
x b.
There appears to be a linear relationship between the variables.
c.
The summations needed to compute the slope and the y-intercept are: Σxi = 2973.3 b1 =
Σyi = 3925.6
Σ( xi − x )( yi − y ) = 453,345.042
Σ( xi − x )( yi − y ) 453,345.042 = = 0.9385 483,507.581 Σ( xi − x ) 2
b0 = y − b1 x = 392.56 − (0.9385)(297.33) = 11352 . y$ = 11352 . + 0.94 x d.
y$ = 11352 . + 0.94 x = 113.52 + 0.94(500) = 5835 .
13 - 241
Σ( xi − x ) 2 = 483,507.581
12. a.
40000 35000
Revenue
30000 25000 20000 15000 10000 5000 0 0
20000
40000
60000
80000
100000
Number of Employees
b. There appears to be a positive linear relationship between the number of employees and the revenue. c.
Let x = number of employees and y = revenue. Summations needed to compute the slope and yintercept are: Σxi = 4200 b1 =
Σyi = 1669
Σ( xi − x )( yi − y ) = 4, 658,594,168
Σ( xi − x )( yi − y ) 4, 658,594,168 = = 0.316516 14, 718,343,803 Σ( xi − x ) 2
b0 = y − b1 x = 14, 048 − (.316516)(40, 299) = 1293 yˆ = 1293 + 0.3165 x d.
yˆ = 1293 + .3165(75, 000) = 25, 031
13- 242
Σ( xi − x ) 2 = 14, 718,343,803
13. a.
Let x = adjusted gross income ($1000s) and y = total itemized deductions ($1000s) 30.0 25.0
y
20.0 15.0 10.0 5.0 0.0 0.0
20.0
40.0
60.0
80.0
100.0
120.0
140.0
x b.
The summations needed to compute the slope and the y-intercept are: Σxi = 399 b1 =
Σyi = 97.1
Σ( xi − x )( yi − y ) = 1233.7
Σ( xi − x ) 2 = 7648
Σ( xi − x )( yi − y ) 1233.7 = = 0.16131 7648 Σ( xi − x ) 2
b0 = y − b1 x = 1387143 . − (0.16131)(57) = 4.67675 y$ = 4.68 + 016 . x c.
y$ = 4.68 + 016 . x = 4.68 + 016 . (52.5) = 13.08 or approximately $13,080. The agent's request for an audit appears to be justified.
13 - 243
14. a.
Let x = average room rate ($) and y = occupancy rate (%)
85 80
y
75 70 65 60 60
70
80
90
100
110
x b.
The summations needed to compute the slope and the y-intercept are: Σxi = 1677.25 b1 =
Σyi = 1404.3
Σ( xi − x )( yi − y ) = 897.9493
Σ( xi − x ) 2 = 3657.4568
Σ( xi − x )( yi − y ) 897.9493 = = 0.2455 3657.4568 Σ( xi − x ) 2
b0 = y − b1 x = 70.215 − (0.2455)(83.8625) = 49.63 y$ = 49.63+.2455x c. 15. a.
y$ = 49.63+.2455x = 49.63+.2455(80) = 69.3% The estimated regression equation and the mean for the dependent variable are: y$i = 0.2 + 2.6 xi
y =8
The sum of squares due to error and the total sum of squares are SSE = ∑( yi − y$i ) 2 = 12.40
SST = ∑ ( yi − y ) 2 = 80
Thus, SSR = SST - SSE = 80 - 12.4 = 67.6 b.
r2 = SSR/SST = 67.6/80 = .845 The least squares line provided a very good fit; 84.5% of the variability in y has been explained by the least squares line.
c.
rxy = .845 = +.9192
13- 244
16. a.
The estimated regression equation and the mean for the dependent variable are: yˆi = 30.33 − 1.88 x
y = 23.2
The sum of squares due to error and the total sum of squares are SSE = ∑( yi − yˆi ) 2 = 6.33
SST = ∑( yi − y ) 2 = 114.80
Thus, SSR = SST - SSE = 114.80 - 6.33 = 108.47 b.
r2 = SSR/SST = 108.47/114.80 = .945 The least squares line provided an excellent fit; 94.5% of the variability in y has been explained by the estimated regression equation.
c.
rxy = .945 = −.9721 Note: the sign for rxy is negative because the slope of the estimated regression equation is negative. (b1 = -1.88)
17.
The estimated regression equation and the mean for the dependent variable are: yˆi = .75 + .51x
y = 3.4
The sum of squares due to error and the total sum of squares are SSE = ∑( yi − yˆi ) 2 = 5.3
SST = ∑( yi − y ) 2 = 11.2
Thus, SSR = SST - SSE = 11.2 - 5.3 = 5.9 r2 = SSR/SST = 5.9/11.2 = .527 We see that 52.7% of the variability in y has been explained by the least squares line. rxy = .527 = +.7259 18. a.
The estimated regression equation and the mean for the dependent variable are: yˆ = 1790.5 + 581.1x
y = 3650
The sum of squares due to error and the total sum of squares are SSE = ∑( yi − yˆi ) 2 = 85,135.14
SST = ∑ ( yi − y )2 = 335, 000
Thus, SSR = SST - SSE = 335,000 - 85,135.14 = 249,864.86 b.
r2 = SSR/SST = 249,864.86/335,000 = .746 We see that 74.6% of the variability in y has been explained by the least squares line.
13 - 245
c. 19. a.
rxy = .746 = +.8637 The estimated regression equation and the mean for the dependent variable are: yˆ = −137.63 + .1489 x
y = 1423.3
The sum of squares due to error and the total sum of squares are SSE = ∑( yi − yˆi ) 2 = 7547.14
SST = ∑( yi − y ) 2 = 47,582.10
Thus, SSR = SST - SSE = 47,582.10 - 7547.14 = 40,034.96 b.
r2 = SSR/SST = 40,034.96/47,582.10 = .84 We see that 84% of the variability in y has been explained by the least squares line.
c. 20. a.
rxy = .84 = +.92 Let x = income and y = home price. Summations needed to compute the slope and y-intercept are: Σxi = 1424 b1 =
Σyi = 2455.5
Σ( xi − x )( yi − y ) = 4011
Σ( xi − x )2 = 1719.618
Σ( xi − x )( yi − y ) 4011 = = 2.3325 1719.618 Σ( xi − x )2
b0 = y − b1 x = 136.4167 − (2.3325)(79.1111) = −48.11 yˆ = −48.11 + 2.3325 x b.
The sum of squares due to error and the total sum of squares are SSE = ∑( yi − yˆi ) 2 = 2017.37
SST = ∑( yi − y ) 2 = 11,373.09
Thus, SSR = SST - SSE = 11,373.09 – 2017.37 = 9355.72 r2 = SSR/SST = 9355.72/11,373.09 = .82 We see that 82% of the variability in y has been explained by the least squares line. rxy = .82 = +.91 c. 21. a.
yˆ = −48.11 + 2.3325(95) = 173.5 or approximately $173,500 The summations needed in this problem are: Σxi = 3450 b1 =
Σyi = 33, 700
Σ( xi − x )( yi − y ) = 712,500
Σ( xi − x )( yi − y ) 712,500 = = 7.6 93, 750 Σ( xi − x )2
13- 246
Σ( xi − x )2 = 93, 750
b0 = y − b1 x = 5616.67 − (7.6)(575) = 1246.67 y$ = 1246.67 + 7.6 x b.
$7.60
c.
The sum of squares due to error and the total sum of squares are: SSE = ∑( yi − yˆi ) 2 = 233,333.33
SST = ∑( yi − y ) 2 = 5, 648,333.33
Thus, SSR = SST - SSE = 5,648,333.33 - 233,333.33 = 5,415,000 r2 = SSR/SST = 5,415,000/5,648,333.33 = .9587 We see that 95.87% of the variability in y has been explained by the estimated regression equation. d. 22. a.
y$ = 1246.67 + 7.6 x = 1246.67 + 7.6(500) = $5046.67 The summations needed in this problem are: Σxi = 613.1 b1 =
Σyi = 70
Σ( xi − x )( yi − y ) = 5766.7
Σ( xi − x ) 2 = 45,833.9286
Σ( xi − x )( yi − y ) 5766.7 = = 0.1258 2 45,833.9286 Σ( xi − x )
b0 = y − b1 x = 10 − (0.1258)(87.5857) = −1.0183 yˆ = −1.0183 + 0.1258 x b.
The sum of squares due to error and the total sum of squares are: SSE = ∑( yi − yˆi ) 2 = 1272.4495
SST = ∑( yi − y ) 2 = 1998
Thus, SSR = SST - SSE = 1998 - 1272.4495 = 725.5505 r2 = SSR/SST = 725.5505/1998 = 0.3631 Approximately 37% of the variability in change in executive compensation is explained by the twoyear change in the return on equity. c.
rxy = 0.3631 = +0.6026 It reflects a linear relationship that is between weak and strong.
23. a. b.
s2 = MSE = SSE / (n - 2) = 12.4 / 3 = 4.133 s = MSE = 4.133 = 2.033
13 - 247
c.
Σ( xi − x ) 2 = 10 sb1 =
d.
t=
s Σ( xi − x )
2
=
2.033 10
= 0.643
b1 2.6 = = 4.04 sb1 .643
t.025 = 3.182 (3 degrees of freedom) Since t = 4.04 > t.05 = 3.182 we reject H0: β1 = 0 e.
MSR = SSR / 1 = 67.6 F = MSR / MSE = 67.6 / 4.133 = 16.36 F.05 = 10.13 (1 degree of freedom numerator and 3 denominator) Since F = 16.36 > F.05 = 10.13 we reject H0: β1 = 0 Source of Variation Regression Error Total
24. a.
Sum of Squares 67.6 12.4 80.0
Mean Square 67.6 4.133
F 16.36
Mean Square
F
s2 = MSE = SSE / (n - 2) = 6.33 / 3 = 2.11
b.
s = MSE = 2.11 = 1453 .
c.
Σ( xi − x ) 2 = 30.8 sb1 =
d.
Degrees of Freedom 1 3 4
t=
s Σ( xi − x )
2
=
1.453 30.8
= 0.262
b1 −188 . = = −7.18 sb1 .262
t.025 = 3.182 (3 degrees of freedom) Since t = -7.18 < -t.025 = -3.182 we reject H0: β1 = 0 e.
MSR = SSR / 1 = 8.47 F = MSR / MSE = 108.47 / 2.11 = 51.41 F.05 = 10.13 (1 degree of freedom numerator and 3 denominator) Since F = 51.41 > F.05 = 10.13 we reject H0: β1 = 0 Source of Variation
Sum of Squares
Degrees of Freedom
13- 248
108.47 6.33 114.80
Regression Error Total
25. a.
1 3 4
108.47 2.11
51.41
s2 = MSE = SSE / (n - 2) = 5.30 / 3 = 1.77 s = MSE = 1.77 = 133 .
b.
Σ( xi − x ) 2 = 22.8 sb1 =
t=
s Σ( xi − x )
2
=
1.33 22.8
= 0.28
b1 .51 = = 182 . sb1 .28
t.025 = 3.182 (3 degrees of freedom) Since t = 1.82 < t.025 = 3.182 we cannot reject H0: β1 = 0; x and y do not appear to be related. c.
MSR = SSR/1 = 5.90 /1 = 5.90 F = MSR/MSE = 5.90/1.77 = 3.33 F.05 = 10.13 (1 degree of freedom numerator and 3 denominator) Since F = 3.33 < F.05 = 10.13 we cannot reject H0: β1 = 0; x and y do not appear to be related.
26. a.
s2 = MSE = SSE / (n - 2) = 85,135.14 / 4 = 21,283.79 s = MSE = 21,283.79 = 145.89 Σ( xi − x ) 2 = 0.74 sb1 =
t=
s Σ( xi − x )
2
=
145.89 0.74
= 169.59
b1 581.08 = = 3.43 sb1 169.59
t.025 = 2.776 (4 degrees of freedom) Since t = 3.43 > t.025 = 2.776 we reject H0: β1 = 0 b.
MSR = SSR / 1 = 249,864.86 / 1 = 249.864.86 F = MSR / MSE = 249,864.86 / 21,283.79 = 11.74 F.05 = 7.71 (1 degree of freedom numerator and 4 denominator)
13 - 249
Since F = 11.74 > F.05 = 7.71 we reject H0: β1 = 0
c. Source of Variation Regression Error Total 27. a.
Sum of Squares 249864.86 85135.14 335000
Degrees of Freedom 1 4 5
Mean Square 249864.86 21283.79
F 11.74
Summations needed to compute the slope and y-intercept are: Σxi = 37 b1 =
Σyi = 1654
Σ( xi − x )( yi − y ) = 315.2
Σ( xi − x ) 2 = 10.1
Σ( xi − x )( yi − y ) 315.2 = = 31.2079 10.1 Σ( xi − x ) 2
b0 = y − b1 x = 165.4 − (31.2079)(3.7) = 19.93 yˆ = 19.93 + 31.21x b.
SSE = Σ( yi − yˆi ) 2 = 2487.66 SST = Σ( yi − y ) 2 = 12,324.4 Thus, SSR = SST - SSE = 12,324.4 - 2487.66 = 9836.74 MSR = SSR/1 = 9836.74 MSE = SSE/(n - 2) = 2487.66/8 = 310.96 F = MSR / MSE = 9836.74/310.96 = 31.63 F.05 = 5.32 (1 degree of freedom numerator and 8 denominator) Since F = 31.63 > F.05 = 5.32 we reject H0: β1 = 0. Upper support and price are related.
c.
r2 = SSR/SST = 9,836.74/12,324.4 = .80 The estimated regression equation provided a good fit; we should feel comfortable using the estimated regression equation to estimate the price given the upper support rating.
d. 28.
yˆ = 19.93 + 31.21(4) = 144.77 SST = 411.73 SSE = 161.37 SSR = 250.36 MSR = SSR / 1 = 250.36 MSE = SSE / (n - 2) = 161.37 / 13 = 12.413
13- 250
F = MSR / MSE = 250.36 / 12.413= 20.17 F.05 = 4.67 (1 degree of freedom numerator and 13 denominator) Since F = 20.17 > F.05 = 4.67 we reject H0: β1 = 0.
29.
SSE = 233,333.33 SST = 5,648,333.33 SSR = 5,415,000 MSE = SSE/(n - 2) = 233,333.33/(6 - 2) = 58,333.33 MSR = SSR/1 = 5,415,000 F = MSR / MSE = 5,415,000 / 58,333.25 = 92.83 Source of Variation Regression Error Total
Sum of Squares 5,415,000.00 233,333.33 5,648,333.33
Degrees of Freedom 1 4 5
Mean Square 5,415,000 58,333.33
F 92.83
F.05 = 7.71 (1 degree of freedom numerator and 4 denominator) Since F = 92.83 > 7.71 we reject H0: β1 = 0. Production volume and total cost are related. 30.
Using the computations from Exercise 22, SSE = 1272.4495 SST = 1998 SSR = 725.5505 s = 254.4899 = 15.95
∑ ( xi − x ) 2 = 45,833.9286 sb1 =
t=
s Σ( xi − x )
2
=
15.95 45,833.9286
= 0.0745
b1 01258 . = = 169 . sb1 0.0745
t.025 = 2.571 Since t = 1.69 < 2.571, we cannot reject H0: β1 = 0 There is no evidence of a significant relationship between x and y. 31.
SST = 11,373.09 SSE = 2017.37 SSR = 9355.72 MSR = SSR / 1 = 9355.72 MSE = SSE / (n - 2) = 2017.37/ 16 = 126.0856 F = MSR / MSE = 9355.72/ 126.0856 = 74.20
13 - 251
F.01 = 8.53 (1 degree of freedom numerator and 16 denominator) Since F = 74.20 > F.01 = 8.53 we reject H0: β1 = 0.
y$ = 6.1092 + 0.8951x
32. a. b.
t=
b1 0.8951 = = 6.01 sb1 0149 .
t.025 = 2.306 (8 degrees of freedom) Since t = 6.01 > t.025 = 2.306 we reject H0: β1 = 0; monthly maintenance expense is related to usage. c. 33. a.
r2 = SSR/SST = 1575.76/1924.90 = 0.82. A good fit. 9
b.
y$ = 20.0 + 7.21x
c.
t = 5.29 > t.025 = 2.365 we reject H0: β1 = 0
d.
SSE = SST - SSR = 51,984.1 - 41,587.3 = 10,396.8 MSE = 10,396.8 / 7 = 1,485.3 F = MSR / MSE = 41,587.3 / 1,485.3 = 28.00 F.05 = 5.59 (1 degree of freedom numerator and 7 denominator) Since F = 28 > F.05 = 5.59 we reject H0: β1 = 0.
e. 34. a. b.
y$ = 20.0 + 7.21x = 20.0 + 7.21(50) = 380.5 or $380,500 y$ = 80.0 + 50.0 x F = MSR / MSE = 6828.6 / 82.1 = 83.17 F.05 = 4.20 (1 degree of freedom numerator and 28 denominator) Since F = 83.17 > F.05 = 4.20 we reject H0: β1 = 0. Branch office sales are related to the salespersons.
c.
t=
50 = 9.12 5.482
13- 252
t.025 = 2.048 (28 degrees of freedom) Since t = 9.12 > t.05 = 2.048 we reject H0: β1 = 0 d.
35.
p-value = .000
A portion of the Excel Regression tool output for this problem follows: Regression Statistics Multiple R
0.7379
R Square
0.5444
Adjusted R Square
0.5094
Standard Error
4.1535
Observations
15
ANOVA df Regression
SS
MS
1
268.0118
268.0118
Residual
13
224.2682
17.2514
Total
14
492.28
Coefficients Standard Error Intercept Gross Profit Margin (%)
36.
t Stat
F 15.5357
P-value
11.3332
2.7700
4.0914
0.0013
0.6361
0.1614
3.9415
0.0017
a.
$y = 11.3332 + .6361x where x = Gross Profit Margin (%)
b.
Significant relationship: Significance F = .0017 < α = .05
c.
Significant relationship: P-value = .0017 < α = .05
d.
r2 = 0.5444; Not a good fit A portion of the Excel Regression tool output for this problem follows:
Regression Statistics Multiple R 0.6502 R Square 0.4228 Adjusted R Square 0.3907 Standard Error 11.5925 Observations 20 ANOVA
13 - 253
Significance F 0.0017
df Regression Residual Total
Intercept Age
1 18 19
SS 1771.982016 2418.967984 4190.95
Coefficients Standard Error -42.7965 19.3816 1.0043 0.2766
MS 1771.982 134.3871
t Stat P-value -2.2081 0.0405 3.6312 0.0019
a.
yˆ = −42.7965 + 1.0043x where x = Age
b.
Significant relationship: Significance F = .0019 < α = .05
c.
r2 = 0.4228; Not a good fit
37.
F Significance F 13.1857 0.0019
A portion of the Excel Regression tool output for this problem follows: Regression Statistics Multiple R
0.9277
R Square
0.8606
Adjusted R Square
0.8467
Standard Error
6.6343
Observations
12
ANOVA df Regression
SS
MS
1
2717.8625
2717.8625
Residual
10
440.1375
44.0137
Total
11
3158
Coefficients Standard Error Intercept Income ($1000s)
-11.8020
12.8441
2.1843
0.2780
t Stat
a.
yˆ = −11.802 + 2.1843x where x = Income ($1000s)
b.
Significant relationship: P-value = .000 < α = .05
c.
r2 = 0.86; A very good fit
13- 254
-0.9189
F 61.7503
P-value 0.3798
7.8581 1.3768E-05
Significance F 1.3768E-05
38. a.
Scatter diagram:
Average Rental Rate ($)
40.0 35.0 30.0 25.0 20.0 15.0 10.0 0.0
5.0
10.0
15.0
20.0
25.0
Vaacancy Rate (%)
b.
There appears to be a linear relationship between the two variables. A portion of the Excel Regression tool output for this problem follows: Regression Statistics
Multiple R
0.6589
R Square
0.4341
Adjusted R Square
0.3988
Standard Error
4.8847
Observations
18
ANOVA df Regression
SS
MS
1
292.9137
292.9137
Residual
16
381.7712
23.8607
Total
17
674.6849
13 - 255
F 12.2760
Significance F 0.0029
Coefficients Standard Error
t Stat
P-value
Intercept
37.0747
3.5277
10.5097 1.36938E-08
Vacancy Rate (%)
-0.7792
0.2224
-3.5037
c.
yˆ = 37.0747 - 0.7792x where x = Vacancy Rate (%)
d.
Significant relationship: Significance F (or P-value) < α = .05
e.
39. a.
b.
r2 = 0.43; Not a very good fit
s = 2.033 x =3
Σ( xi − x ) 2 = 10
s yˆp = s
2 1 ( xp − x ) 1 (4 − 3) 2 + = 2.033 + = 1.11 n Σ( xi − x ) 2 5 10
y$ = 0.2 + 2.6 x = 0.2 + 2.6(4) = 10.6 y$ p ± t α / 2 s y$ p 10.6 ± 3.182(1.11) = 10.6 ± 3.53 or 7.07 to 14.13
c.
sind = s 1 +
d.
y$ p ± tα / 2 sind
2 1 ( xp − x ) 1 (4 − 3) 2 + = 2.033 1 + + = 2.32 n Σ( xi − x )2 5 10
10.6 ± 3.182(2.32) = 10.6 ± 7.38 or 3.22 to 17.98 40. a. b.
s = 1.453 x = 3.8
s yˆp = s
Σ( xi − x )2 = 30.8 2 1 ( xp − x ) 1 (3 − 3.8) 2 + = 1.453 + = .068 2 n Σ( xi − x ) 5 30.8
y$ = 30.33 − 188 . x = 30.33 − 188 . (3) = 24.69 y$ p ± t α / 2 s y$ p 24.69 ± 3.182(.68) = 24.69 ± 2.16
13- 256
0.0029
or 22.53 to 26.85
c.
sind = s 1 +
d.
y$ p ± tα / 2 sind
2 1 ( xp − x ) 1 (3 − 3.8) 2 + = 1.453 1 + + = 1.61 2 n Σ( xi − x ) 5 30.8
24.69 ± 3.182(1.61) = 24.69 ± 5.12 or 19.57 to 29.81
41.
s = 1.33 x = 5.2
s yˆp = s
Σ( xi − x ) 2 = 22.8 2 1 ( xp − x ) 1 (3 − 5.2)2 + = + = 0.85 1.33 n Σ( xi − x )2 5 22.8
y$ = 0.75 + 0.51x = 0.75 + 0.51(3) = 2.28 y$ p ± t α / 2 s y$ p 2.28 ± 3.182 (.85) = 2.28 ± 2.70 or -.40 to 4.98 sind = s 1 +
2 1 ( xp − x ) 1 (3 − 5.2) 2 + = 1.33 1 + + = 1.58 2 n Σ( xi − x ) 5 22.8
y$ p ± tα / 2 sind 2.28 ± 3.182 (1.58) = 2.28 ± 5.03 or -2.27 to 7.31 42. a.
s = 145.89 x = 3.2
s yˆp = s
Σ( xi − x ) 2 = 0.74 2 1 ( xp − x ) 1 (3 − 3.2)2 + = 145.89 + = 68.54 n Σ( xi − x ) 2 6 0.74
yˆ = 1790.5 + 581.1x = 1790.5 + 581.1(3) = 3533.8 y$ p ± t α / 2 s y$ p
13 - 257
3533.8 ± 2.776(68.54) = 3533.8 ± 190.27 or $3343.53 to $3724.07
b.
sind = s 1 +
2 1 ( xp − x ) 1 (3 − 3.2) 2 + = 145.89 1 + + = 161.19 n Σ( xi − x )2 6 0.74
y$ p ± tα / 2 sind 3533.8 ± 2.776(161.19) = 3533.8 ± 447.46 or $3086.34 to $3981.26 43. a. b.
yˆ = 51.82 + .1452 x = 51.82 + .1452(200) = 80.86 s = 3.5232 x = 183.4667
Σ( xi − x ) 2 = 11,867.73
2 1 ( xp − x ) 1 (200 − 183.4667) 2 + = 3.5232 + = 1.055 n Σ( xi − x ) 2 15 11,867.73
s yˆp = s
y$ p ± t α / 2 s y$ p 80.86 ± 2.160(1.055) = 80.86 ± 2.279 or 78.58 to 83.14
c.
sind
2 1 ( xp − x ) 1 (200 − 183.4667) 2 = s 1+ + = 3.5232 1 + + = 3.678 2 n Σ( xi − x ) 15 11,867.73
y$ p ± tα / 2 sind 80.86 ± 2.160(3.678) = 80.86 ± 7.944 or 72.92 to 88.80 44. a.
x = 57
Σ( xi − x ) 2 = 7648
s2 = 1.88 s yˆp = s
s = 1.37
2 1 ( xp − x ) 1 (52.5 − 57) 2 + = 1.37 + = 0.52 n Σ( xi − x ) 2 7 7648
y$ p ± t α / 2 s y$ p 13.08 ± 2.571 (.52) = 13.08 ± 1.34
13- 258
or 11.74 to 14.42 or $11,740 to $14,420 b.
sind = 1.47 13.08 ± 2.571 (1.47) = 13.08 ± 3.78 or 9.30 to 16.86 or $9,300 to $16,860
c.
Yes, $20,400 is much larger than anticipated.
d.
Any deductions exceeding the $16,860 upper limit could suggest an audit.
45. a.
y$ = 1246.67 + 7.6(500) = $5046.67
b.
x = 575 Σ( xi − x ) 2 = 93, 750 s2 = MSE = 58,333.33 s = 241.52 sind = s 1 +
2 1 ( xp − x ) 1 (500 − 575) 2 + = 241.52 1 + + = 267.50 2 n Σ( xi − x ) 6 93, 750
y$ p ± tα / 2 sind 5046.67 ± 4.604 (267.50) = 5046.67 ± 1231.57 or $3815.10 to $6278.24 c.
46. a.
Based on one month, $6000 is not out of line since $3815.10 to $6278.24 is the prediction interval. However, a sequence of five to seven months with consistently high costs should cause concern.
Summations needed to compute the slope and y-intercept are: Σxi = 227 b1 =
Σyi = 2281.7
Σ( xi − x )( yi − y ) = 6003.41
Σ( xi − x ) 2 = 1032.1
Σ( xi − x )( yi − y ) 6003.41 = = 5.816694 1032.1 Σ( xi − x ) 2
b0 = y − b1 x = 228.17 − (5.816694)(27.7) = 67.047576 y$ = 67.0476 + 58167 . x b.
SST = 39,065.14 SSE = 4145.141 SSR = 34,920.000 r2 = SSR/SST = 34,920.000/39,065.141 = 0.894 The estimated regression equation explained 89.4% of the variability in y; a very good fit.
c.
s2 = MSE = 4145.141/8 = 518.143
13 - 259
s = 518.143 = 22.76 2 1 ( xp − x ) 1 (35 − 27.7)2 + = 22.76 + = 8.86 2 n Σ( xi − x ) 10 1032.1
s yˆp = s
y$ = 67.0476 + 58167 . x = 67.0476 + 58167 . (35) = 270.63 y$ p ± t α / 2 s y$ p 270.63 ± 2.262 (8.86) = 270.63 ± 20.04 or 250.59 to 290.67
d.
sind = s 1 +
2 1 ( xp − x ) 1 (35 − 27.7) 2 + = + + = 24.42 22.76 1 n Σ( xi − x ) 2 10 1032.1
y$ p ± tα / 2 sind 270.63 ± 2.262 (24.42) = 270.63 ± 55.24 or 215.39 to 325.87 47. a.
Using Excel's Regression tool, the estimated regression equation is: yˆ = −7.0222 + 1.5873x or yˆ = −7.02 + 1.59x
b.
The residuals calculated using yˆ = −7.02 + 1.59x are 3.48, -2.47, -4.83, -1.60, and 5.22
c.
6
Residuals
4 2 0 -2 0
5
10
15
20
25
-4 -6 x With only 5 observations it is difficult to determine if the assumptions are satisfied; however, the plot does suggest curvature in the residuals which would indicate that the error team assumptions are not satisfied. The scatter diagram for these data also indicates that the underlying relationship between x and y may be curvilinear.
13- 260
d.
x = 14 s = 4.8765
( xi − x ) xi
xi − x
6 11 15 18 20
-8 -3 1 4 6
2
∑( x − x )
( xi − x ) 2
2
i
64 9 1 16 36 126
.5079 .0714 .0079 .1270 .2857
s yi − yˆi
yi − yˆi
2.6356 4.1625 4.3401 4.0005 3.4972
3.48 -2.47 -4.83 -1.60 5.22
hi .7079 .2714 .2079 .3270 .4857
Standardized Residual 1.32 -.59 -1.11 -.40 1.49
e.
Standardized Residuals
2 1.5 1 0.5 0 -0.5 0
5
10
15
20
25
-1 -1.5 x
The shape of the standardized residual plot is the same shape as the residual plot. The conclusions reached in part (c) are also appropriate here. 48. a.
Using Excel's Regression tool, the estimated regression equation is: yˆ = 2.322 + 0.6366x or yˆ = 2.32 + 0.64x
b.
13 - 261
4 3
Residuals
2 1 0 -1
0
2
4
6
8
10
-2 -3 -4
x The assumption that the variance is the same for all values of x is questionable. The variance appears to increase for larger values of x. 49. a.
Using Excel's Regression tool, the estimated regression equation is: yˆ = 29.3991 + 1.5475x or yˆ = 29.40 + 1.55x
b.
Significant relationship: Significance F (or P-value) < α = .05
c.
13- 262
10
Residuals
5 0 0
5
10
15
20
25
-5 -10 -15 x
d.
The residual plot here leads us to question the assumption of a linear relationship between x and y. Even though the relationship is significant at the α = .05 level, it would be extremely dangerous to extrapolate beyond the range of the data. (e.g. x > 20).
50. a. From the solution to Exercise 9 we know that y$ = 80 + 4x
8 6
Residuals
4 2 0 -2 0
5
10
15
-4 -6 -8 x b. 51. a.
The assumptions concerning the error terms appear reasonable. Let x = return on investment (ROE) and y = price/earnings (P/E) ratio. yˆ = −32.13 + 3.22 x
b.
13 - 263
Standardized Residuals
2 1.5 1 0.5 0 -0.5 -1 -1.5 0
10
20
30
40
50
60
x c.
There is an unusual trend in the residuals. The assumptions concerning the error term appear questionable.
52.
No. Regression or correlation analysis can never prove that two variables are casually related.
53.
The estimate of a mean value is an estimate of the average of all y values associated with the same x. The estimate of an individual y value is an estimate of only one of the y values associated with a particular x.
54.
To determine whether or not there is a significant relationship between x and y. However, if we reject B1 = 0, it does not imply a good fit.
55. a.
The Excel output is shown below: Regression Statistics
Multiple R
0.8624
R Square
0.7438
Adjusted R Square
0.7118
Standard Error
1.4193
Observations
10
ANOVA df
SS
MS
F 23.2233
Regression
1
46.7838
46.7838
Residual
8
16.1162
2.0145
Total
9
62.9
Coefficients
Standard Error
t Stat
0.0013
P-value
Intercept
9.2649
1.0991
8.4293
2.99E-05
Shares
0.7105
0.1474
4.8191
0.0013
b.
Significance F
Since the p-value corresponding to F = 23.223 = .001 < α = .05, the relationship is significant.
13- 264
c.
r 2 = .744; a good fit. The least squares line explained 74.4% of the variability in Price.
d.
yˆ = 9.26 + .711(6) = 13.53
56. a.
The Excel output is shown below: Regression Statistics
Multiple R
0.9586
R Square
0.9189
Adjusted R Square
0.9116 11.0428
Standard Error Observations
13
ANOVA df
SS
MS
1
15208.3982
15208.4 124.7162
Residual
11
1341.3849
Total
12
16549.7831
Regression
Coefficients Intercept Common Shares Outstanding (millions)
Standard Error
F
Significance F 2.42673E-07
121.9441
t Stat
P-value
-3.8338
5.9031
-0.6495
0.5294
0.2957
0.0265
11.1676 2.43E-07
b.
yˆ = −3.83 + .296(150) = 40.57 ; approximately 40.6 million shares of options grants outstanding.
c.
r 2 = .919; a very good fit. The least squares line explained 91.9% of the variability in Options.
57. a.
The Excel output is shown below: Regression Statistics
Multiple R
0.6852
R Square
0.4695
Adjusted R Square
0.4032
Standard Error
2.6641
Observations
10
ANOVA df
SS
MS
F 7.0807
Regression
1
50.2554
50.2554
Residual
8
56.7806
7.0976
Total
9
107.036
13 - 265
Significance F 0.0288
Coefficients
Standard Error
t Stat
P-value
Intercept
0.2747
0.9004
0.3051
0.7681
S&P 500
0.9498
0.3569
2.6609
0.0288
b.
Since the p-value = 0.029 is less than α = .05, the relationship is significant.
c.
r2 = .470. The least squares line does not provide a very good fit.
d.
Woolworth has higher risk with a market beta of 1.25.
58. a. 100
High Temperature
90 80 70 60 50 40 35
45
55
65
75
85
Low Temperature b.
It appears that there is a positive linear relationship between the two variables.
c.
The Excel output is shown below:
Regression Statistics Multiple R
0.8837
R Square
0.7810
Adjusted R Square
0.7688
Standard Error
5.2846
Observations
20
ANOVA df Regression
SS
MS
1
1792.2734
1792.273
Residual
18
502.6766
27.9265
Total
19
2294.95
13- 266
F 64.1783
Significance F 2.40264E-07
Coefficients Intercept Low
Standard Error
t Stat
P-value
23.8987
6.4812
3.6874
0.0017
0.8980
0.1121
8.0111
2.4E-07
d.
Since the p-value corresponding to F = 64.18 = .000 < α = .05, the relationship is significant.
e.
r 2 = .781; a good fit. The least squares line explained 78.1% of the variability in high temperature.
f.
rxy = .781 = +.88
59.
The Excel output is shown below: Regression Statistics Multiple R
0.9253
R Square
0.8562
Adjusted R Square
0.8382
Standard Error
4.2496
Observations
10
ANOVA df
SS
MS
Regression
1
860.0509486
860.0509
Residual
8
144.4740514
18.0593
Total
9
1004.525
Coefficients Intercept Weekly Usage
Standard Error
t Stat
F
Significance F
47.6238
P-value
10.5280
3.7449
2.8113
0.0228
0.9534
0.1382
6.9010
0.0001
a.
yˆ = 10.528 + .9534x
b.
Since the p-value corresponding to F = 47.62 = .0001 < α = .05, we reject H0: β1 = 0.
c.
Using the PredInt macro, the 95% prediction interval is 28.74 to 49.52 or $2874 to $4952
d.
Yes, since the expected expense is $3913.
13 - 267
0.0001
60. a.
The Excel output is shown below: Regression Statistics
Multiple R
0.8597
R Square Adjusted R Square
0.7391
Standard Error
1.4891
0.6739
Observations
6
ANOVA df
SS
MS
F 11.3333
Regression
1
25.1304
25.1304
Residual
4
8.8696
2.2174
Total
5
34
Coefficients
Standard Error
t Stat
Significance F 0.0281
P-value
Intercept
22.1739
1.6527
13.4164
0.0002
Line Speed
-0.1478
0.0439
-3.3665
0.0281
b.
Since the p-value corresponding to F = 11.33 = .0281 < α = .05, the relationship is significant.
c.
r 2 = .739; a good fit. The least squares line explained 73.9% of the variability in the number of defects.
d.
Using the PredInt macro, the 95% confidence interval is 12.294 to 17.271. The scatter diagram follows:
10 Days Absent
61. a.
8 6 4 2 0 0
5
10 Distance to Work
A negative linear relationship appears to be reasonable.
13- 268
15
20
b.
The Excel output is shown below: Regression Statistics Multiple R
0.8431
R Square
0.7109
Adjusted R Square
0.6747
Standard Error
1.2894
Observations
10
ANOVA df
SS
MS
F 19.6677
Regression
1
32.6993
32.6993
Residual
8
13.3007
1.6626
Total
9
46
Coefficients Intercept Distance to Work
Standard Error
t Stat
Significance F 0.0022
P-value
8.0978
0.8088
10.0119
8.41E-06
-0.3442
0.0776
-4.4348
0.0022
c.
Since the p-value corresponding to F = 419.67 is .0022 < α = .05. We reject H0 : β1 = 0.
d.
r2 = .711. The estimated regression equation explained 71.1% of the variability in y; this is a reasonably good fit.
e.
Using the PredInt macro, the 95% confidence interval is 5.195 to 7.559 or approximately 5.2 to 7.6 days.
62. a.
The Excel output is shown below:
Regression Statistics Multiple R 0.9341 R Square 0.8725 Adjusted R Square 0.8566 Standard Error 75.4983 Observations 10 ANOVA df Regression Residual Total
Intercept Age
SS 1 8 9
312050 45600 357650
Coefficients Standard Error 220 58.4808 131.6667 17.7951
13 - 269
MS 312050 5700
F Significance F 54.7456 7.62662E-05
t Stat P-value 3.7619 0.0055 7.3990 7.63E-05
b.
Since the p-value corresponding to F = 54.75 is .000 < α = .05, we reject H0: β1 = 0.
c.
r2 = .873. The least squares line provided a very good fit.
d.
Using the PredInt macro, the 95% prediction interval is 559.5 to 933.9 or $559.50 to $933.90
63. a.
The Excel output is shown below: Regression Statistics
Multiple R
0.9369
R Square
0.8777
Adjusted R Square
0.8624
Standard Error
7.5231
Observations
10
ANOVA df
SS
MS
Regression
1
3249.720752
3249.721
Residual
8
452.7792483
56.5974
Total
9
3702.5
Coefficients
Standard Error
t Stat
F
Significance F
57.4182
P-value
Intercept
5.8470
7.9717
0.7335
0.4842
Hours Studying
0.8295
0.1095
7.5775
6.44E-05
b.
Since the p-value corresponding to F = 57.42 is .000 < α = .05, we reject H0: β1 = 0.
c.
84.65 points
d.
Using the PredInt macro, the 95% prediction interval is 65.35 to 103.96
64. a.
The Excel output is shown below: Regression Statistics
Multiple R
0.4659
R Square
0.2171
Adjusted R Square
0.1736
Standard Error
0.2088
Observations
20
13- 270
6.43959E-05
ANOVA df
SS
Regression
MS
1
0.2175
0.2175
Residual
18
0.7845
0.0436
Total
19
1.002
Coefficients Intercept Adjusted Gross Income
Standard Error
t Stat
F 4.9901
P-value
-0.4710
0.5842
-0.8061
0.4307
3.86778E-05
1.73143E-05
2.2339
0.0384
b.
Since the p-value = 0.0384 is less than α = .05, the relationship is significant.
c.
r2 = .217. The least squares line does not provide a very good fit.
d.
Using the PredInt macro, the 95% confidence interval is .7729 to .9927.
13 - 271
Significance F 0.0384
Chapter 13 Multiple Regression Learning Objectives
1.
Understand how multiple regression analysis can be used to develop relationships involving one dependent variable and several independent variables.
2.
Be able to interpret the coefficients in a multiple regression analysis.
3.
Know the assumptions necessary to conduct statistical tests involving the hypothesized regression model.
4.
Understand the role of Excel in performing multiple regression analysis.
5.
Be able to interpret and use Excel's Regression tool output to develop the estimated regression equation.
6.
Be able to determine how good a fit is provided by the estimated regression equation.
7.
Be able to test for the significance of the regression equation.
8.
Understand how multicollinearity affects multiple regression analysis.
9.
Know how residual analysis can be used to make a judgement as to the appropriateness of the model, identify outliers, and determine which observations are influential.
13- 272
Solutions:
1.
a.
b1 = .5906 is an estimate of the change in y corresponding to a 1 unit change in x1 when x2 is held constant. b2 = .4980 is an estimate of the change in y corresponding to a 1 unit change in x2 when x1 is held constant.
2.
a.
The Excel output is shown below:
Regression Statistics Multiple R
0.8124
R Square
0.6600
Adjusted R Square
0.6175
Standard Error
25.4009
Observations
10
ANOVA df
SS
MS
Regression
1
10021.24739
10021.25
Residual
8
5161.652607
645.2066
Total
9
15182.9
Coefficients Intercept X1
Standard Error
1.7727
0.1142
1.9436
0.4932
3.9410
0.0043
The Excel output is shown below: Regression Statistics Multiple R
0.4707
R Square
0.2215
Adjusted R Square
0.1242
Observations
P-value
25.4181
yˆ = 45.0594 + 1.9436(45) = 132.52
Standard Error
15.5318
45.0594
An estimate of y when x1 = 45 is
b.
t Stat
F
38.4374 10
14 - 273
Significance F 0.0043
ANOVA df
SS
MS
Regression
1
3363.4142
3363.414
Residual
8
11819.4858
1477.436
Total
9
15182.9
Coefficients Intercept X2
Standard Error
t Stat
F 2.2765
Significance F 0.1698
P-value
85.2171
38.3520
2.2220
0.0570
4.3215
2.8642
1.5088
0.1698
An estimate of y when x2 = 15 is yˆ = 85.2171 + 4.3215(15) = 150.04 c.
The Excel output is shown below: Regression Statistics Multiple R
0.9620
R Square
0.9255
Adjusted R Square
0.9042
Standard Error
12.7096
Observations
10
ANOVA df
SS
MS
Regression
2
14052.15497
7026.077
Residual
7
1130.745026
161.535
Total
9
15182.9
Coefficients Intercept
Standard Error
t Stat
F 43.4957
Significance F 0.0001
P-value
-18.3683
17.97150328
-1.0221
0.3408
X1
2.0102
0.2471
8.1345
8.19E-05
X2
4.7378
0.9484
4.9954
0.0016
An estimate of y when x1 = 45 and x2 = 15 is yˆ = -18.3683 + 2.0102(45) + 4.7378(15) = 143.16 3.
a.
b1 = 3.8 is an estimate of the change in y corresponding to a 1 unit change in x1 when x2, x3, and x4 are held constant. b2 = -2.3 is an estimate of the change in y corresponding to a 1 unit change in x2 when x1, x3, and x4 are held constant.
13 - 274
b3 = 7.6 is an estimate of the change in y corresponding to a 1 unit change in x3 when x1, x2, and x4 are held constant. b4 = 2.7 is an estimate of the change in y corresponding to a 1 unit change in x4 when x1, x2, and x3 are held constant. 4.
5.
a.
yˆ = 25 + 10(15) + 8(10) = 255; sales estimate: $255,000
b.
Sales can be expected to increase by $10 for every dollar increase in inventory investment when advertising expenditure is held constant. Sales can be expected to increase by $8 for every dollar increase in advertising expenditure when inventory investment is held constant.
a.
The Excel output is shown below: Regression Statistics Multiple R 0.8078 R Square 0.6526 Adjusted R Square 0.5946 Standard Error 1.2152 Observations 8 ANOVA df
SS 16.6401 8.8599 25.5
MS 16.6401 1.4767
Coefficients Standard Error 88.6377 1.5824
t Stat 56.0159
P-value 2.174E-09
0.4778
3.3569
0.0153
SS 23.4354 2.0646 25.5
MS 11.7177 0.4129
Regression Residual Total
Intercept Television Advertising ($1000s) b.
1 6 7
1.6039
F Significance F 11.2688 0.0153
The Excel output is shown below: Regression Statistics Multiple R 0.9587 R Square 0.9190 Adjusted R Square 0.8866 Standard Error 0.6426 Observations 8 ANOVA df Regression Residual Total
2 5 7
13 - 275
F Significance F 28.3778 0.0019
Intercept Television Advertising ($1000s) Newspaper Advertising ($1000s)
6.
Coefficients Standard Error 83.2301 1.5739
t Stat P-value 52.8825 4.57E-08
2.2902
0.3041
7.5319
0.0007
1.3010
0.3207
4.0567
0.0098
c.
No, it is 1.6039 in part (a) and 2.2902 above. In this exercise it represents the marginal change in revenue due to an increase in television advertising with newspaper advertising held constant.
d.
Revenue = 83.2301 + 2.2902(3.5) + 1.3010(1.8) = $93.59 or $93,590
a.
The Excel output is shown below: Regression Statistics Multiple R
0.5579
R Square
0.3112
Adjusted R Square
0.2620
Standard Error
7.0000
Observations
16
ANOVA df
SS
Regression
1
309.9516
309.9516
Residual
14
686.0028
49.0002
Total
15
995.9544
Coefficients Intercept Curb Weight (lb.) b.
MS
Standard Error
6.3255
P-value
49.7800
19.1062
2.6054
0.0208
0.0151
0.0060
2.5151
0.0247
The Excel output is shown below: Regression Statistics Multiple R
0.9383
R Square
0.8804
Adjusted R Square
0.8620
Standard Error
3.0274
Observations
t Stat
F
16
ANOVA
13 - 276
Significance F 0.0247
df Regression
SS
F
2
876.8049822
438.4025
Residual
13
119.1493928
9.1653
Total
15
995.954375
Coefficients
Standard Error
t Stat
47.8327
Significance F 1.01401E-06
P-value
Intercept
80.4873
9.1393
8.8067
7.69E-07
Curb Weight (lb.)
-0.0031
0.0035
-0.8968
0.3861
0.1047
0.0133
7.8643
2.7E-06
Horsepower c.
MS
yˆ = 80.4873 - 0.0031(2910) + 0.1047(296) = 102 Note to instructor: The Excel output shows that Curb Weight is not very significant (p-value = .3861) given the effect of Horsepower. In Section 15.5, students will learn how to test for the significance of the individual parameters.
7.
a.
The Excel output is shown below:
Regression Statistics Multiple R 0.9121 R Square 0.8318 Adjusted R Square 0.7838 Standard Error 51.1363 Observations 10 ANOVA df Regression Residual Total
Intercept Capacity Comfort b.
c.
8.
a.
2 7 9
SS MS 90548.0554 45274.03 18304.4446 2614.921 108852.5
Coefficients Standard Error 356.1208 197.1740 -0.0987 0.0459 122.8672 21.7998
t Stat 1.8061 -2.1524 5.6362
F Significance F 17.3137 0.0019
P-value 0.1139 0.0684 0.0008
b1 = -.0987 is an estimate of the change in the price with respect to a 1 cubic inch change in capacity with the comfort rating held constant. b2 = 122.8672 is an estimate of the change in the price with respect to a 1 unit change in the comfort rating with the capacity held constant. yˆ = 356.1208 - .0987(4500) + 122.8672(4) = $403
The Excel output is shown below:
13 - 277
Regression Statistics Multiple R
0.7629
R Square
0.5820
Adjusted R Square
0.5329
Standard Error
16.9770
Observations
20
ANOVA df
SS
MS
2
6823.2072
3411.604 11.8368
Residual
17
4899.7428
288.2202
Total
19
11722.95
Regression
Coefficients
t Stat
247.3579
110.4462
2.2396
0.0388
Safety Rating
-32.8445
13.9504
-2.3544
0.0308
34.5887
14.1294
2.4480
0.0255
b.
yˆ = 247.3579 − 32.8445(7.5) + 34.5887(2) = 70.2
a.
The Excel output is shown below:
Significance F 0.0006
P-value
Intercept Annual Expense Ratio (%)
9.
Standard Error
F
Regression Statistics Multiple R 0.6182 R Square 0.3821 Adjusted R Square 0.2998 Standard Error 12.4169 Observations 18 ANOVA df Regression Residual Total
Intercept Average Class Size Combined SAT Score
b.
2 15 17
SS 1430.4194 2312.6917 3743.111111
Coefficients Standard Error 26.7067 51.6689
MS 715.2097 154.1794
F Significance F 4.6388 0.0270
t Stat P-value 0.5169 0.6128
-1.4298
0.9931
-1.4397
0.1705
0.0757
0.0391
1.9392
0.0715
yˆ = 26.7067 - 1.4298(20) + 0.0757(1000) = 73.8 or 73.8%
13 - 278
Note to instructor: the Excel output shows that Average Class Size is not very significant (p-value = .1705) given the effect of Combined SAT Score. In Section 15.5, students will learn how to test for the significance of the individual parameters. 10. a.
The Excel output is shown below: Regression Statistics Multiple R
0.9616
R Square
0.9246
Adjusted R Square
0.9188
Standard Error
226.6709
Observations
15
ANOVA df Regression
SS 1
MS
8192067.3605 8192067.3605
Residual
13
667935.9155
Total
14
8860003.2760
Coefficients Intercept Cars
Standard Error
33.3352
83.0767
7.9840
0.6323
F
Significance F
159.4418
1.13179E-08
51379.6858
t Stat
P-value
0.4013
0.6947
12.6270 1.13179E-08
b.
An increase of 1000 cars in service will result in an increase in revenue of $7.984 million.
c.
The Excel output is shown below: Regression Statistics Multiple R
0.9703
R Square
0.9416
Adjusted R Square
0.9318
Standard Error
207.7292 15
Observations ANOVA df Regression
SS 2
MS
8342186.4020 4171093.2010
Residual
12
517816.8740
Total
14
8860003.2760
Coefficients Standard Error Intercept
105.9727
85.5166
13 - 279
F 96.6618
43151.4062
t Stat 1.2392
P-value 0.2390
Significance F 3.98523E-08
Cars Locations 11. a.
8.9427
0.7746
11.5451 7.42955E-08
-0.1914
0.1026
-1.8652
SSE = SST - SSR = 6,724.125 - 6,216.375 = 507.75 SSR 6, 216.375 = = .924 SST 6, 724.125
b.
R2 =
c.
Ra2 = 1 − (1 − R 2 )
d.
The estimated regression equation provided an excellent fit.
12. a.
0.0868
R2 =
n −1 10 − 1 = 1 − (1 − .924) = .902 n − p −1 10 − 2 − 1
SSR 14, 052.2 = = .926 SST 15,182.9 n −1 10 − 1 = 1 − (1 − .926) = .905 n − p −1 10 − 2 − 1
b.
Ra2 = 1 − (1 − R 2 )
c.
Yes; after adjusting for the number of independent variables in the model, we see that 90.5% of the variability in y has been accounted for.
13. a.
R2 =
SSR 1760 = = .975 SST 1805 n −1 30 − 1 = 1 − (1 − .975) = .971 n − p −1 30 − 4 − 1
b.
Ra2 = 1 − (1 − R 2 )
c.
The estimated regression equation provided an excellent fit.
14. a.
R2 =
SSR 12, 000 = = .75 SST 16, 000 n −1 9 = 1 − .25 = .68 n − p −1 7
b.
Ra2 = 1 − (1 − R 2 )
c.
The adjusted coefficient of determination shows that 68% of the variability has been explained by the two independent variables; thus, we conclude that the model does not explain a large amount of variability.
15. a.
R2 =
SSR 23.435 = = .919 SST 25.5
Ra2 = 1 − (1 − R 2 ) b.
n −1 8 −1 = 1 − (1 − .919) = .887 n − p −1 8 − 2 −1
Multiple regression analysis is preferred since both R2 and Ra2 show an increased percentage of the variability of y explained when both independent variables are used.
13 - 280
16.
Note: the Excel output is shown with the solution to Exercise 6. a.
No; R Square = .3112
b.
Multiple regression analysis is preferred since both R Square and Adjusted R Square show an increased percentage of the variability of y explained when both independent variables are used.
17. a. b. 18.
R Square = .3821
Adjusted R Square = .2998
The fit is not very good Note: The Excel output is shown with the solution to Exercise 10.
a.
R Square = .9416
b.
The fit is very good.
19. a.
MSR = SSR/p = 6,216.375/2 = 3,108.188 MSE =
b.
Adjusted R Square = .9318
SSE 507.75 = = 72.536 n − p − 1 10 − 2 − 1
F = MSR/MSE = 3,108.188/72.536 = 42.85 F.05 = 4.74 (2 degrees of freedom numerator and 7 denominator) Since F = 42.85 > F.05 = 4.74 the overall model is significant.
c.
t = .5906/.0813 = 7.26 t.025 = 2.365 (7 degrees of freedom) Since t = 2.365 > t.025 = 2.365, β1 is significant.
d.
t = .4980/.0567 = 8.78 Since t = 8.78 > t.025 = 2.365, β2 is significant.
20.
A portion of the Excel output is shown below. Regression Statistics Multiple R
0.9620
R Square
0.9255
Adjusted R Square
0.9042
Standard Error
12.7096
Observations
10
ANOVA df
SS
13 - 281
MS
F
Significance F
Regression
2
14052.15497
7026.077
Residual
7
1130.745026
161.535
Total
9
15182.9
Coefficients Intercept
Standard Error
t Stat
43.4957
0.0001
P-value
-18.36826758
17.97150328
-1.0221
0.3408
X1
2.0102
0.2471
8.1345
8.19E-05
X2
4.7378
0.9484
4.9954
0.0016
a.
Since the p-value corresponding to F = 43.4957 is .0001 < α = .05, we reject H0: β1 = β2 = 0; there is a significant relationship.
b.
Since the p-value corresponding to t = 8.1345 is .000 < α = .05, we reject H0: β1 = 0; β1 is significant.
c.
Since the p-value corresponding to t = 4.9954 is .0016 < α = .05, we reject H0: β2 = 0; β2 is significant.
21. a.
b.
22. a.
In the two independent variable case the coefficient of x1 represents the expected change in y corresponding to a one unit increase in x1 when x2 is held constant. In the single independent variable case the coefficient of x1 represents the expected change in y corresponding to a one unit increase in x1. Yes. If x1 and x2 are correlated, one would expect a change in the coefficient of x1 when x2 is dropped from the model. SSE = SST - SSR = 16000 - 12000 = 4000 s2 =
SSE 4000 = = 571.43 n - p -1 7
MSR =
b.
SSR 12000 = = 6000 p 2
F = MSR/MSE = 6000/571.43 = 10.50 F.05 = 4.74 (2 degrees of freedom numerator and 7 denominator) Since F = 10.50 > F.05 = 4.74, we reject H0. There is a significant relationship among the variables.
23. a.
F = 28.38 F.01 = 13.27 (2 degrees of freedom, numerator and 1 denominator) Since F > F.01 = 13.27, reject H0.
b.
Alternatively, the p-value of .002 leads to the same conclusion. t = 7.53
13 - 282
t.025 = 2.571 Since t > t.025 = 2.571, β1 is significant and x1 should not be dropped from the model. c.
t = 4.06 t.025 = 2.571 Since t > t.025 = 2.571, β2 is significant and x2 should not be dropped from the model.
24.
Note: The Excel output is shown below: Regression Statistics Multiple R
0.9383
R Square
0.8804
Adjusted R Square
0.8620
Standard Error
3.0274
Observations
16
ANOVA df Regression
SS 2
876.8049822
438.4025
Residual
13
119.1493928
9.1653
Total
15
995.954375
Coefficients
Standard Error
t Stat
47.8327
P-value
80.4873
9.1393
8.8067
7.69E-07
Curb Weight (lb.)
-0.0031
0.0035
-0.8968
0.3861
0.1047
0.0133
7.8643
2.7E-06
F = 47.8327 F.05 = 3.81 (2 degrees of freedom numerator and 13 denominator) Since F = 47.8327 > F.05 = 3.81, we reject H0: β1 = β2 = 0. Alternatively, since the p-value = .000 < α = .05 we can reject H0.
b.
F
Intercept Horsepower a.
MS
For Curb Weight: H0: β1 = 0 Ha: β1 ≠ 0 Since the p-value = 0.3861 > α = 0.05, we cannot reject H0
For Horsepower:
13 - 283
Significance F 1.01401E-06
H0: β2 = 0
Ha: β2 ≠ 0
Since the p-value = 0.000 < α = 0.05, we can reject H0 25. a.
The Excel output is shown below: Regression Statistics Multiple R
0.6867
R Square
0.4715
Adjusted R Square
0.3902
Standard Error
5.4561
Observations
16
ANOVA df Regression
SS
MS
2
345.2765787
172.6383
Residual
13
387.0034213
29.7695
Total
15
732.28
Coefficients Standard Error
t Stat
F
Significance F
5.7992
0.0158
P-value
Intercept
6.0382
4.5893
1.3157
0.2110
Gross Profit Margin (%)
0.6916
0.2133
3.2421
0.0064
Sales Growth (%)
0.2648
0.1871
1.4154
0.1805
b.
Since the p-value = 0.0158 < α = 0.05, there is a significant relationship among the variables.
c.
For Gross Profit Margin (%): Since the p-value = 0.0064 < α = 0.05, Profit% is significant. For Gross Profit Margin (%): Since the p-value = 0.1805 > α = 0.05, Sales% is not significant.
26.
Note: The Excel output is shown below: Regression Statistics Multiple R
0.9703
R Square
0.9416
Adjusted R Square
0.9318
Standard Error Observations
207.7292 15
ANOVA
13 - 284
df Regression
SS 2
MS
8342186.4020 4171093.2010
Residual
12
517816.8740
Total
14
8860003.2760
Coefficients Intercept Cars Locations
F
Standard Error
96.6618
Significance F 3.98523E-08
43151.4062
t Stat
P-value
105.9727
85.5166
1.2392
0.2390
8.9427
0.7746
11.5451 7.42955E-08
-0.1914
0.1026
-1.8652
0.0868
a.
Since the p-value corresponding to F = 96.6618 is 0.000 < α = .05, there is a significant relationship among the variables.
b.
For Cars: Since the p-value = 0.000 < α = 0.05, Cars is significant
c.
For Location: Since the p-value = 0.0868 > α = 0.05, Location is not significant
27. a. b.
28. a. b. 29. a.
yˆ = 29.1270 + .5906(180) + .4980(310) = 289.8150 The point estimate for an individual value is yˆ = 289.8150, the same as the point estimate of the mean value. Using the PredInt macro, the 95% confidence interval is 132.16 to 154.16. Using the PredInt macro, the 95% prediction interval is 111.13 to 175.18. yˆ = 83.2 + 2.29(3.5) + 1.30(1.8) = 93.555 or $93,555 Note: In Exercise 5b, the Excel output also shows that b0 = 83.2301, b1 = 2.2902, and b2 = 1.3010; hence, yˆ = 83.2301 + 2.2902x1 + 1.3010x2. Using this estimated regression equation, we obtain yˆ = 83.2301 + 2.2902(3.5) + 1.3010(1.8) = 93.588 or $93,588 The difference, $93,588 - $93,555 = $33, is simply due to the fact that additional significant digits are used in the computations.
b.
Using the PredInt macro, the confidence interval estimate: 92.840 to 94.335 or $92,840 to $94,335
c.
Using the PredInt macro, the prediction interval estimate: 91.774 to 95.401 or $91,774 to $95,401
13 - 285
30. a.
Since Curb Weight is not statistically significant (see Exercise 24), we will use an estimated regression equation which uses only Horsepower to predict the speed at 1/4 mile. The Excel output is shown below: Regression Statistics Multiple R
0.9343
R Square
0.8730
Adjusted R Square
0.8639
Standard Error
3.0062
Observations
16
ANOVA df
SS
Regression
MS
1
869.4340
869.434
Residual
14
126.5204
9.0372
Total
15
995.9544
Coefficients Intercept Horsepower
Standard Error
t Stat
F
Significance F
96.2064
1.18632E-07
P-value
72.6500
2.6555
27.3586 1.49E-13
0.0968
0.0099
9.8085 1.19E-07
Using the PredInt macro, the point estimate is a speed of 101.29 miles per hour. b.
Using the PredInt macro, the 95% confidence interval is 99.490 to 103.089 miles per hour.
c.
Using the PredInt macro, the 95% prediction interval is 94.596 to 107.984 miles per hour.
31. a. b. 32. a.
Using the PredInt macro, the 95% confidence interval is 58.37% to 75.03%. Using the PredInt macro, the 95% prediction interval is 35.24% to 90.59%. E(y) = β0 + β1 x1 + β2 x2 where x2 = 0 if level 1 and 1 if level 2
b.
E(y) = β0 + β1 x1 + β2(0) = β0 + β1 x1
c.
E(y) = β0 + β1 x1 + β2(1) = β0 + β1 x1 + β2
d.
β2 = E(y | level 2) - E(y | level 1) β1 is the change in E(y) for a 1 unit change in x1 holding x2 constant.
13 - 286
33. a. b.
two E(y) = β0 + β1 x1 + β2 x2 + β3 x3 where x2 0 1 0
c.
Level 1 2 3
x3 0 0 1
E(y | level 1) = β0 + β1 x1 + β2(0) + β3(0) = β0+ β1 x1 E(y | level 2) = β0 + β1 x1 + β2(1) + β3(0) = β0 + β1 x1 + β2 E(y | level 3) = β0 + β1 x1 + β2(0) + β3(0) = β0 + β1 x1 + β3
β2 = E(y | level 2) - E(y | level 1) β3 = E(y | level 3) - E(y | level 1) β1 is the change in E(y) for a 1 unit change in x1 holding x2 and x3 constant. 34. a.
$15,300
b.
Estimate of sales = 10.1 - 4.2(2) + 6.8(8) + 15.3(0) = 56.1 or $56,100
c.
Estimate of sales = 10.1 - 4.2(1) + 6.8(3) + 15.3(1) = 41.6 or $41,600
35. a.
Let Type = 0 if a mechanical repair Type = 1 if an electrical repair The Excel output is shown below: Regression Statistics Multiple R 0.2952 R Square 0.0871 Adjusted R Square -0.0270 Standard Error 1.0934 Observations 10 ANOVA df Regression Residual Total
Intercept Type b.
SS 1 8 9
0.9127 9.5633 10.476
Coefficients Standard Error 3.45 0.5467 0.6167 0.7058
MS 0.9127 1.1954
F Significance F 0.7635 0.4077
t Stat P-value 6.3109 0.0002 0.8738 0.4077
The estimated regression equation did not provide a good fit. In fact, the p-value of .4077 shows that the relationship is not significant for any reasonable value of α.
13 - 287
c.
Person = 0 if Bob Jones performed the service and Person = 1 if Dave Newton performed the service. The Excel output is shown below: Regression Statistics Multiple R 0.7816 R Square 0.6109 Adjusted R Square 0.5623 Standard Error 0.7138 Observations 10 ANOVA df Regression Residual Total
Intercept Person d.
36. a.
SS 1 8 9
MS 6.4 4.076 10.476
Coefficients Standard Error 4.62 0.3192 -1.6 0.4514
6.4 0.5095
F Significance F 12.5613 0.0076
t Stat P-value 14.4729 5.08E-07 -3.5442 0.0076
We see that 61.1% of the variability in repair time has been explained by the repair person that performed the service; an acceptable, but not good, fit. The Excel output is shown below: Regression Statistics Multiple R 0.9488 R Square 0.900199692 Adjusted R Square 0.850299539 Standard Error 0.4174 Observations 10 ANOVA df Regression Residual Total
Intercept Months Since Last Service Type Person b.
SS 3 6 9
9.4305 1.0455 10.476
Coefficients Standard Error 1.8602 0.7286 0.2914 0.0836 1.1024 0.3033 -0.6091 0.3879
MS 3.1435 0.1743
F Significance F 18.0400 0.0021
t Stat P-value 2.5529 0.0433 3.4862 0.0130 3.6342 0.0109 -1.5701 0.1674
Since the p-value corresponding to F = 18.04 is .0021 < α = .05, the overall model is statistically significant.
13 - 288
c.
37. a.
b.
The p-value corresponding to t = -1.57 is .1674 > α = .05; thus, the addition of Person is not statistically significant. Person is highly correlated with Months (the sample correlation coefficient is -.691); thus, once the effect of Months has been accounted for, Person will not add much to the model. Let Position = 0 if a guard Position = 1 if an offensive tackle. The Excel output is shown below: Regression Statistics Multiple R
0.6895
R Square
0.4755
Adjusted R Square
0.4005
Standard Error
0.6936
Observations
25
ANOVA df Regression
SS
MS
3
9.1562
3.0521
Residual
21
10.1014
0.4810
Total
24
19.2576
Coefficients
Standard Error
t Stat
F 6.3451
Significance F 0.0031
P-value
Intercept
11.2233
4.5226
2.4816
0.0216
Position
0.7324
0.2893
2.5311
0.0194
Weight
0.0222
0.0104
2.1352
0.0447
Speed
-2.2775
0.9290
-2.4517
0.0231
c.
Since the p-value corresponding to F = 6.3451 is .0031 < α = .05, there is a significant relationship between rating and the independent variables.
d.
The value of Adjusted R Square is .4005; the estimated regression equation did not provide a very good fit.
e.
Since the p-value for Position is .0194 < α = .05, position is a significant factor in the player’s rating.
f.
yˆ = 11.2233 + .7324(1) + .0222(300) − 2.2775(5.1) = 7.0
13 - 289
38. a.
The Excel output is shown below: Regression Statistics Multiple R 0.9346 R Square 0.8735 Adjusted R Square 0.8498 Standard Error 5.7566 Observations 20 ANOVA df Regression Residual Total
Intercept Age Pressure Smoker
3 16 19
SS 3660.7396 530.2104 4190.95
Coefficients Standard Error -91.7595 15.2228 1.0767 0.1660 0.2518 0.0452 8.7399 3.0008
MS 1220.247 33.1382
F Significance F 36.8230 2.06404E-07
t Stat P-value -6.0278 1.76E-05 6.4878 7.49E-06 5.5680 4.24E-05 2.9125 0.0102
b.
Since the p-value corresponding to t = 2.9125 is .0102 < α = .05, smoking is a significant factor.
c.
Using the PredInt macro, the point estimate is 34.27; the 95% prediction interval is 21.35 to 47.18. Thus, the probability of a stroke (.2135 to .4718 at the 95% confidence level) appears to be quite high. The physician would probably recommend that Art quit smoking and begin some type of treatment designed to reduce his blood pressure.
39. a.
b. 40. a.
b. 41. a.
Job satisfaction can be expected to decrease by 8.69 units with a one unit increase in length of service if the wage rate does not change. A dollar increase in the wage rate is associated with a 13.5 point increase in the job satisfaction score when the length of service does not change. yˆ = 14.4 - 8.69(4) + 13.5(6.5) = 67.39 The expected increase in final college grade point average corresponding to a one point increase in high school grade point average is .0235 when SAT mathematics score does not change. Similarly, the expected increase in final college grade point average corresponding to a one point increase in the SAT mathematics score is .00486 when the high school grade point average does not change. yˆ = -1.41 + .0235(84) + .00486(540) = 3.19 The regression equation is Regression Statistics Multiple R
0.9681
R Square
0.9373
Adjusted R Square
0.9194
Standard Error
0.1298
Observations
10
13 - 290
ANOVA df
SS
MS
Regression
2
1.7621
0.8810
Residual
7
0.1179
0.0168
Total
9
1.88
Coefficients Standard Error Intercept
b.
F
t Stat
Significance F
52.3053
6.17838E-05
P-value
-1.4053
0.4848
-2.8987
0.0230
X1
0.0235
0.0087
2.7078
0.0303
X2
0.0049
0.0011
4.5125
0.0028
F.05 = 4.74 (2 degrees of freedom numerator and 7 degrees of freedom denominator) F = 52.44 > F.05; significant relationship.
c.
R2 =
SSR = .937 SST
Ra2 = 1 − (1 − .937)
9 = .919 7
good fit d.
t.025 = 2.365 (7 DF) for B1: t = 2.71 > 2.365; reject H0 : B1 = 0 for B2: t = 4.51 > 2.365; reject H0 : B2 = 0
42. a.
The regression equation is Regression Statistics Multiple R
0.9493
R Square
0.9012
Adjusted R Square
0.8616
Standard Error
3.773
Observations
8
ANOVA df
SS
MS
Regression
2
648.83
324.415
Residual
5
71.17
14.234
Total
7
720
13 - 291
F 22.7916
Significance F 0.0031
Coefficients Intercept
b.
Standard Error
t Stat
P-value
14.4
8.191
1.7580
0.1391
X1
-8.69
1.555
-5.5884
0.0025
X2
13.517
2.085
6.4830
0.0013
F.05 = 5.79 (5 degrees of freedom) F = 22.79 > F.05; significant relationship.
c.
R2 =
SSR = .901 SST
Ra2 = 1 − (1 − .901)
7 = .861 5
good fit d.
t.025 = 2.571 (5 degrees of freedom) for β1: t = -5.59 < -2.571; reject H0 : β1 = 0 for β2: t = 6.48 > 2.571; reject H0 : β2 = 0
43. a.
The Excel output is shown below: Regression Statistics Multiple R
0.5423
R Square
0.2941
Adjusted R Square Standard Error
0.2689 19.4957
Observations
30
ANOVA df Regression
SS 1
4433.856352
4433.856
Residual
28
10642.25117
380.0804
Total
29
15076.10752
Coefficients Intercept Book Value Per Share b.
MS
Standard Error
t Stat
F
Significance F
11.6656
P-value
12.7928
6.6242
1.9312
0.0636
2.2649
0.6631
3.4155
0.0020
The value of R Square is .2941; the estimated regression equation does not provide a good fit.
13 - 292
0.0020
c.
The Excel output is shown below: Regression Statistics Multiple R
0.7528
R Square
0.5667
Adjusted R Square
0.5346
Standard Error
15.5538
Observations
30
ANOVA df
SS
Regression
MS
2
8544.237582
4272.119
Residual
27
6531.869938
241.9211
Total
29
15076.10752
Coefficients
Standard Error
t Stat
F
Significance F
17.6591
1.24768E-05
P-value
Intercept
5.8766
5.5448
1.0598
0.2986
Book Value Per Share
2.5356
0.5331
4.7562
5.87E-05
Return on Equity Per Share (%)
0.4841
0.1174
4.1220
0.0003
Since the p-value corresponding to the F test is 0.000, the relationship is significant. 44. a.
The Excel output is shown below: Regression Statistics Multiple R
0.9747
R Square
0.9500
Adjusted R Square
0.9319
Standard Error
2.1272
Observations
16
ANOVA df Regression
SS
MS
4
946.1809495
236.5452
Residual
11
49.7734
4.5249
Total
15
995.954375
13 - 293
F 52.2768
Significance F 4.33829E-07
Coefficients Intercept
Standard Error
t Stat
P-value
97.5702
11.7926
8.2738
4.74E-06
0.0693
0.0380
1.8210
0.0959
-0.0008
0.0026
-0.3145
0.7590
0.0590
0.0154
3.8235
0.0028
-2.4836
0.9601
-2.5869
0.0253
Price ($1000s) Curb Weight (lb.) Horsepower Zero to 60 (Seconds) b.
Since the p-value corresponding to the F test is 0.000, the relationship is significant.
c.
Since the p-values corresponding to the t test for both Horsepower (p-value = .0028) and Zero to 60 (p-value = .0253) are less than .05, both of these independent variables are significant.
d.
The Excel output is shown below: Regression Statistics Multiple R
0.9648
R Square
0.9309
Adjusted R Square
0.9203
Standard Error
2.3011
Observations
16
ANOVA df Regression
SS
MS
2
927.1181
463.559
Residual
13
68.8363
5.2951
Total
15
995.9544
Coefficients Intercept Horsepower Zero to 60 (Seconds)
Standard Error
t Stat
F 87.5449
P-value
103.1028
9.4478
10.9129
6.47E-08
0.0558
0.0145
3.8436
0.0020
-3.1876
0.9658
-3.3006
0.0057
13 - 294
Significance F 2.86588E-08
e.
The standardized residual plot is shown below:
Standard Residuals
3 2 1 0 -1
80
90
100
110
120
-2 Predicted y There is an unusual trend in the plot and one observation appears to be an outlier. f.
The Excel output is shown below:
Household Exposures
45. a.
The Excel output indicates that observation 2 is an outlier
700 600 500 400 300 200 100 0 0
20
40
60
Times Ad Aired b.
The Excel output is shown below: Regression Statistics Multiple R
0.9829
R Square
0.9660
Adjusted R Square
0.9618
Standard Error Observations
31.70350482 10
13 - 295
80
100
ANOVA df
SS
MS
F
Regression
1
228519.8983
228519.9
Residual
8
8040.897745
1005.112
Total
9
236560.796
Coefficients Standard Error Intercept Times Ad Aired
t Stat
Significance F
227.3576
3.70081E-07
P-value
53.2448
16.5334
3.2204
0.0122
6.7427
0.4472
15.0784
3.7E-07
Since the p-value is 0.000, the relationship is significant. c.
The Excel output is shown below: Regression Statistics Multiple R
0.9975
R Square
0.9949
Adjusted R Square
0.9935
Standard Error
13.0801
Observations
10
ANOVA df
SS
MS
Regression
2
235363.1688
117681.6
Residual
7
1197.62722
171.0896
Total
9
236560.796
Coefficients Standard Error Intercept Times Ad Aired BigAds
t Stat
F 687.836
Significance F 9.23264E-09
P-value
73.0634
7.5067
9.7331
2.56E-05
5.0368
0.3268
15.4131
1.17E-06
101.1129
15.9877
6.3244
0.0004
d.
The p-value corresponding to the t test for BigAds is 0.0004; thus, the dummy variable is significant.
e.
The dummy variable enables us to fit two different lines to the data; this approach is referred to as piecewise linear approximation.
13 - 296
46. a.
The Excel output is shown below: Regression Statistics Multiple R
0.6059
R Square
0.3671
Adjusted R Square
0.3445
Standard Error
5.4213
Observations
30
ANOVA df
SS
Regression
MS
F
1
477.2478
477.2478
Residual
28
822.9189
29.3900
Total
29
1300.1667
Coefficients Standard Error Intercept Suggested Retail Price ($)
t Stat
Significance F
16.2385
0.0004
P-value
38.7718
4.3481
8.9170
1.13E-09
0.0008
0.0002
4.0297
0.0004
Since the p-value corresponding to F = 16.24 is .0004 < α = .05, there is a significant relationship between the resale value (1%) and the suggested price. b.
R-Square = .3671; not a very good fit.
c.
Let Type1 = 0 and Type2 = 0 if a small pickup; Type1 = 1 and Type2 = 0 if a full-size pickup; and Type1 = 0 and Type2 = 1 if a sport utility. The Excel output using Type1, Type2, and Price is shown below: Regression Statistics Multiple R
0.7940
R Square
0.6305
Adjusted R Square
0.5879
Standard Error
4.2985
Observations
30
ANOVA df Regression
SS
MS
3
819.7710938
273.257
Residual
26
480.3955729
18.4768
Total
29
1300.166667
13 - 297
F 14.7892
Significance F 8.11183E-06
Coefficients Standard Error Intercept
d.
t Stat
P-value
42.5539
3.5618
11.9472
4.62E-12
Type1
9.0903
2.2476
4.0444
0.0004
Type2 Suggested Retail Price ($)
7.9172
2.1634
3.6596
0.0011
0.0003
0.0002
1.8972
0.0690
Since the p-value corresponding to F = 14.7892 is .000 < α = .05, there is a significant relationship between the resale value and the independent variables. Note that individually, Suggested retail Price is not significant at the .05 level of significance. If we rerun the regression using just Type1 and Type2 the value of Adjusted R-Square decreases to .5482, a drop of approximately .04. Thus, it appears that for these data, the type of vehicle is the strongest predictor of the resale value.
13 - 298
Chapter 14 Statistical Methods for Quality Control Learning Objectives
1.
Learn about the importance of quality control and how statistical methods can assist in the quality control process.
2.
Learn about acceptance sampling procedures.
3.
Know the difference between consumer’s risk and producer’s risk.
4.
Be able to use the binomial probability distribution to develop acceptance sampling plans.
5.
Know what is meant by multiple sampling plans.
6.
Be able to construct quality control charts and understand how they are used for statistical process control.
7.
Know the definitions of the following terms:
producer's risk
assignable causes
consumer's risk
common causes
acceptance sampling
control charts
acceptable criterion
upper control limit
operating characteristic curve
lower control limit
13 - 299
Solutions:
1.
a.
For n = 4 UCL = µ + 3(σ / n ) = 12.5 + 3(.8 / 4 ) = 13.7 LCL = µ - 3(σ / n ) = 12.5 - 3(.8 / 4 ) = 11.3
b.
For n = 8 UCL = µ + 3(.8 / 8 ) = 13.35 LCL = µ - 3(.8 / 8 ) = 11.65 For n = 16 UCL = µ + 3(.8 / 16 ) = 13.10 LCL = µ - 3(.8 / 16 ) = 11.90
2.
c.
UCL and LCL become closer together as n increases. If the process is in control, the larger samples should have less variance and should fall closer to 12.5.
a.
µ=
677.5 = 5.42 25(5) UCL = µ + 3(σ / n ) = 5.42 + 3(.5 / 5 ) = 6.09 LCL = µ - 3(σ / n ) = 5.42 - 3(.5 / 5 ) = 4.75
b.
3.
135 = 0.0540 25(100)
a.
p=
b.
σp =
p(1 − p) = n
0.0540(0.9460) = 0.0226 100
UCL = p + 3 σ p = 0.0540 + 3(0.0226) = 0.1218
c.
LCL = p - 3 σ p = 0.0540 -3(0.0226) = -0.0138 Use LCL = 0 4.
R Chart: UCL = RD4 = 1.6(1.864) = 2.98 LCL = RD3 = 1.6(0.136) = 0.22 x Chart: UCL = x + A2 R = 28.5 + 0.373(1.6) = 29.10 LCL = x − A2 R = 28.5 - 0.373(1.6) = 27.90
5.
a.
UCL = µ + 3(σ / n ) = 128.5 + 3(.4 / 6 ) = 128.99 LCL = µ - 3(σ / n ) = 128.5 - 3(.4 / 6 ) = 128.01
14 - 300
6.
b.
x = Σxi / n =
772.4 = 128.73 6
in control
c.
x = Σxi / n =
774.3 = 129.05 6
out of control
Process Mean =
2012 . + 19.90 = 20.01 2 UCL = µ + 3(σ /
n ) = 20.01 + 3(σ /
5 ) = 20.12
Solve for σ:
σ=
(2012 . − 20.01) 5 = 0.082 3
7. Sample Number
Observations
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
31 26 25 17 38 41 21 32 41 29 26 23 17 43 18 30 28 40 18 22
42 18 30 25 29 42 17 26 34 17 31 19 24 35 25 42 36 29 29 34
28 35 34 21 35 36 29 28 33 30 40 25 32 17 29 31 32 31 28 26
xi 33.67 26.33 29.67 21.00 34.00 39.67 22.33 28.67 36.00 25.33 32.33 22.33 24.33 31.67 24.00 34.33 32.00 33.33 25.00 27.33
R = 11.4 and x = 29.17 R Chart: UCL = RD4 = 11.4(2.575) = 29.35 LCL = RD3 = 11.4(0) = 0
x Chart: UCL = x + A2 R = 29.17 + 1.023(11.4) = 40.8 LCL = x − A2 R = 29.17 - 1.023(11.4) = 17.5
13 - 301
Ri 14 17 9 8 9 6 12 6 8 13 14 6 15 26 11 12 8 11 11 12
R Chart: 30
UCL = 29.3
20
R = 11.4 10
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
LCL = 0
Sample Number x Chart: UCL = 40.8
40
=
30
x = 29.17
20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Sample Number 8.
a.
141 p= = 0.0470 20(150)
b.
σp =
p(1 − p) = n
0.0470(0.9530) = 0.0173 150
UCL = p + 3 σ p = 0.0470 + 3(0.0173) = 0.0989 LCL = p - 3 σ p = 0.0470 -3(0.0173) = -0.0049
13 - 302
LCL = 17.5
c.
Use LCL = 0 12 p= = 0.08 150 Process should be considered in control.
d.
p = .047, n = 150 UCL = np + 3 np(1 − p) = 150(0.047) + 3 150(0.047)(0.953) = 14.826 LCL = np - 3 np(1 − p) = 150(0.047) - 3 150(0.047)(0.953) = -0.726
Thus, the process is out of control if more than 14 defective packages are found in a sample of 150.
9.
e.
Process should be considered to be in control since 12 defective packages were found.
f.
The np chart may be preferred because a decision can be made by simply counting the number of defective packages.
a.
Total defectives: 165 p=
b.
165 = 0.0413 20(200)
σp =
p(1 − p) = n
0.0413(0.9587) = 0.0141 200
UCL = p + 3 σ p = 0.0413 + 3(0.0141) = 0.0836 LCL = p - 3 σ p = 0.0413 + 3(0.0141) = -0.0010 Use LCL = 0 20 = 010 . 200
c.
p=
d.
p = .0413, n = 200
Out of control
UCL = np + 3 np(1 − p) = 200(0.0413) + 3 200(0.0413)(0.9587) = 16.702 LCL = np - 3 np(1 − p) = 200(0.0413) - 3 200(0.0413)(0.9587) = 0.1821 e. 10.
The process is out of control since 20 defective pistons were found. f ( x) =
n! p x (1 − p) n − x x !(n − x )!
When p = .02, the probability of accepting the lot is
13 - 303
f (0) =
25! (0.02) 0 (1 − 0.02) 25 = 0.6035 0!(25 − 0)!
When p = .06, the probability of accepting the lot is f (0) = 11. a.
25! (0.06) 0 (1 − 0.06) 25 = 0.2129 0!(25 − 0)!
Using binomial probabilities with n = 20 and p0 = .02. P (Accept lot) = f (0) = .6676 Producer’s risk: α = 1 - .6676 = .3324
b.
P (Accept lot) = f (0) = .2901 Producer’s risk: α = 1 - .2901 = .7099
12.
At p0 = .02, the n = 20 and c = 1 plan provides P (Accept lot) = f (0) + f (1) = .6676 + .2725 = .9401 Producer’s risk: α = 1 - .9401 = .0599 At p0 = .06, the n = 20 and c = 1 plan provides P (Accept lot) = f (0) + f (1) = .2901 + .3703 = .6604 Producer’s risk: α = 1 - .6604 = .3396 For a given sample size, the producer’s risk decreases as the acceptance number c is increased.
13. a.
Using binomial probabilities with n = 20 and p0 = .03. P(Accept lot) = f (0) + f (1) = .5438 + .3364 = .8802 Producer’s risk: α = 1 - .8802 = .1198
b.
With n = 20 and p1 = .15. P(Accept lot) = f (0) + f (1) = .0388 + .1368 = .1756 Consumer’s risk: β = .1756
c.
The consumer’s risk is acceptable; however, the producer’s risk associated with the n = 20, c = 1 plan is a little larger than desired.
13 - 304
14.
c 0 1 2
P (Accept) p0 = .05 .5987 .9138 .9884
Producer’s Risk α .4013 .0862 .0116
P (accept) p1 = .30 .0282 .1493 .3828
Consumer’s Risk β .0282 .1493 .3828
(n = 15)
0 1 2 3
.4633 .8291 .9639 .9946
.5367 .1709 .0361 .0054
.0047 .0352 .1268 .2968
.0047 .0352 .1268 .2968
(n = 20)
0 1 2 3
.3585 .7359 .9246 .9842
.6415 .2641 .0754 .0158
.0008 .0076 .0354 .1070
.0008 .0076 .0354 .1070
(n = 10)
The plan with n = 15, c = 2 is close with α = .0361 and β = .1268. However, the plan with n = 20, c = 3 is necessary to meet both requirements. 15. a.
P (Accept) shown for p values below: c 0 1 2
p = .01 .8179 .9831 .9990
p = .05 .3585 .7359 .9246
p = .08 .1887 .5169 .7880
p = .10 .1216 .3918 .6770
p = .15 .0388 .1756 .4049
The operating characteristic curves would show the P (Accept) versus p for each value of c. b.
P (Accept) c 0 1 2
16. a.
µ=
At p0 = .01 .8179 .9831 .9990
Producer’s Risk .1821 .0169 .0010
At p1 = .08 .1887 .5169 .7880
Consumer’s Risk .1887 .5169 .7880
Σx 1908 = = 95.4 20 20
b. UCL = µ + 3(σ / LCL = µ - 3(σ / c.
n ) = 95.4 + 3(.50 / n ) = 95.4 - 3(.50 /
No; all were in control
13 - 305
5 ) = 96.07 5 ) = 94.73
17. a.
For n = 10 UCL = µ + 3(σ / LCL = µ - 3(σ /
n ) = 350 + 3(15 / n ) = 350 - 3(15 /
10 ) = 364.23 10 ) = 335.77
For n = 20 UCL = 350 + 3(15 / LCL = 350 - 3(15 /
20 ) = 360.06 20 ) = 339.94
For n = 30 UCL = 350 + 3(15 / LCL = 350 - 3(15 /
30 ) = 358.22 30 ) = 343.78
b.
Both control limits come closer to the process mean as the sample size is increased.
c.
The process will be declared out of control and adjusted when the process is in control.
d.
The process will be judged in control and allowed to continue when the process is out of control.
e.
All have z = 3 where each tail area = 1 - .9986 = .0014 P (Type I) = 2 (.0014) = .0028
f. 18.
The Type II error probability is reduced as the sample size is increased. R Chart: UCL = RD4 = 2(2.115) = 4.23 LCL = RD3 = 2(0) = 0 x Chart: UCL = x + A2 R = 5.42 + 0.577(2) = 6.57 LCL = x − A2 R = 5.42 - 0.577(2) = 4.27 Estimate of Standard Deviation:
σ$ =
19.
R = 0.665
R 2 = = 0.86 d 2 2.326
x = 95.398
x Chart: UCL = x + A2 R = 95.398 + 0.577(0.665) = 95.782 LCL = x − A2 R = 95.398 - 0.577(0.665) = 95.014 R Chart:
13 - 306
UCL = RD4 = 0.665(2.115) = 1.406 LCL = RD3 = 0.665(0) = 0 The R chart indicated the process variability is in control. All sample ranges are within the control limits. However, the process mean is out of control. Sample 11 ( x = 95.80) and Sample 17 ( x =94.82) fall outside the control limits.
20.
R = .053
x = 3.082
x Chart: UCL = x + A2 R = 3.082 + 0.577(0.053) = 3.112 LCL = x − A2 R = 3.082 - 0.577(0.053) = 3.051 R Chart: UCL = RD4 = 0.053(2.115) = 0.1121 LCL = RD3 = 0.053(0) = 0 All data points are within the control limits for both charts. 21. a. .0 8
UCL .0 6
.0 4
.0 2
LCL
0
Warning: Process should be checked. All points are within control limits; however, all points are also greater than the process proportion defective.
13 - 307
b. 25 UCL 24
23
LCL 22
Warning: Process should be checked. All points are within control limits yet the trend in points show a movement or shift toward UCL out-of-control point. 22. a.
p = .04
σp =
p(1 − p) = n
0.04(0.96) = 0.0139 200
UCL = p + 3 σ p = 0.04 + 3(0.0139) = 0.0817 LCL = p - 3 σ p = 0.04 - 3(0.0139) = -0.0017 Use LCL = 0 b.
13 - 308
out of control
UCL (.082)
.04
LCL (0) For month 1 p = 10/200 = 0.05. Other monthly values are .075, .03, .065, .04, and .085. Only the last month with p = 0.085 is an out-of-control situation.
23. a.
Use binomial probabilities with n = 10. At p0 = .05, P(Accept lot) = f (0) + f (1) + f (2) = .5987 + .3151 + .0746 = .9884 Producer’s Risk: α = 1 - .9884 = .0116 At p1 = .20, P(Accept lot) = f (0) + f (1) + f (2) = .1074 + .2684 + .3020 = .6778 Consumer’s risk: β = .6778
b.
The consumer’s risk is unacceptably high. Too many bad lots would be accepted.
c.
Reducing c would help, but increasing the sample size appears to be the best solution.
24. a.
P (Accept) are shown below: (Using n = 15)
f (0) f (1)
α = 1 - P (Accept)
p = .01 .8601 .1303 .9904
p = .02 .7386 .2261 .9647
p = .03 .6333 .2938 .9271
p = .04 .5421 .3388 .8809
p = .05 .4633 .3658 .8291
.0096
.0353
.0729
.1191
.1709
Using p0 = .03 since α is close to .075. Thus, .03 is the fraction defective where the producer will tolerate a .075 probability of rejecting a good lot (only .03 defective). b. f (0)
p = .25 .0134
13 - 309
f (1)
.0668 .0802
β = 25. a.
P (Accept) when n = 25 and c = 0. Use the binomial probability function with n! f ( x) = p x (1 − p) n − x x !(n − x )! or 25! 0 f (0) = p (1 − p) 25 = (1 − p) 25 0!25!
p p p p
If = .01 = .03 = .10 = .20
.04
.06
f (0) .7778 .4670 .0718 .0038
b. 1.0
P (Accept)
.8
.6
.4
.2
.00
.02
.08
.10
.12
Percent Defective c.
1 - f (0) = 1 - .778 = .222
13 - 310
.14
.16
.18
.20
View more...
Comments