Chap 012
Short Description
Download Chap 012...
Description
Chapter 12 Simple Regression
True / False Questions
1. A scatter scatter plot is used used to visualize visualize the associat association ion (or lack lack of association) beteen to !uantitative variables. "rue "rue
#alse
2. "he correl correlation ation coe$cien coe$cientt r measures measures the strength of the linear relationship beteen to variables. "rue "rue
#alse
%. &earson's earson's correl correlation ation coe$cient coe$cient (r ) re!uires that both variables be interval or ratio data. "rue "rue
#alse
. f r * * .++ and n * 1,- then the correlation is significant at * ./+ in a to0tailed test. "rue "rue
#alse
+. A sample sample corre correlat lation ion r * * ./ indicates a stronger linear relationship relationship than r * * 0.,/. "rue "rue
#alse
,. A common common source source of spurious spurious correl correlation ation betee beteen n X and and Y is is hen a third unspecied variable Z aects aects both X and and Y . "rue "rue
#alse
3. "he correl correlation ation coe$cien coe$cientt r ala4s ala4s has the same sign as b1 in Y * * b/ 5 b1 X X . "rue "rue
#alse
6. "he tted intercept in a regression regression has little meaning if no data values values near X * * / have been observed. "rue "rue
#alse
7. "he least least s!uares s!uares regression regression line line is obtained obtained hen the sum of the s!uared residuals is minimized. "rue "rue
#alse
1/ n a simple regression- if the coe$cient for X is is positive and . signicantl4 dierent from zero- then an increase in X is is associated ith an increase in the mean (i.e.- the e8pected value) of Y . "rue "rue
#alse
11 n least0s!uares regression- the residuals e1- e2- . . . - en ill ala4s . have a zero mean. "rue "rue
#alse
12 9hen using the least s!uares method- the column of residuals ala4s . sums to zero. "rue "rue
#alse
1% n the model Sales * 2,6 5 3.%3 Ads- an additional :1 spent on ads . ill increase sales b4 3.%3 percent. "rue "rue
#alse
1 f R2 * .%, in the model Sales * 2,6 5 3.%3 Ads ith n * +/- the to0 . tailed test for correlation correlation at α * ./+ ould sa4 that there is a signicant correlation correlation beteen Sales and Ads. "rue "rue
#alse
1+ f R2 * .%, in the model Sales * 2,6 5 3.%3 Ads- then Ads e8plains %, . percent of the variation in Sales. "rue "rue
#alse
1, "he ordinar4 least least s!uares regression regression line ala4s ala4s passes through the . point . "rue "rue
#alse
13 "he least s!uares regression regression line gives gives unbiased estimates of β/ and . β1. "rue "rue
#alse
16 n a simple regression- the correlation coe$cient r is is the s!uare root of 2 . R. "rue "rue
#alse
17 f SSR is 16// and SSE SSE is 2//- then R2 is .7/. . "rue "rue #alse 2/ "he idth of a prediction prediction interval for an individual value value of Y is is less . than standard error se. "rue "rue
#alse
21 f SSE regression- the statistician ill conclude that the SSE is near zero in a regression. proposed model probabl4 has too poor a t to be useful. "rue "rue
#alse
22 #or a regression ith 2// observations- e e8pect that about 1/ . residuals ill e8ceed to standard errors. "rue "rue
#alse
2% Condence intervals for predicted Y are less precise hen the residuals . are ver4 small. "rue
#alse
2 Cause0and0eect direction beteen X and Y ma4 be determined b4 . running the regression tice and seeing hether Y * β/ 5 β1 X or X * β1 5 β/Y has the larger R2. "rue
#alse
2+ "he ordinar4 least s!uares method of estimation minimizes the . estimated slope and intercept. "rue
#alse
2, ;sing the ordinar4 least s!uares method ensures that the residuals ill . be normall4 distributed. "rue
#alse
23 f 4ou have a strong outlier in the residuals- it ma4 represent a dierent . causal s4stem. "rue
#alse
26 A negative correlation beteen to variables X and Y usuall4 4ields a . negative p0value for r . "rue
#alse
27 n linear regression beteen to variables- a signicant relationship . e8ists hen the p0value of the t test statistic for the slope is greater than α. "rue
#alse
%/ "he larger the absolute value of the t statistic of the slope in a simple . linear regression- the stronger the linear relationship e8ists beteen X and Y . "rue
#alse
%1 n simple linear regression- the coe$cient of determination (R2) is . estimated from sums of s!uares in the AA table. "rue
#alse
%2 n simple linear regression- the p0value of the slope ill ala4s e!ual . the p0value of the F statistic. "rue
#alse
%% An observation ith high leverage ill have a large residual (usuall4 an . outlier). "rue
#alse
% A prediction interval for Y is narroer than the corresponding . condence interval for the mean of Y . "rue
#alse
%+ 9hen X is farther from its mean- the prediction interval and condence . interval for Y become ider. "rue
#alse
%, "he total sum of s!uares (SST ) ill never e8ceed the regression sum of . s!uares (SSR). "rue
#alse
%3 ?@igh leverage? ould refer to a data point that is poorl4 predicted b4 . the model (large residual). "rue
#alse
%6 "he studentized residuals permit us to detect cases here the . regression predicts poorl4. "rue
#alse
%7 A poor prediction (large residual) indicates an observation ith high . leverage. "rue
#alse
/ Ill-conditioned refers to a variable hose units are too large or too . small (e.g.- :2-%-+,3). "rue
#alse
1 A simple decimal transformation (e.g.- from 16-271 to 16.271) often . improves data conditioning. "rue
#alse
2 "o0tailed t-tests are often used because an4 predictor that diers . signicantl4 from zero in a to0tailed test ill also be signicantl4 greater than zero or less than zero in a one0tailed test at the same α. "rue
#alse
% A predictor that is signicant in a one0tailed t-test ill also be . signicant in a to0tailed test at the same level of signicance α. "rue
#alse
=mission of a relevant predictor is a common source of model . misspecication. "rue
#alse
+ "he regression line must pass through the origin. . "rue #alse , =utliers can be detected b4 e8amining the standardized residuals. . "rue #alse
3 n a simple regression- there are n 0 2 degrees of freedom associated . ith the error sum of s!uares (SSE). "rue
#alse
6 n a simple regression- the F statistic is calculated b4 taking the ratio of . MSR to the MSE. "rue
#alse
7 "he coe$cient of determination is the percentage of the total variation . in the response variable Y that is e8plained b4 the predictor X . "rue
#alse
+/ A dierent condence interval e8ists for the mean value of Y for each . dierent value of X . "rue
#alse
+1 A prediction interval for Y is idest hen X is near its mean. . "rue #alse +2 n a to0tailed test for correlation at α * ./+- a sample correlation . coe$cient r * /.2 ith n * 2+ is signicantl4 dierent than zero. "rue
#alse
+% n correlation anal4sis- neither X nor Y is designated as the . independent variable. "rue
#alse
+ A negative value for the correlation coe$cient (r ) implies a negative . value for the slope (b1). "rue
#alse
++ @igh leverage for an observation indicates that X is far from its mean. . "rue #alse
+, Autocorrelated errors are not usuall4 a concern for regression models . using cross0sectional data. "rue
#alse
+3 "here are usuall4 several possible regression lines that ill minimize . the sum of s!uared errors. "rue
#alse
+6 9hen the errors in a regression model are not independent- the . regression model is said to have autocorrelation. "rue
#alse
+7 n a simple bivariate regression- F calc * t calc2. . "rue #alse ,/ Correlation anal4sis primaril4 measures the degree of the linear . relationship beteen X and Y . "rue
#alse
Multiple Choice Questions
,1 "he variable used to predict another variable is called the . A. B. C. .
response variable. regression variable. independent variable. dependent variable.
,2 "he standard error of the regression . A. is based on s!uared deviations from the regression line. B. ma4 assume negative values if b1 D /. C. is in s!uared units of the dependent variable. . ma4 be cut in half to get an appro8imate 7+ percent prediction interval. ,% A local trucking compan4 tted a regression to relate the travel time . (da4s) of its shipments as a function of the distance traveled (miles). "he tted regression is Time * 03.12, 5 /./21 Distance - based on a sample of 2/ shipments. "he estimated standard error of the slope is /.//+%. #ind the value of t calc to test for zero slope.
A. B. C. .
2., +./2 ./ %.1+
, A local trucking compan4 tted a regression to relate the travel time . (da4s) of its shipments as a function of the distance traveled (miles). "he tted regression is Time * 03.12, 5 ./21 Distance - based on a sample of 2/ shipments. "he estimated standard error of the slope is /.//+%. #ind the critical value for a right0tailed test to see if the slope is positive- using α * ./+.
A. B. C. .
2.1/1 2.++2 1.7,/ 1.3%
,+ f the attendance at a baseball game is to be predicted b4 the e!uation . Attendance * 1,-+// 0 3+ Temperatre - hat ould be the predicted attendance if Temperatre is 7/ degreesE
A. B. C. .
,-3+/ 7-3+/ 12-2+/ 1/- /2/
,, A h4pothesis test is conducted at the + percent level of signicance to . test hether the population correlation is zero. f the sample consists of 2+ observations and the correlation coe$cient is /.,/- then the computed test statistic ould be
A. B. C. .
2./31. 1.7,/. %.+73. 1.,+.
,3 9hich of the folloing is not a characteristic of the F-test in a simple . regressionE
A. t is a test for overall t of the model. B. "he test statistic can never be negative. C. t re!uires a table ith numerator and denominator degrees of freedom. . "he F 0test gives a dierent p0value than the t 0test.
,6 A researcher's F8cel results are shon belo using Femlab (labor force . participation rate among females) to tr4 to predict !ancer (death rate per 1//-/// population due to cancer) in the +/ ;.S. states.
9hich of the folloing statements is not trueE
A. "he standard error is too high for this model to be of an4 predictive use. B. "he 7+ percent condence interval for the coe$cient of Femlab is 0.27 to 0/.26. C. Signicant correlation e8ists beteen Femlab and !ancer at α * . /+. . "he to0tailed p0value for Femlab ill be less than ./+.
,7 A researcher's results are shon belo using Femlab (labor force . participation rate among females) to tr4 to predict !ancer (death rate per 1//-/// population due to cancer) in the +/ ;.S. states.
9hich statement is valid regarding the relationship beteen Femlab and !ancer E
A. A rise in female labor participation rate ill cause the cancer rate to decrease ithin a state. B. "his model e8plains about 1/ percent of the variation in state cancer rates. C. At the ./+ level of signicance- there isn't enough evidence to sa4 the to variables are related. . f 4our sister starts orking- the cancer rate in 4our state ill decline.
3/ A researcher's results are shon belo using Femlab (labor force . participation rate among females) to tr4 to predict !ancer (death rate per 1//-/// population due to cancer) in the +/ ;.S. states.
9hat is the R2 for this regressionE
A. B. C. .
.7/16 ./762 .6%7+ .1,/+
31 A nes netork stated that a stud4 had found a positive correlation . beteen the number of children a orker has and his or her earnings last 4ear. Gou ma4 conclude that
A. people should have more children so the4 can get better Hobs. B. the data are erroneous because the correlation should be negative. C. causation is in serious doubt. . statisticians have small families. 32 9illiam used a sample of ,6 large ;.S. cities to estimate the . relationship beteen !rime (annual propert4 crimes per 1//-/// persons) and Income (median annual income per capita- in dollars). @is estimated regression e!uation as !rime * 26 5 /./+/ Income. 9e can conclude that
A. the slope is small so Income has no eect on !rime. B. crime seems to create additional income in a cit4. C. ealth4 individuals tend to commit more crimes- on average. . the intercept is irrelevant since zero median income is impossible in a large cit4.
3% Iar4 used a sample of ,6 large ;.S. cities to estimate the relationship . beteen !rime (annual propert4 crimes per 1//-/// persons) and Income (median annual income per capita- in dollars). @er estimated regression e!uation as !rime * 26 5 /./+/ Income. f Income decreases b4 1///- e ould e8pect that !rime ill
A. B. C. .
increase b4 26. decrease b4 +/. increase b4 +//. remain unchanged.
3 Amelia used a random sample of 1// accounts receivable to estimate . the relationship beteen Da"s (number of da4s from billing to receipt of pa4ment) and Si#e (size of balance due in dollars). @er estimated regression e!uation as Da"s * 22 5 /.//3 Si#e ith a correlation coe$cient of .%//. #rom this information e can conclude that
A. 7 percent of the variation in Da"s is e8plained b4 Si#e. B. autocorrelation is likel4 to be a problem. C. the relationship beteen Da"s and Si#e is signicant. . larger accounts usuall4 take less time to pa4. 3+ &rediction intervals for Y are narroest hen . A. the mean of X is near the mean of Y . B. the value of X is near the mean of X . C. the mean of X diers greatl4 from the mean of Y . . the mean of X is small. 3, f n * 1+ and r * .27,- the corresponding t 0statistic to test for zero . correlation is
A. B. C. .
1.31+. 3.6,2. 2./6. impossible to determine ithout α.
33 ;sing a to0tailed test at α * ./+ for n * %/- e ould reHect the . h4pothesis of zero correlation if the absolute value of r e8ceeds
A. B. C. .
.2772. .%,/7. ./2+/. .2//.
36 "he ordinar4 least s!uares (=JS) method of estimation ill minimize . A. B. C. .
neither the slope nor the intercept. onl4 the slope. onl4 the intercept. both the slope and intercept.
37 A standardized residual ei * 02.2/+ indicates . A. B. C. .
a rather poor prediction. an e8treme outlier in the residuals. an observation ith high leverage. a likel4 data entr4 error.
6/ n a simple regression- hich ould suggest a signicant relationship . beteen X and Y E
A. B. C. .
Jarge p0value for the estimated slope Jarge t statistic for the slope Jarge p0value for the F statistic Small t 0statistic for the slope
61 9hich is indicative of an inverse relationship beteen X and Y E . A. A negative F statistic B. A negative p0value for the correlation coe$cient C. A negative correlation coe$cient . Fither a negative F statistic or a negative p0value
62 9hich is not correct regarding the estimated slope of the =JS . regression lineE
A. B. C. .
t is divided b4 its standard error to obtain its t statistic. t shos the change in Y for a unit change in X . t is chosen so as to minimize the sum of s!uared errors. t ma4 be regarded as zero if its p0value is less than α.
6% Simple regression anal4sis means that . A. the data are presented in a simple and clear a4. B. e have onl4 a fe observations. C. there are onl4 to independent variables. . e have onl4 one e8planator4 variable. 6 "he sample coe$cient of correlation does not have hich propert4E . A. B. C. .
t can range from 01.// up to 51.//. t is also sometimes called &earson's r . t is tested for signicance using a t 0test. t assumes that Y is the dependent variable.
6+ 9hen comparing the 7/ percent prediction and condence intervals for . a given regression anal4sis
A. the prediction interval is narroer than the condence interval. B. the prediction interval is ider than the condence interval. C. there is no dierence beteen the size of the prediction and condence intervals. . no generalization is possible about their comparative idth. 6, 9hich is not true of the coe$cient of determinationE . A. B. C. .
t is the s!uare of the coe$cient of correlation. t is negative hen there is an inverse relationship beteen X and Y . t reports the percent of the variation in Y e8plained b4 X . t is calculated using sums of s!uares (e.g.- SSR- SSE- SST ).
63 f the tted regression is Y * %.+ 5 2.1 X (R2 * .2+- n * 2+)- it is . incorrect to conclude that
A. B. C. .
Y increases 2.1 percent for a 1 percent increase in X . the estimated regression line crosses the Y a8is at %.+. the sample correlation coe$cient must be positive. the value of the sample correlation coe$cient is /.+/.
66 n a simple regression Y * b/ 5 b1 X here Y * number of robberies in a . cit4 (thousands of robberies)- X * size of the police force in a cit4 (thousands of police)- and n * + randoml4 chosen large ;.S. cities in 2//6- e ould be least likel4 to see hich problemE
A. Autocorrelated residuals (because this is time0series data) B. @eteroscedastic residuals (because e are using totals uncorrected for cit4 size) C. 07A table= and F test. Topic+ 0rdinar" /east S:ares Formlas
1,. "he ordinar4 least s!uares regression line ala4s passes through the point
.
TRUE
"he =JS formulas re!uire the line to pass through this point. AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-43 Interpret t)e slope and intercept o5 a r egression e:ation. Topic+ Regression Terminolog"
13. "he least s!uares regression line gives unbiased estimates of β/ and β1. TRUE
"he e8pected values of the =JS estimators b/ and b1 are the true parameters β/ and β1. AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-49 Fit a simple regression on an E%cel scatter plot. Topic+ 0rdinar" /east S:ares Formlas
16. n a simple regression- the correlation coe$cient r is the s!uare root of R2. TRUE
n fact- e could use the notation r 2 instead of R2 hen talking about simple regression . AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4< Interpret t)e standard error= R3= A>07A table= and F test. Topic+ 0rdinar" /east S:ares Formlas
17.
f SSR is 16// and SSE is 2//- then R2 is .7/. TRUE
R2 * SSRLSST * SSRL(SSR 5 SSE) * 16//L(16// 5 2//) * .7/. AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4< Interpret t)e standard error= R3= A>07A table= and F test. Topic+ Tests 5or Signi6cance
2/. "he idth of a prediction interval for an individual value of Y is less than standard error se. FALSE
"he formula for the interval idth multiplies the standard error b4 an e8pression Q 1. AA!S*+ Anal"tic *looms+ 8nderstand Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4? Distingis) bet@een con6dence and prediction inter2als 5or Y. Topic+ !on6dence and &rediction Inter2als 5or Y
21.
f SSE is near zero in a regression- the statistician ill conclude that the proposed model probabl4 has too poor a t to be useful. FALSE
SSF is the sum of the s!uare residuals- hich ould be smaller if the t is good. AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4< Interpret t)e standard error= R3= A>07A table= and F test. Topic+ Tests 5or Signi6cance
22. #or a regression ith 2// observations- e e8pect that about 1/ residuals ill e8ceed to standard errors. TRUE
f the residuals are normal- 7+. percent (17/ of 2//) ill lie ithin M2se (so 1/ outside). AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3- Identi5" nsal residals and )ig)-le2erage obser2ations. Topic+ 8nsal 0bser2ations
2%. Condence intervals for predicted Y are less precise hen the residuals are ver4 small. FALSE
Small residuals impl4 a small standard error and thus a narro@er prediction interval. AA!S*+ Anal"tic *looms+ 8nderstand Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4? Distingis) bet@een con6dence and prediction inter2als 5or Y. Topic+ !on6dence and &rediction Inter2als 5or Y
2. Cause0and0eect direction beteen X and Y ma4 be determined b4 running the regression tice and seeing hether Y * β/ 5 β1 X or X * β1 5 β/Y has the larger R2. FALSE
Cause and eect cannot be determined in the conte8t of simple regression models. AA!S*+ Anal"tic *looms+ 8nderstand Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-43 Interpret t)e slope and intercept o5 a r egression e:ation. Topic+ Simple Regression
2+. "he ordinar4 least s!uares method of estimation minimizes the estimated slope and intercept. FALSE
=JS minimizes the sum of s!uared residuals. AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-49 Fit a simple regression on an E%cel scatter plot. Topic+ 0rdinar" /east S:ares Formlas
2,. ;sing the ordinar4 least s!uares method ensures that the residuals ill be normall4 distributed. FALSE
=JS produces unbiased estimates but cannot ensure normalit4 of the residuals. AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4 Test residals 5or 2iolations o5 regression assmptions. Topic+ Residal Tests
23. f 4ou have a strong outlier in the residuals- it ma4 represent a dierent causal s4stem. TRUE
=utliers might come from a dierent population or causal s4stem. AA!S*+ Anal"tic *looms+ 8nderstand Diclt"+ 3 Medim /earning 0b1ecti2e+ 3- Identi5" nsal residals and )ig)-le2erage obser2ations. Topic+ 0t)er Regression &roblems 0ptionalB
26. A negative correlation beteen to variables X and Y usuall4 4ields a negative p0value for r . FALSE
"he p0value cannot be negative. AA!S*+ Anal"tic *looms+ 8nderstand Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4C Test )"pot)eses abot t)e slope and intercept b" sing t tests. Topic+ 7isal Displa"s and !orrelation Anal"sis
27. n linear regression beteen to variables- a signicant relationship e8ists hen the p0value of the t test statistic for the slope is greater than α. FALSE
ReHect β1 * / if the p0value is less t)an α. AA!S*+ Anal"tic *looms+ Appl" Diclt"+ Eas" /earning 0b1ecti2e+ 3-4C Test )"pot)eses abot t)e slope and intercept b" sing t tests. Topic+ Tests 5or Signi6cance
%/. "he larger the absolute value of the t statistic of the slope in a simple linear regression- the stronger the linear relationship e8ists beteen X and Y . TRUE
"he correlation coe$cient measures linearit4- regardless of its sign (5 or 0). AA!S*+ Anal"tic *looms+ Appl" Diclt"+ Eas" /earning 0b1ecti2e+ 3-4C Test )"pot)eses abot t)e slope and intercept b" sing t tests. Topic+ Tests 5or Signi6cance
%1. n simple linear regression- the coe$cient of determination (R2) is estimated from sums of s!uares in the AA table. TRUE
R2 * SSRLSST or R2 * 1 0 SSELSST . AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4< Interpret t)e standard error= R3= A>07A table= and F test. Topic+ 0rdinar" /east S:ares Formlas
%2.
n simple linear regression- the p0value of the slope ill ala4s e!ual the p0value of the F statistic. TRUE
"his is true onl4 if there is one predictor (but is no longer true in multiple regression). AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4< Interpret t)e standard error= R3= A>07A table= and F test. Topic+ Anal"sis o5 7ariance+ 02erall Fit
%%. An observation ith high leverage ill have a large residual (usuall4 an outlier). FALSE
"he concepts are distinct (a high0leverage point could have a good t). AA!S*+ Anal"tic *looms+ 8nderstand Diclt"+ 3 Medim /earning 0b1ecti2e+ 3- Identi5" nsal residals and )ig)-le2erage obser2ations. Topic+ 8nsal 0bser2ations
%.
A prediction interval for Y is narroer than the corresponding condence interval for the mean of Y . FALSE
&redicting an individual case re!uires a ider condence interval than predicting the mean. AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4? Distingis) bet@een con6dence and prediction inter2als 5or Y. Topic+ !on6dence and &rediction Inter2als 5or Y
%+.
9hen X is farther from its mean- the prediction interval and condence interval for Y become ider. TRUE
"he idth increases hen X diers from its mean (revie the formula). AA!S*+ Anal"tic *looms+ 8nderstand Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4? Distingis) bet@een con6dence and prediction inter2als 5or Y. Topic+ !on6dence and &rediction Inter2als 5or Y
%,.
"he total sum of s!uares (SST ) ill never e8ceed the regression sum of s!uares (SSR). FALSE
"he identit4 is SSR 5 SSE * SST . AA!S*+ Anal"tic *looms+ Remember Diclt"+ Eas" /earning 0b1ecti2e+ 3-4< Interpret t)e standard error= R3= A>07A table= and F test. Topic+ Anal"sis o5 7ariance+ 02erall Fit
%3. ?@igh leverage? ould refer to a data point that is poorl4 predicted b4 the model (large residual). FALSE
A high0leverage observation ma4 have a good t (onl4 its X value determines its leverage). AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3- Identi5" nsal residals and )ig)-le2erage obser2ations. Topic+ 8nsal 0bser2ations
%6. "he studentized residuals permit us to detect cases here the regression predicts poorl4. TRUE
Studentized residuals resemble a t 0distribution. A large studentized t 0 value (e.g.- t D 02.// or t Q 5 2.//) ould implies a poor t. AA!S*+ Anal"tic *looms+ 8nderstand Diclt"+ 3 Medim /earning 0b1ecti2e+ 3- Identi5" nsal residals and )ig)-le2erage obser2ations. Topic+ 8nsal 0bser2ations
%7. A poor prediction (large residual) indicates an observation ith high leverage. FALSE
@igh leverage indicates an unusuall4 large or small value (not a poor prediction). A high0leverage observation ma4 have a good t or a poor t. =nl4 its X value determines its leverage. AA!S*+ Anal"tic *looms+ 8nderstand Diclt"+ 3 Medim /earning 0b1ecti2e+ 3- Identi5" nsal residals and )ig)-le2erage obser2ations. Topic+ 8nsal 0bser2ations
/. Ill-conditioned refers to a variable hose units are too large or too small (e.g.- :2-%-+,3). TRUE
n F8cel- a s4mptom of poor data conditioning is e8ponential notation (e.g.- .%F 5 /,). AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4 &er5orm r egression anal"sis @it) E%cel or ot)er so5t@are. Topic+ 0t)er Regression &roblems 0ptionalB
1. A simple decimal transformation (e.g.- from 16-271 to 16.271) often improves data conditioning. TRUE
Peeping data magnitudes similar helps avoid e8ponential notation (e.g.- .%F 5 /,). AA!S*+ Anal"tic *looms+ 8nderstand Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4 &er5orm r egression anal"sis @it) E%cel or ot)er so5t@are. Topic+ 0t)er Regression &roblems 0ptionalB
2.
"o0tailed t-tests are often used because an4 predictor that diers signicantl4 from zero in a to0tailed test ill also be signicantl4 greater than zero or less than zero in a one0tailed test at the same α. TRUE
"rue because the critical t is larger in the to0tailed test (the default in most softare). AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4C Test )"pot)eses abot t)e slope and intercept b" sing t tests. Topic+ Tests 5or Signi6cance
%. A predictor that is signicant in a one0tailed t-test ill also be signicant in a to0tailed test at the same level of signicance α. FALSE
#alse because the critical t ould be larger in a to0tailed test. AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4C Test )"pot)eses abot t)e slope and intercept b" sing t tests. Topic+ Tests 5or Signi6cance
. =mission of a relevant predictor is a common source of model misspecication. TRUE
n a multivariate orld- simple regression ma4 be inade!uate. AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4 &er5orm r egression anal"sis @it) E%cel or ot)er so5t@are. Topic+ 0t)er Regression &roblems 0ptionalB
+. "he regression line must pass through the origin. FALSE
"he =JS intercept estimate does not- in general- e!ual zero. 9e might be unable to reHect a zero intercept if a t 0test- but the tted intercept is rarel4 zero. AA!S*+ Anal"tic *looms+ Remember Diclt"+ Eas" /earning 0b1ecti2e+ 3-49 Fit a simple regression on an E%cel scatter plot. Topic+ 0rdinar" /east S:ares Formlas
,. =utliers can be detected b4 e8amining the standardized residuals. TRUE
A poor t implies a large t 0value (e.g.- larger than M% ould be an outlier). AA!S*+ Anal"tic *looms+ Remember Diclt"+ Eas" /earning 0b1ecti2e+ 3- Identi5" nsal residals and )ig)-le2erage obser2ations. Topic+ 8nsal 0bser2ations
3. n a simple regression- there are n 0 2 degrees of freedom associated ith the error sum of s!uares (SSE). TRUE
"his is true in simple regression because e estimate to parameters ( β/ and β1). AA!S*+ Anal"tic *looms+ Remember Diclt"+ Eas" /earning 0b1ecti2e+ 3-4< Interpret t)e standard error= R3= A>07A table= and F test. Topic+ Anal"sis o5 7ariance+ 02erall Fit
6.
n a simple regression- the F statistic is calculated b4 taking the ratio of MSR to the MSE. TRUE
B4 denition- F calc * MSRMSE (obtained from the AA table). AA!S*+ Anal"tic *looms+ 8nderstand Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4< Interpret t)e standard error= R3= A>07A table= and F test. Topic+ Anal"sis o5 7ariance+ 02erall Fit
7. "he coe$cient of determination is the percentage of the total variation in the response variable Y that is e8plained b4 the predictor X . TRUE
R2 * SSRLSST or R2 * 1 0 SSELSST lies beteen / and 1 and often is e8pressed as a percent. AA!S*+ Anal"tic *looms+ 8nderstand Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4< Interpret t)e standard error= R3= A>07A table= and F test. Topic+ 0rdinar" /east S:ares Formlas
+/. A dierent condence interval e8ists for the mean value of Y for each dierent value of X . TRUE
Both the interval idth and also E(Y T X ) * β/ 5 β1 X depend on the value of X . AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4? Distingis) bet@een con6dence and prediction inter2als 5or Y. Topic+ !on6dence and &rediction Inter2als 5or Y
+1.
A prediction interval for Y is idest hen X is near its mean. FALSE
"he prediction interval is narro@est hen X is near its mean. Revie the formula- hich has a term ( % i - )2 in the numerator. "he minimum ould be hen % i . AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4? Distingis) bet@een con6dence and prediction inter2als 5or Y. Topic+ !on6dence and &rediction Inter2als 5or Y
+2. n a to0tailed test for correlation at α * ./+- a sample correlation coe$cient r * /.2 ith n * 2+ is signicantl4 dierent than zero. TRUE
t calc * r N(n 0 2)L(1 0 r 2)O1L2 * (.2)N(2+ 0 2)L(1 0 .2 2)O1L2 * 2.217 Q t ./2+ * 2./,7 for d.5. * 2+ 0 2 * 2%. AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4 !alclate and test a correlation coecient 5or signi6cance. Topic+ 7isal Displa"s and !orrelation Anal"sis
+%.
n correlation anal4sis- neither X nor Y is designated as the independent variable. TRUE
n correlation anal4sis- X and Y covar4 ithout designating either as ?independent.? AA!S*+ Anal"tic *looms+ Remember Diclt"+ Eas" /earning 0b1ecti2e+ 3-4 !alclate and test a correlation coecient 5or signi6cance. Topic+ 7isal Displa"s and !orrelation Anal"sis
+. A negative value for the correlation coe$cient (r ) implies a negative value for the slope (b1). TRUE
"he sign of r must be the same as the sign of the slope estimate b1. AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-49 Fit a simple regression on an E%cel scatter plot. Topic+ 0rdinar" /east S:ares Formlas
++. @igh leverage for an observation indicates that X is far from its mean. TRUE
B4 denition- observations have higher leverage hen X is far from its mean. AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3- Identi5" nsal residals and )ig)-le2erage obser2ations. Topic+ 8nsal 0bser2ations
+,. Autocorrelated errors are not usuall4 a concern for regression models using cross0sectional data. TRUE
9e more often e8pect autocorrelated residuals in time series data. AA!S*+ Anal"tic *looms+ Remember Diclt"+ Eas" /earning 0b1ecti2e+ 3-4 Test residals 5or 2iolations o5 regression assmptions. Topic+ Residal Tests
+3. "here "here are usuall4 usuall4 several several possible possible regr regressio ession n lines that that ill ill minimiz minimize e the sum of s!uared errors. errors. FALSE
"he =JS solution for the estimators estimators b/ and b1 is uni!ue. AA!S*+ Anal"tic *looms+ Remember Diclt"+ Eas" /earning 0b1ecti2e+ 3-49 Fit a simple regression on an E%cel scatter plot. Topic+ 0rdinar" /east S:ares Formlas
+6. 9hen the err errors ors in in a regr regressio ession n model model are are not independe independentnt- the the regression regression model is said to have autocorrelation. TRUE
#or e8ample- in rst0order autocorrelation Gt depends depends on Gt 01 01. AA!S*+ Anal"tic *looms+ Remember Diclt"+ Eas" /earning 0b1ecti2e+ 3-4 Test Test residals 5or 2iolations o5 regression assmptions. Topic+ Residal Tests 2 +7. n a sim simple ple bivari bivariate ate regress egression ion-- F calc calc * t calc calc .
TRUE
"his statement is true onl4 in a simple regression (one predictor). AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4< Interpret t)e standard error= R3= A>07A table= and F test. Topic+ Anal"sis o5 7ariance+ 02erall Fit
,/. Corre Correlat lation ion anal4 anal4sis sis prima primaril ril4 4 measur measures es the degree of the linear relationship beteen X and and Y . TRUE
"he sign of r indicates indicates the direction and its magnitude indicates the linearit4. degree of linearit4. AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4 !alclate and test a correlation coecient 5or signi6cance. Topic+ 7isal Displa"s and !orrelation Anal"sis
Multiple Choice Questions
,1. "he variabl variable e used used to predict predict another another variabl variable e is called called the the
A. B. C. .
response variable. regression variable. independent variable. dependent variable.
9e might also call the independent variable a predictor of of Y . AA!S*+ Anal"tic *looms+ Remember Diclt"+ Eas" /earning 0b1ecti2e+ 0b1ecti2e+ 3-43 Interpret t)e slope and intercept o5 a r egression e:ation. Topic+ Simple Regression
,2. "he standa standard rd erro errorr of the regr regress ession ion
A. is based on s!uared deviations from from the regression regression line. B. ma4 assume negative values if b1 D /. C. is in s!uared units of the dependent variable. . ma4 . ma4 be cut in half ha lf to get an appro8imate appro8imate 7+ percent prediction prediction interval.
n a simple regression- the standard error is the s!uare root of the sum of the s!uared residuals divided b4 (n 0 2). AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4< Interpret t)e standard error= R3= A>07A table= and F test. Topic+ Tests 5or Signi6cance
,%. A local local truckin trucking g compan4 compan4 tted tted a regr regressio ession n to relate relate the travel travel time time (da4s) of its shipments as a function of the distance traveled (miles). "he tted regression regression is Time * 03.12, 5 /./21 Distance - based on a sample of 2/ shipments. "he estimated standard error of the slope is /.//+%. #ind the value of t calc calc to test for zero slope.
A. A. B. B. C. . . t calc calc *
2., +./2 ./ %.1+ * (/./21)L(/.//+%) (/./21)L(/. //+%) * ./%6. AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4C Test Test )"pot)eses abot t)e slope and intercept b" sing t tests. Topic+ Tests 5or Signi6cance
,. A local trucking compan4 tted a regression to relate the travel time (da4s) of its shipments as a function of the distance traveled (miles). "he tted regression is Time * 03.12, 5 ./21 Distance - based on a sample of 2/ shipments. "he estimated standard error of the slope is /.//+%. #ind the critical value for a right0tailed test to see if the slope is positive- using α * ./+.
A. B. C. D.
2.1/1 2.++2 1.7,/ 1.3%
#or d.5. * n 0 2 * 2/ 0 2 * 16- Appendi8 gives t ./+ * 1.3%. AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4C Test )"pot)eses abot t)e slope and intercept b" sing t tests. Topic+ Tests 5or Signi6cance
,+. f the attendance at a baseball game is to be predicted b4 the e!uation Attendance * 1,-+// 0 3+ Temperatre - hat ould be the predicted attendance if Temperatre is 7/ degreesE
A. B. C. .
,-3+/ 7-3+/ 12-2+/ 1/- /2/
"he predicted Attendance is 1,-+// 0 3+(7/) * 7-3+/. AA!S*+ Anal"tic *looms+ Appl" Diclt"+ Eas" /earning 0b1ecti2e+ 3-43 Interpret t)e slope and intercept o5 a r egression e:ation. Topic+ Simple Regression
,,. A h4pothes h4pothesis is test test is conducted conducted at at the + percent percent level level of signica signicance nce to test hether the population correlation is zero. f the sample consists of 2+ observations and the correlation correlation coe$cient is /.,/then the computed test statistic ould be
A. A. B. B. C. . .
2./31. 1.7,/. %.+73. 1.,+.
N( N(n 0 2)L(1 0 r 2)O1L2 * (.,/)N(2+ 0 2)L(1 0 .,/ 2)O1L2 * %.+73. t calc calc * r Comment Re!uires Re!uires formula handout or memorizing the formula. AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4 !alclate and test a correlation coecient 5or signi6cance. Topic+ 7isal Displa"s and !orrelation Anal"sis
,3. ,3.
9hic 9hich h of the the fol follo loi ing ng is is not a a characteristic of the F-test in a simple regressionE
A. t is a test for overall t of the model. B. "he test statistic can never be negative. C. t C. t re!uires re!uires a table ith numerator and denominator degrees of freedom. D. "he F 0test 0test gives a dierent p0value than the t 0test. 0test. F calc calc is the ratio of to variances (mean s!uares) that measures overall t. "he test statistic cannot be negative because the variances are non0negative. n a simple regressionregression- the F 0test 0test ala4s agrees ith the t 0test. 0test. AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4< Interpret t)e standard error= R3= A>07A table= and F test. Topic+ Anal"sis o5 7ariance+ 02erall Fit
,6. A resear researcher cher's 's F8cel F8cel results results are shon belo using Femlab (labor force participation rate among females) to tr4 to predict !ancer (death rate per 1//-/// population due to cancer) in the +/ ;.S. states.
9hich of the folloing statements is not trueE trueE
A. "he standard error is too high for this model to be of an4 predictive use. B. "he B. "he 7+ percent condence interval for the coe$cient of Femlab is 0.27 to 0/.26. C. Signicant correlation correlation e8ists beteen Femlab and !ancer at at α * . /+. . "he to0tailed p0value for Femlab ill be less than ./+.
"he magnitude of se depends on Y (and (and- in this case- the t calc calc indicates signicance). AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4C Test Test )"pot)eses abot t)e slope and intercept b" sing t tests. Topic+ Tests 5or Signi6cance
,7. A resea researc rcher her's 's resu results lts are are shon shon belo belo usin using g Femlab (labor force participation rate among females) to tr4 to predict !ancer (death (death rate per 1//-/// population due to cancer) in the +/ ;.S. states.
9hich statement is valid regarding the relationship beteen Femlab and !ancer E
A. A A. A rise in female labor participation rate ill cause the cancer rate to decrease ithin a state. B. "his model e8plains about 1/ percent of the variation in state cancer rates. C. At C. At the ./+ level of signicance- there isn't enough evidence to sa4 the to variables are related. . f . f 4our sister starts orking- the cancer rate in 4our state sta te ill decline. t is customar4 to e8press the R2 as a percent (here(here- the t calc calc indicates signicance). AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4< Interpret t)e standard error= R3= A>07A table= and F test. Topic+ 0rdinar" /east S:ares Formlas
3/. A researcher's results are shon belo using Femlab (labor force participation rate among females) to tr4 to predict !ancer (death rate per 1//-/// population due to cancer) in the +/ ;.S. states.
9hat is the R2 for this regressionE
A. B. C. .
.7/16 ./762 .6%7+ .1,/+
R2 * SSRLSST * (+-%33.6%,)L(+-3+.22+) * ./762. AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4< Interpret t)e standard error= R3= A>07A table= and F test. Topic+ 0rdinar" /east S:ares Formlas
31. A nes netork stated that a stud4 had found a positive correlation beteen the number of children a orker has and his or her earnings last 4ear. Gou ma4 conclude that
A. people should have more children so the4 can get better Hobs. B. the data are erroneous because the correlation should be negative. C. causation is in serious doubt. . statisticians have small families. "here is no a priori basis for e8pecting causation. AA!S*+ Anal"tic *looms+ Appl" Diclt"+ Eas" /earning 0b1ecti2e+ 3-4 !alclate and test a correlation coecient 5or signi6cance. Topic+ 7isal Displa"s and !orrelation Anal"sis
32. 9illiam used a sample of ,6 large ;.S. cities to estimate the relationship beteen !rime (annual propert4 crimes per 1//-/// persons) and Income (median annual income per capita- in dollars). @is estimated regression e!uation as !rime * 26 5 /./+/ Income. 9e can conclude that
A. the slope is small so Income has no eect on !rime. B. crime seems to create additional income in a cit4. C. ealth4 individuals tend to commit more crimes- on average. D. the intercept is irrelevant since zero median income is impossible in a large cit4. Uero median income makes no sense (signicance cannot be assessed from given facts). AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4C Test )"pot)eses abot t)e slope and intercept b" sing t tests. Topic+ Simple Regression
3%. Iar4 used a sample of ,6 large ;.S. cities to estimate the relationship beteen !rime (annual propert4 crimes per 1//-/// persons) and Income (median annual income per capita- in dollars). @er estimated regression e!uation as !rime * 26 5 /./+/ Income. f Income decreases b4 1///- e ould e8pect that !rime ill
A. B. C. .
increase b4 26. decrease b4 +/. increase b4 +//. remain unchanged.
"he constant has no effect so V!rime * /./+/ VIncome * /./+/(0 1///) * 0+/. AA!S*+ Anal"tic *looms+ Appl" Diclt"+ Eas" /earning 0b1ecti2e+ 3-43 Interpret t)e slope and intercept o5 a r egression e:ation.
Topic+ Simple Regression
3. Amelia used a random sample of 1// accounts receivable to estimate the relationship beteen Da"s (number of da4s from billing to receipt of pa4ment) and Si#e (size of balance due in dollars). @er estimated regression e!uation as Da"s * 22 5 /.//3 Si#e ith a correlation coe$cient of .%//. #rom this information e can conclude that
A. 7 percent of the variation in Da"s is e8plained b4 Si#e. B. autocorrelation is likel4 to be a problem. C. the relationship beteen Da"s and Si#e is signicant. . larger accounts usuall4 take less time to pa4.
R2 * .%/2 * ./7. "hese are not time0series data- so there is no reason to e8pect autocorrelation. 9e cannot Hudge signicance ithout more information. AA!S*+ Anal"tic *looms+ Appl" Diclt"+ ; $ard /earning 0b1ecti2e+ 3-4< Interpret t)e standard error= R3= A>07A table= and F test. Topic+ 0rdinar" /east S:ares Formlas
3+.
&rediction intervals for Y are narroest hen
A. the mean of X is near the mean of Y . B. the value of X is near the mean of X . C. the mean of X diers greatl4 from the mean of Y . . the mean of X is small. Revie the formula- hich has ( % i - )2 in the numerator. "he minimum ould be hen % i . AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4? Distingis) bet@een con6dence and prediction inter2als 5or Y. Topic+ !on6dence and &rediction Inter2als 5or Y
3,.
f n * 1+ and r * .27,- the corresponding t 0statistic to test for zero correlation is
A. B. C. .
1.31+. 3.6,2. 2./6. impossible to determine ithout α.
t calc * r N(n 0 2)L(1 0 r 2)O1L2 * (.27,)N(1+ 0 2)L(1 0 .27, 2)O1L2 * 1.31+. AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4 !alclate and test a correlation coecient 5or signi6cance. Topic+ 7isal Displa"s and !orrelation Anal"sis
33.
;sing a to0tailed test at α * ./+ for n * %/- e ould reHect the h4pothesis of zero correlation if the absolute value of r e8ceeds
A. B. C. .
.2772. .%,/7. ./2+/. .2//.
;se r crit * t ./2+L(t ./2+2 5 n 0 2)1L2 * (2./6)L(2./6 2 5 %/ 0 2) 1L2 * .%,/7 for d.5. * %/ 0 2 * 26. AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4 !alclate and test a correlation coecient 5or signi6cance. Topic+ 7isal Displa"s and !orrelation Anal"sis
36. "he ordinar4 least s!uares (=JS) method of estimation ill minimize
A. B. C. .
neither the slope nor the intercept. onl4 the slope. onl4 the intercept. both the slope and intercept.
=JS method minimizes the sum of s!uared residuals. AA!S*+ Anal"tic *looms+ Remember Diclt"+ Eas" /earning 0b1ecti2e+ 3-49 Fit a simple regression on an E%cel scatter plot. Topic+ 0rdinar" /east S:ares Formlas
37.
A standardized residual ei * 02.2/+ indicates
A. B. C. .
a rather poor prediction. an e8treme outlier in the residuals. an observation ith high leverage. a likel4 data entr4 error.
"his residual is be4ond M2se but is not an outlier (and ithout % i e cannot assess leverage). AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3- Identi5" nsal residals and )ig)-le2erage obser2ations. Topic+ Residal Tests
6/. n a simple regression- hich ould suggest a signicant relationship beteen X and Y E
A. B. C. .
Jarge p0value for the estimated slope Jarge t statistic for the slope Jarge p0value for the F statistic Small t 0statistic for the slope
"he larger the t calc the more e feel like reHecting $/ β1 * /. AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4C Test )"pot)eses abot t)e slope and intercept b" sing t tests. Topic+ Tests 5or Signi6cance
61. 9hich is indicative of an inverse relationship beteen X and Y E
A. A negative F statistic B. A negative p0value for the correlation coe$cient C. A negative correlation coe$cient . Fither a negative F statistic or a negative p0value F calc and the p0value cannot be negative. AA!S*+ Anal"tic *looms+ Remember Diclt"+ Eas" /earning 0b1ecti2e+ 3-4< Interpret t)e standard error= R3= A>07A table= and F test. Topic+ Anal"sis o5 7ariance+ 02erall Fit
62.
9hich is not correct regarding the estimated slope of the =JS regression lineE
A. B. C. D.
t is divided b4 its standard error to obtain its t statistic. t shos the change in Y for a unit change in X . t is chosen so as to minimize the sum of s!uared errors. t ma4 be regarded as zero if its p0value is less than α.
9e ould reHect $/ β1 * / if its p0value is less than the level of signicance. AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4C Test )"pot)eses abot t)e slope and intercept b" sing t tests. Topic+ Tests 5or Signi6cance
6%. Simple regression anal4sis means that
A. the data are presented in a simple and clear a4. B. e have onl4 a fe observations. C. there are onl4 to independent variables. D. e have onl4 one e8planator4 variable. Mltiple regression has more than one independent variable (predictor). AA!S*+ Anal"tic *looms+ Remember Diclt"+ Eas" /earning 0b1ecti2e+ 3-43 Interpret t)e slope and intercept o5 a r egression e:ation. Topic+ Simple Regression
6. "he sample coe$cient of correlation does not have hich propert4E
A. t can range from 01.// up to 51.//. B. t is also sometimes called &earson's r . C. t is tested for signicance using a t 0test. D. t assumes that Y is the dependent variable. Correlation anal4sis makes no assumption of causation or dependence. AA!S*+ Anal"tic *looms+ Remember Diclt"+ Eas" /earning 0b1ecti2e+ 3-4 !alclate and test a correlation coecient 5or signi6cance. Topic+ 7isal Displa"s and !orrelation Anal"sis
6+. 9hen comparing the 7/ percent prediction and condence intervals for a given regression anal4sis
A. the prediction interval is narroer than the condence interval. B. the prediction interval is ider than the condence interval. C. there is no dierence beteen the size of the prediction and condence intervals. . no generalization is possible about their comparative idth. Indi2idal values of Y var4 more than the mean of Y . AA!S*+ Anal"tic *looms+ Remember Diclt"+ Eas" /earning 0b1ecti2e+ 3-4? Distingis) bet@een con6dence and prediction inter2als 5or Y. Topic+ !on6dence and &rediction Inter2als 5or Y
6,.
9hich is not true of the coe$cient of determinationE
A. t is the s!uare of the coe$cient of correlation. B. t is negative hen there is an inverse relationship beteen X and Y . C. t reports the percent of the variation in Y e8plained b4 X . . t is calculated using sums of s!uares (e.g.- SSR- SSE- SST ). R2 cannot be negative. AA!S*+ Anal"tic *looms+ Remember Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4< Interpret t)e standard error= R3= A>07A table= and F test. Topic+ 0rdinar" /east S:ares Formlas
63.
f the tted regression is Y * %.+ 5 2.1 X (R2 * .2+- n * 2+)- it is incorrect to conclude that
A. B. C. .
Y increases 2.1 percent for a 1 percent increase in X . the estimated regression line crosses the Y a8is at %.+. the sample correlation coe$cient must be positive. the value of the sample correlation coe$cient is /.+/.
;nits are not percent unless Y is alread4 a percent. AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-43 Interpret t)e slope and intercept o5 a r egression e:ation. Topic+ Simple Regression
66.
n a simple regression Y * b/ 5 b1 X here Y * number of robberies in a cit4 (thousands of robberies)- X * size of the police force in a cit4 (thousands of police)- and n * + randoml4 chosen large ;.S. cities in 2//6- e ould be least likel4 to see hich problemE
A. Autocorrelated residuals (because this is time0series data) B. @eteroscedastic residuals (because e are using totals uncorrected for cit4 size) C. 07A table= and F test. Topic+ 0rdinar" /east S:ares Formlas
11%. #ind the sample correlation coe$cient for the folloing data.
A. B. C. .
.6711 .712 .7622 .7++,
;se F8cel *C=RRFJ(ata- Gata) to verif4 4our calculation using the formula for r . AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4 !alclate and test a correlation coecient 5or signi6cance. Topic+ 7isal Displa"s and !orrelation Anal"sis
11. #ind the slope of the simple regression
A. B. C. .
* b/ 5 b1 % .
1.6%% %.27 /.3,2 02.226
;se F8cel to verif4 4our calculations using the formulas for b/ and b1. AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-49 Fit a simple regression on an E%cel scatter plot. Topic+ 0rdinar" /east S:ares Formlas
11+. #ind the sample correlation coe$cient for the folloing data.
A. B. C. .
.3271 .63%, .7116 .7+,%
;se F8cel *C=RRFJ(ata- Gata) to verif4 4our calculation using the formula for r . AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4 !alclate and test a correlation coecient 5or signi6cance. Topic+ 7isal Displa"s and !orrelation Anal"sis
11,. #ind the slope of the simple regression
A. B. C. D.
* b/ 5 b1 % .
2.+7+ 1.1/7 02.221 1.66
;se F8cel to verif4 4our calculations using the formulas for b/ and b1. AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-49 Fit a simple regression on an E%cel scatter plot. Topic+ 0rdinar" /east S:ares Formlas
113. A researcher's results are shon belo using n * 2+ observations.
"he 7+ percent condence interval for the slope is
A. B. C. .
N 0%.262- 01.26O. N 0.%7- 0/.213O. N1.116- +./2,O. N 0/.776- 5/.776O.
#or d.5. * n 0 2 * 2+ 0 2 * 2%- t ./2+ * 2./,7- so 02.26% M (2./,7) (/.776++). AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4J !alclate and interpret con6dence inter2als 5or regression coecients. Topic+ Tests 5or Signi6cance
116. A researcher's regression results are shon belo using n * 6 observations.
"he 7+ percent condence interval for the slope is
A. B. C. .
N1.%%%N1.,/2N1.2,6N1.116-
2.26O. 2./,O. 2.%76O. 2.7O.
#or d.5. * n 0 2 * 6 0 2 * ,- t ./2+ * 2.3- so 1.6%%% M (2.3) (/.2%/3). AA!S*+ Anal"tic *looms+ Appl" Diclt"+ 3 Medim /earning 0b1ecti2e+ 3-4J !alclate and interpret con6dence inter2als 5or regression coecients. Topic+ Tests 5or Signi6cance
117. Bob thinks there is something rong ith F8cel's tted regression. 9hat do 4ou sa4E
A. B. C. .
"he estimated e!uation is obviousl4 incorrect. "he R2 looks a little high but otherise it looks =P. Bob needs to increase his sample size to decide. "he relationship is linear- so the e!uation is credible.
A visual estimate of the slope is V " LV % * (,2+ 0 1//)L(2// 0 /) * 2.,2+- so the indicated slope less than 1 must be rong- plus the visual intercept is 1// (not 1+.,1) and the t seems better than R2 * .226. AA!S*+ Anal"tic *looms+ Appl" Diclt"+ ; $ard /earning 0b1ecti2e+ 3-49 Fit a simple regression on an E%cel scatter plot. Topic+ 0rdinar" /east S:ares Formlas
Short Answer Questions
12/. &edro became interested in vehicle fuel e$cienc4- so he performed a simple regression using 7% cars to estimate the model !it"M&' * β/ 5 β1 (eig)t here (eig)t is the eight of the vehicle in pounds. @is results are shon belo. 9rite a brief anal4sis of these results- using hat 4ou have learned in this chapter. s the intercept meaningful in this regressionE Iake a prediction of !it"M&' hen (eig)t * %///and also hen (eig)t * ///. o these predictions seem believableE f 4ou could make a car 1/// pounds lighter- hat change ould 4ou predict in its !it"M&'E
t is reasonable that a causal relationship might e8ist beteen a vehicle's eight and its I&W. 9e e8pect a negative slope (heavier vehicles ould get loer I&W). "he coe$cient of (eig)t diers from zero at an4 common value of α (the p0value is less than .///1) and the F statistic is huge. "he condence interval for the coe$cient of the predictor (eig)t does not include zero. "he highl4 signicant predictor (eig)t is consistent ith the high coe$cient of determination (R2 * .311)- hich sa4s that ell over half the variation in I&W is e8plained b4 (eig)t . f (eig)t * %///- e predict M&' * 3./6 0 .//6/ (eig)t * 3./6 0 .//6/(%///) * 2%./+ mpg. f (eig)t * ///- e predict M&' * 3./6 0 .//6/ (eig)t * 3./6 0 .//6/(///) * 1+./+ mpg. "he intercept is not meaningful since no vehicle has zero eight or a eight close to zero.
#eedback t is reasonable to postulate that a causal relationship might e8ist beteen a vehicle's eight and its I&W. =ur a priori e8pectation ould be that the slope should be negative since e ould e8pect that heavier vehicles ould get loer I&W. "he coe$cient of (eig)t diers from zero at an4 common value of α (the p0value is less than .///1) and the F statistic is huge. "he condence interval for the coe$cient of the predictor (eig)t does not include zero. "he slope's sign is negative- as anticipated a priori. "he highl4 signicant predictor (eig)t is consistent ith the high coe$cient of determination (R2 * .311)- hich sa4s that ell over half the variation in I&W is e8plained b4 (eig)t . f (eig)t * %///- e predict M&' * 3./6 0 .//6/ (eig)t * 3./6 0 .//6/(%///) * 2%./+ mpg. 9hen (eig)t * ///- e ould predict M&' * 3./6 0 .//6/ (eig)t * 3./6 0 .//6/(///) * 1+./+ mpg. "he intercept is not meaningful since no vehicle has zero eight or an4 eight close to zero. AA!S*+ ReKecti2e T)ining *looms+ E2alate Diclt"+ ; $ard /earning 0b1ecti2e+ 3-4C Test )"pot)eses abot t)e slope and intercept b" sing t tests. Topic+ Tests 5or Signi6cance
121. Iar4 noticed that old coins are smoother and more orn. She eighed %1 nickels and recorded their age- and then performed a simple regression to estimate the model (eig)t * β/ 5 β1 Age here eight is the eight of the coin in grams and Age is the age of the coin in 4ears. @er results are shon belo. 9rite a brief anal4sis of these results- using hat 4ou have learned in this chapter. Iake a prediction of (eig)t hen Age * 1/- and also hen Age * 2/. 9hat does this tell 4ouE s the intercept meaningful in this regressionE
t is reasonable to postulate a causal relationship beteen a coin's age and its eight (negative slope- since e ould e8pect that coins ill ear don ith usage). "he coe$cient of Age diers from zero at an4 common α (the p0value is less than .///1) and the F test statistic is large. "he condence interval for the coe$cient of Age does not include zero- and its sign is negative- as anticipated a priori. espite the signicant predictor Age- the coe$cient of determination (R2 * .2) shos that less than half the variation in nickel eights is e8plained b4 Age. f Age * 1/- e predict (eig)t * +./21/ 0 ./// Age * +./21/ 0 .///(1/) * .761 gm. f Age * 2/- e predict (eig)t * +./21/ 0 ./// Age * +./21/ 0 .///(2/) * .71 gm. "he intercept is meaningful if Age * / as in the sample data set (or at least some Age value near zero). "he intercept is logicall4 meaningful because Age * / is something e might observe (i.e.- a nel4 minted nickel).
View more...
Comments