Statistics
Walterr nto Walte ntonio niotti tti
st
entu entury ry Learnin ing g P ro d u c ts
Third Th ird Ed Edit itio ion n IS ISBN BN 1-929 1-929850850-01-8 01-8 Copyr yriight © 2001 by 21 21st st Ce Cent ntur ury y Learn rniing Pro Produ duct cts s All All righ rights ts rese reserv rved ed.. QUI
K NOT
S
is a re regi gist ster ered ed tr trad adem emar ark k
of
21stt Ce 21s Cent ntur ury y Le Lear arni ning ng Prod Produc ucts ts
Fred and Lulu have been provided by Corel Draw and Image Ima ge Clu Club b Graph Graphics ics 403-262-80 403-262-8008 08 respe respecti ctivel vely. y.
21st Ce 21st Cent ntury ury Le Lear arni ning ng Pr Prod oduc ucts ts 227 22 7 Ba Babo boos osic ic La Lake ke Roa oad d Merr Me rrim imac ack, k, NH 03 0305 054 4 603-424-4665 800-253-6595
[email protected] www.businessbookmall.com
edication Thi hiss book book is dedi dedica cate ted d to the the many many te teac ache hers rs who sp spen end d coun countl tles esss ho hour urss deve develo lopi ping ng clas classs hand handou outs ts to me meet et the the le lear arni ning ng st styl yles es an and d ab abil ilit ity y le leve vels ls part partic icul ular ar to th thei eirr st stud uden ents ts.. I have have been been pr priv ivil ileg eged ed to le lear arn n from ma many ny su such ch te teac ache hers rs the the mo most st pert pertin inen entt o f whom is the late Dr. Paul Ga Gawt wthr hrop op my Mariet Marietta ta Coll College ege St Stati atisti stics cs te teach acher. er. uick Notes Statistics Stati stics wa wass model modeled ed af afte terr his his Stat Statis isti tics cs co cour urse se outl outlin ine. e.
Ver Very Special Thank You To Pr Prof ofes esso sors rs Carl Carl T. Bre rezo zov vec Nor Norm mand and A. Di Dion on Wil Willi liam am H. Jack Jack Jr. Ca Cand ndac acee B. McKi Mc Kinn nnis isss Rober Robertt F. Wi Wies esen enau auer er and and P. Te Tere resa sa Fa Farn rnum um of Fran Franklin klin Pierce Pierce Co Colle llege ge Rind Rindge ge New New Hamp Hampshi shire re wh whos osee su sugg gges esti tion onss an and d en encou coura rage geme ment nt impro improved ved th thee book and and made the pr proj ojec ectt more more enjo enjoya yabl blee. To my Fr Fran ankl klin in Pier Pierce ce Col olllege ege Di Divi visi sion on of Professional Stud Studie iess Stat Statis isti tics cs st stud uden ents ts who who en enha hanc nced ed the the deve develo lopm pmen entt of T he he u ic ic k Notes earning Syste Sy stem m fo forr Stat Statis isti tics cs.. To Ji Jill ll Moon Moon grad gradua uate te st stat atis isti tics cs st stud uden entt at Geor George ge Mason Mason Univ Univer ersi sity ty Wash Wa shin ingt gton on DC DC w who ho ex exte tens nsiv ivel ely y re revi view ewed ed an early early draft draft of the book. To Pro roffesso essorr William Benoit Chair of the the Busi Busine ness ss Depart Departmen mentt Plymo Plymouth uth St State ate Coll College ege Plymout Plymouth h New Ne w Ha Hamp mpshi shire re fo forr his his inva invalu luab able le sugg sugges esti tion ons. s.
bou bo ut T he
utho r
Walter Walt er Anto Antoni niot otti ti bega began n te teac achi hing ng Sta tati tist stic icss over over 30 year yearss ago ago fo forr Da Dani niel el Webs Webste terr Co Coll lleg egee Nash Na shua ua Ne New w Ha Hamp mpsh shire ire wher wheree he beca became me an Asso Associ ciat atee Profess Professor or of Business Admini Adm inistra stratio tion n an and d Chairpers Chairperson on of the the Depart Departme ment nt of Av Avia iati tion on Mana Manage geme ment nt.. Duri During ng the past years as Di Dire rect ctor or and then then De Dean an of Co Cont ntin inui uing ng Educa Educatio tion n fo forr Frankli Franklin n Pi Pierc ercee Colleg Col legee Rindge Rindge Ne New w Hampshi Hampshire re Walt Walter er helped helped build build one of New En Engl glan and d s mo most st su succ cces essf sful ul Cont Contin inui uing ng Ed Educa ucatio tion n Pr Progr ogram ams. s. To Toda day y as Fra Frankl nklin in Pierce Pierce College College s Sp Spec ecia iall Assist Ass istant ant for Profes Professio sional nal Stud Studie iess Program Dev Develo elopme pment nt Walter Walter enj enjoys oys teachi teaching ng writin writing g an and d in inve vest stig igat atin ing g area areass o f inte intere rest st to hims himsel elff and the Col olle lege ge.. Walt Walter er Anto toni niot ottti has a Bachelor of Sc Scie ienc ncee degr degree ee in Busi Busine ness ss Ad Admi mini nist strat ratio ion n from from Marie Marietta tta Coll Colleg egee Mariett Mariettaa Ohio Oh io aand nd a Ma Mast ster erss of Bu Busi sine ness ss Admi Adminis nistra tratio tion n degr degree ee from from No North rtheas easter tern n Univ Univers ersit ity y Boston Massachusetts.
T
QUII K NOT S PI QU PIll llLO LOSO SOPH PHY Y Thee Theo Th Theory ry of Opti Optimum mum Amoun Amounts ts Ther Theree exis exists ts fo forr ev ever ery y CO CONC NCEP EPT T to be lear learne ned d an op opti timu mum m am amou ount nt of explana explanatory tory mat materia eriall There here ex exis ists ts for for ev ever ery y TOPIC to be lear learne ned d an op opti timu mum m numb number er o f conc concep epts ts to be inte integr grat ated ed Th Ther eree ex exis ists ts for for ev ever ery y SUB UBJJEC ECT T to be lear learne ned d an op opti timu mum m numb number er o f topics to be ma maste stered red By li limi miti ting ng ex expl plan anat ator ory y ma mate teri rial al to opti optimu mum m amoun amounts ts Quick No Note tess ma maxi ximi mize zess lear learni ning ng
Optimum um Placement Thee Theo Th Theorr y of Optim Ther Theree ex exis ists ts fo forr ever every y CONC CONCEP EPT T to be lea learned rned and in inte tegr grat ated ed in into to a TO TOP PIC of conc concer ern n a uniq unique ue pla place ce-ment o f el elem emen ents ts tha thatt wi will ll maxi maximi mize ze lear learni ning ng By pl plac acin ing g re rela late ted d el elem emen entts on the same page age or fa faci cing ng pa page gess Quick No Note tess ma maxi ximi mize zess lear learni ning ng
T he he Optim Optimum um Relationship Betw Be twee een n Content Content an d Process Educ Educat atio ion n is th thee le lear arni ning ng of con conte tent nt an and d proc proces esss Con onte tent nt is the what hat of learning it s the ari arithm thmeti eticc math them emat atic icss an and d th thee gr gram amma marr of communication of ma Proc Proces esss is th thee ap appl plic icat atio ion n of content it s the the pro probl blem em-solving o f math mathem emat atic icss and and the the wri writi ting ng of communication Le Lear arni ning ng be begi gins ns wi with th co cont nten entt and and expa expand ndss to process By ma maki king ng th thee le lear arni ning ng of con content tent eas easier ier Quick No Note tess ma make kess the le leaarnin rning g o f proc process ess ea easi sier er
Education Educat ion Requir Requires es Sacr Sacrif ific icee an d Discipline Sacr Sacrifi ifice ce and and di disc scip ipli line ne wh whic ich h ar aree req requi uired red to o schoolwork and and home homewo work rk aare re es esse sent ntia iall par parts ts of the the educ educat atio iona nall pro proce cess ss Appl Ap plyi ying ng th thee Quic Quick k No Note tess Phil Philos osop ophy hy will ill mak akee this this sac acri rifi fice ce an and d di disc scip ipli line ne le less ss frus frustr trat atin ing g but it will will not not ma make ke educ educat atio ion n fun scho hool olwo work rk an and d ho home mewo work rk were were supp suppos osed ed to be fun they they woul would d If sc be ca call lled ed sc scho hool olfu fun n an and d ho home mefu fun n By lea learni rning ng to sa sacr crif ific icee and and exhi exhibi bitt disc discip ipli line ne whil whilee goin going g to school a youn young g pers person on begi begins ns th thee pr proc oces esss of bec becom oming ing an adul adultt
Thee Worl Th World d
Multiple Mult iple Intellig Intelligence ence
Ho Howa ward rd Garn Garner er s Theo Theory ry Mul Multiple tiple Intelligen Intelligence ce defi define ness thes thesee ei eigh ghtt kind kindss hum human an inte intellige lligence. nce. Ma Math them emat atica icall-lo logi gica call (p (pro robl blem em solv solvin ing, g, fix or re repa pair ir,, pr prog ogra ram) m) 2 Spat Spatia iall (d (dan ance ce,, spor sports ts,, driv drivin ing g a bus) bus) Bodily ily-ki -kines nesthe thetic tic (a (acti cting ng,, mi mime me,, spor sports) ts) 3 Bod 4 Mu Music sicalal-rhy rhythm thmic ic (co (comp mpos osing ing,, playing playing mu music sic,, clapp clapping ing)) 5 Ver Verbal bal-li -lingu nguist istic ic (r (read eading ing,, using using word words, s, pub public lic speak speakin ing, g, storyt storytelli elling ng)) 6 In Inte terp rper erso sona nall (s (so oci cial al skill killss, re read adin ing g othe otherr pe peop ople le,, work workin ing g in a gr grou oup) p) 7 8
Intrap Intrapers ersona onall (intro (introspe specti ction on,, sel self-a f-asse ssessm ssment ent,, goal goal mak makin ing, g, visio vision, n, pla planni nning) ng) Naturalist Natura list (able (able to dist distin ingu guis ish h amon among, g, cl clas assi sify fy,, and and use use envi enviro ronm nmen ental tal fe feat atur ures es))
Mathem Math emati atical cal-l -log ogica icall and and Verb Verbal al inte intelli llige genc ncee re repr pres esen entt core intelli intelligen gence. ce. Skill Skillss rel related ated to core core inte intell llig igen ence ce ar aree em emph phas asiz ized ed by tr trad aditi ition onal al scho school ols. s. Peop People le with with abo above av aver erag agee ab abil ilit ity y in any any th thee eigh eightt area areass intelligence, hav avee spec specia iall inte intell llig igen ence ce.. The The wo worrld wo work rk rewa reward rdss peop people le who who deve develo lop p skil skills ls asso associ ciate ated d with with their their sp spec ecia iall inte intell llig igen ence ce,, pr prov ovide ided d they they meet meet mi mini nimu mum m skil skilll re requ quir irem emen ents ts as asso soci ciat ated ed with with core core inte intell llig igen ence ce..
etermining etermini ng pp ppro ropri priate ate Edu Educat cation ion for a World Mult Multiple iple Intel Intellige ligence nce Dete Determ rmin inin ing g educ educati ation onal al re requ quir irem emen ents ts begi begins ns by matc matchi hing ng a perso person n s spec specia iall intel intellig ligen ence ce wi with th care career erss th that at re rewa ward rd this this in inte tell llig igen ence ce.. Ca Care reer erss have have ma many ny leve levels ls competition. Choo Choosi sing ng one one s appr approp opri riat atee le leve vell re requ quir ires es hone honest st anal analys ysis is intell intelligenc igence, e, motiv motivation, ation, and and pers person onal al nee eeds ds.. For For exam exampl ple, e, th thee he heal alth th in indu dust stry ry req requi uire ress doct doctor orss and and nurs nurses es,, hosp hospit ital al dire direct ctor orss and and fl floo oorr supe superv rvis isor ors, s, xx-ra ray y te tech chni nici cian anss and and phys physica icall ther therap apis ists. ts. Care Career er succ succes esss wil illl be enha enhanc nced ed by ch choo oosi sing ng an appr approp opri riat atee le leve vell competition, one one in wh whic ich h co core re and and spec specia iall in inte tell llig igen ence ce re requ quir irem emen ents ts are are reas reason onab ably ly sati satisf sfie ied. d. Once nce th thee co comp mpet etit itiv ivee le leve vell is se set, t, the the appr approp opri riat atee educ educat atio ion, n, cons consid ider erin ing g mi mini nimu mum m core core in inte telli llige genc ncee and and sp spec ecia iall inte intelli llige genc ncee re requ quir irem emen ents ts,, can be dete determ rmin ined ed.. Succ Succes esss at any le lev vel will be enh enhance anced d by impr impro ovi ving ng skill llss rela relate ted d to nonnon-co core re and and non non spec specia iall inte intell llig igen ence ce.. A per erso son n mig might not like ike going ing to the the offi fice ce pi picn cnic ic or talk talkin ing g to poten potentia tiall cust custom omer ers, s, but but deve develo lopi ping ng thes thesee sk skil ills ls is important to eco econom nomic ic suc succes cess. s. Th Thee dyna dynami micc natu nature re busi busine ness ss ma may y ca caus usee skil skilll requ requir irem emen ents ts for for a part partic icul ular ar care career er level to change. In addit itio ion n, pe peo ople ofte ten n want want to comp mpet etee at a high higher er lev level. As a re resu sult lt,, an in indi divi vidu dual al ma may y fr freq eque uent ntly ly have have to comp compar aree thei theirr core core and and spec specia iall inte intell llig igen ence ce with with new new skil skilll req requi uire reme ment nts. s. Onc ncee this this an anal alys ysis is is co comp mple lete ted d, choo choosi sing ng an educ educat atio ion n appr approp opri riat atee for for th thee enha enhanc ncem emen entt
thes thesee skil skills ls ma may y be begi gin n.
ev evel elop opin ing g Spe peci cial al Skil Skills ls is Impo Import rtan antt Once mi Once mini nimu mum m core core in inte tell llig igen ence ce skil skilll re requ quir irem emen ents ts have have been been sati satisf sfie ied d fo forr a give given n care career er lev level, el, econ econom omic ic and and ac acad adem emic ic re retu turn rnss from educ educat atio ion n wil illl be ma maxi ximi mize zed d by deve develo lopi ping ng spec specia iall in inte tell llig igen ence ce skil skills ls.. Peop People le wh who o ig igno nore re the the pr proc oces esss dete determ rmin inin ing g appr approp opri riat atee educ educat atio ion n fo forr a world mu mult ltip iple le inte intell llig igen ence ce ma may y re rece ceiv ivee li litt ttle le re retu turn rn fr from om their their educ educat atio ion. n. Bureau th thee Ce Cens nsus us 1992 data data in indi dica cate tess th that at appr approx oxim imat atel ely y 25 th thee bach bachel elor or degr degree ee hold holder erss earn earn le less ss th than an th thee me medi dian an high high sc scho hool ol gr grad adua uate te and and appr approx oxim imat atel ely y 20 th thee high high scho school ol grad gradua uate tess earn earn mo more re th than an th thee me medi dian an co coll lleg egee gra radu duat ate. e. Perc Percen enta tage gess vary vary depe depend ndin ing g upon upon age age, gend gender er,, and and other demo demograp graphic hic character characteristics istics.. National Nation al Su Surv rvey ey Ad Adul ultt Li Lite tera racy cy te test stss me meas asur urin ing g Pros Prose, e, Do Docu cume ment nt (u (und nder erst stan andi ding ng fo form rms) s),, and an d Qu Quan anti tita tati tive ve skil skills ls cond conduc ucte ted d by the the De Depa part rtme ment nt Educ Educat atio ion n in 1992 repo report rted ed th that at 5 to 20 graduates.
fo four ur-y -yea earr colle college ge gr grad adua uate tess have have skil skilll leve levels ls belo below w medi median an high high scho school ol
Using The The Quic Quick k Notes Learning System Quick Notes explain explai n basi basicc sta statis tistic ticss princ pri ncipl iples es wit with h clea clearr concise conci se two-page outl outlines. ines. Thee beg Th beginn inning ing o f each outli out line ne con conta tain inss ba basi sicc definitions theories and co conc ncep epts ts.. Th Thee natu nature re o f stat statis isti tics cs is exp expla laine ined d at the beg beginn inning ing o f chapter See See page page 62 for for a comp comple lete te review o f area areass cove covere red d by Quick No Note tess Stat Statistic istics. s.
Chapt hapter er 1 I.
St Stat atis isti tics cs Is Abou Aboutt Using Data in Decisi ision Making
nature of statistics Many Man y dis discip ciplin lines es use st stat atist istics ics.. Busine Bus iness ss an and d Econom Economics ics 2 Natur Nat ural al and and Soc Socia iall Sci Scienc ences es 3 Physical Phys ical Scien Sciences ces 4 Education Politics 5 Basic Basi c defi definiti nitions ons Populat Popu lation: ion: tota totali lity ty unde underr st stU Udy such such as th the e st stud uden ents ts at atte tend ndin ing g a sc scho hool ol 2 Sa Samp mple: le: subs subset et of a po popu pula lati tion on such such as th the e st stud uden ents ts in one class of a school 3 Parameter: Parame ter: a cha charac racter terist istic ic of a popul opulat atio ion n suc such as the aver averag age e age of studen stu dents ts atte attend nding ing a school school 4 Statistic: Statist ic: a cha charac racter terist istic ic of a samp sample le such such as th the e av aver erag age e ag age e of stude tuden nts class s of a scho school ol in a clas Statistics is the the scienc science e of co colle llect cting ing,, organ organizi izing, ng, prese presenti nting, ng, analy analyzi zing, ng, and and interpr inte rpretin eting g numeric numerical al data in relat relatio ion n to th the e decisi decisionon-mak making ing proc proces ess. s. Descriptive Descr iptive statistics statistics su summ mmar ariz izes es nu nume meri rica call da data ta us usin ing g nu numb mber ers s and grap graphs hs.. The The grad grades es of st stud uden ents ts in a cl clas ass s can be su summ mmar ariz ized ed wit ith h
he
A
B
C
av aver erag ages es and and line line grap graphs hs.. Inferen Inferential tial stat statistic istics s us uses es samp sample le st stat atis isti tics cs to es esti tima mate te po popu pula lati tion on pa para ram meter eters. s. The The av aver erag age e ag age e of st stud uden ents ts in a class can be used to es esti tima mate te the aver averag age e age of st stud uden ents ts at atte tend ndin ing g a sc scho hool ol..
2
hapt ha pter er 2
Linda s Video
Showcase
I.
A co cont ntin inuo uous us exa exampl mplee o f how Li Lind ndaa Sm Smit ith h cal calcul culate atess stat statis isti tics cs an and d uses uses the them m whe when n making mak ing bus busine iness ss deci decision sionss fo forr Li Lind ndaa s Vide Video o Sho Showca wcase se is an int nteg egra rall part part of Quick Note No tess St Stat atis isti tics. cs. Vi Vide deot otap apee rent rental alss wi will ll be anal analyz yzed ed to lear learn n abo about ut th thee rel relat atio ions nshi hip p betwe bet ween en sa sale less reve revenu nuee and and advertising advert ising expend expenditure itures. s. Customer Cust omer sati satisfact sfaction ion wil willl be me meas asur ured ed aass wi will ll the the effectiveness o f her sa sale less team. See pag e 164 for a complete compl ete review o f the topi topics cs sh shee wi will ll ex expl plor ore. e.
Lind Linda a s Vid Video eo Showca Showcase se A B
Upon grad Upon gradua uati ting ng from from coll colleg ege, e, Linda inda Smit ith h op open ened ed Lin Linda da s Video Show Showca case se,, a reta retailil bus busine iness ss spe specia cializ lizing ing in vide videotap otape e rental rentals. s. Lind Linda a will will us use e desc descri ript ptiv ive e st stat atis isti tics cs to an anal alyz yze e th this is daily daily vi vide deo o rent rental als s data data set.
176
2
Array: 53
88
53
66
97
73
64
82
77
57
93 93
85
70
76 68
Lind nda a s firs firstt ste step wa was s to make ake a list of dat data by orde orderr of magn magnitude itude called called an arra array. y. She also lso calc lcul ula ated a range high igh num number ber minus inus the low number) for for the da datta.
57
64
66
68
70
R a ng e : H i g h II.
Summar Sum marizi izing ng Data Data
73 o w
76
76
77
82
85
88
93
97
=97 - 53 =44
Fr Freq eque uenc ncy y dist distrr ibu ibutt ion ions s A
A frequen frequency cy dis distr tribu ibutio tion n divi divide des s da data ta into into nu nume meri rica call grou groupi ping ngs s and de depi pict cts s the the num number ber of ob obse serv rvat atio ions ns oc occu curr rrin ing g wi with thin in ea each ch grou groupi ping ng.. Ac Acad adem emic ic grad grades es are are ofte often n summ summar ariz ized ed with with a fr freq eque uenc ncy y dist distri ribu buti tion on with with ea each ch of th the e fi five ve grad grades es re repr pres esen enti ting ng a grou group. p. A grad grade e of B is usua usually lly betwee between n 79 and 90. The The fi firs rstt thre three e colu column mns s of the char chartt at the bo bott ttom om of th this is pa page ge are are a fr freq eque uenc ncy y dist distrribut ibutio ion n of the ab abov ove e ren enttal da datta.
Practice Sets
Provide Reinforcement
I.
Each Quic Quick k Notes Notes chap chapter ter is foll follo owed wed by a Pr Prac acti tice ce Set o f simila sim ilarr design. design. f you ha have ve tr trou oubl blee an answe swerin ring g a Practi Practice ce Se Sett problem, problem, just tu turn rn ba back ck two pages and look look at the the
Practice Set 2 Summarizing Data
D a r i n s M u s i c E m p o r i um
Upon gr Upon grad adua uati ting ng fr from om coll colleg ege, e, Da Dari rin n Jone Jones s open opened ed Dar Darin in s Musi Music c Empori Emp orium. um. Th The e comp compan any y sell sells s mu musi sicc-re rela late ted d hard hardwa ware re and software. W e will will use desc descri ript ptiv ive e stat statis isti tics cs to anal analyz yze e comp compan any y sale sales s data. ata.
B
Darin Da rin rece recently ntly coll collecte ected d the follo followi wing ng Wa Walkm lkman an CD Re Recor corder der sa sale les s data Unit Un its s sold sold per per day: day: 17 22, 17 8 12 12,, 15 14 16 21 29 16
the
same page page loca locatio tion n for for ap appr prop opri riat atee Quick Quick Notes Notes demonstra demo nstration tion problem.
1
Make an array an and d ca calc lcul ulat ate e the the rang range e of this this da data ta..
Darin Dar in s Musi Musicc mporium
2
Calculate an appr approp opri riat ate e cl clas ass s wi widt dth h for for this this data data..
Pra raccti ticce Se Sets ts deal deal with ho w Dari Da rin n Jo Jone ness calcul calculate atess and uses stat statis isti tics cs when managing managing Dari Da rin n s Music Music Emporium. Emporium. Then he buys buys Fu Futu ture re Hori Horizo zons ns Corporation. t req requir uires es he stu study produ product ct quali quality ty contro controll
II. Ma Make ke a 5-cl 5-clas ass s fr freq eque uenc ncy y di dist stri ribu buti tion on usi sing ng stat stated ed cl clas ass s li limi mits ts for for the the first fir st clas class s o f 5-9 sale sales s unit units. s. Thos Those e usin using g stat statis isti tics cs so soft ftwa ware re shou hould try try ot othe herr clas class s limi limits ts with with thei theirr so soft ftwa ware re and and prin printt the the one one with with the the mos ostt symmetrica symme tricall dist distribu ribution tion..
and other other issues issues o f con concer cern n to manufac man ufacturi turing ng companies.
Quick Questions follow Pr Prac acti tice ce Se Sets ts and and revie review w defi defini niti tion onss an and d other other importan importantt concepts.
I
Quick Qui ck Qu Quest estion ions s 2 Sum Summar marizin izing g Dat Data a
Plac Place e the the nu numb mber er of the ap appro propr priat iate e formul formula a describes.
Mutually-exclusive Mutuallyexclusive events
Relative Rela tive frequency
D
Appro App roxim ximate ate cla class ss widt width h
do not not cont contai ain n the the sam same outc outco ome
E
All-incl Allinclusiv usive e eve events nts
F
Ogive
3.
classes
+ 2
x2
class clas s freq frequenc uency y total frequen frequencies cies
4 5
cumulati cumu lative ve freque frequency ncy dis distri tribu butio tion n
6
a pl plac ace e for for ev ever ery y outc outco ome
Probabi Prob ability lity For Formu mula la Rev Review iew
The
I
Types Typ es an and d cha charac racter terist istics ics of probability Types Typ es of pr prob obab abilility ity 1 Classical: P(A) =
2
Empi Em piri rica cal: l: P(A) P(A) =
Complet Com pletee Soluti Solutions ons Prac Practi tice ce Sets Sets,, Quick Quick Questi Questions ons,, tessts ha have ve be been en pr prov ovid ided ed to and te help with with diffic difficult ult co conc ncept epts. s. to
1
of
Xl
Class Cla ss mid midpoi point nt
phras hrase e ne next xt to the item it range
2
C
eviews a n d Tests fir irsst 4 pa part rtss o f Quick Notes end wi witth a form formu ula revi review ew and a test. Part V is a uniq iqu ue revie iew w of Quick Note Notess Sta Statist tistics. ics.
or
I
Probab Pro babili ility ty Tes Testt
Av Aver erag age e ho hour urs s wo work rked ed by ma manu nufa fact ctur urin ing g wo work rker ers s is no norm rmal ally ly di distr strib ibut uted ed with ith a mean of 41 ho hour urs s an and d a stan standa dard rd de devi viat atio ion n of .5 hours. Grap Graph h and solve the follow followin ing g prob proble lems ms..
P(41.0
hour ssx
42.5 42.5 hour hours) s)
re ed Lo ok head an and d Lulu vi w Introducing r Look Ahe Ahead. ad. Use me as a re remi mind nder er to loo look k ov over er the the mai ain n poin points ts of a lear learni ning ng un unit it be befo forre readi reading ng it in deta detailil.. Lo Look okin ing g ar arou ound nd fi firs rstt wi will ll ma make ke lear learni ning ng easier
I m F re re d
I m Lu Lulu lu Revi Review ew.. I m here here to rem remind ind you you to revi review ew on e in a while. o jump on bo boar ard, d, and and
we will will rev review iew toge togeth ther er::
Message to Qu i ck Users For All Users
Quic Qu ick k Note Notes s summa summari rize ze di diff ffic icul ultt co conc ncep epts ts.. Most Most stud studen ents ts revi review ew them them a nu numbe mberr of time times. s.
2
Comp Co mple lete te solu soluti tion ons s to all Prac Practi tice ce Sets Sets and Quic Quick k Ques Questi tion ons s are are prov provid ided ed Read Re adin ing g th these ese answer answers s
3
is
in
Partt VI Par VI..
a gr grea eatt way way to re revi view ew basic asic con concep cepts ts,, es espe peci cial ally ly when when st stud udyi ying ng for for a tes estt
Chap Ch apte ters rs 25 to 27 re revi view ew impo import rtan antt conc concep epts ts and ar are e de desi sign gned ed to tie tie ev ever eryt ythi hing ng toge togeth ther er.. Releva Rel evant nt sections sections th thes ese e chap chapte ters rs sh shou ould ld be re revi view ewed ed af afte terr comp comple leti ting ng ea each ch pa part rt Quick Note Notes. s.
Forr Pe Fo Peop ople le No Nott Us Usin ing g Stat Statis isti tics cs So Soft ftwa ware re
In Info form rmat atio ion n pr provi ovide ded d on page page 22 chap chapte terr 5 an and d in
in
all of chap chapte terr 6 has been een prov provid ided ed
chap chapte ters rs 3 and 4 and may may be skip skippe ped. d. Warn Warning ing Thi his s in info form rmat atio ion n may may be req require uired d by thos those e
ta taki king ng a coll colleg ege e st stat atis isti tics cs cour course se.. Check Check your your syll syllab abus us to see see if grou groupe ped d me meas asur ures es are are requ requir ired ed.. 2
Quic Qu ick k an answ swer ers s may di diff ffer er sl slig ight htly ly from from you yourr an answ swer ers s be beca caus use e of roun roundi ding ng.. Wh When en an answ swer ers s diff differ er,, comp compar are e yourr proced you procedur ures es with with th those ose the the ap appr prop opri riat ate e Quic Quick k Note Notes s de demo mons nstr trat atio ion n prob proble lem m an and d check check your your ma math th..
3
Ignore Ign ore Data Data Sets Sets Fo Forr Peop People le Usin Using g Stat Statis isti tics cs Soft Softwa ware re..
Forr Pe Fo Peop ople le Us Usin ing g Qu Quic ick k Note tes s Data File iles and St Stat atis isti tics cs So Soft ftwa ware re Di Dire rect ctio ion ns
Data fi Data file les, s, practi practice ce set in instr struc ucti tion ons, s, an and d comput computer er ge gene nera rate ted d answe answers rs are are ava avail ilabl able e for popula popularr sta stati tisti stics cs prog progra rams ms.. If purchase sed d, th the ey are on the disk isk affixed to the back cove verr. Set Set yo your ur word pro proce cess ssor or to Rich Te Text xt Format mat and lo load ad th the e fi file le compd compdir ir for di dire recti ction ons s
2
on
ho how w to us use e your your soft softwa ware re wi with th Quic Quick k Note Notes s Stat Statis isti tics cs™. ™.
Info Inform rmat atio ion n pr prov ovid ided ed on pa page ge 22 ch cha apter pter 5 and in all chap chapte terr 6 has been been prov provid ide ed else elsewh wher ere e and ma may y be skip skipp ped. Warnin Warning g This This in info form rmat atio ion n may may be re req qui uire red d for for thos those e taki taking ng a co coll lleg ege e st stat atis isti tics cs co cour urse se.. Ch Chec eck k yo your ur sylla syllabu bus s to see if gr grou oupe ped d measur measures es ar are e re requ quir ired ed.. Gr Grou oupe ped d calcul calculat ation ions s will will differ differ from from ungro ungroup uped ed calcu calcula lati tion ons. s.
help lp with ith Stat Statis isti tics cs,, Exce xcel, Free Stud Study y Aids to he Accoun Acco unti ting ng,, Ec Econ onom omic ics, s, Ma Mana nage geme ment nt,, an and d Ma Math them emat atic ics s are are av avai aila labl ble e at www.businessbookmaILcom.
ab a b le of Chapter
Part I
o nt e n t s
D escr i pt i ve S t a t i s t i cs
Page
1
Statistics
2
Summarizing Data
3
Measuring Central Tendency o f Ungrouped Data
4
Measuring Dispersion o f Ungrouped Data
5
Measuring Central Tendency o f G ro rouped Data
22
6
Measuring Dispersion o f Grouped Data
28
Descr De script iptive ive Sta Statis tistic tics s Form Formul ula a Re Revi view ew an and d Test
34
Is
Abou Aboutt Usin Using g Data in Dec Decisi ision on Mak Makin ing g
Part Pa rt II
2 4
6
Prob Probab abil ilit ity, y, The The Ba Basi sis s fo forr Infe Inferrenti ential al St Stat atis isti tics cs
7
Understan Unde rstanding ding Prob Probabil ability ity
4
8
Probabili Prob ability ty Part \I Mult Multipli iplicatio cation n Ru Rules les
46
9
Discrete Discr ete Prob Probabil ability ity Dist Distrib ributio utions ns
52
Continuo Cont inuous us Nor Normal mal Pro Probab babilit ility y Dis Distrib tributio utions ns
58
Sampling
2
and
th the e Samp Sampli ling ng Dist Distri ribu buti tion on of the the Me Mean ans s
Samp Sampli ling ng Dist Distri ribu buti tion ons s Part Part
7
Proba Pro babil bility ity Form Formul ula a Re Revi view ew and Test
76
Part Pa rt III
Infe ferrential ial Sta Statist tistic ics s
3
Larg Large e Sam Sampl ple e Hyp Hypoth othesi esis s Testi Testing ng
4
Larg Large e Sam Sampl ple e Hyp Hypoth othesi esis s Testi Testing ng Part Part
5
Hypoth Hyp othesi esis s Tes Testin ting g of Popu Popula lati tion on Prop Propor ortio tions ns
6 7
Smalll Samp Smal Sample le Hypo Hypoth thes esis is Te Test stin ing g Usin Using g St Stud uden entt s t Te Test st Statis Sta tistica ticall Qu Quali ality ty Co Cont ntro roll
2
8
Analys Ana lysis is of Var Varian iance ce
8
9
Two-Fact TwoFactor or Ana Analys lysis is of Var Varian iance ce
4
2
Nonpar Non parame ametri tric c Hyp Hypoth othesi esis s Test Testin ing g of No Nomi mina nall Da Data ta
2
2
Nonp No npar aram amet etri ric c Hypo Hypoth thes esis is Test Testin ing g of Or Ordi dina nall Data ata Part Part I
2 26 6
22
Nonpar Non parame ametri tric c Hyp Hypoth othesi esis s Testi Testing ng of Ordi Ordina nall Da Data ta Pa Part rt
3 32 2
84 88
II
94
II
Inf Infer erent ential ial Sta Statis tistic tics s Exe Execut cutive ive Summ Summar ary, y, Form Formul ula a Re Revi view ew,, and and Tes Testt
98
3 35 5
Part
IV
Correlat Corr elation ion and Reg Regressi ression on
23
Correlation Correl ation Analysis
46
24
Simple Sim ple Linear Linear Regres Regressio sion n Analysi Analysis s
52
Corre Co rrela lati tion on and and Regr Regres essi sion on Fo Form rmul ula a Revie Review w and and Test
58
Part Pa rt V Cumu Cumull ati ative ve Revi Revie ew 25 26
Taxono Tax onomy my of St Stat atis isti tics cs Taxonomy Taxono my of Pa Para rame metr tric ic Stat Statis isti tics cs
62 63
27
Proble Pro blem m Review Review
64 Pa Part rt VI
Th e
Professor s
A n sw sw e er r B oo oo k
Appe Append ndix ix I Comp Comple lete te Solu Soluti tion ons s to Prac Practi tice ce Sets Sets
S5
Appendix
Appendix
Comp Co mple lete te So Solu luti tion ons s to Quick Quick Questi Question ons s Compl Co mplete ete So Solu luti tion ons s to Te Tests sts
T
Partt VII Par VII Stat Statist istica icall Tables Tables
ST
P a r t V III I n d e x
I
Chap Ch apte terr 1
Stati tatist stic ics s Is Abo bout ut Using Data in Decision Making to lo oo o k at th the e ke key y poin points ts o f a le lear arni ning ng unit unit be befo fore re st stud udyi ying ng th them em in de deta tail il.. e re yo you u wil illl see se e that th this is un unit it co cove vers rs de defi fini niti tion ons s rela relate ted d to th the e nature o f sta statis tistic tics s the nature nature of measurement an and d the co coll llec ecti tion on of data. emember
I
The natur nature e of statistics Many Ma ny discip disciplin lines es us use e statis statistic tics. s. 1 Busine Bus iness ss an and d Econom Economics ics 2 Natu Na tura rall and and So Soci cial al Scien Science ces s 3 Physical Phys ical Sciences Sciences 4 Education 5 Politics B Basic Bas ic definitions definitions 1 Popula Pop ulatio tion: n: to tota tali lity ty un unde derr st stud udy y such such as th the e st stud uden ents ts at atte tend ndin ing g a sc scho hool ol 2 Sample: Sam ple: subset subset of a popu popula lati tion on such such as the studen students ts in on one e cla lass ss of a sc scho hool ol 3 Parameter: Param eter: a charac character terist istic ic of a po popu pula lati tion on su suc ch as the av aver erag age e ag age e of students atten attendi ding ng a scho school ol Statistic: Stati stic: a charac character terist istic ic of a samp sample le such such as the the av aver erag age e ag age e of st stud uden ents ts 4 in a cl clas ass s of a scho school ol C Statistics is the science of collectin collecting g organizi organizing ng present presenting ing analyz analyzing ing an and d interp interpret reting ing numerica numericall data in rela relati tion on to the the deci decisi sion on maki making ng pr proc oces ess. s. 1 Descripti Desc riptive ve statistics statistics summ summar ariz izes es nu nume meri rica call da data ta us usiing nu numb mber ers s and gr grap aphs hs.. The grad grades es of students in a clas ass s can be summ summar ariz ized ed with with aver averag ages es and and li line ne gr grap aphs hs.. 2 Inferent Infe rential ial statistic statistics s us uses es samp sample le st stat atis isti tics cs to esti estima mate te po popu pula lati tion on pa para rame mete ters rs.. Th The e av aver erag age e ag age e of studen students ts in a class can be us used ed to es esti tima mate te th the e av aver erag age e ag age e of st stud uden ents ts at atte tend ndin ing g a sc scho hool ol..
II. The natur e of measurement Variab Var iable: le: an ac acti tivi vity ty subj subjec ectt to va vari riat atio ion n e.g .g.. grade grades s on a st stat atis isti tics cs te test st and ho how w so some meon one e fe feel els s B Quantita Quan titativ tive e versus versus qualitat qualitative ive variable variables s 1 Quantitative variable: variable: expr expres esse sed d nume numeri rica call lly y e.g. e.g. a gr grade ade of 85 and a bo body dy te temp mper erat atur ure e of 1 1 degrees 2 Qualitati Qual itative ve variable: variable: no nott ex expr pres esse sed d nu nume meri rica call lly y e. e.g. g. a gr grad ade e of B and so some meon one e fe feel elin ing g po poor orly ly C Discre Dis crete te versus versus continuo continuous us variab variables les 1 Discrete: Discre te: only only fini finite te valu values es such such as th the e coun counta tabl ble e numb number ers s can can exist exist on the x axis e.g. defects in a ti tire re and the nu numb mber er co corr rrec ectt on a true true or fa fals lse e stat statis isti tics cs ex exam am 2 Continu Con tinuous: ous: meas measur urem emen entt may may as assu sume me any any valu value e as asso soci ciat ated ed with with an unint uninterrup errupted ted scale scale e.g. e.g. a bottle bottle may ma y cont contai ain n 12 12.0 .02 2 ou ounc nces es of liq liquid uid refr refres eshm hmen entt and a pe pers rson on may may weig weigh h 17 175. 5.25 25 po poun unds ds
Dis cre te
C o nt i n uo us
f
f
I D E
1
2
3
4
5
I
6
Defects
2
Ounces
The x axi axis s as show shown n he here re repr repres esen ents ts 1 of 4 meas measur urem emen entt sc scal ales es im impo port rtan antt to ou ourr st stud udy y of st stat atis isti tics cs.. The y axis axis of ofte ten n meas measur ures es ho how w of ofte ten n an x ax axis is meas measur urem emen entt ha has s oc occu curr rred ed.. Th This is is ca calllled ed frequenc frequency y t . 2
F
Measurement scales levels levels)) dete determin rmine e data s exactne exactness ss Nom Nomina inall scal scaled ed da data ta is the the we weak akes est, t, prov provid idin ing g the the leas leastt info inform rmat atio ion. n. Data Data can can on only ly be pu putt into into grou groups ps call called ed cate catego gorie ries s an and d be coun countted. ed. No orde orderr or sc scal ale e exis istts. Exam Exampl ples es incl inclu ude th the e numb number er of shoppers who bUy or do not buy when hen going into a store and the nu num mber ber of parts th that at pass ass or do not not pass pass insp inspec ecttion ion. 2 Ordi Ordinal nal scal scaled ed da data ta can can be arra arrang nged ed in order. An exam exampl ple e wo woul uld d be th the e numb number er of cus custom tomers ers who th think ink a prod produc uctt is poor, ave averag rage, or go goo od. While hile good ood is be bett tter er th than an av aver erag age, e, no at atte temp mptt is made made to qu quan anti tify fy such suc h diff differe erences nces into into mea measur surable able inter interval vals. s. 3 Inter Interval val scale scaled d da data ta allo allows ws for the the qu quan anti tifi fica cati tion on of diff differ eren ence ce.. Fahre Fahrenhe nheit it an and d Cels Celsiu ius s therm thermome ometer ters s have have in inte terv rval al scal scales es.. Th Thes ese e scal scales es ha have ve eq equa uall int interva ervals ls.. But, th thei eirr me meas asur ure e of zero zero is arbi arbitr trar ary y be beca caus use e zero zero de degr gree ees s do does es no nott me meas asur ure e the the ab abse senc nce e of heat. at. Such Such arbi arbitr trar ary y st star arti ting ng po poin ints ts plac place e re rest stri rict ctio ions ns on th the e math math oper operat atio ion ns that that can be do done ne with ith int interv erval scaled aled data. For For exam examp ple, le, th the e use use of pro propor portio tions ns is not app approp ropria riate. te. 4 Ra Rati tio o sc scale aled d data data has an inhe inhere rent nt start arting ing point. Temp Temper erat atur ure e measu easure red d on a Kelv lvin in scal scale e is ratio ratio scaled scaled da data ta bec becaus ause e zero zero repr repres esen ents ts the the abs absenc ence e of he heat at.. Tota Totall vari variab able le cost costs s are are ra rati tio o sc scal aled ed da data ta be beca caus use e cost costs s are are zero zero wh when en prod produc ucti tion on is zero zero.. Tota Totall co cost sts, s, be beca caus use e of fi fixe xed d cost costs, s, are are inte interv rval al scal scaled ed da data ta..
I ll. ll. Co Coll llec ectt iing ng dat dat a Pr Prim imar ary y versu versus s sec second ondary ary sour source ces s of data rimary rimar y source source data is pu publ blis ishe hed d by the the orig origin inal al coll collec ecto torr da datta coll collec ecte ted d by th the e Bu Bure reau au of the Cen Census sus). ). 2 econdary econd ary source source dat data a is pu publ blis ishe hed d by a no nonc ncol olle lect ctor or Bu Bure reau au of th the e Ce Censu nsus s data data prin printe ted d in a new newsp spape aper) r).. B Methods of gat gather hering ing dat data a 1. O b bs se err v a titi o n 2 Personal interview 3 Telep Telephone hone intervie interview w when n a form form qu ques esti tion onna nair ire) e) is comp comple lete ted d by th the e respo responde ndent nt indi indivi vidu dual al,, comp compan any, y, et etc. c.). ). 4 Self-administration is whe 5 Registration is wh when en the the resp respon onden dentt is resp respon onsi sibl ble e for brin bringi ging ng th the e de desir sired ed inf infor orma mati tion on to a pres prescr crib ibed ed C
loca cati tion oning regi regist ering ing a car) car).. Data Datalo gat gather hering altern altster ernati atives ves A survey is the the colle collect ctin ing g of info inform rmat atio ion n conc concer ernin ning g exis existi ting ng mate materi rial al.. a A census con contai tains ns inform informat ation ion from from an ent entire ire popul populat ation ion.. b A sample cont contai ains ns info inform rmat atio ion n from from part part of a pop popula ulati tion. on. 1) S a m p l i n g error oc occu curs rs be beca caus use e a samp sample le is taken aken rat rather her th tha an a ce cens nsus us.. The The prim primar ary y cause of sam sampli pling ng erro errorr is the the samp sample le is not repre representa sentative tive of the popu populati lation. on. 2) Nonsam Nonsampli pling ng erro error, r, wh whic ich h oc occu curs rs wi with th an any y su surv rvey ey,, exis exists ts beca becaus use e of poor col collec lectio tion n te tech chniq niques ues.. Beca Becaus use e a samp sample le is sm smal alle lerr tha han n a cens census us,, more more ef effo fort rt may may be pu putt into into elim elimin inat atin ing g no nons nsam ampl plin ing g err error. This This mea eans ns that that lim limited fund unds ma may y make ake a sa sam mple ple more more accu accura rate te th than an a cens census us.. 2 An experiment is a proc proces ess s for for gene genera rati ting ng an and d me meas asur urin ing g da data ta..
Quic Qu ick k Ques Questi tion ons s
Stat Statis isti tics cs
s
bou bout Using Data Data n De Deci cisi sion on Maki Making ng
Place Pla ce the number o f th e appropr appropriate iate descrip description tion n ex ex t t o th e item it describes.
Statistic
Subset of a pop popula ulatio tion n
B Parameter C Population
2 Expre Expressed ssed numer numerically ically
3 Th The e use of sam sample ple sta statis tistic tics s to estima estimate te popula populatio tion n parame parameter ters s
D Discrete E Quant Quantitati itative ve variable
F Sec Second ondary ary sou source rce dat data a
G Sample
5 On Only ly fini finite te valu values es can ex exis istt on the the xx-ax axis is 6 Pu Publ blis ishe hed d by the the orig origin inal al colle collect ctor or
7 Me Measu asure reme ment nt may assume assume any valu value e as asso soci ciat ated ed with with an unin uninte terr rrup upte ted d scale scale
H Infere Inferenti ntial al stati statistics stics I Continuous J
4. Charac Character terist istic ic of a samp sample le
Pr Prima imary ry sour source ce dat data a
8 Pu Publ blis ishe hed d by a no nonc ncoll ollec ecto torr
9 Characteristic of a pop popula ulatio tion n 10. Tota Totali lity ty un unde derr stud study y
See page QQ 3 of Appendix II for for Co Comp mple lete te So Solu luti tion ons s to Qui Quick ck Qu Ques esti tion ons. s.
Chap ter 2 I
Summa riz ing Data
Linda s Vi deo deo Sho Showca wcase se A B
Upon gr Upon grad adua uati ting ng fr from om coll colleg ege, e, Li Lind nda a Smit Smith h op open ened ed Linda Linda s Video Showcas Showcase, e, a reta retail il bu busin siness ess specia specializ lizin ing g in videota videotape pe rentals rentals.. Li Lind nda a will will us use e de descr scrip ipti tive ve stat statis isti tics cs to an anal alyz yze e this this da dail ily y vi vide deo o rent rental als s da data ta set. set. 1.
2
76
88
5 53 3
66
97
73 73
64 64
82
77
57 57
93
85
70 70
76 68
Li Lind nda a s fi firs rstt st ste ep was was to make make a list of data by or orde derr of magn magnit itud ude e ca callled led an array. Sh She e al also so ca calc lcul ulat ate ed a ra rang nge e hi hig gh nu numb mber er min minus the the lo low w num umbe ber) r) for for the the data. ta. Array: 53
57
64
66
68
70
73
76
Range : Hi g h - Lo Low w
76
77
82
85
88
93
97
=97 - 53 =44
II. Freque Frequency ncy distributions A.
B
A frequen frequency cy distribut distribution ion di divi vide des s da data ta in into to nu nume meri rica call gr grou oupi ping ngs s an and d de depi pict cts s the the numb number er of ob obse serv rvat atio ions ns oc occu curr rrin ing g with within in ea each ch gr grou oupi ping ng.. Acade Academi mic c gr grad ades es ar are e of ofte ten n summ summar ariz ized ed wi with th a freq freque uenc ncy y dist distri ribu buti tion on wi with th each of the five grade rades s represe resen ntin ing g a group. A grade of B is usua usually lly betwee between n 79 and 90. Th The e firs firstt thre three e columns of th the e ch char artt at the the bo bott ttom om of th thiis pa page ge ar are e a freq freque uenc ncy y dist distri ribu buti tion on of the the ab abov ove e rent rental al da data ta.. A grou groupi pin ng is ca call lled ed a cla class. ss. Class Cla ss limits limits st stat ate e the the extr extrem emes es of a clas class. s. Thei Theirr di diff ffer eren ence ce is ca call lle ed the the cla class ss wi widt dth. h. Cl Clas asse ses s must must be mutual mutually ly exclusiv exclusive e in that that a pi pie ece of dat data outco utcome me)) ma may y bel belong to only only one one class. 3 Cl Clas asse ses s must must be all-in all-inclu clusive sive coll collect ective ively ly exhau exhausti stive) ve) in that that ther there e mu must st be a cl cla ass for for ev ever ery y outc tco ome. me. Data is of ofte ten n summar summarize ized d with with 5 to 15 classes. 1. A cl clas ass s widt width h shou should ld be ea easi sily ly divi divisi sibl ble, e, i.e., .e., 5 10 50 100 500 etc. range 2 This fo form rmul ula a is use sed d with with a numb umber su such ch as five ive to determi ermin ne an app approxi roximate mate class class width. width. of class classes es 3 Data Da ta that that is nat natura urally lly cluste clustered red sh shou ould ld be so cl clust uster ered ed in the the distr distrib ibut utio ion. n. If poss possib ible le,, = = 88 all cl clas asse ses s sh shou ould ld be of eq equa uall size size and cont contai ain n at le leas astt on one e ou outc tcom ome. e. Rounded cl clas ass s li limi mits ts ar are e ca call lle ed stated stated class class li limi mits ts.. 1. For For ex exam ampl ple, e, a fi firs rstt cl clas ass s wi with th sta stated ted clas class s limi imits of 50-59 would have real real cl clas ass s li limi mits ts of 49.5-59.5. 2 Outc Ou tcom omes es eq equa uall to the the up uppe perr re rea al li limi mitt be bellong to the the ne next xt high higher er cl clas ass. s. c I a s s fre q ue n y That That is, th the e ou outc tcom ome e 59.5 wo woul uld d be belo long ng to th the e seco second nd clas class. s. A t a l l y is a vertical line used to count class outcomes. relative frequency = total frequenci frequencies es Th The e to tota tall outcom outcomes es of a cl clas ass s ar are e its fre freque quency ncy ra ratte of oc occu curr rren ence ce). ). 2 Fr Freq eque uenc ncy, y, ex expr pres esse sed d as a de deci cima mal, l, is call called ed relative relative freque frequency. ncy. Cumulative frequency is meas measur ured ed by moremore-th than an and le lessss-th than an ogiv ogives es.. Og Ogiv ive es summ summar ariz ize e th the e cu cumu mula lati tive ve nu numbe mberr of ou outc tcom omes es ov over er or un unde derr ea each ch rea real cl clas ass s limit imit.. Freq Freque uenc ncy, y, re rela lati tive ve fr freq eque uenc ncy, y, an and d cumu cumula lati tive ve freq freque uenc ncy y are are calc calcul ulat ated ed be belo low w and grap graphe hed d on the the ne next xt pag age. e. 2
C
D
E.
F
G
Li Lind nda a s Vid Video eo Sh Show owca case se Dail Da ily y Re Rent ntal als s Be Begi ginn nnin ing g 1/ 1/2/ 2/98 98 Stated Class Limits
Real Class Limits
Tally
50 - 59
49.5 - 5 9 9.. 5
II
60 - 69 69
59 .5 - 6 69 9.5
70 7 9
Cumulati Cumu lative ve Frequen Frequency cy
f
Relative Frequency f+ n
M o re -t h a n
Less-than
2
0.13
4 9 . 5 is 1 5
4 9 . 5 is
0
3
0 . 20
5 9. 5 is 13
5 9. 5 is
2
69 9.. 5 - 7 9 9.. 5
5
0. 34
6 9. 5 is 1 0
6 9 . 5 is
5
8 0 - 89
79 9.. 5 - 8 9 9.. 5
3
0.20
7 9 . 5 is
5
7 9 . 5 is 10
90 - 99 99
89.5 - 9 99 9.5
2
0.13
8 9 . 5 is
2
89 . 5 is 13
1. 0 0
9 9 . 5 is 0
9 9. 5 is 15
Totals
F re q u e n c y
n = 15 4
Graphing Graphi ng frequency frequency distribu distribution tions s 1 A histog histogram ram is a vert vertic ical al ba barr ch char artt de depi pict ctin ing g a freq freque uenc ncy y dist distri ribu buti tion on.. The x-axis is fo forr th the e va vari riab able le being meas measur ured ed and the the yy-ax axis is is for for the the freq freque uenc ncy. y. 2 A fre freque quency ncy po polyg lygon on (a many many-s -sid ided ed fig igu ure re)) is a li lin ne gr grap aph h depi depict ctin ing g a frequency frequency distrib distributi ution. on. a Eac Each h frequency frequency is de depi pict cted ed at the the mi midp dpoi oint nt of the the cl clas ass s it repr repres esen ents ts.. b The midpo midpoin intt is the the stat state ed or real cl clas ass s limit imits s ad add ded toge togeth ther er and Id ' d d B h . h X X = 5 59 = 54.5 divi e b y ttw wo. ot Yle t e same answer. 3 A relati relative ve fre freque quency ncy polygo polygon n is simi simila larr to a freq freque uenc ncy y poly polygo gon n ex exce cept pt it ha has s th the e re rela lati tive ve freq freque uenc ncy y of each ach cl clas ass s on the the yy-ax axis is.. 4. Cumula Cumulativ tive e fre freque quency ncy di distr strib ibuti ution ons s (O (Ogi give ves) s) me meas asur ure e the the accumulation of fr freq eque uenc ncie ies s ab abov ove e an and d be belo low w ea each ch re real al cl clas ass s li limi mit. t. a A more-t more-tha han n cumula cumulativ tive e fre freque quency ncy distri distribu butio tion n begi egins wi with th the the numb number er of freq freque uenc ncie ies s that that are ab abov ove e th the e re real al lo lowe werr li limi mitt of the the lo lowe west st cl cla ass ss.. The The answ answer er is equa equall to tota totall freq freque uenc ncy. y. It is loc oca ated ted ne near ar th the e to top p of the yy-ax axis is ab abov ove e the the lo lowe werr real cl clas ass s limit. Each Each su succ cces essi sive ve cl clas ass s li limi mitt is as asso soci ciat ated ed with with a small smaller er and smal smalle lerr nu numb mber er of freq freque uenc ncie ies s be bein ing g ab abov ove e the the suc succe cessi ssivel vely y high higher er cl clas ass s li limi mits ts.. The The fin ina al va valu lue e on the the yy-ax axis is will be ze zero ro be beca caus use e no none ne of the the outc outcom omes es ca can n be high higher er than than the the uppe upperr limit of th the e up uppe perr clas class. s. Cumu Cumula lati tive ve freq freque uenc ncy y di dist stri ribu buti tion ons s can can also also be cons constr truc ucte ted d on a rela relati tive ve ba basi sis s with with th the e cumu cumula lati tive ve freq freque uenc ncy y pe perc rcen enta tage ge gr grap aphe hed d on the the y-ax y-axis is.. b A le lessss-tha than n cumulat cumulative ive fre freque quency ncy distri distribu butio tion n is the compl complement ement of the more more-tha -than n freque frequency ncy distr distrib ibut utio ion. n. Its yy-ax axis is va valu lue e at or orig igin in is zero zero,, and at the the up uppe perr cl clas ass s limit, it, y wi willl be equal qual to tota totall freq freque uenc ncy. y.
H
H i st o g r am
requency Polygon
Frequency 6
Frequency
6
5
5
4
4
3
3
2
2
70 79
60-69
80- 89
ou . . . . _
90-99
Daily Dail y Tape Tape Rental Rentals s
Relati Rel ative ve Percent
:
64 . 5
More Mo re ttha han n Og Ogiv ive e
Polygon
35 30
14
Q
12
0
10
Q
E :: ]
- - . l . . . - - - - - -
54.5
64.5
74.5
84. 5
Daily Dail y Tape Tape Rent Rentals als
...I-
94. 5
' : '
74.5
--'----:
84.5
~
.
94.5
Less Le ss ttha han n Og Ogiv ive e o
c:
6
80
u: :;:; .£
8
6 3
u....
.............
>. 100
>
15
_
16
:: ]
20
10
54. 5
Daily Dail y Tape Tape Rental Rentals s
req requen uency cy
25
.............
60
40
4 ::]
2
o
49.5 59.5 69.5 79.5 89.5 99.5
Daily Dail y Tape Tape Rental Rentals s
E :: ]
20
O U .lol:..I.......,..j .j.....l................... .... 49.5 59.5 69.5 69.5 79.5 89. 9.5 5 99.5
Dail Da ily y Ta Tape pe Rent Rental als s Note No te the the y-a y-axis xis scal scale. e.
5
Prac Pr acti tice ce Set Set
Su Summ mma ari rizi zin ng
ata ata
Se See e pa page ges s PS 6 and PS 7 of Appe Append ndix ix I for comp comple lete te so solu luti tion ons s to th this is Pr Prac acti tice ce Set. Set. I
Dari Da rin n s Music Music Empor Emporium ium A Up Upon on gr grad adua uati ting ng fr from om coll colleg ege e Dari Darin n Jo Jone nes s op open ened ed Darin Darin s Music Music Empor Emporium ium.. Th The e co comp mpan any y se sell lls s musi mu sicc-re rela late ted d hard hardwa ware re an and d soft softwa ware re.. We will will use use de desc scri ript ptiv ive e st stat atis isti tics cs to an anal alyz yze e co comp mpan any y sa sale les s da data ta.. B Da Dari rin n recentl recently y colle collect cted ed the the ollowin Wa Wa/k /kma man n CD Record Recorder er sale sales s da data ta.. Units sold pe r day: 17 22 17
II
12
15 14
16 21 29 16
1
Make Ma ke an ar arra ray y and ca calc lcul ulat ate e th the e rang range e of this his data.
2
Calculate an ap appr prop opri riat ate e clas class s widt width h for for th this is da data ta..
Make a 5-cl Make 5-clas ass s fr freq eque uenc ncy y di dist stri ribu buti tion on usin using g st stat ated ed clas class s limi limits ts fo forr th the e fi firs rstt cl clas ass s of 5-9 sale sales s un uniits. ts. Th Thos ose e us usin ing g st stat atis isti tics cs soft softwa ware re sh shou ould ld tr try y ot othe herr cl clas ass s li limi mits ts with with th thei eirr soft softwa ware re an and d pr prin intt th the e on one e with with th the e most most symm symmet etri rica call dist distri ribu buti tion on.. Darin arin s Musi Music c Empo Empori rium um Walk Walkma man n Sale Sales s Data Data Stat Stated ed Class Class Limits
9
Dra Draw or pr prin intt a hist histog ogra ram. m.
Note No te:: The xx-ax axiis may may be la labe bele led d with with the the lowe lowerr st stat ated ed or rea real clas class s li limi mits ts the the cl clas ass s midp midpoi oint nts s or ea each ch clas class s ra rang nge. e.
B
Draw Draw or pr prin intt a fr freq eque uenc ncy y po poly lygo gon. n.
c
Draw print print a le lessss-tha than n cumul cumulati ative ve re rela lati tive ve fre freque quency ncy po polyg lygon on Ogive Ogive an and d a re rela lativ tive e fre freque quency ncy po poly lygo gon. n.
Note: Not e: A le less ss-t -tha han n cumu cumula lati tive ve re rela lati tive ve freq freque uenc ncy y di dist stri ribu buti tion on can can be us used ed to es esti tima mate te th the e pe perce rcent ntil iles es de defi fine ned d in cha chapte pterr See pag pages PS 6 and PS 7 of Ap App pend endix I for for co comp mpllete so sollutions to this Practi ctice Set.
Quick Que Question ions I
Plac Place e the the numb number er of the app approp ropria riate te formu formula la
Mutually-ex Mutua lly-exclusiv clusive e events
B
Relativ Rel ative e frequen frequency cy
C
Clas Class s mid midpoi point nt
ata ata
phra phrase se next next to th the e ite tem m it desc descri ribe bes. s.
range
2.
D. Ap Appr prox oxim imat ate e clas class s widt width h
II
or
Summa ummari riz zing ing
of classes
do not not cont contai ain n th the e same same outc outcom ome e X
3
+
X
2
class clas s freq frequenc uency y frequencies
E
All-inc All-inclus lusive ive event events s collec collectively tively exhaus exhaustive tive
4.
F
Ogive
5
cumulative cumul ative frequency distribu distribution tion
6
a plac place e for ever every y outc outcom ome e
tot
Comp Co mple lete te the the fo foll llow owin ing g usin using g th this is data data.. Data:
38, 48,
27,
14, 31,
23, 46, 38, 54, 26, 44,
33,
17, 34, 6, 37
A
Array
B
Range
C
Approx App roxima imate te cla class ss width width
D
Comp Co mple lete te th this is char chart. t. Pe Peop ople le usin using g stat statis isti tics cs soft softwa ware re shou should ld pr prin intt a fr frequ equenc ency y dist distri ribu buti tion on,, rel relat ativ ive e frequ frequency ency dis distri tribut bution ion,, and les less-t s-than han cum cumula ulativ tive e freq frequenc uency y distri distribut bution ion..
Stated
l a s ss s
5
E
i mi mi tts s
Real Cl C la s s L i m it s
Tally
Fre que ncy f
14
Frequenc Freq uency y polyg polygon on
8
Relative F req uen cy
Cumulative Frequency Cumulative Frequency More-tha n L ess - t h an
istogram
G
Relati Rel ative ve frequency frequency polygo polygon n
H
More Mor e than than cumulati cumulative ve frequency frequency polygo polygon n
I
Less Less than than cum cumula ulative tive freque frequency ncy poly polygo gon n
hapt hapter er I.
Measuri rin ng
entral ral Tend Tenden ency cy of Ungrouped Da Data ta
Introduction Cent Ce ntra rall tend tenden ency cy desc descri ribe bes s the the midd middle le of da data ta.. It repr represe esent nts s a typ typic ical al valu value. e. B. Measures of cen centr tral al ten tenden dency cy are are ca calllled ed aver averag ages es.. C. The arit arithme hmeti tic c mea mean n is the mo most st co comm mmo on average. It is used to meas measure ure gra grades des,, success in spor sports ts,, busi busine ness ss suc succes cess, s, and and man many y oth other er in inte tere resti sting ng subj subjec ects ts.. D. Pop Popula ulatio tion n para paramet meters ers ar are e rep repres resen ente ted d by Gre Greek ek capi capita tall le lett tter ers. s. E Sampl Sam ple e sta statis tistic tics s are are re repr prese esent nted ed by Ara Arabi bic c lo lower wercas case e le lett tter ers. s.
Don t fo Don forg rget et to lo look ok ahea head
II. Th e mean
T he he s am am p pll e mean ( X ) 1. Linda is interested in ho how w ma many ny se self lf-h -hel elp p Videotapes sh she e rented las astt ye yea ar. If sub substan stanti tial al,, sh she e will stoc stock ka larg larger er va vari riet ety y of tape tapes. s. To ma make ke an es esti tima mate te,, she she coun counte ted d last last we week ek s self self-h -hel elp p tape tape re rent ntal als s and re reco cord rded ed the foll follo owin wing sa samp mple le data. ta. Th The e data ata is a samp sample le be beca caus use e she she on only ly incl includ uded ed pa part rt of last last year year s da data ta.. 2. Da Dailily y self self-h -hel elp p tap tape e re rent ntal als s we were re:: 3 7 7 4 1 8 5.
x = L x where X L
read x bar, is th the e sa samp mple le me mean an..
x is the the var variab iable le bein being g me meas asur ured ed..
is the the Gree Greek k ca capi pita tall le lett tter er si sigm gma. a. It is th the e sym symbo boll for addi additi tion on..
n is the the samp sample le size size..
Th The e popu popula lati tion on me mean an ( l ) . Had Lind inda use sed d all of la last st year year s da data ta,, this this po popu pula lati tion on me mean an form formul ula a wo woul uld d ha have ve be been en used used.. 2. l is the Gr Gree eek k capital letter ter for for M and it is read Mu. 3. N is the the popu popula lati tion on si size ze.. C. A w e i g ht ht e d m e ea a n ( xw 1. When a data set has a numb mbe er of dupl duplica icate te va valu lues es,, a we weig ight hted ed me mean an is often often cal calcul culate ated. d. 2. Each Each vari variab able le occu occurr rrin ing g mo more re than than on once ce is as assi sign gned ed a vari variab able le na name me cons consis isti ting ng of capi capita tall x wit with a su subs bscr crip iptt and a weig weight ht (W) wi with th a si simi mila larr su subs bscr crip ipt. t. 3. Lind inda s Vi Vide deo o Sh Show owca case se re rent nts s tape tapes s for for 2, 3 and 4. Th The e we weig ight hted ed me mea an of re rece ceip ipts ts pe perr tape for for a day day of 36, 18, and 6 re res spect ctiive rentals is cal calcul culate ated d as follow follows: s: B.
E lx Jl
=
N
Note: W1 refers to how often X 1 happens.
x w
D.
36)
2)
+ (18)( 3) +
36
18
6
6)
4)
= 72 +
54 + 24 60
= 150 = 2 50 6
The su sum m of the the de devi viat atio ions ns ar arou ound nd a me mean an eq equa uals ls ze zero ro.. 1 L X-I.l. = 0 2. The The me mean an of 1 3 and 8 is 4. 3. The sum of the de devi viat atio ions ns ar arou ound nd the me mea an would be calc calcu ulate lated d as follo ollows ws:: L x-I.l. = 1-4
+ 3-4 + 8-4
= -3) + - 1 + 4 = 0 E
The The prim primar ary y disa disadv dvan anta tage ge of us usin ing g the the me mean an as a me meas asur ure e of cent centra rall tend tenden ency cy conc concer erns ns it be bein ing g s evex erampl ely ple, ae, ffect cte eedme byan a fis ewsm vaall lu lue s caus atuse eithe iteher xtrrsnow em e.stor Usi sin tsult helted d a ta at day they tw op thly is 1praegnetaas an exam the th mean smal l ebe beca ar beigxt sn owst orm mng re resu ed in a da ithofon only l. 10
III. Th e median A Th The e me medi dian an is th the e midd middle le nu numb mber er of data data ar arra rang nged ed in into to an ar arra ray. y. B. The The media dian as a m e eas asur ure e of cen centra trall tend tendency ency The The media edian n ma ay y be th thou ough ghtt of as th the e geom geomet etri ric c mi midd ddle le whil while e th the e me mean an is th the e arit arithm hmet etic ic middl iddle. e. 1 2 The geo geome metri tric c natu nature re of th the e me medi dian an re resu sult lts s in it not not be bein ing g infl influe uenc nced ed by a few larg large e numb number ers s at eith either er ext xtrreme. C Deter Determi mini ning ng the med median ian 1 Arra Arrang nge e th the e da data ta into into an arra array. y. 2. De Dete term rmin ine e th the e me medi dian an s posi positi tion on us usin ing g th this is expr expres essi sion on..
3 D
Count th Count this is num number ber of sp spac aces es fr from om ei eith ther er extr extrem eme e to find th the e median. A n ev even en num num b ber er fo forr n will result in the loca locati tion on bein being g half halfwa way y betw betwee een n two two nu num mbers bers.. Add th the e numb number ers s an and d divi divide de by 2 to dete determ rmin ine e th the e median.
Example Lind Li nda a Sm Smit ith h wa want nts s to calc calcul ulat ate e la last st week week s me medi dian an nu numb mber er of self-help self-help rentals. rentals. 1 2 Daily aily self self-h -hel elp p rent rental als s fr from om page page 10 we were re 3 7 7 4 1 8 an and d 5
Array: 1
3
4
5
Th The e arro arrow w me ean ans s go to the array. Counti Cou nting ng from eith either er dire direct ctio ion, n, th the e fo four urth th numb number er is 5
7, 8
7
IV. Th e mode A Th The e mo mode de is th the e valu value e occu occurr rrin ing g mo most st of ofte ten. n. B. It wa was s 7 fo forr self self-h -hel elp p ta tape pe re rent ntal als. s. C Som ome e data ata set sets have have no modes odes whi hille ot oth hers ers have ave two two bi bim m oda odal) l) or more multimodal modes. D For For m a any ny data data sets, the m o ode de is no nott a goo good repre eprese sent ntat atio ion n of the dat data s middl iddle e value,. A s a result, it is the least us used ed measure measure of cent centra rall te tend nden ency cy.. Howe Howeve ver, r, know knowin ing g th the e valu value e th that at occu occurr rred ed mo most st of ofte ten n is of ofte ten n of interest. V. M e a s u r e s of position A B. C
Th Thes ese e meas measure ures s loca locate te in inte tere rest stin ing g poin points ts al alon ong g da data ta ar arra rang nged ed into into an ar arra ray. y. Th The e me medi dian an is an exam exampl ple. e. Quartiles Quart iles sepa separa rate te da data ta in into to quar quarte ters rs.. 1 01 sepa separa rate tes s th the e fi firs rstt and and seco second nd qu quar arte ters rs.. 2 02 th the e me medi dian an,, sepa separa rate tes s th the e seco second nd and and th thir ird d quar quarte ters rs.. separa rate tes s th the e th thir ird d and and fo four urth th qu quar arte ters rs.. 3 03 sepa Quartile
Finding th e quartiles fo r th e
Location
Analysis
above abo ve data data
01
+ .5
+. 5 = 1.75 + .5 = 2.25 ~
02
+. 5
+ .5 = 3.5 +. 5 = 4 ~
03 D
~4
3.25
Note: 3. 3.25 25 is .25 of th the e dis distan tance ce be betw twee een n 3 the seco second nd numb number er,, and and 4, th the e th thir ird d num number. ber.
5
This his data is not not sy sym mmet etri ric cal. al. It is a coinc nciidenc nce e th that at the mean ean and and media edian n are are equal.
2; + .5 = 5.25 + .5 = 5 75 ~
Note: 7 is .75 of th the e dist distan ance ce betw betwee een n 7 and 7
7
Interquartile range interq rqua uart rtil ile e rang range e is th the e di diff ffer eren ence ce betw betwee een n Q3 and Q 1 The inte 2
1 0 3 0 1 = 7 - 3 . 2 5 = 3 . 75
1
E.
Deciles Decil es sepa separa rate te da data ta int nto o te tent nths hs.. The The 3r 3rd d deci decile le wo woul uld d be calc calcul ulat ated ed as fo foll llow ows: s:
F
Percentiles
1 2 3 4
xn
37
10 +.5 +. 5 = 1 0 +.5
= 2.6
Pe Perc rcen enti tile les s sepa separa rate te data data in into to 100 100 part parts. s. Let Let x equa equall th the e perc percen enti tile le of interest. The loc locati ation on of th the e x perc percen enti tile le wo woul uld d be st stat ated ed as fol ollo low ws: 100 +. 5 The 90 90th th perc percen enti tile le of da dail ily y self self-h -hel elp p re rent ntal als s wo woul uld d be
.102. 0 +.5
0 +.5 = = 90107)
630 1 0 0 + . 5 = 6.8 ~
7.8
3.6
Note: Comp Note: Comput uter er soft softwa ware re may use use diff differ eren entt fo form rmul ulas as to loca locate te th the e po posi siti tion on of data. A s a result, th thei eirr an answ swer ers s fo forr meas measure ures s of position ma y diff differ er sli slight ghtly ly from these the se answer answers. s.
11
Pra Pr actice Set I
Measuring
entral Tend Tenden ency cy of Ungrouped Data Data
Dari Da rin n Jone Jones s wa want nts s to know know mo more re ab about out the the sale sales s of Wa Walk lkma man n CD re reco cord rder ers/ s/pl play ayer ers s de desc scri ribe bed d on page 6 Calcul Calc ulat ate e the sa samp mple le me mean an us usin ing g thi this Wa Walk lkma man n sa sale les s da datta from from the the las astt Pr Prac acti tice ce Set. Stat State e th the e form formu ula for th the e popu popula lati tion on me mean an.. Arra Array y of dail daily y Wa Walk lkma man n sale sales: s: 8 12, 14, 15, 16, 16, 17, 17, 21, 22, 29
Sa Samp mple le me mean an
Having Hav ing troubl trouble e wit with h these these probl problem ems? s? Pl Plea ease look loge okloca back ba ck 2 page pa sick tok abou ab out the same sa mesepage pa lo cati tion on in ges Quic Qu No Note tes st th toe se see e how Lind Linda a so solv lved ed a simil similar ar prob proble lem. m.
B
II
Popula Pop ulatio tion n mean form formula ula
Darin Dari n sell sells s thre three e diff differ eren entt Wa Walk lkma man n CD re reco cord rder ers; s; on one e for for 14 149, 9, one for 159, and a third for for 169. the the 187 ma mach chin ines es so sold ld dur uriing th this is elev eleven en-d -day ay pe peri riod od;; 43 we were re the the le leas astt ex expe pens nsiv ive, e, 90 wer ere e mo mode dera rate tely ly pric priced ed,, and and 54 we were re the the ex expe pens nsiv ive e mode model. l. Calc Ca lcul ulat ate e the the we weig ight hted ed me mean an sale sales s pric price e for thes these e ma mach chin ines es..
III. Usin Using g the the data ata from from qu ques esti tion on I pr prov ove e that that the the su sum m o the d de eviations fr from a m me ean is is
12
_
IV
The The me medi dian an nu numb mber er of Walk Walkma man n un unit its s so sold ld is
V
The The mode fo forr th this is da data ta is
VI
This data can be described as
VI VIII
_ _
_
Calc Calcula ulate te the follow following ing measures measures posit positio ion n Those Those using using computer computer softw software are sho should uld use use a le less ss th than an cumula cumulativ tive e re rela lati tive ve fre freque quency ncy di distr strib ibut utio ion n to answer answer these these quest question ions s
0
D
6th decile
_
C
_
95th percentile
Interquartile ra r ange
_
_
Quic Qu ick k Ques esttio ion ns I
Me Meas asu urin ing g
en entr tral al Te Tend nden ency cy of Ungrou oup ped Da Data ta
Writ Wr ite e th the e nu numb mber er of the the ap appr prop opri riat ate e form formul ula a ne next xt to the the item item it de desc scri ribe bes s A
Sa m p le m me ean
B
Population mean
_
5 _ 2
C
Location of the median
II
Location of
E
Weighted m me e an
F
Location of 0 3
_
X
n
_
_
Li Lis st and calc calcul ulat ate e the the 3 me meas asur ures es of centr central al tende tendency ncy Data:
5
7
3
8
6
10
9
8
B
C
III
What
is
5
W x X x
_ 3
D
X
5
4
the the primar primary y disad disadvant vantage age of the mean as a meas easure of cent central ral tendency? tendency?
14
6
5
IV
Usin Using g th this is data data prov prove e that that th the e sum sum Data:
3
7
of
the deviations around an arithmetic mean is
_
5
Calcul Calc ulat ate e a we weig ighte hted d me mean an of parki parking ng ticke tickets ts cos costin ting g 25 35 and and 45 with with cor corres respon pondin ding g weig we ight hts s of 1 2 and 10 resp spe ective velly. Wh Why y mu must st the an answ swe er be 35?
VI. Ca Calc lcul ulat ate e the the foll follow owing ing for th the e que questi stion on II data. C
D
2nd 2nd decil decile e
E
85th 85t h per percent centile ile
15
Interqua Inte rquartil rtile e rang range e
Chapte Cha pterr
Meas Measur urin ing g Dis Dispersi ersio on of Ung Ungrou rouped ped Data
I.
In tro d u ct io n A Disp Disper ersio sion n refe refers rs to the the spre spread ad of dat data, a, its its var variab iabili ility ty.. B Dispersion is imp import ortant ant bec becaus ause e it det determ ermine ines s the the relia reliabil bilit ity y of cent central ral tenden tendency cy measuremen measurements. ts. C Com Compar paring ing the the dis disper persio sion n of diff differ eren entt da data ta sets sets may be reve reveal alin ing. g. Tw Two o st stud uden ents ts mi migh ghtt ha have ve th the e sam same gr grad ade e poin oint aver averag age e wit with on one e hav avin ing g all B s and the ot othe herr hav having ing halfA halfA s and half alf C s. D This This page page will will ex expl plor ore e po popu pula lati tion on para parame mete ters rs.. Wh Wher ere e samp sample le st stat atis isti tic c fo form rmul ulas as diff differ er,, calc calcul ulat atio ions ns will will be do done ne on the the ne next xt pa page ge.. E The sample data for self-help rentals presented [ in ch chapt apter er 3 will will be us used ed he here re as po popu pula lati tion on da data ta.. 3 7 7 4, 1 8 5 and 1 1 = 5
II.
R an ge A The The rang range e is the the high highes estt valu value e H mi minu nus s the the lowe lowest st valu value e L B =8 1 =7 C Whil ile e ea easy sy to calc calcul ulat ate, e, the the ra rang nge e is sev severe erely ly affe affect cted ed by unusu unusual al circu circumsta mstances. nces. In this this cas case, a snow snowst stor orm m caus caused ed Linda to cl clos ose e ea earl rly y li limi miti ting ng that that da day y s renta entals ls to on one e unit unit..
III. Popu Popula latt iion on aver averag age e de devi viat atio ion n AD AD)) The ave averag rage e devi deviatio ation n is the the me mean an of the abs absolu olute te A. The values of the the de devi viat atio ions ns from from the the mean mean..
Note No te:: N is popu popula lati tion on size size.. B
C
I
Self Help Rent Rental als s
x
\
~
Ix - \ l
3
5
2
2
7
5
2
2
7
5
2
2
4
5
1
1
1
5
4
4
8 5
5 5
3
Totals Usin Using g the the ab absol solut ute e val value ue of the the devi deviat ation ions s is ne nece cessa ssary ry be beca caus use e the the sum sum of the dev deviat iations ions is zero. The aver averag age e devi deviat atio ion n is a quic quick k wa measur sure e disper dispersio sion. n. way y to mea The The soon soon to be expl explai aine ned d vari varian ance ce an and d stan standa dard rd de devi viat atio ion n are are mo more re valu valuab able le me meas asur ures es..
3
Q
Q
0
14
IV. Pop Popula ulati tion on variance a ) an d stan standar dard d devia deviati tion on a A. The The vari varian ance ce solv solves es the the prob proble lem m of the sum sum of the the vari variat atio ions ns fr from om a mean mean bein being g zero zero by sq squa uari ring ng th the e di diff ffer eren ence ces. s. B The variance is the the av aver erag age e of the squ square ared d dev deviat iation ions s of the the da data ta fr from om th thei eirr mean mean.. C The The resu result ltin ing g measur measure e is simi simila larr to the the av aver erag aged ed de devi viat atio ion n alth althou ough gh it is larg larger er be beca caus use e the vari variat atio ion n was was squ qua are red d D Th This is proble problem m is solv solved ed wi witth the stan standa dard rd devi deviat atio ion n wh whic ich h is the sq squa uare re root of the vari variance ance.. E The The popula population tion variance 2 ==
L
X - ~
2
Alternat Alte rnative ive For Formul mula a 2 ==
=3 0 F
_
LNX
4 25
2
=5.4
Populat Pop ulation ion sta standa ndard rd devia deviation tion
Selff Help Sel Help Re Rent ntal als s
3
5
2
4
9
7
5
2
4
49
7
5
2
4
49
4
5
1
1
16
1
5
4
16
1
8
5
3
9
64
Q
5
0
l
2 1 3 -
=
3 ~
2
__ J 2 1 3 1 75
7-1
6
=
2.5
VI. Us Usiin g th the e sta sta nd nd ard ard d evi evia a titi o on n a s a me mea a su su rre e o f variability A. Th The e emp empiri irical cal rul rule e is used used for norm normal al,, bell bell-sh -shap aped ed data data.. 1. For For sym symme metr tric ical al or be bell ll-s -sha hape ped d da data ta,, 68 68.2 .26 6 of the the item item will will be with within in one one stan standa dard rd de devi viat atio ion n of the the me mean an,, 95.4 95.44 4 will be wi with thin in tw two o stan stand dar ard d de devi viat atio ions ns of the mean, and 99 99.7 .74 4 will will be with within in thre three e stan standa dard rd de devi viat atio ions ns of the the mean. I f I.l = 50 500 0 and and 0 = 100, 00, th then en 95.4 95.44 4 of the popu popula lati tion on will will be betw betwee een n 30 300 0 and 700. 500 ± 2(100) 500 ± 200
300
H
700
35
35 35
Stud Studen ents ts wo woul uld d li like ke a sm smal alll stan standa dard rd de devi viat atio ion n arou around nd a test test mean of 95 so ev ever eryo yone ne re rece ceiive ves s a gr grad ade e of Chebysh Che byshev's ev's rule is us used ed for non nonsymme symmetric trical al dist distrib ributi utions ons.. 1. Russia Russian n mathema mathematician tician P. Che Chebys byshev hev devel develop oped ed a me meth thod od to esti estima mate te the the mini minimu mum m prop propor orti tion on of item items s that that are are with within in a desi design gnat ated ed numbe numberr of stan standa dard rd de devi viat atio ions ns from from the the me mean an for nonsy nonsymme mmetri trical cal di distr stribu ibuti tion ons s with with me mean ans s gre greate aterr th than an 1. 1- J As with with the the em emp piric irica al rule, the the estim stimat ate e wo work rks s for for both samples samp les and and pop popula ulatio tions. ns. 2. The The prop propor orti tion on of it item ems s with within in K stan standa dard rd de devi viat atio ions ns of th the e me mean an is at lea least 1 min minus 1 ov over er K sq squa uare red d pr pro ovi vid ded K is a cons consta tant nt grea greate terr than than 1. [ 3. The 1= 1The prop propor orti tion on of the the da data ta fall fallin ing g with within in 2 stan standa dard rd devi deviat atio ions ns of a me mean an is calculated a as s fo follows: . , ,2.
B.
= 1-
~
=
(2)
~
VI VII. I. Coefficie Coefficient nt of variation C V Comp Co mpar arin ing g the the var varia iabil bility ity of data data sets sets of di diff ffer erin ing g ma magn gnit itud udes es is acc accomp omplis lishe hed d usin using g the the coef coeffi ficie cient nt of vari variat atio ion. n. B. De Depa part rtme ment nt A wi witth 40 mil millio ion n in sa sale les s will ha have ve a mu much ch la larg rger er sta stand ndar ard d de devi viat atio ion n than De Dep partm artme ent B which has on only ly 3 mill millio ion n in sale sales. s. Supp Suppos ose e De Depa part rtme ment nt A's A's r wa was s 4 milli illion on and De Depa part rtme ment nt B's r was 400 400,00 ,000. 0. C. The The coef coefficie ficient nt o f vari variatio ation, n, wh whic ich h ex expr pres esse ses s the the stan standa dard rd de devi viat atio ion n as a pe perc rcen entt of the the me mean an,, re reve veal als s wh whic ich h depa depart rtme ment nt has has the the la larg rges estt re rela lati tive ve sale sales s vari variab abil ilit ity. y. CV = ~ 1 0 0 for for samp sample le da data ta.. For For De Depa part rtme ment nt B
For For Dep Depart artme ment nt A
C. V. --
J
-
4 000 000
100 - 40 000 000
(1 (100 00)) -
C.V. = (100)
1 °1
J
1
400 000
= 3 000 000
(100) =13 3 0 1
Note: Not e: De Depa part rtme ment nt A had le less ss sale sales s do doll llar ar vari variab abil ilit ity y ev even en thou though gh it had a larg larger er stan standa dard rd de devi viat atio ion. n. 17
Practice I
Meas Me asur urin ing g
isp sper ersi sion on of Un Ungr gro ouped uped
ata
Darin Dari n is conc concer erne ned d ab abou outt Walk Walkma man n sa sale les s vari variab abil ilit ity. y. Firs Firstt calc calcul ulat ate e th the e ra rang nge e fo forr Walk Walkma man n sale sales s an and d th the en th the e average average deviat deviation ion th the e standa standard rd deviat deviation ion and and the the varian variance. ce. Array
II
t
daily dail y Walkma Walkman n sa sale les: s: 8
A
Range
B
Sa Samp mple le average average deviatio deviation n
C
Sa Samp mple le variance variance
D
Sa Samp mple le standard standard deviatio deviation n
12 14 15 1 6 16
Labe La bell this this graph graph depict depicting ing th the e empi empiri rical cal ru rule le..
95.44 99.74
17 17 21 22 29
Sa m pl pl e m ea ea n : 17
III. Last yea year s mean mean week weekly ly Walk Walkma man n sale sales s were were 16 and the the sta stand ndar ard d devi deviat atio ion n wa was s 4. Use the the emp empiric irical al rule de dete term rmin ine e a ra rang nge e for Walk Walkma man n sale sales s fo forr one two two a an nd th thre ree e samp sample le st stan anda dard rd de devi viat atio ions ns from from the the me mea an.
One Standa Standard rd Deviat Deviation ion
B
Two Stan Standa dard rd Devia Deviati tion ons s
C
to
Three Standa Standard rd Devia Deviatio tions ns
IV
Us Use e Cheb Chebys yshe hev v s ru rule le to de dete term rmin ine e a ra rang nge e for Walk Walkma man n sale sales s be bein ing g wi with thin in two two samp sample le st stan anda dard rd de devi viat atio ions ns of the mean mean..
V
Da Dari rin n read read in a tr trad ade e pu publ bliica cati tio on that that the aver averag age e Walk Walkma man n sale sales s and st stan anda dard rd devi deviat atio ion n for for a st stor ore e his si size ze and ty type pe are are 8 and 3 re resp spec ecti tive vely ly.. Usin sing the the sa samp mple le da data ta fro rom m page 18 are are Da Dari rin n s Walk Walkma man n sa sale les s more more or le less ss vari variab able le th than an th thos ose e of his in indu dust stry ry? ? Use Use th the e stan standa dard rd de devi viat atio ion n calc calcul ulat ated ed in problem I
Quic Qu ick k Quest stio ions ns I
II
Me Meas asur urin ing g
iis spersion of Ungrou rouped
a ata ta
Plac Place e th the e nu numb mber er the the appro appropri priate ate form formul ula a next next to th the e pa para rame mete terr or st stat atis isti tic c it de desc scri ribe bes. s. A
Population average deviation
B
Population variance
C
Population standard deviation
D
Alternative population variance
1
_
_
7
_ _
E
Alternative population standard deviation
F
Chebyshev s rule
_
G
Sample va v ariance
_
H
Sample standard deviation
I
Alternative sample variance
J
Alternative sample standard deviation
3
_
_ _ _
Calc Ca lcul ulat ate e the the fo foll llow owin ing g stati statist stics ics usin using g this this samp sample le data data.. Data: 5 7 3
8
6
10, 9 8
x A
6
Varian Var iance ce use alternat alternative ive for formul mula) a)
2
8
/r. 50 I > 5 5)) =
P(>50 and >5) P(>5)
P(>50) P(>5 0) x P( P(>5 >51> 1>50 50)) P(>50)x P(>50) x P( P(>5 >5 1>50 ) + P( P(>5 >50) 0) x P >5 I >50)
IV.
2 2 5
2
5
25
8
- 80
5
5
Join Jointt and cond condit itio iona nall pr prob obab abil ilit ity y ma may y easi easily ly be read fro from a cont contin inge genc ncy y ta tabl ble e co conv nver erte ted d to de deci cima mals ls..
Mont Mo nthl hly y Adve Advert rtis isin ing g and Sa Sale les s Sales
Less than or
Greater than
equal to to 50 50,000
50,000
Advertising Less than or equ equal to 5, 5,00 000 0
Totals
advertisi advert ising ng over 5,000 5,000 of 40% can be read directly from this chart.
0.10
Greater tha Greater than n 5, 5,000
0.10
Totals
0.50
The page 46 answer to sales over 50,000 and
0.50
0.50
0.50
1.00
The answer to the conditional statement above can offf th the e ch char artt as di divi vide ded d by .5 or 80%. be read of Adve Advert rtis isin ing g and sale sales s are are de depe pend nden entt so th the e speci pecial al rule fo forr mUltiplication does not apply. Note that joint probability is not not th the e prod produc uctt of margi marginal nal pro probab babili ility. ty.
V. Co Coun unti ting ng rele releva vant nt ou outc tcom omes es A As pr prob oble lems ms beco become me mo more re comp comple lex, x, coun counti ting ng tota totall outc outcom omes es an and d out outco comes mes of in inter teres estt wi willll al also so be more comp complex. lex. B. Th The e co coun unti ting ng rule rule:: If one event can happen M ways and a second event can happen N ways, then the two even events ts can can happ happen en in seque equenc nce e (M) M)(N (N)) way ays s. Linda wa want nts s to vi visi sitt he herr 3 co comp mpet etit itor ors, s, ea eac ch of whom have 2 store res s. Th Ther ere e ar are e (3)( )(2 2) = 6 stor stores es she can visit. The The total cou ount ntiing fo forr th thre ree e ev even ents ts wo wou uld be (M)(N)( 0) . C. The fac factor torial ial rule rule in invo volv lves es arra arrang ngin ing g N avai availa labl ble e item items s. 1. Linda can vi visi sitt the the 6 st stor ores es of her com compet petito itors rs us using ing 6 alterna alternative tive routes. routes. 2. N = 6 = 6 x 5 x 4 x 3 x 2 x 1 = 72 720 0 altern alternati ative ve rout routes es she e beg egiins, she has 6 alte altern rnat atiives ves. Having ing been to a sto torre, she then has 5 al alte tern rna ati tive ves s, then 4, etc. 3. When sh D. The perm permutat utation ion rul rule e in invo volv lves es arra arrang ngin ing g R of N avai availa labl ble e item items. s. 1. Orde Orderr is impo import rtan antt as a, b, c and c, a, b are are diff differ eren entt and eac ach h is co cou unt nte ed a s a n outcome. 2. Here is ho how w many many ways Linda could arra arrang nge e 4 of 7 post poster ers s as a win ind dow display. N = 7 and R = 4 N E
P
N
= N-R) =
Totality What is not of interest
=
7 P4
7
= 7-4) =
7x6x5x4x3x2x1
3x2x1
= 7 x 6 x 5 x 4 =840
Th The e com combin binati ation on rul rule e inv involv olves es ch choo oosi sing ng (not (not arra arrang ngiing) ng) R of N av avai aila labl ble e items. Bec eca aus use e it item ems s are are no nott be beiing ar arran range ged, d, order is no nott impo mportan rtant. t. Ite Items abc abc and cba are th the e sam ame e and are not counte ted d twice ce.. 1. Just Just hang hangin ing g (n (not ot ar arra rang ngin ing) g) 4 of 7 post poster ers s has has fewe fewerr poss possib ibil ilit itie ies s beca becaus use e orde orderr does doesn' n'tt coun count. t.
NC = 2.
N N-R)
R )
7
c
4 -
7 _ 7x 6 x5 x4 x3 x2 x1 _ 7 x 6 x 5 - 35 (7 - 4 ) 4 - 3x2x1 x 4x 3x 2x 1 - 3x 2x 1 -
The use of R in the the den denomi ominat nator or elim elimin inat ates es the the mu mult ltip iple le coun counti ting ng of it item ems s of in inte tere rest st.. 7
Practice ice Set
Probability Par artt
the da data ta Dari Darin n Jon Jones es colle collect cted ed conc concer erni ning ng sale sales s to cust custom omer ers s I Below is the Conv Co nver ertt Tabl Table e 1 to de deci cima mals ls an and d plac place e the the info inform rmat atio ion n in into to Tabl Table e
Multi Mu ltipl plica icati tion on Rules Rules
nalysis of Sa Sale les s y ge ge of C Cu ustomer
Decimals
Tabl Ta ble e1
Sale
C u s t o me r A g e
L e s s th a n equal to 20
r
Overr 20 Ove
No Yes
16 24
12
24 36
Totals
40
20
60
Tabl Table e2
Less than or equ qua al to 20
Totals
diff differ eren entt ag ages es.. see see pa page ge 42
Over 20
Totals
II Us Use e a form formul ula a to calc calcul ulat ate e the the pr prob obab abil ilit ity y thes these e ev even ents ts an and d chec check k your your answ answer ers s us usin ing g Tabl Table e A The pro probab babilit ility y
a c ust om er being ove r 20 2 0 years old is
_
B The pro probabi bability lity a cu cust stom omer er being ing over over 20 years ears old and no nott maki kin ng a sa sale le is
_
C The pro probabi bability lity a cu cus st om omer being less than or equal to 20 ye yea ar s old and over over 20 years old is
D Was Was the spec specia iall ru rule le mu mult ltip ipli lica cati tion on ap appl plic icab able le to qu ques esti tion on B? Why or why no not? t? Coul Could d th the e spec specia iall rule mult Wh y not? ltip ipli lica cattio ion n be used used by Lind inda wi with th the page 46 ad adve verrti tisi sing ng data data? ? Why Why rWhy
_
III. Use Use Baye Bayes s theo theore rem m to calc calcul ulat ate e the the prob probab abil ilit ity y of maki making ng a sa sale le gi give ven n a cust custom omer er is less than or equal to 20 years of age.
IV. Re Reca calc lcu ula late te yo your ur an answ swer er to qu ques esti tion on III us usiing Ta Tabl ble e 2 on pa page ge 48.
V
Use Use Li Lind nda a s page page 46 adve advert rtis isin ing g da data ta to calc calcul ulat ate e the the po poss ssib ibil ilit ity y having mont mo nthl hly y adve advert rtis isin ing g ove overr 5,00 5,000 0 and mo mont nthl hly y sale sales s ove overr 50 50,0 ,000 00..
VI. Answ Answer er th thes ese e ques questi tion ons s abou aboutt 5 po post ster ers s Da Dari rin n has to ad adve vert rtis ise e a ne new w CD re reco cord rder er/p /pla laye yer. r. Be su sure re to sh show ow all form formu ulas. las.
How Ho w ma many ny wa ways ys ca can n he ar arra ran nge the these po post ster ers s
How ma How many ny wa ways ys ca can n he ar arra rang nge e on only ly 3 poste sters rs? ? Ar Arra rang nge e imp mpllies ies that that orde orderr co cou unts. ts. AB is no nott th the e sa same me as BA and that that both sh shou ould ld be co coun unte ted d.
in
a ho hori rizo zont ntal al line line acro across ss a wa wall ll? ?
u s t
C
How ma many ny wa ways ys ca can n he ju just st hang the them? (o (ord rder er do does esn n t co cou unt)
r u ~
Quic Qu ick k Que uest stiion ons s I.
Prob Pr obab abil ilit ity y Part
ultipl ult iplica icatio tion n Rules
Pl Plac ace e th the e le lett tter er of th the e ap appr prop opri riat ate e de defi fini niti tion on or fo form rmul ula a ne next xt to the the conc concep eptt it de defi fine nes. s. 1. Ge Gene nera rall ru rule le for multi multipl plica icati tion on
A
2. Indepe Independent ndent events
B. Margi Marginal nal probab probability ility
3. Sp Spec ecia iall ru rule le for mult multip iplic licati ation on
C. P A and B
4. P A)
D. Even Eventt A does does no nott affe affect ct the the prob probab abil ilit ity y of ev even entt B
5. Counti Counting ng rule rule 6. Comb Combina inatio tion n rul rule e
E. P A and B P A) x P B I A F. P A x P BIA + P A x P BIA
7. Join Jointt probabil probability ity
G. M) N)
8. Denominator of Bayes theorem
H. N ite items ca can n be arra arrang nged ed N ways
9. Fac Factor torial ial ru rule le
I.
10. Per Permuta mutatio tion n rul rule e
J.
P A and B
P A) x P B)
N- R) N-R)
R )
Note that Note that G represe sen nts how two se setts of items can be ordered and H, I, and J rep represe resen nt how one set set of item items s can can be orde ordere red. d. II.
Comp Co mple lete te th this is ch char artt co conc ncer erni ning ng the the nu numb mber er of hou ours rs stud studen ents ts st stu udied died for for a test test and thei theirr ex exam am gra grades. des. Hours Hou rs studyi studying ng Test
score
Less than 4
Greater than or e eq qual to to 4
Le Less ss tha than 85 Greater than Greater than or equal to 85
Total 10
10
Totals
III. Use a formu rmula and the data in quest uestiion II to ans answer wer the the foll follow owin ing g quest questio ions. ns. A
The The pr prob obab abil ilit ity y of ea earn rnin ing g a gr grad ade e le less ss than than 85.
B.
The pro rob bab abil ilit ity y of so some meon one e stud studyi ying ng 4 or more more ho hour urs s and earn earnin ing g a grad grade e of 85 or high higher er..
C.
Was th the e spec specia iall ru rule le of mult multip ipli lica cati tion on ap appl plic icab able le to ques questi tion on B? Why Why or why no not? t?
5
IV
V
D
Use Use Ba Baye yes s the theor orem em to calc calcul ulat ate e th the e prob probab abili ility ty some someon one e sc scor orin ing g 85 or high higher er if the hey y studi tudied ed 4 rmor more e ho hour urs. s.
E
Pr Pro ove your your answ answer er to que questio stion n D using ing the char chartt on page page 50.
How Ho w many many stor stores es wi will ll a sale salesp sper erso son n vi visi sitt if they they mu must st visi visitt 3 loca locati tion ons s in each
4 ci citi ties es? ?
An ad adve vert rtis isin ing g manage managerr ha has s 6 adve advert rtis isem emen ents ts eq equa uall size size to plac place e ho hori rizo zont ntal ally ly ac acro ross ss a maga magazi zine ne pa page ge.. How many any wa ways ys can the 6 ads ads be arranged?
A
B
How many wa way ys can 4
c
How many any wa ways ys can 4 the 6 ads be arranged if orde orderr does does not count ount and a b c d and d c b a are are cons consid ider ered ed the sam same e arra arrang ngem emen ent? t?
the the 6 ads ads be arrang anged if orde orderr cou counts nts?
5
Chap Ch apte terr 9 Disc iscret ete e Pr Prob obab abil iliity Di Dis strib tribut utiion ons s I.
Understanding Understandi ng probabil probability ity distri distributions butions A. A random variable me measu asures res a nume numeri rica call even event, t, the the valu value e of wh whic ich, h, is dete determ rmin ined ed by chan chance ce..
The ex expe peri rime ment ntal al outcom outcomes es descr describe ibed d in chap chapte terr 8 ar are e rand random om vari variab able les. s. Ex Exam ampl ples es incl includ ude e fli lipp ppiing a coin coin an and d cust custom omer er bu buyi ying ng ha habi bits ts base based d upon upon ge gend nder er.. C. Random variables are eith either er discre discrete te or continuous. Di Discr screte ete:: Only Only fini finite te valu values es,, such such as the the coun counta tabl ble e nu num mbe bers rs,, can can ex exis istt on th the e x-ax x-axis is.. Exam Exampl ples es incl includ ude e ti tire re de defe fect cts s an and d the the nu numb mber er corr correc ectt on a tr true ue or fal false se exam exam.. 2 Continuous: Mea Measur sureme ement nt may assum assume e any valu value e asso associ ciat ated ed with with an unin uninte terr rrup upte ted d scal scale. e. Ex Exam ampl ples es B.
ofall oflues incl includ ude eity the exact exabution ct weigh weight ts onepoun und d ilit bo box cook co ies s and an d ed th the ewith av gendom leng leom ngth th computer parts. probability distri distribution D. A probabil li list sts alal on the thee-po pr prob obab abil ity yxvalu va esokie asso as soci ciat ated wiavera therage a rand ra vari vaof riab able le (x). (x). parts. E. Exam Exampl ple: e: In chapte chapterr 3, Lin Linda fou found that that 36, 18, and 6 tape tapes s we ere re rent nte ed fo forr 2, 3, and 4 resp respe ect ctiv ivel ely. y. 1. The amount amount rece receiv ived ed is a discre discrete te ra rand ndom om vari variab able le wi with th poss possib ible le valu values es (out (outco come mes) s) of 2, 2, 3, and 4. 2
Below Below is the the pr prob obab abil ilit ity y dist distri ribu buti tion on asso associ ciat ated ed with with tape tape rent rental al fe fees es.. Discrete Probability Distribution
Rent Re ntal al Fees Fees (x)
T a p e s R en t ed
Probability P(x)
2.00
36
36/60 = .60
1. 2 0
4
2.40
3. 00
18
18/60 = .30
0.90
9
2 . 70
6/60 = JJ
0. 40
16
1.60
Number of
4.00
1.0
60 F.
x2
[x e P x ]
[X
2 .5 0
2e P x ]
Note: This distrib distributi ution on
is si simi mila larr to a fr freq eque uenc ncy y distribut distri bution ion with with P( P(x) x) replacing f.
6.70
The The me mean an an and d vari varian ance ce of a dis discre crete te pro probab babili ility ty distri distribut butio ion n 1. Ra Rand ndom om variab variable le param paramete eterr calc calcul ulat atio ions ns ar are e si simi mila larr to grou groupe ped d data data par param amete eterr calc calcul ulat atio ions ns.. Howe Howeve ver, r, divi divisi sion on is not necess necessary ary for rand random om vari variab able le calc calcul ulat atio ions ns beca becaus use e th the e obse observ rvat atio ions ns to tota tall 1. 1.0 0 (100 (100%) %).. 2. The The m ean ean of ra rand ndom om vari variab able le x is call called ed the the ex expe pect cted ed valu value e of x or E(x). 3. The varian variance ce of x is V(x). 2 e p(x)] - [E X ]2 N ote ote:: Thes These e fo form rmul ulas as may be Vex = [ writ writte ten n usin using g Greek Greek lett letter ers s wit ith h E(x) = [ e p(x)] = 2.50 2.50 = 6.70-( 2.50)2 fo forr E(x) (x) and (} 2 for V( V(x) x).. See chart calcul calculati ations ons = 6.70 - 6.25 = .45
II. The bin binom om iial al probability distribution A. Bi Binom nomial ial experime experiments nts have the fol follow lowing ing cha charac racter terist istics ics.. 1. The experim experiment ent cons consis ists ts of a fi fixe xed d nu numb mber er of tr tria ials ls.. Two mutu mutual ally ly-e -exc xclu lusi sive ve outco outcome mes s resu result lt fr from om each each tr triial. al. 2. Defined as succ succes ess s an and d fail failur ure, e, ea each ch set set of outc outcom omes es can can be coun counte ted d and and repr repres esent ent an indep indepen enden dentt even event. t. 3. The probab probabili ility ty of suc succes cess s and and the the pr prob obab abil ilit ity y of fail failur ure e must must be cons consta tant nt wit ith h P(F (F)) = 1 - P(S). B. C.
Bino Bi mial alning exp experi erime ments nts incl in ude etribut fl flip ippi ping ng req a coin co in, , coun co unti ting ng pr prod oduct uct defe de cts, s,n and andpmark ma rket etin ing g resp respon onse se rate rates. s. P(x) Deter Denomi termi minin g the binom bin omial ialclud distri dis bution ion requir uires es calcul cal culati ating ng =fect x q n - x where: x (n-x)
In is numbe umberr of trials Ix is n um um b er er of successes Ip is prob probab abil ilit ity y of success Iq, the probab probabili ility ty of failure, is 1 - P 1.
2.
The The pa page ge 46 coin coin fl flip ippi ping ng ex expe peri rime ment nt,, solv solved ed wit ith h a cont contin inge genc ncy y ta tabl ble e an and d a de deci cisi sion on tr tree ee,, is a bin inom omia iall experiment. Th e probability of ha havi ving ng ex exac actl tly y on one e he head ad wi with th tw two o to toss sses es is calc calcul ulat ated ed be belo low w. 1 n = 2, x = 1 (head), p = .5, q = . 5 Note: O = 1, xO = 1, and x = x
n x n -x p( X) = X (n-x) p q
-(
51 5 2- 1) P, 1 -- _2_ 1 (2-1) ' .
= 2x 1 ( 51 52-1) 1 (1) . . = 2 x . 5 x . 5 = .5
Th e Binomial Proba Probability bility Distribution fo r n = 2 an d p = .5
P(x) .5 .3
of Heads (x)
-
.1
o
1
2
Nu Numb mber er of He Head ads s 52
x
P(x)
0
.25
1
.50
2
.25 1. 00
To t a l
D.
Binomiall tables Binomia tables 1. Ex Exte tens nsiv ive e tabl tables es have have be been en de deve velo lope ped d to solv solve e bino binom mial ial ex expe peri rime ment nts. s. See See Tabl Table e 1 pa page ge ST 1. exper erime iment nt and and some some rele releva vant nt prob probab abil ilit itie ies. s. 2. Below is a table for a two trial (n = 2) exp 3. No Note te the the distribution for for the page page 46 coin prob proble lem m is und under er th the e .5 column. 4. If the the prob probab abil ilit ity y of a def defect ective ive pa part rt is .05, then then ge gett ttin ing g 2 ou outt of 2 defe defect cts s woul would d be .0025 or .25 . Probability of x succ succes essf sful ul ou outc tcom omes es give given n the the foll follow owin ing g prob probab abil ilit ity y valu values es (p) an and d tr tria ials ls (n)
-
E.
X
0.0500
0.10
0.20
0.30
0.40
0 .5 0
0.60
0.70
0 .8 0
0 .9 0
0.9500
0
0.9025
0.81
0.64
0.49
0.36
0 .2 5
0.16
0.09
0 .0 4
0.01
0.0025
1
0.0950
0.18
0 .3 2
0.42
0.48
0 .5 0
0.48
0.42
0.32
0.18
0. 1950
Note: Not e: A bino binomi mial al ta tabl ble e has has 2 de defi fini ning ng charac cha racter terist istics ics,, n and p. Note: Not e: For For this his
2
0.0025
0.01
0.04
0 .0 9
0 .1 6
0.25
0.36
0.49
0.64
0.81
0.9025
table, le, n
I
=
2.
The The shap shape e of bino binomial mial distributio distributions ns Distr tribu ibutio tions ns are sym symmet metrica ricall when P( P(x) x) = .5. High High or low prob probab abili iliti ties es have have highly highly skewed skewed dist distri ribu buti tion ons. s. 1. Dis 2. Wh When en the the p(x p(x) ' .5, .5, the the dis distr tribu ibutio tion n is ske kewe wed d and a larg larger er n will ill resul esultt in a mor more e sym symmet metrica ricall dist distri ribu buti tion on..
III. he Poiss Poisson on distrib distribution ution A A Po Pois isso son n dist distri ribu buti tion on is simi simila larr to a bino binomi mial al dist distri ribu buti tion on ex exce cept pt th the e P(x) P(x) mu must st be smal small. l. A Pois Poisso son n dist distrribu ibuti tion on is de defi fine ned d by on only ly 1 char charac acte teri rist stic ic,, its mea ean. n. The The dist distri ribu buti tion on is high highly ly sk skew ewed ed to th the e righ ight. B. Ev Even ents ts rela relate ted d to ti time me,, such such as cust custom omer ers s arri arrivi ving ng pe perr 5-mi 5-minu nute te pe peri riod ods, s, of ofte ten n fol follow low a Poi Poiss sson on dist distri ribu buti tion on.. C. The The me mean an is ne need eded ed when when us usin ing g a Po Pois isso son n dist distri ribu buti tion on.. = E(x) = [ e P x ] (s (see ee pa page ge 52 52))
D.
A Po Pois isso son n dist distri ribu buti tion on ma may y be de dete term rmin ined ed with with a form formul ula a or look looked ed up in a ta tabl ble. e. 1. Ca Call lls s pe perr 15 mi minu nute te pe peri riod od to Lind Linda' a's s repa repair ir faci facili lity ty foll follow ow a Pois Poisso son n dist distri ribu buti tion on with with = 1.0. What is the th e pro probab babilit ility y of ex exac actl tly y thre three e serv servic ice e call calls s be bein ing g rece receiv ived ed in a rand random omly ly sele select cted ed 15-mi 15-minut nute e pe peri riod od? ?
P x = ~xX e.- I
P 3 = 1 )2.7183-
3'
I
ISee table below •
= (1)(0.3679) = 0.0613 6
IV. The Poisso Poisson n app approx roximat imation ion of th e bin binom omial ial pro probabi bability lity distri distribut bution ion A A Pois Poisson son distributio distribution n is ofte often n used sed to ap appr proX oXim ima ate a bino binomi mial al dist distri ribu buti tion on fo forr prob proble lems ms such such as erro errors rs on a ty type ped d pa page ge,, ci circ rcui uitt bo boar ard d de defe fect cts, s, an and d cust custom omer ers s bo boun unci cing ng chec checks ks at Lind Linda' a's s Video Showcase. B. This is do done ne to save save the the time time and mo mone ney y ne nece cess ssar ary y to solv solve e ex exte tens nsiv ive e bino binom mial ial ex expe peri rime ment nts. s. C. Thes These e two two dist distri ribu buti tion ons s ha have ve simila similarr skewn skewness ess prov provid ided ed the the num number ber of trials is large n 30) and th the e pro probab babili ility ty of occ occurr urrence ence (p) (p) is small (either np or nq < 5). D. The The me mean an for for a Po Pois isso son n ap appr prox oxim imat atio ion n of a bin binom omial ial is = np n = trials ials and p is th the e prob probab abili ility ty of an even event) t).. E. Re Rece cent nt obser observa vatio tions ns reve reveal aled ed that that 4 of 40 item items s pu purc rcha hase sed d by cust custom omer ers s are are re retu turn rned ed.. This This is a bino binomi mial al pr prob oble lem m wi with th a samp sample le mea ean n of P(s) =4/40 =.10. Dete Determ rmin inin ing g the en enti tire re dist distri ribu buti tion on by us usin ing g th the e bino inomial ial fo forrmu mula la 40 ti time mes s woul would d be a trem tremen endo dous us task. 1. Us Usin ing g the the Po Pois isso son n ap appr prox oxim imat atio ion n with with a samp sample le mean mean of .10 fo forr is much easier. Its use is appropriate as n 30 and and np < 5 (40 x .1 =4). The The abov bove form ormula ula yield ields s a pr prob obab abil ilit ity y of 0 retu return rns s equa equall to .90 9048 4837 37.. 2.
Usin Using g a Po Pois isso son n dist distri ribu buti tion on ta tabl ble e to solv solve e this this prob proble lem m on only ly requ requir ires es loca locati ting ng th the e ap appr prop opri riat ate e outcome (va outcome (value lue of x unde underr the app approp ropria riate te me mean an.. = .1 and x = 0 .9048 (se see e Table 2 page ST 2 Probability of x out outco come mes s give given n the the fo foll llow owin ing g po popu pula lati tion on mean means s
x
.10
0
0.9048
I 0.8187
1
0. 0905
2 3 4 5
.50
.60
.7 0
.80
.90
1.00
0.7408 0.6703
0.6065
0.5488
0.4966
0.4493
0.4066
0.3679
0.1637
0.2222 0.2681
0. 3033
0 .3 2 9 3
0.3476
0.3595
0.3659
0.3679
0 . 00 45
0.0164
0 . 0 3 3 3 0 .0 5 3 6
0.0758
0.0988
0 .1 2 1 7
0.1438
0.1647
0.1839
0 . 00 02
0.0011
0.0033 0.0072
0.0126
0 .0 1 9 8
0 .0 2 8 4
0.0383
0.0494
I 0.0613
0.0001
0 . 0 0 0 3 0 .0 0 0 7
0.0016
0.0030
0.0050
0.0077
0.0111
0.0153
0.0001
0.0002
0.0004
0.0007
0.0012
0.0020
0.0031
0.0001
0.0002
0.0003
0.0005
.20
.30
.40
6 7
0.0001 53
Pr a c ti c e S e t I
iscrete isc rete Proba Probabil bility ity
is istr trib ibut utio ions ns
Darin se sell lls s thre three e di diff ffer eren entt Wa Walk lkma man n CD re reco cord rde ers rs;; on one e for for 149, on one e for for 159, and a third hird for for 16 169. 9. the the 187 ma mach chin ines es so sold ld dur urin ing g a re rece cent nt per erio iod d, 43 we were re the the le leas astt exp xpen ensi sive ve,, 90 we were re mo mode dera rate tely ly pr pric iced ed,, and 54 we were re the the ex expe pens nsiv ive e mo mode del. l.
Calc Ca lcul ulat ate e the the exp expect ected ed pr pric ice e
o
Walkman Walk man sal sales. es.
Comp Co mpar are e this this an answ swer er to the the pa page ge 12 we weig ight hted ed me mean an sa sale les s valu value e
C
In
theo th eory, ry, what
is
Walkman Wal kman sal sales. es.
the the di diff ffer eren ence ce be betw twee een n a we weig ight hted ed me mean an o vari variab able le x an and d the the expe expect cted ed valu value e
When hen wa wait itin ing g on a cust custom omer er,, Da Dari rin' n's s sale salesp speo eopl ple e ma make ke a sale 60 o the the time ime (s (se ee page 42). Use the binomial fo form rmul ula a or your stat statis isti tics cs soft softwa ware re to calc calcul ulat ate e the the pr prob obab abil ilit ity y of ma maki kin ng ex exac actl tly y 3 sa sale les s to 5 cu cust stom omer ers. s.
II
o
II
Vari Variab able les s tha thatt may fol follow low a bi bino nomi mial al pro proba babi bilit lity y di distr strib ibut utio ion n
Prob Probab abil ilit ity y of an em empl ploy oyee ee cont contri ribu buti ting ng to the the com compa pany ny pe pens nsio ion n plan plan
Probability o col colle lecti cting ng an ove overdu rdue e acc accou ount nts s re recei ceivab vable le
C
Prob Probab abil ilit ity y of rece receiv ivin ing g a po posi siti tive ve re resp spon onse se to a ma mark rket etin ing g camp campai aign gn
D
Prob Probab abil ilit ity y of a pa part rt be bein ing g de defe fect ctiv ive e
Vari Variab able les s th that at ma may y fo follo llow w a Poi Poisso sson n pr prob obab abililit ity y di distr strib ibut utio ion n
Numb Nu mber er of de defe fect cts s on a 30 300 0 foot foot roll
Er Erro rors rs on a typ typed pa page ge
C
Custom Cust ome ers arriv rrivin ing g at a dr driv ive e up wi wind ndow ow withi ithin n a 5 min inut ute e perio riod Number o rare rare di dise seas ase e case cases s pe perr 1,00 1,000, 0,00 000 0 pe peop ople le 54
D
o
x?
III. Usin Using g the the ap appr prop opri riat ate e tabl table e or yo your ur stat statiistic stics s softwa sof tware, re, comp complete lete the bin binomi omial al dis distri tribut bution ion descr describ ibed ed by quest questio ion n II. II.
Sp Speci ecial al Not Note e I
o
aluminum
IV
Usin Using g th the e ans answer wer to qu quest estion ion III or stat statis isti tics cs soft softwa ware re,, ans answer wer the the fo follo llowi wing ng qu ques esti tion ons. s.
V
VI
P x = 4) is
B
P x>2) is
C
P x 12, then 12.5 wo woul uld d have have been been th the e valu value e of x. Calcul Cal culate ate P x 12) usi using ng the nor normal mal approx approxima imate te of the bin binomi omial. al. Z
B
t
1
Find ind th the e pr prob obab abil ilit ity y in th the e body body of th the e z ta tabl ble. e. Lo Look ok to th the e ma marg rgin ins s of the ta tabl ble e to find z. Find Find th the e rang range e fo forr x usin using g th this is fo form rmul ula. a. Jl ± z
4. Co Cont ntin inue ue unti untill th the e prob proble lem m is solv solved ed..
- 20.88
=29.12
Practice Se Sett I.
ontiinuous ont nuous Normal Normal Probability istributions
Sal ales es comm commis issi sion ons s paid paid by Da Dari rin' n's s Mu Musi sic c Em Empo pori rium um are are norm normal ally ly di dist stri ribu bute ted d wi with th a me mean an 25 25,0 ,000 00 and a standard stan dard devia deviation tion 5,00 5,000. 0. So Solv lve e the the foll follow owin ing g bein eing su sure re to draw draw a grap graph h of ea each ch dis istr triibu buti tio on. A.
P( 15, 000 S; x < 25,0 25,000 00))
B.
P( 20, 20, 000 x < 30,000)
c.
D.
Note: ote: This question is read What is the probability that 15,000 is less less than requal to x which is less than 25,OOO?
37,550)
P( 27, 27, 500 x < 32,500)
II.
The number customer customers s ret returni urning ng m e r h n d i ~ e to Da Dari rin n s Music Music Empo Empori rium um is norma normally lly dist distri ribu bute ted d with with a mean of 6.3 6.3 per per we week ek and a stand andard ard dev deviati iation on 1. 1.5. 5. Given Given th the e fo foll llow owin ing g prob probab abil ilit itie ies, s, calc calcul ulat ate e th the e app approp ropria riate te value value rval values ues for x Half Ha lf of the time, re retturns urns will ill be abov above e
B
Ninet Nin ety y percen percentt the the ti time me,, re retu turn rns s will will be
c
Find Find th the e in inte terq rqua uart rtile ile rang range e fo forr retu return rns s to Dq Dqri rin n s Musi Music c Em Empo pori rium um..
D
Draw Draw a gr grap aph h the the ei eigh ghth th deci decile le for for retu return rns s to Dari Darin n s Musi Music c Em Empo pori rium um..
III. A rece recent nt stUdy in indi dica cate ted d5 Da Dari rin n s cust custom omer er s retu return rn me merc rcha hand ndise ise sold sold for cr cred edit it.. What is th the e pro probab babilit ility y Darin having ving les less than 20 retur turns for a 500 cred credit it sale sales s we week ek? ?
63
Quic Qu ick k Ques Questi tions ons I
Cont Contin inuo uous us Norm Normal al Prob Probab abil ilit ity y Dist Distri ribu buti tion ons s
The ave avera rage ge in incom come e of 30-ye 30-yearar-ol old d co colle llege ge gra gradu duat ates es fro from m Stat State e Uni Univer versit sity y is norm normally ally dis distrib tribute uted d with with a mean of 30 30,0 ,000 00 and a stan standa dard rd de devi viat atio ion n of 4,0 4,000 00.. Ca Calc lcul ulat ate e the the fol followi lowing ng being su sure re to grap graph h ea each ch question. P x < 34,00 34,000) 0)
P x> 38,000)
C
P 18,000:::;; x
D
P x> 30,000)
19,800)
II
Grades Stat State e Un Univ ivers ersit ity y gra gradua duates tes are no norm rmal ally ly di dist stri ribu bute ted d wi with th a me mean an 3 0 and a standa sta ndard rd devia deviation tion 3 Ca Calc lcul ulat ate e th the e fo foll llow owin ing g be bein ing g su sure re to gr grap aph h each each qu ques esti tion on A
Wh What at grad grade e po poin intt aver averag age e is re requ quir ired ed to be in th the e to top p5
Calcu Calculat late e the int interq erquar uartil tile e ra range nge
C
An ecce eccent ntri ric c alum alumnu nus s left left scho schola lars rshi hip p money money fo forr stud studen ents ts in th the e th thir ird d dec decil ile e from from the the bo bott ttom om the their ir cla class ss Det Determ ermine ine the ran range ge th the e th thir ird d deci decile le Woul ould a stud studen entt with with a 2 8 grad grade e po poin intt aver averag age e qual qualif ify y fo forr th this is sc scho hola lars rshi hip? p?
D
What Wh at is th the e me medi dian an gr grad ade e po poin intt aver averag age e
th this is class? class?
th the e gr grad adua uati ting ng clas class? s?
Chapter 11 Sa Samp mpli ling ng and the the Samp Sampli lin ng Dist Distri ribu buti tion on o f th the e Means eans Infe Infere rent ntia iall stati statisti stics cs uses uses sample sample statis statisti tics cs to es estim timate ate po popu pula lati tion on pa para rame mete ters rs.. This This chapte chapterr will will ex expl plore ore how how a samp sample le mean mean x is used sed to pred predic ictt its its pop populat ulatio ion n mean 11 . II. Wh y use use samp sample le data data A The cost of a censu nsus is prohibitive. B The time re req qui uire red d to ta take ke a ce cens nsus us is not availabl available. e. C Me Meas asur urin ing g a pa para rame mete terr de dest stro roys ys the the item item be bein ing g te test sted ed meas measur urin ing g th the e mean mean li life feti time me in hours of ligh lightt bulbs bulbs . sample le will will yi yiel eld d ad adeq equa uate te resu result lts. s. D A samp III. III. Prob Probab abil ilit ity y sample samples s A A proba probabi bilit lity y sample sample is one one in whic which h th the e like likeli liho hood od of an item item be bein ing g ch chos osen en is known. Partia Par tiall Table Table of B Pro Probab babili ility ty sampli sampling ng method methods s Random Ran dom Digit Digits s 1 Sim Simple ple random random sample samples s a Ea Each ch po popu pula lati tion on memb member er ha has s an eq equa uall chan chance ce of bei being ng chosen. chosen. 1318 7677 9619 b Pu Putt an id iden enti tifi fica cati tion on na name me,, seri serial al numb number er,, et etc. c. in into to a hat at,, mix ix,, and sele select ct.. 2122 8297 1190 c A ta tabl ble e of ra rand ndom om digi digits ts or a comp comput uter er prog progra ram m of ofte ten n re repl plac ace e th the e hat. 0037 6355 4717 d To sampl mple 30 out out of 485 485 stud studen ents ts usin using g the their ir ID nu numb mbe ers fro rom m 1 to 485: 1 Arb Arbitr itrari arily ly choos choose e a star starti ting ng poin pointt on a ta tabl ble e of ran random dom digits digits.. 4788 9044 5583 2 Working in some direct direction ion horizo horizonta ntally lly,, vertic verticall ally, y, or diagon diagonall ally y, and us usin ing g th the e firs firstt or last last th thre ree e digit its s, cho hoos ose e 30 st stud uden entt numb number ers s ig igno nori ring ng th thos ose e over over 48 485. 5. 2 Sys System temati atic c random random sa samp mples les a Us Use e ev ever ery y nth item be beg ginni inning ng at some ome ran random po poin intt on a lis istt of pop opul ulat atio ion n memb member ers. s. b Th This is meth method od coul could d be bias biase ed be beca caus use e po popu pula lati tion on memb member ers s at th the e be begi ginn nnin ing g of a li list st Mr. Ab Abbo bott or empl em ploy oyee ee 000 001 1 and and end of a lis list M Ms s. Zon ona a or empl employ oyee ee 9999 migh mightt not have an eq equ ual chan anc ce of being bei ng chosen chosen.. Stratifi ified ed random random sample samples s 3 Strat
I
2786 1379 5184 0292
Divi Divide de popu popula lati tion on into into homo homoge gene neou ous s subg subgro roup ups s and and samp sample le each each su subg bgro roup up.. Th This is type type of samp sample le can can be more more repr repres esen enta tati tive ve th than an a si simp mple le ra rand ndom om sa samp mple le be beca caus use ea smal sm alll di dive vers rse e sect sectio ion n of a po popu pula lati tion on mi migh ghtt no nott be ch chos osen en wit ith h a si simp mple le ra rand ndom om sa samp mple le.. C Sa Samp mpli ling ng and and nons nonsam ampl plin ing g error error 1 Sampling error ex exis ists ts beca becaus use e a no nonr nrep epre rese sent ntiv ive e samp sample le was us used ed in pl plac ace e of a ce cens nsus us.. 2 Nons Nonsampl ampling ing er error ror,, whic which h oc occu curs rs with with an any y surv survey ey,, ex exis ists ts pr prim imar aril ily y be beca caus use e of po poor or co coll llec ecti tion on te tech chni niqu ques es.. High nons nonsam ampl plin ing g erro errorr can mak ake e a cen ens sus less ac accu cura rate te than a sampl ple e. Why? hy? Limited funds an and d ha havi ving ng to surv survey ey all po popu pula lati tion on memb member ers s ca caus use e po poor or co coll llec ecti tion on te tech chni niqu ques es an and d hi hig gh no nons nsam ampl plin ing g err rror or.. IV. Sam Sampli pling ng distribut distribution ion of the the mean means s A The The sa samp mpli ling ng di dist stri ribu buti tion on of th the e mean means s con onsi sis sts of all the po pos ssi sibl ble e sa samp mple le mea eans ns of siz ize e n th that at may be drawn fr from om a po popu pula lati tion on si size ze N It is impo import rtan ant. t. Ta Taki king ng on one e samp sample le is re real ally ly ta taki king ng on one e ou outt of man many y possib possible le sa samp mple les. s. The sampli sampling ng distrib distributio ution n is th the e key key to why why ac accu cura rate te pred predic icti tion ons s ca can n be mad made e wi with th inf infere erenti ntial al statis statistic tics. s. B Pop opul ulat atio ion n memb member ers s A,B, A,B,C, C,D, D, an and d E ha have ve 1, 1,2, 2,3, 3,4, 4, and 5 ch chil ildr dren en re resp spec ecti tive vely ly.. Th The e sa samp mpli ling ng di dist stri ribu buti tion on of the means, its mea ean n and sta tand nda ard devia eviattion ion, fo forr a sa sam mple ple of 3 ou outt of 5 has been cal alc cul ulat ated ed and de dem monst onstra rate ted d below. a b
10 poss possib ible le samp sample les s re resu sult lt from a sample of 3 out of 5
Population Popul ation Distributio Distribution n N=5
2
o
5
4
x
Number Num ber of Chil Childr dren en
fl
= LX = 1 N
a
3
4
5
5
given at = 1.414
3
x
x
ABC
1,2,3
2.00
ABD
1,2,4
2.33
ABE
1,2,5
2.67
Sampling Distribution of th the e Mean Means s
ACD
1,3,4
2.67
x
f
ACE
1,3,5
3.00
2.00
1
ADE
1,4,5
3.33
2.33
1
BCD
2,3,4
3.00
2.67
2
BCE
2,3,5
3.33
3.00
2
BDE BD E
2,4,5
3.67
3.33
2
CDE
3,4,5
4.00 30.00
3.67 4.00
1 1
X
66
f
Sampling Samp ling Distrib Distributio ution n of th the e Mean Means s n=3 N=1
2
1
o
4
fl
x
5 x
LX
N 10
Qj7 f = 1 [414
82
V. Ce nt nt ra l l iim m iitt t he he or or em em A Th e sampl sampling ing distribut distribution ion o f th e m ean eans s wi will ll be nor nor mal mal w henev henever er t he pop popula ulatio tion n is norma normal. l. B The The cent centra rall li limi mitt the theore orem m also also ap appl plie ies s to skewe skewed d po popu pula lati tion ons s prov provid ided ed th the e samp sample le is larg large e (n 30). C The The rela relati tion onsh ship ip be betw twee een n the the pa para rame mete ters rs of a popu popula lati tion on and and it its s samp sampli ling ng dist distri ribu buti tion on is sho shown wn be belo low. w.
Sampling
P(x)
Distribution of the the Me Mean ans s
x
VI. Us Usin ing g a la larr ge ge s am am p pll e n A
30) 30 ) to det deter ermin mine e poi point nt and and inte interv rval al estim estimat ates es o f pop populat ulation ion para paramet meters ers
Pointt estima Poin estimates tes
S e c t i on on B No t e: When n < 30 and ai s un unkn know own, n, th the et distribu dist ribution tion,, discuss discussed ed in chapter 16, must be su subs bstit titut uted ed for for the the z dist distri ribu buti tion on when when maki making ng inte interv rval al esti estima mate tes. s. Man any y statis sta tistic tics s softwa software re program programs s do all all interv interval al calcu calculat lation ions, s, regardless of samp sample le si size ze,, us usin ing g th the e t distr distrib ibut utio ion. n.
1
A poin poortant inttant es esti tima mate is imates a oneone-numb er est estim imate ate.. Import Imp poi point ntteest estima tes number a A samp sample le mea ean n for for its po popu pula lati tion on me mean an b A sample sample stan standa dard rd de devi viat atio ion n for it its s popu popula lati tion on stan standa dard rd devi deviat atio ion n Int Interva ervall estimates 1. An in inte terv rval al est estima imate te is a range. [ 2 A range for Il call called ed a conf confiden idence ce inte interv rval al,, is dete determ rmin ined ed us usin ing g this his ex expr pres essi sion on.. X Z 3. The sta standa ndard rd dev deviat iation ion of a sam sampli pling ng dis distr tribu ibutio tion n ax , call called ed th the e sta standar ndard d I import ortant ant in det determ ermini ining ng an inter interva vall esti estimat mate e of a po popu pula lati tion on mean mean.. error of the th e mean, is very imp 4. Be Belo low w are are two two im impo port rtan antt conf confide idenc nce e inte interv rval als s for for Ilx and and the theref refore ore Il a 95 perc percent ent conf confiden idence ce inter interval val . . Note: Not e: These These inte interv rval al es esti tima mate tes s are are ba base sed d up upon on th the e rela relati tion onsh ship ip be betw twee een n z f or .95/2 = .4750 1 . 9 6 ± 1.96 z, th the e po popu pula lati tion on dist distri ribu buti tion on,, and the sampling sampl ing distri distribution bution of th the e me mean ans. s. b 99 perc percent ent con confid fidenc ence e inter interval val x lx x l x l z for .99/2 = .4950 2.581 ± 2.58 cr Z
2.
B
Note: Be Note: Beca caus use e th the e samp samplin ling g dist distri ribu buti tion on is nor normal mal regard regardless less of its its popula populatio tion n s skewne skewness, ss, a samp samplin ling g dis distr tribu ibutio tion n s mean mean can be used to make ake pre predict dictio ions ns abou aboutt one one of its sample mean me ans. s. Pr Pred edic icti tion on proc proced edur ures es will will be si simi mila larr to th thos ose e follo ollowe wed d in chapter 10, wh wher ere e th the e po popu pula lati tion on mean mean was was used to make ake pred predic icti tion ons s abo about ut a value value of x. In pra practi ctice ce,, th the e sa samp mple le mean ean is know known n and us use ed to make make esti estima mate tes s ab abou outt th the e sa sam mplin pling g dist distri ribu buti tion on s mean mean.. Thes These e es esti tima mate tes s also also ap appl ply y to the popu lat lation ion me mean an be beca caus use e said said mean means s are are eq equ ual. al. Th Thes ese e es esti tim mate tes s popu pula lati tion on mea ean n can can be very very ac accu cura rate te be beca caus use e the of a po samp sampli ling ng dist distri ribu buti tion on s st stan anda dard rd devi deviat atio ion n will will be smal smalle lerr if the sample sam ple siz size e is incr increa ease sed. d. Dimi Dimini nish shin ing g re retu turn rns s ap appl ply y to larg larger er samp sample les s be bein ing g mo more re ac accu cura rate te as th the e de deno nomi mina nato torr of CJx is not n but but the the sq squa uare re root of n A samp sample le of 49 is onl only y sl slig ight htly ly more more ac accu cura rate te than than a samp sample le of 36. Why Why? ? Bec Becaus ause e th the e de deno nomi mina nato torr is on only ly slig slight htly ly larg larger er 7 vs. 6), an and d the sam samplin pling g dis distr trib ibut ution ion s standard stan dard devia deviation tion is no nott propor proportion tionate ately ly small smaller. er.
+
I
x
=
x
5
C
r
When pop When popula ulatio tion n stand standar ard d dev deviat iation ion is unkn unknow own, n, the the samp sample le stan standa dard rd de devi viat atio ion n ma may y be us used ed as a po poin intt esti estima mate te of the popula pop ulatio tion n sta standa ndard rd dev deviat iation ion prov provide ided d the the sam sample ple is large. Smal Sm alll sample samples s will will be ex exam amin ined ed in chapter 16.
Example: Exampl e: Lind Linda a took took a ra rand ndom om samp sample le of 49 cus custo tomer mer or orde ders rs an and d foun found d the mean mean pu purc rcha hase se am amou ount nted ed to 7.50. The The po popU pUla lati tion on stan standa dard rd devi deviat atio ion n is kn know own n to be .70. The The 99 conf confide idenc nce e in inte terv rval al for for the the po popU pUla lati tion on me mean an pu purc rcha hase se ha has s be been en calc calcul ulat ated ed in this this fram frame. e. Note:: Lind Note Linda a can can lo lowe werr the the ra rang nge e by ac acce cept ptin ing g a conf confid iden ence ce interval of on only ly 95 or by incr increa easi sing ng the the samp sample le size. ize.
67
=
jfi
± 2.58;; Given:
x= a
7.50 .70
49 z for .99 is 2.58 n
x±
58;
7.50 ± 2.58
,,49
7.50 ± 2.58 .10) 7.50 ± .258 7.24
H
7.76
Prac Pr acti tice ce Set Set
Samp Sa mpli ling ng and the Sam Samplin pling g Distr istrib ibut utio ion n
of
the the Means ans
Darin Dari n s ne new w comp compan any, y, Futu Future re Hori Horizo zons ns Corp Corpor orat atio ion, n, manu manufa fact ctur ures es a comp compon onen entt for for comp comput uter er chip chips. s. Da Dari rin n wa want nts s to know know th the e av aver erag age e weig weight ht of 1,00 1,000 0 re rece cent ntly ly pr pro odu duce ced d comp compon onen ents ts.. A sa samp mple le of 36 had a me mea an we weig ight ht of 30.02 .025 mi mill llig igra rams ms an and d a st stan anda dard rd de devi viat atio ion n of .0 .065 65 mi mill llig igra rams ms.. Ca Calc lcul ulat ate e the the 98 conf confid iden ence ce inte interv rval al for the the po popu pula lati tion on mean me an weight weight of these componen components. ts.
I
aw
II
Data
o r Peo People ple
Using Using Statis Statistic tics s Softwa Software re
29.89
29.96
29 . 9 7
30. 05
29. 97
29.98
29.98
30.06
30.04
30.07
30.05
30.06
29.97
29. 95
3 0 . 05
30. 05
2 9. 9 5
30.09
29. 95
29.99
30.06
30.06
29.89
30.09
29.99
29. 99
29. 98
30. 02
30.08
30.01
30.09
30.06
30.08
30.12
30.16
30.15
Calc Ca lcul ulat ate e th the e 95
confid confidenc ence e in inte terva rvall us usin ing g pr prob oble lem m I in info form rmat atio ion. n.
III. What can Dari rin n do to ma make ke this this in inte terv rval al sm smal alle ler? r?
68
Quic Qu ick k Questi Questions ons I
II
Samp Sa mpli ling ng and and th the e Samp Sampli ling ng
is istr trib ibut utio ion n
t he
Pla Place th the e nu numb mber er of th the e ap appr prop opri riat ate e form formul ula a ne next xt to th the e conc concep eptt it de defi fine nes. s. co confidence interval
1
A
The 99
B
Stand Standar ard d error of the mean
C
Us Used ed when when th the e popu popula lati tion on var varian iance ce un unkn know own n and and th the e samp sample le is large.
D
The 95
E
The mean of the sampling distribution of the means
co confidence interval
_
cr
. L
J7f
_
2
is
lx
3
_
±2
4
_ _
±
58;
1 96;
± 2 58
5
An Answ swer er th the e fo foll llow owin ing g tr true ue or fals false e and fill fill
in
the blank blank questi questions ons..
A
Th The e prim primar ary y caus cause e of samp sampli ling ng erro errorr
is
poor collection techniques. T
B
The The st stan anda dard rd error error of the mean is halved when the sample size is doubled.
C
A one one number number estim estimate ate of the the popu popula lati tion on mean mean is called a
D
A ra range ffo or a po population pa parameter is is ca called tth he
E
A simp simple le ra rand ndom om sa samp mple le be beca caus use e a smal smalll dive divers rse e sect sectio ion n with with a si simp mple le ra rand ndom om samp sample le..
F T
estimate
F of
th the e mean mean..
_ may be more accurate than a of th the e po popu pula lati tion on migh mightt no nott be ch chos osen en
III. Calc Calcul ulat ate e th the e 95 and 99 conf confid iden ence ce inte interv rval als s fo forr th the e po popu pula lati tion on mean mean gi give ven n a sa samp mple le of 36 result resulted ed in a mean of 55 and a sta tand ndar ard d de devi via ation tion of 18. Data
Se t
For Peop People le Usin Using g Stati Statist stics ics Softwa Software re
55
55
39
50
8
48
43
85
38
58
50
57
52
75
55
85
55
8
52
47
62
25
54
7
32
73
40
72
98
53
35
56
55
2
26
46
69
ea eans ns
Chap Ch apte terr I.
Sampling Sampli ng
istr istrib ibuti utions ons
Estimating th e popu populatio lation n proporti proportion on successes A. The The po popu pula lati tion on pr prop opor orti tion on is the the av aver erag age e pa part rt of a popu popula lati tion on havi having ng a parti particul cular ar trai trait. t. P = population size B. It may may be ex expr pres esse sed d as a frac fracti tion on,, deci decima mall, or per erce cent nta age. ge. Some Some text texts s use 1t for C. The sample sample propo proporti rtion on is p = the pop popula ulatio tion n prop propor ortio tion. n. D. The popula populatio tion n proporti proportion on is us used ed to meas measur ure e trai traits ts such such as cons consum umer er atti attitu tude des s to towa ward rd a pr prod oduc uct, t, voter voter pr pref efer eren ence ce,, an and d th the e pr prop opor orti tion on of par parts ts passing passing ins inspec pectio tion. n. E. Exp Experi erime ments nts describ described ed here here must meet meet the the bi bino nomi mial al experi experimen mentt condi conditio tions ns descr describ ibed ed on page 52 an and d the no norma rmall appro approxim ximati ation on of the the bi bino nomi mial al condi conditio tions ns descr describ ibed ed on page page 61. F. Esti Estima mati ting ng a conf confid iden ence ce in inte terv rval al for th the e po popu pula lati tion on pr prop opor orti tion on us usin ing g a larg large e samp sample le is exp explai lained ned below below..
p ± zap
2.
crp p where cr
=
P JP<
P is the the samp sample le prop propor orti tion on 2 n the the sa samp mple le si siz ze, is 30 3 z is ba base sed d up upon on the the de desi sire red d conf confid iden ence ce inte interv rval al
and
Example: Exampl e: Li Lind nda a Smith Smith random randomly ly ca call lled ed 100 cu cust stom omer ers s an and d foun found d that that 80 we were re happ happy y wi with th the the se serv rvic ice e th they ey re rece ceiv ived ed when when shop shoppi ping ng at Lin ind da s Vi Vide deo o Show Showca case se.. Calc Calcul ula ate a 95 con confid fiden ence ce inte interv rval al for the the po popu pula latio tion n propor proportio tion. n. Given Given:: n = 100 and z for 95 -
x
n=
P= n =
80
100
= .80
~ 3
np = 100x.8 = 80
5
= 100 x .2 = 20
5
nq
confidence is 1.96 ~
The normal normal approx approxima imatio tion n of the binomi binomial al applie applies. s.
II.
Part
--
JP<
=
J
n
P
p ± zap .80 ± 1. 1.96 96 .04) .04)
.8 1-.8 100
.80±
= J.0016 = .04
.722
Q[l J
F iin n ite ite co corr rre ec cti tio o n f tor Thus Thus fa far, r, fo form rmul ulas as used sed to calc calcul ulat ate e the the stand standar ard d error of the the me mean an a x and the the sta standa ndard rd of th e proportion ap have have been been base based d up upon on in infi fini nite tely ly la larg rge e popu popula lati tion ons. s. B. If th the e po popu pula lati tion on is fin iniite, the then the the re rela lati tive ve si size ze of ou ourr samp sample le has inc ncre rea ase sed, d, an and d th the e stand standar ard d er erro rorr can can be re redu duce ced d us usin ing g th the e fini finite te corr correc ecti tion on fact factor or..
A
.878
error
.05. Sma Small ller er rati ratios os are im imma mate teri rial al..
C. The The fi fini nite te corr correc ecti tion on fa facto ctorr is us used ed to calc calcul ulat ate e th the e stan standa dard rd erro errorr wh when en
St Stan anda dard rd Error of the Me Mean
H
.0784
Standard Er E r r o r o f th e P Prroportion
a = Jp 1 P IN n n
P
D.
N1
Lin ind da must must ad adju just st her in inte terv rval al calc calcul ulat atio ion n be beca caus use e he herr custo customer mer poo pool tota totale led d 1,000. n
N=
100
1,000
=. 1 0
.05
p
= JP< P n
=.04
IN n N- 1
1 000 100 1 000 1
=.04 J.9009
p ± zap .80 .80 ± 1. 1.96 96 .0 .038) .80±
.726
H
.0745 .875
Note: Not e: Beca Becaus use e the the rang range e is
=.038
slight sli ghtly ly small sma ller er, , the th e predict ictio ion n may ma y be more more us usef eful ul..pred
70
III. II I. Dete Determ rmin inin ing g sam sampl pl e si ze ze A. A small mall sa samp mple le may may gi give ve an inad inadeq equa uate te an answ swer er (too too lar large a co conf nfid iden ence ce inte interrval) l).. B A larg large e sa samp mple le requ requir ires es ex exce cess ss time time and mone money. y. C Th Thre ree e fa fact ctor ors s are are us used ed to dete determ rmin ine e an ap appr prop opri riat ate e samp sample le size size.. 2 The popu popula latio tion n varian variance ce « ) a A la larg rge e po popu pula lati tion on vari varian ance ce mean means s a larg larger er samp sample le is ne need eded ed to yiel yield d ac acce cept ptab able le re resu sult lts. s. b If the popu popula lati tion on varian variance ce is no nott know known, n, it may be es esti tima mate ted d wi with th a smal smalll pr prel elim imin inar ary y surv survey ey.. 2 The re requ quir ired ed degree degree of con confide fidence nce (z) (z) a. A given confi onfid dence nce interval (90 ) has a matc matchi hing ng degree o f confidence. In th the e long long run, ther here is a 90 degree of con confid fidenc ence e that that th the e popu popula lati tion on parame parameter ter bein being g meas measur ured ed wi willll fa fallll withi within n th the e 90 confiden confidence ce interval. interval. b A hi high gher er de degr gree ee of confi confiden dence ce requ requir ires es a larger larger samp sample le.. The amou amount nt of acc accep eptab table le error error (E) a A st stud udy y wi will ll have have some some logi logica call ac acce cept ptab able le rang range e for th the e conf confid iden ence ce inte interv rval al.. 1 In Inco come me may may be es esti tima mate ted d to wi with thin in 500of the mean mean.. 2) A part's size may be estimated to within .01 millimeters. b A smal smalll ac acce cept ptab able le erro errorr requ requir ires es a larg larger er samp sample le.. Sa Samp mple le si size ze deter determin minati ation on when when estim estimat atin ing g th the e popu popula lati tion on mean mean Solving X ± z for n give gives s th the e fo foll llow owin ing g samp sample le size size fo form rmul ula. a. 3
D
2. [ 3
~
n=
r
Note:: A lar Note large de degr gree ee of conf confid iden ence ce,, a larg large e varia varianc nce, e, and and a smal smalll acce accept ptab able le erro errorr all all make make th the e samp sample le size size larg larger er..
Su Supp ppos ose e Li Lind nda a was was un unha happ ppy y wi with th th the e av aver erag age e cu cust stom omer er pu purc rcha hase se ra rang nge e firs firstt de desc scri ribe bed d on pag age e 67 and su summ mmar ariz ized ed below low. How How lar large a sa samp mple le woul would d be req equi uire red d to lowe lowerr th the e acce accepta ptabl ble e error error fr from om .26to .1 O? As Assu sume me th the e fini finite te corr correc ecti tion on fa fact ctor or is no nott ap appl plic icab able le..
r oblem oblem R evi evi ew
X ± 2.58
Given:
x = 7.50
z for .99 is 2.58 a n
.70 49
7.50 ± 2.58
n= z; 2 =[
= [18.06]2 = 326.16 -+ 327 Note No te:: alwa always ys rou ound nd up
7.50 ± .258 7.24
2.5; .7 ] 2
49
7.76 oOl l
4.
5
E
Chec Check k yo your ur an answ swer er by ca calc lcul ulat atin ing g th the e co conf nfid ide enc nce e inte interv rva al us usin ing g th the e ne new w sa samp mple le size ize. If th the e inte interrval is acceptable (within .10),co cond nduc uctt yo your ur ne new w su surv rvey ey with with a samp sample le of 327. When Wh en de dete term rmin inin ing g th the e sa samp mple le size size for bo both th mean mean and prop propor orti tion on
pr prob oble lems ms, answe an swers le less ss th than an ou b roon un und d up al topo 30are 30pula beca be caus use e ,samp sa mple le rs si size ze fo form rmul ulas as arsehould base bald sed deup upon aed no norm rmal popu lati tion on.. Sa Samp mple le siz size e deter determin minati ation on when when esti estima mati ting ng the popu popula lati tion on prop propor orti tion on 1. 2
3
4. 5.
=
P < 1 p > ~ r
x±
5 8 .70 j
7
x ± .09987 and .09987 < .10
Us Using ing th the e proble problem m II da data ta from from th the e prev previo ious us pa page ge,, Lind Linda a woul would d like like to lo lowe werr th the e ac acce cept ptab able le erro errorr asso associ ciat ated ed with with th the e 95 confidence inte in terv rval al for custom customer er satis satisfa fact ctio ion n fr from om ± 7.45 to ± 5 . Wha Whatt sample sample siz size e is re requ quir ired ed? ? Th The e samp sample le si size ze fo form rmul ula a must must incl includ ude e th the e pa page ge 70 fin finite ite correc correctio tion n factor factor because nlN is > .05. Fr From om th thes ese e ca calc lcul ulat atio ion ns, it ap appe pear ars s th that at Lind inda can red educ uce e the range of the the confi confiden dence ce in inte terv rval al to ± 5 by incr increa easi sing ng th the e samp sample le size size to 234. If is no nott kn know own n, it may may be estim stima ate ted d with with a sa samp mple le of 100. Also, using p of .5 wi will ll give give th the e maxi maximu mum m ap appr prop opri riat ate e samp sample le size. ize. 71
n= p1-
= .80(1 -
j
) 2
1.96 .80 .80) ( .05 )
2
(.949)
= .80(.20)(39.2)2(.949) = 233.3 -+ 234
Pra ract ctic ice e Se Sett I
Sampling Distributions
Part
Darin arin wa want nts s to know know th the e pr prop opor orti tion on of page page 68 part parts s pass passin ing g in insp spec ecti tion on.. Fift Fifty y part parts s we were re ra rando ndoml mly y sel selec ecte ted d fr from om a re rece cent nt pr prod oduc ucti tion on run of 1, 1,00 000 0 part parts s an and d 45 pass passed ed insp inspec ecti tion on.. ata a ta S et et
P
P
P
P
o r Peo People ple Using Using Statistics Software
P
F
P
P
P
P
F
P
P
P
F
F
F
Calc Calcula ulate te the pro propor portio tion n of par parts ts pas passi sing ng inspection.
Darin would like to us use e las astt w e eek ek s data data to pred redic ictt a rang range e fo forr th the e prop propor orti tion on of fut future ure pro produc ductio tion n ru runs ns passi pas sing ng insp inspec ecti tion. on. Cal Calcul culate ate the 95 con confid fidenc ence e in inte terv rval al for th the e propor proportio tion n of part parts s pr prod oduc uced ed by th this is produc pro ductio tion n pr proce ocess ss passin passing g inspec inspecti tion. on.
B
C
What Wh at assu assump mpti tion on is Darin arin ma maki king ng wh when en usin using g la last st we week ek s data data to pred predic ictt fu futu ture re ma manu nufa fact ctur urin ing g qu qual alit ity? y?
D
Calcul Calc ulat ate e th the e 99 co confi nfiden dence ce in inte terv rval al fo forr th the e proportion of par parts ts pas passin sing g inspec inspecti tion. on.
What samp sample le size size is nec necess essary ary to re redu duce ce accept acc eptabl able e err error or to ± 5 ?
E
7
Darin is also also conc concer erne ned d ab abou outt the the weig weight ht of page page 68 parts. It mu must st be po poss ssib ible le fo forr the mean mean weig weight ht of parts to be 30 mg wi witth a 99 degr degree ee of conf confid iden ence ce.. As indi indica cate ted d on pa page ge 68 an and d re revi view ewed ed be belo low, w, a re rece cen nt test est wa was s bare barely ly su suc cces cessful sful.. Dari arin wa wan nts to redu reduce ce er erro rorr fr from om th the e curr curren entt ±. ±.02 0279 79 mg to ±. ±.02 025 5 mg. What Wh at samp sample le si size ze is required? Page 8 Problem vi ee ee p a g e PS 8
w
Given x=3
25
mg
36 n =2.58 z= s = .0 .065 65 mg
30.025 ± .0279 29.997mg 30.053mg Note: This Note: This rang range e in indi dica cate tes s th the e po popu pula lati tion on mean mean coul could d be unde underr 30 mg.
III. Chec Check k yo your ur answ answer er to proble oblem m by calc calcul ulat atin ing g the the 99 conf confide idenc nce e inte interv rval al us usin ing g a samp sample le si size ze of 45 and and a samp sample le stan standa dard rd de devi viat atio ion n of .06 065 5. Anal Analyz yze e th the e re resu sult lt..
V
How wo How woul uld d the the solu soluti tion on to prob proble lem m chan change ge if the the samp sample le of 45 had been een taken aken from a po popu pula lattion ion of 500 500 ite items ms? ?
Reca Re calc lcul ulat ate e the the answer answer to prob proble lem m
us using ing the the fin finit ite e cor correc rectio tion n fa fact ctor or..
73
Quick Questions I
Sampling Distributions
Plac Place e th the e nu numb mber er of th the e ap appr prop opri riat ate e form formul ula a ne next xt to th the e item item it de desc scri ribe bes s A
Population proportion
B
Stan Standa dard rd error of the proportion
C
_
Part
JP 1;P
2 N
_
3
Confidence interval for the population proportion
_
D
Finite correction factor
E
When to use the finite correction factor
F
Sample size when predicting the population mean
G
Sample size when predicting the population proportion
5
99
confidence interva confidence intervall
B
95
x
ii
6
_ _
A surv rve ey of 8 Ne New w York York Ci City ty vo vote ters rs rev reveale ealed d 60 plan lanned to vo vote te in the next next el elec ecti tion on Calcul Calculate ate both both th the e 99 and and 95 confi confide denc nce e inte interv rval al for the the popula populatio tion n proport proportion ion A
±z p
_ _
confidence confidence interval interval
p 1
7
P
~
z; 2 Data t For Peop People le Usin Using g Statisti Stat istics cs Softwa Software re
Y
N
Y
Y
Y
N N
Y Y
Y Y
Y Y
Y N
Y
N
N
Y
Y
Y
Y
Y
N
Y
Y
N
Y
Y
N
Y
Y
N
Y
N
Y
Y
Y
Y
N
Y
Y
N
Y
Y
Y
Y
Y
Y
Y
N
Y
Y
Y
Y
Y
Y
N
Y
N
Y Y
74
5
~ ; : : :
4
II
Y
Y
Y
Y Y
N N
Y
Y
N
Y
Y
Y
N
Y
Y
Y
Y
Us Usin ing g the the same same data data,, calc calcula ulate te th the e 99 confi confide denc nce e inte interv rval al assu assumin ming g th the e results lts ca came me fr from om a ci city ty 1,50 1,500 0 voters. voters.
III. Rest Restau aura rant nt cust custom omer ers s leav leave e a tip tip ap appr prox oxim imat atel ely y 70 th the e time time.. A 95 conf confid iden ence ce inte interv rval al for the the tips tips pr prop opor ortio tion n is de desi sire red. d. The The an answ swer er shou should ld be corr correc ectt wi with thin in 5 . How many cust stom ome ers must must be su surv rvey eyed ed? ? Comp Comput uter er us user ers s se sett s to .458.
IV
Lin Linda will will cons consid ider er op open enin ing g a new new vide video o show showca case se in to town wns s wi with th av aver erag age e fa fami mily ly inco income me ov over er 35 35,0 ,000 00.. Sh She e requ requir ires es a 99 conf confid iden ence ce inte interv rval al.. Th The e es esti tima mate te shou should ld be wi with thin in 1, 1,00 000 0 the the popu popula lati tion on mean mean.. Rece Recent ntly ly gathe gathere red d data data indi indica cate tes s th the e popu popula lati tion on stand standar ard d devi deviati ation on is 4,000. What What si size ze sa samp mple leis required?
75
rob ro babilit ility y I.
or ormu mula la Rev Review iew
Types Type s and characte characterist ristics ics o f probability A.
Types Type s of pr prob obabi abili lity ty 1. Classical: peA 3.
B.
=
A N
peA = ~
2. Empirical:
Su Subj bjec ecti tive ve:: Us Use e empi empiric rical al fo formu rmula la assu assumi ming ng past past data data of si simi milar lar ev even ents ts is ap appr prop opri riat ate. e.
Probability characteristics Probability characteristics 1. Ra Rang nge e for for pr prob obab abil ilit ity: y: 0 :::;; P A) A ) :::;; 1
Value ue of comple complemen ments: ts: peA 2. Val
1 - peA
II. Pro roba babi bili lity ty ru rule les s A.
Addition is use used to find the sum or union of 2 even events ts.. 1. Gen General eral ru rule le:: peA or B) peA + PCB) - peA and B) 2.
B.
Special Spe cial ru rule le::
peA or B)
peA + PCB) is us used ed when when ev even ents ts ar are e mutu mutual ally ly ex excl clus usiv ive. e.
Multiplicati Multipli cation on is used used to de dete term rmin ine e join jointt prob probab abil ilit ity y or th the e in inte ters rsec ecti tion on of 2 ev even ents ts.. 1. Gene General ral ru rule le:: peA and B) peA x PCB I A 2.
Specia Spe ciall ru rule le::
peA and B)
used when when th the e ev even ents ts ar are e in inde depe pend nden ent. t. peA x PCB) is used
Note: For indepe independe ndent nt even events ts,, the the joint joint proba probabi bili lity ty is th the e produc productt of the margin marginal al probab probabili ilitie ties. s. C.
Bayes theorem is used used to find find cond condit itio iona nall prob probab abil ilit ity. y.
P(AIS)
_
_
P....; :....A.:... x----:P :....B:-IA..;.
P(A) x P B I A) +
p AJx p
: _ ::
Note: The Note: The denom denomina inator tor is wh when en co cond ndit itio ion nB hap appe pens ns.. It ha hap ppe pens ns wit ith h A and with A
J
BIA
III. Coun Counti ting ng rule rules s The coun counti ting ng rul rule e of mu mult ltip iple le even events ts:: If one eve ven nt can happen M ways and a second ev even entt can ha happ ppen en N ways ways,, th then en the tw two o ev even ents ts can hap app pen (M)( (M)(N N) way ays s. Fo Forr 3 ev eve ent nts s, us use e (M)( (M)(N N)( )(O) O).. B. Fa Fact ctori orial al rule rule for ar arra rang ngin ing g all of the items of one event: N items can be arranged in N ways. C. e r m u t a ~ i o n rule fo forr arra arrang ngin ing g some some of items of one one ev even ent: t: p _ N (order IS Important: a, b, c and c, a, b are diff differ eren ent) t) N R - N _ R)
A.
D.
Combina Combi nati tion on rule rule for for ch choo oosi sing ng so some me of the ite tem ms of on one e ev even ent: t: (order is no nott im impo port rtan ant: t: ab abc c and cba are are the the sa sam me and are are no nott coun unte ted d twi twice ce))
N
NCR
IV. Di Disc scre rete te probab probabili ility ty distri distribut bution ions s A.
B.
R )
Probability distributions Probability 1. P x)::: [ x . p x)] is ca calc lcul ulat ated ed fo forr ea each ch valu value e of x. 2.
Mean of a probabili probability ty distribu distribution tion:: Il ::: E(x):::
3.
Variance of a probabi probability lity distrib distributi ution: on: V x):::
Binomial Bino mial distributions distributions
P x) = C.
= N-R)
X
n
n-x)
pXqn-x
where
IX
e4-L
where
[x e p x)] e
p x)] - [E x)]
n is num number ber of tria trials ls
x is numbe numberr of succes successe ses s
P is prob probabi abilit lity y of success success
the e pr proba obabi bili lity ty of fa fail ilur ure, e, is 1 - P q, th
Il ::: np,
Poisson Pois son distributions distributions
P(x) = ----xl
np
2
:::
npq and
:::
Jnpq
Pois Poisso son n appr approx oxim imat atio ion n of the bi bino nomi mial al re requ quir ires es n 30 and and np < 5 or nq < 5. 7
V. Th e cont continuo inuous us normal normal probabili probability ty distribu distribution tion A
To fi fin nd th the e pr prob obab abil ilit ity y of x being within a giv@n range:
Z
X fl
a
Normal approxi Normal approximat mation ion of the the bino binomi mial al requ requir ires es n 30 and both np and nq are The cont continu inuity ity cor correct rection ion factor factor appl applie ies. s.
B
± za
To find a ra ran nge fo forr x gi give ven n the the pr pro obab abil iliity: ty:
VI VI.. Cent Centra rall li limi mitt th theo eore rem m Sampling Distribution of the Me Mean ans s
Px
--
x
x
If n
30, th the e po popu pula lati tion on may may be skew skewed ed..
VII. VII. Poin Pointt estimate estimates s A
xfor fl
B
s f or
VlIl.l VlIl.lnte nterva rvall estima estimates tes when when n A
For For a po popu pula lati tion on mean mean
C
D
pfor p
Sx for O x where S-x - and a x = - J7i
30
x+ z S ... J7i
or
x+ -
Note: Us Note: Use e the the fini finite te corr correc ecti tion on facto factorr in sect sectio ion n VIII III form formul ulas as when nl .05. N n
z_s_
J
J7i
N
B.
Fo Forr a po popu pula latio tion n pr prop opor orti tion on
+ Z JP<
P
where
IX. IX. Det Determ ermini ining ng sample size
A
When Wh en estima estimatin ting g the popu popula lati tion on mean mean
B.
When Wh en estima estimati ting ng th the e po popu pula lati tion on pr prop opor orti tion on
n = 5 _
77
p{
x
:::::::fj
S ec ec ttii o n VI I I Not e : When n < 30 and is un unkn know own, n, the the t dist distri ribu buti tion on,, to be disc discus usse sed d in ch chap apte terr 16, must must be subs substi titu tute ted d for for the the z dist distri ribu buti tion on wh when en maki ma king ng inte interv rval al es esti tima mate tes. s. Many Many st stat atis isti tics cs softw softwar are e prog progra rams ms do all inte interva rvall calcul calculati ation ons, s, reg regard ardles less s of samp sample le si size ze,, us usin ing g the the t dist distri ribu buti tion on..
ro robability T est I
II
Av Aver erag age e hour hours s work worked ed by manu manufac factur turin ing g worke workers rs is no norm rmal ally ly di dist stri ribu bute ted d with with a mea mean of 41 ho hour urs s an and d a st stan anda dard rd de devi viat atio ion n of .5 hours. Graph and so v th the e fol follow lowing ing proble problems ms..
P 41 hours::;; x
42.5 42. 5 hour hours s
B
P x < 40. 40.345 345 hours hours
C
P 41.75 41.75 hours: hours::;; :;; x
D
P 39.5 39.5 ho hour urs: s::; :;;; x < 42.5 hours hours
42 ho hour urs s
StUdy time time at Stat State e Univ Univer ersi sity ty is no norm rmal ally ly dis distrib tribu ute ted d wi with th a mean mean of 15 ho hour urs s per week week and a st stan anda dard rd deviat deviation ion of 3 ho hour urs. s. Grap Graph h and and solv solve e the the fo foll llow owin ing g pr prob oble lems ms.. A How many many ho hour urs s must must a stud studen entt stud study y to be in the top 1 of th the e studen students ts atte attend ndin ing g St Stat ate e Univ Univers ersit ity? y?
Calc lcul ulat ate e th the e fo four urth th deci decile le.. B Ca
8
III. Answ Answer er th the e foll follow owin ing g ques questi tion ons s ba base sed d up upon on this this stud study y of mon money ey spe spent nt on souv souven enir irs s at a virt virtua uall real realit ity y them theme e park park.. Money sp nt on s o uv e ni r s
U n de r 5
5 and over
Totals
g
5
15
20
22 and and ol olde derr
20
20
40
Totals
25
35
60
Under 22
A. Use a formula to calculate the P(Age < 22 or Age
The The eve event nts s
in
22).
question A are
t he
_
_
_
rule for
and the therefo refore, re,
is applicable.
C
Use Use a form formul ula a to calc calcul ulat ate e the the pr prob obab abil ilit ity y of some someon one e bein ing g at leas leastt 22 ye year ars s old and sp spen endi ding ng 5 and over.
D
Question C required the the events are
rule for
be c a us e
'
Use Ba Baye yes' s' theo theore rem m to ca calc lcul ulat ate e the the pro roba bab bil ilit ity y of so some meon one e at leas leastt 22 ye year ars s old sp spen endi ding ng 5 or mo morre.
F
Us Usin ing g th the e abo above ve char chart, t, cal calcu cula late te th the e pr prob obab abililit ity y of so some meon one e at leas leastt 22 ye year ars s old sp spe end ndin ing g less less than 5.
G
Why Wh y do does es yo your ur an answ swer er to qu ques esti tion on F ma make ke se sens nse? e?
IV. Us Use e a form ormula to calc calcul ulat ate e the pro roba bab bilit ility y tossi ossin ng a coin 3 tim imes es and gettin ing g exac exactl tly y 3 head eads. What is th the e pro probab babilit ility y a head ead com coming ing up on the fourt urth tos oss? s?
V
Four Four cust custom omer ers s ha have ve thre three e ba bank nk br bran anch ches es and you you will will visi visitt the ma mana nage gerr and as assi sist stan antt mana manage gerr at each br bran anch ch.. How ma many ny mana manage gers rs an and d as assi sist stan antt mana manage gers rs will will you you vi visi sit? t?
VI
A sale salesp sper erso son n mu must st vi visi sitt 4 6 stor tores and c rder is im impo port rtan ant. t. That That is AB an and d BA re repr pres esen entt di diff ffer eren entt ro rout utes es.. How ma many ny ro rout utes es ar are e av avai aila labl ble e to the the sale salesp sper erso son? n?
VII. Redo prob proble lem m VI assu assumi ming ng or orde derr does does not not cou count. nt. AB and BA are are the same same and coun countt as one one route. Be sure to use a form ormula and sh show ow all work.
VIII III. Ho How w many many dif diffe fere rent nt 3 pers person on subco subcomm mmit itte tees es can can be chos chosen en fr from om an a pe pers rson on comm commit itte tee? e?
IX
Three of a com commit itte tee e me memb mber ers s must must be chos chosen en to give ive a spee peech. ch. All a have very very diff differ eren entt pers person onal alit itie ies s an and d ord order er is im impo port rtan ant. t. How rnan rnany y diff differ eren entt spe speake akerr arra arrang ngem emen ents ts ar are e po poss ssib ible le? ?
How Ho w ma many ny 44-pl plac ace e ra rand ndom om nu numb mber ers s can can be ge gene nera rate ted d from from 10 digi digits ts? ? Repe Repeat atin ing g digi digits ts is allo allowe wed. d.
XI. Si Six x pa part rts s ar are e to be in insp spec ecte ted d from from a pr prod oduc ucti tion on pr proc oces ess s desi design gned ed to ha have ve ap appr prox oxim imat atel ely y 5 defe defect ctiv ive e parts. Using Using th the e binomi binomial al for formul mula a determ determine ine the pro probab babilit ility y zero zero de defe fect cts. s. Use Use a ta tabl ble e to de dete term rmin ine e th the e pr prob obab abil ilit ity y at leas leastt 2 defe defect ctiv ive e pa parrts. ts. Stat State e the en enti tire re pr prob obab abil ilit ity y dist distri ribu buti tion on.. Wh What at is th the e pr prob obab abil ilit ity y 2 def defect ective ive part parts? s?
Xll Xlll.P l.Plac lace e the the number number of the the app approp ropri riate ate it item em in the spac space e provi provided ded.. A
St Stand andar ard d error of the mean
B
99
C
St Stand andard ard error of the proportion
D
Requ Re quir ires es n be
E
Acceptable e r r o r
c o n f i d e n c e i n t er va l
30
_
0
3
_
_
x -
Q
E
5 in
58;
X ± 2 58 n
4
_
XIV. Answ Answer er the the foll follow owin ing g true true or fals false e and and fi fill ll
X±
_
JP< ;;P
the the bla blank nk questi questions ons..
A
Th The e st stan anda dard rd error error of t h e mean will be halved i f th t he sa mple size is doubled.
B
Samp Sampli ling ng error error ex exis ists ts becaus because e a nonr nonrep epre rese sent ntiv ive e sam sampl ple e was ta take ken n
C
A one-num one-number ber estima estimate te of the population mean is ca c al l ed a
estimate of th the e mean mean..
D
A ra range ffo o r a po pop ulation pa p ara me te r iis s ca called tth he
_
E
A m a y be more acc urate than a simple random sa samp mple le be beca caus use e a smal smalll di dive vers rse e sect sectio ion n of the the popu popula lati tion on mi migh ghtt not not be repr repres esen ente ted d in a si simp mple le rand random om samp sample le.. of
in
place of a census.
of
XV. A sarage mp mpelerefreshmen 6 o ut ut t spendi 25,000 25, 000ngbase baof seba ball ll.60. fans fa atte atetend ndin ing ga gam eatio reve re veal aled ed averag ave refre3shment spe nding 7.6 7 0.nsTh The stan st anda dard rdgame devi deviat ion n for fo r th the e popu popula lati tion on is 2. 2.10 10.. Ca Calc lcul ulat ate e the the 95 conf confid iden ence ce inte interv rval al for aver averag age e refreshment refres hment spending by fans fans atte attend ndin ing g this this game game..
XVI. XV I.A A marke marketi ting ng test of cho chocol colate ate fl flav avor ored ed shav shavin ing g cr crea eam m reve reveal aled ed a fa favo vora rabl ble e resp respon onse se from from 35 of 50 test test subj subjec ects ts.. Test Test subj subjec ects ts were were chos chosen en at rand random om from from th the e compan company' y's s 1, 1,20 200 0 empl employ oyee ees. s. Calc Calcul ulat ate e the the foll follow owin ing: g: A Th The e 90 confid confidenc ence e inte interv rval al for this this ma marke rkett test test..
B
Th The e comp company any is unhapp unhappy y wi with th the the conf confid iden ence ce inte interv rval al calc calcul ulat ated ed ab abov ove e an and d woul would d like ike to lo lowe werr ac acce cept ptab able le er erro rorr fr from om to 5 . How large a sa sam mple ple m u ust st be take taken? n?
82
_ _
Refreshment Spending Refreshment Spending Data Se t statistics stics software software fo r tho those se using using stati
4.50
8 .0 0
9. 00
9.00
6.95
4 .9 0
7. 00
8.05
10 . 0 0
8. 00
9.50
2.00
11.00
9 .0 0
5. 00
8.00
8.05
8 .5 0
10 . 0 0
4.80
6. 0 0
4.90
1 1. 00
9.00
6. 5 0
7 .0 0
7.00
8.00
11.00
8 .0 0
5. 00
5.75
9.10
6 .0 0
9 .1 0
9.00
Data se t fo r those using statistics software Favorable F and Unfavorable Unfavorable (U) (U) Attitude Attit udes s Toward Chocolate Flavored Shav Shaving ing Cream Cream U
F
F
F
F
F
U
F
F
U
U
F
U
F
F
U
F
F
F
U
F
U
F
F
F
U
F
F
U
F
F
F
F
F
F
U
F
F
U
U
F
F
F
F
F
F
F
F
U
U
XVII. Mat atch ch each each it item em on th the e righ rightt wi with th th the e conc concep eptt it defi define nes. s.
1. Bayes theorem
1.
Addi diti tion on ru rule le wh when en ev event ents s are are mu mutu tual ally ly ex excl clus usiv ive e 2. Ad Variance o f a bin binomi omial al pro probab babili ility ty distri distribut bution ion
3.
4.
8. Subj Subjecti ective ve proba probabilit bility y
J
x e
~
5.
6.
o;;
P A)
; ;
1
X J
7.
Joint pr Joint prob obab abil ilit ity y is th the e pr prod oduc uctt of the mar margin ginal al pro probab babil iliti ities es
8.
l
± zcr
9.
N N-R) R )
Gene nera rall ru rule le for for addi additi tion on 9. Ge
10.
10 10.. Perm Permut utati ation on rule rule 11. To find a rang range e give given n th the e prob probab abil ilit ity y
13. 13. Mean ean of a prob probab abil ilit ity y dist distri ribu buti tion on
15. For ind indepe epende ndent nt even events ts
14. 14. Value Value o fa complement
17. To find the pr prob obab abil ilit ity y give given n a rang range e
n
11.
A)
12.
1 - P( A)
13.
N
14.
P ( A) + P ( B) - P ( A a n nd d B)
15.
P(A) x P(B)
16.
N ways
17.
[1:x 2
18.
16. 16. Binom Binomial ial dist distribu ribution tion
19.
x
• p x ]
- [E x)]
n x n- x n-x),p q
npq
20.
18 18.. Comb Combin inati ation on rule rule
P A x P BIA P A x P B IA + p
19. 19. Poi Poisso sson n dist distribu ribution tion 20. The The comp comple leme ment nt of A
M xN
J
7. Em Empir pirica icall probab probabilit ility y
12. 12. Cla Classi ssical cal probabil probability ity
N-R)
xl
Mult ltip ipli lica cati tion on rule rule wh when en th the e ev even ents ts ar are e in inde depe pende ndent nt 6. Mu
N
2.
4. Fact Factor oria iall ru rule le for arra arrang ngin ing g al alll of th the e it item ems s of one event event 5. Rang Range e for pro probab babili ility ty
P(A) + P(B)
22. Th The e coun counti ting ng rule rule fo forr mult multip iple le ev even ents ts
22.
P(A) x P (B I A)
23.
Use empi empiri rica call fo form rmul ula a as assu sumi ming ng past past data of sim similar ilar eve events nts is app approp ropria riate te
24.
np
25.
24. Mea ean n of a bino binomi mial al prob probab abil ilit ity y di dist stri ribu buti tion on 25 25.. Ge Gene nera rall ru rule le for mU mUlt ltip ipli lica cati tion on
83
J
BIA
21.
probabili bility ty dist distribu ribution tion 21 21.. Variance Variance of a proba
23. Is calc calcul ulat ated ed for each each valu value e of x wh when en dete determ rmin inin ing g a prob probabil ability ity distributi distribution on
AJx p
p x)
1: [ x . p x)]
Chap Ch apte terr 13 Large Sample Hypo potthesi sis s Testin ing g I.
I nt ro d u c ti o n A. Ch Chap apte terr 13 ex expl plor ores es a syst system emat atic ic meth method od fo forr te test stin ing g clai claims ms ab abou outt th the e po popu pula lati tion on mean mean us usin ing g a samp sample le mean. Large ge sample sample n 30) te test sts s us usin ing g z will will be co con nside sidere red. d. The The sta tand ndar ard d dev evia iati tio on 0 ) ma may y be kn know own n or unknown. B. Lar C. Small sample n < 30 30)) t dist distri ribu buti tion on te test sts s used by most most stat statis isti tics cs soft softwa ware re wi will ll be explored in chap chapter ter 16. D. Is Issu sues es to be te test sted ed incl includ ude e 1. Qual Qualit ity y contr tro ol iss issues such as the wei weight of a comp comput uter er pa part rt 2. Mar Market keting ing re resea searc rch h issu issues es such such as th the e prop propor orti tion on of co cons nsum umer ers s liki liking ng a ne new w pr prod oduc uctt 3. Po Polilitic tical al iss issue ues s such such as th the e prop propor orti tion on of vote voters rs plan planni ning ng to vo vote te fo forr a poli politi tica call cand candid idat ate e
II. Definitions A. Th e null hypothes hypothesis is (H o) stat states es some some hy hypo poth thes esiz ized ed valu value e fo forr a po popu pula lati tion on para parame mete terr su such ch as th the e mea mean. 1. Re Read ad H subsub-ze zero ro,, it its s ac acce cept ptan ance ce impl implie ies s no statis statistica ticall differen difference ce be betw twee een n a pa para ramet meter er(ll (ll)) and and a stat statis isti tic( c(x) x).. 2. Li Lind nda a Smit Smith h want wants s to know know whet whethe herr th the e aver averag age e cust custom omer er pu purc rcha hase se has de decr crea ease sed d fr from om last last year year's 's mean of $7 $7.7 .75 5 be beca caus use e a rece recent nt sa samp mple le of 49 had a mean of on only ly $7 $7.5 .50 0 (s (see ee pa page ge 67) 7).. a. A null hy hypo poth thes esis is migh mightt rea ead d the the aver averag age e pu purc rcha hase se has has not not decr decrea ease sed d fr from om $7 $7.7 .75. 5. b. In effect, o : $7.75
B.
3. The direct direction ion of the inequa inequality lity is grea greate terr th than an or eq equa uall to beca becaus use e th this is impl implie ies s th the e mean mean ha has s no nott de decr crea ease sed. d. reje ject cted ed if th the e meas measur ured ed dif differ feren ence ce betw betwee een n th the e hypo hypoth thes esiz ized ed and x is larg large e and and seldo seldom m happ happen ens. s. 4. Ho is re The alterna alternate te researc research h hypothes hypothesis is (H 1 ) rep repres resent ents s the possib possible le differe difference nce bein being g studie studied. d. 1. Read Hsubsub-one, one, it implie implies s there is a st s t atist ical [ H < $7.75 • difference. It is th the e comp comple leme ment nt of th the e null null hypo hypothe thesis. sis. 2. An al alte tern rnat ate e hy hypo poth thes esis is migh mightt rea ead d th the e mean mean pu purc rcha hase se is un unde derr $7 $7.7 .75. 5.
C.
o f significance Level 1. Rejection of a tr true ue null ull hy hypo poth thes esis is shou should ld rare rarely ly ha happ ppen en.. a. The le leve vell of sig signif nifica icanc nce e st state ates s th the e maxim maximum um prob probab abilility ity of such such an er erro ror. r. b. A .01 sig signif nifica icance nce leve levell indi indica cates tes a sampl sample e statis statisti tic c at least least th this is differ differen entt fr from om some some hypo hypoth thes esiz ized ed pa para rame mete terr wil illl ha happ ppen en no more more tha han n 1% of th the e time time.. Th Ther eref efor ore, e, th the e maxi maximu mum m er error ror one pe perc rcen ent. t. c. The si sign gnif ific ican ance ce leve levell prov provid ides es a limi limitt fo forr th the e samp sample le stat statis isti tic. c. Be Beyo yond nd th this is limi limit, t, Ho is rejected. d. The cost cost as asso soci ciat ated ed with with maki making ng an inco incorr rrec ectt deci decisi sion on de dete term rmin ines es th the e ap appr prop opri riat ate e leve levell of significance. 2. Type I or alpha error a a. Alph Alpha a error error eq equa uals ls th the e leve levell of sign signif ific ican ance ce.. It meas measur ures es th the e ri risk sk of re reje ject ctin ing g a tr true ue null null hypo hypoth thes esis is.. b. De Deci cidi ding ng to reje reject ct th the e nu null ll hy hypo poth thes esis is ab abou outt th the e av aver erag age e Error Summary purchase o f $7.75 $7.75 crea creates tes th the e poss possib ibili ility ty of type I error Decision Nature's True State (acc (accep epti ting ng a de decr crea ease se when when th ther ere e is not not a decr decrea ease se). ). c. Tr Tradi aditio tional nal al alph pha a erro errors rs inclu include de .0 .05 5 for mark market etin ing g Ho is true Ho is false resea res earc rch h questi questions ons and and .01 for qualit quality y cont contro roll ques questi tion ons. s. Acce Accept pt Ho C o r r ec t Type II error 3. Type II or beta error acce accepti pting ng a false false null null hypo hypoth thes esis is,, is ex exam amin ined ed on page page 89. Reje Re ject ct Ho Type I er r or C o rr e c t D. Tes Testt statis statistic tics s an and d th thei eirr criti critical cal va value lues s 1. Te Test st st stat atis isti tics cs are are us used ed to de dete term rmin ine e th the e va vali lidi dity ty of a nu null ll hy hypo poth thes esis is.. Ex Exam ampl ples es incl includ ude e x and p. Here re,, x will will be us used ed to te test st a null hy hypo poth thes esis is conc concer erni ning ng po popu pula lati tion on mean mean pu purc rcha hase ses s de desc scri ribe bed d abov above. e. 2. He 3. We begin by assuming th e null hypothes hypothesis is is t r u e. For the .01 level of sign signif ific ican ance ce,, a sample sample mean mean that that separ sep arate ates s 1% of the samp samplin ling g dist distri ribu buti tion on's 's samp sample le mean means s fr from om th the e ot othe herr 99% will be th the e critic itical al value lue. 4. Wh When en te test stin ing g a nu null ll hypo hypoth thes esis is rela elate ted d to a no norm rmal al samp sampli ling ng di dist stri ribu buti tion on,, th the e test test stat statis isti tic c is of ofte ten n conv conver erte ted d into into its z value. This This z va valu lue e is like like th the e criti critica call value value be bec cau ause se it separ sep arate ates s th the e re regi gion on of acc accept eptanc ance e from from th the e regi region on of rejection. Accept 5. Here we ha have ve a criti ritica call va valu lue e fo forr z of -2.3 -2.33 3 fo forr th the e .01 level 2.33. This of significance as.49 This me mean ans s ::;;1 of the samp sample le mean means s are are be beyo yond nd - 2. 2.33 33 stan standa dard rd de devi viat atio ions ns fr from om .99 and and result result in th the e erro errorr of reje reject ctin ing g a true true null null hypo hypoth thes esis is.. 6. Th The e alt alter ernat nate e hypo hypothe thesis sis poin points ts to towa ward rd th the e regi region on of rejection. z = -2.33 of or In < t h i s oneon e-ta tail il prob pr oble lem, m, w ith it h a n H , $7.75 $7 .75, , th the e cr criti itica cal l Critical Crit ical Valu Value e area is to th the e le left ft be beca caus use e Lind Linda a is conc concer erne ned d th that at a low low samp sample le z o mean of $7.5 $7.50 0 indic indicate ates s the popu popula lati tion on mean mean has has decr decrea ease sed. d.
84
III. A 5 - s t e p approach to hypo hypothes thesis is testing testing State e the the null null hypo hypoth thes esis is and and alte altern rnat ate e hypo hypoth thes esis is.. A Stat Dete term rmin ine e the the cond condit itio ion n (c (cla laim im,, conc concer ern, n, diff differ eren ence ce)) bein being g te test sted ed usin using g > < or::l-. Ca Call ll it H1 • 1. De 2. Dete Determin rmine e the condition condition's 's complement complement us usin ing: g::; :;;, ;, or =. Ca Call ll it Ho' 3. Ho impl implie ies s no diff differ eren ence ce by cont contai aini ning ng an eq equa uali lity ty sig ign. n. It is stat stated ed fi firrst st.. B Sele Select ct the the leve levell of sign signif ific ican ance ce base based d upon upon accep accepta tabl ble e type type I erro error. r. C. Deter Determi mine ne the rel releva evant nt test test statis statisti tics cs (x for for now, now, p and and othe others rs will will fo foll llow ow). ). etermi mine ne th the e de deci cisi sion on rule rule us usin ing g a gr grap aph h of the cri criti tical cal values values of z D. Deter t h 1. Ac Acce cept pt th the e nu null ll hy hypo poth thes esis is if th the e test test stat statis isti tic c r --t- -=-f t h t t t -r-t- · 8 I m p y pu , I e es s a IS.IC IS ex reme z is no nott be beyo yond nd th the e cr crit itic ical al valu value e of z reme en enou oug g . 2. Otherwise, reject the null hypothesis. beyond the Critical value, rejec t the null hypotheSIS.
=-·--
E
V
Appl Apply y th the e de deci cisi sion on rule rule..
One-tai One -taill testing testing o f on e s a mp mp le me an an Linda inda Smith mith th thin inks ks av aver erag age e cust custom omer er pu purc rcha hase ses s coul could d be lo lowe werr than than la last st year year's 's 7. 7.75 75 be beca caus use e a samp sample le of 4 9 (s ee page 67) had a mean of on only ly 7.50 7.50.. The The po popu pula lati tion on st stan anda dard rd de devi viat atio ion n is .70 .7 0. Linda w ant ants s type typeI erro error, r, the the cha chanc nce e of reje reject ctin ing g a tr true ue null hy hypo poth thes esis is,, to be 1 . 7. 7.75 75 an and d : I l < 7.75 B. Type I e rro r is 1 . C. Th The e te test st st stat atis isti tic c is x D. If z from from th the e te test st stat statis isti tic c is be beyo yond nd -2.3 -2.33, 3, reje reject ct th the e nu null ll hypoth hypothesi esis. s.
A
o
E
Il
z-
X
l
L
.[if
_ 7. 7 .50 -
7. 75
.2 5
.7 . 7 0
. 70 7
J49
Reject
= .2 5 = -2.50
Accept
.10
.....
.49
Reject the null hy hypo poth thes esis is be beca caus use e a z of -2 -2.5 .50 0 is beyo beyond nd (sma (s mall ller er)) the the cri critic tical al value value of -2.33. A sa sam mple ple mean of 7.5 7 .50 0 happen happens s less less-t -tha han n 1 of the the ti tim me when when Il 7.75.
z = -2.33 Critica Criticall Value Value
t
.50
.
7 . 75
Not ote: e: If the the area rea bey beyond ond the test test stat statis isti tic c (the tail) is less ess-t -tha han n the level of sig signif nifica icance nce,, the measure measured d diff differ eren ence ce is signif significa icant nt and and Ho is al also so re reje ject cted ed.. Th This is appr approa oach ch,, call called ed p-va p-valu lue e hy hypo poth thes esis is te test stin ing, g, is us used ed by most most sta tati tis sti tic cs soft softwa ware re.. After After comp comple leti ting ng this this page page,, st stat atis isti tics cs soft softwa ware re user users s shou should ld read read part part II of page page 88 and ch chap apte terr 16. 16 .
V
Two-tail testing Two-tail testing of on e s a mp mp le me an an concer cern n any chang change, e, reg regard ardles less s of direction. A Two-tail problems con B In the the pr prob oble lem m ab abov ove, e, Lind Linda a wa was s not not co conc ncer erne ned d ab abou outt tconc hencer aerni ve vening r ag ang geth peurav cherag ch as asage e egopu in g u p.seNmust owstshbe e ichan s. ange Th he e dcla co the aver purc rcha hase mu ch ged toim i ncl u de a n y d i f f e r e n c e f r o m l ast y ea r ' s a v e r a g e p u r ch a s e [ : Il = 7.75 : Il ::I 7.75 of 7.75 7.75.. Th The e nu null ll hy hypo poth thes esis is an and d al alte tern rnat ate e hy hypo poth thes esis is wo woul uld d be: . C. Fo Forr th this is twotwo-ta tail il prob proble lem, m, the the alte altern rnat ate e hypo hypoth thes esis is does does not not st stat ate e th the e dire direct ctio ion n o f the change change (dif (diffe fere renc nce) e).. D. Usin Using g a .01 level of sign signif ific ican ance ce,, al alph pha a ri risk sk must must be di divi vide ded d ev even enly ly be betw twee een n the the 2 ta tail ils s of a norm normal al curv curve. e.
1IIII I_ '_II III
. 2
=
. = ;Q1 = .005 2 2
;Q1 =.005
2
Accept
.495
z = -2 .5 8
. 495
7.75 z = +2.58 Critical Crit ical Values Values
85
The te test st st stat atis isti tic c rema remain ins s ± 2.5 .50. 0. The The ana naly lysi sis s to the the lef eftt in indi dica cate tes s th the e cri criti tica call valu value e ha has s chan change ged d to ± 2.58. Accept th the e null null hypo hypoth thes esis is as z of -2.50 is not not be beyo yond nd th the e crit critic ical al valu value e of -2.58. A t the .01 level of si signi gnific ficanc ance, e, a sample sample mean mean of 7 7..50 is not not low lo w enou enough gh to conc conclu lude de th the e po popu pula lati tion on mean mean is no nott 7.7 7 .75. 5. Note Note how how spli splittti ting ng th the e .01 level of signif ic ican ance ce (ris (risk) k) betw betwee een n two two ta tail ils s incr increa ease ses s th the e crit critic ical al val value ue.. As a resu resullt, wh what at was was a si sign gnif ific ican antt diff differ eren ence ce is now an acce accept ptab able le diff differ eren ence ce..
Prac Pr acti tice ce Set
Large Sa Samp mple le Hypothes esiis Te Test stin ing g
Darin Dari n Jones Jones is ve very ry co conc ncer erne ned d that that pa part rts s de desi sign gne ed to wei weigh less tha than or eq equa uall to 30 mg ma may y be too heavy eavy and not pass insp spe ecti tio on. From rom page 68. we kn kno ow tha that a sa samp mplle of 36 pa part rts s re resu sullted ted in a samp sample le mean mean of 30 30.0 .025 25 mg an and d a sampl sample e stan standa dard rd de devi viat atio ion n of .065 mg. Darin wants to co con ntrol typ ype e I erro errorr the the pr prob obab abil ilit ity y of deci cid ding th the e parts th that at are are too too heav heavy y when they they are not) to the 1 level of sign signif ific ican ance ce.. Solv Solve e this this pr prob oble lem m us usin ing g th the e 55-st step ep ap appr proa oach ch to hy hypo poth thes esis is test testin ing. g. Special Special Note: Note: We know th e p o op pu ull ati atio o n mea mea n can can b e l e ess ss than or equal to mg at th e .01 011 1 le leve vell of sig signi nifi fican cance ce becaus because e the 98 confid confidenc ence e in inte terva rvall calcul calculat ated ed fo r th this is po popu pula lati tion on mean mean on page h ad a l o ow w er er l iim m it it o f 29 29.99 .999 9 mg. mg.
I
II
Using proble lem m I da data ta and a 1 level of significa can nce. det determi rmine whethe whet herr th the e po popu pula lati tion on mean mean ha has s chan change ged d from from 30 mi mill llig igra rams ms..
86
III. Redo problem II usi using a .05 level vel
of
significance.
Quic Qu ick k Ques Questi tion ons s I
Errorr Sum Erro Summar mary y
Comp Co mple lete te th the e fo follllow owin ing g cha chart rt and and ques questi tion ons. s.
Type I error is called
error.
B
Type
error.
C
When z cal calcu cula late ted d fr from om sam sampl ple e data data is beyond the the crit critic ical al valu value e (l (les ess s than than fo forr le left ft tai tail prob proble lems ms and gr grea eate terr than than for ri righ ghtt tail tail prob proble lems ms), ), the the nu null ll hypothesis is
D
II
Larg Large e Sampl Sample e Hypo Hypoth thes esis is Test Testin ing g
T
F
II
error
is
called
Decision Concerning Nu Nullll Hypo Hypothe thesis sis
Nature Nat ure's 's Tru True e St Stat ate e Ho is true true
Ho is false
Acce Accept pt Ho Reje Re ject ct Ho
By setting the confidence level to 99 , we are trying to assure that that the the alte altern rnat ate e (res (resea earc rch) h) hy hypo poth thes esis is will will no nott be ea easi sily ly acce accept pted ed..
Make Ma ke th thes ese e test tests s usin using g the the 5-st 5-step ep ap appr proa oach ch to hy hypo poth thes esis is test testin ing. g.
B
A ligh lightt bu bulb lb wa warra rrant nty y stat states es av aver erag age e bu bullb li life fe is at le leas astt 20 20,0 ,000 00 hours. A sa samp mple le of 49 bulbs had an av aver erag age e life ife of 19 19,0 ,000 00 ho hour urs. s. Th The e po popu pula lati tion on stan standa dard rd de devi viat atio ion n is 1,40 1,400 0 ho hour urs. s. Test Test the the wa warr rran anty ty cl cla aim to the the 1 lev level el of sig signif nifica icance nce..
Aver Averag age e we week ekly ly ma manu nufa fact ctur urin ing g ea earn rnin ings gs we were re 48 480 0 an and d the the st sta and ndar ard d de devi viat atio ion n wa was s 72. A re rece cent nt sa samp mple le of 36 re resu sullted ted in a me mean an of 450. Th The e sta stand ndar ard d de devi via ation tion has no nott ch cha ang nged ed.. Te Test st to the the .05 le leve vell whet whethe herr av aver erag age e we week ekly ly ea earn rnin ings gs chan change ged. d.
87
o rPeop People le Usin Using g Sta Statis tistic tics s
Software
Life o f Ligh Lightt Bul Bulbs bs (Thou (Th ousan sands ds of Ho Hour urs) s) 19
17
18
19
19
20
19
21
20
22
20
19
19
21
19
19
18
19
17
19
19
19
19
16
20
19
20
17
19
18
18
18
21
17
18
20
21
18
16
21
19
20
22
19
20
18
20
18
o rPeo People ple Usi Using ng Stat Statis isti tics cs
Software
eeklyManufa Manufacturing cturing Earn Earning ings s
500
520
490
580
470
475
565
610
490
420
480
400
445
580
300
440
450
480
400
4 20
480
410
440
430
390
480
390
460
460
450
420
385
350
500
360
280
Cha hap pt er 14 Large Sample Hypothesis Testing Part II I.
Two-tail testin Two-tail testing g o f tw o sample mean means s fro from m ind indepe epende ndent nt pop popul ulati ation ons s A. Vari Variab able les s are in inde depe pend nden entt wh when en the the oc occu curr rren ence ce of on one e vari variab able le do does es no nott affe affect ct the the valu value e of the other var varia iabl ble. e. B. Li Lind nda a is in inter terest ested ed in whe wheth ther er the the av aver erag age e cus custo tome merr pu purc rcha hase se is diff differ eren entt at tw two o of he herr stor stores es.. 1. A sa samp mple le of 50 from from stor store e 1 had a mean of 7 7.5 .50 0 an and d a stan standa dard rd de devi viat atio ion n of 1.00 1.00.. 2. A sa samp mple le of 32 from st stor ore e 2 had a mean of 7.40 and a sta standard devia viation ion of .80. C. The The 5-ste 5-step p appr approa oach ch to hy hypo poth thes esis is test testin ing g 1. Stat State e the the nul ulll an and d al alte tern rnat ate e hy hypo poth thes esis is.. a. This This is a tw twoo-ta tail il prob proble lem m be beca caus use e the the cl clai aim m in invo volv lves es an any y diff differ eren ence ce in aver average age pur purcha chase. se. b.
2. 3.
IH
=
1J.2
and H 1
: 1J.1 :;t 1J.2
Since Sinc e the the clai claim m is ma mark rket etin ing g orie riented nted,, the test test will be at the the .05 leve vell of significance. Deter De termin mine e th the e rel releva evant nt test sta stati tisti stics. cs. a. ) is the the rele releva vant nt test test stat statis isti tic. c. b.
4.
o : 1J.1
Note: If the Note: the di diff ffer eren ence ce be betw twee een n the the tw two o samp sample le means is la larg rge e rela relati tive ve to the their ir ave averag rage e sta stand ndar ard d errors, z for for the the test test will will be larg larger er tha than the the crit critic ical al va valu lue e of z an and d the the null hypo hypoth thes esis is wi will ll be re reje ject cted ed..
Z=
Dete De term rmin ine e the the de deci cisi sion on ru rule le us usin ing g a grap graph h of the the cr crit itic ical al valu values es.. Th The e crit critic ical al va valu lue e of z for a + 2 = 5+ 2 =.025 .025 is ± 1.96. is If z from from the the test test stat statis isti tic c beyond ± 1.96 1.96 the the nul ulll hypoth hyp othesi esis s wi willll be re reje jecte cted. d. Note No te:: Th This is wo woul uld d be a one ne-t -tai aill pr prob oble lem m if Linda wante want ed to kn know ow wh whet ethe herr one st sto ore had a larg rge er avera average ge purch purchas ase e than than the the othe otherr stor store. e. z
5.
=-1.96
z
=+1.96
Appl Apply y the the deci decisio sion n ru rule le.. Stor Store e
1
n
Stor Store e
2
n 1-
2
-
S2
S2 ...1+...1. n
n
= 50 = 32
1 2
7.50-7.40
---so 1 1..00 2
.80 2 +3
= 7.50 = 7.40 -
51
52
.10
.10
J 02 .02
= 1.00 = .80 = .50
.2
Ac Acce cept pt H o becau because se .50 .50 < 1.96. Sales are the sa same me at the .05 .05 lev level el of significance.
II. Hypoth Hypothesi esis s tes testin ting g usi using ng p-v p-valu alues es A. The The p-va p-valu lue e appr approa oach ch to hy hypo poth thes esis is test testin ing g comp compar ares es the the prob probab abil ilit ity y asso associ ciat ated ed with with the the test test statisti stat istic's c's tail tail or tail tails s (p (p)) with with the the le leve vell of si sign gnif ific ican ance ce.. P mea measure sures s the sign signific ificance ance of the the test test da data ta.. 1. If the the p-va p-valu lue e is sma smalle llerr than than the the le leve vell of si sign gnif ific ican ance ce,, the the prob probab abil ilit ity y of a test test stat statis isti tic c this this ex extr trem eme e is un unli like kely ly (les (less s than than the the leve vell of si sign gnif ifica icance nce), ), and and th the e null null hyp hypoth othes esis is is rejected. 2. A sm sma all p-va -valu lue e (a tail of .003 .003)) me mean ans s subs substa tant ntia iall di diff ffer eren ence ce an and d Ho is reject rejected ed.. 3. A larg large e p-va p-valu lue e (a tail tail of .30) me mean ans s littl ttle di diff ffer eren ence ce an and d Ho is easi easily ly accep accepted. ted. B. For For exam exampl ple, e, a p-va p-valu lue e an anal alys ysis is of th the e oneone-ta tailil and and twotwo-ta tailil prob proble lems ms on pa page ge 85, wh wher ere e z for for the the test test stat statis isti tic c wa was s 2.50 2.50 and the the le leve vell of si sign gnif ific ican ance ce wa was s .01, wo woul uld d be do done ne as foll follow ows. s. One-tail Problem
z = 2.50
.4938 (.5000 - .4938) = .0062 z =2.50 Reje Re ject ct H o be beca caus use e p of .006 .0062 2 < .01.
Two-tail Problem .4938 (.5 (.5000 - .4938) = .0062
Because this Because this is a tw twoo-ta tailil pr prob oble lem, m, 002 Accept H o be beca caus use e p of .00 .0062 > .005. 88
=.01/2 = .005.
III. An Ana al yzi yzin ng typ typ e II error A Type error is the prob probabi ability lity o f acc accep epti ting ng a fa fals lse e null null hypo hypoth thes esis. is. B Lind Li nda a s twotwo-ta taililed ed stu study dy con concer cerni ning ng any any cha chang nge e in the the ave averag rage e purc purcha hase se pric price e from from la last st year year s 7.75(s (see ee page page 85) will will be anal analyze yzed. d. Firs Firstt we will will calc calcul ulat ate e the the lo lowe werr cr crit itic ical al valu value, e, an ac acce cepU pUre reje ject ct po poin intt for th this is null null hypo hypoth thes esis. is. l
lX+2 =
01
.5 0 - . 0 0 5 1
2 3
+ 2 = .005
= .495
z=±
z ox)
7.75-2.58 .10) 7..7 5 - .2 58 = 7.49 7
58
Here He re,, typ type e II erro errorr exis exists ts ev ever eryw ywhe here re ex exce cept pt for for Jl = 7.75. Th This is me mean ans s the the am amou ount nt of type type II erro errorr var varies ies dep depend ending ing upon pon the the valu value e of the the true true po popu pula lati tion on mean mean.. We will will calc calcul ulat ate e the the pr prob obab abil ilit ity y of type type II error for for a popu popula lati tion on me mean an Jl1) of 7.40.
z = -2.58 x= 7.49
l
7.75
z
=+2.58
X -Jl1
Z = a fii
Z
=
7.49- 7.40 .70
=
09 :10
= .90
type II er erro ror, r, acceptin accepting g Jl = 7.75 when it equa when equals ls 7. 7.40 40
.3159
J
type
II
error is .50 - .3159
= .1841
When the mean is 7.40,Lin Linda da s dec decisi ision on rule has a typ ype e II error of 18.41 .
7.40 J l1
C
Operating Operati ng chara characteris cteristic tic curves 1 The opera operatin ting g cha charac racter terist istic ic cur curve ve gr grap aphs hs th the e pr prob obab abilility ity II of type type II error rror.. It de dep pic icts ts all pos ossi sibl ble e typ type er erro rors rs gi give ven n o :::c so some me ac acce cept ptab able le level of type type I error rror.. It me meas asur ure es Ol c: acce accept ptin ing g no chan change ge wh when en ther there e ha has s bee een n chan change ge.. 2 As the the true true popu popula lati tion on me mean an in th the e abov above e exa exampl mple e dr drop ops, s, ac acce cept ptin ing g a fals false e null null hy hypo poth thes esis is be beco come mes s le less ss li like kely ly as th the e ri righ ghtt tail tail area area of th the e seco second nd grap graph h be beco come mes s sm smal alle ler. r. 0 Even Eventu tual ally ly the the true true po popu pula lati tion on mean mean is so sm smal alll that that :0 accep acc eptin ting g a false false null null hyp hypot othe hesis sis is almost imposs impossible. ible. ti l 3. As the the true true me mean an appr approa oach ches es 7.75,the area to the right e gets larger. It reach che es a peak of 98+ perce percent nt just befo before re 7.75. Type II erro errorr do does es no nott ex exis istt for for Jl = 7.75 becau because se th the e null null hyp hypoth othesi esis s is not false. false. 4. At a po poin intt just just be beyo yon nd 7.75,beta beta err error or is still 98+ percent an and d it drop drops s towa toward rd zero zero as the the true true po popu pula lati tion on me mea an in incr crea ease ses. s. Power Pow er curv curves es 1 A po powe werr cu curv rve e grap graphs hs the pro roba bab bil ilit ity y of no nott ma maki king ng a type erro error. r. It me meas asur ures es:: a how how ofte often n you you corr correc ectl tly y re reje ject ct a fals false e nu null ll hy hypo poth thes esis is Ol b how how ofte often n you you acce accept pt a corr correc ectt re rese sear arch ch hy hypo poth thes esis is c: n 2 It is the the comp comple leme ment nt of type type II error or 1 - type II error. 3 Th The e powe powerr curv curve e show shows s ac acce cept ptin ing g a chan change ge in quality, consu con sumer mer atti attitu tude de,, and vote voterr pref prefer eren ence ce when when ther there e ha has s 0 been been chan changes ges in thes these e are areas. as. :0 4. Lowering type II erro errorr come comes s at the the ex expe pens nse e of in incr crea easi sing ng til ty type pe I erro errorr an and d vi vice ce ve verrsa sa.. e
a
D
7.49 X
Operating Opera ting Char Character acteristic istic Cur Curve ve 1.00 .75 .50 .25
7.75
Jl
Power Pow er Cur Curve ve 1.00 .75
Q)
.50 .25
alpha error = 01
7.75
89
Jl
Practice Se Sett I
II
Large Sam ampl ple e Hypothesis Testing Par artt
Darin Dari n buys buys ma mate teri rial al fo forr hi his s 3D-m 3D-mil illi ligr gram am pa part rts s fr from om supp suppli lier ers s A and and B A samp sample le of 30 orde orders rs plac placed ed with ith supp suppli lier er A had a me mean an deli delive very ry ti time me of 24 da days ys and a st stan anda dard rd devi deviat atio ion n of 9 days. A sample of 40 or orde ders rs pla pl aced ced wit ith h supp suppli lier er B had a me mean an de deli live very ry ti time me of 27 days days and a st stan anda dard rd de devi viat atio ion n of 10 days. Using a .0 .05 5 le leve vell of sig signif nifica icance nce,, dete determi rmine ne whether whether these these sup suppli pliers ers hav have e dif differ ferent ent me mean an del delive ivery ry time times. s. 16 11 27 32 32 26 26 29 24 29 19 10 19 22 12 17 31 26 35 11 15
Supplier A:
1
Supplier B:
14 37 2 19 12 18 22 23 26 21 19 39 34 27 34 4 42 1 37 31 38 27 38 34 13 4 22 11 32
22 14 39 37 4
3
29 3
17 41 35 26 11 42 25 29 36 17 21
Darin has has deci decide ded d to dete determ rmin ine e th the e p-va p-valu lue e asso associ ciat ated ed wit ith h th the e te test st of the 3D-m 3D-mil illig ligram ram par parts ts conducted in p ro rob le le m 1 on page 86. This data wa s first analyzed on page 68. Problem Review
Given:
z=
:
x=
30 30.0 .025 25 mg mg,,
30.00 0 mg ::;; 30.0
X
l
= 36, 5 = .065 mg, and a =
:
> 30 30.0 .00 0 mg
30..025 30 025 - 30. 0.0 000
5
065
j7f
j3 6
1
Accept
=2.315 < 2. 2.33 33,, acc accep eptt H o 30
A
Calc Ca lcul ulat ate e th the e p-va p-valu lue e asso associ ciat ated ed wi with th th this is st stud udy. y.
Zc = 2 33
Note te:: c is fo forr cr crit itic ical al valu value. e.
90
B
Use Use this p-val -value ue to acce accept pt or reje eject the null hy hypo poth thes esis is.. Does Does your your an answ swer er ag agre ree e with with the pa page ge 86 an answ swer er? ?
C. Wh What at doe does this p-va p-valu lue e indi indic cat ate e is th the e st stre reng ngth th or validity the decis decision ion mad made e conc concer erni ning ng th the e nu null ll hy hypo poth thes esis is? ? of the
III. Pa Past st ex expe peri rien ence ce in indi dica cate tes s that that the po popu pula lati tion on me mean an weig weight ht of ma mate teri rial al cont contain ainer ers s us used ed to make make comp comput uter er part parts s is 5 00 000 0 kilo kilogr gram ams. s. The stan standa dard rd devi deviat atio ion n is 28 kilo kilogr gram ams. s. Type Type I erro errorr fo forr a sa samp mple le of 49 will will be cont contro roll lled ed to the 01 level of si sign gnif ific ican ance ce.. The 99 conf confid iden ence ce inte interv rval al is 4 98 989. 9.68 68 ki kilo logr gram ams s to 5 010 010.3 .32 2 ki kilo logr gram ams. s. A
Calc Ca lcul ulat ate e the the type type II erro errorr for for a two wo-t -tai aill prob proble lem m us usin ing g ea each ch of th thes ese e poss possib ible le po popu pula lati tion on me mean ans. s.
B
4 985 kg
4 995 kg
5 000 kg
C
Usin Using g the the da data ta calc calcul ulat ated ed in prob proble lem m A sket sketch ch and labe labell an oper operat atin ing g char charac acte teri rist stic ic cu curv rve. e.
5
5 kg
15kg
Us Usin ing g th the e da data ta calc calcul ulat ated ed in prob proble lem mA sk sket etch ch and and labe labell a po powe werr curv curve. e.
Note:: An op Note oper erat atin ing g char charac acte teri rist stic ic curv curve e and po powe werr curv curve e fo forr a on onee-ta tail il prob proble lem m is limi limite ted d to of on one e si side de the the popu popula lattion ion mean. Both look look like ike half half a normal cu currve sto topp ppin ing g at th the e mean. ean. 91
5
Quick Qui ck Quest Questio ions ns
ypoth othesis Tes Testing Part Part
Plac Place e th the e nu numb mber er of th the e de desc scri ript ptio ion n ne next xt to the the item item it de desc scri ribe bes. s.
I.
1.
Are rea a be bey yon ond d th the e te test st statistic
2
3
X
75
1
Ti1
II.
arge rge Samp Sample le
A.
Power cu curve
B.
P-value
_ _
2
5
n
25
4
~
~ 5
25
C.
Z ffo or te t esting two means
D.
Operating characteristics curve
Ace Rea eallty want wants s to de dete term rmin ine e whet whethe herr the av aver erag age e tim time it ta take kes s to sell ho hom mes is di diff ffer eren entt fo forr its tw two o offi office ces. s. A sa sam mple ple of 40 from from of offi fice ce 1 re rev vea ealled a mean mean of 90 da days ys and a sta stand ndar ard d de dev viati iation on of 15 day ays. s. A sample of 50 from office 2 revealed a mean of 100 days and a standard de devi viat atio ion n of 20 da days ys.. Use Use a .05 level of sig signif nifica icanc nce. e.
_ _
For Peopl People e Using Using St Stati atisti stics cs Soft Softwa ware re D a ys t o S e l l a H o m e Office
Of Offi fice ce 2
1
52
95
89
129
108
57
60
80
102
90
64
94
58
123
63
94
90
110
93
63
83
106
91
91
87
109
74
117
80
99
105
93
127
106
137
89
95
78
98
106 116
86
83
123
93
93
120
110
85
119
82
90
103
122
118
86
58
98
100
124
124
100
75
103
84
100
110
108
70
69
98
92
74
90
80
107
127
106
116
95
82
119
98
84
90
107
93
105
110
III. Tough Tire Tire Comp Compan any y is co conc ncer erne ned d th that at tre tread life ife of its new new all weat weathe herr ti tire re may may be bel elo ow the 70,0 ,00 00 mil ile e warr warran anty ty.. A sa samp mple le of 36 reve reveal aled ed a mean mean of 69 69,8 ,800 00 mile miles s an and d a st stan anda dard rd deviatio tion of 750 miles es.. Usin Using g a .05 lev eve el of sign signif ific ican ance ce and and th the e pp-va valu lue e appr approa oach ch,, te test st Toug Tough h Tire Tire s warran warranty ty cl clai aim. m.
92
\
7
90
For People People Usin Using g Statistics Statisti cs Software Software Tire Mileage 69850 71200
69700
694 940 00 69550
70625
701 0150 50 69300
70175
70 7010 100 0 69950
70400
68950 950 68416
69150
71834 70200
70750
69904 68650
69700
69620 68850
69475
70350 70300
69300
70450 70250
68550
702 0200 00 68825
69900
688 8850 50 69725
70200
IV
Th The e Easy Easy Lo Loan an Com Compa pany ny wa want nts s to de dete term rmin ine e wh whet ethe herr the the av aver erag age e leng length th car car loan loans s has has in incr crea ease sed d from from last last year year s po popu pula lati tion on me mean an 5 months A sampl ple e of 49 had a mean 53 mo mont nths hs an and d a stan standa dard rd de devi viat atio ion n 14 months A ; 5 Test H o : and H : > 5 at the the 5 level significance
B
Calc Ca lcul ulat ate e the the crit critic ical al value value x
C
Calcul Cal culate ate type type
D
What Wh at is the the type type 54 months
II
error err or for for
II
or eople Using Statistics Software ength of
ar oans
47
58
2
53
52
52
79
72
4
48
55
61
62
68
27
55
49
56
53
78
55
52
44
73
53
57
66
63
49
51
75
42
45
71
69
67
52
53
38
36
43
38
46
32
73
6
23
53
= months
error err or for for thes these e po popu pula lati tion on me mean ans? s? 53 31 months
93
5
1 months
Chapter I.
Hy Hypo poth thes esis is Test Testin ing g of Po Popu pula lati tion on Prop Propor orti tion ons s Don't forg Don't forget et to loo look ahead
I n t r o du c t i o n Th The e po popu pula lati tion on propo proporti rtion on,, first first de descr scrib ibed ed on page page 70, is the av aver erag age e pa part rt of a po popu pula lati tion on ha havi ving ng a cert certai ain n char charac acte teri rist stic ic.. 1. Th The e populati population on proporti proportion on p follow follows s a binomi binomial al probab probabili ility ty distri distribut bution ion.. 2. It ma may y be ex expr pres esse sed d as a frac fracti tion on,, deci decima mal, l, or pe perc rcen enta tage ge.. = 3. Impo Importan rtantt statisti statistics cs Sample Proportion Interval Estimate for p
p
=
efsuccesses sample size
B.
=
+
2
-
n
p-
Z
Jp 1- P n
Pr Prop opor orti tion on tests tests must must meet meet bi bino nomi mial al exper experime iment nt re requ quir irem emen ents ts.. 1. The ex expe peri rime ment nt must must in invo volv lve e two two mutu mutual ally ly-e -exc xclu lusi sive ve ou outc tcom omes es de defi fine ned d as succe success ss or fail failur ure. e. 2. Ou Outc tcom omes es,, which which can can be coun counte ted, d, must must be in inde depe pend nden entt and cons consta tant nt.. 3.
In is numbe mber of trials Ip is pro proba babi bili lity ty of su succe ccess ss Iq
I
the the prob probab abil ilit ity y of fail failur ure, e, is 1 - p
C. Thes These e pr prop opor orti tion on te test sts s use use the the no norm rma al app ppro roxi xima mati tio on of the the binomia mial. This his me mea ans both np and nq must mu st be 5 and n m u s t b e 30 30.. The re reco comm mmen ende ded d re requ quir irem emen entt for n vari varies es from from 30 30-1 -100 00..
II. One One -ta -taii l te test stii ng ng of on e sample proportion Linda is ap appl plyi ying ng for a Flop Flopbu bust ster er Vi Vide deo o fran franch chis ise. e. Fl Flop opbu bust ster er requ requir ires es at leas leastt 85 of Li Lind nda' a's s cu cust stom omer ers s be ha happ ppy y wi with th serv servic ice e at the .05 leve vell of significance. Pag age e 70 sa samp mple le da data ta in ind dicat icated ed 80 of 10 100 0 cust custom omer ers s were were ha happ ppy y wi with th serv servic ice. e. B. Befo Before re us usin ing g th the e no norm rmal al ap appr prox oxim imat atio ion n to th the e bi bino nomi mial al,, the the ap appr prop opri riat aten enes ess s of the the da data ta mu must st be chec checke ked. d. 1. Both np and nq are 5 as (1 (100 00)( )(.8 .85) 5) =85 and 10 100( 0(.1 .15) 5) =15. 15. 2. The The sa samp mple le si size ze of 100 is 30 30.. C. The The 55-ste step p ap appr proa oach ch to hy hypo poth thes esis is te test stin ing g 1. Th The e nu null ll hy hypo poth thes esis is an and d al alte tern rnat ate e hy hypo poth thes esis is ar are e H o : p .85 .85 and and H 1 : p < .85. 2. Th The e le leve vell of si sign gnif ific ican ance ce will be .05 and the cr criitica ticall valu value e of z is -1.645. 3. Th The e re rele levan vantt st stat atis isti tic c wi will ll be Note No te:: The The stan standa dard rd erro errorr of the the po popu pula lati tion on p-p pp proportion is base based d upon upon the the hypo hypothe thesiz sized ed z p populatio popu lation n proportion proportion p (somet (sometimes imes labele labeled d 1t , and no nott the the samp sample le prop propor orti tion on..
J
4.
Either of 2 de deci cisi sion on rules may may be used sed. a. If z from rom the test test sta statist tistiic is be beyo yond nd the the criti critica call value of z, th the e nu null ll hy hypo poth thes esis is will will be reje reject cted ed.. b. If th the e pp-va valu lue e is le less ss tha than the the .05 level vel of signif ic ican ance ce,, th the e nu null ll hy hypo poth thes esis is wi will ll be re reje ject cted ed..
5. App Apply ly the the decis decision ion rule rule..
lp
=
=
z=
p-p p- p
JP ;;P
Accept
He
= .80
I
.80-.85 .85 1-.85
100
because beca use -1.40 -1.40 is not bey beyon ond d -1.6 -1.645 45..
Customer Custom er sat satisfa isfactio ction n is
= 1 645 Critical Value
= 1 4 0
85 .
The The p metho thod yi yie elds lds the the sa same me answ swe er. z = -1.40 ---7 ---7.4 .419 192 2
z
P=
or z=o
.5000 - .4192 = .0808
Acce Accept pt H o beca because use .080 .0808 8 >. >.05 05..
III. Two-tail Two-tail testing of one sample sample propo proporti rtion on Whe hen n an any y chan change ge is be bein ing g meas measur ured ed,, a twotwo-ta tail il pr prob oble lem m ex exis ists ts.. B. If th the e ab abov ove e pr prob oble lem m were were stat stated ed as a twotwo-ta tail il pr prob oble lem, m, then then H o : P = .85 and H 1 : p :t .85 .85 wo woul uld d be appr approp opri riat ate. e. C. With a tw twoo-ta tail il te test st,, p must must be do dou ubl bled ed to 2( 2(.0 .080 808) 8) =.1 .161 616 6. Ac Acce cept pt Ho because 1616> .05. 94
IV. Tw Twoo-ta tail il test testing ing of t w o sampl sampl e proportions Many Man y intere interesti sting ng pro proble blems ms inv involv olve e two popula populatio tion n propor proportio tions. ns. A 1. Does Does cons consumer umer sat satisf isfact action ion differ differ bec becaus ause e of ge gend nder er,, ag age, e, inco income me,, et etc. c.? ? 2. Do Does es ma mach chin ine e A prod produc uce e fewe fewerr de defe fect cts s than than ma mach chin ine e B? 3. Do Does es taki taking ng a cert certai ain n drug drug lowe lowerr the the inci incide denc nce e of illness? two-ta tail il prob proble lem m B. A two1. Lind inda wa want nts s to kn know ow at .05 level of sig signif nifica icance nce whether whether two of he herr st stor ores es have have eq equa uall leve levels ls of customer satisf sat isfact action ion.. Sto Store re 1 had 80 of 100 100 sati satisf sfie ied d cust custom omer ers s wh whil ile e st stor ore e 2 had 45 of 50 sat satisf isfied ied cus custom tomers ers.. 2. The The 5-st 5-step ep app appro roach ach to hy hypo poth thes esis is test testin ing g a. The nu null ll hy hypo poth thes esis is and and alte altern rnat ate e hypo hypoth thes esis is are: are:
b. c.
1 Ho : P = P 2 H 1: P 1 : F : P 2 The le leve vell of si sign gnif ific ican ance ce will will be .05 and 0./2 0./2 = .05/2 The The test test stat statis isti tic c will will be
=.025 4
Z
n1
sample le size size 1 and is samp
n2
is sa sam mpl ple e si size ze 2 and X2 is suc succes cessfu sfull respon responses ses from from this this sample sample..
X
succes cessfu sfull respon responses ses fro from m this this sample sample.. is suc
5 the the sam sample ple prop propor orti tion on for for popu popula lati tion on
1, is
52 the the sam sample ple prop propor orti tion on for for po popu pula lati tion on
2, is
w is
d.
= ± 1.96.
= 2
_
W
-
total successes total successes tot total al sample sampled d
=
X
n
+ X2 + n2
= .80.
= 50 45 = .90.
the the we weig ight hted ed or pool pooled ed es esti tima mate te of the the pop popula ulatio tion n me mean an..
Th The e decisi ision rule will be, if z from the test statistic is beyond the the crit critica icall value value of z, the the nu null ll hypo hypoth thes esis is will will be reje reject cted ed..
e.
Apply Appl y th the e de decis cision ion rule. ule.
Pw =
-
Accept
X
X2
n
n2
=
80 45 100 50
=.8
33
.80 .90 ~ ~ = = .833 1-.833 100
+
; = = = = .833 1-.833 50
= 1 55 z = -1.96
Acce Accept pt Ho be beca caus use e -1 -1.5 .55 5 is no nott be beyo yond nd -1 -1.9 .96. 6. Cu Cust stom omer er satisfaction is the same at the .05 level of significance.
z = +1.96
Th The e p-va p-valu lue e me meth thod od yi yiel elds ds th the e same same an answ swer er.. = -1.55
Z
.4394 and .5000 - .4394 = .0 .060 606 6 fo forr on one e ta tail il
4
Accept Ho because P =2(.0606) = .1 .121 212 2 and .1212> .1212> .05. V
One-tail testing o f t wo One-tail wo sampl sampl e proportions A One-t One -tail ail pro proble blems ms inv involv olve e change change in one directio direction. n. B. Do Doin ing g the ab abov ove e prob proble lem m as a on onee-ta tail il prob proble lem, m, the the b . 2 quest que stion ion cou could ld be; d oes store give etter service?. 1. Usin Using g z yi yiel elds ds the the foll follow owin ing g analy analysis sis..
2. The p me meth thod od yi yiel elds ds the the foll follow owin ing g an anal alys ysis is..
I.
:P
and H : P >
Accept Ho because = .05 4 an and d -1.5 -1.55 5 is not beyo beyond nd 1 645
z = - 1.55
Accept 95
Ho
Z of
I
± 1.645
.4394 and p =.5 .500 000 0 - .4 .439 394 4 because 0606> .05. 4
=.0606
Prac Pr acti tice ce Set
Hy Hypo poth the esis Te Tes sting ting of Popu pula lattion Proportions
I
Pa Page ge 72 da datta show showed ed 90 4 45 5 of 50) of the the 30 30-m -mil illi ligr gram am par artts, ta take ken n fr from om a lot lot of 1, 1,000 000 pa part rts, s, passed passed insp inspec ecti tion on.. Da Darrin wa want nts s a 1 level of sign signif ific ican ance ce te test st to de dete term rmine ine whethe whetherr th the e po popu pula lati tion on pr prop opor orti tion on of pa part rts s pa pass ssin ing g in insp spec ecti tion ons s ha has s incr increa ease sed d from from th the e 86 repo report rted ed last last year year..
II
Darin Dari n wan wants ts to det determ ermine ine at the 1 level of sig signif nifica icance nce whether whether there there is a differ differenc ence e in th the e propor proportio tion n of de defe fect cts s pr prod oduc uced ed du duri ring ng the the da day y an and d nigh nightt shif shifts ts.. sam sample ple of 100 part arts wa was s tak aken en fro rom m each each shif iftt. The day day shif shiftt had 5 def defec ects ts and th the e nigh nightt shif shiftt had 14 de deffect ects. Is th ther ere e a diff differ eren ence ce in the pro propor portio tion n of de defe fect cts s pr prod oduc uced ed by thes these e two two shif shifts ts? ?
Data
et o r
Pe Peopl ople e Usi Using ng Sta Stati tist stics ics Sof Softw twar are e
Day P P P P P P P P P P P P P P P P P P P P P P P P P F P P P P P P P P P P P P P P P P P F P P P P F P P P P P P P P P P P P P P P P P P F P P P P P P P P P P P P P P P P P P P P P P P P P F P P P P P P NightP P P F P P P F P P P P P P F P P F P P P P P P P P P P P P P P P P P P P P P P P P P F P P P P F P P P P P P F P P P F P P P P P P P F P P P P P P P P F P P P P F P P P P F P P P P P P F P P P P F P
96
Quick Qu ick Que Questi stions ons I.
op opul ulat atio ion n
Pl Plac ace e th the e nu numb mber er of th the e ap appr prop opri riat ate e fo form rmul ula a or ex expr pres essi sion on next to th the e it item em it de desc scri ribe bes. s. A When Wh en usin using g th the e norm normal al ap appro proxim ximat ation ion to the the bi bino nomi mial al di dist stri ribu buti tion on,, 1.
np and n 1 - p must be
2.
n must be
B.
A o n e p o p u l a ti o n t e s t
c.
w
J
rop roport rtio ion ns
P
P2
pw 1-pw
_
_
D. A two population test
II.
ypot yp othe hesi sis s Te Test stin ing g of
_
+
n
2
~
3
X1
X2
n
+n
4
pw 1-pw
n2
~ 5
op
_
A na nati tion onal al video video pu publ blic icat atio ion n state stated d lo long ng-t -ter erm m tape tape re rent ntal als s avera verag ge 20 of all ta tape pe re ren nta talls. A 15 150 0 cust custom omer er stUdy at Lind Li nda a s Video Video Showcas Showcase e revea reveale led d 24 long term rentals. Test Test at th the e .0 .05 5 le leve vell of si sign gnif ific ican ance ce whet whethe herr Li Lind nda a s lo long ng term term rent rental als s ar are e le less ss th than an th the e nati nation onal al av aver erag age. e.
o r Peo People ple
Using Using Statis Statistics tics So Soft ftwar ware e
Length m I
o f Video
I m I
I I
I
I
I
I
mm
I
or
I m
I
I m I
I
I
I m I I
I
III. Linda Smith found th tha at 70 out out of 100 cu cust sto omers rented 2 or more tapes at one store and 44 out o f 50 rented 2 o r more tapes at a second store. Te Test st at the .05 level of sig signif nifican icance ce whether whether there there is a di diff ffer eren ence ce be betw twee een n th the e pr prop opor orti tion on of cust custom omer ers s at th thes ese e two two stor st ores es rent rentin ing g 2 or more more ta tape pes. s.
Rentals
I
I m I m I
People Peo ple Using Using Statis Statistic tics s So Soft ftwa ware re umber of Video
S tto o re re 1
Rentals St Stor ore e2
2 2 1 2 2 1 2 1 2 1 2 2 2 2 3 1 2 2 2 2 2 2 2 1 2 4 1 2 2 2 5 1 2 1 2 2 3 1 2 2 2 2 3 2 2 3 2 1 2 4 2 1 2 2 1 3 4 1 5 2 1 2 2 2 1 2 3 2 1 2 2 2 2 2 2 2 2 1 3 2 2 2 1 2 5 1 2 3 2 2 2 1 2 2 2 1 2 2 1 2 3 2 2 4 2 2
2
1 1 4 2 2 1 3 1 2 2 2 3 1
3 1 2 2 1 2 1 2 3 4 2 3 3 1 4 2 2 1 2 2 1 3 2 1 3 3 4 1 2 2
97
Chapt apter 16 Small Sample Hypothesis Testing Using Student s t Test I
La Larg rge e ve vers rsus us smal smalll sa samp mple les s Th The e st stan anda dard rd norm normal al distr distribu ibuti tion on z) is app approp ropria riate te for la large rge samples n 30). 30). The The pop opu ula lati tion on may may be norm rma al or ske skewe wed d. 1 If 0 is unkno unknown wn,, us use es 2 For small samples, n < 30, z is app approp ropria riate te provid provided ed the population is no norm rmal al an and d 0 is known. B Th The e studen studentt t distri distribu butio tion n is ap appr prop opri riat ate e for for smal smalll samp sample les, s, n 4.2 4.26. Mean sale ales 109
of the these se salesp salespeop eople le are are not not equa equal. l.
Prac Pr acti tice ce Set I
Anal Analys ysiis
Variance
Darin want wants s to kn know ow whet whethe herr the the vari varian ance ce of 30 mg pa part rts s ha has s incre increas ased ed Th The e st stan anda dard rd de devi viat atio ion n fr from om a re recen centt sa samp mple le of 16 pa part rts s was was 06 067 7 mi mill llig igra rams ms Th The e st stan anda dard rd de devi viat atio ion n fr from om an ea earl rlie ierr study study of 14 pa part rts s was was 06 062 2 mi mill llig igra rams ms Test Test at the 1 le leve vell whether whether the po popu pula lati tion on varia variance nce ha has s in incre creas ased ed
Data Se t Fo r People Using Usi ng Statis Statistic tics s Soft Softwa ware re S amp le 1 29 91 29 93 29 96 29 95 29 94 29 95 29 97 30 09 30 04 29 96 30 09 30 06 29 95 29 91 30 09 29 92
II
Sa mpl e 2 29 89 30 09 29 96 29 96 29 98 29 99 30 05 29 99 30 07 30 06 29 97 30 09 30 04 30 09
Ti Time me pa pass sse ed and th the e wo wond nder ers s of mi mini niat atur uriz izat atio ion n ha have ve re redu duce ced d the the 30 mg pa part rts s to a we weig ight ht of on only ly 9 mg Da Dari rin n random randomly ly select selected ed sample samples s of 9 mg pa part rts s from from 3 de depa part rtme ment nts s wi with th the the foll follow owin ing g resu result lts s Pe Peop ople le using using statisti statistics cs softwar software e should should skip to pa part rt D Comp Co mple lete te th this is ch char artt to be beg gin an ANOVA ANOVA stud study y of th the e mean mean weig weight ht of part parts s prod produc uced ed by thes these e 3 de depa part rtme ment nts s Weight Analysis o f 9 mg Pa Part rts s Pr Prod oduc uced ed by 3 D e p a r t me n t s
Part Parts s Samp Sample le 1 is T
X1
Parts Sample 2 is T
X
X2
X3
X3
8 95
8 0 102 5
9 05
81 9025
9 05
81 9025
8 90
79 210 0
9 05
81 9025
9 15
83 7225
8 90
7 9 210 0
9 10
82 8100
9 10
82 8100
LXr
Rofor wT To otaculatio ls Rtions eqns uired Cal Calcula
Part Parts s Samp Sample le 3 is T3
LX
L Xr 2
n
N
L X T 2 n
L [
L ~ T
LX
110
2
=
2 ]
=
B
Us Usin ing g da data ta fr from om th the e pr prev evio ious us pa page ge,, calc calcul ulat ate e th the e fo foll llow owin ing g valu values es..
C
= [ L ~ T 2 ] _ L:>
=
_
[ L ~ T > 2 ]
SSTOTAL
=
Comp Co mple lete te th the e fo foll llow owin ing g ch char artt us usin ing g the the da data ta ac accu cumu mula late ted d to this this po poin int. t. Summary ary Tab Table le Variance Analysis Summ
Variance Sources Between Treatments Within Treatments error Total Variance
D
df
t- 1=
I
Sum of the Squares
SST =
-
N-t=
SS
N
SSTOT
Mean Mea n Squares Squares
I
ANOVA
I
M S T=
MS =
Usi sing ng the 55-st step ep ap appr proa oach ch to hypo hypoth the esi sis s test testin ing g and the the ab abov ove e ch cha art, rt, test test at the the .05 le leve vell wheth whether er th the e samp sample le mean means s ar are e from from po popu pula lati tion ons s with with eq equa uall mean means. s.
F=
Quic ick k Qu Que esti tio ons I
Analys lysis of Va Vari rian ance ce
Copy Co py the form formul ulas as an and d ex expr pres essi sion ons s on the the righ rightt int into this his ANOV A summary chart. chart. Variance Anal Analysis ysis Summ Summary ary Table
Variance Sources
df
ISum of the I Me Mean an Squ Square ares s I AN OV A S qua r es
F
SST
Between Treatments
N-t
Within Treatments error
MS =
Total Variance
MS =
SSTOTAL
t-1
SST t
r
SS
E
ST SE
SSE
N t
N- 1 II
III
Answ Answer er the the foll follow owin ing g fill in the the blan blank k qu ques esti tion ons. s. A
Analysis of variance requires th t he populations be
distributed.
B
When Wh en us usin ing g the the F dist distri ribu buti tion on,, th the e nu nume mera rato torr is alwa always ys the the
C
When Wh en do doin ing g AN ANOVA OVA,, the the nu nume mera rato torr of the F d diistribution measures va v a r ia n c e
D
When Wh en doin doing g AN ANOV OVA, A, the de deno nomi mina nato torr of the F d diistribution measures va v a ri a n c e
of th the e 2 va vari rian ance ces. s.
the ttrreatments. the treatments.
Comple Comp lete te the the foll follow owin ing g ANO ANOVA VA stud study y conc concer erni ning ng grad grade e po poin intt av aver erag ages es ra rand ndom omly ly se sele lect cted ed by a loca locall coll colleg ege. e. Thos Those e us usin ing g stat statis isti tics cs soft softwa ware re shou should ld skip skip to pa part rt D Be Begi gin n by comple completin ting g this this char chart. t. Analysis o f College Grades Base Based d Up Upon on Hi High gh Scho School ol Grades
Hig High H.S. Grad Grades es T, College Grades X,
X21
Medi Me dium um H.S H.S. Grad Grades es T2 College G r a d e s
~
Low Low H.S. Grad Grades es T 3 College Grades X3
34
32
2
35
28
25
3
30
27
Row Tot Totals als Requ Requir ired ed for Calculatio Calculations ns
X
LXr L Xr
n
N
L X T 2
L [
n
L ~ T
L X =
112
2 ]
=
B
Usin Using g th the e char chartt
on
T = ~ i
2 ]
C
: ~ T
th the e pr prev evio ious us pa page ge,, ca calcu lcula late te the the foll follow owin ing g valu values es..
_ 1:: 2
MS
t
SS
N t
Comp Co mple lete te th the e fo foll llow owin ing g char chartt us usin ing g da data ta ac accu cumu mula late ted d to th this is po poin int. t. Varian Var iance ce Analys Analysis is Summary Summary Variance Sources Between T re a t m e n t s Within Treatments error Total Variance
Sum of of the Sq u a re s
df
t- 1=
SST=
N-t=
SSE=
N- 1=
SSTOTAl
able
Mean Sq Squares
MS
ANOVA
- F=
MSE= =
D
Us Usin ing g th the e 55-st step ep ap appr proa oach ch to hy hypo poth thes esis is test testin ing, g, te test st at th the e .0 .05 5 leve levell wheth whether er thes these e samp sample le mean means s come come from po popu pula lati tion ons s with with eq equa uall mean means. s.
E.
Answerr proble Answe problem m
at
th e
1
le leve vell of signif significa icance nce..
113
Chap Ch apte terr 9 Tw Two o Fa Fact ctor or Analysis of Va Vari rian ance ce I.
Sources o f variability A. Vari Varian ance ce be betw twee een n tr trea eatme tment nt vari variab able les s ex exis ists ts be beca caus use e trea treatm tmen ents ts are are no nott alik alike. e. B Vari Varian ance ce with within in a tr trea eatm tmen entt is un unex expl plai aine ned d and du due e to samp sampli ling ng erro error. r. C Add Additi ition onal al sources sources of vari variab abil ilit ity y (c (cal alle led d fa fact ctor ors s or trea treatm tmen ents ts)) may may be ad adde ded d to a st stud udy. y. 1 Thei Theirr vari variab abil ilit ity y may be us used ed to re redu duce ce un unac acco coun unte ted d for, for, wi with thin in trea treatm tmen entt vari variab abil ilit ity y (err (error or). ). 2 Addi Additi tion onal al treat treatme ment nts s ar are e call called ed blockin blocking g variab variables les.. 3 They The y repres represent ent a substa substanti ntial al source source of in inhe here rent nt respo response nse vari variab abil ilit ity. y. 4 Tr Trea eatm tmen ents ts must must no nott be in inde depe pend nden ent. t. Trea Treatm tmen entt B may may affe affect ct the the fact factor ors s of trea treatm tmen entt A diff differ eren entl tly. y. 5
For For ex exam ampl ple, e, weeks weeks of ex expe peri rien ence ce may may ha have ve a di diff ffer eren entt affe affect ct on ea each ch of the the recen recently tly hire hired d sale sales s peop people le.. Examples of bl block ockin ing g variab variable les s in inclu clude de ag age, e, gend gender er,, educ educat atio ion, n, and and time time..
II. T wo wo - fa fa c ctt o orr variance analysis A In ch chap apte terr 18, Li Lin nda fou found that that he herr 3 sale salesp speo eopl ple e had di diff ffer eren entt me mea an we week ekly ly sa salles and that that half half of the the data data s varia var iabi bili lity ty could could be att attrib ribut uted ed to the the salesp salespeo eopl ple e tre treat atme ment nt.. B Ch Chap apte terr 18 sa sale les s da data ta was was ra rand ndom omly ly assi assign gned ed to ea each ch sale salesp spe erson rson.. Here, it has been arra arrang nged ed by week weeks s of ex expe peri rien ence ce.. Usin Using g ex expe peri rien ence ce as a bl bloc ocki king ng vari variab able le may may ac acco coun untt for for some some of the une unexpl xplain ained ed var variab iabili ility. ty. Treatm Tre atment ents s are not not indepe independe ndent nt be becau cause se weeks weeks of exper experien ience ce may affect affect sa sales lespe peop ople le diff differ eren ently tly.. C L X a is th the e sale sales s as asso soci ciat ated ed with with ea each ch bl bloc ock k (wee (week) k).. Numb Number er of trea treatm tmen ents ts is now t b is the the nu numb mber er of block locks. s. Weekly Sa Sale les s (x) (x) in Th Thou ousa sand nds s o f Doll ars
Block(Bx ) Week s
Salesp Sal espers erson on L is T Sa Sale lespe sperso rson n M is T2 S a l e s ( X
S a J e s
~
Salesp Sal esperso erson n N is T 3
X2
Sales(X3
Row Totals Required for Cal Calcul culati ations ons
X2
LXa
L Xa 2
L XS 2
1
4
xf 16
2
6
36
6
36
8
64
20
40 0
133.3
3
7
49
6
36
9
81
22
484
161.3
4
I
49
a
64
10
100
25
625
208.3
2
3
6
36
7
49
17
289
96.3
L [
84=LX
LXr
24
26
34
L Xr 2
576
676
1156
b
4
4
4
144
169
289
L X T 2 b
172
150
df
Su Sum m of the Squares
Between Treatments
t- 1
SST
Block Within Treatments (error) Total V Va ar i an c e
b-1
t-1
b -1
N -1
MS r =
MS
SSB
MS
SSE S S T
T
= SSs a
b
L X 2 = 616
E
AN OVA
MS T
Msa
MS E
b
=
SSE t-1 b--1
~
114
= 599.3
L[ LXT ] = 602
SST
t-1
2 ]
N = 12
294
Mean Mea n Squares Squares
L ~ S
84=LX
Variance Ana Analysi lysis s Summar Summary y Table Table
Variance Sources
MSE
Note: Th This is anal analysi ysis s is call called ed me mean an squa square re becau bec ause se it is based upon upon the the vari varian ance ce..
D. E.
Li Lin nda want wants s to know know the the vari variab abil ilit ity y ex expl plai aine ned d by the the bloc blocki king ng vari variab able le expe experi rien ence ce at th the e .05 leve levell of significance. The The 55-st step ep ap appr proa oach ch to hy hypo poth thes esis is test testin ing g 1. A ch chec eck k of ea each ch null hy hypo poth thes esis is will will be mad ade. e. a. Ho : f = f 2 = f 3 and H1 : f f 2 f 3 for the the trea treatm tmen entt mean means. s.
* * and H * * *
2. 3. 4.
b. Ho : f = f 2 = f 3 = f 4 f 2 f 3 f 4 fo forr th the e bloc block k mean means. s. 1 : f The le leve vell of significance is .05. LX 2 2 The The test test stat statis isti tic c is F SSTOTAL = L X If F from the the test test stat tatist istic is be beyo yond nd the the crit critic ical al valu value e of F for = 61 6 16 - 588 = 28 th the e .05 le leve vell of si sign gnif ific ican ance ce,, the the nu null ll hy hypo poth thes esis is will will be reje reject cted ed..
5.
Apply App ly the the de decis cision ion rule. ule.
SST SS T=
SSE = SSTOTAL - SST + SSs)
SSB =
L[ LXTl] _ LX 2 N b
=602
8 2
_
_ t 1
=
_1_4 3 1
L ~ B
_ L : 2
2 ]
14.0 + 11.3)
= 2 8 .0 -
= 2.7
8 2
=5 9 9 . 3 - 2
2
= 602 - 588 = 14
MS T =
L [
Unexplained Unexplain ed variability variability is down down fr from om 14. 4.0 0 to 2.7.
= 599.3 - 588 = 11.3
= 7.0
S
-
B
SSB b-1
=
11 3
4-1 4- 1
= 3 77
MS
E
= (3 -12.74-1
SSE
= t-1
b-1
-
45
.
Reje Re ject ct H o because F =
Treatment hypothesis degrees o f freedom
t - 1 = 3 -1 = 2 for for nu nume mera rato torr t -1) b -1) = 3 - 1) 4 - 1) =6 for de denom nomina inato torr F =5.14
E
= 5 = 15.56> 7
5.14.
Avera Ave rage ge sales salesper perso son n sale sales s are are no nott equa equal. l.
Block hypo hypothes thesis is degrees degrees o f freedom
b - 1 =4 -1 t -1) b -1) F = 4.76
MS T MS
Re.ject Reje ct H o be beca caus use eF
=3 for for nu nume mera rato torr = 3 - 1) 4 - 1) =6 for for den denom omina inato torr
MS = MS
B E
=
3 77
45
= 8 .38
4 76 .
Aver Averag age e we week ekly ly sale sales s are are no nott eq equa ual. l.
III. Comparing three or more trea treatme tment nt sam sample ple mea means ns fo r one-fact one-factor or analysis A Hav Ha ving ing prov proven en that that ther there e is a diff differ eren ence ce in the the av aver erage age sale sales s of th the e thr three ee treatm treatment ents s sale salespe speople ople)) in chap chapte terr 18 18,, de dete term rmini ining ng whethe whetherr trea treatm tmen entt me mean ans s dif diffe ferr from from ea each ch of the other her may be of interest. B. A rang range e conf confide idenc nce e in inte terv rval al)) will will be fou ound nd for for the the diff differ eren ence ce be betw twee een n 2 tr trea eatm tmen entt mean means. s. A po posi siti tive ve range C
fo forreth the diff dilue ffer ence e ofwi thes th me mean ans ill indi indica cate te the diff differ eren ence ce coul could d no nott be zero zero and th the e mean means s are are dif differen erentt. Th The tevalu va eerenc will llese bee us used ed. . s will foral2
X - X 1 D.
±t J M S E
~ 1
+
We wi will ll de dete term rmin ine e wh whet ethe herr av aver erag age e sale sales s for for the the firs firstt and th thir ird d sale salesp sper erso son n are are diff differ eren entt at the .05 leve levell of significance.
Salespersons 1 and 3 Ave vera rage ge Sale les s Dat Data a from from pa page ge 10 109) 9)
The The numb number er of observations with within in ea each ch treat treatmen mentt is n1 and n2 •
X= 1
-
X -X1
±t J M S E
* + ~
8.5 -6.0 ± 2.262J1.56 i+i LX = n1
24 4
LX
34
=60
X3 = f i 3 = 4 = 8.5 t for 012 and N - t degrees of freedom is 12 - 3 = 9
2.262 2 t = 2.26
2.5±2.262P8
2.5 ± 2.0 A posi positi tive ve rang range e of .5
H
4.5
indi indica cate tes s th the e mean means s are are diff differ eren ent. t.
MS E from page page 109 109 is 1.56.
115
Practice Set I
9
Two Fac Factor tor Anal aly ysi sis s of Va Vari rian ance ce
Pr Prac acti tice ce Se Sett 18 will ill be expa expand nded ed by assu assumi ming ng th the e da data ta wa was s rand random omly ly coll collec ecte ted d at hour hourly ly inte interv rval als. s. Pa Page ge 110 110 data data ha has s been been arra arrang nged ed acco accord rdin ingl gly. y. Dari rin n wa want nts s to dete determ rmin ine e whet whethe herr samp sample les s ta take ken n later in a shif shiftt are are less less like likely ly to pass pass in insp spec ecti tion on.. Peo People ple using using statis statistic tics s software should s k kii p t o p ar ar t D. Com ompl plet ete e th this is char chartt to begi egin an AN OV A study of th the e pr prod oduc ucti tion on pr proc oces ess s pr prod oduc ucin ing g th thes ese e pa part rts. s.
W eight eight Analysis Analysis o f 9 m g P a arr tts s Produced by 3 Departments
Time
Parts S a m mp p le le 1 is T
X1
Parts Sample 2 is T 2
X2
Row Tot Totals als Re Requ quir ired ed for Calculati Calculations ons
Parts S ample 3 is T 3
X 22
Xa
X23
9:15 9:15 AM
8.90
79. 21 00
9 .0 5
8 1 .9 0 2 5
9.05
8 1 . 90 2 5
10 10:2 :20 0 AM
8.90
79.2100
9.05
81.9025
9.10
8 2 . 81 0 0
11:1 AM
80 . 10 2 5
9. 10
8 2 .8 1 0 0
9
83.7225
5
Xa
LX= Xr Xr 2
Xr
26.75 715.5625 3
27.30
739.84
745.29
3
3
246.613
238.521
Usin Using g the abov above e da data ta cal calcu cula late te the fo foll llow owin ing g valu values es..
S S r = L[ L: L:rr 2] 2] _ L: 2
SS a =
L ~ S
2 ]
_
2
i
SSTOT L
2
116
2
733.5 733.507 07 = .057
= X
2 ]
L[ L;r 2]= 733.564 248.4350 L
=733.5 733.564 64
L ~ S
N=9
248.430
733.56 564 4 _ 8 ;} = 733.
L [
81.25
~
246.6150
238.5225
B
27.20
LX S
Xa 2
2
X
x 2 = 733.5725
C
Comp Co mple lete te th the e fo foll llow owin ing g ch char artt us usin ing g da data ta ac accu cumu mula late ted d to th this is point oint..
Variance Sources Between Treatments
Block Within reatments {error}
Total Variance
D
II
df
t =
I
Variance Varia nce Analysis Analysis Summ Summary ary Tabl Table e Sum of the Squares
I
M
SST
b =
SSB
t - 1 b - 1 =
SSE
Mean Mea n Squares Squares
ANOVA
T
F
MS r
F
MS s
M5
M E
N =
SSTOT
Usin Using g th the e 55-ste step p ap appr proa oach ch to hy hypo poth thes esis is te test stin ing, g, de dete term rmin ine e at the the 1 level of significance whethe whet herr th the e samp sample le tr trea eatm tmen entt an and d bl bloc ock k mean means s come come from from po popu pula lati tion ons s wi with th eq equa uall mean means. s.
Usin Using g in info form rmat atio ion n fr from om pa page ge 11 111, 1, de dete term rmin ine e at th the e be betw twee een n tre treatm atment ents s 1 an and d
1
level of signi signific fican ance ce whether whether the there re is a diff differe erence nce
117
Quic Qu ick k Qu Ques esti tion ons s I
9
Two Factor Factor An Ana aly lys sis of Va Vari rian ance ce
Us Use e th the e sy symb mbol ols s to th the e ri rig ght to comp comple lete te the the fol ollo low win ing g ANOV ANOVA A su summ mmar ary y ch cha art.
F=
SST
Varian Var iance ce Analysi Analysis s Summar Summary y Table Table Variance Sources
df
Sum of the Squares
Mean Me an Square Squares s
t-1
AN OVA
Between Treatments
SSTOT
MS T SSr
t-1
sSe M S B = b1
SSE
t- 1
Block
MS
Within Treatments error
E
SS
= t-1
b-1
b-1
Total Variance II
b-1
MS r MS
F=
SSe
MS s MS
N=1
The anal analys ysis is in th the e la last st set set of Qu Quic ick k Ques Questi tion ons s wi willl be expa expan nded ded by rea rearran rrang ging ing the the data ata in each each row row so it is based sed up upon on th the e amoun amountt of time time stud studen ents ts spen spend d stud studyi ying ng.. Comp Comple lete te the the folloWing ANO ANOVA VA st stud udy y conc concer erni ning ng col college gr grad ades es and st stud udy y time times s coll colle ected cted by a lo loca call co collle leg ge. Begin by co comp mple leti ting ng this ch cha art. rt. People People us using ing statis sta tisti tics cs softwa software re should should s ki ki p t o p a arr t C. Analysis of College Grades B a s e d U p o n H i g h S c h o o l G r a d e s a nd nd T im im e S p pe e nt nt Studying While in Colleg College e
College Study
Hi High gh H.S .S.. Gr Grad ades es T Me Med dium H.S. Gra Grades T 2
Time
College Grades X
High
3.5
Medium
3.4
Low
Rofor w To TCal otaculati ls R eons quired Calcul ations
Low H.S. Grad Grades es T3
College Grades X2
X22
College Grades X3
X32
1 2 .2 5
3.2
10.24
2.7
7.29
11.56
3.0
9.00
2 .5
6.25
96
2.8
7.84
2
44
X21
I,XB 2
I,XB
~
LXT
10
9
7.3
LXT 2
100
8
53.29
b
3
3
3
L X r
33. 33
27
17.76
x
L [
26.3
~
N=9 L[ L:r 2] = 78.09
b
A
33.42
~
27.08
17.95
L X 2 =78.45
Usin Using g th this is ch char art, t, calcul calculate ate the the foll followi owing ng va valu lues es.. SST
SSB = L [
L[ L;r 2] _ L : 2
L ~ S
L ~ e
2 ]
_ L L:: 2
L X s 2
2 ]
SSE =
B
SSTOTAL -
SST
+ SSB
Comp Co mple lete te the the foll follow owing ing cha chart rt usin using g da data ta ac accu cumu mulat lated ed to this this po poin int. t. Variance Varia nce Analysis Analysis Summ Summary ary Tab Table le
Variance Sour c es Between Treatments
B lo c k Within Treatments error
Total Variance
C
df
t
=
b - 1=
Sum of the Squares SST=
SSB
=
t - 1 b - 1 =
SSE=
N =
SSTOTAL
Mean Squares
T=
AN OV A
F
MS r
MS
SE
B= E=
S
=
Usin Using g the the 5-st 5-step ep ap appr proa oach ch to hy hypo poth thes esis is test testin ing, g, de dete term rmin ine e at th the e .0 .05 5 leve levell of significance whet wh ethe herr thes these e trea treatm tmen entt an and d bloc block k mean means s come come from from po popu pula lati tion ons s with with equa equall mean means. s.
III. Using ing the ch char artt dat data from rom pages ges 112 112 and 113, dete determ rmin ine e at the the di diff ffer eren ence ce be betw twee een n treat treatme ment nt me mean ans s 1 an and d2
119
1
lev level wh whet ethe herr th ther ere e is a
Chap Ch apter ter
No Nonp npar aram amet etri ric c Hypot oth hesis sis Tes esti ting ng of Nomina inal Da Data ta
I.
Introduction A Par Parame ametri tric c statis statistic tics s is th the e name name give given n to much of the the mate materi rial al cove covere red d th thro roug ugh h chapte chapterr 19 19.. 1 Para Parame metr tric ic te test sts s invo involv lve e a po popu pula lati tion on pa para rame mete terr for for whic which h th the e te test st st stat atis isti tic c ha has s a kn know own n di dist stri ribu buti tion on (s (sha hape pe). ). 2 Meas Measurem urement ent (data) (data) sophist sophisticat ication ion is of an inte interv rval al or ra rati tio o le leve vel. l. (s (see ee page page 2 B Non Nonpara parametr metric ic statistics statistics are are us used ed when when th the e requ requir irem emen ents ts of par parame ametri tric c statis statistic tics s are not ful fulfil filled led.. 1 Da Data ta is consi consider dered ed distribution-free distribution-free beca becaus use e the the dist distri ribu butio tion n of th the e sa samp mple le st stat atis isti tic c may may be un unkn know own. n. 2 No Nomi mina nall an and d or ordi dina nall da data ta can be test tested ed.. C Co Count unt data data (categ (categori orical cal dat data) a) this chap chapte ter, r, samp sample le ob obse serv rvat atio ions ns (cou (count nts) s) are are grou groupe ped d in into to ca cate tego gori ries es an and d comp compar ared ed to so some me ex expe pect cted ed 1 In this coun countt (f (fre requ quen ency cy). ). A smal smalll diff differ eren ence ce be betw twee een n the the ac actu tual al and ex expe pect cted ed fr freq eque uenc ncie ies s in indi dica cate tes s a matc match. h. 2 Applications a Deter De termi mini ning ng bran brand d prefer preferenc ence e by age, age, gend gender er,, etc. etc. b Meas Me asuri uring ng the succes success s of an adve advert rtis isin ing g camp campai aign gn or df 3 training trai ning progra program. m. a= 5 D The chi-square chi-square distribut distribution ion (pro (prono noun unce ced d kigh kigh squ squar are) e) 1 The chi-sq chi-squar uare e distri distribut butio ion n is like like th the e t dist distri ribu buti tion on beca becaus use e the there is a fami amily of curv curves es,, on one e fo forr ea each ch de degr gree ee of freedom. 2 The di dist stri ribu buti tion on be beco come mes s more more no norm rmal al as th the e de degr gree ees s of fr free eedo dom m in incr crea ease se.. ChiChi-sq squa uare re is th the e rati ratio o of n - 1)5 2 to 1 2 2 7.82 ritical ritical Val Value ue of X II. Go odn ess of fi t tests fo r a one cate catego gori rica call vari variab able le Lind Li nda a is int intere erest sted ed in de dete term rmin inin ing g if cons consum umer ers s at he herr four four st stor ores es are are giVing eq equa uall ac acce cept ptan ance ce to th the e low low sale sales s pric price e of a ne new w hi hitt musi music c video eo.. B
Th The e 55-st step ep ap appr proa oach ch to hy hypo poth thes esis is te test stin ing g H o : sales sales are equall equally y dist distri ribu bute ted d 1
M usi c V i d eo S al es
H 1 : sales sales are not not equal equally ly dist distri ribu bute ted d 2 3
X
5
=
fo ffa a
A
Store Store Store B C
Totals
8
22
19
11
60
15
15
15
15
60
is an ob obse serv rved ed fr freq eque uenc ncy y of a ca cate tego gory ry..
fe is an ex expe pect cted ed fr freq eque uenc ncy y of a ca categ tegory ory.. It shoul should d be 5 when when us usin ing g th the e co cont ntin inuo uous us chichi-sq squa uare re di dist stri ribu buti tion on for a discr discrete ete pr prob oble lem. m.
]
The The decisi decision on ru rule le:: 2 If X fr from om the the test test st stat atis isti tic c is beyo beyond nd the the crit critic ical al valu value, e, the th e di diff ffer eren ence ce is high high an and d th the e null ull hy hypo poth thes esis is is rejected. Appl Apply y th the e de deci cisi sion on rule rule fo forr th this is on onee-ta tail il te test st.. Store
Ex Expe pect cted ed sale sales, s, fe
The signif significa icance nce leve levell is .05. ChiCh i-sq squa uare re is the the te test st st stat atis isti tic. c. 2
4
Sample Sam ple sa sales les,,
Store A
fe
fo fa
fo fa 2
8
15
-7
49
B
22
15
7
49
C
19
15
4
16
11
15
-4
16
Eq Equa uall accept acceptanc ance e mean means s fe= 60/4 = 15 15.. k is th the e nu numb mber er of ca cate tego gori ries es.. There are k - 1 degrees of fr free eedo dom m fo forr a goodn goo dness ess of fi fitt pr prob oble lem. m.
o
Chi-Square
fa
49/15 =3.27
49/15 =3.27
16/15 =1.07
16/15 =1.07
0 df =k - 1 =4 - 1 =3 X =7.82 Reject Ho bec becaus ause e 8.68 8.68 > 7.82. Sa Sale les s are not not equall equally y dist distri ribu bute ted. d.
X
=8 68
Degrees
RightRig ht-tai taill area area
of freedom
.10
.05
.025
1
2 71
3.84
5.021 6.641 7.88
2
4 61
5.99
7.38
3
6
4
7.78
9.49
11.14 13.28 14.86
5
9.24
11.07
12.83 15.09 16.75
5 l J : il l
9.35
01
9 21
.005 10.60
11.35 12.84
Page T 6 has a more co mp mpl e ette chi-square chi-squa re table. table.
Note: Th Note: This is pr proc oced edur ure e can can be us used ed to te test st un uneq equa uall ex expe pect cted ed freq freque uenc ncie ies. s. Su Supp ppos ose e Stor Store e A usua usuall lly y has 40 of comp ompany any sales and the 3 oth other stores res each have 20 . Store A would be expec expected ted to have 24 sa sale les s (.40 x 60) and the othe otherr stor stores es wou oulld be expected to have 12 sales (.20 x 60). Note No te:: Opi pini nion ons s va vary ry on th the e ex exac actt lowe lowerr limi limitt fo forr fe 120
III. T es categorical rical variables with a contin continge gency ncy table table es t f or or independence o f tw o catego A. The one catego proble lem m on page page 120 120 te test sted ed one vari variab able le (s (sal ales es)) agai agains nstt some some hypo hypoth thes esiz ized ed categorr y vari vari abl abl e prob fr freq eque uenc ncy y to de dete term rmin ine e if th ther ere e was was a go good od fit fit be betw twee een n th the e hy hypo poth thes esiz ized ed fr freq eque uenc ncy y and th the e ob obse serv rved ed fr freq eque uenc ncy y. B. Here ere, a cont contin inge genc ncy y ta tabl ble e is us used ed to dete determ rmin ine e if th ther ere e is a rela relati tion onsh ship ip (s (sta tati tist stic ical al de depe pend nden ency cy)) be betw twe een tw two o vari variab able les. s. Th That at is, do does es know knowle ledg dge e of va vari riab able le A s value value prov provid ide e know knowle ledg dge e of vari variab able le B s valu value. e. If so, var variab iables les ar are e stat statis istic tical ally ly depe depend nden ent. t. Ot Othe herw rwis ise, e, they they are are inde indepe pend nden ent. t. C. Th The e id idea ea of sta statis tistic tical al dependen dependency cy was first first encoun encounter tered ed Weekly Week ly dverti tis sing and Sal ales es in the pr prob obab abil ilit ity y ch chap apte terr on pa page ge 46. At th that at time, adve advert rtis isin ing g expe expendi nditur tures es and and sale sales s reve revenu nue e were were said said to Sales Less than Greater Totals be dep epe end nde ent nt.. To be su sure re sa sale les s inc increas ased ed en enou oug gh when when or equal to than adve advert rtis isin ing g in incr crea ease sed d to in indi dica cate te depe depend nden ency cy,, a st stat atis istic tical al pr proo ooff for for de depe pend nden ency cy could be con ondu duct cted ed.. We must must ad adju jus st th the e mont monthl hly y da data ta on pag age e 46 to week weekly ly da data ta be bec cau ause se ce cellll values values fa mu must st be 5. Fifty week weeks s of da data ta will will be stud studie ied. d. Th The e nu null ll hy hypo poth thes esis is will will ag agai ain n proc procla laim im no difference (sal (sales es and ad adve vert rtis isin ing g ar are e inde indepe pend nden ent) t).. The The te test st wi will ll meas measur ure e whethe whe therr th the e dif differ feren ence ce is larg large e an and d did no nott ha happ ppen en by chan chance ce.. Th That at is, th the e vari variab able les s are are de depe pend nden ent. t. D. The The 55-st step ep approa approach ch to hypo hypoth thes esis is te test stin ing g 1. o adve advert rtis isin ing g and and sale sales s are are inde indepe pend nden entt
:
adve advert rtisi ising ng and and sa sale les s are are depe depend nden entt
1 2 ,0 0 0 20
12,000 5
25
Greater than than 1,000
5
20
25
Totals
25
25
50
ontinge ontin genc ncy y Tabl Table e of Weekly Wee kly dvert ertising and and Sal ales es
The signific significanc ance e le leve vell is .01. Chi-square is th the e test test st stat atis isti tic. c.
2. 3.
A d v e r t is i n g Less than o r eq equa uall to 1, 1,00 000 0
Sales Advertising
2 =
L[Cfo fa 2]
X
4. 5.
where
fe _
fo 20
Greater than Greater than 1,000 Totals
x c
n
df= r-1 c-1 r is th the e numbe numberr of rows, and c the numbe umberr of columns
Less than Less than o r equal to 1 12 2, 000
Less th than or or eq equa uall to 1, 1,00 000 0
If I fr from om th the e te test st st stat atis isti tic c is beyo beyond nd the cr crit itic ical al valu value, e, re reje ject ct th the e null null hypo hypoth thes esis is.. Ap Appl plyi ying ng th the e de deci cisi sion on rule rule fo forr th this is on onee-ta tail il te tes st. a. Im Imag agin ine e th the e ab abov ove e cont contin inge genc ncy y ta tabl ble e has on only ly fo dat ata a and th the e da data ta cell cells s and to tota tals ls are are blank lank.. b. Rowand co colu lumn mn to tota tals ls for are are equ qual al to th thos ose e of fo c. A ta tabl ble e cell is comp comple lete ted d by mult multip iply lyin ing g its row row to tota tall by its co colu lumn mn to tota tall and div dividin iding g by th the e gran grand d tota tal. l. For For ex exam ampl ple e,
12. 5
fo 5
5
12.5
25
25
12.5
fo 25
25
20
12.5
25
25
25
25
50
50
X
=L[Cfo fa 2] =L[C20 12.S 2 12. 5
f
xc
n
-
-
2Sx2S
X2 = 6.6 .64 4 (se see e ch char artt pa pag ge 120)
+ CS 12.S 2 + CS 12.S 2 + C20 12.S 2J 12 . 5
12. 5
1 2. 5
=4.5 + 4.5 + 4.5 + 4.5 = 18 The null null hypo hypoth thes esis is is reje reject cted ed beca becaus use e 18 > 6. 6.64 64.. Ad Adve vert rtis isin ing g expe expend nditu iture res s aff affec ectt sa sale les s reve revenu nue. e. Th Thes ese e va vari riab able les s are are de depe pend nden entt at th the e .01 level of significance.
Note: Chi-sq Chi-squar uare e anal analys ysis is is used used to test test inte intere rest stin ing g re rela lati tion onsh ship ips s such such as leve levell of income inco me (low, low, medi medium um,, an and d high high)) and and freq freque uenc ncy y of pur purcha chase se (o (oft ften en and and not not of ofte ten) n).. Note:: As demon Note demonst stra rated ted wi with th th this is adve advert rtis isin ing/s g/sale ales s data data,, it is of ofte ten n neces necessa sary ry to
regr re p data ta to as assu sure that at is 5. Cl Clas asse ses s wi with th a low low fr freq eque uenc ncy y ar are e comb combin ined ed is th unti ungrou tilloup th the e da re requi quire remen ment t re observed. 121
Note: If 2 vari Note: variab able les s ar are e inde indepe pen nde dent nt,, th thei eirr cell cell valu values es ar are e in proportion. Th This is fo form rmul ula a is us used ed to de dete term rmin ine e expe expecte cted d ro row w and and colu column mn cell cell valu values es..
th the e fir first st cell cell has has been been ca calc lcul ulat ated ed in th the e fr fram ame e to th the e right. df = ( r - 1) (c - 1) = 2 - 1) (2 - 1) = 1
Totals
Greater than 1 2, 00 0
50
25
.
Pract actice ice Set Set 20 I
Nonpa No nparam rametr etric ic Hypot Hypothes hesis is Test Testin ing g o f No Nomi mina nall Dat ata a
Darin Dari n fe feel els s 20 of th the e 99-mg mg pa part rt de defe fect cts s ar are e pr pro oduce duced d by the the firs firstt shi shift, 30 by the the se seco cond nd shif shiftt, and 50 by by the third shift. Do an 1 level of sig signi nific fican ance ce test test to de dete termi rmine ne whethe whetherr this this sa sampl mple e da data ta foll follow ows s Dari Darin n s pr prop opos osed ed distribution. eople usi using ng statis statistic tics s sof softwa tware re o not ne need ed to fill ou t the secon second d ch char art. t.
Analysis o f Defects S h i f t 1 Sh i f t 2 Sh Shif iftt defe defects cts,, fo
6
Sh i f t 3
Totals
23
11
4
Expect Exp ected ed defects, defects,
Sh i f t
fo
fo
(fo
fa
fo f a
r
Totals
122
II
This is Dari rin n s pa page ge 42 st stud udy y of cust custom omer er ag age e and maki makin ng a sa salle Test Test at th the e 05 05 level vel whet whethe herr cust custom omer er ag age e and maki making ng a sale sale ar are e in inde depe pend nden entt Pe Peop ople le usin using g statis statistic tics s softwa software re do not not need need to fill ou outt the ch char artt
Cust ustome omer Age Age and Making A Sale C u st o m e r A g e Making A Sa lle e
Less than or eq equa uall to 20
Ove r 2 20 0
Contingenc Conting ency y Tabl Table e of Custom tomer Age and Making A Sale Customer g ge e Maki Ma king ng A a le
fo
Totals
No
16
24
Yes
24
12
36
Totals
40
20
60
Less than han or equal to 20
fo
No
16
Yes
24
12
Totals
40
20
Note No te:: Test Tests s si simi mila larr to th thos ose e con condu duct cted ed he here re can can be used sed as foll llo ows ws:: Te Test st ob obse serv rved ed da data ta to see see if it foll follow ows s a no norm rmal al pr prob obab abil ilit ity y di dist stri ribu buti tion on Te Test st ob obse serv rved ed da data ta to see see if it foll follow ows s a Pois Poisso son n pr prob obab abil ilit ity y di dist stri ribu buti tion on C Thes These e te test sts s ar are e ea easy sy to pe perf rfor orm m with with stat statis isti tics cs soft softwa ware re 123
Over 20 20
Totals
fo
Quic Qu ick k Questi Question ons s I.
Pl Plac ace e the the nu numb mber er of the the form formul ula a or expr expres essi sion on ne next xt to the the conc concep eptt it de defi fine nes. s. A. B.
II.
Nonp No npar aram amet etri ric c Hypo Hypoth thes esis is Test Testin ing g of Nomi Nomin nal
for for a cont contin inge genc ncy y tabl table e eq equa uals ls
Expected Expe cted frequency fa mu must st be
C.
fa
D.
Chi-square is the the rati ratio o of
E.
df for for use use wi with th a cont contin inge genc ncy y tabl table e
F.
df for for a good goodne ness ss of fit fit pro problem blem
1.
1) s to
n -
2.
r x k
2
4. 5.
n
3.
5
k-1
[ fo~:e
r-1
2
J
c-1
La Last st year year,, 40 of Lind Linda a s cust custom omer ers s rent rented ed 1 tap ape e, 30 ren ente ted d 2 ta tape pes, s, 20 rent rented ed 3 ta tape pes s, and 10 rent rented ed 4 or mo more re tape tapes. s. Be Belo low w is last last week week s tape tape rent ental dist distri ribu buti tion on for for Lind Linda a s st stor ores es.. Us Usin ing g the the 5-st 5-step ep ap appr proa oach ch to hypo hypoth thes esis is test testin ing, g, test test at the the .05 leve levell of sig signif nifica icance nce whether whether there there has been a chan change ge in the dis distri tribut bution ion of tape tape rent rental als. s. Each Each ex expe pect cted ed fr freq eque uenc ncy y will will be th the e tot otal al of 1,00 1, 000 0 observa observation tions s mUlt mUltip iplilied ed by las lastt yea yearr s app approp ropria riate te per percen centag tage. e.
Tape Tap e Rent Rental al O b s er v e d Frequency fa 1 ta p e
300
2 tapes
250
3 ta tape pes s
250
4+ tapes
20 0
Totals
nalysi ysis
E xp e c t e d Frequency fa
1,000
III. Is Lind nda a ha happ ppy y with with thes these e test test re resu sult lts? s? Wh Why? y?
124
ata ata
IV
Usi Using th the e 5 step step ap appr proa oach ch to hy hypo poth thes esis is te test stin ing g an and d th the e 1 le level vel of signif significa icance nce test test whet whethe herr th the e nu numb mber er of math math cour course ses s ta take ken n an and d succes success s in stat statistic istics s are indepe independe ndent. nt. Peop People le us usin ing g statis statistic tics s softwa software re o no nott ut t h table. need need to fi fill ll o ut
Statistics Grad Grades es and and Ma Math th ackground at Stat State e Uni Univers versit ity y Gr a de
Less than B
M at h c o u r s e s ta ken Less th than o orr eq equal to to 2
15
Gr Grea eate terr th than an 2 Totals
Less tth han B
Math Ma th co cour urse ses s ta take ken n f Less than o r equal to 2 Grea Greate terr th tha an 2
Totals
Greater th than o r equal to B f
15
20
25
20
30
Totals
f
5
20
25
30
30
50
125
5
ontin ont inge genc ncy y Table Table of Sta tati tist stiics Gr Grad ades es and Math Math ackground Grade
G reat er To t a l s than than or e q u a l to B 20
50
Chapter 21 Non Nonpar parame ametric tric Hypoth Hypothesi esis s Testi Testing ng o f Ordinal Data Part I I
A run test is use used d t o deter deter mine mine r andom andomnes ness s base based d upon upon order of occurrence. A To be succ succes essf sful ul,, an exp experi erime ment nt of ofte ten n re requ quir ires es data data be ra rand ndom omly ly coll collec ecte ted. d. Infer Inferen entia tiall sta statis tistic tics s of ofte ten n req requir uires es dat data a be collec collected ted ra rando ndoml mly. y. 1 2. Quali ality cont ontrol, ol, st stud udie ied d in chapter 17, re requ quir ires es defect defect te test stin ing g be done done to rando randoml mly y sele select cted ed item items. s. stud udie ied d pe pert rtai ains ns to a two two ca cate tego gory ry vari variab able le ma male le/f /fem emal ale, e, pass pass/f /fai ail, l, et etc. c.). ). Th The e nu numb mber er of ru runs ns si simi milar lar B Data st obse observ rvat atio ions ns)) dete determ rmin ines es rand random omne ness ss.. Too Too ma many ny or to too o few ru runs ns cause causes s re reje ject ctio ion n of th the e nul nulll hyp hypot othes hesis. is. C Li Lind nda a wa want nts s an .05 le leve vell te test st to de dete term rmin ine e whet whethe herr th the e gend gender er of peop people le wa walk lkin ing g int nto o her her st stor ore e is a ra rand ndom om eve vent nt.. 1 This This gende genderr data data wa was s coll collec ecte ted d fr from om Li Lind nda a s Satu Saturd rday ay mo morn rnin ing g cust custom omer ers. s. Runs uns have have been been unde underl rlin ined ed.. 2 FFF MM FFFF M FFFFF MMMM E MMMM FFFFF MMMMM FF MM FFF
In
The The samp sample le size size of ei eith ther er cate catego gory ry is n 1 • The The samp sample le size size of th the e ot othe herr ca cate tego gory ry is n2 • Th The e numb number er of runs runs is r The sam sampli pling ng dis distr tribu ibutio tion n of r is approx approxim imate ately ly norm normal al pr prov ovid ided ed th the e samp sample le si size ze of ei eith ther er cat catego egory ry n 1 or n2 ) is be beyo yond nd 20. fboth b oth ar are e s 20, tabl tables es cont contai aini ning ng the cr crit itic ical al valu value e of r shou hould be used. Here ere are are th the e me mean an an and d st stan anda dard rd er erro rorr asso associ ciat ated ed wi with th the samp sampling ling dis distri tribut bution ion of r _ 2n1n2 1 J.lr - n1+ n 2
Z=
r-J lr r
z= =
II.
The te test st st stat atis isti tic c is r
n
I n2 = 18 males I r = 13 runs I 2n n2 2n1 n2 -n 1 - n2 r r
_ 2(23 (23)( )(18 18)) 23 18
=
+ n2 2 n+ n2 - 1
828 41
+
1
+1
n +n2 2 n1 n 2 - 1
= =
2(23)(18)[(2(23)( 2(23)(18 )[(2(23)(18)18)- 23 -18 1 (23+ 18)2(23+ 18)2(23+ 18-1 651,636 67,240
If z from the test
stat statis isti tic c is be beyo yond nd th the e cr crit itic ical al va valu lue e
=21.195
the e null null hypo hypoth thes esis is is re reje ject cted ed.. of z th
r-J lr
= 3. 3.11 113 3
For For th the e .05 level of si signi gnific ficanc ance, e, z is ± 1.96 fo forr th this is tw twoo-ta tail il te test st..
r
13.000-21.195
3.113
=-2.63 D
2n n2 2n 1n2 -n 1 - n2
r r
= 23 females
Reje Re ject ct H o because -2.63 is beyond - 1.96. Gender of custom cus tomers ers wa walk lkin ing g in into to Li Lind nda a s stor store e is not ra rand ndom om..
Run te test sts s ma ay y be done done usi using th the e medi dia an. Runs co cons nsis istt of consecu consecutive tive outcomes outcomes larger or sm smal alle lerr th than an th the e medi edian. an. Outc Outcom omes es equa equall to th the e media edian n are are igno ignore red. d.
O n e - t a i l testing of on e sample median using t he he s ig ig n t es es t A This This te test st is equi equiva vale lent nt to a oneone-ta tail il para parame metr tric ic te test st of 1 samp sample le me mean an.. B Data m u ust st be at leas leastt ord ordin inal al in natu nature re and know knowle ledg dge e abou aboutt th the e shap shape e of th the e dist distri ribu buti tion on is no nott re requ quir ired ed.. assign gned ed to va valu lues es abov above e th the e median of int nter eres estt and a -) sign to th thos ose e belo below w th the e median dian.. C A +) sign is assi Thos Those e equal qual to th the e media edian n are are drop droppe ped d from the te test st and n is redu educe ced d ac acco corrding dingly ly.. D Ou Ourr stu study dy of infe infere rent ntia iall st stat atis isti tics cs bega began n wh when en Li Lind nda a beca became me conc concer erne ned d abou aboutt a drop drop in th the e aver averag age e custom cus tomer er pur purcha chase se fro from m 7.75. If Linda does does not kn know ow th the e sh shap ape e of the dist distri ribu buti tion on,, she she can can do a si sign gn tes testt of th this is year year s data data agai agains nstt la last st year year s me medi dian an of 7.70. Me Medi dian an hou hourl rly y sale sale fo forr 7 randomly sele select cted ed peri period ods s will ill be te test sted ed at th the e .05 level of significance. 1 If th the e me medi dian an has has decr decrea ease sed, d, th the e prop propor orti tion on of -) si sign gns s shou should ld be grea greate terr th than an th the e prop propor orti tion on of +) +) si sign gns. s. 2 Ho: P .50 and H 1 : p < .50 H1 mu must st be le less ss-t -tha han n beca becaus use e th this is is th the e chan change ge bein being g te test sted ed.) .) a. For For sm smal alll samp sample les, s, th the e bi bino nomi mial al di dist stri ribu buti tion on is used used to ca calc lcul ulat ate e th the e prob probab abil ilit ity y the e dist distri ribu buti tion on ta tail il obse observ rvat atio ions ns be beyo yond nd th the e pr prop opos osed ed me medi dian an). ). of th Medi dian an Sign ign Sample Me P of ofte ten n call called ed 7t equa equals ls .5, n equa equals ls to tota tall ob obse serv rvat atio ions ns,, and x equa equals ls b obser obs ervat vatio ions ns beyo beyond nd th the e pr prop opos osed ed me medi dian an.. If th the e pr prob obab abil ilit ity y of th the e ta tail il 1 7.65 is less ess th than an th the e level of si sign gnif ific ican ance ce al alph pha) a),, th the e nu null ll hypo hypoth thes esis is is re reje ject cted ed.. 7.50 2 Wi With th a twotwo-ta tail il te test st,, th the e pr prob obab abil ilit ity y calc calcul ulat atio ion n is dOUbled + 8.00 3 3 Z is appr approp oprriate iate fo forr la larg rge e sa sam mp ple les s with p equa qual to .50 see sectio section n IC of page 94). 4 Th The e p-va p-valu lue e appr approa oach ch to hypo hypoth thes esis is te test stin ing g wil illl be used used with ith th thes ese e si sign gn te test sts. s. 7.60 4 a. Five median sales figures are below 7.70 and n is 6 because of a tie. tie. 0 7.70 5 b The The bino binom mia iall ta tabl ble e ST 1 yie yields lds the fo foll llow owing ing:: 5) = .094 +.016 =.11. 7.35 6 c Ac Acce cept pt Ho as 11 is greate greaterr th than an .05. Ch Chan ance ce coul could d ha have ve caus caused ed th thes ese e decre decreas ases es d. With samples of 6, all m ust ust decr decrea ease se to rej ejec ectt Ho P x = 6 = .0 .016 16 and .016 < .05) 7 7. 55 -
126
III. TwoTwo-ta tail il te test stin ing g o f 2 sa samp mple le medi median ans s from from in inde depe pend nden entt po popu pula lati tion ons s us usin ing g the the Mann Mann-W -Whi hitn tney ey test A Th This is hypot hypothe hesis sis te test st is us used ed when when po popu pula lati tion ons s ar are e no nott symm symmet etri rica call and do no nott ha have ve eq equa uall vari varian ance ces. s. B Da Data ta must must be at le least ast or ordi dina nall in nature. C Procedures 1 Data fro rom m 2 sa samp mple les s will be co comb mbin ined ed into nto an or ord der ered ed arra ray. y. Sa Samp mple le si size ze may may diffe iffer. r. 2 Begi Beginn nnin ing g with with th the e number number 1 da data ta will will be ra rank nked ed.. Equa Equall da data ta,, call called ed ties ties,, wi will ll be gi give ven n their their avera average ged d ra rank nk.. 3 Rank Ra nks s will will be as assi sign gned ed to thei theirr re resp spec ecti tive ve samp sample le and th the e mean mean ra rank nk of ea each ch sample sample cal calcul culate ated. d. 4 If po popu pula lati tion on medi median ans s ar are e eq equa ual, l, ther there e will will be li litt ttle le di diff ffer eren ence ce be betw twee een n th the e mean mean ra rank nk of ea each ch samp sample le.. 5 Either Either mean mean ca calcu lcula lati tion on,, U1 or U2 , ma may y be used. 6 Th The e sa samp mpli ling ng di dist stri ribu buti tion on of U will will be app appro roxim ximate ately ly no norm rmal al pr prov ovid ided ed bo both th sam sample ples s n1 and n2 are 10. Spec Specia iall pr proce ocedu dure res, s, no nott co cove vere red d in Quic Quick k Note Notes s Stat Statis isti tics, cs, are used when ei eitther her n is le less ss th than an 10. Twenty Twe nty-th -three ree emplo employee yees s were were ra rand ndoml omly y assig assigne ned d to trai traini ning ng method A or B Distr Distrib ibut utio ion n sh shap apes es ar are e no nott kn know own. n. Lind Li nda a want wants s to de dete term rmin ine e th the e eq equa uali lity ty of trai traini ning ng meth me thod ods s at th the e .0 5 level of significance. 7
D
n1 is sample size 1.
n2 is samp sample le size size 2.
U = U1 or U2
R 1 is sample 1 s rank. R2 is samp sample le 2 s ra rank nk..
z
I
= Median2
Ho : Median1
U is th the e te test st stat statis isti tic. c. If z from from the the test test stat statis isti tic c is be beyo yond nd th the e crit critic ical al valu value e of z Ho will be re reje ject cted ed.. Th That at is th the e medi median ans s ar are e no nott eq equa uall.
U-Ilu Ju-
Method A
B
R ank Ordered Ordere d Array Array an and d Meth Method od
1 n2
+ +
2 11
n1 n1
2
1 2
1 ~ + 1
-
R
1
I
[
f lu =
f lu
-155.5
= 13 2 + 7 8 - 1 5 5 . 5 = 54.5
14 17 27 19 13 32 22 25 18 30 24 33
12 21 28 16 30 26 14 18 28 22 14
n n2
_ -
12 11 2
1 2 3 4 5 6 7 8 9 10. 11.
12 13 14 14 14 16 17 18 18 19 21
B A A 8 B B A A B A B
12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23.
22 22 24 25 26 27 28 28 30 30 32 33
A
12.5
A A
14 15
u
u
= 66
=
U -I l u Ju 54.5-66.0
16.248
=
-.71
1 2 4
11
B
17
A
20.5
A A
22
B B
12.5
16 18.5 18.5
B
155.5
R
n n2 n
=
n2
12
+ n2 + 1 12
=
n n2 n
--
12 11 12+11+1 12
68
=
16.248
This This two-ta two-tailil .0 .05 5 test has a z o f ± 1.96. Ac Acce cept pt Ho because z of -.71 from from th the e te test st stat statis isti tic c is no nott beyo eyond - 1.96. There is no nott a di diff ffer eren ence ce be betw twee een n th thes ese e medi median an scor scores es..
12 7
8.5
10
A
12
4 4 6
7 8.5
B
= J3
z
B
A
n2
=
Method
Score
Totals R1 has bee een n calcul calculate ated d us usin ing g the the ch char art. t.
Ranked Ran ked Sco Scores res
20.5
or
120.5
Prac Pr acti tice ce Set I
Nonparametric
ypothesis Testing of Ordinal
at ata a Part I
Darin Dari n want wants s to de dete term rmin ine e whethe whetherr th the e pa page ge 68 comp comput uter er comp compon onen ents ts were were draw drawn n at rand random om.. The medi median an 30 30.0 .045 45 mg is the the stan standa dard rd for for this this test test.. Dete Determ rmin ine e at the the .05 leve levell of si sign gnif ific ican ance ce whet wh ethe herr this dat ata a was was ra rand ndom omly ly coll colle ected cted.. Data was was re reco cord rded ed one one co colu lumn mn at a time time st star arti ting ng at th the e to top p ea each ch co colu lumn mn.. Colu Column mns s were were re reco cord rded ed fro rom m left eft to righ ight. 29.89 30.05 29.98 30.07 29.97 30.05 29.95 30.06 29.99 30.02 30.09 30.12 29.96 29.97 30.06 30.05 29.95 29.95 29.99 29.89 29.99 30.08 30.06 30.16 29.97 29.98 30.04 30.06 30.05 30.09 30.06 30.09 29.98 30.01
II
Darin arin fi firs rstt st stud udie ied d th the e nu numb mber er of de defe fect ctiv ive e 30 mi mill llig igra ram m pa part rts s on page 96. At th that at ti time me he did a param rametri etric c stu study because he fe felt lt th the e da data ta was no norma rmall lly y di dist stri ribu bute ted. d. The The cons consis iste tenc ncy y of ra raw w mate materi rial al in inp put uts s ha has s cha chang nged ed and Dari Darin n isn isn t sure sure the the distr istrib ibu uti tion on is st stiill nor orma mal. l. Do a .05 level vel of si sign gnif ific ican ance ce si sig gn te test st to de dete term rmin ine e whethe whetherr de defe fect cts s ha have ve in incr crea ease sed d from from la last st yearr s median yea median
128
30.08 30.15
Sample
Median Med ian Defe Defects cts
1
6
2
7
3
5
4
4
5
8
6
6
7
7
III. Darin wants ants to reex reexam amin ine e th the e num num b ber er of si sick ck days days ta take ken n by em empl ploy oyee ees s base based d upon upon educ educat atio ion. n. This This data data wa was s fi firs rstt pres presen ente ted d on page page 100. At th that at ti time me it was was assu assume med d th the e popu popula lati tion ons s we were re appr approx oxim imat atel ely y norm normal al with with th the e sam same e vari varian ance ce.. As a re resu sult lt,, popu popula lati tion on me mean ans s we were re co comp mpar ared ed.. As Assu sume me thes these e assu assump mpti tion ons s mi migh ghtt not not be tr true ue and and use use a Ma Mann nn-W -Whi hitn tney ey 01 level of significance te test st to dete determ rmin ine e whe whethe therr th thes ese e samp sample les s come come fr from om po popu pula lati tion ons s with with eq equa uall me medi dian ans. s. Graduates sick days: days: 5, 4 7 2 7 7
3 6 8 6 Non-graduates sick days: 9 13 8 6 14 6 12 16 8 1
7 11
ott u s se e this chart. People Peo ple Using Using St Stat atist istic ics s Soft Soft war ware e should n o
Compl Com plet ete e this this ta tabl ble e by: 1) comp comple leti ting ng an orde ordere red d ar arra ray, y, 2) assi assign gnin ing g a G fo forr gr grad adua uate tes s and and an N fo forr nonnon-gr grad adua uate tes s to each each el elem emen entt of th the e arra array, y, 3) assi assign gnin ing g each each rank rank to th the e appr approp opri riat ate e cate catego gory ry nonnon-gr grad adua uate te or grad gradua uate te), ), 4) calc calcul ulat atin ing g each each subt subtot otal al,, and and 5) calc calcul ulat atin ing g R1 wh whic ich h equa equals ls th the e sum sum of th the e 3 subt subtot otal als s fo forr nonnon-gr grad adua uate tes s or R2 which hich equa equals ls th the e sum sum of th the e 3 subt subtot otal als s for gr grad adua uate tes. s. Rank Ra nk R an ke d S c o r e s Orde Ordere red d Ar Array ray and and Or Orde dere red d Arra Array y an and d NonGrads Degr ee S t at us De g r e e S t a t us grads 2) 2) 1 ) 1) 3) 3)
Ra n ke d S c o r e s Grads 3)
N on grads 3) 3)
Ran k Or Orde dere red d Array Array and and De gre e Status 2) 1 )
1 2 3 4 5 6
9. 10. 11. 12. 13. 14.
18. 19. 20. 21. 22.
7
15. 16.
23.
4) 4) Subt Subtot otal al
17
4) 4) Sub Subtot total al
4) Su Subt btota otall
5) R1 =
129
Ranked Scores Grads 3)
Nongrads 3)
Quic Qu ick k Quest Question ions s I
N o n p a r am e t r i c
ypothesis Testing of Ordinal
at ata a
Data on de defe fect ctiv ive e par artts pr prod oduc uced ed by the the ni nigh ghtt shif shiftt was was pre rese sent nte ed on page 96 and has been re repr prin inte ted d be belo low w Deter Determin mine e at the the 1 level sig signi nific fican ance ce whethe whetherr these these pa part rts s were were rando randomly mly coll collec ecte ted d Data Data ha has s be been en en ente tere red d ho hori rizo zont ntal ally ly from from le left ft to ri righ ghtt star starti ting ng wi with th th the e to top p ro row w
NightShift P P P F P P P
P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P
II
Las astt year year s weekly weekly medi median an nu numb mber er of cust custom omer ers s re rent ntin ing g from from on one e Linda s sto store res s was was 34 340 0 Dete Determ rmin ine e at th the e 05 05 le leve vell of sign signif ific ican ance ce whet whethe herr th this is samp sample le in indi dica cate tes s a ch chan ange ge in weekly weekly medi median an cust custom omer ers s
130
Sample
Median
1
338
2
321
3
317
4
364
5
325
6
303
Part I
III. Darin want wants s to re reex exam amin ine e th the e de deli live very ry time time of 2 supp suppli lier ers s firs firstt pre rese sent nted ed on page page 90 and and repr reprod oduc uced ed belo below. w. Para Parame metr tric ic te test sts s us usin ing g z or t as assu sume me th the e po popu pula lati tion ons s ar are e ap appr prox oxim imat atel ely y no norm rmal al an and d ha have ve eq equa uall vari varian ance ces. s. If th thes ese e co cond ndit itio ions ns ar are e no nott met met or un unkn know own n and th the e shap shape e and disp disper ersi sion on of the the dist distri ribu buti tion ons s are are si simi mila lar, r, the the no nonp npar aram amet etri ric c Mann Mann-W -Whi hitn tney ey test test of 2 me medi dian ans s is ap appr prop opri riat ate. e. Te Test st at the the .05 .05 leve levell of sign significa ificance nce whether th thes ese e samp sample les s come come fr from om a po popu pula lati tion on with with eq equa uall medi median ans. s. For For calc calcul ulat atio ion n conv conven enie ienc nce, e, on only ly the the first first 11 pieces of data will be used fr fro om each data set. Pe Peop ople le usin using g stat statis isti tics cs soft softwa ware re do no t need t o com compl pl et et e this chart. chart. Supplier Supp lier A: 1
22 14 39 37 4
3
29 3
16 11
Su Supp ppli lier er B: 14, 37, 2
19 12 18 22 23 26 21 19
Comp Co mple lete te this ta tabl ble e by: 1 comp comple leti ting ng an or ord der ered ed arra ray, y, 2 assi assig gning ning an A for for su supp ppli lier er A and a B for for su supp ppli lier er B to ea each ch el elem emen entt of th the e ar arra ray, y, 3 as assi sign gnin ing g ea each ch ra rank nk to th the e appr approp opri riat ate e cate catego gory ry suppl supplie ierr A or B , 4 cal calcu cula lati ting ng ea each ch su subt btot otal al,, and and 5 calcu calcula lati ting ng R1 wh whic ich h equals the the su sum m of the 3 su sub btot totals for for sup suppli lie er A or R2 wh whic ich h equa equals ls the the su sum m of the the 3 subt subtot otal als s for supp suppli lier er B Rank Orde Ordere red d Array Array and and Supplier 1 2 1. 2 3 4. 5 6 7
Supplier A
B
Rank Orde Ordere red d Array Array an and d Su p p l i e r 2 1
A
B
Rank Ordere Ord ered d Arr Array ay and and S uppl ier 1 2 15. 16. 17. 18. 19. 20. 21.
8 9 10. 11. 12. 13. 14.
4 Subtotal
Supplier
4 Subtotal
5 R1 =
131
22.
4 Subtotal
Supplier A
B
hapt hapter er I.
Nonp Nonpar aram amet etri ric c Hypo Hypotthes hesis Test stin ing g of Ordin dinal Data Data
Part
Two-tail testing o f 2 sample m e di di a ns ns f ro ro m d ep ep e nd nd e nt nt populations using a paired dif differ ferenc ence e sign sign test
A This This test test is eq equi uiva vale lent nt to a twotwo-ta tail il pa para rame metr tric ic te test st fo forr st stat atis isti tica call depe depend nden ence ce.. (s (see ee part part VI page page 99) B Data mu must st be at leas leastt or ordi dina nall in na natu ture re.. Kn Know owle ledg dge e of th the e samp sampli ling ng di dist stri ribu buti tion on s shap shape e is no nott ne nece cess ssar ary. y. C. T he he t es est us e es s a (+) s iign gn to r e epr pres esen entt s iitt u ua at iio ons w her here e the f iirr s stt varia ariabl ble e is lar lar ger ger than t he he s ec ec ond ond variable. le. It uses uses a (-) sign sign to re repr pres esen entt the the op oppo posi site te si situ tua ati tion on.. Ze Zero ro repr repres esen ents ts a si situ tuat atio ion n wher where e vari variab able les s are are equ equal. al. Z er er o v a alu lues es are are exc exc lud luded ed f rro om th he e t es t . E a ac c h t iim me this happe appens ns,, the s am am p ple le siz ize e is redu educ ed ed by one. of D If the the medi median ans s are are eq equa ual, l, th the e pr prop opor orti tion on (+) (+) si sign gns s shou should ld be ap appr prox oxim imat atel ely y equa equall to the prop propor orti tion on of ((-)) sig signs. ns. 1 Ho : P =. 5 0 a n d H 1 : p:;t: .50
For sm smal alll sam sampl ples, es, we us use e th the e bi bino nomi mial al di dist stri ribu buti tion on to de deter termi mine ne the the li like keli liho hood od of one on e of the the si sign gns s oc occu curr rrin ing g a larg large e number of times. P is equal to .5 and n is equal to the n um um b be e r of observations. If the the prob probab abil ilit ity, y, base based d up upon on th the e ob obse serv rved ed si sign gns, s, is grea greate terr than than the the le leve vell of sig signif nifica icance nce,, the nu null ll hypo hypoth thes esis is is acce accept pted ed.. Z ma y be us use ed f o orr la larr g ge e sa am m ple ples s wit h p =.50. (s (see ee IC of page page 94 94)) W eek eekly ly s a ale les s bef bef or or e an and d af aftt e err a big p rrom omo ot iion on at t h hrr e ee e of Li Linda nda s sto stores res Store Sign Sales Dollar Dollars s were were 1,20 1,200, 0, 1,30 1,300 0 an and d 1, 1,40 400 0 an and d 1, 1,40 400, 0, 1, 1,50 500 0 an and d 1,50 1,500 0 resp respec ecti tive vely ly.. Before After This This data data wa s firs firstt st stud udie ied d on pa page ge 99. At th that at ti time me,, it was wa s as assum sumed ed the the popu popu lations were nor nor m ma al. I f this were no nott t h he e c as as e or un unkn know own, n, a .10 le leve vell si sign gn test test 1,200 + 1 1,400 of the the me medi dian an coul could d ha have ve be been en cond conduc ucte ted. d. + 2 1 , 30 0 1, 500 The The p-va p-valu lue e ap appr proa oach ch to hy hypo poth thes esis is te test stin ing g is us used ed fo forr thes these e si sig gn test tests. s. + 1 , 40 0 1 , 50 0 3 Th This is tabl table e indi indica cate tes s medi median an sale sales s in incr crea ease sed d at all 3 st or ore s s.. Sa m mpl ple e s iiz z e is 3. 2 Th e Bin Binomi omial al tab table le S T 1) yi yiel elds ds th the e fo foll llow owin ing: g: P( P(x x =.125. = = Fo Forr this this twotwo-ta tail il te tes st, p (.125)(2) .2 .25 50. Acce Accept pt th the e null ull hypo hypoth thes esis is beca becaus use e .25> .25>.1 .10. 0. Me Medi dian ans s are are equal. 3 4. This This nu null ll hy hypo poth thes esis is can can t be re reje ject cted ed beca becaus use e th the e samp sample le si size ze is too too small mall.. OneOn e- an and d twotwo-ta tail il bran brand d pr pref efer eren ence ce te test sts s can can be do done ne with with a pair paired ed diff differ eren ence ce sig ign n tes test. 2.
E
F
G
II. Testing 3 or m o re medians ans from indepen independen dentt populations using th e Kruskal-Wallis test re s am am p le le medi A The ANOVA analysis of cha chapte pterr 18 re requ quir ired ed po popu pula lati tion ons s be norm normal ally ly di dist stri ribu bute ted d with with eq equa uall vari varian ance ces. s. If thes these e re requ quir irem emen ents ts ar are e no nott me t or unk unknow nown, n, the para paramet metric ric ANOVA test of seve severa rall mea means ns is repl replac aced ed with with the no nonpar nparam ametr etric ic Kr Krus uska kall-Wa Wall llis is H test test of seve several ral medians. medians. This tes testt com comple plemen ments ts th the e Mann-W Mann-Whit hitney ney test test of 2 medians. B This C Th This is test test re requ quir ires es that that da data ta fr from om in inde depe pend nden entt ra rand ndom om samp sample les s be at le leas astt ordi ordina nall in natu nature re.. anked ed.. T ies ies ar are e as ass s iig gned ned t h he e av ver erag age e of th thei eirr rank ranks. s. A true true null hypo hypoth thes esis is mean means s aver averag age e D Data is r ank gr grou oup p ra rank nks s are are ap appr prox oxim imat atel ely y eq equa ual. l. Sp Spec ecia iall ta tabl bles es,, no nott prov provid ided ed here ere, shou should ld be used used if n < 5. E The The chap chapte terr 18 sale salesp sper erso son n s sa sale les s da data ta,, with with a week added s o n = 5 wil illl be test tested ed for for equa equali lity ty of medians signif ific ican ance ce.. We will will no nott assum assume e no norm rmal al di dist stri ribu buti tion ons s and and use use the the Kr Krus uska kall-Wa Wall llis is test test.. at t he he .05 lev level of sign Week We ekly ly sale sa les s da data ta i s ra rank nked ed with wi th th this is char ch art. t. F
H=
(It R 1 + (It R 2 +
_ 1 _ 2
N N+1
n
•• •
+ (It R k
Weekly Sales ]
n
- 3 N
Salesperson L Sales
W he r e :
H is t he he des iig gnat e ed d sta ti sti c. k is t he number of samples. N is the number of observations. nk is a s a am m ple ple s size. R k is a samp sample le s ra rank nk to tota tall. df
H-
=k - 1 =3 - 1 =2
12
15( 15+ 1)
[ 27.5 2
5
+
= .05 .05[15 [151.2 1.25 5 + 204.80
3 32 22
5
+
60. 5 2 ]
5
732.05] - 3 1 5
- 3 N
1)
54.4 .40505- 48 48.0 .000 00 6.405 =ct54 Reje Reject H o beca becaus use e H o== 6.41 > 5.99. Medi Median ans s ar are e no nott equal qual.. 132
Salesp Sal esperso erson nM
R a n k R 1 Sa le s
Sa Sales lesper person son N
R a nk R2 Sales Rank R3
7
9.5
6. 0
5. 5
9. 0
14.0
6
5.5
8. 0
12. 5
8. 0
1 2. 5
7
9.5
6. 0
5.5
7. 0
9.5
4
2.0
6. 0
5.5
10.0
15.0
3
1.0
5.0
3.0
7.0
R 1=27.5
X2 = 5.99
(x) in Thousands o f Dollars
R 2 =3 2 . 0
R 3=60.5
adjust stme ment nt,, not not show shown n here, Notes: 1) An adju is requ requir ired ed when when ther there e are many ties.
2) Both oth the the Ma Mann nn-W -Whi hitn tney ey test test and the the KruskalKrus kal-Wall Wallis is test require require popu populati lations ons be of sim simila ilarr shape and dis disper persio sion. n.
Practice Set 22 I
II.
Nonparametric Hypothesis Testing o f Or Ordi dina nall Dat ata a
Dari Da rin n co cond nduc ucte ted d a tr trai aini ning ng pr prog ogram ram for 5 re rece cent ntly ly hi hire red d em empl ploy oyee ees. s. Employee Th This is prob proble lem m fi firs rstt ap appe pear ared ed on pa page ge 100. t that that time it was assu assume med d th that at th the e po popu pula lati tion on was was ap appr prox oxim imat atel ely y no norm rmal al.. If this this as assu sump mpti tion on is not 1 corre cor rect ct or un unkn know own n a 1 le leve vell of signi signific fican ance ce pair paired ed dif differ feren ence ce si sign gn 2 te test st may may be cond conduc ucte ted d to de dete term rmin ine e whet whethe herr trai traini ning ng in incr crea ease sed d worker worker efficiency. 3
Darin wants to reexa eexami min ne the ANOVA NOVA stu study conducte cted on page 110. That That st stud udy y assu assume med d pop opul ula atio tions were were no norm rmal ally ly di distr strib ibut uted ed with with eq equa uall va varia rianc nces. es. Those Those as assu sump mpti tion ons s ar are e no nott ap appr prop opri riat ate. e. Cond Conduc uctt a 1 level of signi signifi fica canc nce e Kruska Kruskall Wall Wallis is test test to deter determin mine e whethe whetherr the med median weig weight ht of pa part rts s pr prod oduc uced ed by thes these e3 depa depart rtme ment nts s ar are e equal al.. Pag age e 110 da data ta has been in incr cre ease sed d to co conf nfor orm m wi with th the n 5 test test re requ quir irem emen ent. t.
133
Parrt II Pa
Efficiency Rating Efficiency Rating fter Before 8
9
8
7
8
4
7
9
5
8
10
eightAnalysis of 9 mg mg Part Parts s Prod Produc uced ed by
3 Dep Depart artmen ments ts
De pa pa r tm tm e en nt 1 De ep p a rt rt me me n t 2 D e ep p a rrtt me me n t 3 8 .9 5
9. 05
9.05
8.90
9.05
9.15
8.90
9.10
9.10
8.92
9.07
9.13
8.88
9 11
9.14
Quic Qu ick k Quest Question ions s I
II
Nonp No npar aram amet etri ric c
ypothesis Testing of Ordinal
Li Lin nda is tr trac acki king ng th the e nu numbe mberr wor work k da days ys mi miss ssed ed by empl employ oyee ees s be befo fore re and af afte terr ta taki king ng pa part rt in th the e comp compan anyy-sp spon onso sore red d lu lunc ncht htim ime e ph phys ysic ical al fi fitn tnes ess s pr prog ogra ram. m. This This pr prob oble lem m firs firstt ap appe pear ared ed on pa page ge 101. At th that at ti time me it was was assume assumed d th the e po popu pula lati tion ons s we were re ap appr prox oxim imat atel ely y normal. If thi his s as assu sump mpti tion on is no nott co corr rrec ect, t, a pai aire red d diff differ eren ence ce sign te test st may may be co cond nduc ucte ted d at the the .10 le leve vell sign signific ificance ance to de dete term rmin ine e wheth whether er medi median an work work days days mi miss ssed ed ha has s chan change ged. d.
The page age 11 112 2 ANOVA ANOVA hi hig gh sc scho hool ol and coll colleg ege e gr grad ades es
Employee
A
B
C
D
E
F
G
Before
8
9
6
8
3
4
5
After
6
7
5
6
5
2
5
34
High H. H.S. Grades Gra des T, College Grades
Employee Employ ee Abs Absent enteei eeism sm and and Compan Com pany y Spo Sponso nsored red Physi Physical cal Fi Fitn tnes ess s
Analysis
study stu assume ass dance th the e s. popula pop ulatio tions were normal nor mally ly distr strib uted ed w ithdyeq equa uallumed va vari rian ces. Thes Th ese e ns assu assump mpti tion ons s ar are e di no not t true tribut ue or unkno known. wn. Cond Conduc uctt a .0 .05 5 le leve vell significance Kruskal-Wall Kruskal-Wallis is te test st to de dete termi rmine ne th the e eq equa uali lity ty treat treatment ment median median grades. grades. Page 112 112 da data ta ha has s be been en in incr crea ease sed d to conf confor orm m wi with th the the n 5 test test requi requirem remen ent. t.
ata ata Part
Coll Co lleg ege e Gr Grad ades es Base Based d Upon Upon High High School Gra School Grades des Medium H H..S. Gr Grad ades es T2
Rank College Grades R,
Rank R2
Low H H..S. Gr Grad ades es T3 College Grades
3.4
3.2
2
3 .5
2.8
2.5
3
3.0
2 .7
3.3
3
2.3
3.6
2.9
1.8
Rank R3
34
xe xecu cuti tive ve Sum Summar mary y of In Infe fere rent ntia iall Stat Statis isti tics cs eing Tested
Sampling istribution is Unknown
Sampling istribution is Known
Parametric Tests of the Mean and Pro Propor porti tion on Usi Using ng Interval and Ratio a ata ta us use e with with Normal Popula Normal Population tion Small Large Sa m p le S a mp l e known n J is know J is un unkn know own n1 runknown
Nonparametric Tests of the Median Using Ordinal a ata ta use use with with
Sk Skew ewed ed pop popula ulatio tion n Large Sa m p le know own n J is kn runknown
Skewed Pop Skewed Popula ulatio tions ns Smal Sm alll Sample Sample
One Sample
z
t
z
Sign Sign Tes Testt
T w o I ndependent Samples
z
t
z
Mann-Whitney Mann-Whit ney Test
Two De Dependent Samples paired pair ed differe difference nce test test
z
t
z
Sign Sign Tes Testt
3 or Mor ore e In Inde depe pend nden entt Samples Samp les ANOVA
F
F
Not Applicable Applica ble
KruskalKrus kal-Wall Wallis is Test
1. I f J Jiis known, z may be used in place t
Nonparametric Tests of Nominal a ata ta Using Goodness
One Catego Categoric rical al Variable Two Catego Categoric rical al Variable Vari ables s Statistica Statisticall Dependency
Fi Fitt Te Test st
Continge Cont ingency ncy Tables 5
ferential Sta Stati tist stic ics s I
Larg Large e sampl sample e hypoth hypothesi esis s te testi sting ng n A.
30
One samp One sample le mean mean 1 On Onee-ta tail il te test stin ing g dete determ rmin ines es if a mean mean is diff differ eren entt th than an a gi give ven n valu value e particula ularr di direc recti tion on.. in a partic Twoo-ta tail il te test stin ing g de dete term rmin ines es if a mean mean is diff differ eren entt th than an a gi give ven n valu value e 2 Tw either er dire direct ctio ion. n. Divi Divide de by 2 in eith 3
B
ormul ormula a Review
Th The e test test stat statis isti tic c is X
and and
H1
:s;
x x
H
x Ho : P = x and H : P ¢ x
One sampl One sample e propo proporti rtion on 1 On Onee-ta tail il te test stin ing g de dete term rmin ines es if a pr prop opor orti tion on is diff differ eren entt th than an a gi give ven n valu value e particula ularr di direc recti tion on.. in a partic 2 Tw Twoo-ta tail il te test stin ing g de dete term rmin ines es if a pr prop opor orti tion on is diff differ eren entt th than an a gi give ven n valu value e
rn rn
Two sample Two sample means means from from indepe independe ndent nt popu popula lati tion ons s 1 On Onee-ta tail il te test stin ing g de dete term rmin ines es if on one e mean mean is larg larger er or smal smalle lerr th than an anot anothe her. r. Twoo-ta tail il test testin ing g de dete term rmin ines es if 2 mean means s ar are e equal. Divid ivide e by 2 2 Tw 3 Th The e test test stat statis isti tic c is X
136
_
Pw
Tota tall succ succes esse ses s _ T o t a l s a m p le d -
n
n2
C
Two Tw o sam sampl ple e me means ans fr from om dep depend endent ent popu popula lati tion ons s pair paired ed di diff ffer eren ence ce te test st Onee- an and d twotwo-ta tail il prpb prpble lems ms ma may y be anal analyz yzed ed.. 1 On 2. T h e test statistic is d.
t 3
Ho : fl d
=
0 and
III. Statistical qu ality control
d
and
Sd
J7i
Sd
=
: fl d < 0
Note:
A. T h he e
xc ha rt
and
fl d
d =
d
and
df = n - 1
is nega negati tive ve wh when en H invo involv lves es te test stin ing g for an incr increa ease se.. B. T h e R c ch h ar t
C. T h e p c h a r t
IV. An Anal alys ysis is of variance A. Te Test stiing 2 samp sample le vari varian ance ces s fr from om norm normal al popU opUlati tio ons 1 On Onee- and and twotwo-ta tail il prob proble lems ms may may be anal analyz yzed ed.. 2. B
C
D
F=
orr both the n u um m e rra a tto o r a nd t h e d e en n om om i na nat o orr df = n - 1 f o TwoTw o-ta tail il tes testt re requ quir ires es divi dividi ding ng th the e leve levell of signi signifi fica cance nce by 2.
Anal Analyz yzin ing g 3 or mor more e sam sample ple mea means ns fr from om nor normal mally ly dis distri tribut buted ed po popU pUla lati tion ons s ANOVA ANOVA 1 Equality of th the e me mean ans s wil illl be te test sted ed.. Ho : fl = fl2 = fl3 and H : fl * fl2 * fl3 2 Th The e te test st st stat atis isti tic c is F _ MS r 3 This is a one-tail test. - MSE Two-fac Two -factor tor vari variance ance ana analys lysis is 1 Equality of 3 or mo more re me mean ans s wil illl be te test sted ed fo forr both both a tr trea eatm tmen entt vari variab able le and and a bloc blocki king ng vari variab able le.. 2 Th The e te test st st stat atis isti tic c is F = MS r and = MS a 3 MSE MSE Th This is is a oneone-ta tail il test. est. Compari Com paring ng three or mo more re tr trea eatm tmen entt me mean ans s to each each ot othe herr Havi Ha ving ng reje reject cted ed th the e null null hypo hypoth thes esis is whe when n comp compar arin ing g th the e means means of three or more popula populations tions,, 1 trea treatm tmen entt me mean ans s can can th then en be comp compar ared ed 2 at a ti time me to determi determine ne ind indivi ividua duall dif differ ferenc ences. es. 2. The The te test st sta tati tist stic ic is th the e range ange fo forr th the e di diff ffer eren ence ce betw betwee een n th the e tr trea eattments ents.. 3
V
T h e tes t statistic is F
X3 - X1
If the the rang range e incl includ udes es 0 conc conclu lude de th ther ere e is not a di diff ffer eren ence ce..
This This is a twotwo-ta tail il test.
± t MS
+
Nonparame Nonp arametric tric hypoth hypothesis esis test testing ing A Goodness of fi t tests for expect expected ed frequen frequency cy of one cat catego egoric rical al variab variable le 1. Do expe expect cted ed fr freq eque uenc ncie ies s equa equall or pro propor portio tional nal match match the obs observ erved ed frequency frequency? ? 2. The The te test st st stat atis isttic is ch chii-sq squa uarre. f ] O e and f e 5 and df = k - 1 X2 L f
=
B
e
Measuring indepen Measuring independence dence of two two cate catego gori rica call vari variab able les s with with a contingency table test A r e t w o v a r i a b l e s d e p e n d en t ? 2 fe ] x 1 2. The The te test st st stat atis isti tic c is chi chi-s -sq quare uare.. X L and f e f o 5, a nd df = r - 1 c - 1 fe
=
C
[f
[ fO
=
The ru n test for dete determ rmin inin ing g ra rand ndom omne ness ss base based d up upon on ord order er of occurrence he re re r IS th the e num number ber of runs, Ilr = nl + n + 1 Z = cr;: w he l
L r
2nln2
and
crr=
2nln2 2nln2 nl
+ n2
n
2 nl+ n
-n2
+n -1
D
OneOn e- and and twotwo-ta tail il te test stin ing g of one one samp sample le medi median an us usin ing g a sign test.
E
OneOn e- an and d twotwo-ta tail il te test stin ing g of 2 me medi dian ans s fr from om in inde depen pende dent nt popu popula lati tion ons s usin using g the Mann-Whitney test.
F
OneOn e- an and d twotwo-ta tail il te test stin ing g of 2 me medi dian ans s fr from om dep depend endent ent popU popUla lati tion ons s usin using g th the e pair paired ed diff differen erence ce sign test.
G. The Kruskal-Wallis te test st fo forr th the e equa equali lity ty of 3 or mor more e inde indepen pendent dent sample sample me media dians ns
137
I nfer nferen enti tial al I
tati tatist stic ics s Test
A samp sample le o 36 ou outt o 25,0 25,000 00 ba base seba ball ll fans fans atte attend ndin ing g a ga game me re reve veal aled ed aver averag age e refr refres eshm hmen entt spen spendi ding ng of 7.60 7.60.. The The po popu pula lati tion on stan standa dard rd dev deviat iation ion·· was 2.10. The ma make kers rs o Dud be beer er will will not di dist stri ribu bute te thei theirr pr prod oduc uctt to a ba ball llpa park rk un unle less ss it is po poss ssib ible le that that the the av aver erag age e fan sp spen ends ds at le leas astt 8.0 8.00 on re refr fres eshm hmen ents ts.. Use the 5-st 5-step ep ap appr proa oach ch to hy hypo poth thes esis is test testin ing g and a 1 le leve vell of si sign gnif ific ican ance ce to test test wh wheth ether er this this ba ball llpa park rk qu qual alif ifie ies s to re rece ceiv ive e Dud bee eer. r.
Data
s
using stat using statisti istics cs software
t or th ho os
Refreshment Spend Spending ing 4.50
8.00
9.00
9.00
6.95 10.00
4. 90 8.00
7.00 9.50
8.05 2.00
11.00
9 .0 0
5.00
8.00
8.05
8 .5 0
10.00
4.80
6.00
4. 90
11.00
9.00
6.50
7.00
7.00
8.00
5
5.75
9.10
9.00
11
OO
9.10
II
A ma mark rket etin ing g test test of choc chocol olat ate e flav flavor ored ed shav shavin ing g cr crea eam m re reve veal aled ed a favo favora rabl ble e re resp spon onse se from from 35 of 50 te test st subj subjec ects ts.. Te Test st su subj bjec ects ts we were re ch chos osen en at ra ran ndo dom m fro rom m th the e comp compan any's y's 1,20 1,200 0 em empl ploy oyee ees. s. Th This is prod produc uctt will will be ma manu nufa fact ctur ured ed if at lea east st 80 o the the pote potent ntia iall ma mark rket et li like ke the the prod produc uct. t. A Usi sin ng the the 5-st 5-step ep ap appr proa oach ch to hy hypo poth thes esis is test testin ing g and a .05 leve vell of si sign gnif ific ican ance ce,, det determ ermine ine whe whethe therr the the pro produ duct ct will will be ma manu nufa factu ctured red..
B
What are the pros and co con ns
o
6.00
Data s t o r tho those se usi using ng statisti stat istics cs softw software are Favorable Favorabl e and Unfa Unfavora vorable ble Attitude Atti tudes s Toward Choc Chocolat olate e Flavor Fla vored ed Sha Shavin ving g Cre Cream am U
F
F
F
F
F
U
F
F
U
U
F
U
F
F
U F
F U
F F
F F
U F
U
F
F
U
F
F
F
F
F
F
U
F
F
U
U
F
F
F
F
F
F
F
F
U
U
us usin ing g comp compan any y em empl ploy oyee ees s to test test this this pr prod oduc uct? t?
138
III. ABC Co Com mpany pany is qu ques estio tioni ning ng whethe whetherr the the qualit quality y of ma mate teri rial al comi coming ng fr from om th the e compa company ny s th thre ree e supp suppli lier ers s has has some someth thin ing g to do wi with th the the nu numb mber er of de defe fect ctiv ive e prod produc ucts ts.. The num number ber of de defe fect cts s fr from om 20 prod produc ucti tion on ru run ns fo forr each suppl upplie ierr we were re cou ount nted ed.. Usin Using g a .05 leve evel of sig signif nifica icance nce.. determin determine e whether the numb number er of def defect ects s an and d th the e comp compan any y supp supply lying ing ma mate teri rial als s are are rela relate ted d de depe pend nden ent) t)..
Analysis o f Ma Mate teri rial al Su Supp ppli lier ers s and De Defec fects ts Company 3
Company
Company
fa
V
Totals
High Hig h defect defects s Lo w d e f e c t s
6 14
15 5
30 30
Totals
20
20
20
60
Four Four pe peop ople le were were give given n ex exte tens nsiv ive e sale sales s trai traini ning ng.. Test Test wh whet ether her the their ir sale sales s perf perfor orma manc nce e impr improv oved ed us usin ing g a .05 le leve vell of si sign gnif ific ican ance ce.. Assum Assume e no norm rmal ally ly distr dis tribu ibute ted d populat populations ions wit with h unknow unknown n sta standa ndard rd deviations.
Analysis o f Sal Sales es Trai Trainin ning g Effectiveness Salesperson
Sales
Performance Before
A
12
15
B
13
17
C
10
14 12
Totals
139
After
V
Owners of th the e Qu Quic ick k Ch Chow ow Rest Restau aura rant nt are are conc concer erne ned d abou aboutt th the e aver averag age e time time to se serv rve e cust custom omer ers s at two of th thei eirr stores ores.. A sa sam mp ple le of 32 cust custome omers rs at st stor ore e A resu sullted in a mean ean service serv ice tim time e of 80 seco second nds s and and a st stan anda dard rd devi deviat atio ion n of 8 seconds. A s a am mple of cust custom omer ers s at sto store re B resu result lted ed in a me mean an serv servic ice e ti time me of 75 seco second nds s and and a st stan anda dard rd deviation of 7 seco nd nds. T es es t a t the .02 level of si sign gnif ific ican ance ce whethe whetherr th the e me mean an time time to wait on cust custom omer ers s at th thes ese e tw two o st stor ores es is th the e sam same.
Data se t fo r tho those se usin using g statistics stati stics softwar software e Store A
St ore B
66
72
79
81
84
73
74
84
70
86
83
68
84
85
65
78
72
74
78
83
85
71
63
71
83
88
72
90
74
64
87
86
62
68
87
85
75
98
73
83
74
78
84
8
78
80
83
70
62
75
93
78
8
72
75
86
69
93
82
71
76
66
74
83
68
8
81
82
82
70
75
64
71
68
78
78
66
75
VI. Before ore rece recent nt impr improv ovem emen ents ts it to took ok 36.4 36.4 mi minu nute tes s to as asse semb mble le a part part.. Afte Afterr imp improv roveme ements nts a sampl sample e of 16 had had an aver averag age e asse assemb mbly ly time time of 34 minut inutes es.. The samp sample le st stan anda dard rd devi deviat atio ion n was was 2. 2.4 4 mi minu nute tes. s. Te Test st at th the e 1 level of sig signif nifica icance nce whether whether improv improveme ements nts low lowere ered d assembl asse mbly y tim time. e.
D at at a s e ett fo r those using statistics software Time After Improvements
35. 9
31. 8
31. 5
3 6. 6
30. 8
32. 3
32. 0
3 6. 2
35. 8
35. 7
36. 8
3 6. 4
3 2. 6
36. 8
31 . 3
31 . 5
140
VI VIII Sample Samples s of 1 taken taken in 19 1985 85 an and d 19 1995 95 re reve veal aled ed the the av aver erag age e time time pe peop ople le spen spend d groc grocer ery y shop shoppi ping ng de decr crea ease sed d from from 18 mi minu nute tes s to 14 mi minu nute tes s Resp Re spec ecti tive ve st stan anda dard rd de devi viat atio ions ns were were 5 mi minu nute tes s an and d 4 mi minu nute tes s Test Test at th the e 1 le leve vell of si sign gnif ific ican ance ce whet whethe herr th ther ere e ha has s be been en a chan change ge in shop shoppi ping ng time variabili variability ty
Data s t f o or r thos using usi ng statis statistics tics software Shopping i m 98 9
VIII Test Test at the the
5 le leve vell
of
Day Da y
Accidents
Monday
9
Tuesday
Wednesday
6
Thursday
Q
5
1
17
7
8
16
9
18
8
14
16
23
13
9
18
28
15
18
14
signif sig nifica icance nce whether whether workpl workplace ace accide accidents nts happ happen en equall equally y throug throughou houtt the the workwe workweek ek
Analysis
Friday Totals
99
of
Workpl Wor kplace ace Acci Acciden dents ts
IX. Th Thre ree e comput computer er com compon ponent ent ass assem embly bly meth methods ods wer were e comp compar ared ed by In Inse sell Co Corp rpor orat atio ion. n. Emplo Employee yee ef effi fici cienc ency y was wa s base based d up upon on prod produc ucti tion on ti time me and and pro produc ductt qual qualit ity. y. A. U se A AN N OV A anal analys ysiis to te test st at th the e .05 level of sig signif nifica icance nce whether whether me mean an employe employee e eff efficie iciency ncy of these as assem sembl bly y me meth thod ods s are eq equa ual. l.
ANOVA Analys Analysis is o f Ass Assemb embly ly Metho Methods ds Employee Employ ee Efficiency Efficiency Ratings fo r 3 Tre Treatmen atments ts T Method 1
Method 2
Row Totals Totals Required fo r Calculations
M et et h ho od 3
Score
Score
Score
4
6
8
6
7
8
7
4
9
I:xr I X
n ~ X T
I
2
n
SSTOTAL
B
Determine at the 1 level of si sign gnif ifica icanc nce e wh wheth ether er th ther ere e is a diff differ eren ence ce in performance of those wh o rec receiv eived ed tea teachi ching ng met method hods s trea treatme tments nts 1 and and
3
= I : x
4
X
Darin Dari n want wants s to comp compar are e as asse semb mbly ly time time of 3D milli millig gra ram m pa part rts s us usin ing g meth method od A an and d meth method od B It is no nott known whet wh ethe herr th thes ese e po popu pula lati tion ons s ar are e appr approx oxim imat atel ely y no norm rmal al with with th the e same same vari varian ance ce Use th the e Mann Mann Whit Whitne ney y te test st to de dete term rmin ine e at th the e 05 le leve vell of sig signif nifica icance nce whether whether thes these e sa samp mple les s come come fro from m popula populatio tions ns wit with h eq equa uall medi median ans s Time Time to Assemb Assemble le 30 Mill Millig igra ram m Pa Part rts s in Se Seco cond nds s Method A
90
95
104
88
91
94
87
10 2
96
98
1 1
Method B
95
102
93
105
96
99
100
103
91
97
106
Rank Ordered r r y and and Asse Assembl mbly y M e thod
Ranked Ranke d Scores Scores Meth Me thod od Metho Method d A
Ra nk Ordered r r y an and d Assembly Assembly M et h od
R a nk ed S c o r e s Method Method
Rank Ordered r r y and ssembly M eth od
R a n k e d S co r e s Method Method
143
XI A th thir ird d as assem sembl bly y meth method od ha has s re rece cent ntly ly be been en pr prop opos osed ed for for th the e 30 mi mill llig igra ram m part parts s ex exam amin ined ed in prob proble lem m 10 Use a 1 le leve vell Krus Kruska kall Wall Wallis is te test st to de dete term rmin ine e whet whethe herr th thes ese e samp sample les s come come from from po popu pula lati tion ons s wi with th equal equ al median medians s Meth Me thod od C
I
86
I
im e to ssemble 30 Milli Milligra gram m Parts Parts in Second Seconds s
99
I
84
I
85
I
92
I
93
I
82
I
81
I
96
I
83
I
94
ssembly i me me f o orr 30 Milli Milligram gram Part Parts s
Method A T im e
Ra n k
Method B T im e
Rank
Method C T im e
Rank
144
XII. Oven Oven te temp mper erat atur ure e at Ch Chew ewy y Pizz Pizza a re rest stau aura rant nts s was was in cont contro roll wh when en th thes ese e samp sample les s were were tak aken en.. Cons Co nstr truc uctt an X char chartt an and d an R char chartt for for this this data data us usin ing g a 99 99.7 .74 4 co conf nfid iden ence ce inte interv rval al.. Sample
1
2
3
4
405
402
398
410
39
4
404
4 04
390
402
40 9
409
Sample iz ize e n)
397
412
388
412
40 0
407
2
1.880
0
3.267
Samp Sample le Mean Mean
3
1.023
0
2.575
Sample Sam ple Range Range
4 5
0.729 0.577
0 0
2.282 2.115
Oven Readings
5
6
XIII. Potential cu cust stom omer ers s were were aske asked d to rate bra brand A and brand B little is know known n about about po popu pula lati tion on dist distri ribu buti tion ons. s. Test at the the .1 .10 0 leve levell of si signi gnifi fican cance ce whet whether her thes these e bran brands ds were were view viewed ed eq equa uall lly y by thes these e pote potent ntia iall cust custom omer ers. s. A pair paired ed dif diffe fere rence nce sign test may may be co cond nduc ucte ted d even even though ough this is not not a test test fo forr stati statist stica icall depe dependen ndency. cy.
T o ta l s
S TM
Contro Con troll Fac Factor tors s
rand
fo r
°
99.74
°
Preferen Pref erence ce Test
Cust om er
Brand A
Br Bran and dB
1 2
87
89
9
97
3
8
85
4
73
8
5
92
98
89
8
145
hapter
orrela orr elatio tion n
na nalys lysis is
I. Co rr rre la la ttii on on a n na a llys ysii s mea mea s su u rre e s the the stre stren ng gth th o f the arithmeti arithmetic c relation relationship ship bet betwee ween n tw o variables. II. Corr Corre e la lati tio on ma y be visually re repr prese esent nted ed with with a scatter scatter di diag agra ram. m. A. linda Smith is in inte tere rest sted ed in analy analyzin zing g the the re rela lati tion onsh ship ip be betw twee een n mon monthl thly y adve adverti rtisi sing ng expend expenditu itures res and and mont mo nthl hly y sa sale les s re reve venu nue. e. Data Data on thes these e vari variab able les s was was firs firstt pres presen ente ted d in chap chapte terr 7.
Advertising expenditures 000 Sales revenue 000
C.
5
2
7
6
10
4
6
5
3
8
50
25
80
50
90
30
60
60
40
80
She began by maki makin ng a scatter dia diagra gram m of the the data. 1. Sales is the depend dependent ent var variab iable le becau because se sa sale les s re reve venu nue, e, to some some de degr gree ee,, is depen dependent dent upon upon advert advertisi ising ng expend exp enditu itures res.. Th This is depend dependenc ency y was was ve veri rifi fied ed on page 121. The de depe pend nden entt vari variab able le is gr grap aphe hed d on th the e y-ax y-axis is.. 2. The independe independent nt variab variable, le, adverti advertising sing expend expenditu itures res,, is gr grap aphe hed d on th the e x-ax x-axis is ab absc scis issa sa . 3. In ch chap apte terr 24, we wi willl le lea arn to dr draw aw a re regr gres essi sion on li line ne th thro roug ugh h th the e mi midd ddle le of a scatter scatter di diag agra ram. m.
Scatter Diag Diagram ram of Adve Advertis rtising ing an and d Sa Sale les s
6
4
relation relationship ship correlat correlation ion . An r o f about .8 or so is hi high gh positi positive ve correl correlati ation. on. An r of about .2 to -.2 is low correlation correlation.. An r of about -.8 o r so is hi high gh negati negative ve correl correlati ation. on.
Perfec Per fectt Pos Positi itive ve Corre Correlati lation on r = 1
8
•
•
4
•
6
•
4
•
2
• 2
4
Strong Strong Positive Positive
= .8
6
8
•
Perfect Perf ect Nega Negative tive Corr Correla elation tion r= 1
00
Weak P o s i t i v e
= .2
•
•
• • •
•
• •
2
4
8
•
8
• •
2
• 6
•
4
•
•
•
6
•
•
8
Advertising
Zero Zer o Corr Correla elatio tion n r= 0
•
•
8
III. Th e sample coefficient coefficient o f corre correlati lation on r A. The coefficien coefficientt of cor correl relati ation on r measur measures es the the stre streng ngth th of the relationship between 2 variables. [ -1 r + 1 • It ta take kes s va valu lues es be betw twee een n ± 1 inclusive. • B. The cl clos oser er r is to ei eith ther er extr extrem eme, e, the the hi high gher er stro stron nge gerr is th the e 1. 2. 3.
in thousands of dollars
Sales
• 00 ---2
---4
----6-----8---- 10
S tro ng N eg at i v e
Weak Negative
=
r = .2
8
146
c
Advertising Expenditures (x) (x) (000 (000))
Sales Revenue (y (y)) (000 (000))
5
50
25
250
2, 50 0
2
25
4
50
625
7
80
49
560
6 ,4 0 0
6
50
36
30 0
2,500
10
90
100
900
8,100
4
30
16
120
900
6
60
36
360
3,600
5
60
25
300
3,600
3
40
9
120
1,600
80
64
640
MQQ
5 65
364
3,600
10(3,6 10( 3,600) 00) - (56) (56)(5 (565) 65)
J10( 10(364 364)) - (5 (56) 6)2] 2][1 [1 0(36, 0(36,22 225)5)- (565) (565)2] 2] (36,000) - 31,640 J[ 3,640 - 3,136 ][ 362,250 - 319,225 ]
4,360 J[5 4][43 25]
[ r = .936
56
Xy
x2
y2
36,225
IV. Coe Coeffi fficien cientt o f determination r The coef coeffici ficient ent of det determ ermina inatio tion n mea measur sures es the tota totall var variat iation ion of th the e dep depend endent ent var variab iable le (s (sal ales es reve revenu nue) e) ac acco coun unte ted d for for by vari variat atio ion n of the ind indepe epende ndent nt var variab iable le (ad (adver vertis tising ing expend expenditu itures res). ). [ .. ... B Approxim Approximately ately 88 of the var variabi iability lity in Lind Linda a s Vid Video eo Sho Showca wcase se sale sales s rev revenu enue e is acc account ounted ed for by adv advert ertisi ising ng exp expend enditu iture re variab variabili ility ty.. = r 2 = (.936)2 = .876 , of
V. C oe oeThe f fifi ci ci en ecoeff n t icient nondetermina nondet tionination ({ 2) coefficient of ermination nondet non determ erminat ion mea measure sures s the tota totall var variat iation ion of th the e dep depend endent ent varia variable ble (s (sal ales es reve revenu nue) e) no nott ac acco coun unte ted d for by vari variat atio ion n of the the inde independ pendent ent var variabl iable e (ad (adver vertis tising ing exp expend enditur itures) es).. Approxim oximate ately ly 12 of the variab variability ility in Lind Linda a s Vide Video o Sho Showca wcase se B Appr 2 sale sales s rev revenu enue e is not acc accoun ounted ted for by advert advertisi ising ng exp expend enditu iture re variab variabili ility. ty. [ 1 = 1 - r 2 = 1 - .876 = .124 Note: Advert Advertisin ising g is no nott the the only only vari variab able le affe affect ctin ing g sale sales. s. Mult Mu ltip iple le corr correla elati tion on an and d regr regres essi sion on,, no nott cove covere red d by Quick Notes, Not es, an analy alyze ze the the rela relati tion onsh ship ip be betw twee een n mo more re than than on one e inde indepe pend nden entt varia variabl ble e an and d a de depen penden dentt vari variab able le..
A note o f ca caut utio ion. n. W e ha have ve proven a high mathem mat hemati atical cal (linea (linear) r) rela relation tionship ship betw betwee een n these 2 variables. We ha have ve not not proven a cause-effect cause-ef fect relations relationship. hip.
VI VI.. Meas Measur urin ing g th e significance o f the th e coefficient o f correlation A To be si sign gnif ific ican ant, t, the the po popu pula lati tion on co coef effi fici cien entt of correlation (p, the Gree Greek k lett letter er fo forr rho) ca cann nnot ot be zero. must st be de dete term rmin ined ed whet whethe herr r is larg large e enou enough gh,, give given n some some leve levell of sig signif nifica icance nce,, to indica indicate te p is not ze zero ro.. B It mu C. The The 55-st step ep app appro roach ach to hy hypo poth thes esis is test testin ing g 1 The nu null ll hypot hypothe hesis sis an and d alte altern rnat ate e hypo hypoth thes esis is are are Ho : p = 0 an d H : p O 2 The le leve vell of si sign gnif ific ican ance ce will be .05 for for this this two wo-t -ta ail proble oblem m wit ith h n - 2 degr degree ees s of freedom. Two Tw o is subt subtra ract cted ed be beca caus use e two two vari variab able les, s, x an and d y, are are be bein ing g es esti tima mate ted. d. 3 The relev relevan antt stat statis isti tic c is r
=n-
[ df
4. 5
2 = 10 - 2 = 8
t = 2.306
t
r- p
Vn:2
Not e : A large r leads to a large t and a large t lead leads s to reje reject ctin ing g th the e null hy hypo poth thes esis is.. p is 0 bec because the Ho is as assu sume med d to be tru rue. e.
If t from the test est statist istic is beyond the critical value of t the the nu null ll hy hypo poth thes esis is will will be reje reject cted ed.. Appl Apply y the the de deci cisi sion on ru rule. le.
-
r -p
-
Vn:2
.936-0 1- .936 2
10- 2
=7.52
Reje Re ject ct Ho bec becaus ause e 7. 7.52 52 > 2.306. Th This is sam sample ple is not not fr from om a popu popula lati tion on with with a coef coeffi fici cien entt of corr correl elat atio ion n equa equall to zero zero..
147
Prac Pr acti tice ce Set I
Corr Correl elat atio ion n Anal Analy ysis sis
Darin Jo Jone nes s want wants s to kn know ow whet whethe herr ag age e sal sales es per person sonnel nel affect affects s sa sale les s pe perfo rforma rmanc nce e nswer nswer the the foll follow owin ing g qu quest estio ions ns using using this this data data A
Draw Draw a scatt scatter er di diag agra ram m ge
S ales Commissions ODD
23
30
25
25
34
20
29
24
21
35
32
22
23
34
24
27
27
22
3Q
6
8
148
B
Calcul Cal culate ate the coeffici coefficient ent cor correl relat atio ion n to
C
What Wh at is th the e coef coeffi fici cien entt de deter termin minati ation on? ? Int Inter erpr pret et your an answ swer er
D
What Wh at is th the e coef coeffi fici cien entt
E
the the re rela lati tion onsh ship ip betwee between n ag age e sa sale les s perso personn nnel el an and d th thei eirr sa sale les s commis commissio sions ns sign signif ifica icant nt at the the 1 level? Is
de decim cimal al pl plac aces es Inter Interpr pret et your answe answerr
nondet non determ ermina ination tion? ? Interpr Interpret et your answer answer
9
Quick Qui ck Questi Questions ons I
II
orrelation nalysis
Plac Place e the the num number ber of the appr appropr opriat iate e for formu mula, la, expres expressio sion, n, or term term next next to th the e appr approp opri riat ate e conc concep ept. t. A
Coefficient of determ inatio n
B
Coefficient of correlation
C
A r ange f o r r
D
Coefficient of non determin ation
E
The The te test st st stat atis isti tic c t used used to measur mea sure e the sign signific ificance ance of r
1. 1- z th the e va vari riab abiility lity in y th that at is not not ex expl plai aine ned d by x
_
2
n L XV
_
L X) L Y
J l n L X ) - L X ) 2 ][ ][ n L ) 2 ) - L Y 2]
3
_
- ::L
_
the e vari variab abil ilit ity y in y th that at is expl explai aine ned d by x z th
4
5
_
n
1
s
r S
+1
Draw Draw th the e fo foll llow owin ing g scat scatte ters rs and and pl plac ace e th the e appr approp opri riat ate e valu value e for r in th the e sp spac ace e prov provid ided ed.. Perfectt Positiv Perfec Positive e Correlation
r=
r= III. Draw Draw a scat scatte terr diag diagra ram m show showin ing g how ho hour urs s st stud udyi ying ng per per we week ekend end affect affect grade grade po poin intt av aver erag age. e. 4
3
2
1
1
2
Perfect Negativ Perfect Negative e Correlation
Zero Correlation
3
4
5
6
7
8
r=
Hours Studying pe r Weekend
Gr ade ade Point Point Average
3
30
2
20
6
38
3
26
4
32
8
37
2
21
3
28
15 150 0
IV. Using the data
V
in
questi que stion on III II,, cal calcul culate ate the the follo followin wing: g:
A
Coefficient corr correl elat atio ion n to 3 de deci cima mall plac places es
B
Coefficient
determination
C
Coefficient
nondetermination
D
In Inte terp rpre rett your answ answer er to ques questi tion on IV B
Could p rrh ho be ze zero ro at the
1
level
significance?
151
Chapter Chap ter I
Si Simp mple le Line Linear ar Re Regr gres essi sion on Anal Analys ysis is
Simple reg Simple regres ressio sion n ana analysi lysis s def define ines s the ma mathe thema mati tical cal re rela latio tions nshi hip p be betw twee een n 2 variables. A A scat scatter ter dia diagra gram m dep depict icts s the the rela relatio tionsh nship ip betw betwee een n th the e in inde depe pend nden entt var varia iabl ble e (ad (adver verti tisin sing) g) on the the x-ax x-axis is an and d a de depe pend nden entt va vari riab able le (s (sal ales es)) on the the yy-ax axis is (s (see ee gr grap aph h). B A li line ne thro throug ugh h the the sca scatt tter er pl plot ot can can be us used ed to math mathem emat atic ical ally ly de defi fine ne this this re rela lati tion onsh ship ip.. Th The e line ca can n be es esti tima mate ted d usi sin ng the the ey eye eba ball ll me metthod by draw drawin ing g a line ine wi with th a ru rule lerr that that di divi vide des s the data in half. Linda' Lin da's s Vide Video o Showcase 2 A regr regres essi sion on eq equa uati tion on ma may y be use sed d to mo morre ex exac actl tly y Advert Adv ertisi ising ng Exp Expen endit diture ures s an and d Sal Sales es Re Reve venu nue e defi define ne the the rel relati ation onshi ship p betwe between en two two vari variab able les. s.
Advertising Expenditures (X (X)) (0 (000 00))
Scatte Sca tterr Diagra Diagram m of Advert Advertis isin ing g and and Sale Sales s Eyeball Eye ball Method Method
Sales
(in thousands thousands of dollars)
•
4
6 8 Advertising
1
Sales Dollars (Y) (000 (000))
x2
w here
E
50
25
250
2,500
2
25
4
50
625
7
80
49
560
6,400
6
50
36
300
2,500
10
90
100
900
8,100
4
30
16
120
900
6
60
36
360
3,600
5
60
25
300
3,600
3
40
9
120
1,600
a
80
64
M Q
6.400
56
565
364
x is the estima matted va vallue of y base2 upon a given
valu value e for for x Th The e period ne next xt to yi s read read giv given en and this expression is read y estimated given x. a is the the yy-in inte terc rcep eptt (whe (where re the the li line ne cros crosse ses s the the y-ax y-axis is). ). b is the slope o f the line. It equals ily + ilx
Dete De term rmin inin ing g the the reg regres ressio sion n equa equati tion on to 3 sig signif nifica icant nt di digi gits ts..
b
=n
L XV n LX2
L X
L Y
LX 2
10(3 10(3,6 ,600 00)) - (56) (56)(5 (565 65)) 10 10(3 (364 64)) - (56 56))2
= F
~ ~ ~ O = 8.6507936
a =Y
a =
LY
n
=
b
LX
x= a
n
8 . 6 5 0 7 9 3 6
x
~ ~
+
= 8.06 + 8 65x
= 8.055556
Th The e ex exam ampl ple e to the the ri righ ghtt us uses es the the re regr gres essi sion on equa equati tion on to cal calcul culat ate e est estima imate ted d mon monthl thly y sa sale les s
x = 8.06
when wh en adver adverti tisi sing ng exp expen endit diture ures s ar are e 9,00 9,000. 0.
Y g = 8.06 + 8.65(9) Y g = 8.06 + 77.85 = 85.91
+8
y2
5
II. De Dete term rmii ni ni ng ng a reg reg re re ssi ssio o n e qu qu ati atio o n u si si n ng g the the me meth tho o d of least squa squares res A Many Ma ny di diff ffer eren entt li lin nes ca can n be dr draw awn n thro throug ugh h a sc scat atte terr pl plo ot usin sing a ruler. B The met metho hod d of lea least st squ square ares s gi gives ves mo more re con consis sisten tentt resu result lts. s. C This This tec techni hnique que resul results ts in a stra straig ight ht li lin ne that that mini minimi mize zes s the the sum sum of the the squa square red d vert vertic ical al de devi viat atio ions ns be betw twee een n the the re resu sult ltin ing g li line ne and the the in indi divi vid dua uall da data ta.. Th Thes ese e de devi viat atio ions ns ma may y be thou though ghtt of as erro rror. D Th This is is the the ge gene nera rall fo form rm of the reg regress ression ion eq equat uatio ion. n.
x= a
XY
65x
or 85,910
3,600 36,225
152
III. Drawing a re regr gres essi sion on line line A Two Tw o po poin ints ts x,y) may may be used to draw draw a stra straig ight ht line. B The The y-in y-interc tercept ept (0 (0,, 8,06 8,060) 0) will be on one e poin point. t. C The estimated value of y for x of 9,000is 85,910. It wi willll be the se sec con ond d poi oin nt see page 152).
Scatt Scatter er Diag Diagra ram m of Ad Adve vert rtis isin ing g and Sal ales es 100
IV. Th The e stand tanda ard er erro rorr of the es esti tima mate te A Th The e stan standa dard rd er erro rorr of the the es esti tima mate te meas measur ures es th the e dispersion of the the scatt scatter er plot plots) s) arou around nd the the regr regres essi sion on line line.. B It is the stan standa dard rd de dev viati iation on of y give given n some ome value alue of x C
S
y.x
=
=
y2-a L Y -b LXY
n
36,225 36,2 25 -8.055556(565) -8.055556(565) -8.6507936(3 -8.6507936(3,600) ,600) = 10
Sy.x=
80
8.145
a = 8.06
60 40 0
JL Y-Y)
in thousand thousands s of dollars)
Sales
/
y.x
=8.06 + 8.65(9) = 85.91
2
10
V. An interval est im imat e fo r th the e condi conditi tiona onall mean mean of y f or or s om ome gi ve n va llu ue of x A A conf confid iden ence ce in inte terv rval al wi will ll be dete determ rmin ined ed usin using g the the smal smalll samp sample le t di dist stri ribu buti tion on.. B
Y.x Y.x
C
±tsy.x
1 n
+
Note: The corr Note: correc ecti tion on fac factor tor fo foll llow owin ing g th the e st stan anda dard rd erro errorr of th the e es esti tima mate te is needed be bec cau ause se the sa samp mplle is small mall and th the e scatter of sale sales s da data ta mi migh ghtt no nott be normal.
X-X 2
L x 2 - L
x2 n
Lind Linda a Smit Smith h want wants s to de dete term rmin ine e th the e 95 con confid fidenc ence e inter interva vall for expect expected ed sale sales s for mont months hs when when adve advert rtis isin ing g expe expend ndit itur ures es are are 9,000. Basi Ba sic c Assum Assumpti ptions ons Conc Concer erni ning ng Linear Linear Regr Regres essi sion on Analy Analysis sis The There are a num number ber of y values for for each value of x
1
2. Th The e cond condit itio iona nall di dist stri ribu buti tion ons s of y give given n x are are no norm rmal al..
The The varian variance ce of the the cond condit itio iona nall dist distri ribu buti tion ons s are are equa equal. l.
3
4. Pre Predic dictio tions ns of y r
limi limite ted d to the the ex exis isti ting ng rang range e for for x
Note: Pr Note: Pred edic icti ting ng an in indi divi vidu dual al valu value e next next mont month h s sale sales) s) rat rathe herr th tha an the mean of Y sa sale les s) requ requiires res ins insert erting ing a 1 under und er the ra radi dica cal. l. Proble Pro blem m Notes Notes
y.x = 8.06 + 8.65
x
85,, 910when = 85
x
= 9.
Se See e page page 152.
Degr gre ees of free freedo dom m fo forr t will be n - 2 because both a and b were we re estima estimated ted in determining .v.x. df =n - 2 =10 - 2 =8 a/ 2 -
D
= .05/2 = .025
=
X
-
=
56 10
7
= 5.6
2.306 for t
n =10
Sy.x = 7.89
Y.x ± ts y .x
85.91 ± 2.306 (8. (8.145 145)) 85.91 75.131
1 1
+
9-5.6 2
364 _
± 10.779 H
96.689
For For regr regres essi sion on anal analys ysis is to be valid, th the e ran range fo forr va vari riab able les s a and b mus ustt co cons nsis istt of rea realis listic tic values. values. Here He re,, th the e y-inte y-interce rcept pt cann cannot ot be ne nega gati tive ve beca becaus use e ne nega gati tive ve sale sales s ar are e no nott po poss ssib ible le.. But, de dete term rmin inin ing g the 95 conf confid iden ence ce in inte terv rval al for the the y-int y-interc ercept ept (0,8.06) by reca recalc lcul ulat atin ing g ac acce cept ptab able le error error E) result results s in a ne nega gati tive ve lo lowe werr li limi mitt (8.0 (8.06 6 - 15 15.9 .96 6 =-7.90). Th This is conc concer ern n migh mightt be so solv lved ed by lo lowe weri ring ng th the e st stan anda dard rd of
In
error the th e es esti mate te wiTh theapo larg larger er samp sa addi ad diti tion on, , proc pr esuse exis ex t for deter deto termi mini ning ng confid con fidenc ence eship inte interv rval al fo for r the th etima sl slop ope. e.with The poss ssib ibil ilit ity ymple ofle. a. ne nega gati tive ve slop slope e oced woul woedur uld dures ca caus eist pe peop ople le qu ques esti tion ona the th e re rela lati tion onsh ip be betw twee een n ad adve vert rtis isin ing g and sales ales.. A larg larger er sa samp mple le mi migh ghtt also also so solv lve e th the e pr prob oble lem m of a ne nega gati tive ve sl slop ope. e.
56 2
10
153 15 3
Prac Pr acti tice ce Set I
Simp Si mple le in inea earr Regr Regres essi sion on
Having Havin g deter determi mined ned tha thatt age affe affect cts s sale sales s perf perfor orma manc nce, e, Dari Da rin n Jone Jones s wa want nts s to es esti tima mate te sale sales s comm commis issi sion ons s us usin ing g th the e da data ta pres presen ente ted d in the the chap chapte terr 23 prac practi tice ce set. et. A.
B
Dete Determ rmin ine e the the regr regres essi sion on eq equa uati tion on to 3 sign signif ific ican antt digit igits. s.
naly nalysi sis s
Age
Sales C o mmi s s i o n s 000
xy
x2
y2
23
30
690
529
900
25
25
6 25
625
625
34
20
680
1, 156
400
29
24
696
84
576
2
35
735
44
1,225
32
22
704
1,024
484
34
7 82
529
1,156
24
33
792
576
1,089
27
27
7 29
729
729
-22
JQ
260
280
7,093
6 ,9 3 4
8,084
Esti Estima mate te sale sales s comm commis issi sion ons s for a grou group p of 24-ye 24-year-o ar-old ld sales salespeop people. le.
154
Gr Grap aph h th the e regr regres essi sion on line line..
D
Dete De term rmin ine e th the e 99
E
What Wh at pr proce ocedur dure e shou should ld be fo foll llow owed ed if qu ques esti tion on D s rang range e incl includ udes es negat negativ ive e nu numb mber ers? s?
con confi fide denc nce e inte interv rval al fo forr th the e qu ques esti tion on B grou group. p.
Quic Qu ick k Ques Questi tion ons s 24 I
Simp Si mple le Linea Linearr Regr Regres essi sion on Analy Analysi sis s
Pl Plac ace e the the number number of the appropria appropriate te formula formula sym symbol bol expres essi sion on next next to th the e conc concep eptt it desc descri ribe bes. s. or expr A
The The stan standa dard rd err error or of the esti estimat mate e
B
The y-i y-inter ntercept cept
C
2.
Y·x
a
t s x
n
bx x-x 2 L x 2- L x n
The reg regres ressio sion n equ equati ation on
3.
4.
D
Th The e est estim imat ated ed val value ue of y given x
E
Th The e slope slope
Y
bx
·x
5.
n L XV
An inte interv rval al est estima imate te for the co cond ndit itio iona nall mean mean of Y
F
x =
LX
n L X2
L
LX
G. An inte nterva rval esti estim ma ate te for indi divi vidu dual al valu value e of Y fo r an in 6
II
Y .x
The The fo foll llow owin ing g dat data a wa firstt pre presen sented ted in chap chapte terr 23. Es Esti tima mate te was s firs the the regr regres essi sion on line line fo forr th this is scat scatte terr us usin ing g th the e eyeb eyebal alll met etho hod. d. 7
Scatter Scat ter Dia Diagram gram o f Ho Hour urs s Stu Studyi dying ng
t s x 1
1
x-x 2
L X 2J L ;
JL
y2-a L Y - b L XV
and an d Grad Grade e Po Poin intt Av Aver erag age e
n-2
Grade Gra de Point Point Aver Average age
5 4 3
•
2
•
• • •
•
•
Hours Studying pe r Weekend
•
1 0
0
1
2
3
4
5
6
7
8
9
10
Hours Studying Studying
r a d e P oin t ve v erag e
XY
X2
y2
3
3. 0
9 .0
9
9.00
2
2. 0
4.0
4
4.00
6
3. 8
22 . 8
36
14.44
3
2.6
7.8
9
6.76
4
3.2
1 2. 8
16
10.24
8
3.7
29.6
64
13.69
2
21
4.2
4
4.41
.M
III. Calculate th the e regr regres essi sion on equa equattion. on. Round ound th the e slo lope pe and and yy-int interc ercept ept to thre three e sign signific ificant ant dig digit its. s.
31
23 . 2
9 8 .6
151
7.84 70.38
156
V
Esti Estim mat ate e the the grad grade e po poin intt aver averag age e for for pe peop ople le who stud studie ied d 5 hour hours s pe perr we week eken end d
V
Dr Draw aw the re regr gres essi sion on li line ne on the the pa page ge 15 156 6 scat scatte terr diag diagra ram m
VI Ca Calc lcul ulat ate e the the 98
conf confide idenc nce e inte interv rval al for stud studen ents ts who study study 5 ho hour urs s per wee weeke kend nd
VII Wh What at pr proc oced edur ure e shou should ld be foll follow owed ed if the the ra rang nge e for for your your an answ swer er to qu ques esti tion on E incl includ udes es nega negati tive ve nu numb mber ers? s?
157
or orre rela lati tion on and and Re Regr gre essio ssion n I
or orm mul ula a Revie Review w
Correlation Corre lation formulas
Coefficient of correlation
B
Coefficient of determination
C
Coefficient of nondetermination
D
The The valu value e of t when when determ determini ining ng the the signi signific fican ance ce of the the coeffi coefficie cient nt of corre correla lati tion on r
t=
f h
r p
V n::2
II
Regres Reg ression sion formu formulas las
y x =a
The regres regressio sion n equati equation on
B
Th The e slope slope of the regres regressio sion n eq equa uati tion on
C
The y-inter y-intercep ceptt of the regre regressi ssion on eq equa uatio tion n
D
The The st stan anda dard rd er erro rorr of the the esti estima mate te
E
n
x
Y
a = Y - bx =
n
in inte terv rval al estima estimate te for the condit condition ional al mean mean of y for some given value for x \
Y·x
ts y x
or
\
y
X
ts y x
Note: An in inte terv rval al es esti tima mate te for an in indi divi vidu dual al valu value e of y sales fo forr a re rece cent ntly ly hi hire red d 24 24-y -yea earr-ol old d sale salesp sper erso son n or gr grad ades es for your your room roomma mate te who who st stud udie ied d 5 ho hour urs s woul would d re requ quir ire e ad addi ding ng a 1 un unde derr th the e ra radi dica cal. l. This This make makes s th the e in inte terv rval al subs substa tant ntia iall lly y larg larger er..
n
X-x 2
2
L x
\
Y·x
±
LX n
and and df =n - 2
58
orr o rrel elat atio ion n and and Regr Regres essi sion on Test I
Pl Plac ace e th the e number number o the approp appropriate riate form formula ula expre expressio ssion n or te term rm nex nextt to th the e appr approp opri riat ate e co conc ncep ept. t.
The ind indepen ependen dentt var variab iable le
B
The depe dependen ndentt var variab iable le
1
r
2
b
3
Meas Me asur ures es th the e st stre reng ngth th in th the e rel relati ations onshi hip p bet betwee ween n two var variab iables les
The var variat iation ion o the depen dependent dent variab variable le expl explai aine ned d by th the e indep independ endent ent vari variab able le
E
The var variat iation ion o the depen dependent dent var variab iable le no nott ex expl plai aine ned d by the inde indepen pende dent nt vari variab able le
x
4
D
1
5
t
r p
1
2 n 2
6
y
aCL y -
b L xv
n
II.
F
Us Used ed when when te test stin ing g th the e sign signif ific ican ance ce o r
G
The reg regres ressio sion n equ equati ation on
H
The sl slop ope e o th the e regr regres essi sion on li line ne
I
Wher Wh ere e a regr regres essi sion on line ine cr cros osse ses s th the e yy-ax axis is
J
Th The e st stan anda dard rd err error or o the est estima imate te
7
y
8
a
9 10
Yx =a
Draw raw the fo foll llow owiing scat scatte ters rs and and pl plac ace e an appr approp oprria iate te va valu lue e fo forr r in th the e sp spac ace e prov proviided. ded.
High High Posi Positive tive Corr Correlati elation on
Zero Corr Correlatio elation n
r =
r
Low Negati Negative ve Cor Correl relati ation on
r
Perfect Perfe ct Pos Positiv itive e Corr Correlati elation on
r =
x
159
III
Answe Answerr th the e fo foll llow owin ing g qu ques esti tion ons s us usin ing g th this is da data ta th that at was ga gath ther ered ed to de dete term rmin ine e whet whethe herr re rese sear arch ch an and d developmen devel opmentt expendit expenditures ures affect pr profi ofitt
R
xpenditures illions 5
The coefficie coefficient nt correlation
Profits in illions 3
4 7
6
6
6 8
4
B
The coefficient coefficient det determ ermina inatio tion n and and the the coeffi coefficie cient nt nondetermination
C
Could rho be zero zero at the
5 level
IV
In Inte terp rpre rett your your an answe swers rs to qu ques esti tion on III III
V
Draw Draw a scatt scatter er di diag agra ram m
4
significance?
the the ab abov ove e da data ta and use use the the eye eyeba ball ll meth metho od to es esti tima mate te the the regr regres essi sion on cu curv rve e
6
VI. An Answ swer er th the e fo foll llow owin ing g qu ques esti tion ons s usin using g th the e da data ta on th the e prec preced edin ing g pa page ge.. Us Use e th the e meth method od of lea least st squar squares es to deter determin mine e a regr regres essi sion on equa equati tion on..
B
Calc Ca lcul ulat ate e th the e es esti tima mate ted d prof profit it fo forr ne next xt year year when when
C
Dr Draw aw th the e regr regres essi sion on li line ne on th the e pa page ge 160 160 scat scatte terr diag diagra ram. m.
D
Calc Ca lcul ulat ate e the 99
E
What proc What proced edur ure e sh shou ould ld be fo foll llow owed ed if th the e rang range e for th the e an answ swer er to qu ques esti tion on D incl includ udes es zero zero or a ne nega gati tive ve nu numb mber er? ?
R
confi confide denc nce e inte interv rval al for ques questio tion n
wi willll be 8,000 8,000,00 ,000. 0.
B
6
hapter Taxonom Taxo nomy y of tatist tatistics ics
t tistics
I
I
r Descriptive
Probability
Statistics
1 Interval
rRatio
Discrete Probability
Continuous Probability
Data
Data
Distributions
Distributions
1
Binomial
Poisson Probability
Normal Probability
Line Line Charts
Statistics
Distribution
Distribution
Distribution
i
Measures Central Tendency Tendency
Measures Dispersion
Mean Mean an and d Proporti Proportion on
Standard Standa rd Deviation Deviation
Interval
I
I
Test Tests s Using
Determin Dete rmining ing point and interval interval estimates estimates meansand ansand proport proportion ionss me
Distributions
Distribution
I
I
Hypothesis Hypothe sis testing
Percentiles
Mode
Variance
Deciles
Tests Using
Ordinal Data
FitTest
Dependency
and Simple Simple
Median Media n Tests
Regression
Hypothesiss testing Hypothesi
3 rmoresample moresample means means usingANOVA
Testing Testi ng
Quality Qualit y control control charts
r
Rat Ratio io Data
Correlation
Statistical
Hypothesiss testing Hypothesi
1 or2 samp sample le means means rproportion proportions s using the critica criticall valu values es z rt or us usin ing g a pp-val value ue
Interval
I
Goodness
th he eF
Z rt
Quartiles
Range
the
Nominal
r
Calcul Calculati ating ng a rang rangee gi give ven n a pro probabi babilit lity y
Median
Statistics
Data Data Usi Usi ng ng
r
Ratio Ratio Data
Calcul Calculati ating ng a pro probab babili ility ty gi give ven n a rang rangee
Measures Position
Parametric
Statistics
r
Probability
Da Data ta maybe grouped into a freq frequenc uency y distribution
Nonparametric distribution free distribution
Statistics
I
Parameters
Charts
Statistics
Parametric
Pi Piee Chart Chartss
Bar Charts
Relational
Statistics
1
rOrdinal
I
Inferential
I
I
Nominal
1
2 or 3 samp sample le me medi diansusin ansusing g sign, sign, Man Mann-Wh n-Whitne itney, y, and Kruskal Kruskal-Wallis -Wallis
samp sample le varia variances nces
Run test test for random randomnes nesss
from normal populations populations
6
Large or
hapter Taxono Tax onomy my of arametric arametric Sta Stati tist stics ics
Small Samples
sin
the
zort
Distributions
I Means
I
Correlation
Proportions
and Regression
I One Mean
I
I
I
Two Means. Means.
Two Mea Means ns
Independent Populations
Dependent Populations
i
i
Isit dif differ ferent ent
Is itdiffer itdifferentin entin
s one one large largerr
in a dir direct ection ion
eitherdirection eitherdirectio n
or smaller smaller
One tail tail test
Two tail tail test
ne ttail ail test
Arethey equal equal Two tail tail test
s one one large largerr or smaller
II
Two tail tail test
One
Two Proportions
Is it differen differentt in a direct direction ion One tail tail test
r
I
Is it differen differentt in ~ i t
r direction
Two tail tail test
Usingtbe F
r
Isone larg larger er
re the they y equ equal al
One tail tail test
Two tail tail test
Coefficients of Correlation Determination and Nondetermination Significance Signif icance Test for the coeffi coefficie cient nt of correlation Point and interv Point interval al estimate estimatess using using a regression regressi on equation
Distribution
Variances of
One Fact Factor or
Two a d o r
Two Two Normal
Analysis of
Analysis
Populations
~
e a n s
I
Proportion
I
Are they they equa equall
One tail tail test
I
I
of
Means
Is one larger or
smaller
One tailtest
Arethey equal Two tailtest
Comparing 2 treatment means
Are treatment meansequal
ve
ve
treatment
blocking
meansequal
meansequaJ
63
Chapter Cha pter 7 Problem Review Lin Linda s Vide Video o Show Showca case se escriptive Statistics
(Chapters) 2
Summari Summ arizing zing Dat Data a
The The nu numb mber er of vi vide deo o re rent ntal als s was was summ summar ariz ized ed with with an array, array, ra rang nge, e, freque frequency ncy distri dis tribut bution ion,, relati relative ve freq frequenc uency y dis distri tribut bution ion,, his histog togram ram,, frequ frequency ency polyg polygon on,, rela relati tive ve fre freque quency ncy poly polygo gon, n, mo more re-t -tha han n ogiv ogive, e, and lessless-th than an ogiv ogive. e.
3
Measuring Measur ing Cen Central tral Tenden Tendency cy of Un Ungr grou oupe ped d Da Data ta
La Last st we week ek's 's self self-h -hel elp p tape tape re rent ntal als s we were re us use ed to esti estima mate te last last year year's 's self self-h -hel elp p tape tape re rent ntal als. s. Popu Popula lati tion on an and d sa samp mple le me meas asur ures es incl includ uded ed the the me mean an,, we weig ight hted ed mean me an,, me medi dian an,, an and d mo mode de.. Me Meas asur ures es of po posi siti tion on calc calcul ulat ated ed by Lind Linda a incl includ ude ed quarti qua rtiles. les. dec decile iles, s, per percen centil tiles, es, an and d the the interinter-qua quarti rtile le rang range. e.
4
Measuring Measuri ng Dispe Dispersio rsion n of Un Ungr grou oupe ped d Da Data ta
The The di disp sper ersi sion on of se self lf-h -hel elp p tape tape re rent ntal als s wa was s an anal alyz yzed ed with with a ra rang nge, e, aver averag age e devi deviat atio ion, n, va vari rian ance, ce, and and stan standa dard rd devi deviat atio ion. n. The The usef useful ulne ness ss the sta stand ndard ard de devi viat atio ion n wa was s ex expl plai aine ned d with with a no norm rmal al curv curve e usin using g the the em empi piri rica call rule ule, Cheby Ch ebyshe shev's v's ru rule le,, and and the the coeff coeffici icien entt of vari variat atio ion. n.
5
Measurin Measu ring g Cen Centra trall Tend Tendency ency of Grou Groupe ped d Da Data ta
Measur Meas ures es calc calcul ulat ated ed ab abov ove e we were re re reca calc lcul ulat ated ed for for grou groupe ped d da data ta usin using g the the vide video o tape tap e ren rental tals s summ summari arized zed in chapter 2 The The ske skewn wness ess of nonsy nonsymm mmet etric rical al data data was wa s defi define ned, d, grap graphe hed, d, and and me meas asur ured ed with with Pe Pear arson son's 's coe coeffi fficie cient nt of skew skewne ness ss..
Measuri Mea suring ng Disp Dispersi ersion on of Grou Groupe ped d Da Data ta
Meas Me asur ures es calc calcul ulat ated ed ab abov ove e we were re re reca calc lcul ulat ated ed for gr grou oupe ped d da data ta.. Th The e firs firstt and thir third d quar quarti tile les, s, th the e in inte terq rqua uarti rtile le ra rang nge, e, and perc percen enti tile les s wer were e calcu calcula late ted. d. Kurt Kurtos osis is wa was s us used ed to de desc scri ribe be the the shap shape e of a freq freque uenc ncy y po poly lygo gon. n.
Probability The asis t Interen Inte rentia tiall Sta Statis tistics tics I
7
Understan Unde rstanding ding Prob Probabil ability ity
Th The e ge gene nera rall and sp spe eci cia al rules fo forr add ddiition tion we were re used to stud study y the relati relations onship hip bet betwee ween n adv advert ertisi ising ng exp expend enditur itures es an and d sal sales es rev reven enue. ue.
8
Probability Probabi lity Part II Multiplicati Multip lication on Rules
The The ge gene nera rall and spec specia iall ru rule les s for for mu mult ltip ipli lica cati tion on we were re us used ed to stud study y the the rela relatio tions nshi hip p betw betwee een n adve adverti rtisi sing ng expe expend nditu iture res s and sale sales s re reve venu nue. e. The The coun counti ting ng and fact factor oria iall ru rule les s we were re use used to de dete term rmin ine e Lind Linda' a's s op opti tion ons s when whe n Vi Visi siti ting ng co comp mpet etit itor ors. s. Pe Perm rmut utat atio ions ns and and com combi bina nati tion ons s wer were e used used to det determ ermine ine Li Lind nda' a's s opti option ons s when when disp displa layi ying ng adve advert rtis isin ing g post poster ers. s.
9
Discrete Discre te Prob Probabil ability ity Distributions
The ex expe pect cted ed valu value e of tape tape re rent ntal als s wa was s de dete term rmin ined ed wi with th a prob probab abil ilit ity y di dist stri ribu buti tion on.. Fl Flip ippi ping ng a coin coin wa was s used used to expl explai ain n the the bino binomi mial al prob probab abil ilit ity y di dist stri ribu buti tion on.. Th The e av aver erag age e nu numb mber er of re repa pair ir call calls s pe perr 15 15-m -min inut ute e pe peri riod od we were re anal analyze yzed d usin using g a Pois Poisso son n pr prob obab abililit ity y distr distrib ibut utio ion. n. Ave Avera rage ge cust custome omerr retu return rns s were anal analyze yzed d usin using g a Pois Poisso son n appr approx oxima imatio tion n to the the bino binomi mial al pr prob obab abilility ity distribution.
1
Continuous Continuo us Normal Probability Distributions
Li Lind nda a de dete term rmin ined ed the the prob probab abil ilit ity y of a stor store' e's s po popu pula lati tion on me mean an sale sales s being ing within a given range. She also determined a range for a store's population mean me an sa sale les s gi give ven n a pr prob obab abililit ity. y.
11
The The Samp Sampliling ng Dis Distri tribu buti tion on of the Means
The 99 con confid fiden ence ce in inte terv rval al for popU popUla lati tion on me mean an cus custom tomer er pur purcha chases ses wa was s determined using a sample mean o f 7.50 from a sample size o f 49 49 customers.
12
Sa Samp mpliling ng Dist Distrib ributi utions ons Part II
The 95 confi confide dence nce in inte terv rval al for th the e popu popula lati tion on prop propor orti tion on of cus custo tomer mers s happ happy y wi with th se serv rvic ice e wa was s deter etermi mine ned d us usiing a sa samp mple le pro ropo port rtio ion n of .80 from from a sa samp mple le of 10 100 0 cust custom omer ers. s. An ap appr prop opri riat ate e samp sample le size size wa was s de dete term rmin ined ed give given n an ac acce cept ptab able le ra rang nge e for for the the po popu pula lati tion on me mean an.. An ap appr prop opri riat ate e samp sample le size size wa was s al also so dete determi rmine ned d gi give ven n an acc accep epta tabl ble e ra rang nge e for the the popu popula lati tion on pr prop opor orti tion on..
164
ferential
tatistics
13
Large Sample Large Sample Hypothesis Testing
The samp sample le mean mean pu purc rcha hase se of 7. 7.50 50 was was caus causin ing g so some me conc concer ern. n. Lind Linda a used used hy hypo poth thes esis is te test stin ing g to de dete term rmin ine e th that at th the e popu popula lati tion on mean mean cust custom omer er pu purc rcha hase se ha had d de decr crea ease sed d fr from om last last ye year ar's 's 7. 7.75 75.. A 01 level of sign signif ific ican ance ce was was us used ed.. Havi Having ng pr prov oved ed th the e mean mean pu purc rcha hase se ha had d de decr crea ease sed, d, she she th then en us use ed a tw two o-tai -taill te test st to prove rove the mean did no nott ch chan ange ge in eithe eitherr direct direction ion..
14
Large Sample Hypothesis Testin Tes ting g Part II
Linda then used samp ample data and hypothesis te tes sting to det eter ermi mine ne whet whethe herr aver averag age e sales of two of her her stores wer were th the e same. A 01 level of sign signif ific ican ance ce was was us used ed.. Th The e twotwo-ta tail il pr prob oble lem, m, conc concer erni ning ng a chan change ge in mean mean cust custom omer er pu purc rcha hase ses s de desc scri ribe bed d in chap chapte terr 13, was re redo done ne us usin ing g a pp-va valu lue e te test st.. A po popu pula lati tion on mean mean of 7 7.4 .40 0 was was found to have atyp error of 18.41
15
Hypothesis Testing of the Population Proportions
Linda used hypoth the esis testing to prove that a sample mple proportion ion measur suring cu cus stome tomerr satisfa isfac ctio tion with service of .80 was was no nott low low en enou ough gh to co conc nclu lude de th that at th the e po popu pula lati tion on pro ropo porrti tion on wa was s below elow th the e .85 required to qualify f or or a Flopbuster Video franchise. She also determined that service at two stores was the same with a two-tail test. She used a one-tail test to determine that one store did not have better bet ter serv service ice th than an anothe anotherr st stor ore. e.
16
Small Sample Hypothesis Testing Using Student's t T es est
Linda used a mean of 2.3 ta tap pe renta tals ls fr from om a sa samp mple le of 9 cust custom omer ers s to de dete term rmin ine e th that at th the e av aver erag age e number of ta tape pes s ren ente ted d pe perr cu cust stom omer er had de decr crea ease sed d fr from om last last year year's 's popu popula lati tion on mean mean of 2.6 2.6 tapes tapes.. She also found the average customer waiting time at two of her store stores s (t (two wo indepe independe ndent nt popu popula lati tion ons) s) was the same. Linda used a paired difference test of norm normally ally distri distribute buted d dependent dependent popula populatio tions ns to conc conclu lude de a prom promot otio iona nall ca camp mpai aign gn had incr increa ease sed d week weekly ly sale sales s at th thre ree e of her stores stores..
17
St Stat atis isti tica call Qual Qualit ity y Li Lind nda a di didn dn't 't us use e stat statis isti tica call qu qual alit ity y cont contro roll to mana manage ge he herr re reta tail ilin ing g bu busi sine ness ss.. The chap chapte terr ex expl plai aine ned d mean me an,, ra rang nge, e, and and prop propor orti tion on of defe defect cts s char charts ts fo forr pa part rts s th that at were were de desi sign gned ed to be 50 mill millim imet eter ers s lon long. Control
18
Analysis of Variance Parts I and
19
Li Lind nda a fo foun und d tw two o of he herr stor stores es had eq equ ual sale ales va varrian iance ce.. Sh She e also also used ANOVA ANOVA to pr prov ove e th that at av er ag e s sa ales of th thre ree e sa sale lesp speo eopl ple e we were re no nott eq equa ual. l. Week Weeks s of exp experie erience nce,, the blocki blocking ng variab variable, le, al also so indi indica cate ted d un uneq equa uall sale sales. s. Th The e trea treatm tmen entt vari variab able le sale salesp speo eopl ple e expl explai aine ned d half half (1 (14 4 of 28) th the e to tota tall sale sales s vari variab abililit ity. y. Vari Variab abilility ity of 11.3 11.3 was was expl explai aine ned d by th the e bloc blocki king ng varia variable ble expe experi rien ence ce.. Variability of 2.7 was was un unex expl plai aine ned. d. Sa Sale lesp sper erso sons ns on one e an and d th thre ree e had diff differ eren entt av aver erag age e week weekly ly sales ales..
II
20
N o n p a r a me t r ic Hypothesis Testing of Nomin Nominal al Data Data
Linda used a goodness of fit fit te test st to de dete term rmin ine e th that at sale sales s of a ne new w hit hit musi music c vide video o were were not eq equa uall lly y distribute ted d among her her fo four ur stores. Two Two categorical variable bles (adve dvertis tising ing exp expendit nditu ures and sales les re reve venu nue) e) were were fo foun und d to be st stat atis istic tical ally ly depen dependen dentt usin using g a contin contingen gency cy ta tabl ble. e.
21
Nonparametric Hypothesis Testing of Ordinal Data Part Part I
A run te test st prov prove ed th the e ge gend nder er of cu cust stom omer ers s en ente teri rin ng a stor store e was was no nott a ra rand ndom om eve vent nt.. The The pa parramet ametrric te test st indi indica cati ting ng an av aver erag age e cust custom omer er pu purc rcha hase se had de decr crea ease sed d fr from om 7. 7.75 75 was was do done ne ag agai ain n with with a no nonn-pa para rame metr tric ic sign sign te test st beca becaus use e Lind Linda a was was not not sure sure of th the e dist distri ribu buti tion on's 's shap shape. e. A very very smal smalll samp sample le reve revers rsed ed th the e ea earl rlie ierr res esul ult. t. Lind Linda a also also fo foun und d two two inst instru ruct ctio iona nall meth method ods s ha had d equa equall te test st res esul ults ts at th the e .0 .05 5 lev level of sig signi nific fican ance ce usin using g a MannMann-Wh Whitn itney ey test test of two independ independent ent popula population tion median medians. s.
22
Nonparametric Hypoth the e s is Testing of Ordinal Data Da ta Pa Part rt II
In chap chapte terr 16, Lind Linda a fo foun und d ad adve vert rtis isin ing g ex expe pend ndit itur ures es and sale sales s we were re de depe pend nden entt us usin ing g a pair aired diff ffe erence nce te tes st th tha at required populations be normally lly distr istriibuted. This his ass ssu umpti mptio on was was dropped and th the e st stud udy y was was red edon one e using ing a paired sign ign te test st of two two depe depend nden entt po popu pula lati tion on medi median ans. s. Th The e ea earl rlie ierr results were reversed. The sample size was only five and the st udy should be redone with more st stor ores es.. Th The e ch chap apte terr 18 ANOV ANOVA A te test st of th thre ree e pe peop ople le's 's mean mean sale sales s as assu sume med d th the e popu popula lati tion ons s had equ qual al vari varian ances ces.. This This assu assump mptio tion n was was drop droppe ped d and and a Kr Krus uska kall-Wa Walli llis s test test of 3 medi median ans s ha had d a simi simila larr result sult..
orre or rela lati tion on and
egression
23
Correlation
A correl correlatio ation n coeffic coefficien ientt of .9 .936 36 was was calc calcul ulat ated ed for adve adverti rtisin sing g expen expendit ditur ures es and and sa sale les s re reve venu nue. e. The coefficie coefficient nt of det determ ermina ination tion explai explained ned 87.6 87.6 of the variabi variability lity between between advert advertisin ising g expend expenditu itures res an and d sale sales s reve revenu nue. e. Th The e coef coeffi fici cien entt of non nondete determi rminati nation on showed showed 12.4 12.4 unexpl unexplaine ained d variab variabili ility. ty. A rang range e for r prov proved ed rho, th the e popu popula lati tion on coef coeffi fici cien entt of cor orrrelat elatio ion n, could no nott be ze zero ro at th the e .05 level of significance.
24
Simpl Sim ple e Linear Linear
The re regr gres essio sion n equa equatio tion n was was V x= 8.06 + 8 6 x (i (in n thousands thousands of dolla ollarrs) s).. It was was use sed d to es esti tima mate te
Regression Analysis
of 7 sales sale s revenue revenue of 85 85,9 ,910 10 when when 9, 9,00 000 0is sp spen ent to nkad adv ver erti tisi sing ng. . ercep Acept ran ange 5,13 131 1d tin o a96 96,6 89 was was chec eck of th the e yy-int inter t ge ra rang nge e 75, re resu sulte lted nega ne,689 gati tive ve calc calcula ulate ted d for th the e 85,9 85,910 10 sale sales s esti estima mate te.. A ch nu numb mber er.. Nega Negati tive ve sales les are are no nott po poss ssib ible le and a larg larger er sa samp mple le must must be ta tak ken en..
165
Problem
eview
Darin s Music Emporium and Future Horizons Corporation Descriptive Statistics
(Chapters)
Summarizing Data
Walkman video recorder sales were summarized with an array, range, frequency distr dis tribut ibution ion,, relat relative ive fre frequen quency cy distr distribu ibutio tion, n, his histo togra gram, m, frequenc frequency y polyg polygon, on, an and d more-than morethan ogive. ogive.
3 Meas Measur uring ing Centr Central al Tendency Ung Ungro roup uped ed Data Data
Measures
cent centra rall tend tendenc ency y for Pr Prac acti tice ce Set 2 da data ta were were calc calcul ulat ated ed..
4. Measuring Dispersion Ung Ungrou rouped ped Dat Data a
Me a s u r e s
disp disper ersio sion n for for this this da data ta wer were e calc calcul ulat ated ed..
5 Measuring Cen Central Te Tend nden ency cy Gro Groupe uped d Dat Data a
Gro Grouped measures
cent centra rall ten tenden dency cy for th this is da data ta wer were e calc calcul ulat ated ed..
6 Measuring Dispers ion Gro Groupe uped d Dat Data a
G r ouped m easur es
disp disper ersi sion on for th this is da data ta were were calc calcul ulat ated ed..
Probability he Basis for Inferential Statistics 7 Unde nderstanding ing Pro Probabi babili litty
The general and special rules for addition we werre used to st u udy dy the relationship be betw twee een n cust custom omer er ag age e and cu cust stom omer er bu buyi ying ng ha habi bits ts (m (mak akin ing g a sa sale le). ).
8 Pr Prob obab abil ilit ity y Part II Multiplication Rules
Th The e ge gene nera rall an and d spec specia iall rule rules s fo forr mu mult ltip ipli lica cati tion on were were us used ed to st stud udy y th the e relationship between customer age and cust omer buying habits. Th e fact factor oria iall rule rules s for pe perm rmut utat atio ions ns and and comb combin inat ation ions s were were used used to deter determin mine e Darin Dar in s opt option ions s whe when n dis displa playin ying g advert advertisi ising ng poster posters. s.
9 Discrete Probability Distributions
The expected unit v va alue wal walkm kman an video video recor recorder der sale sales s was de dete term rmin ined ed us usin ing g a probability distribution. The binomial probability distribution was used to analyze the pro probab babilit ility y ma maki king ng a Wa Walk lkma man n vi vide deo o reco record rder er sale sale.. Th The e aver averag age e nu numb mber er custom cus tomer er comp compla lain ints ts pe perr 20 20-m -minu inute te pe peri riod od wer were e an anal alyz yzed ed us usin ing g a Po Pois isso son n
prob pr ity yPo dist diisso stri ribu buti on. . oxim The ave avera geth number num ber bo ced dilit chec ch ecks ks was anal an yzed ed usin usobab ing gabil the thilit e Pois son n tion appr ap prox imat ation ionrage to the e bino binomi mial alboun prob prunce obab abil ity y dist di stri ribu buti tion on. . alyz 10. Continuous Normal Probability Distributions
Darin determined the probability pop popula ulatio tion n mean mean sales sales commissi commissions ons be bein ing g within a given range. He also determined a range for the population mean number cust customer omer mer mercha chandis ndise e retur returns ns given given a proba probabil bility ity..
11. The Sampling Distribution th the e Me Mean ans s
Th e 99 confidence interval for the population mean wei gh ght com compute puterr pa part rts s was de dete term rmin ined ed us usin ing g a samp sample le mea ean n 30.025 mg from rom a sa sam mple ple 36 parts.
12. Sa Samp mpli ling ng Dis Distr tribu ibuti tion ons s Par Partt II
The 95 conf confide idenc nce e inte interv rval al for th the e po popu pula lati tion on prop propor orti tion on par parts ts passin passing g inspe ins pect ction ion was was de dete term rmin ined ed us usin ing g a sam sample ple prop propor orti tion on .90 from a sample 50 pa part rts. s. An ap appr prop opri riat ate e samp sample le si size ze was was de dete term rmin ined ed give given n an ac acce cept ptab able le ra rang nge e for for the the po popu pula lati tion on me mean an weig weight ht part arts. An appr approp opri riat ate e samp sample le si size ze was also also dete determ rmin ined ed give given n an acce accept ptab able le rang range e for th the e po popu pula lati tion on prop propor orti tion on par parts ts pas passin sing g inspec inspectio tion. n.
166
nferen nferentia tiall
tatistic tatistics s
13. Large Samp Sample le Hypothesis Testing
Darin us Dar used ed hy hypo poth thes esis is te test stin ing g and and a samp sample le mean mean 30 30.0 .025 25 mg to de dete term rmin ine e whet whethe herr the popula pop ulatio tion n mean mean weight weight pa part rts s was was ab abov ove e th the e re requ quir ired ed limi limitt 30 mi milllig ligrams. ms. Usi Using a tw two o-t -tai aill te test st,, he de dete terrmi mine ned d that that parts were were not diff differ eren entt fr from om 30 mg at th the e 01 0 1 level si sig gni nifi fic canc ance. How However, er, parts arts were were diff differ eren entt fr fro om 30 mg at th the e .05 lev eve el significance.
14. La Larg rge e Sample Sample Hypothesis Testi Testing ng Part Part II
Darin Dari n used sed samp sample le data data an and d hy hypo poth thes esis is te test stin ing g to de dete term rmin ine e th that at de deli live very ry time time fo forr 2 of his su supp ppli lier ers s was was th the e sa same me at th the e .05 lev level sign signif ific ican ance ce.. A chap chapte terr 13 te tes st, whic which h proved ved parts were we re no nott too heavy, was was redo done ne using ing a p-va p-valu lue e te tes st. Type Type II er erro rorr fo forr th the e weig weight ht of mate materrial ial cont contai aine ners rs hypo hypoth thes esis is wa was s dete determ rmin ined ed and and gr grap aphe hed d wi with th an oper operat ating ing ch char arac acter terist istic ic cur curve. ve. A po powe werr curv curve e show showin ing g th the e prob probab abil ilit ity y no nott mak making ing a ty type pe II err error or was estim estimat ated ed..
15. Hy Hypo poth thes esis is Testing the Population Proportions
Darin used hy Dar hypo poth thes esis is te test stin ing g to prov prove e th that at th the e .9 .90 0 samp sample le pr prop opor orti tion on par parts ts passi passing ng insp inspec ecti tion on was was no nott high high en enou ough gh to conc conclu lude de an incr increa ease se fr from om th the e .8 .86 6 popu popula lati tion on pr prop opor orti tion on recorded last last ye year ar.. Pa Parrts prod odu uce ced d by th the e da day y and ev even enin ing g shift hifts s had th the e sa same me proporti tio on defects.
16. Small Sa Samp mple le Hypothesis Testin Tes ting g Using Using St Stud uden ent' t's s t Test
Darin fo Dar foun und d av aver erag age e sick sick days days ta take ken n by no nonn-hi high gh scho school ol gr grad adua uate tes s were were diff differ eren entt th than an th thos ose e ta take ken n by high high scho school ol grad gradua uate tes s (2 inde indepe pend nden entt po popu pula lati tion ons) s).. Dari Darin n also also fo foun und d employee empl oyee efficien efficiency cy increa increased sed becaus because e a tr trai aini ning ng pr progr ogram am (2 depend dependent ent popu popula lati tion ons) s)..
17. St Stat atis isti tica call Qualit Quality y Control
Darin Dari n co cons nstr truc ucte ted d me mean an,, rang range, e, and and prop propor orti tion on de defe fect cts s char charts ts for th the e 30 mill millig igra ram m parts first introduced in chapter 11.
18. An Anal alys ysis is 19. Variance Parts I and
Darin Dari n fo foun und d th that at th the e vari varian ance ce 30 30--mg parts had not not incr increa ease sed. d. He us used ed ANOVA ANOVA to prove the average we w eight pa parrts prod produc uced ed by 3 de depa part rtme ment nts s was was not eq equa ual. l. A bloc blocki king ng vari variab able le,, when wh en prod produc uced ed,, had eq equa uall mean means. s. Th Ther eref efor ore, e, time time pr prod oduc uctio tion n did did not aff affec ectt part part weig weight ht.. The tr trea eatme tment nt variab variable le depart departmen mentt expl explai aine ned d .0 .057 57 vari variab abililit ity. y. The bloc blocki king ng variab variable le ti time me explai exp lained ned .0 .00 071 variab variabili ility. ty. Variab Variabilit ility y .0 .001 014 4 was was un unex expl plai aine ned. d. Th The e av aver erag age e weig weight ht of part parts s prod produc uced ed by dep depar artme tments nts 1 and 3 was was fo foun und d to be diff differ eren ent. t.
II
20. Nonp Nonpar arame ametr tric ic Darin Dari n us used ed a go good odne ness ss fi fitt te test st to de dete term rmin ine e th that at de defe fect cts s fr from om 3 sh shif ifts ts fo foll llow owed ed his Hypothesis Hypot hesis Testing .20, .3 .30 0, and .5 .50 0 expe expect cted ed dist distri ribu buti tion on.. Two Two ca cate tego gori rica call vari variab able les, s, custo custome merr age an and d of Nomin omina al Data maki making ng a sale, were were fo fou und to be sta stati tist stic ical ally ly inde indepe pend nden entt us usin ing g a co cont ntin inge genc ncy y ta tab ble. le. 21. Nonparametric Hyp Hypot oth hes esis is Test Testin ing g Ordin Ordinal al Data Data P a rt I
Darin used a run te tes st to dete terrmine ine the 30-milligram parts fir first presented on page 68 were dra rawn wn at ran and dom. He the hen n used a sign ign test test to de dete term rmin ine e medi median an de defe fect cts s had no nott inc increased fr from om th the e medi median an 5 reco record rded ed last last ye year ar.. An ea earl rlie ierr para parame metr tric ic te test st (p (pag age e 10 100) 0) indi indica cati ting ng th the e inequality me mean an sick sick da days ys ta take ken n by grad gradua uate tes s and no nonn-gr grad adua uate tes s was conf confir irme med d using a Mann-Wh Mann -Whitne itney y median median te test st..
22. Nonparametric
An earlier stUdy (page 100) measuring the effects
employ emp loyee ee tr trai aini ning ng usin using g a pair paired ed
Hy Hypo poth thes esis is Test Testin ing g dif differ ferenc ence e test test de depe pend nden entt po popu pula lati tion on me mean ans s was was re redo done ne us usin ing g a pa pair ired ed sign sign te test st of Or Ordi dina nall Data Data dep depend endent ent popu popula lati tion ons. s. Acco Accord rdin ing g to both both st stud udie ies, s, tr trai aini ning ng impr improv oved ed ef effi fici cien ency cy.. Part II An ea earl rlie ierr ANOVA ANOVA te test st comp compar arin ing g th the e mean mean weig weight ht 99-mg mg part parts s fr from om three three depa depart rtme ment nts s was wa s redo redone ne wi with th a Kr Krus uska kall-Wa Wall llis is te test st 3 medi median ans. s. Th The e ea earl rlie ierr te test st was was co conf nfir irme med. d.
orrelation and egression 23. Corr Correl elat atio ion n
A .9 .908 08 coef coeffi fici cien entt corr correl elat atio ion n was was ca calc lcul ulat ated ed for a pe pers rson on's 's ag age e and th thei eirr sale sales s comm commis issi sion ons. s. The coef coeffic ficie ient nt det determ ermina ination tion explai explained ned 82.4 82.4% % the variabili variability ty be betw twee een n ag age e and and sale sales s comm commis issi sion ons. s. Th The e coef coeffi fici cien entt nond nondeter eterminat mination ion showed showed 17 17.6 .6% % un unex expl plai aine ned d vari variab abil ilit ity. y. A rang range e fo forr r pr prov oved ed rho ho,, th the e po popu pula lati tion on coef coeffi fici cien entt corr correl elat atio ion, n, coul could d no nott be zero zero at th the e 01 0 1 level significance.
24. Simple mple Li Line near ar Regression
The re regr gres essi sion on equa equati tion on for age age sale sales s peop people le and and their their sale sales s commi commiss ssio ions ns was was Y x 55 9 1 07x (in (in thousa thousands nds dollars). It was used to estimate commissions
Analysis
30,20 30,200 0 for 24-y 24 -yea earr-old old sale sa speo eopl ple. A ch rang range e 27 27,4 ,470 70 32,9 30 ssio was waions sns calc calcul ated ed for chec eck k th the e ra rang nge etofor32 commi co,930 mmiss (y (y)ulat ) exclu ex clude ded d the 30,200 commiss com mission' ion's s lesp estima est imate. te.e. A the possib possibili ility ty nega negativ tive e comm commis issi sion ons. s. Ther Theref efor ore, e, a lar large gerr samp sample le was not nece necess ssar ary. y.
167
pp p pen dix I o mple mplete te Solutions to Practice S e ts Prac Practi tice ce Se Sett co comp mple lete te so solu luti tion ons s ha have ve be been en se sepa para rate ted d from from Qu Quic ick k Qu Ques esti tion on comple com plete te so solut lutio ions ns to faci facili lita tate te re read adin ing g them them in se seq que uenc nce e as if they they we were re a busin ine ess case. The The result is an example of how stat statis isti tica call an anal alys ysis is may be used us ed fo forr de deci cisi sion on mak akin ing. g. This gr grou oupi pin ng also allo llows for for an ea easy sy com ompa parriso ison of re rela late ted d item items s loca locate ted d in diff differ eren entt ch chap apte ters rs.. Ap Appe pend ndix ix I pa page ge nu numb mber ers s be beg gin wit ith h th the e le lett tter ers s PS and match thei theirr cor orrres esp pon ondi din ng Prac Practtice ice Se Sett pa pag ge numbe berr.
Note:: Quic Note Quick k Ques Questi tion on an answ swer ers s begi begin n on pa page ge QQ Answers in Quick Not Notes es may diff differ er slig slight htly ly from from thos those e gene genera rate ted d by sta statis tistic tics s softwa software. re.
S
P ract i ce S e t 2 I.
Summ Su mmari arizi zing ng Data Data
Darrin re Da rece cent ntly ly coll collec ecte ted d th the e fo foll llow owin ing g Walk Walkma man n CD Re Reco cord rder er sale sales s data data.. Units sold pe r d a y : 17, 22, 17, 8
A.
12, 15, 14, 16, 21, 29, 16
Mak ake e an arra array y and and calc calcul ulat ate e th the e rang range e of th this is da data ta..
Array:
8,12,14,15,16,16,17,17,21,22,29
Range: B.
Calc Ca lcul ulat ate e an ap appr prop opri riat ate e cl clas ass s wi widt dth h for th this is da data ta.. orange f classes
[
/I.
= 29 5- 8 = 4 . 2
H- L
~
=29 - 8 =21
4 or 5
Us Use e the firs firstt th thre ree e colu column mns s o f th this is char chartt to ma make ke a 55-cl clas ass s fr freq eque uenc ncy y distri ution Use st state ated d cl clas ass s limits fo forr th the e fi firs rstt clas class s of 5 - 9 sa sale les s units. Th Then en answ answer er the follow owiing ques questi tion ons. s.
I
Darin s Mus Music ic Empor Empor iium um D ai ai lly y W al al k km man Sa all es es Data Stated Class Limits
5
T ally
Frequency (f)
Re l at i v e F req uen cy f n
I
1
0.09
11
0
/I
2
0. 18
10
1
5
0 .4 6
8
3
9
10 - 14 15 - 19
20 - 24
/I
2
0. 18
3
8
25 - 29
I
1
0. 09
1
10
11
1. 00
0
11
To Tota tall freque frequency ncy n) A.
C umulative Fr F r e q u e nc y M o r e - t h an Le ss- t ha n
Draw or print print a his histog togram ram..
B.
Draw or pr prin intt a fre freque quency ncy po poly lygo gon. n.
Frequen Freq uency cy Polygon Polygon
Histogram 6
6
~
c
Q)
;:,
[
u. 2
o
59
10 14
15 19
2 0 24
0 - - - - - - - - - - - - - - - - - - - - - - - - - ---- - - - - - 59 10 14 15 19 20 2 4 25 2 9
2 5 29
Daily Dail y Walkman Walkman Sale Sales s
Daily Walkman Sales
C. Draw or prin printt a less less-t -tha han n cumu cumula lati tive ve re rela lati tive ve fr freq equen uency cy po poly lygo gon n an and d a re rela lati tive ve freque frequency ncy poly polygo gon. n.
Fi Firs rstt answer answer requ requir ires es divi dividi ding ng th the e le less ss-t -tha han n char chartt data data by 11 or add adding ing the relati relative ve freque frequenci ncies. es.
Less-th Les s-than an Ogive Ogive
Relati Rel ative ve Frequ Frequen ency cy Poly Polygo gon n
100
50
~
~
c
~
75
Q)
::J
~
c
C
C
50
30
u. 20
E ::J
o
25
o ......... .........,,,, - --- --- -- --- ---- --- ----- ----- ---- -- 5
10
15
20
25
30
§
10
oLl..
59
10 14
15 19
L
20 2 4
25 2 9
D a iily ly W a llk k m a n S a le les s
lk m a an n S a les les Daily W a lk
PS 6 and 7
Practice Se Sett I.
Measuring Central Te Ten nden dency of Ungrouped Data Data
Darin Jones nes wa wan nts to kn know ow mo more re abou aboutt the sales ales of Walk alkma man n CD recor recorders/p ders/players layers described described on page 6. Calcul Calc ulat ate e the the samp sample le me mean an usin using g this this Walkm alkman an sale sales s data data from from th the e la last st Prac Practi tice ce Set. St Stat ate e th the e fo form rmul ula a fo forr th the e po popu pula lati tion on me mean an.. Ar Arra ray y of dail daily y Wa Walk lkma man n sale sales: s: 8, 12, 14, 15, 16 , 1 6, 17, 17, 21, 22, 29
A. Sample mean
II.
X
[
=
~
=
W= 17
Po Popul pulati ation on mean mean formul formula a
B.
Darin sell sells s thre three e di diff ffer eren entt Wa Walk lkma man n CD reco record rder ers; s; one one for for 14 149 9, one fo forr 159 159, and a third for for 169 69.. Of the the 187 187 ma mach chin ines es sold sold duri during ng thi this elev eleven en-d -day ay peri period od;; 43 wer ere e th the e le leas astt expe expens nsiv ive, e, 90 we were re mo mode dera rate tely ly pric priced ed,, and 54 we were re th the e ex expe pens nsiv ive e model el.. Ca Calc lcul ulat ate e the the we weig ight hted ed me mean an sale sales s pric price e for for thes these e ma mach chin ines es..
=
X
L W x X x
~ ~
= (43)(
149)+(90)( 149)+ (90)( 159)+ (54 (54)( )( 169 169)) 43 + 90 + 54
= 6,407+ 14,310 187
+ 9,126
III. Using sing th the e data data from from ques questi tion on I, prove rove that that the sum of the dev devia iati tio ons fro from a mean is
x ~
xL
~
X - ~
2 [ V.
+. 5
11
=2
14
15
16
16
17
17
21
22
29
17
17
17
17
17
17
17
17
17
17
17
-9
-5
-3
-2
-1
-1
0
0
4
5
12
+ -5 ) + - 3 + - 2
+ .5 = 5. 5 + . 5 = 6
The mode for for this data data is
+ --1 1 + -1
~
r-----------------------.
Note No te:: Coun ounti ting ng 6 po pos sit itio ion ns from rom the le left ft or righ rightt of the arra array y yie ield lds s 16 as the 6th nu num mber.
16
+ 0 + 0 + 4 + 5 + 12= 0
16 and 17 .
VI. This data can be described as
bimodal
VII. Calc Calcul ulat ate e the the fo follllow owin ing g mea measu sures res of posi positi tion on.. Tho Those se usin using g com comput puter er so soft ftwa ware re sh shou ould ld use a less less-t -tha han n cum cumula ulativ tive e rela relati tive ve fre freque quency ncy dist distri ribu buti tion on to answe answerr th thes ese e qu ques esti tion ons. s. +. 5 =
+ .5 = 2.75 + .5 = 3.25 ~
A.
B.
0
C.
Interqua Inte rquartil rtile e range
3
~ n
D. 6th decile
E.
.
12
IV. The me medi dian an numb number er of Wa Walkm lkman an unit units s sold sold is n
zero
8
=
= 29,843 = 15959
~
5 =
5 =
8
2 5
5 = 8
14.25
7 5 ~ 2 0
6(11)
95 95th th percentile percentile 100 + .5 =
17+.75 21-17)=20
I
= 20 -14.25 = 5.75
3
66
10 + . 5 = 1 0 + .5 = 10 + .5 =
I Note:
9 5 (1 1 )
+ .5
6.6+.5 ~
0 45 = 11,0 0 +.5 =
7.1
17
10.45 + .5 = 10.95 ~
28.65
87
PS 12 and 13
P r acti ce S e t I
Meas Me asur urin ing g
isper spersi sio on of Ungr Ungrou oupe ped d
ata
Darin is conc concer erne ned d ab abou outt Walk Walkma man n sale sales s vari variab abil ilit ity. y. Firs Firstt calc calcul ulat ate e the the rang range e for for Wa Walk lkma man n sale sales s and then then th the e aver averag age e de devi viat atio ion, n, th the e stand standar ard d de devi viat atio ion, n, an and d the the va vari rian ance ce.. Arra Array y of da dail ily y Walk Walkma man n sale sales: s: 8 12, 14, 15, 16, 16, 17, 17, 21, 22, 29
= 29
- 8 = 21
A
Range
B
Sampl Sam ple e average average deviat deviation ion
C
Sampl Sam ple e variance variance S
D
2
=
-
x-x 2 n
306 10
I
lx
xl
42
=
= 3.8
I
=3 0 6
Sample Sam ple standar standard d deviat deviation ion s~
I s
J3Q 6
Samp mplle mean: 17
5.53
35
5.5
25
5
~
~
X
15 15
25
99.74
II
La Labe bell this this gr grap aph h de depi picti cting ng th the e empi empiri rica call ru rule le..
III. Last yea year s mean week weekly ly Walkma lkman n sal sales were were 16 and the stan standa dard rd devi devia atio tion wa was s 4 Us Use e the the em empi piri rica call rule rule to de dete term rmin ine e a ra rang nge e fo forr Walk Walkma man n sale sales s fo forr one ne,, two wo,, and th thre ree e samp sample le st stan anda dard rd de devi viat atio ions ns from from the the me mea an. B Two Standa One e Standa Standard rd Deviat Deviation ion Standard rd Deviat Deviatio ions ns C Thre Three e Sta Stand ndard ard Dev Deviat iation ions s A On
68.2 68.26 6
IV
16 ± 1(4)
16 ± 2(4)
16 ± 3(4)
16 ± 4
16 ± 8
16 ± 12
ran range ge:: 12
f 7
95.4 95.44 4
ran range ge:: 8
f 7
99.7 99.74 4
24
ra rang nge: e: 4
f 7
28
Use Ch Cheb ebys yshe hev v s ru rule le to de dete term rmin ine e a ra ran nge fo forr Walk Walkma man n sale sales s be bein ing g with within in two two samp sample le st stan anda dard rd de devi viat atio ions ns of the the mea mean (s (see ee qu ques esti tion on III). II).
= 1 - -212 = 1 - -41 = -43
V
20
75
Da Dari rin n read read in a tr trad ade e publ publiica cati tio on that that the the aver averag age e Wal alkm kma an sal sales and st stan and dard ard devi deviat atio ion n for for a st stor ore e his si size ze and type ype are 18 and 3 re resp spec ecttiv ive ely. ly. Us Usin ing g the samp sample le da data ta from from page 18, are Da Dari rin n s Walk Walkma man n sa sale les s more more or less va vari riab able le than than th thos ose e of hi his s in indu dust stry? ry? Use Use the the stand standar ard d de devi viat atio ion n calc calcul ulat ated ed in question 1. Indust Ind ustry ry Sale Sales s Data Data
C V =
3 00) = 18 (100) = 16.67
Dari Da rin n s Music Music Empori Emporium um
C V =~
1 0 0
=
~
i
1 0 0 )
= 32 32.3 .35 5
Sal ale es fr from om th this is smal smalll samp sample le of only da days ys were were twic twice e as var varia iabl ble e as indu industr stry y popu popula lati tion on data data..
35