STA1510 STA1610 2017 TL Part I Questions and Solutions

STA1510 and STA1610/001/1/2017

Tutorial letter 001/1/2017 Basic Statistics and Introduction to Statistics

STA1510 and STA1610 Semester 1 Department of Statistics

DISCUSSION CLASS QUESTIONS AND SOLUTIONS PART I

DISCUSSION CLASS QUESTIONS PART 1 QUESTION 1 Louisian Louisiana’ a’s s Energy Energy Corporation Corporation lists four types of domestic domestic electric customers customers.. In its computer computer records the company uses “1” to identify residential customers, “2” for commercial customers, “3” for industrial industrial customers, customers, and “4” for government government customers. customers. The type of variabl variable e could could represent represent the above statement is 1. nominal nominal 2. discrete discrete 3. ordinal ordinal 4. both discrete discrete and and nominal nominal 5. both discrete discrete and and ordinal ordinal QUESTION 2 Which one of the following statements is incorrect? 1. The average average marks for STA1610, STA1610, the values are 0:75%, 0:748% and 0:7498% is a continuous variable. 2. The number number of cylinders cylinders in the engine engine is a discrete variabl variable. e. 3. The size of soft drink (small, medium or large) is quantitative quantitative variable. variable. 4. The amount of school fees must fall fall represent a continuous variable. variable. 5. The median median is not sensitiv sensitive e to outlier outlier.. QUESTION 3 The following is a stem-and-leaf display representing the amount of gasoline purchased, in gallons. 4

1

5

8

5

0

2

2

5

9

6

1

2

5

5

6

7

0

3

6

7

Which one of the following statements is correct? 1. The range range is 29 2. The mod mode e is 65 3. The ﬁfth smalles smallestt number number is

2

4. The media median n is 61 5. An ordered ordered array array is

2

41

j 4j5 4j8 5j0 5j2 5j2 5j5 5j9 6j1 6j2 6j5 6j5 6j6 6j6 6j7 7j0 7j3

STA1510 and STA1610/001/1/2017 QUESTION 4 Which one of the following statements is incorrect? 1. The starting salaries of graduates of computer computer programmes is a continuous continuous variable. variable. 2. The weekly closing closing price of the stock shoprite.com is a discrete discrete variable. variable. 3. Parameter Parameter is a characteristic measure measure of a population. 4. Statistic Statistic is a characte characteristic ristic measure measure of a sample. sample. 5. Nominal Nominal scale uses the numbers only for the purpose of identifyin identifying g membership membership in a group or category, these numbers have not arithmetic meaning. QUESTION 5 The measure of central tendency tells us about: 1. The mean, median and mode when they have have the same value. value. 2. The presence presence of outlier outlier.. 3. Whether the distribution distribution is symmetrical or skewed skewed.. 4. The range, range, mean and variance. variance. 5. The mean, mean, median and and mode. QUESTION 6 The following data give the monthly expenses (in thousand rands) for a sample of 9 household: 17

21

8

11

14

6

15

20

5

Calculate: 9

X 1. i

X i

D1 0

X 2. i

X i2

D1 9

X 3.

  X N 

X i

i

D1

3

9

X 4.

X i

i

D1

2

    X N

9

X

5.

i

D1

2

  X N  n1  X i

QUESTION 7 Consider the following data set: 33

29

45

60

42

19

52

38

36

Which one of the following statements is  incorrect ?

N  D 39:3333:

(1) The mean  X

(2) The median is 38: (3) The distribution is positively skewed. (4) The mode is zero. (5) The coefﬁcient of variation  C V

D 31:14%

QUESTION 8 The daily consumption in kilowatt(kwh) by a sample of 51

50

47

33

37

43

Which one of the following statements is incorrect? 1. The position of Q 1

D 2:75

2. The median is 40 3. The value of Q 2

D 45:5 4. The value of Q  D 51 3

5. The interquartile range is 10

4

10  households is 61

55

44

41

STA1510 and STA1610/001/1/2017 QUESTION 9 1. If events A and B  are independent, P . A/ statements is incorrect? 1. P  AC

 D

3. P . A  or B / 4. P . A= B /

D 0:12

D 0:58

D 0:40 P  . B = A/ D 0:40

2. If events A  and B   are: mutually exclusive, P . A/ following statements is correct? 1. P  B C

 D

3. P . A  or B / 4. P . A= B /

D 0:40 and P . B / D 0:30:  Which one of the

0:60

2. P . A  and B /

5.

Which of the following

0:60

2. P . A  and B /

5.

D 0:40  and P . B / D 0:30:

D 0:12

D 0:70

D 0:4 P  . B = A/ D 0:3

QUESTION 10 A group of 150  Chief Executive Ofﬁcers (CEOs) is tested for personality type. The following table gives the results of this survey. Gender Men  . M / Women  . W /

Type A  . A/

Type B  . B /

78

42

19

11

If one CEO is selected at random from the group, which one of the following statements is  correct ? (1) The event "woman  . W /" and "type A personality  . A/" are mutually exclusive. (2) P . M   B / (3) (4)

j D 0:35 P  . W  and B / D 0:073 P  . W / D 0:353

(5) Suppose that events “men ( M )” and “type  . A/” are independent than P . M  and A/

D 0:8227

5

QUESTION 11 Given the following contingency table

C   D

?

Total 0:44 0:67

0:11

?

?

Total

0:34

?

1:00

A

B

Which of the following statements is incorrect? 1. Event B and D  are independent 2. P . A  and D /

D 0:11

3. A  and  C  are mutually exclusive 4. P . A  = D / 5.

D 0:33 P . B or  C  / D 0:89

QUESTION 12 Three males with an X -linked genetic disorder have one child each. The random variable X  is the number of children among the three who inherit the X -linked genetic disorder is  X  P . X /

0

1

2

3

4

0:10

0:20

?

0:15

0:05

Which of the following statements is correct? 1. P .0

 X   3/ D 0:80 2. P .1  < X  < 4/ D 0:65 3. P . X   2/ D 0:2 4. E . X / D 1:58 5. The variance   D 0:8660 2

QUESTION 13 The Department of Statistics owns 6   laptops and each laptop has a 25% probability of working properly. (Hints: use both formula and statistics tables where it is necessary). Which one of the following statements is incorrect? 1. P . X  2.

6

D 2/ D 0:2966 P  . X  > 4/ D 0:0376

STA1510 and STA1610/001/1/2017 3. P . X

 3/ D 0:9624 4. Mean   D 1:5  and variance   D 1:125 2

5. A binomial process can be conducted. QUESTION 14 During the working hours, arrivals at a curbside banking machine have been found to be Poisson distributed with a mean of 1.3 persons per minute. If x  = number of arrivals during a given minute, the variance of persons per minute is A. The variance of persons per minute is 1.

1:3

2.

1:14

3.

1:69

4.

0:03

5.

13

B. Calculate P . X

 5/

1.

0:0084

2.

0:9977

3.

0:9893

4.

0:0022

5.

0:0106

QUESTION 15 A neuropsychologist designs a test for short–term memory that has a population mean score of 100  and a standard deviation of 5: Calculate the probability that a randomly selected person will have score of at least 110? 1.

2:00

2.

0:0228

3.

0:9772

4.

0:0179

5.

0:00228

7

QUESTION 16 Which one of the following statements is correct? 1. P . Z  > 1:51/

D 0:9345 2. P . Z  < 1:55/ D 0:0606 3. P . Z  < 1:63/ D 0:9484 4. .1:44 < Z  < 0:60/ D 0:6050 5. P . Z  > 1:44/ D 0:9251 QUESTION 17 If the area to the right of a positive z 1  is 0:063, then the value of z 1  must be 1.

0:35

2.

1:71

3.

1:53

4.

1:53

5.

0:72

QUESTION 18 The distribution of weights of a large group of high school students is normally distributed with a mean of 55 kg and a standard deviation of  5  kg. What is the probability of weights of a large group of high school students will be more than 63  kg?

8

1.

0:9452

2.

0:0458

3.

0:1446

4.

0:0548

5.

0:8554

STA1510 and STA1610/001/1/2017

TABLE E.2 TABLE OF CUMULATIVE STANDARDIZED NORMAL PROBABILITIES

9

10

STA1510 and STA1610/001/1/2017

TABLE E.6 TABLE OF BINOMIAL PROBABILITIES

11

12

STA1510 and STA1610/001/1/2017

13

14

STA1510 and STA1610/001/1/2017

15

16

STA1510 and STA1610/001/1/2017

17

DISCUSSION CLASS SOLUTIONS PART I QUESTION 1 Louisianna’s Energy Corporation (1   residential customers, industrial customers, 4 government customers) represents a qualitative – nominal variable

D

D

2

D   commercial customers, 3 D

Option (3) QUESTION 2 1. Correct The values are 0:75; 0:748; 0:7498: The data occur with a level of accuracy. 2. Correct We can count the cylinders as the data will be integers (whole numbers). 3. Incorrect. The size of soft drink (small, medium or large) is a qualitative – ordinal variable. 4. Correct We can measure the amount. 5. Correct The median is the middle number, while outlier is always a value at the extreme. Option (3) QUESTION 3 4

1

5

8

5

0

2

2

5

9

6

1

2

5

5

6

7

0

3

6

7

41

45

48

50

52

52

55

59

61

62

65

65

66

66

67

70

73

1. Incorrect: The range: The largest number

D D

73 32

 41

2. Incorrect The mode is the most repeated observation There are three mode:

18

52; 65  and 66

  the smallest number

STA1510 and STA1610/001/1/2017 3. Incorrect The ﬁfth smallest value is

52:

4. Correct The median: is the middle value in an ordered array The position of the median

D n C2 1 D 17 9C 1 D 9

Which gives a value equals to

61

5. Incorrect An ordered array is 41

45

48

50

52

52

55

59

62

65

65

66

66

67

70

73

61

Option (4) QUESTION 4

1. Correct The starting salaries can be represented such as R 2000:00 or R15000:00 an so on. 2. Incorrect The weekly closing price is quantitative – continuous variable. 3. Correct 4. Correct 5. Correct Option (2) QUESTION 5 The measure of central tendency tells us about the mean, median and mode. This is because these measures allow us to assign a value to what is the most representative value of the group. Option (5)

19

QUESTION 6 Given the following numbers: 17

21

8

11

14

6

15

20

5

Calculate 1.

9

X

X i

I  1

D

D 17 C 21 C 8 C 11 C 14 C 6 C 15 C 20 C 5 D 117

2. 9

X i

X i2

D1

.17/2

D

2

2

2

2

2

2

C .21/ C .8/ C .11/ C .14/ C .6/ C .15/ 2

2

C .20/ C .5/

D   1797 9

X    N  3.  xi

i

X

D1

N

X  is the sample mean (

D average) n

N D

X

X i

X i

D1

n

117

D

9

C 6 C 15 C 20 C 5 D 17 C 21 C 8 C 11 C 14 9

D 13

9

X

X i

i

D1

  X N  D

.17

 13/ C .21  13/ C .8  13/ C .11  13/ C .14  13/ C .6  13/ C .15  13/ C .20  13/ C .5  13/ D 4 C 8 C .5/ C .2/ C .1/ C .7/ C 2 C 7 C .8/ D 4C852C17C2C78 D 0

4. 9

X

X i

i

D1

2

  X N  D

.17

2

2

2

2

2

2

2

2

C .11  13/ C .14  13/ C .6  13/ C .15  13/ C .20  13/ C .5  13/ D 4 C 8 C .5/ C .2/ C 1 C .7/ C2 C 7 C .8/ D 16 C 64 C 25 C 4 C 1 C 49 C 4 C 49 C 64 D 276 2

2

2

20

2

 13/ C .21  13/ C .8  13/ 2

2

2

2

2

2

STA1510 and STA1610/001/1/2017 5.

P

2

    X N D 276 D 276 D 34:5 n1 91 8

X i

QUESTION 7 Given the data set: 33

29

45

60

42

19

52

38

36

1. The median n

N D D

X

X i

X i

D1

n

33

C 29 C 45 C 60 C 42 C 19 C 52 C 38 C 36 9

354

D D

9 39:3333

2. The median is the middle value in an ordered array. The ranked

19

29

33

38

36

42

45

52

68

The median

N

3. The distribution is positively skewed if the mean  X  is greater than the median and the value of median is greater than the mode. Since there is no mode in the data set, we refer to the value of mean and median.

N  D 39:3333  >  the median : 38:  Therefore the distribution is positively skewed.

The mean  X  4. Incorrect

Since there is no a number that is the most repeated, we conclude that there is no mode. 5. The coefﬁcient of variation deviation  D standard  100% mean

C V

D  X S N  100%

C V

N  D 39:3333

The mean  X

21

The standard deviation S ;  we need to calculate ﬁrst the variance S 2 n

2

X

    X N D S  D n1 D .33  393333/ C .29  39:3333/ C .45  39:3333/ C .60  39:3333/ C .42  39:3333/ C .19  39:3333/ D C .52  39:3333/ C .38 9 391:3333/ C .36  39:3333/ D 1200 8 D 150 p  The standard deviation D 150 D 12:2474  X i

i

2

1

2

2

2

2

2

2

2

2

2

The coefﬁcient of variation

:2474  D  X S N D 12 D 0:3114 39:3333

C V

or

31:14%

Option (4) QUESTION 8 Data: 51 50 47 33 37 43 61 55 44 41 The ranked data:   33 37 41 43 44 47 50 51 Ranked data: Position of quartile:

33 st 1

37 nd 2

41 rd 3

43 th 4

"

55

61

44 th 5

47 th 6

50 th 7

51 th 8

2:75

"

5:5

"

Q1

Q2

Q3

55 th 9

61

th 10

8:25

D total number D 10

N

1. The position of Q 1 2. The median:

44

:

C 1 D 10 C 1 D 11 D 2:75

N

4

4

4

C 47 D 45:5 2

Incorrect 3. The value of Q 2

D median (by deﬁnition)  N  C 1    The position of Q  D 2 D 2 .2:75/ D 5:5 2

  The value of Q

2

4

because 5:5  falls between the 5th value

The value of Q 2  is the average of this

22

44

C 47 D 45:5 2

D 44 and the 6th value 47

STA1510 and STA1610/001/1/2017 4. The value of Q 3 to calculate the value of Q 3 ;  we have to calculate ﬁrst the position of Q 3 :

n

C 1  D 3 .2:75/ D 8:25

  The position of Q  D 3 4   The vluae of Q , we round 8:25 to 8 and we consider the 8th value in the ordered array ) The value of Q  D 51: 5. The interquartile range D Q  Q  value   The value of Q ; we refer to the position of Q  D 2:75:   To calculate the value of Q  we round 2:75 to 3  and we consider the third value in the 3

3

3

3

1

1

1

1

ordered array equals to

  The interquartile range:

41:

Q3

 Q  value D 51  41 D 10 1

Option (2) QUESTION 9 (1) Events A  and B  are independent that means P  . A  and B / P . A/

D 0:40

P  . B /

D P . A/  P . B /

D 0:30

1. Correct P  AC

 D

2. Correct

1

 P . A/ D 1  0:40 D 0:60 P . A  and B /

D D D

P  . A/

  P  . B / 0:40  0:30 0:12

3. Correct P . A  or B /

D D D

C P . B /  P . A  and B / 0:40 C 0:3  0:12 P  . A/

0:58

4. Correct P  . A = B /

D D D D

P . A  and B / P . B / P . A/ P  . B /

P . B / P  . A/ 0:40

23

5. Incorrect P . B = A /

B/ D P . AP and D P . B / D 0:3  . A/

Option (5) (2) Events A  and B  aremutually exclusive when P . A  and B / P . A/

D 0:40

P  . B /

D0

D 0:30

1. Incorrect P  B C

 D

2. Incorrect

1

 P . B / D 1  0:30 D 0:90 P . A  and B /

D0

3. Correct P . A  or B /

D D D

C P . B /  P . A  and B / 0:40 C 0:30  0 P  . A/

0:70

4. Incorrect P . A = B /

D

P . A  and B / P . B / 0

D D 5. Incorrect P . B = A/

0:30 0

B/ 0 D P . AP and D D0  . A/ 0:40

Option (3) QUESTION 10 Gender Men (M) Women (W) Total

Type (A)

Type (B)

Total

78

42

120

19

11

30

97

53

150

1. If events W  and A  are mutually exclusive than P . W  and A/ But from the above table: P . W  and A/

19 D 150 D 0:1267

Therefore, events  W  and A  are not mutually exclusive.

24

D0

STA1510 and STA1610/001/1/2017 2. P . M = B /

D

P  . M  and B / P . B / 42

D

150 53 150

D D 3. P . W  and

B/

0:28 0:3533 0:7925

11 D 150 D 0:0733

4.

30 D 150 D 0:2 5. Events M  and A  are independent: P . M  and A/ D P . M /  P . A/ 120 D 0:8 P . M / D 150 97 D 0:6467 P . A/ D 150

P . W /

Therefore P . M  and A/

D 0:8  0:6467 D 0:5174

Option (3) QUESTION 11  A C   D

0:23

Total

0:34

0:11

Total 0:44 0:67 0:22 0:33 0:66 1:00 B

1. Correct Events B and D  are independent when P . B and D / P . B and D /

D 0:22

P . B and D /

D P . B /   P  . D/

D P . B /  P . D/

0:22

D 0:66  0:33 0:22 D 0:2178 25

2. P . A  and D /

D 01::11 D 0:11 00

3. If events A  and  C  are mutually exclusive than P . A and C / But P . A  and  C /

D0

D 01::23 D 0:23 00

Therefore A  and  C  are not mutually exclusive.

4. P . A = D /

5.

D/ 0:11 D P . AP and D D 0:3333  . D / 0:33

P  . B or  C /

D D D

and C /

P  . B /

C  P  .C /  P . B 0:66 C 0:67  0:44 0:89

Option (3) QUESTION 12

X  P  . X /

0

1

2

3

4

0:10

0:20

?

0:15

0:05

Condition: The sum of P . X / 4

X i

P . X i /

D0

D D

1 1

P . X

D 2/ D 1  .0:10 C 0:20 C 0:15 C 0:05/ D 1  0:5 D 0:5

1. Incorrect P . 0

 X   3/ D D D

P  . 0/

C P .1/ C  P  .2/ C P .3/ 0:10 C 0:20 C 0:5 C 0:15 0:95

2. Correct P . 1  < X  < 4/

26

D D D

P  . 2/ 0:5

C P .3/

C 0:15

0:65

STA1510 and STA1610/001/1/2017 3. Incorrect P . X

 2/ D D D

P  .2/

C P .3/ C P .4/ 0:5 C 0:15 C 0:05 0:70

4. Incorrect  E  . X /

D  D 0  0:10 C 1  0:20 C 2  0:5 C 3  0:15 C 4  0:05 D 0 C 0:20 C 1 C 0:45 C 0:2 D 1:85

5. The variance n

2

D

X

D D D

.0

i

D1

. x i

2

 /  P . X  / i

2

2

2

2

 1:85/  0:10 C .1  1:85/  0:20 C .2  1:85/  0:5 C .3  1:85/  0:05 0:3423 C 0:1445 C 0:0113 C 0:1984 C 0:2311 0:9275

Option (2) QUESTION 13 This is a Binomial probability distribution with  n

D 6 and    D 0:25

1. P . X

D 2/

A. Using the formula P . X /

D

P . 2/

D

n!

x

  x ! .n   x /! 6!

2! .6

.1

n  x

 / 

.0:25/2 .1

6

2

 0:25/ 

 2/! D .62  51/.44  33  22  11/  0:0625  .0:75/ D 2 720  24  0:0625  0:3164 D 15  0:0625  0:3164 D 0:2966

2

27

B. Using the statistical tables n 3

6

0:01

0:02

0:25



0

0:1780

1

0:3560



0:5

x

n

0:2966

2 3

0:1315

4

0:0330

5

0:0044

6

0:0002

Therefore P . X

D 2/ D 0:2966

2. P . X  > 4/

D D D

P  . X

D 5/ C P . X  D 6/ 0:0044 C 0:0002 0:0046

Incorrect 3. P . X

 3/ D D D

P  . 0/

C P .1/ C P .2/ C P .3/ 0:1780 C 0:3560 C 0:2966 C 0:1315 0:9621

4.

 

The mean  

D D D

The variance  2

5. Correct

 6  0:25

n

1:5

D D D

  .1  / 6  0:25 .1  0:25/

n

1:125

Option (2) QUESTION 14

A. The variance Option (1)

28

D mean  D

1:3  by deﬁnition

STA1510 and STA1610/001/1/2017 B. Calculate P . X

 5/

1. Using the formula P . x /

D

e

P . X

 5/ D P . 0/ D

P .1/ P .2/ P .3/ P .4/ P .5/

P  . X

D D D D D

 5/ D D

x



where e

x !

D 2:7183

P  . 0/

C P .1/ C  P  .2/ C P .3/ C P .4/ C P .5/  :  .1:3/ 2:1783 0:2725  1 D D 0:2725 0! 1 0

13

.2:7183/1:3 1!

.2:7183/1:3 2!

1

 .1:3/ D 0:2725  1:3 D 0:3543  .1:3/ D 0:2725  1:69 D 0:2303 2

.2:7183/1:3  .1:3/3 3!

.2:7183/1:3  .1:3/4 4!

.2:7183/1:3  .1:3/5 5!

0:2725 0:9977

2

D 0:2725 6 2:197 D 0:0998 D 0:2725  2:8561 D 0:0324 24

 3:7129 D 0:0084 D 0:2725120

C 0:3543 C 0:2303 C 0:0998 C 0:0324 C 0:0084

Option (2) 2. Using the table E.7 (page 11)

D 0:2725 P . 1/ D 0:3543 P . 2/ D 0:2303 P . 3/ D 0:0998 P . 4/ D 0:0324 P . 5/ D 0:0084 P . X   5/ D P  .0/ C  P  .1/ C P  . 2/ C P  .3/ C P  . 4/ C P  .5/ D 0:9977 P . 0/

Option (2)

29

QUESTION 15 This is a normal distribution. Given the mean   100 and the standard deviation   5: P . X  110/?  Let us tranfer X   into Z  so that we can use the E.2 table (cumulative standardized normal probabilities).

D



D

Z

D D D

   110  100

X

5

2

Therefore P . X

 110/ D P . Z   2:00/

Area to calculate 0.9772

0

2.00

P . Z

 2/ D 1  0:9772 D 0:0228

Option (2) QUESTION 16 1. P . Z  > 1:51/

0.9345 0.0655 0

P . Z  > 1:51/

30

1.51

D 1  P . Z  < 1:51/ D 1  0:9345 D 0:0655

STA1510 and STA1610/001/1/2017 2. P . Z  < 1:55/

D 0:9394

0.9394 0.0606

0

3. P . Z  <

1.55

1:63/ D 0:0516

0.9484 0.0516  _

1.63

0

4. P . 1:44  < Z  < 0:60/

D D D

P  . Z  < 0:60/ 0:7257 0:0208

 P . Z  < 1:44/

 0:7049

0.7049

_  1.44

0

0.60

0.7257

5. P . Z  >

1:44/ D 1  P . Z   1:44/ D 1  0:0749 D 0:9251 31

0.9251 0.0749  _

1.44

0

Option (5)

QUESTION 17

0.9370 0.0063 0

Z,?

Search for 0:9370  inside of table E.2 and let us read the outside number  z 1

D 1:53

Option (3) QUESTION 18 This is a normal probability distribution the parameter: The mean   55 The standard deviation   5

D

D

P . X  > 63/

Because the normal probability distribution is given in terms of z  normal standardized tables, we have to convert X  –value 63  into the Z  –value using the Z  –score formula

D

D  X   

Z

If X

D 63

32

Z ?

D  X    D 63 5 55 D 1:6