probability reliability

August 31, 2017 | Author: Paris Adrián | Category: Probability Distribution, Reliability Engineering, Uncertainty, Statistics, Probability Theory

Share Embed Donate

Report this link

Short Description

probability reliability...

Description

and Statistical Methods in

Sn Rn

Achintya Haldar Sankaran Mahadevan

Probability, Reliability, and Statistical Methods in Engineering Design

Probability, Reliability, and Statistical Methods in Engineering Design Achintya Haldar Department of Civil Engineering & Engineering Mechanics University of Arizona

Sankaran Mahadevan Department of Civil and Environmental Engineering Vanderhilt University

John Wiley & Sons, Inc. New York / Chichester / Brisbane / Weinheim / Singapore / Toronto

ACQUISITIONS EDITOR MARKETING MANAGER SENIOR PRODUCTION EDITOR DESIGNER ILLUSTRATION EDITOR

Wayne Anderson Kalherine Hepburn Patricia McFadden Karin Gcrdes Kincheloe Gene Aiello

This book was set in Times by Argosy and printed and bound by Courier/Sloughton. The cover was printed by Phoenix Color. This book is printed on acid-free paper.

CM

CM

IT)

CO CO

co

CO

CO

Young's modulus, E (ksi)

Figure 3.1 Histogram and Frequency Diagram of Young's Modulus

other hand, if a very large number of intervals are used, the width of each interval will be very small and the histogram may look like a series of spikes, defeating the purpose of the histogram. A considerable amount of judgment is necessary to plot a meaning¬ ful histogram. An empirical relationship can be used for this purpose:

k = \ + 3.31og10 n

(3.7)

where k is the number of intervals and n is the number of samples. For the data shown in Table 3.1, n = 41 and so k = 1 + 3.3 log10 41 = 6.3. This gives a rough idea of the number of intervals to use. Suppose a histogram for 1 million data points needs to be plotted. The approximate number of intervals may be calculated as k = 1 + 3.3 log10 106 = 20.8 ~ 21. Thus, the number of data points does not necessarily complicate the drawing of a histogram. Considering the minimum and maximum values of Young's modulus, and rounding them off to 25,000 ksi and 34,000 ksi since they are not the absolute minimum and maximum values, we used six intervals with a width of 1 ,500 ksi each to develop the histogram shown in Figure 3.1. See also Table 3.2. The area under a histogram depends on the width of the intervals and the number of data points. For the example under consideration, the area under the histogram will be 1,500 x 41 = 61,500. Since the probability of an event is between 0.0 and 1.0, it will be mathematically advantageous to have the area under a histogram equal to unity. A histogram with a unit area is known as a frequency diagram. The frequency diagram can be easily obtained by dividing the ordinates of a histogram by its area. This will not change the shape of the diagram as shown in Figure 3.1. The histogram or fre¬ quency diagram will give the relative frequencies of various intervals.

40

Chapter 3

Modeling of Uncertainty

Table 3.2 Data for Histogram and Frequency Diagrams Interval (xlO3 ksi)

25.0-26.5 26.5-28.0

28.0-29.5 29.5-31.0 31.0-32.5 32.5-34.0

No. of observations

Fraction of observations

1/41 = 0.0244 0.0976 0.4146 0.2683 0.1463 0.0488 I 1 .0000

4

17 11 6 2

The area under a frequency diagram can be used to estimate the probability of the of interest. Suppose that the probability of Young's modulus between 28,000 ksi and 31,000 ksi needs to be calculated. The probability of the event can be estimated as

event

P(28, 000 < E < 3 1, 000) =

(17 + 11)1,500 41x1,500

= 0.6829.

One of the primary objectives of a frequency diagram is to fit a curve to the diagram to model the pattern or behavior of the randomness. A curve can be easily fitted to the frequency diagram, as shown in Figure 3.1. As more data are added, the fitted curve will approach the frequency diagram more closely. Attempts can be made to verify whether the fitted curve represents one of many commonly used distributions, such as the normal or lognormal. This will be discussed in detail in Chapter 5.

3.3

ANALYTICAL MODELS TO QUANTIFY RANDOMNESS

The discussion in the previous section needs to be described mathematically. To make

this process simple yet comprehensive, discrete and continuous random variables need be treated separately. Since the Young's modulus example considered in Section 3.1 can be treated as a continuous random variable, it is discussed first. to

3.3.1 Continuous Random Variables From now on, a random variable will be represented in the text by an uppercase letter (e.g, X), and a particular realization of a random variable will be represented by a low¬ ercase letter (e.g., x). The curve shown in Figure 3.1 is called the probability density function (PDF) or density function and is represented byfx(x). It does not directly pro¬ vide information on probability but only indicates the nature of the randomness. To cal¬ culate the probability of X having a value between x{ and x2, the area under the PDF between these two limits needs to be calculated. This can be expressed as P{xx ÿ*) = 7— • Px(x)

'

(3-29)

If X and Y are statistically independent, as discussed in Section 2.4, the condition has no meaning; that is,fx Y(x I y) =fx(x) or px y(x I y) = Px(x) and it can be shown that

,

,

fx,y(x,y) = fx(x)fY(y)

(3-3°)

Px,y(x,y) = px(x)pY(y).

(331)

or

3.4.3

Marginal PDF and PMF

It may be necessary in some cases to calculate the marginal PDF or PMF of a random variable, X, from the information on the joint PDF or PMF of X and Y, by completely eliminating the effect of Y. Using the theorem of total probability, we can show the marginal PDF and PMF to be

fx(x)=]fx,y(x,y)dy

(3.32)

]fx,y(x,y)dx

(3-33)

'ZPxjixÿj)

(3.34)

fy(y)= and

Px(x)=

all yj

Y.Px,y{xhy).

Py{y)=

(3.35)

all A":

3.4.4

Covariance and Correlation

The problem may become cumbersome if the probability needs to be calculated using the joint distribution of many random variables. Furthermore, the available information may be inadequate to develop the joint distribution of the multiple random variables. For practical applications, it could be advantageous to use the information on the dependence or independence between two random variables to extract as much infor¬ mation as possible. This can be accomplished by covariance and correlation analyses. Similar to the variance analysis of a single random variable, the covariance of two random variables X and Y, denoted as Cov(X,T), is the second moment about their respective means and pr, and can be calculated as Cov(X,Y) = E[(X - Ii )(F - M-y )] = E[XY - nxT - X\ly

* = E(XY) - [ixpY = E(XY) - E(X)E(Y).

+ Hx[lY ]

ÿ

52

Chapter 3

Modeling of Uncertainty

E(XY) can be calculated as E(XY) =

j jxyfXJ(x,y)dxdy.

(3.37)

If X and Y are statistically independent, then E(XY)=

J xfx(x)dx]yfr(y)dy = E(X)E(Y).

(3.38)

From Equation 3.36, it can be observed that for statistically independent X and Y, Cov(X,F) = 0. Otherwise, it can be positive or negative and has the unit that is the square of the unit of the mean. Cov(X,K) indicates the degree of linear relationship between the two random variables. Nondimensionalizing the covariance will result in the correlation coefficient, denoted as pz Y, which can be calculated as

(3.39) -CTy

Values of pXY range between -1 and +1. Again, the correlation coefficient repre¬ degree of linear dependence between two random variables. The physical characteristics of the correlation coefficient are elaborated in Figure 3.7. Figure 3.7a indicates that there is no linear relationship between the two random variables; the cor¬ relation coefficient is expected to be close to zero, and the two random variables can be considered to be uncorrelated. Figure 3.7b indicates a positive relationship between X and Y; that is, Y increases as X increases. However, the relationship is not perfectly linear, indicating that pXY is expected to be between 0 and 1.0. Figure 3.7e clearly indi¬ cates that there could some nonlinear relationship between the two random variables, but since the relationship is not linear, pXY is expected to be zero. If the correlation coefficient needs to be calculated from observed sample values, it is rare to obtain values of precisely zero, +1, or -1. The two random variables can be consid¬ ered to be statistically independent if the correlation coefficient is less than ±0.3; they can be considered to be perfectly correlated if the correlation coefficient is greater than ±0.9. sents the

EXAMPLE 3.6 The water level in a particular lake depends on two sources, direct rainfall X, and inflow from a stream Y. The rainfall Z around the lake can be considered as a random variable with a mean of jjz and a standard deviation of oz. X and Y are related to Z as

X =aZ Y=b+cZ where a, b, and c are constants. X and Y are functions of a random variable and are therefore also random. Calculate the correlation coefficient pXY.

SOLUTION The mean and variance of X and Y can be shown to be (see Section 6.2.1)

3.4

Multiple Random Variables

(c)p =1.0

W) p = -1.0

(e) p = 0

(f)p = 0

0

Figure 3.7 Correlation of Two Random Variables

\ix-a\iz and

\iY=b + c\iz

(5~x - a2 o

and

\ o2=c2oz.

Also, E(XY) can be calculated as E(XY) = E[aZ(b + cZ)] =

E[abZ + acZ2 ]

= abE(Z) + acE{Z.2 ).

Using Equation 3.15, for random variable Z, we can show that

a!=£(zVi4. Thus, E(XY) - ab\iz +

aco2z + ac\x.|.

Using Equation 3.36, we can show the covariance of X and Y to be Cov(X,Y) = ab \iz

+ ac a2z + ac \iz - (a \xz )(b + c \xz) = ac a| ,

53

54

Chapter 3

Modeling of Uncertainty

Using Equation 3.39, we can calculate the correlation coefficient of X and Y as Cov(X,Y) aco| 1A = — =1.0Px,k= CIC Gz

f

Since both X and Y are linearly related to Z, they are linearly related to each other; therefore, the correlation coefficient of 1.0 between them is expected.

EXAMPLE 3.7 The time to produce a typical engineering drawing, represented by a random variable X, and its quality, represented by a random variable Y, are under consideration. For the sake of discussion, suppose X can be 70, 80, 90, or 100 hours. The quality of a draw¬ ing can be considered to be moderate, good, and excellent, and Y can be considered to be 1,2, and 3, respectively. Suppose that 100 such drawings are evaluated and the information given in Table 3.3 is obtained. (a) Plot the joint PMF of X and Y. (b) Plot the marginal PMF of X and Y. (c) If only excellent quality drawings are acceptable (i.e., Y = 3), plot the condi¬ tional PMF of X. (d) Determine the Cov(X,y) and the corresponding correlation coefficient between

X and Y. Table 3.3 Time and Quality Information on Engineering Drawings X

70

80

90

100

15 3 5

8 4 8

3

6 12

2 12 22

Y 1

2 3

SOLUTION (a) To plot the joint PMF of X and Y, the information can be rearranged as shown below. X

Y

70 80

1

90 100 70

80 90

100 70 80 90 100

1 1 1 2 2 2 2 3 3 3 3

No. of observations

frequencies

15 8 3 2 3 4

0.15 0.08 0.03 0.02 0.03 0.04

Relative

6

0.06

12

0.12 0.05 0.08 0.12 0.22

5 8

12 22

3.4

Multiple Random Variables

55

The joint PMF of X and Y is shown in Figure 3.6. (b) Using Equation 3.34, we can calculate the marginal PMF of X as

px(70) = 0.15 + 0.03 + 0.05 = 0.23 px(80) = 0.08 + 0.04 + 0.08 = 0.20 px(90) = 0.03 + 0.06 + 0.12 = 0.21 px(100) = 0.02 + 0.12 + 0.22 = 0.36. The marginal PMF of X is plotted in Figure 3.8a. Similarly,

py( 1) = 0.15 + 0.08 + 0.03 + 0.02 = 0.28 pY(2) = 0.03 + 0.04 + 0.06 + 0.12 = 0.25 PyO) = 0.05 + 0.08 + 0.12 + 0.22 = 0.47. The marginal PMF of Y is plotted in Figure 3.8b. (c) Using Equation 3.28, we can show the conditional PMF of X, given Y = 3, to be

PyO) Thus,

0Q5

Px,y(70l3) = — = 0.11 /?xiy(80l3) = — = 0.17 m 0.47

/?viy(90l3) = — = 0.25 0.47

pxir(100l3) = — = 0.47. Xl} 0.47

The conditional PMF of X is shown in Figure 3.8c. (d) Cov(X,Fj and the correlation coefficient px Y can be estimated by using Equations 3.36 and 3.39, respectively. The required information can be calculated as follows: E(X) = 70 x 0.23 + 80 x 0.20 + 90 x 0.21 + 100 x 0.36 = 87 Var(X) = (70 - 87)2 x 0.23 + (80 - 87)2 x 0.20 + (90 - 87)2 x 0.2 1 + (100 - 87)2 x 0.36 = 1 39

cx =11.79. Similarly, E(Y) = 1 x 0.28 + 2 x 0.25 + 3 x 0.47 = 2. 19

- 2. 19)2 x 0.28 + (2 - 2.19)2 x 0.25 + (3 - 2.19)2 x 0.47 = 0.7139 Var(F) -(F) = (1 (1-2.

oY = 0.845

56 Chapter 3 Modeling of Uncertainty

0. 36 0.23

1

70

0.20

0.21

80

90

1

100

X

(a) Marginal PMFof X

0.47

0.28

0.25

1

2

3

Y

(b) Marginal PMF of Y

0.47

0.25 0.17

T 1 70

80

90

100

(c) Conditional PMF of XIY=3

Figure 3.8 Marginal PMF of Time and Quality

E{XY) = 70 x 1x 0.15 + 80 x 1 x 0.08 + 90 x 1x 0.03 + 100 x

1 x 0.02 + 70 x 2 x 0.03 + 80 x 2 x 0.04 + 90 x 2 x 0.06 +

100 x 2 x 0.12 + 70 x 3 x 0.05 + 80 x 3 x 0.08 + 90 x 3x0.12 + 100x3x0.22 = 195.1 Cov(X,y) = 195.1 - 87 x 2.19 = 4.57 Px,y =

4.57 = +0.46. 11.79x0.845

3.4

Multiple Random Variables

EXAMPLE 3.8 The joint density function of two random variables X and Y can be represented as

fx,y(x,y) = c(x2 -4)(y2 -9),

and 0 < y < 3

0 l\Y = 2) =

2

3.4.5

.

3,2

= fx(x) =

_ 4)dxj(y

2

—

-±(x> -4).

--

3 I x~ 3

~ =(x2 -4)dx J-— , 16 16 \

j (xFx r (1,3) = T 96 n

(e2)

fr(y)

fY(y)

16

= 0.3125.

4x

- 9 )dy = 0.6875.

n

Multivariate Distributions

In general, the explicit consideration of multivariate distributions is mathematically cum¬ bersome. However, standard procedures as outlined in Section 3.4.1 can still be used if the joint PDF of a multivariate distribution is known. Suppose X and Y are jointly nor¬ mally distributed. The normal distribution will be discussed in detail in Chapter 4. To define the PDF of this bivariate normal distribution requires five parameters, namely, the mean values of X and Y, ux and py, their standard deviations ox and Oy, and the correla¬ tion coefficient pXy. The PDF of the bivariate normal distribution can be expressed as \2 I 1 r — ll I x-V-x fx,Yÿy) = exPi 9 2 ' a. 2(1-Px.y) 2tc gx ay -Jl-Px.y

~2Px,i

—

oo

U-PxXy-M , ( y-v-Y

< X < oo,

OxOy

—

oo

\2

a»

< y < OO.

(3.40)

The PDF of a multivariate normal distribution is more complicated.

3.5

CONCLUDING REMARKS

Modeling and quantifying uncertainties in random variables are the initial and essen¬ tial steps in any risk-based analysis and design. Collecting data and extracting infor¬ mation from the data in terms of many descriptors are introduced in this chapter. Continuous and discrete random variables are considered. Modeling of multiple random variables and their correlation or dependence on each other are presented. The information presented here is expected to provide sufficient background in modeling and quantifying uncertainties in random variables.

3.6

PROBLEMS

3.1 In an examination for a class of 30 students, the following scores were obtained: 99, 45, 60, 80, 95, 100, 95, 91, 85, 87, 77, 75, 61, 71, 85, 88, 83, 85, 79, 81, 82, 55, 63, 75, 82, 88, 77, 78, 41, and 70. (a) Draw the histogram for the data. (b) Draw the frequency diagram for the data. (c) Calculate the mean, variance, standard deviation, coefficient of variation, skewness, and

skewness coefficient for the test scores.

3.6

Problems

59

(d) Assume that a student must score at least 85 to receive an A grade. What is the proba¬ bility that any student in the class will receive an A, using the actual scores only? What will be the corresponding probability if the frequency diagram is used instead?

3.2 The annual precipitation in inches per year during the past 30 years in Tucson, Arizona, is as follows: 11.60, 7.19, 12.69, 11.86, 14.81, 8.07, 11.15, 8.00, 9.55, 11.02, 19.54, 8.63, 12.33, 8.53, 16.55, 19.74, 18.40, 11.37, 10.55, 8.68, 9.62, 6.93, 14.80, 10.64, 14.76, 15.19, 14.56, 9.68, 11.13, and 4.35. (a) Draw the histogram for the data. (b) Draw the frequency diagram for the data. (c) Calculate the mean, variance, standard deviation, coefficient of variation, skewness, and skewness coefficient for the precipitation. (d) Using the frequency diagram, calculate the probability that the annual precipitation in Tucson will exceed 12 in./yr using the actual data and using the frequency diagram.

3.3 The traveling time from the office to the nearest airport may be 0.5, 1.0, 1.5, 2.0, 2.5, or 3.0 hours depending upon the time of travel. The corresponding PMFs are shown in Figure P3.3. Calculate the following information on the travel time:

0.30

0.25 0.20 0.10

0.10

0.05 0

1

0.5

1.0

1.5

2.0

2.5

3.0

Figure P3.3 PMFs of Travel Time

T(hours)

(a) The mean. (b) The variance, standard deviation, and coefficient of variation. (c) The skewness and skewness coefficient.

3.4 In order to bid for a nonstandard construction job, an engineer needs to estimate the dura¬ tion, D, of the project. Since no prior information on similar jobs is available, the engineer estimates that it may take 10 to 20 days. Suppose the PDF of D can be defined by a uniform distribution between 10 and 20 days. (a) Define the PDF of D. (b) Define the CDF of D. (c) Calculate the mean of D. (d) Calculate the variance, standard deviation, and coefficient of variation of D. (e) Calculate the skewness and skewness coefficient of D. (f) Calculate the modal and median values of D.

3.5 The error X in a measurement is modeled with a probability density function in the shape of a cosine curve:

fx{x) = C

cos-ÿr, it o

-E0 3 I X = 1) (ii) Fx y( 1, 3)

3.12 The joint probability density function of two random variables X and Y can be represented as

fxj(x>y) = cex+y, 0

probability reliability

Short Description

Description

Comments

We need your help!