GE 210 Lecture 33 (Experimental Design)
Short Description
Lecture on 2k factorial experimentation Includes hypothesis testing, fractional optimization of hydraulic fracturing to...
Description
GE 210—Probability and Statistics December 3, 2012 Lecture 33
Today •
Experimental Design
Purpose of Experimental Experimental Design •
Engineers are always conducting experiments –
–
–
–
•
•
Improving process yield Reducing variability of output Reduce design and development time Reduce cost of operation
Proper statistical methods are essential to good experimentation Poorly designed experiments produce inconclusive or incorrect results and use valuable resources ineffectively
Experimental Design •
You have been asked to determine the biogas production potential of several types of feedstocks (slaughterhouse waste, dairy manure, distiller’s mash) –
–
•
Want to know if the type of feedstock has a significant effect on biogas volumes If there is a significant effect, which feedstock has the highest biogas volume?
How do you do it?
Feedstock biogas production L of biogas/L of feedstock per day Feedstock 1
Feedstock 2
Feedstock 3
1.51
2.29
1.65
L of biogas/L of feedstock per day Feedstock 1
Feedstock 2
Feedstock 3
1.5
2.1
0.95
1.1
3.9
1.8
1.2
0.8
1.7
1.8
3.4
2.1
1.7
0.8
1.9
1.4
1.4
0.8
1.9
3.6
2.3
average
1.51
2.29
1.65
stdev
0.30
1.34
0.57
Replication allows the degree of variability in the data to be assessed
Experimental Design •
Use what we know about analyzing the data to “work backwards” and collect the data properly Can select the number of replicates required to satisfy the desired power of the experiment (also depends on variability of data) Quite often, the number of replicates is constrained by time/cost Using t-tests to test for differences is very cumbersome! Need 3 sets of hypotheses –
–
•
–
•
•
•
Ho: μ1 = μ2 Ho: μ1 = μ3 Ho: μ2 = μ3
Experimental Design •
Instead, use ANOVA to test significance of “treatment” –
–
In this case, the “treatment” is the type of feedstock
If the effect of the treatment is significant (P value is less than a), then a “means separation” will tell you exactly which treatment is different from the others
–
Ho: there are no differences
–
HA: there are differences
ANOVA Output (Minitab) General Linear Model: Biogas versus Feedstock Factor Type Levels Values Feedstock fixed 3 1, 2, 3 Analysis of Variance for Volume, using Adjusted SS for Tests Source DF Seq SS Adj SS Adj MS F P Feedstock 2 2.3745 2.3745 1.1873 1.61 0.228 Error 18 13.2821 13.2821 0.7379 Total 20 15.6567
Since P >
a
, do not reject Ho, there is no evidence to support that
Experimental Design •
•
You have been asked to optimize the operating parameters (temperature and dry matter content) to improve biogas production from a specific feedstock –
Temperature can range from 35˚C to 55˚C
–
Dry matter content can range from 3 to 10%
How do you do it?
Biogas Production by Varying Parameters •
Change factors one at a time –
Set the temperature to 40˚C, vary dry matter content and measure biogas production •
–
Set the dry matter to 10%, vary temperature and measure biogas production •
•
10% dry matter produces highest volume
35˚C produces the highest volume
Conclude that 10% dry matter content and 35˚C are the optimum operating conditions Very poor statistical practice and experimental design…
Biogas production by varying parameters
Feedstock 1
Feedstock 2
Feedstock 3
Temperature
DM Content
L of biogas/L of feedstock per day
35
5
1.8
40
10
2.1
50
15
0.7
30
7
2.4
45
12
2.4
55
18
2.4
35
4
1.1
40
7
3.6
55
9
4.4
Very poor statistical practice and experimental design…
Biogas production by varying parameters •
•
By “haphazardly” varying temperature and dry matter contents, it is difficult to assess whether differences in biogas production are due to temperature or dry matter content or both
A better solution is to use a factorial experimental design
Factorial Experimental Design •
When several factors are of interest in an experiment, a factorial experiment should be used In each complete replicate of the experiment, all possible combinations of the levels of the factors are investigated Each feedstock should be digested at each temperature and each DM content for valid comparisons –
–
•
– –
3 feedstocks x 3 DM contents x 3 temperatures equals 27 experiments (x 3 reps = 81 trials)
The effect of each factor can then be separately assessed Potential interactions of factors can also be assessed •
Factorial experiments are the only way to discover interactions between variables
Interactions •
•
When the difference in response between the levels of one factor is not the same at all levels of the other factors, there is an interaction between factors When an interaction is significant, the corresponding main effects have very little practical meaning
No interaction between factors
Interaction between factors
Biogas production by varying parameters Feedstock
Temperature
DM Content
L of biogas/L of feedstock per day
1
25
3
1.8
1
25
5
2.1
1
25
10
1.5
1
30
3
2.1
1
30
5
2.1
1
30
10
0.75
1
40
3
1.9
1
40
5
2.3
1
40
10
1.7
Average of 3 reps
Feedstocks 2 and 3 will have the same tables of data. Now we can analyze the effect of three “treatments” feedstock type, temperature and dry matter content.
We can also determine if there are significant interactions among
Factorial Experimental Design •
•
“Are the results significant?”
Cannot make that assessment without a measure of the variability of data Average
StdDev
Feedstock 1
1.8
0.47
Feedstock 2
2.9
0.62
Feedstock 3
2.1
0.79
25 degrees
2.0
0.49
30 degrees
2.4
1.00
40 degrees
2.4
0.75
3% DM
2.3
0.69
5% DM
2.6
0.62
10% DM
1.8
0.85
ANOVA Output (Minitab) General Linear Model: Biogas versus Feedstock, Temp, DM Factor Type Levels Values Feedstock fixed 3 1, 2, 3 Temp fixed 3 25, 30, 40 DM fixed 3 3, 5, 10
Analysis of Variance for Biogas, using Adjusted SS for Tests Source DF Seq SS Feedstock 2 5.7535 Temp 2 1.0591 DM 2 2.8846 Feedstock*DM 4 0.3415 Feedstock*Temp 4 0.9670 Temp*DM 4 2.0293 Error 8 2.5196 Total 26 15.5546
Adj SS Adj MS F P 5.7535 2.8768 9.13 0.009 1.0591 0.5295 1.68 0.246 2.8846 1.4423 4.58 0.047 0.3415 0.0854 0.27 0.889 0.9670 0.2418 0.77 0.575 2.0293 0.5073 1.61 0.262 2.5196 0.3150
Interactions not significant
Since P is <
for both feedstock and DM, conclude that type feedstock
a
Main Effects Plots Main Effects Plot for Biogas Data Means Feedstock
DM
3.0
2.8
2.6
n a e 2.4 M 2.2
2.0
1.8 1
2
3
3
5
10
Interactions Plot Interaction Plot for Biogas Data Means 3.5
Feedstock 1 2 3
3.0
n 2.5 a e M 2.0
1.5
3
5
DM
10
Means Separation for Feedstock Tukey Simultaneous Tests Response Variable Biogas All Pairwise Comparisons among Levels of Feedstock Feedstock = 1 subtracted from: Difference SE of Adjusted Feedstock of Means Difference T-Value P-Value 2 1.0833 0.2646 4.0949 0.0086 3 0.2611 0.2646 0.9870 0.6048
Feedstock = 2 subtracted from: Difference Feedstock of Means 3 -0.8222
Pairwise t-tests!
SE of Adjusted Difference T-Value P-Value 0.2646 -3.108 0.0347
1 is different from 2, but not from 3. Feedstock 2 is different from 3
Means Separation for Dry Matter Content Tukey Simultaneous Tests Response Variable Biogas All Pairwise Comparisons among Levels of DM DM = 3 subtracted from:
DM 5 10
Difference of Means 0.3111 -0.4833
SE of Adjusted Difference T-Value P-Value 0.2646 1.176 0.4986 0.2646 -1.827 0.2217
DM = 5 subtracted from:
DM 10
Difference of Means -0.7944
SE of Adjusted Difference T-Value P-Value 0.2646 -3.003 0.0404
Systematic Error and Randomization •
•
You are testing a total of 18 o-rings —half from the old production line and half from the new production line to determine if the new line produces o-rings of higher strength Would you take 9 consecutive o-rings from the old production line, test them, then take 9 consecutive o-rings from the new production line and test them? – –
•
For a chemical experiment, will need to use 2 bottles of chemical to complete experiment, but pH of one bottle may be slightly different –
•
Systematic error can affect results O-rings produced at beginning of the day may be different than o-rings produced later in the day
Randomize use of two bottles among treatments to avoid systematic error
For a bench-scale digester experiment, you have 12 reactors in a circulating warm water bath, but one end of the bath may have a slightly different temperature –
Randomize the ordering of the treatments in the reactors to avoid systematic error
Systematic Error and Blocking •
Another way to avoid systematic error is by “blocking” the treatments together – –
–
–
Commonly used in experiments involving plots of soil Even a small area of land (1 ha) can have variations in the soil that may affect results (growth rate, yield, nutrient uptake, etc.) Area is broken up into smaller blocks and 1 replicate of each combination of treatments is placed in each block The “block” number can be treated as a main effect during analysis to see if the location of the block had a significant effect on the response variable(s)
Factorial Experiments •
•
Pros –
Simple and robust analysis
–
Can analyze effects of interactions
Cons –
Requires excessive time and resources
–
Experiment can get impossibly large very quickly
2k Factorial Design • • •
To study the effect of several factors on a response Can also investigate effect of interactions of factors Each factor has 2 levels Quantitative (continuous) measures of temperature, pressure, time –
Qualitative (discrete) measures like “high” and “low” or “old” and “new”, etc. A complete replicate of such a design requires 2 x 2 x 2 x … x 2 = 2 k –
•
experiments and is called a 2 k factorial design If the effect of the treatment is significant, do not need a means separation to determine which one is different because there are only 2 factors and they must be different from each other! –
Experimental Design Considerations •
Control –
•
Placebo –
•
A type of control used extensively in drug studies since the psychological effect of taking a drug may impact your response to it
Blinding –
•
Allows for identification of effects between treatments with the effect of the treatment itself
Removes bias of subjects and experimenters
Confounding –
–
When potential interactions are not specifically identified, effects can become confounded (and results are not as conclusive as they could be) Ex: if a new cure for the common cold was administered to men only, and a placebo to women only, no way to investigate the interaction between gender and drug so gender and drug are confounded
Experimental Design Considerations •
Randomization –
•
Blocking –
•
Helps remove some of the bias due to systematic error A specific type of randomization that accounts for a known source of variation
Replication vs duplication –
Replication is required to get a measure of variation
–
In true replication, the factors are simulated and repeated x times
–
In duplication, x samples are taken from the same simulation of factors (not truly replicated!)
Example—Blocking •
A researcher is carrying out a study of the effectiveness of four different skin creams for the treatment of a certain skin disease. He has eighty subjects and plans to divide them into 4 treatment groups of twenty subjects each. Using a randomized block design, the subjects are assessed and put in blocks of four according to how severe their skin condition is; the four most severe cases are the first block, the next four most severe cases are the second block, and so on to the twentieth block. The four members of each block are then randomly assigned, one to each of the four treatment groups.
Example—Controls •
My research: emissions from manure spreading –
Factors: •
•
•
•
Type of manure (solid, liquid) Species of manure (cow, pig) Application method (surface, subsurface) Application rate (0X, 1X, 2X, 3X) –
–
–
Controls for application rate » Measuring emissions from bare soil (0X) can separate the emissions from the manure from the emissions from bare soil Controls for application method » Disturbed and undisturbed Controls for manure type » Water applied at 2X application rate
Example •
Set up an experiment to test the effectiveness of store brand vs brand name laundry detergent –
Replication vs duplication
–
Blinding
–
Some standard “dirty” to be tested
–
Some qualitative measure of “clean”
The trickiness and ambiguity of experimental design and analysis… •
“Warning Signs in Experimental Design and Interpretation” –
An online essay (unknown author)
–
http://norvig.com/experiment-design.html
–
Focused on experimental design and analysis for medical research, but the ideas are applicable to all research
Warning Signs in Misinterpretation of Results • •
Lack of repeatability and reproducibility Ignoring publication bias Experiments that don’t work out are often not published! Ignoring other sources of bias Not understanding conditional probability! Pr(cancer/positive) is not the same as Pr(positive/cancer) Confusing correlation with causation Being too clever! –
• •
–
• •
View more...
Comments