Math1041 Study Notes for UNSW
April 23, 2017 | Author: Oliver | Category: N/A
Short Description
Study notes for Statistics for Life and Social Sciences....
Description
Stats Study Notes Graphical and Numerical Summaries Statistic - summary of data (which are measures of events) Field of Statistics - collecting, analysing and understanding data measured with uncertainty When choosing which graph: 1 Variable 2 Variables Histogram (columns, no Quantita gaps) Scatterplot tive Box Plot Clustered bar chart (two Categori side-by-side charts in same Bar graph (with gaps) cal scale) Jittered Scatterplot Comparative Bloxplot One Comparative Histogram Each (side-by-side, same scale) When looking at a graph observe: Location - where most data is (similar to mode, also mean/median) Spread - variability (width of bulky part) Shape symmetric, left-skewed, right-skewed (skewed=direction it is pulled from symmetry) Unusual observations When choosing which method of numerical summary: One categorical variable table of frequency/percentages One quantitative variable o Location Mean Median if n (number of values) is odd, M= if n is even, M= o Spread Standard Deviation s= Interquartile Range - Q3-Q1 (each are calculated as medians of the top or bottom half) Five number summary: (Min, Q1, M, Q3, Max). This is the data shown in a box plot, however the tails of a box plot may exclude outliers. This is calculated by adding 1.5IQR to the outer ends of Q 1 and Q3, then picking the furthest data points within this range. Outlier points are marked with a ○. Transformations Linear transformations are changing units of x to xnew, for example time (min→h), length (km→mi) and temperature ( oC→oF), Page 1
Oliver Bogdanovski
altering location and shape, but not shape. They are found by the equation: xnew = a + bx Measures of location follow this: xnew = a + bx Mnew = a + bM Measures of spread are only affected by b: snew = bs IQRnew = bIQR Non-linear transformations change shape, and are good for correcting skewed data and working with outliers. To pull down the right tail (right-skewed) use log(x) [preferred], x 1/4 or x1/2 (from strongest to weakest). These are monotonically increasing (keeps everything in order), and the base of the log only affects the scale, not shape, and hence will not make it more symmetrical. Because log(ab)=log(a)+log(b), they change multiplicative values to additive. To pull down the left tail (left-skewed), treat it as -x then continue with right-skewed (e.g. log(-x). If dealing with zeros in right-skewed, use log(x+1). To stretch the proportions of data where 0
View more...
Comments