Mathematics_Notes 2016 HSC
Good notes for HSC Mathematics 2016....
NOTES | KEVIN LUU | 5.2 – 5.3 MATHEMATICS
ASSESSMENT Review types of data, collecting data, sorting data, measures of central tendency, measures of spread and displaying data. Interpreting results from two sets of data (i.e. back to back stem and leaf displays, histograms, double column graphs, or box and whiskers plots). Find the range, interquartile range and standard deviation as measures of spread of data sets - Find the mean and standard deviation of a set of data using digital technologies – calculators - Compare and describe the spread of sets of data with the same mean but different standard deviations Bivariate Data: recognises the difference between dependent and independent variables. Describes the strength and direction of the relationship between two variables displayed in a scatter plot, e.g. Strong positive relationships, weak negative relationships with justifications. Uses lines of best fit to predict what might happen between known data values (interpolation) and predict what might happen beyond known data values (extrapolation). Know the six processes to setting up statistical investigations. Identify reasons why data in a display may be misrepresented. EXPRECTATIONS Use measures of central tendency (mean, mode, median) and the range to analyse data that is displayed in a frequency table, stemand-leaf plot or dot plot. Use terms ‘skewed’ or ‘symmetrical’ when describing the shape of a distribution. Compare two sets of data and draw conclusions by finding the mean, mode and/ or median, and the range of both sets. Construct a cumulative frequency table, histogram and polygon (ogive) for ungrouped data. Use cumulative frequency to find the median. Group data into class intervals. Construct a cumulative frequency table and histogram for grouped data. Find the mean and modal class of grouped data. Determine the upper and lower quartiles for a set of scores. Construct a box-and-whisker plot using the five-point summary. Use a calculator to find the standard deviation of a set of scores. Use the mean and standard deviation to compare two sets of data. Compare the relative merits of measures of spread (range, interquartile range and standard deviation). STATISTIC TERMANOLOGY BIVARIATE DATA - data that has to variables
BOX PLOT (CAT-AND-WHISKERS PLOT) - a diagram obtained from the five number summary - the box shows the middle 50% of scores (the interquartile range) - the whiskers show us the extent of the bottom and top quartiles as well as the range CENSUS - a survey of a whole population CUMULATIVE FREQUENCY - the number of scores less than or equal to a particular outcome - e.g. For the data 3,6,5,3,5,5,4,3,3,6 the cumulative frequency of 5 is 8 (there are 8 scores of 5 or less) CUMULATIVE FREQUENCY HISTORGRAM (AND POLYGON) - these show the outcomes and their cumulative frequencies DATA - the pieces of information (or ‘scores’) to be examined - categorical: data that uses non-numerical categories - ordered data involves a ranking, e.g. exam grades, garment sizes - distinct data has no order, e.g. colours, types of cars - numerical: data that uses numbers to show ‘how much’ - continuous data can have any numerical value within a range, e.g. height - discrete data is restricted to certain numerical values, e.g. number of pets DOT PLOT - a type of graph that uses one axis and a number of dots above the axis EXTRAPOLATION - predicting a data beyond the range of values given FIVE NUMBER SUMMARY - a set of numbers consisting of the minimum score, the three quartiles and the maximum score FREQUENCY - the number of times an outcome occurs in the data - e.g. for the data 3,6,5,3,5,5,4,3,3,6 the outcome 5 has a frequency of 3 FREQUENCY DISTRIBUTION TABLE - a table that shows all the possible outcomes and their frequencies (it usually is extended by adding other columns such as the cumulative frequency) FREQUENCY HISTROGRAM - a type of column graph showing the outcomes and their frequencies. FREQUENCY POLYGON - a type of line graph showing outcomes and their frequencies - to complete the polygon, the outcomes immediately above and below the actual outcomes are used (the height of these columns is zero) GROUPED DATA
data that is organised into groups or classes class intervals: the size of the groups into which the data is organised e.g. 1-5 (5 scores); 11-20 (10 scores) - class centre: the middle outcome of a class e.g. the class 1-5 has a class centre of 3 INTERPOLATION - estimating data that lie within the domain of the values given INTERQUARTILE RANGE - the range of the middle 50% of scores - the difference between the median of the upper half of scores and the median of the lower half of scores - IQR = Q3-Q1 LINE OF BEST FIT - a line that ‘best fits; the data on a scatter plot mean MEAN - the number obtained by ‘evening out’ all the scores until they are equal - e.g. if the scores 3,6,5,3,5,5,4,3,3,6 were ‘evened out’, the number obtained would be 4.3 - to obtain the mean, we divide the sum of the scores with the total number of scores MEDIAN - the middle score for an odd number of scores or the mean of the middle two scores for an even number of scores - the median class is grouped data containing the median MODE (MODAL CLASS) - the outcome or class that contains the most scores OGIVE - this is another name for the cumulative frequency polygon OUTCOME - a possible value of the data OUTLIER - a score that is separated from the main body of scores QUARTILES - the points that divide the scores the scores up into quarters - the second quartile, Q2, divides the scores into halves (Q2 = median) - the first quartile, Q1, is the median of the lower half of scores - the third quartile, Q3, is the median of the upper half of scores RANGE - the difference between the highest and lowest scores SAMPLE - a part (usually a small part) of a large population - random sample: a sample taken so that each member of the population has the same change of being included - systematic sample: a sample selected according to some ordering scheme, e.g. every tenth member - stratified sample: a sample is proportionally taken from each subgroup in a population
SCATTER PLOT - a graph that uses points on a number plane to show the relationship between two categories. SHAPE (OF A DISTRIBUTION) - a set of scores can be symmetrical or skewed SOURCES OF DATA - primary: the data has been collected by yourself - secondary: the data has come from an external source, e.g. newspapers, internet STANDARD DEVIATION - a measure of spread that can be thought of as the average distance of scores from the mean - the larger the standard deviation, the larger the spread STATISTICS - the collection, organisation and interpretation of numerical data STEM-AND-LEAF PLOT - a graph that shows the spread of scores without losing the identity of the data - ordered stem-and-leaf plot: the leaves are placed in order - back-to-back stem-and-leaf plot: this can be used to compare two sets of scores, one set on each side VARIABLE - something that can be observed, measured or counted to provide data
1 STATISTICS TYPES OF DATA The data we collect is made up of variables. These are pieces of information like a quantity or a characteristic that can be observed or measured. They may change either over time or between individual observations. The main types of data are: CATEGORICAL – VARIABLES ARE CATEGORIES - ordered | e.g. exam grades, garment sizes - distinct | e.g. types of cars, eye colour NUMERICAL – VARIABLES ARE NUMBERS - discrete | e.g. goals scored, number of pets
continuous | e.g. height of a person, distance thrown
COLLECTING DATA There are three main ways of collecting data, including: CENSUS - a whole population is surveyed, e.g. every student in the school is questioned SAMPLE - a selected group of a population is surveyed, e.g. a small number in each class is questioned OBSERVATION - numerical facts are collected and tabulated, e.g. sports data, weather, sales figures, etc. A sample is usually random to limit the chances of bias occurring. However, it may be systematic if the members of the sample are chosen according to a rule, such as every 10th member of a population. If a population is composed of various sub-groups, the sample could be stratified to ensure a proportionate representation of each group in the sample. Primary source data is collected first hand by observation or survey. Secondary source data is obtained from an external source such as a newspaper, website or another person’s research.
SORTING DATA A large amount of data needs to be tabulated (organised into a table) so that it can be analysed. A common form of table is the frequency distribution table. DISCRETE DATA OUTCOME ( TALLY x ) 1 ||| 2 |||| 3 ||||||| 4 ||||||||| 5 ||||| 6 ||
FREQUENCY ( f ) 3 4 7 9 5 2 TOTAL | 30
CUMULATIVE FREQUENCY 3 7 14 23 28 30
3 8 21 36 25 12 | 105
GROUPED DATA Used to cluster discrete data into groups or to divide continuous data into adjoining groups. f × c . c . CUMULATIV CLASS CLASS TALLY FREQUENCY