Correlation and Regression Exercise Q1 Following is a list of hypothetical examples of the types of analysis for which one might use each of the methods mentioned (dependent variable is denoted by an asterisk*; details of variables are within parentheses):
Pearson’s product moment correlation: Testing for an association between leaf area and root starch concentration in a clonal tree species (both variables continuous and normally distributed).
Rank correlation (Spearman and/or Kendall) Testing for an association between the diversity of flowering plant species (ranked) and the number of visiting pollinator species within the study area (normal distribution of variables not required).
Linear regression Testing the relationship between the degree of nuptial shading* (continuous measure, residuals normally distributed) and the availability of mates in a species of fish (linearly related to dependent variable).
Logistic regression Testing the effect of genetic distance on the sex* of sterile offspring (binomial distribution) in five hybrid species pairs.
Analysis of covariance Testing the effect of soil permeability (ranked), litter depth and soil type (covariate) on the rate of ant re-colonisation* (residuals normally distributed).
Q2
Divya Krishnamohan Student ID: 200292988
Dr. William Kunin collected data on the abundance of 33 different species of North American ducks and geese as well as the number of chewing lice (Mallophaga) species recorded on them. The data pertaining to 32 of these species (excluding species code name “snwgos” for which data of concern to this analysis is missing) will be used in the following analysis.
In order to determine whether the number of Mallophaga species associated with a duck species is affected by how common the host is, data on the diversity of Mallophaga, Mallophaga , as well as two measures of duck abundance will be used. The two measures of duck abundance are:
i.
Number of sites used in the Christmas Bird Count at which the species was recorded (abbreviated as CBC circles)
ii.
Number seen in the Christmas Bird Count (abbreviated as CBC number).
a) Testing for a significant relationship between the two measures of host
abundance: In order to perform a correlation, data have to meet the assumptions of the type of correlation test being performed. As a rule, a parametric test, such as Pearson’s product-moment correlation, gives a more powerful result than its nonparametric counterparts, Kendall’s tau_b/ Spearman’s rank. Pearson’s correlation assumes a normal distribution of both variables.
A histogram of the data sets CBC number and CBC circles reveals a strong skew in the case of the former, and a moderate skew in the latter. (Refer Fig. 1) Usually, a logarithmic transformation is applied to correct strong skews while a square root transformation is applied to correct moderate skews. Both transformations were applied respectively to the data and the normality assessed by means of the Shapiro-Wilk Test (used as there are fewer than 50 cases). (Refer Fig. (Refer Fig. 2, Table 1) 1)
Divya Krishnamohan Student ID: 200292988
30
25
20 y c n e u q15 e r F
10
5
Mean =194135.47 Std. Dev. =358003.762 N =32
0 0
250000
500000
750000
1000000
1250000
1500000
CBC number
12
10
8 y c n e u q 6 e r F
4
2
Mean =375.66 Std. Dev. =262.151 N =32
0 0
200
400
600
800
1000
1200
CBC circles
Fig. 1. Histograms showing the distribution of untransformed untransformed data – CBC numbers and CBC circles.
Divya Krishnamohan Student ID: 200292988
12
10
8 y c n e u q 6 e r F
4
2
Mean =4.8193 Std. Dev. =0.70313 N =32
0 3.00
3.50
4.00
4.50
5.00
5.50
6.00
6.50
log10 (CBC number)
6
y c n 4 e u q e r F
2
Mean =18.0935 Std. Dev. =7.05973 N =32 0 5.00
10.00
15.00
20.00
25.00
30.00
35.00
square root (CBC circles)
Fig. 2 Histograms showing the distribution of log transformed CBC numbers and square root transformed CBC circles.
Divya Krishnamohan Student ID: 200292988
Table 1. Shapiro-Wilk’s test of normality on untransformed and transformed variables – CBC number and CBC circles.
Tests of Normality Shapiro-Wilk CBC number log10 (CBC number)
Statistic .513
df 32
Sig. .000
.948
32
.124
CBC circles
.922
32
.024
sqrt (CBC circle)
.947
32
.121
It is apparent that the transformations applied have helped normalise the data. The results of a Pearson’s product-moment correlation are described in the table below. Table 2. Pearson’s product-moment correlation for sqrt (CBC circles) and log10 (CBC numbers). Correlations
sqrt sqrt(C (CB BC circ circle le))
Pears earson on Corr Correl elat atiion Sig. (2-tailed)
sqrt sqrt(C (CBC BC circ circle le)) 1
.000
N log( log(CB CBC C numb number er))
Pear Pearso son n Cor Corre rela lati tion on
log( log(CB CBC C numb number er)) .625(**)
32
32
.625(**)
1
Sig. (2-tailed) N
.000 32
32
** Correlation is significant at the 0.01 level (2-tailed).
The Pearson’s correlation reveals that there is a significant correlation between the two variables (r=0.625, P
Thank you for interesting in our services. We are a non-profit group that run this website to share documents. We need your help to maintenance this website.