Dr. Etazaz Econometrics Notes.pdf
Short Description
Download Dr. Etazaz Econometrics Notes.pdf...
Description
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
������������ ���������� ��������� � ���������
���� � 1 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
E
conometrics
It is a subject in which we formulate mathematical relationship among economic variables on the basis of knowledge of economic theory and there estimate, numerical values of the parameters in this relationship using the actual data. Classical Linear Regression Model:
Suppose we want to analyze a variable Y using the data; Y i (i = 1, 2, 3 ………n). In the most simple analysis we’ll like to represent the whole data Y 1, Y2, Y3 .…..Yn, by a single number. We can formulate a model for this purpose which looks like this way; Y=α+Y–α Or Y=α+U [U = Y – α] Thus Y is set equal to a constant (α) plus the discrepancy (difference) between Y and the presumed constant (α). The equation is what is the most appropriate interpretation of α? If we set the average of errors E (U) = 0 Then we’ll have => E (Y – α) = 0 => E (Y) – α = 0 => E (Y) = α. [Pop Mean of Y] With this interpretation, we can write the model as; Y = E (Y) + U U = Y – E (Y). Example: Pop Mean = 20 years Person age = 3 years Y = E (Y) + U Y = 20 + 3 Y = 23 years. And if Person age = -1 year Y = 20 + (-1) Y = 19 years. In the statistics, we learnt how to estimate population mean using a random sample. In this course, we will repeat the some exercises using a different approach. We start with the model. Y=α+U Suppose this model is imposed on data; Y1, Y2, Y3 .…..Yn this amounts to; ������������ ���������� ��������� � ���������
���� � 2 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Y1 = α + U1 Y2 = α + U2 Y3 = α + U 3 : Yi = α + U i : : Yn = α + Un The estimation of α depends on assumptions of the model. The Classical assumptions are as follows; 1). Ui is a random variable for each i. This means Ui is a random variable for that U1, U2, U3………… Un are all random variables. Random Variable: random variable is that which can take at least two values with non zero probability]. Ui is one out of infinite values, each have infinite values. Time is fixed variable. Age is not random variable. Weight is random variable. 2). E (Ui) = 0 for each i. On average errors are equal to zero. Since Ui = Y – α => Yi – E (α) = E (Ui) This assumption holds by construction. 3). Var (Ui) = σ² for all i. All errors terms have the same variance, this assumption is known as Homoscedasticity assumption and if assumption violated we’ll have Hetroscedasticity. 4). Co-Var (Ui, U j) = 0 of all i ≠ j. Time series data they are correlated but not in cross section data. If co-Var (Ui, U j) ≠ 0 for some i ≠ j then we say that Ui is Auto correlated with Uj different time at one variable (for example food expenditure). 5). Ui is distributed normally. Some times we also make the assumption that; Ui ~ N [Ui is distributed normally]
It is also challengeable assumption. ������������ ���������� ��������� � ���������
���� � 3 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Or
Y – E (Y) 0 – 800
[Mean income]
1000000 – 800
Estimation of α Let is an estimator of α.
= +℮ [Where ℮ = Y - ] ℮ is regression error or estimation error or residual. One way to approach estimation is to focus on ℮ and choose such an estimator which minimizes the error. Suppose we attempt to minimize ∑ ℮i Examples
Choosing
1). ℮i
2). ℮i
10 -10 20 -20 ∑ ℮i = 0
0 0 1 0 ∑ ℮i = 1
(1) it is preferred wrong criteria.
Suppose we minimize∑ |℮i| (ignoring signs) Examples
Choosing
1). ℮i
2). ℮i
5 -5 5 -5 ∑ |℮i |= 20
0 0 19 0 ∑ |℮i |= 19
(2) is wrong criteria.
������������ ���������� ��������� � ���������
���� � 4 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
We should minimize weighted some of errors such that larger error are assigned greater weights. Suppose we set weights proportional to absolute size of error, so set; ωi = Ө |℮i | Now minimize ∑ ωi |℮i | Min ∑ Ө |℮i | |℮i | Min ∑ Ө |℮i |² Min ∑ ℮i² The esti estimat mator or , which which min min ∑ ℮i² is is known known as Ordinary Least Square (OLS) method
Y= α + U
[Basic equation]
Estimation:
=
where [E (U) =0]
Regression error or residual
e=Ye=YMin ∑ ei² OLS
[Ordinary
Least Square Estimator]
estimator of α:
Min ∑ ei² = ∑ (Y - ) ² First-Order condition
∂ _ [(Y1 - ) ² + (Y2 - ) ² + (Y3 - ) ² + ……..+ (Yn - ) ²] =0 ∂ [2(Y1 - ) (-1) + 2(Y2 - ) (-1) + 2(Y3 - ) (-1) + …. + 2 (Yn - ) (-1)] =0
������������ ���������� ��������� � ���������
���� � 5 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
-2[∑Y 2[∑Y – n ] =0 divide both sides by -2 and n
OLS
estimator of α is mean of Y.
Some properties of
:
1). has min sum square of errors ∑ ei².
∑ ei² = ∑ (Yi - ) ² = ∑ (Yi = ∑ y² 2).
)²
is a random variable of Ui. = 1_ [Y1 + Y2 + Y3 + ……… + Y n] n = 1_ [(α + U1) + (α + U2) + (α + U3) +……………. + (α + U n)] n = 1_ [n α + (U1 + U2 + U3 +……………. + U n)] n = α + (1_ U1 + 1_ U2 + 1_ U3 +……………. + 1_U n)] equation (i) n n n n = a0 + a1 U1 + a2 U2 + a3 U3 +……………. + an Un)] equation (ii)
Where a0= α, ai= 1_ and so on for all as’. n Equation (ii) show that is a linear function of random variable of U1, U2, U3, ….. , Un. .: 3).
must be random.
is Unbiased.
Proof:
E ( ) = E (α (α + ( 1_U 1_U1 + 1_U2 + 1_U3 +……………. + 1_U n)] n n n n = α + (1_ E (U1) + 1_ E (U2) + 1_ E (U3) +……………. + 1_ E (Un)] n n n n = α + (1_ (0) + 1_ (0) + 1_ (0) +……………. + 1_ (0)] n n n n E ( ) = α. [As we know that E (Ui) =0] ������������ ���������� ��������� � ���������
���� � 6 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
4). has minimum variance in the class of linear unbiased estimator. Proof: (a).
Var ( ) = E [ - E ( )] ² = E [α + 1_ U1 + 1_ U2 + 1_ U3 +……………. + 1_ U n - α] ² n n n n = E [1_ (U1 + U2 + U3 +……………. + U n)] ² n = E [1_ (U1 + U2 + U3 +……………. + U n) ²] n² = 1_ E [U1² + U2² + U3² +……………. + U n² + ∑i≠j ∑ (Ui, U j)] n² = 1_ [E (U1²) + E (U2²) + E (U3²) +…………. + E (U n²) + ∑i≠j ∑ E (Ui, U j)] n² = 1_ [σ² + σ² + σ² +…………. + σ² + ∑ i≠j ∑ (o)] .: {co-var (Ui, U j) =0} n² = 1_ nσ² n² = σ² equation (a) n (b). Now consider any linear unbiased estimator. (i)
α* = b1Y1 + b2Y2+ b3Y3+ b4Y4+ …………………. + bnYn
Where b1, b2, b3, b4… bn are constants obviously α* is linear, to make α* unbiased, we set E (α*) this implies the following; E (α*) = α E ( b b1Y1 + b2Y2+ b3Y3+ b4Y4+ …………………. + bnYn) = α E [ b b1 (α+U1) + b2 (α+U2) + b3 (α+U3) + …………………. + bn (α+Un)] = α E [α (b1 + b2 + b3 + b4 + …… + bn) + b1U1 + b2U2 + b3U3 + ……. + bnUn] = α α (b1 + b2 + b3 + .. + bn) + b1 E (U1) + b2 E (U2) + b3 E (U3) + …. + bn E (Un)] = α α ∑bi + b1 (0) + b2 (0) + b3 (0) + …. + bn (0)] = α α ∑bi = α ∑bi = 1. (ii). Now compute variance α*. Var (α*) = Var ( b b1Y1 + b2Y2+ b3Y3+ b4Y4+ …………………. + bnYn) = Var ( b b1Y1) + Var ( b b2Y2) + Var ( b b3Y3) + ………………. + Var ( b bnYn) + ∑ i≠j ∑ Co-var ( b biYi, b jY j) = b1²Var (Y1) + b2² Var (Y2) + b3² Var (Y3) + ……………. + bn² Var (Yn) + ∑ i≠j ∑ bi b j Co-var (YiY j) = b1²Var (U1) + b2² Var (U2) + b3² Var (U3) + ……………. + bn² Var (Un) + ∑ i≠j ∑ bi b j Co-var (UiU j) ������������ ���������� ��������� � ���������
���� � 7 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
= b1²σ² + b2²σ² + b3²σ² + ……………. + bn²σ² + ∑ i≠j ∑ bi b j (0) = ∑ bi²σ² (iii). (iii). Comp Compari arison son betw between een Var Var ( ) < Var (α*) unless b1= 1 for all i. n Consider Var (α*) and minimize it by choosing bi. Min ( b b1, …… bn)
Var (α*) = ∑ bi²σ² Subject to ∑ bi = 1.
Make Lagrangian L = σ² ( b b1² + b2² + b3² + ……………. + bn²) +λ [1-( b b1 + b2 + …. …. + bn)] First-order conditions; ∂L => L bi => 2σ² bi – λ =0 (i= 1,2,3………n) --------------------------- -- (A) ∂bi ∂L => L λ => 1-( b --------------------------- -- (B) b1 + b2 + …. …. + bn) =0 ∂λ
bi = λ_ substitute in equation (B) 2σ² => 1-( λ_ + λ_ + …. …. + λ_) =0 2σ² 2σ² 2σ² => λn_ =1 => λ = 2σ² -------------- (C) 2σ² n Substitute equation (C) into (A). From equation (A)
=> 2σ² bi = λ 2σ² bi = 2σ² n bi = 1_ n => α*= b1Y1 + b2Y2 + b3Y3 + …………..+ bnYn = 1_Y1 + 1_Y2 + 1_Y3 + …………..+ 1_Yn n n n n = 1_ [ [Y1 + Y2 + Y3 + …………..+ Yn] n = 1_ ∑ ∑Yi n α*= Ỹ Recap: the OLS estimator is linear, unbiased and has minimum variance in the class of linear unbiased estimators, estimators, that is is best linear unbiased estimator (minimum variance). i.e.:-
is BLUE
������������ ���������� ��������� � ���������
���� � 8 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Comments
1). linear:
on BLUE property:
is a linear function of Y.
= 1 Y1 + 1 Y2 + 1 Y3 + ………..+ 1 Y n12 n n n n Theorem: If X1 ~ N, X2 ~ N, X3 ~ N………….. X n ~ N then any linear combination of X1, X2, X3………… X n. Z = a1X1 + a2X2 + a3X3 + ……………….. + a nXn ~ N By this theorem since Ui ~ N and Yi = α + Ui is linear function of Ui, we infer that Yi ~ N Further = b1Y1 + b2Y2 + b3Y3 + …………..+ bnYn being a linear function fun ction of Y1 toYn is also distributed normally; Ui ~ N _Lf__ Yi ~ N _ Lf__ This
~ N
[Lf = linear function]
is normally distributed ~N.
I Linearity is important to apply tools of statistical inferences because of the following mportance of BLUE property:
argument;
Ui ~ N Yi ~ N is a linear function of Y.
= ~N.
We can use standard tools of statistical inferences; we can say big things with limited source of data. There is counter argument that the above chain of reasoning is too long and unnecessary. We can just assume that ~N. Linearity is not very important (indispensable). We have more options options of estimatio estimations. ns. Unbiased Unbiased ness ness this this means means E ( ) = α. .: if we draw all all possible random samples of Y and estimate from each sample one by one, then mean value of will be equal to α. Where This property is desirable because we don’t want to have any systematic error in estimation but it is not indispensable. Consider the following example; Prob (α –E < < α +E) = 0.6 E ( ) = α unbiased Prob (α –E < α* < α +E) = 0.9. E ( ) ≠ α α* estimator is biased We can see in figure that estimator biased could be better than unbiased. ������������ ���������� ��������� � ���������
���� � 9 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Best/Minimu Best/Minimum m Variance Variance this this means means that Var ( ) < Var Var (α*), where is OLS estimator, α* i s any other linear biased estimator. If we compute with nonlinear or biased estimator, the property doesn’t help. BLUE property is desirable, unbiased limit our choices and linearity also limit. The above model determines E (Y) as a constant. Y= α +U α = E (Y). Now suppose we want to determine E (Y) given some information set (I). This information is usually in form of data on variables called explanatory variables, e.g. Gender, Height etc. Suppose such variables are X1, X 2, X 3…………….. Xm, if the set of information is complete then we can write; Y= f (X1, X2, X3…………….. X m) Complete The
information means:
List of all variables X1, X2, X3…………….. X m is complete. All data are measured accurately. The functional form [f (.)] is exactly known. three sources of error: Incomplete list of X variables. Measurement error in data. Misspecification of the functional form.
This will produce the following type of equation ������������ ���������� ��������� � ���������
���� � 10 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Y= α2 X2 + α3 X3 + α4 X4 + ………………+ α k Xm +Z.
[k < m]
Z= error committed due to above three reasons. Econometrics is all about extracting information from the composition of error term Z and using it beneficially, extracting information intelligently with cost and beneficially. We can extract information immediately.
Z= E (Z) + Z – E (Z) Fluctuation in error around its mean value, denote it by (U) A parameter that can be estimated (α) Now we can write the model as;
Y= α2 X2 + α3 X3 + α4 X4 + ………………+ α k Xm +Z. Or
Y= α2 X2 + α3 X3 + α4 X4 + ………………+ α k Xm + E (Z) + Z – E (Z) Or
Y= α1 + α2 X2 + α3 X3 + ………………+ α k Xk + + U. This
[X1 =1]
model is known as General Linear Regression Model (focus on parameters).
Assumptions:
1). Ui is random variable for each i. 1b). Ui is normally distributed (Ui ~ N) for each i. 2). E (Ui) =0 for each i. As this assumption holds by construction; Ui = Zi – E (Zi) E (Ui) = E (Zi) – E [E (Zi)] = E (Zi) – E (Zi) =0 3). Var (Ui) = σ² for all i. 4). Co-Var (Ui Uj) = 0 for all i≠j. 5). X variables are fixed or exogenous or non-random. Example: (i). Height depends on age. Age is exogenous variable. (ii). Marks depends on hours of study (reading). Hours of study are exogenous variable. (iii). C = α + βY + U. Y = C + I + G +NX (X –M).
������������ ���������� ��������� � ���������
���� � 11 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
T Y = α + βX + U
wo Variable Regression Model
Suppose we have data through a random sample size n, and then we can write;
Yi = α + βXi + Ui
(i= 1, 2, 3………..n) Example: Height or Age is exogenous variables.
Yi = α + βX i + Ui Estimation: Suppose and are estimators of α and β respectively, therefore we have the estimated values of Y given as;
Ŷi = + Xi + Ui
The regression residual;
ei = Yi - Ŷi = Yi - ( + Xi) ∑ ei² = ∑ [Yi - - Xi] ²
For OLS we minimize ∑ei² with respect to and the first-order first-order conditions conditions are; = ∑ 2[Yi- - Xi] (-1) =0 = ∑ 2[Yi- - Xi] (-Xi) =0 Divide both sides by -2 and rearrange
∑ Yi- n - ∑ Xi =0 ----------- (i) ∑ Xi Yi- Xi - Xi² =0 --------- (ii) From equation (i)
∑ Yi + ∑ Xi = n
÷ both sides by n
----------- (iii)
Substitute (iii) into (ii)
������������ ���������� ��������� � ���������
���� � 12 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Consider
Like wise, we can show that; Now substitute (v) and (vi) into (iv)
Thus we have
These
are OLS estimators of α and β.
Properties of
:-
1). is a linear function of Y. Proof:
������������ ���������� ��������� � ���������
���� � 13 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
= a1 Y1 + a2 Y2 + a3 Y3 + ………………… a n Yn. = ∑ ai Yi ------------------------------------ --------- (vii) By
assumption x values are fixed therefore a1 = xi / ∑x² is fixed value and can be treated as a constant, the same is true for a2, a3 …………………………… a n. This means that = a1 Y1 + a2 Y2 + a3 Y3 + …………. a n Yn is a linear function of Y1, ………. Yn. 1b). is a linear function of U. Proof: = ∑ ai Yi. = ∑ ai (α + β Xi + Ui) = ∑ ai α + β ∑ ai Xi + ∑ ai Ui Now consider
----------------------------- -- (viii)
∑ ai =
∑ (xi / ∑ xi²) = 1 . ∑ xi. ∑ xi²
= 1 . (0) => ∑ ai = 0. ∑ xi² ∑ ai Xi
--------------------------- -------- (ix)
= ∑ (xi / ∑ xi²) Xi.
������������ ���������� ��������� � ���������
���� � 14 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
= 1 .∑ xi² ∑ xi² ∑ ai Xi = 1. Substitute (ix) and (x) into (viii) = ∑ ai α + β ∑ ai Xi + ∑ ai Ui = (0) α + β (1) + ∑ ai Ui = β + ∑ a i Ui ------------------------------- ---- (xi) = β + a1 U1 + a2 U2 + …………………. + a n Un is a linear function of U. It follows that is random variable, it also follows that Ui ~ N for each i then ~ N. 2). is unbiased: Proof: = β + ∑ a i Ui E ( ) = E [β [β + ∑ (a (ai Ui)] = β + ∑ ai E (Ui) since ai is fixed = β + ∑ ai (0) where E (Ui) = 0 E( )=β -------------------------------------- (xii) 3). has minimum variance in the class of linear unbiased estimators: Var ( ) = E [ - E ( )] ² = E [β + ∑ ai Ui - β] ² = E [∑ai Ui] ² = E [a1² U1 ² + a2² U2² + a3² U3² + …………... + a n² Un² + ∑ i≠j ∑ aij Ui U j] = a1² E (U1²) + a2² E (U2²) + …………... + a n² E (Un²) + ∑ i≠j ∑ aij E (Ui U j) (A) Since x values are fixed. Consider
E (Ui) = E [Ui – E (Ui)] ² [where E (Ui) = 0] = E (Ui) ² Var (Ui) = σ² ----------------------------------- -------- (xiii) Now consider E (Ui U j) = E [Ui –E (Ui)] [U j –E (U j)] ² [where E (Ui) = 0] = co-Var (Ui U j) = 0 ----------------------------------- ---------- (xiv) ������������ ���������� ��������� � ���������
���� � 15 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Substitute (xiii) and (xiv) into (A) Var ( ) = a1² σ² + a2² σ² + …………... + a n² σ² + ∑ i≠j ∑ aij (0) = ∑ ai σ² = σ² ∑ (xi / ∑ xi²) ² = σ² [∑ xi²/(∑ xi²)²] Var ( ) = σ² 1 . --------------------------------------- (xv) ∑ xi² 4). Consider another linear estimator: Let β* be another linear estimator of β. Proof: β* = ∑ bi Yi. [Where bi is fixed] = b1 Y1 + b2 Y2 + b3 Y3 + ……………… + b n Yn. If β* is to be unbiased, we will need E (β*) = β. That is E (β*) = β => E (∑bi Yi) = β. => E [∑bi (α + β Xi + Ui)] = β. => E [∑bi α + β ∑ bi Xi + ∑ bi Ui] = β. => ∑ bi α + β ∑ bi X i + ∑ bi E (Ui) = β. [Where E (Ui) = 0] => ∑ bi α + β ∑ bi Xi = β. ---------------------------------- --------- (xvi) This
require∑ bi = 0, ∑ b i Xi = 1. Now consider variance of β*. Var (β*) = E [β* - E (β*)] ² ------------------------------------- -------- (xvii) Consider β* = ∑ bi Yi. = ∑ bi (α + β Xi + Ui) = ∑ bi α + β ∑ bi Xi + ∑ bi Ui. = α (0) + β (1) + ∑ bi Ui. [Using (xvi) equation] β* = β + ∑ bi Ui. Substitute in (xvii)
Var (β*) = E [β +∑ bi Ui – β] ² = E [∑ bi Ui] ² = E [∑bi² Ui² + ∑ i≠j ∑ b i Ui b j U j]. = ∑ bi² E (U i ²) + ∑ i≠j ∑ b i b j E (Ui U j)]. [E (Ui U j) = 0] Var (β*) = ∑ bi²σ². --------------------------------------- ------------ (xviii) need to prove prove that that Var (β*) (β*) > Var Var ( ). We need Var (β*) = ∑ bi²σ². = σ² ∑ (b i²- ai + ai) ². = σ² ∑ [(bi- ai) ² + (ai) ² + 2 (bi- ai) (ai)]. = σ² ∑ (b i- ai) ² + ∑ ai² + 2 ∑ (bi - ai) (ai) -2 ∑ ai²].
������������ ���������� ��������� � ���������
���� � 16 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
[∑ ai = 0, ∑ a i Xi = 1, ∑ bi = 0, ∑ bi Xi = 1]. Consider
= σ² [∑ (bi- ai) ² - ∑ ai² + 2 ∑ bi ai] -------------- (xix) ∑ bi ai = ∑ bi xi . ∑ xi²
∑ bi ai = ∑ ai²
Substitute in (xix) equation
Var (β*) = σ² [∑ (bi- ai) ² - ∑ ai² + 2 ∑ ai²] = σ² [∑ (bi- ai) ² + ∑ ai²] = σ² ∑ ai² + σ² ∑ (bi(b i- ai) ² Var (β*) = Var Var ( ) + σ² σ² ∑ (bi(bi- ai) ai) ² Var (β*) (β*)
> Var Var ( ) unless unless bi bi = ai ai Var (β*) = σ² ∑ ai² + σ² ∑ (bi- ai) ² = σ² σ² 1 + σ² ∑ (bi- ai) ² ∑ xi² Var (β*) = Var Var ( ) + σ² σ² ∑ (bi(bi- ai) ai) ² If β* is different from then we must have bi ≠ ai for at least some i, this will yield; Var Var (β* (β*)) = Var Var ( )
P
ractice Equation:
Suppose we want to estimate the equation, Example: Per minute income hypothesis. Qi = βAi + Ui Production = Area Or Ti = βM+ Ui Yi = βXi+ Ui (error) and Derive OLS estimator of ( 1 Now convert the equation as,
)
Yi = β+ Єi Xi Again derive OLS estimator of β. ������������ ���������� ��������� � ���������
���� � 17 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Compare between
and
, think over it.
Hetroscedisticity [Yi = β + Єi] Xi Properties of OLS residuals:
1). OLS residuals are or thogonal to Regressors. Like [a1 a2] a1. b1 + a2.b2 =0 Regressors in equation are 1 and Xi in the below equation. Yi = α + β Xi + Ui a). [ 1,1, ……………..1]
b). [X1 + X2 + ..… Xn]
e1 + e2 + e3 + ,……….. en => ∑ei =0
X1 e1 + X2 e2 + X3 e3 + ,……….. X n e2n => ∑ Xi ei =0
Proof:OLS estimators are derived from the equation => ∑ (Yi-
- Xi) = 0
=> ∑ (Yi- ) = 0 => ∑ ℮i = 0 = 0 => ∑ (Yi-
- Xi) Xi = 0
=> ∑ (Yi- ) Xi = 0 => ∑ Xi ℮i = 0
P
ERFORMANCE OF AN ESTIMATED EQUATION:
Consider the following relation, Yi = + ℮i =0
������������ ���������� ��������� � ���������
���� � 18 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Multiply both sides∑ (i)
Actual variation = Expected variation. Now consider
Where
=0
= 0. Using this result we can write equation (i) as
The
performance of the estimated model can be measured by R², which is Square of multiple correlation co-efficient between one variable (V) on the one hand and a set of variables (V) on the other hand. R² = minimum at 0. R² = maximum at 1. R² is also called the co-efficient of determination, it is given by;
Also note that,
Extreme cases, so 0 ≤ R² ≥ 1 There is no bench mark and in what context R² is taken. Example:
Age of Ali = α + β (age of Ali’s dad) + U
������������ ���������� ��������� � ���������
���� � 19 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
R² = 1 R² = 1
Weight of Ali = α + β (weight of Ali’s dad) + U
Pakistan, consumption R² = 0.95
Spurious
cause between Y and G
In cross section data R² = 0.4 is good, but in time series data R² = 0.9 is not a remarkable because of bound ness, we think this is best measure. Problem with R² is that R² increases if we add more variables in the regression. Example: Dependent variable is consumption of household. Data: Cross section. C = α + β Y +U R² = 0.25 C = α + β Y + γ N +U R² = 0.46 C = α + β Y + γ Nc + δ Nm + π Nf + λ R +U R² = 0.56 C = is linear function of Y, Nc, Nm, Nf , Residence, education female, education male, wealth ……………………….etc. R² = 0.9899 If we make R² as the criteria to choose the number of variables in the equation, we’ll end up with as many variables as the number of sample points with R² = 1. Also note that as the sample size decreases R² will in general increases.
������������ ���������� ��������� � ���������
���� � 20 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
It
do not put limitations to include variables in the model, model has to be small cater.
Adjusted R²:
Consider the formula for R²
Now
Where K= number of parameters in the equation [α β γ δ λ π etc].
We
can write adjusted R² as;
������������ ���������� ��������� � ���������
���� � 21 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
If the net effect is positive then, on net basis will increases and we’ll include variable under consideration. Degree of Freedom (n-k), we need at least two variables.
S
tatistical Inference in Econometrics: Theorems:
1). If X is distributed normally with mean (�) and standard deviation (σ). X ~ N (�, σ) [shape and all estimates exactly] Then Z = x-� ~ N (1, 0) σ Standardize normal distribution. 2). If X1 ~ N (� 1, σ 1) X2 ~ N (� 2, σ 2) X3 ~ N (� 3, σ 3) : : : Xm ~ N (� m, σ m) And X1, X2, X3 ……….. Xm are mutually independent then, X1 + X2 + X3 +……. + X m ~ N (� 1+ � 2…+ � m, σ 1+ σ 2 + σ 3…+ σ m) If X1, X2, X3 ……….. Xm are not mutually independent then, X1 + X2 + X3 +……. + X m ~ N (� 1 + � 2…+ � m, σ 1+ σ 2+ σ 3…+ σ m+∑ I j∑ σi j) 3). Suppose Xi ~ N (� i, σ i) [i= 1, 2, 3 ………………] And Xi is mutually independent
Number of observer of minimum required.
Continuous variable (Height) There will be no point probability, only interval probability. 4). Suppose V1 ~ χ²m1 ������������ ���������� ��������� � ���������
���� � 22 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
V2 ~ χ²m2 V1 and V2 are independent, F = V1/ m1 ~ F m1, m2 [Fisher distribution with numerator degree of freedom to m 1 and denominator V2/ m2 degree of freedom to m 2]
5). Suppose
X ~ N (�, σ) V ~ χ²m
X and V are mutually independent then t = X-�/ σ = > standardize normal (V/m) ½ (V/m) (V/m) ½ t = X-�/ σ ~ t m (V/m) ½ Variances increases tails increases
Back to Econometrics:
Consider
Y=α+βX+U Suppose we want to test the null hypothesis; H0: β = β0 [where β0 is given value and alternative] H1: β ≠ β 0 As we know that ~ N (β,
)
Therefore ������������ ���������� ��������� � ���������
���� � 23 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
In
testing null (H0) hypothesi hypothesis, s, we will will use esti estimat mated ed value value of β as , the hypoth hypotheti etical cal value β0 in place of β and ∑x² will be computed from the actual data.
That is
from data
β = β0 from H0 ∑x² from data Note that σ² remains unknown; one option is to replace σ² by its unbiased estimator, [Proof is in the book] Then
is no longer normally normally distributed distributed if the sample size is large then is Approximately normal Another option is to convert Z into t distribution the steps are as follows; 1).
~ N (β,
)
2). It can be shown that Or
3). It can also be shown and V are mutually independent. 4). It follows from 1, 2, and 3 that
Now consider
������������ ���������� ��������� � ���������
���� � 24 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Recall
Example 1:
Y=α+βX+U
Y=weight
X =Age
(X–mean X) =x
(Y-mean Y) =y
X²
y²
xy
4 8 10 12 11
1 2 3 4 5
-2 -1 0 1 2
-5 -1 1 3 2
4 1 0 1 4
25 1 1 9 4
10 1 0 3 4
∑Y=45
∑X=15
∑x=0
∑y=0
∑x²=10
∑y²=40
∑x y=18
= 9 – 1.8 x 3 = 9 - 5.4 = 3.6 The estimated equation is; 3.6 = weight at time of birth. 1.8 = rate of increase in weight per year increase in age.
Suppose we want to test; H0: β = 0 H1: β ≠ 0
������������ ���������� ��������� � ���������
���� � 25 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
5.4 7.2 9 10.8 12.6
When x = 1 x=2 x=3 x=4 x=5
We can compute
Deviation is standard error. Now
Set
level of significance (or probability of type 1-error, equal to one minus level of confidence) = 0.05
The
calculated t-value falls in rejection (area) range so we reject H 0. This means the effect of age on weight is significantly different from zero.
������������ ���������� ��������� � ���������
���� � 26 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Testing of Hypothesis: Hypothesis:
Suppose we have estimated the following equation; C =10.0 + 0.8Y , R² = 0.97, n = 25 (2.5) (0.1) [The values in brackets are standard errors] Interpret
these results as an economist. Before we interpret, we test a few hypotheses. 1). H0: α = 0 H1: α ≠ 0
Degree of freedom = 25 – 2 = 23 Level of significance = 0.05 Critical t-values = + 2.069 We
reject H0. 2). H0: β = 0 H1: β ≠ 0
We
reject H0.
3). H0: β = 1 H1: β < 1
������������ ���������� ��������� � ���������
���� � 27 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
We
reject H0.
Interpretation: Interpretation: -
The result show that 97% variation in consumption expenditure is explain by our model, which indicates that the over all performance of the equation is satisfactory, the intercept is positive and significantly different from zero, its magnitude shows that the subsistence or autonomous consumption expenditure is 10 thousand rupees per capita per year, further note that marginal propensity to consume (MPC) is significantly different from zero and less than one, the estimated value of the MPC shows that the marginal consumption rate is 0.8 or 80% of each incremental rupee of income is consumed, while the remaining 20% is saved.
Testing a linear restriction on two or more parameters: Y= α + β X + U
H0: α + β = 1 H1: α + β ≠ 1
.
������������ ���������� ��������� � ���������
���� � 28 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Actual application:
matrix.
Suppose
Variances and co-variances are obtained from coefficient variance- co-variance
= 1.7,
= - 0.2
H0: α + β = 0 H1: α + β ≠ 0
H0: α - β = 0 H1: α - β ≠ 0
������������ ���������� ��������� � ���������
���� � 29 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
T
hree Variables Model: Yi = β1 + β2X2i + β3X3i + Ui
Assumptions:
1). Ui is random variable for each i. 1b). Ui is normally distributed (Ui ~ N) for each i. 2). E (Ui) =0 for each i. As this assumption holds by construction; Ui = Zi – E (Zi) E (Ui) = E (Zi) – E [E (Zi)] = E (Zi) – E (Zi) =0 3). Var (Ui) = σ² for all i. 4). Co-Var (Ui Uj) = 0 for all i≠j. 5). X variables are fixed or exogenous or non-random. 6). Correlation between X2, X3 is not equal to 1. [This is X2, and X3 is not perfectly correlated].
E (Yi) = β1 + β2X2i + β3X3i Consider β2 = ƏE (Y) ƏX2 β3 = ƏE (Y) ƏX3
Since all variation in X2, and X3 are common, it is impossible to separate the effects of X2, and X3 on Y [conceptually wrong, where variables are perfectly correlated, model construction is wrong].
We can further show that all possible methods of estimation will fail; we can also interpret this situation by argument that information content in data is zero. X2, X3
Information, Content is zero.
X2,
Information Content is full.
������������ ���������� ��������� � ���������
X3
X2, X3
Information is rich.
���� � 30 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Estimation :
Yi = β1 + β2X2 + β3X3 + U Replace unknown parameters by their estimators estimators and set U = 0.
In OLS we solve the following problem
First-Order-Condition:
The end result is as follows;
Also note;
It can not be shown that following properties hold:
������������ ���������� ��������� � ���������
���� � 31 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
3).Var ( 1), Var ( 2), Var ( 3) and Z have minimum variance in the class of linear unbiased estimators, it can be shown that an unbiased estimator of the variance of U, σ² is;
M
ultiple Regression Model:
Testing of more than one restriction jointly:
Suppose we have to estimate or test. H0: β2 = 1, β3 = 0 H1: β2 ≠ 1 and/ or β3 ≠ 0, And one model is Y = β1 + β2X2 + β3X3 + β4X4 + U [We
have to see the number of restrictions and not to focus on how many restrictions on parameters].
Where
The Steps are as follows:
1). Estimate the unrestricted equation to compute 1, 2, 3,
4.
.
Then compute Then ℮i = Y i – . ������������ ���������� ��������� � ���������
���� � 32 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Finally compute ∑℮U² Suppose ∑℮U² = 50. 2). Impose the restriction given into this will yield, according to our example; Y = β1 + (1) X2 + (0) X3 + β4X4 + U Y – X2 = β1 + β4X4 + U Estimate β1 and β4; and compute; = ………………………..
1
4
= ………………………..
Now complete
℮i = Y i – . Finally compute ∑℮R ² Suppose we have ∑℮R ² = 60. 3).Compute the F-statistics. Note these values; ∑℮U² = 50, ∑℮R ² = 60, R = 2, n = 34 and k = 4. Now plug in values; F = (60 – 50) / 2 = 10 / 2 = 5*30 = 3. 50 / (34 – 4) 50 / 30 50 4). Conclusion. We conclude by comparing calculated F-value with the critical F-value. In our case the critical F-value at R = 2 and n-k = 30 and df = 2.87 is supposed.
In
our example the calculated F-value > critical F-value, so we reject H0.
������������ ���������� ��������� � ���������
���� � 33 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
How
to check F- value in the Table:
Consider now General model:
Y = β1 + β2X2 + β3X3 + ……………………………… ………………………………… … β K XK + + U Special Case 1: Just one restriction. Examples: H0: β1 = 0. H0: β1 = 1. H0: β2 + β3 = -1. H0: β2 + β3 + β4 + β5 = 0. H0: β2 + β3 + β4 + ………………… + β K = = 0. In this case we can show that; F = t² Critical F = (critical t) ² | t | > | t critical | | F | > | F critical | | t | = | t critical | | F | = | F critical | | t | < | t critical | | F | < | F critical | So “t” and “F” test will lead to identical conclusions. Special Case 2: Each parameter except intercept is set equal to zero. H0: β2 = 0, β3 = 0, β4 = 0 ………………..… β K = = 0. H1: At least one β j ≠ 0, [j = 2, 3, 4 ……………k ] In this case restricted model becomes; Y = β1 + U Or Y = b1 + V
Now
������������ ���������� ��������� � ���������
���� � 34 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Where
all values come from unrestricted model, so we can ignore the subscript ǖ. We can write model as;
It
is test of over all performance of the model. Note 1: In General Case.
Or
Divide all terms by ∑ y².
F-statistics
indicates that increase in the restriction decrease R² by more error (In against or alternative). F-statistics indicates the increase in the R² due to removal of restrictions. Note 2: In Special Case 2.
Divide all terms by ∑ y²
������������ ���������� ��������� � ���������
���� � 35 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
When F = 0 => R² = 0 test just like. F-statistics indicates the significance difference of R² from zero.
M It is an econometric problem.
ULTICOLLINEARITY:
There
are four ways to solve the problem: What is problem What are the consequences of problem How to test the problem What is solution Recall the three variables regression model problem. Yi = β1 + β2X2 + β3X3 + U Also recall;
Also note;
In
each equation denominator is same. Ώ = ∑x2² ∑x3² - (∑x2 x3) ²
������������ ���������� ��������� � ���������
���� � 36 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Now
suppose we have X2 and X3 are perfectly correlated; γ²23 = + 1. In this case 2 = ---------------, 3 = ---------------. O. O So we can estimate β2 and β3 by OLS. It can be shown that β2 and β3 can not be estimated at all. In fact the true values of β2 and β3 are not even perfectly defined. β2 = ∂ E (Y) ∂ X2 On
and
β3 = ∂ E (Y) ∂ X3
do not exist
the other hand if X2 and X3 are not at all correlated then,
In this case
γ23 = + 0.
These
are OLS estimators when we regress. Y = α + β 2 X2 + U ------------------------- ---- One model Y = α + β 3 X3 + U ------------------------- ---- Other model Also note that in this case. β2 = ∂ E (Y) = d E (Y) ∂ X2 d X2 β3 = ∂ E (Y) = d E (Y) ∂ X2 d X3 Finally note that in this case multiple regression equation and partial (simple) regression equation produce identical results. Recap: In case γ23 = + 1, multiple regression equation fails theoretically and application wise also. In other extreme case γ23 = 0, multiple regression equation is not needed, so the only practical use of multiple regression equation is when
γ23 = ≠ 0, γ23 ≠ + 1. ������������ ���������� ��������� � ���������
���� � 37 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Now
multicollinearity can be defined as “ Strong but imperfect correlation among X variables”. A correlation problem in which independent (X) variables is highly correlated, and then estimator’s quality is poor. Another
way of understanding the problem is to look at the information content in data. X2, X3.
γ
= + 1.
X2, Information. Content is zero.
23
γ
X3.
= + 1.
23
Information Content is full.
γ23 = ≠ 0, γ23 ≠ + 1. | γ23 | is low.
Information content is rich.
| γ23 | is High.
Information content is poor.
Multicollinearity
is a problem of poor information content in data. Variation or estimation is possible but in poor performance. Concept of Multicollinearity: Multicollinearity: Type of situation when multicollinearity can arise. Multicollinearity is most common in time series data because in such data variables are highly correlated due to common trend. Example: Relationship between age of my father and my age it is one by one relationship which is spurious cause. Multicollinearity can also arise if the X variables are poorly specified. Examples: Exchange Rate. ER = α + β CAD + γ Export + θ Import + U. In this equation we have repeated same variables on the right hand side as exports and imports are included in Current Account Deficit (CAD) therefore writing only CAD or exports and imports are good. Consumer Price Index. CPI = α + β CC + γ DD + θ TD + λ OD + U ������������ ���������� ��������� � ���������
���� � 38 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
In this equation of CPI, DD plus TD are coming from commercial banks and CC is major part of money, we can write this equation as to be good. CPI = α + β [CC + DD + TD + OD] + U Or CPI = α + β M2 + U [M2 = CC + DD + TD + OD] Or CPI = α + β (CC + DD) + γ (TD + OD) + U There could be model specification problem. Here is matter of judgment not a matter of science.
Consequences of Multicollinearity: Multicollinearity:
Note that OLS estimators remain BLUE. (1) The variances of OLS estimators become large. Recall formula for variances.
If
| γ23 | is high (1- γ²23) will be low, therefore Var ( 2) and Var ( 3) will be large.
It follows that, ������������ ���������� ��������� � ���������
���� � 39 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Standard
error will also be large. Thus the t-value for H0: j = 0. t = j - 0. Will become small SE ( 2) Therefore we may accept H0, while will should not have to accept. In other words we may wrongly conclude that X variables do not affect Y. Example: CPIt = α + β M2 + γ ER t + θ Yt + λ CPI t-1 + Ut If data is too large than more multicollinearity, we may accept H 0: j = 0. Standard error will be greater and cause misleading t-value. Another implication of this consequence is that j varies quite a bit from sample to sample. In particular even small changes in the sample may produce large changes in j. Take above example of CPI equation, we may have = 2.1 than we use sample of annual data from 1970 to 2002. = -3.7 when the data are from 1970 to 2005, so conclusion is that ’s also erratically erratically changes changes with small changes changes in the data, specificatio specificationn etc, t-value t-value (decreases) (decreases) is small then then ’s will volatile volatile the model model their will will be no robusness (stability (stability)) in the model, their will be no trust over model.
(2) Recall the formula for Co-Var ( 2, 3).
If γ2 3 is high (due to multicollinearity), this will make Co-Var ( 2, 3) large. Further note that ������������ ���������� ��������� � ���������
���� � 40 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
If X2 and X3 are positively and highly correlated with each other, then (negative and very large correlation) under estimation of β2 will accompany over estimation of β3 and vice versa. Example: β2 = 100 and β3 = 250 β3 = 300 and β2 = 100 Like wise if X2 and β3 are negatively and highly correlated then over (under) estimation of β2 will accompany under (over) estimation of β3. Consequences Consequences of (1) and (2) (2) imply that the estimate estimatedd parameters parameters ( ’s) become become volatile (unreliable, unstable) and too sensitive, their magnitudes are quite likely to be unrealistic in terms of sign and size (some significant parameters’ sign will not good). Example: CPI, Y, ER, IR MPC = 1.3. MPC = -1.3 Own price elasticity is negative it comes positive; it means estimation is not realistic and reliable. Testing
and Diagnostic of Multicollinearity:
Formal test of multicollinearity are too complex but not much fruit full. In practice we may rely on certain clues and symptoms (indicators). (1) Multicollinearity is likely to be present if data are time series data at high frequency (for example annual rather than monthly data). Unless data are de-trended (Remove common trend, low interval and low frequency). (2) A very popular symptom of multicollinearity is that over all performance of the estimated equation is good in terms of high value of R², but t-statistics for individual regression coefficients for H0: j = 0 is mostly insignificant. Example1: Log CPIt = 1.2 + 0.3 log M2 + 0.7 log Yt – 0.25 log ER t + 0.97 log CPIt-1. T-values:
(0.85)
(1.37)
(-0.09)
(44.73)
R² = 0.9938 Log CPIt = 1.2 + 0.3 log M2 + (-) 0.7 log Yt – (+) 0.25 log ER t + 0.97 log CPIt-1. (Insignificant)
(Insignificant)
(Insignificant)
(Highly significant)
R² = 0.9938 is very impressive. In this case over all performance is good but individual effects are misleading due to sign and high (low) standard error and small (low) t-values and also insignificant tvalues. It is not result that before CPI is present that’s why it is arisen. Example 2: Log (Qw) = 1.2 + 0.3 log Y – 0.1 log (Pw) + 0.3 log (Pr ) + 1.1 log (N pop) T-values:
R² = 0.99.
(1.57) (-0.73) (1.21) (Insignificant) (Insignificant) (Insignificant)
������������ ���������� ��������� � ���������
(17.43) (Highly significant)
���� � 41 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Where Log Y = Income elasticity. Log Pw = Price elasticity of wheat. Log Pr = = Cross price elasticity of rice. Like wise signs are good results is fine because all factors are good. (3) Parameter estimates are too sensitive to changes in sample, definition of variables and specification of the model. If we change sample little bit to add new data and regress and it give new situation which drastical drastically ly changes results results (we are are avoiding avoiding Var ( ) is too high not good). good). Defining Defining variables in more than one ways like GNP as GDP or GNI and saving data which definition we are going to use or to take, which changes drastically. Through which way we are specifying the model, C = α + β1 Y + γ R + ……………………….. ……………………….. -------- Linear function. Log C = α + β 2 log Y + γ log R + ……...……….. -------- Logarithm function. β1 = ∂C. ∂Y β2 = ∂ Log C = ∂C Y. ∂ Log Y ∂Y C Therefore β2 = β1 Y. C
Drastically changes means too sensitive. (4) Detection through Correlation Matrix. Let take an example: Log P = α + β log M + γ log GDP + θ log ER + U. (CPI)
(Ms)
(Real GDP)
������������ ���������� ��������� � ���������
(Real ER)
���� � 42 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Construct Correlation Matrix for all variables.
Rule of Thumb: If correlation among X variables is stronger than correlation between X and Y then multicollinearity is present. Example: X and Y are strongly correlated. ABC In this case the correlation among X variables can undermine the relationship. Relationship between X and Y variables Among X variables Case 1: γ P, M = 0.95 γ M, ER = 0.80 ( Not ok ) (Multicollinearity exists). γ P, ER = 0.70. Case 2: γ P, M = 0.95
γ P, Y = 0.65. Case 3: γ P, GDP = 0.95
--.
.
--.
.
--.
γ M, Y = 0.60 (Ok ) ( No Multicollinearity exists ).
γ P, ER = 0.70.
.
γ GDP, ER = 0.55(Ok ) ( No Multicollinearity exists ).
Solutions of Multicollinearity:
(1) Exclude variable (s) causing multicollinearity. multicollinearity.
This solution makes sense only when the variable being dropped is not important in the over all frame-work of our analysis. Example: Pt = α + β M t + γ GDPt + θ ER t + λ Pt-1 + π Pt-2 + U If Pt-2 is causing multicollinearity we should exclude this variable which is not very important while Mt is causing multicollinearity we should not exclude this variable because with out Mt (money supply) we can not measure the inflation. Note: Unfortunately important variable causes multicollinearity. ������������ ���������� ��������� � ���������
���� � 43 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
(2) Increase the sample size.
This solution in general is not much appealing because along sample is desirable in any case (Why we wait for multicollinearity to arise?). In most cases the sample size is small in first place just because large sample is not available. Example: Poverty, income reduces poverty, why poverty is exists? However a meaning full interpretation of this solution can be as follows. (a) Split the data into quarterly data or monthly. => 30 x 1 = 30 yearly data. => 30 x 4 = 120 quarterly data. => 30 x 12 = 360 monthly data. Two issues: Can we split data on the basis of month and quarter. Monthly and quarterly data is not valid due to common trend. In some cases like as; ER weakly, monthly daily data. Real activity can not split quarterly and monthly like as; GDP, saving is not perfectly in quarterly data but quarterly approximation. Pt: Mt, GDPt, ER t, Pt-1, Pt-2. GDPt is not accurately splits, if advantage is more in splitting data then split the data. (b) Split the data on special basis. For example data on Pakistan can be split into data on Punjab, Sindh, Blochistan and NWFP provinces wise, area (space) wise, time wise etc. (C) Merge two different (not identical) but similar data. For example we can merge 30 observations of Pakistan with 30 observations each of India, Sri Lanka and Bangladesh. It is work able solution.
(3) Filter the data.
(1) In time series we can apply first differencing. Yt = α + β Xt + γ Zt + Ut. Yt-1 = α + β Xt-1 + γ Zt-1 + Ut-1 . Yt –Yt-1 = β( Xt – Xt-1) + γ(Zt – Zt-1) + Ut – Ut-1 Or ∆Yt = β ∆Xt + γ∆ Zt +∆ Ut. ������������ ���������� ��������� � ���������
���� � 44 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
It reduces the changes drastically of multicollinearity but it filter out all valuable variables also. It is not a good solution, intercept is also gone. Suppose multicollinearity is caused by common trend. Why not control trend? To control for the trend, we include time variable in the model/ equation. t = 0, 1, 2, 3 ……………………………………………….. Yt = α + θt + β Xt + γ Zt + Ut. Where => θt – θ(t-1) Yt-1 = α + θ(t-1) + β Xt-1 + γ Zt-1 + Ut-1 . => θ [t – (t –1)] ∆Yt = θ + β ∆Xt + γ∆ Zt +∆ Ut. => θ. Relationship between independent and dependent will become week. (2) Both in cross-section and time series data we can take ratios. C = α + β Y + γ W + U. [Y increases W increases] In time series = Pop income In cross-section = household pop income (per capita).
We are redefining; not dividing by N trend will become week. Suppose we have Cob-Douglas Production function. K = Capital L = Labor. ������������ ���������� ��������� � ���������
���� � 45 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
M = Material. E =Energy. Log Q = log A + α log K + β log L + γ log E + θ log M + U. Or Log Q = a0 + α log K + β log L + γ log E + θ log M + U. Firm have higher capital stock, and then there will be more employment arises. Test: H0: α + β + γ + θ = 1 H1: α + β + γ + θ ≠ 1 Suppose H0 is accepted then we can write; β = 1 – α – γ – θ. Now production function becomes,
Or
Y per worker = f (K per worker, E per worker, M per worker). Note: A problem does not have to be shall only tackle, leave it alone.
F
RISCH-WAUGH THEOREM:
Then
Suppose we have; Y = β1 + β2 X + β3 Z + U
β2 = ƏE (Y) ƏX The effect of Z can also be eliminated as follows;
Regress Y on Z. Y = a0 + a1 Z + V And obtain
(by OLS)
Regress X on Z. X = b0 + b1 Z + W And obtain
(by OLS)
������������ ���������� ��������� � ���������
���� � 46 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Now regress by OLS below equation Y* = C0 + C1X* + Є It can be proved. If
we make Z as constant or we eliminate it. ∆ Log Xt = log Xt – log Xt-1 = log Xt – log Xt-1 t – (t – 1) = ∆ Log Xt ∆t ≈ ∂ log Xt = ∂ log Xt . ∂Xt ∂t ∂Xt dt = 1 ∆X X = growth rate of X. Note: - We have to week the co linearity to decrease multicollinearity.
A
UTOCORRELATION:
Definition:
Correlation between Xi and Y j, i≠j and X and Y may be the same or different variables are called Serial correlation. Example: correlation between Mt-1, Pt Yt, R t-2 t-2 If X and Y are the same then serial correlation becomes Autocorrelation. Correlation between Xi and X j, it arises in time series data and in cross section as well. Example: - Correlation between Ct, Ct-1 [Consumption] Correlation between Qt, Qt-4 [Output] [Temperature] Correlation between Tt, Tt-12 Special case of serial correlation is correlation between Xi, Yi [contemporiuos]. Autocorrelation Autocorrelation Problem:
This problem means presence of autocorrelation in error term [Ui] in the regression equation. Yi = α + β Xi + Ui Autocorrelation means, Cov (Ui, U j) ≠ 0 for at least some i≠j. Recall the assumption. Cov (Ui, U j) = 0 for all i≠j. Autocorrelation is violation of this assumption and mainly a problem of time series data, it is usually present because the dependent variable has inertia or sluggish ness or stickiness and this inertia is not captured by any variable on the right hand side. ������������ ���������� ��������� � ���������
���� � 47 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Ct = α + β Yt + Ut [4 years monthly data]
We have exclude variables which capture the inertia, error become auto correlated when error term captures inertia. Consequences of Autocorrelation: Autocorrelation: Note: - OLS estimators remain linear and unbiased.
1). OLS estimators no more have minimum variance in the class of linear unbiased estimators. Not remains best. 2). Ordinary formula for calculating variances is no more valid. Var ( ) ≠ σ²_ ∑x² ∑ i≠j ∑ Cov (Ui U j) ≠ 0 OLS estimators are not sufficient, they are larger in variances. This is not a big problem we can make correlation if we apply BLUE but not best by using correct formula, result will come in too high variances then standard errors also become high and we will accept t-value which miss lead the parameters.
[Testing and Solution of Autocorrelation is post pond till such time we understand the various forms of Autocorrelation]. Form of Autocorrelation:
Consider the model Yt = β0 + β1 Xt + Ut 1). Auto Regressive system [AR (p) model]
Ut = α0 + α1 Ut-1 + ……………………………… + α p Ut-p + Є t Auto correlated portion
������������ ���������� ��������� � ���������
Non-auto portion Innovation News, Shock White noise error ���� � 48 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
2). Moving Average system [MA (q) model]
Ut = γ 0 + γ 1 Є t-1 + ……………………………… + γ q Є t-q + Є t (Regressed over errors, chicken egg problem) = γ + γ 1 Є t-0 + ……………………………… + γ q Є t-q
[γ 0 =1]
Why we call it moving average, we can write equation as
Ut = γ + Ф [Moving average of Є t] 3). ARMA (p, q) model:
Ut = α0 + α1 Ut-1 + ……… + α p Ut-p + γ 0 + γ 1 Є t-1 + …………… + γ q Є t-q + Є t AR (p) model
MA (q) model
[Some called it ARIMA model] AR (1) Model:
Ut = α0 + α1 Ut-1 + Є t This is the most popular and a simple way to model Autocorrelation. Assumptions:
1). Є t is a random variable for all t. 2). E (Є t) = 0. 3). Var (Є t) = σ² for all t. 4). Cov (Є t, Є t’) = 0 for all t = t’. [at two different points they are not correlated]. 5). | α1 | < 1. Properties of Ut:
Solve Ut as follows Ut = α0 + α1 Ut-1 + Є t = α0 + α1 [Ut = α0 + α1 Ut-2 + Є t-1] + Є t = α0 + α0 α1 + α1² Ut-2 + α1 Є t-1 + Є t = α0 + α0 α1 + α1² [α0 + α1 Ut-3 + Є t-2] + α1 Є t-1 + Є t = α0 + α0 α1 + α0 α1² + α1³ Ut-3 + α1² Є t-2 + α1 Є t-1 + Є t [We will end up with following equation]
Or
Ut = α0 + α0 α1 + α0 α1² + α0 α1³ + ………………………………………… + Є t + α1 Є t-1 + α1² Є t-2 + α1³ Є t-3 + ……………………………….. ……………………………….. + α1˚˚ Ut-∞ [α1˚˚ = 0] Ut = α0 [1 + α1 + α1² + …….……..] + Є t + α1 Є t-1 + α1² Є t-2 + ……….….
������������ ���������� ��������� � ���������
���� � 49 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
= α0 1- α1˚˚ + Є t + α1 Є t-1 + α1² Є t-2 + ……………………………….. ……………………………….. 1- α1 Ut = α0 + Є t + α1 Є t-1 + α1² Є t-2 + …………………………… MA(∞) 1-α1 AR (1) model = MA (∞) model Ut = γ + γ 0 Є t-0 + γ1 Є t-1 + γ2 Є t-2 + ………………………………………
Weighted average of past innovations, shocks telling us use full information.
Parametric Properties of Ut: 1). E (Ut) = α0 + E (Є t) + α1 E (Єt-1) + α1² E (Є t-2) + …………….………… 1-α1 = α0 + (0) + α1 (0) + α1² (0 ) + ………………..………….… ………………..………….………… ……… 1-α1 = α0 + 0 [1+ α1 + α1² + ………………..………….…… ………………..………….…….…………] .…………] 1-α1 = α0 + 0 [ 1 ] 1-α1, 1-α1 = α0 . 1-α1 [0 + 0 + 0 + 0 + 0 + 0 + 0 + …………………….+ 0 ≠ 0] ∞ Time zeros 10 = ∞ => 10 = 0 ∞.
0 1=∞
=> 1 = 0 ∞.
0
In AR process for Ut, if E (Є t) = 0, we should set α 0 = 0, so we’ll have E (Ut) =0
Ut = α0 + α1 Ut-1 + ……………………………… + Є t 2). Var (Ut) = Var (Є t + α1 Є t-1 + α1² Є t-2 + ……………………………………………) ……………………………………………) = Var (Є t) + Var (α1 Є t-1) + Var (α1² Є t-2) + ……………+ (covariance) = σ² + σ² α 1² + σ² α 1 + ………………………………………..(0). = σ² [1 + α1² + α1 + ……………………………… ……………………………………………]. ……………]. = σ² [ 1 – (α1²)˚˚] [ = (0.8)˚˚ = 0] 1-α1² Var (Ut) = σ² . ---------------------------------- ------- (1a) 1-α1² This is constant variance for all t, there is no hetroscedisticity in Ut.
������������ ���������� ��������� � ���������
���� � 50 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
3). Cov (Ut, Ut-1) = E [Ut – E (Ut)][Ut-1 – E (Ut-1)] = E [Ut – Ut-1] = E [Є t + α1 Є t-1 + α1² Є t-2 + ……………….…………] x [ Є t-1 + α1 Є t-2 + α1² Є t-3 + ……………..…] = σ² α1 + σ² α1³ + σ² α1 + ……………………………… = σ² α1 [1 + α1² + α1 + …………………………………] …………………………………] = σ² α1 [ 1 ] 1-α1² Cov (Ut, Ut-2) = E [Ut – E (Ut)] [Ut-2 – E (Ut-2)] = E [Ut – Ut-2] = E [Є t + α1 Є t-1 + α1² Є t-2 + ……………………................……] x [ Є t-2 + α1 Є t-3 + α1² Є t-2 + ……………..…] = σ² α1² + σ² α 1 + σ² α1 + …………………...…………… …………………...…………………… ……… = σ² α1 [1 + α1² + α1 + ……………………………………………] ……………………………………………] = σ² α1² [ 1 ] 1-α1² Cov (Ut, Ut-3) = E [Ut – E (Ut)] [Ut-3 – E (Ut-3)] = E [Ut – Ut-3] = E [Є t + α1 Є t-1 + α1² Є t-2 + ……………………................……] x [ Є t-3 + α1 Є t- 4 + α1² Є t-5 + ……………..…] = σ² α1³ + σ² α 1 + σ² α1 + …………………...…………… …………………...…………………… ……… = σ² α1³ [1 + α1² + α1 + ……………………………… ……………………………………………] ……………] = σ² α1³ [ 1 ] 1-α1² In General, we obtain;
Note Var (Ut, U0) = σ² α1˚ [ 1 ] 1-α1² = Var (Ut) = σ² . 1-α1²
-------------- as given in equation (1a).
4). Corr (Ut, Ut-i) =
Cov (Ut, Ut-i) . SD (Ut), SD (Ut-i) = Cov (Ut, Ut-i) . SD (Ut), SD (Ut) = Cov (Ut, Ut-i) . Var (Ut) = Cov (Ut, Ut-i) . Var (Ut-i)
������������ ���������� ��������� � ���������
���� � 51 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Auto Autoco corr rrel elat atio ionn Coe Coeff ffic icie ient nt at lag lag len lengt gthh i
whic whichh is is fun funct ctio ionn of i
α1 > 0 α1 = 0.8
Auto function is geometrically declining and approaching towards zero as the lag length increases.
α1 < 0 α1 = - 0.5
Auto function oscillatory, starting with a negative value at lag length one and approaching towards zero. Price level = α + β (money supply) + Ut Another case of AR (P) Suppose we have quarterly data, to estimate the equation. Yt = α + β Xt + Ut We expect that Ut = α0+ α1 Ut-1 + α2 Ut-2 + α3 Ut-3 + α4 Ut- 4 + Є t To simply that matter we assume α0 = α1 = α2 = α3 = 0. ������������ ���������� ��������� � ���������
���� � 52 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
We are left with Ut = α4 Ut- 4 + Є t -------------- (1) Assumptions about Є t:
1). Є t is a random variable for all t. 2). E (Є t) = 0. 3). Var (Є t) = σ² for all t. 4). Cov (Є t, Є t’) = 0 for all t = t’. Properties of Ut:
1). Equation (1) can be expressed as a MA (∞) process; Є t + α4 Є t-4 + α8² Є t-8 + …………………….................…… 2). E (Ut) = 0. 3). Var (Ut) = σ² . 1- α4²
5). Cov (Ut, Ut-i) = i = 0 => σ² . 1- α4² i = 1 => 0 i = 2 => 0 i = 3 => 0
It follows that
i = 5 => 0 i = 6 => 0.
= 0 other wise. Autocorrelation function is;
α4 > 0 α4 = 0.5
������������ ���������� ��������� � ���������
���� � 53 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
α4 < 0 α4 = -0.5
In computer software the autocorrelation autocorrelation function is shown as a part of “Correlogram” (One part of autocorrelation, other are partial autocorrelation etc).
AR (1), α10
AR (4), α10
AR (1) is just a symptom, it is kind of art not a perfect science and very use full idea, correlogram is reason. MA (1) Model:
Ut = Є t + β Є t-1 Є t satisfies all standard properties, we can show that; 1). E (Ut) = 0. 2). Var (Ut) = (1+ β1²) σ² 3). Cov (Ut, Ut-i) = β1 σ² for i = 1 Cov (Ut, Ut-i) = 0 for i ≥ 2 Ut = Є t + β Є t-1 Ut-1 = Є t-1 + β Є t-2 Ut-2 = Є t-2 + β Є t-3
Ut Ut-1 Ut-2
������������ ���������� ��������� � ���������
No correlation
���� � 54 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
4). Corr (Ut, Ut-i) = β1 . For i = 1 1+ β1² =0 for i ≥ 2
Correlogram:
MA (1) β1 < 0
MA (1) β1 > 0
MA (4); Ut = Є t-2 + β Є t-4 MA (1) β1 > 0
MA (1) β1 < 0
Testing of Autocorrelation: Autocorrelation: 1). Durbin Watson test:
DW test is based on DW statistics.
This
formula can be further expressed as follows;
������������ ���������� ��������� � ���������
���� � 55 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
H0: ρ = 0 => ď = 2 H1: ρ ≠ 0 => ď ≠ 2 Now unfortunately the distribution of ď is not unique; it depends on actual data, for exact distribution we don’t have time and energy to make calculations of observation. Durbin Watson has provided the two extreme distributions as shown in the following graph
������������ ���������� ��������� � ���������
���� � 56 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
. Table for critical ď value provides dl and du for various values of; n (Number of observations) k`(Number of parameters minus one) Example:
CPI = α + β M2 + γ GDP + θ ER +U Sample 1970-71 to 2004-05 n = 35 k`= 3 From the table we have dl = 1.42 du = 1.71 Suppose the calculated ď = 2.74, we can determine the right tail critical value. 4-dl = 4- 1.42 = 2.58 4-du = 4- 1.71 = 2.29 Since calculated ď < 4-dl and 4-du, we reject H0 and conclude that autocorrelation is present, this test has some problem. Notes on the test:
1). the test statistics has inconclusive range, so it may not produce a concrete conclusion. 2). the test is especially designed for AR (1) process, but not for higher order auto processes or MA process or others. 3). Despite the above two limitations the test is power full to detect autocorrelation, especially it is most common form AR (1) process.
������������ ���������� ��������� � ���������
���� � 57 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
H0 is : Test Reject Decision Accept
True Type I error confidence
False Power Type II error
4). DW test is the most popular test. 5). DW gives biased results when lagged dependent variable appears on the right hand side. Example: Pt = α + β M6 + γ Yt + θ Et + λ Pt-1 + Ut Note
(1) is reality not a weak ness (2) and (5) are serious problems.
2). Durbin-h Test:
Durbin-h is use full to test autocorrelation of first order when lagged dependent variable is on the right hand side.
h ~ N (0, 1) Critical values are + 1.96 for 5% level of significance. + 1.645 for 10% level of significance. + 1.345 for 1% level of significance. If h turn to be an imaginary number,
Then we use another method. 3). Durbin’s Alternative Test:
Estimate the regression equation. Yt = α + β X t + Ut By OLS and compute regression residual ℮t, then estimate the following regression equation. ℮t = θ0 + θ1 ℮ t-1 + θ2 ℮ t-2 + θ3 ℮ t-3 + ………. θ p ℮ t- p + γ Yt-1 + λ Xt + error. ������������ ���������� ��������� � ���������
���� � 58 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Now test the null hypothesis, H0: θ0 = 0, θ1 = 0, θ2 = 0, θ3 = 0 ……………….. θ p = 0. H1: At least one θ j ≠ 0 for j = 1, 2, 3 ……………..…… p. Apply F test [simple form p = 1] ℮t = θ0 + θ1 ℮ t-1 + γ Yt-1 + λ Xt + error. H0: θ1 = 0. H1: θ1 ≠ 0. 4). Q-Test:
Q-test is use full to test cumulative autocorrelation up to any order p permissible with the data. H0: ρ0 = 0, ρ1 = 0, ρ2 = 0, ρ3 = 0 ……………….. ρ p = 0. H1: At least one ρ j ≠ 0 for j =1,2, 3 ………p. a). P =1. First order autocorrelation. b). P =2. First and second order autocorrelation autocorrel ation c). P =3. First, second and third order autocorrelation and so on. For Q-test formula is; Improved form of formula;
Solutions for Autocorrelation: Autocorrelation:
Consider the following model Yt = α + β Xt + Ut. -------------------------- (1) Ut = ρ Ut-1 + Є t. -------------------------- (2) Є t. is White noise. [Random variables with zero mean and constant variance and zero autocorrelation]
The estimation procedure attempt to replace the auto correlated variable Ut by non auto correlated variable Є t. Consider
Yt = α + β Xt + Ut.
(1)
[Take first difference of equation (1) and multiply with ρ to all terms minus new equation from (1)] ������������ ���������� ��������� � ���������
���� � 59 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
ρ Yt-1 = ρ α + ρ (β Xt-1) + ρUt-1.
(1`)
Yt = α + β Xt + Ut. ρ Yt-1 = ρ α + ρ (β Xt-1) + ρUt-1. Subtract: Yt – ρ Yt-1 = α (1- ρ) + β (Xt – ρ Xt-1) + Ut- ρ Ut-1. Or Yt – ρ Yt-1 = α (1- ρ) + β (X t – ρ Xt-1) + Є t. ----------------- (A) Now using equation (2) we can write Ut = Yt – α – β Xt . ρ Ut-1 = ρ Yt-1 – ρ α – ρ β Xt-1 Ut – ρ Ut-1 = (Yt – α – β Xt) – ρ (Yt-1 – α – β Xt-1) = Є t.
Subtract: Or
(Yt – α – β Xt) = ρ (Yt-1 – α – β Xt-1) + Є t. ----------------- (B)
Equation
(A) or (B) has the error term Є t which satisfies all the classical assumptions. The two unknown values of coefficients multiply each other then it becomes non-linear equation. However, the trouble is that both these equations are non-linear in parameters; we can not drive the formula for the OLS estimators of α, β, ρ. Since we can not use any unique formula to compute OLS estimators of α, β and ρ, we’ll have to apply some numerical algorithm. We’ll consider two methods (1) Cochrane-Orcatt two step iterative method. (2) A Version of Direct Search. Cochrane-Orcatt two step iterative method:
Step 1a:
Start with some initial value of ρ, suppose we set ρ◦ = 0. Then equation (A) becomes; Yt = α + β X t + Є t. -------------------------------- (A`) Now apply OLS to compute and these estimators are poor because they do not treat autocorrelation (ignoring).
Step 1b:
Substitute
and
in equation (B)
Or
℮ t = ρ℮ t-1 + Є t.
-------------------------- (B`)
Apply OLS to compute compute , now use in equation equation (A); ������������ ���������� ��������� � ���������
���� � 60 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Yt* = α X1 + β Xt* + Є t. Apply Apply OLS to yield yield and . These are two step estimators of α and β (either not good due to wrongly estimated and ). Since Since ρ◦ = 0 is not true, true, and are poor
is poor and are poor However, However, we expect that that is more likely likely to to be closer to to the true value value of ρ than ρ◦ ρ◦ = 0
it foll follow ows, s, ther theref efore ore that that improvement
and and
are are pre prefer ferab able le to and and , but but ther theree is is poss possib ibil ility ity of
Step 2a:
Use and and in equat equatio ionn (B) (B) to comp comput utee . Or
℮ t = ρ℮ t-1 + Є t.
------------------- (B``)
Step 2b:
Use in equa equati tion on (A) (A) to compu compute te and and . This process continuous till convergence achieved. Then the estimator of α and β becomes stable. A Version of Direct Search:
Consider equation (A) which can be written as; Yt = α (1- ρ) + β (Xt – ρ Xt-1) + ρ Yt-1 +Є t. Above equation of Yt regressed on Xt, Xt-1 and Yt-1. Yt = θ0 + θ1 Xt – θ2 Xt-1) + θ3 Yt-1 + Є t. Such that θ1, θ3 = θ2 We start with initial values of all parameters of α˚, β˚ and ρ˚. For example we can set α˚ = 2, β˚ = 0.5 and ρ˚ = 0.7. This will yield Yt = α˚ (1- ρ˚) + β˚ (X t – ρ˚ Xt-1) + ρ˚ Yt-1 + ℮ t. = 2 (1-0.7) + 0.5 Xt – 0.35 Xt-1 + 0.7 Yt-1 + ℮ t. ������������ ���������� ��������� � ���������
���� � 61 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
= 0.6 + 0.5 Xt – 0.35 Xt-1 + 0.7 Yt-1 + ℮ t. Now compute ℮ t= (Y- ) ℮ t = Yt - α˚ (1- ρ˚) - β˚ (Xt – ρ˚ Xt-1) - ρ˚ Yt-1. = Yt - 0.6 - 0.5 Xt + 0.35 Xt-1 Finally compute ∑℮² Let ∑℮² = 52. Now change one of the three parameters at a time and recomputed ∑℮². For example we change β˚ from 0.5 to 0.6 keep and α˚ = 2, ρ˚ = 0.7. Now compute the following expression
This is so called numerical derivative, if the expression in (a) is positive, it means that at β˚ = 0.5, errors are increasing in β, so we should set β˚ less than 0.5. Repeat the same procedure for α, β and ρ. Once we know the directions in which α, β and ρ should be searched, we can change the initial values and repeat the entire process. Example: α˚ = 2, β˚ = 0.5, ρ˚ = 0.7. Derivative with respect to α > 0 Derivative with respect to β < 0 Derivative with respect to ρ < 0 Now we can set α˚ = 0.2 β˚ = 0.8 ρ˚ = 0.9. For example now the signs of derivatives are Positive for α Positive for β Negative for ρ
Now set α˚ = 0.-3. β˚ = 0.7. ρ˚ = 0.95.
H
ETROSCEDISTICITY:
Introduction:
If the assumption that Var (Ui) =σ² for all i is violated, we’ll have Var (Ui) =σi², which can vary from observation to observation, this situation is referred to as Hetroscedisticity. Examples: (i) Qi = α + β K i + γ Li + θ Ai +Ui ������������ ���������� ��������� � ���������
���� � 62 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Q = wheat output K= capital L = labor A= acreage.
In our sample we have all size of farms; Var (Ui) measures the size of variation in output due to random factor. We expect that Var (Ui) to increase with the size of farm. (ii) Yi = α + β Xi + Ui Y= expenditure on snacks X= income
There is random fluctuation, low variance, and low income in mostly in cross section data.
Yi = α + β X i + Ui Var (Ui) = σi² which varies across observation points, one reason can be that when the value of Xi is larger, there are more chances of larger unexpected variations in Yi, that is Example1):
Var (Ui) = σi² => f (Xi)
Yi = α + β Xi + Ui Yi is food consumption, Xi is income, and data is at household level. Now the household with higher income level are expected to experience larger fluctuations in food consumption. Example2):
Yi = α + β X i + Ui Yi is wheat output; Xi is area under wheat crop, and Ui is random fluctuation in wheat output. Larger the farms are expected to experience larger fluctuations in output. There can be favorable and unfavorable effects of weather conditions on wheat output. Obviously hetroscedisticity problem is more likely to arise where larger variations in Xi. This is more likely to happen in cross section data rather than in time series data. Hetroscedasticity mainly a problem of cross section data, it may arise in time series data if the data is observed at low frequency level like daily or weakly. Consequences of hetroscedisticity: hetroscedisticity: OLS
estimators are remains linear and unbiased. 1). OLS estimators no more have minimum variance in the class of linear unbiased estimators. Not
remains best.
2). Ordinary formula for calculating variances is no more valid. Var ( ) ≠ σ² σ²_ ∑x² ������������ ���������� ��������� � ���������
���� � 63 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
OLS estimators are not sufficient, they are larger in variances. Testing of Hetroscedisticity: Hetroscedisticity:
1). Gold-field Quandt test. 2). Glejser test. 3). Rank correlation test. 4). White’s General test. These all are well known tests, Glejser is weaker test and White’s General test is modified form of that, Gold-field Quandt test is very power full test. Gold-field Quandt:
This test is power full test but it can not help in detecting the form of hetroscedisticity and not giving direction for its solutions. Consider the regression equation;
Yi = α + β X i + Ui Steps:
1) Arrange the data in order of Xi (ascending order). 2) Omit central 20% observations (to get some whole number), this will yield two subsamples 40% observations with small Xi and 40%observations with large Xi. 3) Estimate a regression equation for each sub-sample and compute ∑℮1², ∑℮2² and hence; Ô1²= ∑℮1², Ô2²= ∑℮2² n1 – k n2 – k 4) Compute F-statistics.
n1 = n2 = 0.4n
F = Ô1² if Ô1²> Ô2² Ô2² F = Ô2² if Ô2²> Ô1² Ô1² The F test is applied at 5% level of significance and degrees of freedom (df) equal to n1-k and n2-k.our null and alternative hypothesis are as given below; H0: σ1²= σ2² H1: σ1²≠ σ2²
[no hetroscedisticity] [Hetroscedisticity]
Notes: 1). Test is very power full. 2). If there are more than one X variables than the test become quite complicated. Foodi = α + β Incomei + γ Familyi +Ui Or Yi = α + β X i + γ Zi + Ui 3). the test does not indicate the form of hetroscedisticity (due to linear, quadratic or simultaneous). ������������ ���������� ��������� � ���������
���� � 64 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
White’s General Test:
White’s General test on the other hand is an instrument in detecting the form of hetroscedasticity and there by directing towards the possible solutions. Consider the regression equation Yi = α + β Xi + γ Zi + Ui Steps:
1). Estimate the equation by OLS and compute ei = Yi – Ŷ Ŷi 2). Estimate by OLS the following equation
℮i²= a0 +a1Xi + a2Zi + b1Xi²+ b2Zi²+ cXiZi +Vi --------- (2)
3). Compute F-statistics or χ ² ²
F = Explained variation/ (n-1) __ Unexplained variation/ (n-m) ²= n R² χ ²=
R² is obtained from equation (2) it is not negligible, it is significant. F is ~ F (n-1), (n-m)
χ ² ² is ~ χ m² m = 1+ (k-1) + (k-1) + (k-1) (k-2) ↓ ↓ __↓__ 2 . Intercept, linear, square ↓ . Simultaneous m = 1+ (k-1) + (k-1) + (k-1) (k-2) 2 = 1+ k - 1 + k -1 + k² - 3k + 2 2 = 2k – 1 + k² -1.5k + 1 2 = k² + 1k 2 2 m = k (k +1) 2 Hypothesis: H0: ai =0, bi =0, ci =0 for all I except intercept. H1: At least one parameter in H0 is ≠ 0. Rejection of H0 indicates presence of Hetroscedisticity. Notes: 1) If k is large then m will also be large and it will reduce the power of test. Let suppose k=6 => 6*7 = 21 2 ������������ ���������� ��������� � ���������
���� � 65 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
If n= 50 then n-m= is too small, it is very poor. Partial
have;
solution for this problem is to omit the simultaneous term, in this case we will
m = 1 + (k-1) + (k-1) + 0 = 1 + k -1 + k- 1 m = 2k-1. If k = 6 then m =11 2) The test also help in determining the form of Hetroscedisticity, which can be guessed by looking at estimate of equation (2)’s t-statistics.
℮i²= a0 +a1 Xi + a2 Zi + b1 Xi²+ b2 Zi²+ c Xi Zi + Vi (1.1) (0.95) (1.32)
(1.15) (4.5)
(0.99)
3) The test is very general in application, it give more than one form of Hetroscedisticity. Solutions of Hetroscedisticity: Hetroscedisticity: Informal Solution:
In some contexts, we can re specify our model to reduce the chances of Hetroscedisticity. Example1: Suppose we suspect Hetroscedisticity relates to K (capital), we also expect that α + β + γ ≈ 1 than we can write,
It is more stable variable model as compare to previous; this equation is less likely to have Hetroscedisticity. Example2: Consider a quadratic expenditure system QES: Yi = α + β Xi + γ Xi²+Ui Yi = food, Xi = income.
Suppose we expect Var (Ui) = θXi² SD (Ui) = √θ Xi Now ÷ the equation by Xi Yi =1 α + β + γXi² +Ui
Xi, Xi,
Xi,
Xi
Si = β + γ X i + α 1 + Vi
Xi
Var (Vi) = Var (1 Ui)
Xi
������������ ���������� ��������� � ���������
���� � 66 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
= 1 Var (Ui)
Xi²
= 1 θXi²
Xi²
= θ -------- no Hetroscedisticity.
[Micro economic theories are based or depend on survey data.] Formal Solution:
In the formal solution, we use the basic principles. Suppose Yi = α + β Xi + γ Zi + Ui ---------- (i) Var (Ui) = σi² ----------------- (ii) Transform equation (i) in the light of equation (ii), so that the transformed error term is homoscedastic. Thus ÷ equation (i) by σi
Or
Yi = α 1 + β X i + γ Zi +Ui σi, σi, σi, σi, σi Yi* = α Xi + β Xi*+ γ Zi * + Ui* -------- (iii)
Now Var (Ui*) = Var (Ui) σi = 1 Var (Ui) σi² = 1 σi² σi² = 1 ----------- No Hetroscedisticity. Equation (iii) can be estimated only when σi is known, thus we have to apply a two step procedure. Step1:
Apply OLS to equation (i) and compute the series of ℮i; then if we follow White’s test we’ll estimate the equation. ℮i²= a0 +a1Xi + a2Zi + b1Xi²+ b2Zi²+ c Xi Zi +error ----------- (iv) Apply White’s test, If the H0 is accepted then hetroscedisticity is not present and step one complete the estimation, if H0 is rejected then hetroscedisticity is present and we move to step two. Step2:
From the estimated equation (iv) and compute the estimated value of the dependent variable ℮i².
Set ������������ ���������� ��������� � ���������
���� � 67 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Hence Now replace σi by
in equation (iii) and apply OLS.
R
EGRESSION ANALYSIS WITH QUANTITATIVE DATA:
We confine to the cases where in the equation quantitative variables appears on the right hand side only, e.g. gender, ethnicity, residential etc. Suppose we consider the effect of gender on income. Y=α+βD+U
Where
------------- equation (i)
D is “dummy” or binary variable in equation (i) indicating gender
D =0 for male D =1 for female
(0 and 1 are more convenient values)
From equation (i) if we assume as usual that E (U) =0 and D is fixed, we can infer the following, E (Y) = α + β D [Mean income depends on income of male and female] E (YM) = α E (YF) = α + β E (YF) - E (YM) = (α + β) – α => β Let
[if D =0, as male] [if D =1, as female] [difference between male and female income]
us redo by defining two dummies:
D1 =0 D1 =1
for male for female
D2 =0 for female D2 =1 for male We can write the model in three different forms (ways). Y = α0 + α1D1 + U ---------------------------------- ------- (i) Y = β0 + β1D2 + V ---------------------------------- ------- (ii) Y = γ1D1 + γ2D2 + W ---------------------------------- ------- (iii) If we include all dummies for all categories of a quantitative variable and also include intercept it will create “dummy variable trap”, this will create perfect co-linearity and estimation will break down. E (Y) = α0 + α1D1 = β0 + β1D2 = γ1D1 + γ2D2 E (YM) = α0 = β0 + β1 = + γ2 [if D1 =0, D2 =1, as male] E (YF) = α0 + α1 = β0 = γ1 [if D1 =1, D2 =0, as female] E (YF) - E (YF) = α1 = - β1 = γ1 - γ2 [difference] In equation (iii) model specification is not very good, essentially there is no difference in results (base category is male) e.g. education and literacy relationship with income.
������������ ���������� ��������� � ���������
���� � 68 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Dummies
for more than two categories:
Categories of education qualification (banking side worker) Illiterate Primary Secondary Senior secondary Higher or above Define the following dummy variables; [Base category is Illiterate]
0-4 years of education 5-9 years of education 10-11 years of education 12-13 years of education 14or above years of education
D2 =1 if primary, =0 otherwise. D3 =1 if secondary, =0 otherwise. D4 =1 if senior secondary, =0 otherwise. D5 =1 if higher, =0 otherwise. The regression model is; Y = β1 + β2D2 + β3D3 + β4D4 + β5D5 + U We can set that E (Y) = β1 + β2D2 + β3D3 + β4D4 + β5D5 E (YI) = β1 E (YP) = β1+ β2 E (YS) = β1+ β3 E (YSS) = β1+ β4 E (YH) = β1+ β5
[if D1 =1, as Illiterate] [if D2 =1, as Primary] [if D3 =1, as Secondary] [if D4 =1, as Senior Secondary] [if D5 =1, as Higher]
[Theoretically we expect] β5> β5> β4> β3> β2>0 [General expectation] β1>0 Example: Now consider the effects of gender and education on income. Gender:
G =1 if female,
=0 otherwise
Education: E2=1 if secondary, =0 otherwise E3=1 if higher, =0 otherwise The model can be constructed as follows; Y = α + β G + U -------------------- (1) Now we propose (α) is expected income of o f male and it depends upon level of education, α = α 1 +α 2E2 + α 3E3 ---------------------- (2a) β = β1 +β2E2 + β 3E3 ---------------------- (2b) Now substitute (2a) and (2b) into (1) then, Or
Y = α 1 +α 2E2 + α 3E3 + [β1 +β 2E2 + β3E3] G + U Y = α 1 +α 2E2 + α 3E3 + β1G + β2 (E2G) + β3 (E3G) + U
������������ ���������� ��������� � ���������
���� � 69 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Categories G =0
Male, primary Male, secondary Male, higher
Expected income E (Y) _____ E (YP) = α 1 E (YS) = α 1+α 2 E (YH) = α 1+α 3
Female, primary E (YP) = α 1 + β1 G =1 Female, secondary E (YS) = α 1+α 2 + β1 + β2 Female, higher E (YH) = α 1+α 2 + β1 + β3 ----------------------------------------------------------------------------------------------. Combining
Qualitative and Quantitative Variables:
Suppose income depends upon experiences and education, experience is measured as a quantitative variable (the years of experiences), education has three categories; 1). M Sc. or Equivalent 2). M Phil or Equivalent 3). PhD or Equivalent Defining dummies; D2 = 1 if M.Phil, = 0 otherwise D3 = 1 if PhD, = 0 otherwise The model can be constructed as follows, Y = α + β E + U -------------------- (1) α = α 1 +α 2D2 + α 3D3 ---------------------- (2a) β = β1 +β2D2 + β 3D3 ---------------------- (2b) Substitute (2a) and (2b) into (1) then, Y = α 1 +α 2D2 + α 3D3+ [β1 +β2D2 + β 3D3] E + U [Mean Income] E (Y) = α 1 +α 2D2 + α 3D3+ [β1 +β2D2 + β 3D3] E E (Y M Sc) = α 1 + β1E E (Y M.Phil) = (α 1+α 2) + (β1 + β2) E E (Y PhD) = (α 1+α 3) + (β1 + β3) E
[Mean income at M Sc. level] [Mean income at M Phil level] [Mean income at PhD level]
We expect that α 3>α 2>0, α 1>0 β3> β2>0, β1>0
������������ ���������� ��������� � ���������
���� � 70 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
S
TOCHASTIC/ RANDOM REGRESSORS:
Suppose the assumption that X variables are exogenous is not true. This situation is called as the case of stochastic/ random repressors. In a typical equation we have, Y=α+βX+U X is given, U is random and model is complete. If
X is not given then the model,
Y=α+βX +U U is random, model is not complete (information is not complete). Example1: Macro level consumption function.
Ct = α + β Yt + Ut Yt = Ct + Zt [Y it self depends upon C, it is case of simultaneous equation, as we can see from below graph]
[ N Nt = A Ut] Population is growing exponentially. Log Nt = α + β Log Nt-1 + Ut
Example2:
������������ ���������� ��������� � ���������
���� � 71 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Log Nt-1 is not exactly given, it follows the path, Log Nt-1 = α + β Log Nt-2 + Ut-1 [ N Nt-1 is also evolving from previous population Nt-2 and so on.] Example3: Market Demand function
Q=α+βP+U P is not given, Infect both P and Q is determined by the intersection of supply and demand. Example4: We have the following relationship for a sample of children aged 0 to18.
Weight = α + β food + U Cross section data of 100 on average, more you eat more will be the weight and food is not independent, it also depends upon weight also. Example5: Investment and Saving Model (IS-Model).
Y=α+βR+γG+U
IS:
R = Interest rate, G = government expenditures
Where G is given and R and Y are not given, government can change G according to their needs. Weight = α + β Age + U Age is given and information is complete, there are more factors like age are given in the practice. Example6:
Consequences of stochastic/ random Regressors problem:
Consider Y=α+βX+U U satisfies all standard assumptions. X is not fixed, it is random ������������ ���������� ��������� � ���������
���� � 72 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
1). OLS estimator of β is : = ∑xy ∑x² = ∑xY ∑x² = x1 Y1 + x2 Y2 + x3 Y3 + ……………….. + x n Yn ∑x² ∑x² ∑x² ∑x² = a1 Y1 + a 2 Y2 + a 3Y3 + ……………….. + a n Yn Now X variables are a re not fixed, therefore ther efore we w e can not treat a1, a 2 , a 3 …… a n as constants, in this case a1, a 2, a 3 ……a n are them salves random variables, so is not a linear function of Y. so property of linearity of OLS estimator is violated. 2). Consider E ( ): = β + ∑xU ∑x² = β + x1 U1 + x2 U2 + x3 U3 + ……………….. + x n Un ∑x² ∑x² ∑x² ∑x² Apply expectation: E ( ) = β +E ( x1 U1) +E ( x2 U2) +E ( x3 U3) + ……………….. +E ( x n Un) ∑x² ∑x² ∑x² ∑x² E ( ) ≠ β + x1 E (U1) + x2 E (U2) + x3 E (U3) + ……………….. + x n E (Un) ∑x² ∑x² ∑x² ∑x² Because x can not be factored out from expectation, (X and U are correlated with each other) so is biased. 3). It can be shown that if x and u are independent, but x is random then the OLS estimator is biased but with increase in sample size the biasness approach towards zero. Examples:
1). Weight = α + β F + U U W F
(F and U are correlated)
2). Q = α + β P + U U Q P 3). Y = α + β R + γ G + U U Y R A good example is;
(P and U are correlated) (R and U are correlated)
Yt = α + β Yt-1 + U t Since Yt-1 = α + β Yt-2 + Ut-1 depends on random variable U t-1 soYt-1 is random, Ut is uncorrelated with Yt-1 (It means today’s event does not depends upon yesterday’s action) today’s shock does not change yesterday’s event. However, Ut and Yt-1 are independent as sample size tends to infinity or reasonable large than bias will be negligible. ������������ ���������� ��������� � ���������
���� � 73 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
4). It can be shown that if x is random and x is not independent of U then OLS estimator is biased and the amount of bias does not diminish with increase in sample s ample size. 5). Consider special case of example 4. Yt = α + β Yt-1 + U t [Current CPI depends upon previous CPI]. Ut = ρUt-1 + Єt Now Ut-1 Ut
Ut and Yt-1 are correlated
Yt-1 Now OLS estimator becomes biased. [Recall [ Recall that D W statistics also becomes biased, now no w we know the reason]. Auto correlated and leg dependent variables both create more serious problem. Solution / Estimation Procedure:
Consider the model Y=α+βX+U Cov(X, U) ≠0 [X is not given, not random and correlated with U, example 4] Now we define instrumental variable, say Z, as a variable that satisfies two conditions; 1). Z and X are closely correlated with in the given sample. 2). Cov (Z, U) =0 in the population. [This seems impossible in sense] X
U But Z is not correlated with U.
Z Food = α + β Weight + U X = weight Z = age
Age does not depends upon U Weight Example:
U (sickness) Age
C=α+βY+U Y=C+Z Z = exogenous [U C Y]
Y
U
Z, Yt-1 Time series data and Yt-1 also good instrumental factor to use in this example ������������ ���������� ��������� � ���������
���� � 74 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
One important of estimation is Two Tw o Stages Least Squares (2SLS) Method.
Model:
Stage 1:
Y=α+βX+U ---------------------- - (i) Cov (X, U) ≠0 Z is valid instrument. Regress X on Z
X=a+bZ+V Apply OLS and obtain estimated
and
and hence
is such that it contains only those variations in X which are determined by Z (basically we are filtering out the problem). In other words we have Not explained by Z, “Trouble some”. Explained by Z, “Trouble free roughly”.
We can say that X is endogenous and therefore trouble some De-endogenizes X Stage 2: Rewrite the main equation (i) as follows;
Y=α+β(
Or
)+U
Error in X variable
Where Now apply OLS,
It can be shown that estimator, so obtained are; 1). Not linear 2). Biased ������������ ���������� ��������� � ���������
���� � 75 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
3). asymptotically unbiased. [As the sample size increases biasness diminishes towards zero].
These estimators are called 2SLS estimators as well as Instrumental Variables Least Squares [IVLS] estimators.
S Consider any two equations
imultaneous Equations estimations:
Let
Q = α + βP + γY + U Q = a + bP + cW + dR+ V
[Demand] [Supply]
P and Q both are endogenous variables, we are going to calculate the endogenous variables at the same time and find first variable and put the value of that to calculate the second one at the same time is referred to simultaneous equation case. ! ~~~~~~~~~~~~~~~~~~~~~~~~~ !
������������ ���������� ��������� � ���������
���� � 76 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
������������ ���������� ��������� � ���������
���� � 77 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Econometrics Practice Questions Sir Eatzaz Ahmed Q.1: Are the following statements true, false
or uncertain? Explain your answer. a vailable. a) Instrumental variables are used when data on some variables are not available. Answer: False statement, because the instrumental variables are used as proxy variable on the behalf of endogenous variables when there is endogenity problem we define instrumental variable. b) OLS estimators for the parameters of simultaneous equation are inconsistent when the equations are under-identified. However, the estimators become consistent if the equations are identified. Answer: We have not learned that simultaneous equation topic. c) Cochrane-Orcutt iterative procedure is a test for autocorrelation. Answer: False
Statement, Cochrane-Orcutt iterative is not a test for autocorrelation because it is solution for autocorrelation.
d) Multicollinearity problem arises only when there are many equations in the model. Answer: False Statement, because Multicollinearity problem arises only when there
explanatory variables in the model.
e) A major limitation of DW test is that it is a powerful test. Answer: False Statement, because it is not a limitation of DW
powerful test.
f)
are many
test; it is specialty that it is
Following is a set of simultaneous equations Z = α + βY +U Y=C+I+G+X–M
Answer: False
Statement, because Z and Y are not simultaneously determined where value of Y is given in the second equation if both are not given same time then it will be the set of simultaneous equation.
err or. g) The inconclusive range in the DW test is the result of type-2 error. Answer: False Statement because type-2 error is acceptance of H 0 when it h)
is false and inconclusive
range means we unable to give concrete result. Goldfield-Quandt test is a powerful method of estimating an equation in the presence of hetroscedisticity.
Answer: False
Statement, because Goldfield-Quandt is a test not a estimating method or solution in the presence of hetroscedasticity.
i)
In the regression equation Y t = α + βX t + δ2 ε t-2 + δ1ε t-1 + ε t , where ε t is a random error term, multicollinearity can occur if there is strong correlation between ε t-1 and ε t-2 .
Answer:
True Statement, because the regression is run on the random error terms and multicollinearity is present.
Q.2: Critically evaluate the following statements. Give details to justify your answer. a) Hybrid equations are used in order to remove both autocorrelation and multicollinearity from
an equation. Answer: We have not learned that hybrid equation topic. b) In the presence of multicollinearity, the OLS estimator is linear and unbiased and its variance is smaller than the variance of any other linear and unbiased estimator. Answer: It
is true that In the presence of multicollinearity, the OLS estimator is linear and unbiased and its variance is smaller than the variance of any other linear and unbiased estimator.
������������ ���������� ��������� � ���������
���� � 78 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
c)
In the presence of autocorrelation OLS estimators of regression parameters are likely to have large sampling error and, therefore, they are unbiased.
Answer: It
is true that In the presence of autocorrelation OLS estimators of regression parameters are likely to have large sampling error erro r and, therefore, they are unbiased. u nbiased.
d)
The estimators based on Cochrane-Orcutt iterative method are linear and unbiased with minimum variance.
Q.3: Using
a cross section data of 500 household you are to study the effects of income, rural-urban residence and education level of the household on household savings. The information on education is classified as no education, school level education and higher education. Formulate an appropriate regression equation.
Answer:
Saving = f [Income (Y), Residence (R), Education (E)] S = α + β Y + U. -------------------------------- ------- (1) Education dummies: E2 = 1 if school level, = 0 other wise. E3 = 1 if higher level, = 0 other wise. Residential dummy: R = 1 if rural, = 0 other wise. α = α0+ α1 R + α2 E2 + α3 E3 ------------------- (2a) β = β0 + β1 R + β2 E2 + β3 E3 ------------------- (2b) Substitute (2a) and (2b) in (1) S = α0+ α1 R + α2 E2 + α3 E3 + (β0 + β1 R + β2 E2 + β3 E3) Y + U. Or S = α0+ α1 R + α2 E2 + α3 E3 + β0 Y + β1 (RY) + β2 (E2 Y) + β3 (E3 Y) + U. Q.4: Carefully explain steps to apply White’s hetroscedisticity test for the Q.3 Answer:
equation.
S = α0+ α1 R + α2 E2 + α3 E3 + β0 Y + β1 (RY) + β2 (E2 Y) + β3 (E3 Y) + U. Apply White’s General Test.
Steps:
(1)
Estimate the equation by OLS and compute. ℮i = S – Ŝ (2) Estimate by OLS the following equation. S = a0+ a1 R + a2 E2 + a3 E3 + a4 Y + b1 R² + b2 E2² + b3 E3² + b4 Y² + c1 RY + c2 E2Y + c3 E3Y + V. This given ℮i² equation all the forms of hetroscedasticity. hetroscedasticity. ² (3) Compute F -statistics or χ ² F = Explained variation/ (n-1) __ Unexplained variation/ (n-m) ²= n R² χ ²=
[R² is obtained from given equation, it is significant.] Where m = 2k-1 with out cross product terms ������������ ���������� ��������� � ���������
���� � 79 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
= 2*9 – 1. = 18 – 1. m = 17
[k = 9] n = 500 n – m = 500 – 17 = 483.
m = k (k +1) 2 Test the Hypothesis:
H0: ai =0, bi =0, ci =0 for all i except intercept. H1: At least one parameter in H0 is ≠ 0. Rejection of H0 indicates presence of hetroscedisticity and Acceptance of H0 indicates no hetroscedisticity. Q.5: Consider
the following estimated regression equation based on a sample of 26 firms in a manufacturing industry of Pakistan, where MPL and L denote the marginal product of labor and the number of labor units respectively. The values in parenthesis are computed the t-values. MPL = 100 +.012 (1/L), R² = 0.4 (40.0) (0.03)
a) b) c)
Explain and interpret all the results. Test the prepositions that marginal product of labor is diminishing. diminishing. Suppose your sample include 16 firms in the privates sector and 10 in public sector. How would you modify the regression equation in order to allow for the possibility that the marginal product of labor diminishes faster in private sector than in the public sector? How would you carry out the test?
Q.6: Consider the following demand equation, where M is the quantity of Money, Y is real out put, P
is general price level, W is financial wealth and t denotes the time period. per iod. Log (Mt) = α + β log (Yt) + δ log (Pt) + Ф R t + λ log (W t) + Ө log (M t-1) + Ut a) Explain the meaning of each parameter.
Answer:
Meaning of parameters α = Subsistence or Autonomous elasticity of money demand
β = ∂ Log (Mt) = ∂ Log (Yt)
% change in elasticity of money demand
δ = ∂ Log (Mt) = ∂ Log (Pt)
% change in elasticity of money demand
Ф = ∂ Log (Mt) = ∂ R t
% change in elasticity of money demand
.
% change in output elasticity of money demand
.
% change in price elasticity of money demand
.
% change in World interest rate
λ = ∂ Log (Mt) = % change in elasticity of money demand . ∂ Log (Wt) % change in financial wealth elasticity of money demand Ө = ∂ Log (Mt) = ∂ Log (Mt-1)
% change in elasticity of money demand
.
% change in previous elasticity of money demand
������������ ���������� ��������� � ���������
���� � 80 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
b) How would you test the following propositions, one at time? Answer: Testing of propositions Money demand does not depend on financial f inancial wealth i. H0: λ = 0 Answer:
H1: λ ≠ 0 ii.
Answer:
The output elasticity of money demand is greater than the price elasticity
H0: β – δ > 0 H1: β – δ = 0
Money demand depends on nominal income PY only H0: β = δ, λ = 0, Ө = 0, Ф = 0 H1: β ≠ δ, λ ≠ 0, Ө ≠ 0, Ф ≠ 0 iii.
Answer:
Log (Mt) = α + β log (Y t) + δ log (P t) Log (Mt) = α + β [log [ log (Yt) + log (Pt)] Log (Mt) = α + β log (Y t Pt) Q.7: In the regression equation Y i = a + bX i + c X i-1 + Ui, Durbin-Watson test is not appropriate to detect first order autocorrelation. Do you agree? If yes, which test is suitable in this case?
Answer:
Not agree, because there is no lagged dependent variable; there is lagged independent variable on the right hand side when lagged dependent appears on the right hand side then DW is not appropriate to detect first order autocorrelation and Durbin htest is suitable in this case.
Q.8: Can White’s general test detect all type of autocorrelation in a random variable? Answer: This statement is wrong because White’s General test is for hetroscedasticity
problem not for detection of autocorrelation problem.
Q.9: In the presence of hetroscedasticity a regression equation can be estimated by Goldfield-Quandt
test. Do you agree?
Answer: We
can not estimate the regression equation in the presence of hetroscedasticity because Goldfield-Quandt is test not an estimation technique; estimation techniques are OLS and many others.
Q.10: While
estimation the regression equation Y i = a + bXi + Ui, multicollinearity problem is more likely to occur in cross section data than in time series data. Do you agree?
Answer:
No, because multicollinearity problem of correlation among explanatory (X) variables and is more likely to occur in time series data and there is only one explanatory variable in the given regression equation.
Q.11: Interpret
the following regression equation as an economist. C, Y and W are per capita consumption, income and wealth respectively, all in thousand rupees. Numbers in parentheses are the t-values. Ct = 1.17 + 0.45Yt + 0.55Wt, R² = 0.9643, DW = 0.09. (2.13) (6.17) (1.43)
Answer: Interpretation: -
The result show that 96.43% variation in consumption expenditure is explain by our model, which indicates that the over all performance of the equation is satisfactory, the intercept is positive and significantly different from zero, its magnitude shows that the subsistence or autonomous consumption expenditure is Rs.1.17 thousand
������������ ���������� ��������� � ���������
���� � 81 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
rupees per capita per year, further note that marginal propensity to consume (MPC) is significantly different from zero and less than one, the estimated value of the MPC shows that the marginal consumption rate is 0.45 or 45% of each incremental rupee of income is consumed, while the remaining 55% is the marginal consumption rate of each incremental rupee of wealth is consumed. Q.12: Suppose in the equation Y t = a + bXt + Ut, the stochastic variables X t and Ut are correlated with
each other. a) Does this imply that we have problems of autocorrelation and/or multicollinearity and/or hetroscedasticity?
It is not a problem of autocorrelation or multicollinearity or hetroscedasticity then it is endogeniety problem
Answer:
Can in this case the equation be estimated by White’s general test or Durbin-Watson test or Durbin’s h-test? Answer: We can not because White’s general test or Durbin-Watson test or Durbin’s h-test are tests not solutions for the given equation. b)
Q.13: Suppose you
have estimated two alternative cost functions for wheat using data on 500 farms. The cost (C) is measured in thousands of rupees while output (Q) is measured in tons. The regression results are given below. The vales in parentheses are standard errors.
Log (C) = - 10.48 +1.12 log (Q) C/Q = 4568 + 0.2284 Q + 4.84 Q-¹ (2.56) (0.16) (1141) (0.5521) (4.40) Can you test the null hypothesis that the marginal cost is an increasing function of output for each equation? If yes, apply the test and draw your conclusion. If not, explain why the test cannot be applied and what additional information, if any, is required r equired to perform the test. Answer:
Log (C) = - 10.48 + 1.12 log (Q) (2.56) (0.16)
Null Hypothesis
H0: β = 0 H1: β ≠ 0
Apply test
Degree of freedom = (n-k) = (500 – 2) => 498.
Critical value = 1.96 Level of significance = 5% Two tail test
We
reject H0. H0: β = 1 H1: β < 1
������������ ���������� ��������� � ���������
t Degree of freedom = (n-k) = (500 – 2) => 498.
���� � 82 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Critical value = 1.96 Level of significance = 5% Apply test
We
Right tail test
accept H0. C/Q = 4568 + 0.2284 Q + 4.84 Q- ¹ (1141) (0.5521) (4.40)
Null Hypothesis
H0: α = 0 H1: α ≠ 0
Apply test
We
Degree of freedom = (n-k) = (500 – 2) => 498.
Critical value = 1.96 Level of significance = 5% Two tail test test
reject H0.
H0: β = 0 H1: β ≠ 0
Apply test
We
Degree of freedom = (n-k) = (500 – 2) => 498.
Critical value = 1.96 Level of significance = 5% Two tail test
reject H0.
������������ ���������� ��������� � ���������
���� � 83 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
H0: β = 1 H1: β < 1
Degree of freedom = (n-k) = (500 – 2) => 498.
Critical value = 1.96 Level of significance = 5% Right tail test
Apply test
We
accept H0.
H0: γ = 1 H1: γ > 1
Degree of freedom = (n-k) = (500 – 2) => 498.
Critical value = 1.96 Level of significance = 5% Left tail test
Apply test
We
accept H0.
Conclusion:
The over all test’s on each equation is given but we are able to make the decision that over all equation is satisfactory and there is also need to check the performance of the equation given by R² which is not given in each of the equation. It is shown from the results that marginal cost is not increasing function of the output for each equation.
Q.14: Suppose you want to study the i. ii.
propositions: propositions: Loan recovery rate varies considerably across private commercial banks, public owned commercial banks and development finance institutions, The loan recovery rate declines with the size of loan. Formulate an appropriate econometric equation, giving special attention to construction of variables and the type of data to be used for estimation.
Answer:
Loan Recovery = f [size of loan (Z), nature of bank) LR = α + β Z + U.
������������ ���������� ��������� � ���������
---------------------------- --- (1) ���� � 84 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Dummies of banks: D2 = 1 if Public own commercial banks, = 0 other wise. D3 = 1 if Development finance banks, = 0 other wise. α = α1 + α2 D2 + α3 D3 ------------------- (2a) β = β1 + β2 D2 + β3 D3 ------------------- (2b) Substitute (2a) and (2b) in (1) LR = α1 + α2 D2 + α3 D3 + (β1 + β2 D2 + β3 D3) Z + U. E (LR) = α1 + α2 D2 + α3 D3 + β1Z+ β2 (D2Z) + β3 (D3Z) E (LR, commercial banks ) = α1 + β1Z. E (LR, Public own commercial banks ) = (α1 + α2) + (β1 + β2) Z. E (LR, Development finance banks ) = (α1 + α3) + (β1+ β3) Z. Q.15: Consider the following the regression equation estimated by OLS using time series data for 24
years, where E is official exchange rate (rupees per US dollar), P is domestic price level (CPI), П is world price level, and T is trade deficit as a percentage of GDP. The numbers in parentheses are the computed t-values. t-values.
Log (Et) = 0.12 + 0.58 Log (P t) – 0.23 Log (П t) + 0.0021T t + 0.92 Log (E t-1) (2.23) (3.21) (-1.85) (2.43) (23.00) R² = 0.9958 DW = 1.82 a) Interpret all the results other than R² and DW statistic. r eason for including lagged exchange rate in the equation? b) What could be the reason c) Are there are serious econometric problems apparent from the results? d) How would you re-estimate the equation in the light of these problems? If there are two or more problems, consider each one at a time. Q.16: Consider the following model of IS-LM
equilibrium:
IS: Y = α +β R +δ Z +U LM: M = Ф +λ R + Ө Y + π W +V Where Y is aggregate expenditure, R is interest rate; Z is exogenous component of aggregate expenditure, M is the quantity of money and W is financial wealth. Suppose the State Bank of Pakistan (SBP) pegs money supply at predetermined levels (that is the quantity of money is exogenous) and lets the interest rate be determined in the market (the interest rate is endogenous). Q.17: a)
Why do we include lagged variables in a regression equation? Illustrate using an example from economics. Answer: We use the lagged dependent variable to capture the inertia (sluggish ness) of the equation, for example current consumption depends up on previous consumption. b) Explain the use of Instrumental Variable Least Squares method using your example. Answer:
Model:
Y=α+βX+U ------------ (i) Cov (X, U) ≠0 Z is valid instrument.
������������ ���������� ��������� � ���������
���� � 85 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Stage 1:
Regress X on Z
X=a+bZ+V Apply OLS and obtain estimated
and
and hence
is such that it contains only those variations in X which are determined by Z (basically we are filtering out the problem). In other words we have Not explained by Z, “Trouble some”. Explained by Z, “Trouble free roughly”.
We can say that X is endogenous and therefore trouble some De-endogenizes X Stage 2: Rewrite the main equation (i) as follows;
Y=α+β(
)+U
Error in X variable
Or
Where Now apply OLS
It can be shown that estimator, so obtained are; 1). Not linear 2). Biased 3). asymptotically unbiased. [As the sample size increases biasness diminishes towards zero].
These estimators are called 2SLS estimators as well as Instrumental Variables Least Squares [IVLS] estimators. c) Provide interpretation for each parameter in the Answer:
������������ ���������� ��������� � ���������
light of economic model you have chosen. ���� � 86 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Q.18: a)
Explain the use of dummy variables in determining the effects of gender (male or female) a nd education (matriculation, intermediate, bachelor or higher) on wage rates among clerical personnel.
Answer:
Wage = f [Gender (G), Education (E)]
Gender dummy: G =1 if female, = 0 other wise. Education dummies: E2 = 1 if intermediate, = 0 other wise. E3 = 1 if higher, = 0 other wise. We can construct model in this way; W = α + β G + U. -------------------------------- ----- (1) α = α1 + α2 E2 + α3 E3 β = β1 + β2 E2 + β3 E3
------------------- (2a) ------------------- (2b)
Substitute (2a) and (2b) in (1) W = α1 + α2 E2 + α3 E3 + (β1 + β2 E2 + β3 E3) G + U. Or
W = α1 + α2 E2 + α3 E3 + β1G + β2 (E2G) + β3 (E3G) + U. Categories
Expected Wage “E (W)”
Male, Matriculation G = 0. Male, Intermediate Male, Higher
= α1. = α1 + α2 = α1 + α3.
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------.
Female, Matriculation G = 1. Female, Intermediate Female, Higher
= α1 + β1 = (α1 + α2) + (β1 + β2) = (α1 + α3) + (β1 + β3).
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------.
b) Provide interpretation for each parameter. Answer:
Interpretation of parameters. α1 = Mean wage of male at matriculation level education. α2 = Mean wage of male at intermediate level education. α3 = Mean wage of male at higher level education. β1 = Differential effect of being a female at matriculation level education. β2 = Differential effect of being a female at intermediate level education. β3 = Differential effect of being a female at higher level education.
������������ ���������� ��������� � ���������
���� � 87 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Q.19: Using
the regression equation Y i = β + Ui provide a precise answer to the following question with or without mathematical proofs. a) Under what assumption is the OLS estimator β of linear?
Answer: should be linear function Y. b) Under what assumption is the OLS estimator β Answer: wh when
of unbiased?
E ( ) = β then OLS estimator β will be unbiased.
Q.20: Consider
the following demand function for rice where Q is per capita consumption of rice in kilograms, P is price of rice per kilogram and M is per capita income in rupees. The regression equation has been estimated on the basis of time series data for 9 years. The values in parentheses are standard errors.
ln Q = 2.46 – 0.45 ln P + 0.65 ln M R² = 0.90, F = 12.00 (0.82) (0.20) (0.50) a) Explain the meanings of estimated regression coefficients. b) Test the null hypothesis of the following one by one and interpret the results of your tests. Income elasticity of rice is greater than one. i. Income elasticity of rice is negative. ii. Q.21: In
the regression equation Y i = β Xi +Ui the parameter β can be estimated using one of the following methods.
Are the two estimators linear; prove? Are the two estimators unbiased; prove? Which of the two estimators do you prefer over the other? Justify your choice. c hoice. Q.22: What are the limitations of DW test for autocorrelation? a) b) c)
Answer:
1. The test statistics has inconclusive range, so it may not produce a concrete conclusion. 2. The test is especially designed for AR (1) process, but not for higher order auto processes or MA process or others. 3. DW gives biased results when lagged dependent variable appears on the right hand side.
cor relation test for hetroscedasticity. Q.23: Explain rank correlation Answer: We have not learned that test. Q.24: Derive autocorrelation function for a) Ut = ρ 3 Ut-3 + ε t b)
the following autoregressive processes.
Ut = ε t + δ 1 ε t -1 + δ 3 ε t -3
Q.25: Derive OLS
estimators for the parameters of the following equations, where NX, X, and P are net export (exports minus imports), export and consumer price index respectively. a) NX = Xt - βYt +Ut
������������ ���������� ��������� � ���������
���� � 88 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Answer:
Apply OLS and estimate the NXt to compute regression residual ℮i².
First order condition.
b)
Log (Pt) = α + log (Pt-1) + Ut
Answer: Apply OLS and estimate the equation to compute regression residual ℮i².
First order condition:
Q.26:
Interpret multicollinearity problem as poor information content in data. Consider any estimation strategy and explain how it can improve the information content.
������������ ���������� ��������� � ���������
���� � 89 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Q.27: What
econometric problems arise in the estimation of an equation with lagged dependent variable on the right hand side? Suggest solution(s) to these problems. Q.28: Specify an econometric equation to determine monthly earning in a cross section of 300 economists in Pakistan. Define all the variables in your model and explain how they can be measured in practice. Q.29: Determine identification of the following two equations by hybrid equations method and explain the steps for estimation of each equation by 2SLS method. Consider the following set of equations. a) Y = α 1 + α 2 R + α 3 Z + U
Here Cov (R, U) ≠ 0. Z is the valid instrumental variable. Stage 1: Regress R on Z R=a+bZ+V Apply OLS and obtain , and Answer:
Where is such that it contains contains only that variation variation in R which which is explained explained or determined determined by Z.? In other words we have R= +R– . We can say that Z is endogenous there is some trouble. De-endogenizes R Stage 2: Rewrite the main equation as follows. Y = α1 + α2 ( + R – ) + α3 Z + U Y = α1 + α2 + α3 Z + U + α 2 (R (R – ). Error in R variable Or Y = α1 + α2 + α3 Z + V Where V = α2 (R (R – ). Now apply OLS Where
It can be shown that estimators, so obtained are (1) Not linear. (2) Biased. (3) Asymptotically unbiased
[As the sample size increases biasness will diminishes towards zero] .
b) M = = β1 + β2 Y + β3 R + V Answer: Here Cov (Y, V) ≠ 0.
R is the valid instrumental variable.
������������ ���������� ��������� � ���������
���� � 90 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Stage 1: Regress Y on R
Y = α1 + α2 R + U Apply Apply OLS OLS and obtai obtainn , and and
Where is such that, that, it contains contains only that variation variation which which explained explained or determine determinedd by R. In other words we have Y= +YWe can say that Y is endogenous, therefore there is some trouble De-endogenizes Y Stage 2: Rewrite the main equation as follows M = β1 + β2 ( + Y - ) + β3 R + V M = β1 + β2 + + β3 R + V + β2 (Y - ) Error in Y variable Or M = β1 + β2 + + β3 R + W [W = β2 (Y - )]
Now apply OLS Where
It can be shown that estimators, so obtained are (1) Not linear. (2) Biased. (3) Asymptotically unbiased
[As the sample size increases biasness will diminishes towards zero] .
Q.30: Consider
an econometric equation involving four or more variables. Suppose you have access to only annual data for 25 years for Pakistan and no other data are available in or outside Pakistan. Further suppose that there is severe multicollinearity in data that can not be eliminated by dropping any variable from the equation. How would you handle this situation? Provide an elaborate answer.
Q.31: The daily demand for strawberries in Islamabad depends on price of strawberries only. On each
day a fixed quantity of strawberries (that can change from day to day) is brought to the market and the price is determined at a level that clears the market. If it were known that the elasticity of demand is constant. Would you be able to obtain unbiased estimator of the elasticity?
Answer:
Qd = α + β P + U. Qs = Q is fixed.
������������ ���������� ��������� � ���������
[Demand function] [Supply function] ���� � 91 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Quantity supply is fixed, exogenous variable and Price is endogenous.
If elasticity of demand is (fixed) constant than the demand function becomes; Log Qd = α + β log P + U.
Log Qs = Q is fixed.
[Demand function] [Supply function]
Here we can not take out the P from expectation because P is not “fixed” variable, so it becomes biased because P and U are correlated with with each other. Q.32: You
want to estimate Cobb-Douglas production function for manufacturing sector of Pakistan with capital, labor and energy as the factor inputs, with only 16 time series observations available. Multicollinearity problem is likely to arise. In order to tackle this problem one can use 16 observations on the private sector and other 16 on public sector to make a pooled sample of 32 observations. What complications are likely to arise due to pooling and how would you respond to these complications? !~~~~~~~~~~~~~~~~~~~~~~~~~!
������������ ���������� ��������� � ���������
���� � 92 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
E523 Econometrics Sir Eatzaz Ahmed Terminal Paper
Spring Semester 2007 Total marks: 75 Time: 3 hour
NOTE: Attempt any three questions. Each question is worth 25 marks 1.
Explain and differentiate between: a) Error term of a regression equation and regression residual b) Random and fixed variables c) R square and Adjusted R-square d) Goldfield-Quandt Test and White’s general test e) Dummy, proxy and instrumental variables
2.
Are the following statements true, false or uncertain? Explain your answer. a) The sample mean of the random error term, U = 1 ∑ Ui is equal to zero. n b) In the regression equation Y/ X = b + U the OLS estimator of b is equal to Y / X. c) If the variable Y is regressed on X and log (X), it may create multicollinearity due to strong linear relationship between the variables X and log (X). d) In the equation X t = α + β Y t + δ Y t-1 + U t a major limitation of DW test is that it produces biased results due to presence of Y t-1 on right hand side of the equation. e) Since a dummy variable can take only values, it must be fixed (exogenous). f) Instrumental variables are used to test the presence of endogenous variables in the equation. 3. Consider the regression model: Y t = α + β Y t + U t U t = U t-2 + ε t. ε t is white noise a) Derive autocorrelation coefficients for the lag lengths 0, 1, 2, 3, 4. b) Explain the Two-Step Iterative method of estimation. 4.
Consider the regression model: Y t = α + β X t + U t X t = a + b Y t + V t. U and V satisfy all standard assumptions. a) Show that the OLS estimator of β is biased. b) Does bias decrease with increase in sample size? c) Consider any context in economics in which the above model can be applied. Mention what are the X and Y variables in the context that you have chosen. d) In the context you have chosen, what instrumental variable can be used for estimating α and β. ! ~~~~~~~~~~~~~~~~~~~~~~~~ !
������������ ���������� ��������� � ���������
���� � 93 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
E523 Econometrics Sir Eatzaz Ahmed 1st Mid Term
Spring Semester 2007 Total marks: 35 Time: 1 hour
1. Suppose you have have estimated two alternative alternative cost functions functions for wheat using using data on 500 500 farms. The cost (C) is measured in thousands of rupees while output (Q) is measured in tons. The regression results are given below. The vales in parentheses are standard errors. Log (C) = - 10.48 +1.12 log (Q) C/Q = 4568 + 0.2284 Q + 4.84 Q-¹ (2.56) (0.16) (1141) (0.5521) (4.40) Can you test the null hypothesis that the marginal cost is an increasing function of output for each equation? If yes, apply the test and draw your conclusion. If not, explain why the test cannot be applied and what additional information, if any, is required r equired to perform the test. Answer:
Log (C) = - 10.48 + 1.12 log (Q) (2.56) (0.16)
Null Hypothesis
H0: β = 0 H1: β ≠ 0
Apply test
We
Degree of freedom = (n-k) = (500 – 2) => 498.
Critical value = 1.96 Level of significance = 5% Two tail test
t
reject H0.
H0: β = 1 H1: β < 1
Apply test
We
Degree of freedom = (n-k) = (500 – 2) => 498.
Critical value = 1.96 Level of significance = 5% Right tail test
accept H0.
������������ ���������� ��������� � ���������
���� � 94 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
C/Q = 4568 + 0.2284 Q + 4.84 Q- ¹ (1141) (0.5521) (4.40) Null Hypothesis
H0: α = 0 H1: α ≠ 0
Apply test
We
Degree of freedom = (n-k) = (500 – 2) => 498.
Critical value = 1.96 Level of significance = 5% Two tail test
reject H0.
H0: β = 0 H1: β ≠ 0
Apply test
We
Degree of freedom = (n-k) = (500 – 2) => 498.
Critical value = 1.96 Level of significance = 5% Two tail test tes t
reject H0.
H0: β = 1 H1: β < 1
Apply test
We
Degree of freedom = (n-k) = (500 – 2) => 498.
Critical value = 1.96 Level of significance = 5% Right tail test
accept H0.
������������ ���������� ��������� � ���������
���� � 95 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
H0: γ = 1 H1: γ > 1
Degree of freedom = (n-k) = (500 – 2) => 498.
Critical value = 1.96 Level of significance = 5% Left tail test
Apply test
We
accept H0.
Conclusion:
The over all test’s on each equation is given but we are able to make the decision that over all equation is satisfactory and there is also need to check the performance of the equation given by R² which is not given in each of the equation. It is shown from the results that marginal cost is not increasing function of the output for each equation. 2. Using the regression equation Y i = β + Xt + Ui provide a precise answer a nswer to the following question with or without mathematical proofs.
a)
OLS estimator of β is .
Answer: OLS OLS
Estimation:
esti estima mato torr of β is . Yi = β + Xt + Ui
As we know that ℮ = Yi As we know ∑℮² = (Yi - ) ² ∑℮² = (Yi - -Xt) ² First order condition
=> -2 ∑ (Yi - -Xt) = 0. => ∑ Yi – n - ∑ Xt) = 0. => ∑ Yi – ∑ Xt = n ������������ ���������� ��������� � ���������
���� � 96 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
Both sides divide by n
b) Under what assumption is the OLS estimator β Answer: OLS estimator β of linear
= 1 (∑ Yi – ∑ Xt) n = 1 ∑ (Yi – Xt) n = 1 ∑ (β + Xt + Ui – Xt) n = 1 ∑ (β + Ui) n = ∑ (1 β +1 Ui) n n = n 1 β +1 ∑ Ui) n n = β +1 ∑ Ui n = β + ai ∑ Ui c)
of linear?
where ai = 1. n
Under what assumption is the OLS estimator β of unbiased?
Answer: OLS estimator
is unbiased E ( ) = β.
Proof:
E ( ) = E [β [β + ∑ (a (ai Ui)] = β + ∑ ai E (Ui) = β + ∑ ai (0) E( )=β
since ai is fixed where E (Ui) = 0
! ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ !
������������ ���������� ��������� � ���������
���� � 97 -
ECONOMETRICS LECTURES OF DR. EATZAZ AHMED
E523 Econometrics Sir Eatzaz Ahmed 2nd Mid Term 1.
2.
Spring Semester 2007 Total marks: 35 Time: 1 hour
a) b) c)
How would you simply define Multicollinearity? What type of procedure do you suggest to Diagnose or test Multicollinearity? Multicollinearity?
d) e)
Drive Autocorrelation coefficient function at lag length 0, 1, 2, 3, 4.
Consider the following model & solve through Iterative Two-Step procedure. Yt = α + β Xt + Ut. -------------------------- (1) Ut = ρ Ut-1 + Є t. -------------------------- (2) Є t. is White noise. ! ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ !
������������ ���������� ��������� � ���������
���� � 98 -
View more...
Comments