regression analysis

Share Embed Donate


Short Description

Download regression analysis...

Description

PROJECT REPORT ON REGRESSION ANALYSIS [IMBA]

PRESENTED BY NAME: - D.SRIKANTH ENROLL NO: - 6NI14059

INTRODUCATION:Trevor Bull - Managing Director Mr. Trevor Bull joined Tata AIG Life as Managing Director in January 2006. Prior to this, Trevor was Senior Vice President and General Manager at American International Assurance in Korea Tate AIG Life Insurance Company Ltd. and Tata AIG General Insurance Company Ltd. (collectively "Tata AIG") are joint venture companies, formed from the Tata Group and American International Group, Inc. (AIG). Tata AIG combines the power and integrity of the Tata Group with AIG's international expertise and financial strength. Tata Group holds 74 per cent stake in the two insurance ventures, with AIG holding the balance 26 per cent stake.

Tata AI G Life Insurance Company Ltd. provides insurance solutions to individuals and corporate. Tata AI G Life Insurance Company was licensed to operate in India on February 12, 2001 and started operations on April, 2001. Tata AIG Life offers a broad array of life insurance coverage to both individuals and

groups, providing various types of add-ons and options on basic life products to give consumers flexibility and choice. Tata AIG Life Insurance Company offers products in Ahmedabad, Bangalore, Chandigarh, Chennai, Guwhati, Hyderabad, Jaipur, Jamshedpur, Jodhpur, Kochi, Kolkata, Mangalore, Muinbai, New Delhi, Pune, Rajkot, Trichi, - Vijay Wada and Lucknow

Objective of the Study The objective of this study is to measure the regression analysis method used by TATA AIG in the city of Hyderabad.

Questionnaire Development For the purpose of this study, a structured questionnaire was developed. In this stage, an exploratory study was carried out using personal and focus group interviews

Collection of Data The above mentioned questionnaire was used to collect the primary data. For secondary data, research papers, journals and magazines were referred.

Regression analysis In statistics, regression analysis is a collective name for techniques for the modeling and analysis of numerical data consisting of values of a dependent variable (also called response variable or measurement) and of one or more independent variables (also known as explanatory variables or predictors). The dependent variable in the regression equation is modeled as a function of the independent variables, corresponding parameters ("constants"), and an error term.

The error term is treated as a random variable. It represents unexplained variation in the dependent variable. The parameters are estimated so as to give a "best fit" of the data. Most commonly the best fit is evaluated by using the least squares method, but other criteria have also been used.

Regression can be used for prediction (including forecasting of time-series data), inference, hypothesis testing, and modeling of causal relationships. These uses of regression rely heavily on the underlying assumptions being satisfied. Regression analysis has been criticized as being misused for these purposes in many cases where the appropriate assumptions cannot be verified to hold. One factor contributing to the misuse of regression is that it can take considerably more skill to critique a model than to fit a model

Underlying assumptions Classical assumptions for regression analysis include: • • •







The sample must be representative of the population for the inference prediction. The error is assumed to be a random variable with a mean of zero conditional on the explanatory variables. The independent variables are error-free. If this is not so, modeling may be done using errors-in-variables model techniques. The predictors must be linearly independent, i.e. it must not be possible to express any predictor as a linear combination of the others. See Multicollinearity. The errors are uncorrelated, that is, the variance-covariance matrix of the errors is diagonal and each non-zero element is the variance of the error. The variance of the error is constant across observations (homoscedasticity). If not, weighted least squares or other methods might be used.

These are sufficient (but not all necessary) conditions for the least-squares estimator to possess desirable properties, in particular, these assumptions imply that the parameter estimates will be unbiased, consistent, and efficient in the class of linear unbiased estimators. Many of these assumptions may be relaxed in more advanced treatments. Regression Analysis that involves two variables is termed bi-variate linear Regression Analysis. Regression Analysis that involves more than two variables is termed as “Multiple Regression Analysis”.

The Bi-variate linear Regression Analysis involves Analyzing the straight line relationship between two continues variables the Bi-variate linear Regression can be expressed as: Y=α+βX Where, Y represents the dependent variable X is independent α and β are two constraint which are know as regression coefficient. β is slope of coefficient β can be symbolically represented as ∆Y/∆X α= Yi-Xiβ β = (Yi-Yj)/ (Xi-XJ)

Least square method The method of least squares or ordinary least squares (OLS) is used to solve over determined systems. Least squares are often applied in statistical contexts, particularly regression analysis. Least squares can be interpreted as a method of fitting data. The best fit in the least-squares sense is that instance of the model for which the sum of squared residuals has its least value, a residual being the difference between an observed value and the value given by the model. The method was first described by Carl Friedrich Gauss around 1794.[1] Least squares correspond to the maximum likelihood criterion if the experimental errors have a normal distribution and can also be

derived as a method of moments estimator. Regression analysis is available in most statistical software packages.

The relationship between the amount spent on advertisement per month & number of customer visited because of advertisement given by TATA AIG Life Insurance Co. The equation for regression line assume by least square is shown below Y=a+bX+ci Where, Y is dependent variable X is independent variable a is a Y intersect b is a slope of line The below table shows the amount spent on advertisement & number of customer visited through advertisement.

AMOUNT SPENT ON ADVERTISING

N.O OF CUSTOMERS VISITED (IN 000’S) [Y]

(IN CRORES)[X]

JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC

3.6 4.8 2.4 7.2 6.9 8.4 10.7 11.2 6.1 7.9 9.5 5.4

9.3 10.2 9.7 11.5 12 14.2 18.6 28.4 13.2 10.8 22.7 12.3

The constant b can be calculated using formula b=m∑ (XY)-∑X ∑Y/n ∑(X2)-(∑X) 2 X is dependent variable Y is independent variable a is calculated as shown below: a = Ῡ-b Where, Ῡ = the mean of value of dependent variable  = the mean of value of independent variable ei= is the error. It is called as residual value. The criterion for the least squar method is given below.

Σ e2i i=1

Where ei = Yi Ŷ i Yi is the actual value of the Dependent variable Ŷi is the value lying on the Estimated regression line. Let a solve the example previously discussed using the least square method. We need to determine the constant a&b to develop the regression equation. The required calculation for determining the constant are shown in table

AMOUNT SPENT ON ADVERTISING

N.O OF CUSTOMERS

2 X

XY

VISITED (IN 000’S) [Y]

(IN CRORES)[X]

3.6

9.3

33.48

12.96

4.8

10.2

48.96

23.04

2.4

9.7

23.28

5.76

7.2

11.5

82.8

51.84

6.9

12

82.8

47.61

8.4

14.2

119.28

70.56

10.7

18.6

199.02

114.49

11.2

28.4

318.08

125.44

6.1

13.2

80.52

37.21

7.9

10.8

85.32

62.41

9.5

22.7

215.65

90.25

5.4

12.3

66.42

29.16

Σx=84.1

ΣY=172.9

ΣXY=1355.61

ΣXY=1355.61

b = 12(1355.61)(84.1)(172.9)/12(670.73)-(84.1)2 = 1.768 The step is to calculate “a” To calculate the value of small “a” we need to first determine the mean of value of variable X&Y  = 84.1/12 =7.0 Ῡ = 172.9/12 =14.40 Substituting the value in equation a = 14.40-(1.768)(7) = 14.40-12.39

= 2.01 We know develop the estimated regression equation by substituting the value of a & b in equations Ŷ = 2.01+1.768X Ŷ represents the estimated value of dependent variable for a given value of X The Strength of Association – R2 R2 can be calculated using the following formula: R2 = explained variance/total variance Total variance = explained variance – unexplained variance Explained variance = total variance – unexplained variance Therefore R2= total variance – unexplained variance/total variance R2 =

1-unexplained variance/total variance The unexplained variance is given by Σ(Yi – Ŷ) 2

The total variance by Σ(Yi - Ῡ) 2 R2 = 1-Σ(Yi – Ŷ) 2 / Σ(Yi - Ῡ) 2

X

Y

XY

X2

Ŷ

Y- Ŷ (Y- Ŷ) 2 (ŶῩ) 2

(YῩ) 2

3.6

9.3

33.48

12.96

4.8

10.2

48.96

23.04

2.4

9.7

23.28

5.76

7.2

11.5

82.8

51.84

6.9

12

82.8

47.61

14.2 092

8.4

14.2

119.28

70.56

16.8 612

10.7

18.6

199.02

114.49

20.9 276

11.2

28.4

318.08

125.44

6.1

13.2

80.52

37.21

7.9

10.8

85.32

62.41

21.8 116 12.7 948 15.9 772

9.5

22.7

215.65

90.25

5.4

12.3

Σx= ΣY= 84.1 172. 9  =7. Ῡ 0 =14.

8.37 48 10.4 964 6.25 32 14.7 396

0.925 2 0.296 4 3.446 8 3.239 6 2.209 2 2.661 2 2.327 6 6.588 4 0.405 2 5.177 2 3.894

0.85599 504 0.08785 296

36.30 26.01 304 15.23 17.64 809

11.8804 3024 10.4950 0816

66.37 22.09 035 0.115 8.41 328

4.88056 0.036 5.76 464 405 7.08198 6.057 0.04 544 505 5.41772 42.60 17.64 176 956 43.4070 1456 0.16418 704 26.8033 9984

18.8 15.1632 06 36 66.42 29.16 11.5 0.786 0.61779 14 6 ΣXY=1 ΣXY=1 Σ (Y355.61 355.61 Ŷ) 2

54.93 196 181 2.576 1.44 667 2.487 12.96 56 19.41 284 8.328 996 Σ (ŶῩ) 2

=126. =25 855 4.4

68.89 4.41 Σ (YῩ) 2 =38 1.29

682

40 Therefore R2 = 1- (Yi – Ŷ) 2 / Σ(Yi - Ῡ) 2 = 1- 126.885/381.29 = 1- 0.33 = 0.67 = 67%

Conclusion This implies that of the total variation of Y, nearly 67% is explain by the variation in X. Hence there is strong linear relationship between the two variables.

View more...

Comments

Copyright ©2017 KUPDF Inc.
SUPPORT KUPDF