DataScience R Project Insurance

May 24, 2018 | Author: v_kans | Category: Vehicle Insurance, Insurance, Regression Analysis, Mathematics, Business
Share Embed Donate


Short Description

ds with r...

Description

Data Data Science with with R Project Project Submission -Swedish Insurance Project Submitted by : xxxx

Submitted on dd-mmm-yyyy

Problem Statement •







The data gives the details of third party motor insurance claims in Sweden for the year 1977. In Sweden, all motor insurance companies apply identical risk arguments to classify customers, and thus their portfolios and their claims statistics can be combined. The data were compiled by a Swedish Committee on the Analysis of Risk Premium in Motor Insurance. The Committee was asked to look into the problem of analyzing the real influence on the claims of the risk arguments and to compare this structure with the actual tariff.

Problem Statement - continued 1.

The committee is interested to know each field of the data collected through descriptive analysis to gain basic insights into the dataset and to prepare for further analysis.

2.

The total value of payment by an insurance company is an important factor to be monitored. So the committee has decided to find whether this payment is related to the number of claims and the number of insured policy years. They also want to visualize the results for better understanding.

3.

The committee wants to figure out the reasons for insurance payment increase and decrease. So they have decided to find whether distance, location, bonus, make, and insured amount or claims are affecting the payment or all or some of them are affecting it.

4.

The insurance company is planning to establish a new branch office, so they are interested to find at what location, kilometer, and bonus level their insured amount, claims, and payment get increased.

5.

The committee wants to understand what affects their claim rates so as to decide the right premiums for a certain set of situations. Hence, they need to find whether the insured amount, zone, kilometer, bonus, or make affects the claim rates and to what extent.

Question-1 The committee is interested to know each field of the data collected through descriptive analysis to gain basic insights into the dataset and to prepare for further analysis.

Step-1: Univariate Analysis:- To find the number of variables in the given data and find its data type. 2nd level analysis to find the unique data for each variable.

Step-2: Summarizing the given data. Kilometres, Zone, Bonus and Make are categorical variables. So that converted into factors. Mean of Insured is 1092.2, Mean of Claims is 51.87 and Mean of Payment is 257008. Finding-1: Claims & Payment have 0 minimum value, ie, there is no claim when there is no payment

Question-2 The total value of payment by an insurance company is an important factor to be monitored. So the committee has decided to find whether this payment is related to the number of claims and the number of insured policy years. They also want to visualize the results for better understanding.

Finding: Claims have 99% correlation with Payment

Finding: Insured have 93% correlation with Payment

Insured : 0.933217

Claims : 0.9954003

     t     n     e     m     y     a      P

     0      0      0      0      0      0      0      1

     t     n     e     m     y     a      P

     0

     0      0      0      0      0      0      0      1

     0 0

500

1500 Claims

2500

0

40000

80000 Insured

120000

Question-3 The insurance company is planning to establish a new branch office, so they are interested to find at what location, kilometer, and bonus level their insured amount, claims, and payment get increased.

Perform Linear Regression function to find the impact for Dependent variable: Payment Independent variables: Insured, Claims, Make, Bonus, Zone, and Kilometers

Question-4 The committee wants to figure out the reasons for insurance payment increase and decrease. So they have decided to find whether distance, location, bonus, make, and insured amount or claims are affecting the payment or all or some of them are affecting it.

Grouped the Insured, Claims and Payments based on a)Zone, b)Kilometre & c) Bonus. The findings are marked below in the screenshot

Question-5 The committee wants to understand what affects their claim rates so as to decide the right premiums for a certain set of situations. Hence, they need to find whether t he insured amount, zone, kilometer, bonus, or make affects the clai m rates and to what extent.

Perform Linear Regression function to find the impact for Dependent variable: Claims Independent variables: Insured, Claims, Make, Bonus, Zone, and Kilometers

Thank You

View more...

Comments

Copyright ©2017 KUPDF Inc.
SUPPORT KUPDF