Part 1 Building your Own Binary Classification Model.txt

Share Embed Donate


Short Description

Download Part 1 Building your Own Binary Classification Model.txt...

Description

First Binary Classification Model Data_Final Project.xlsx You work for a bank as a business data analyst in the credit card risk-odelin! de"artent. Your bank conducted a bold ex"erient three years a!o# for a sin!le day it $uietly issued credit cards to e%eryone who a""lied& re!ardless of their credit risk& until the bank had issued '(( cards without screenin! a""licants. )fter three years& *+(& or ,+& of those card reci"ients defaulted# they failed to "ay back at least soe of the oney they owed. owe%er& the bank collected %ery %aluable "ro"rietary data that it can now use to o"tii/e its future card-issuin! "rocess. 0he bank initially collected six "ieces of data about each "erson# � )!e � Years

at current e"loyer

� Years

at current address

� 1ncoe

o%er the "ast year

� Current

credit card debt& and

� Current

autoobile debt

1n addition& the bank now has a binary outcoe# default 2 *& and no default 2 (. Your first assi!nent is to analy/e the data and create a binary classification odel to forecast future defaults. You will will cobine cobine data data fro fro the abo%e six in"uts in"uts to out"ut a sin!le sin!le � score.� 3se the 4oldier Perforance s"readsheet for a si"le exa"le of cobinin! ulti"le in"uts. Forecastin! 4oldier Perforance.xlsx 0he relati%e relati%e rank-orderin! rank-orderin! of scores scores will deterine deterine the odel odel� s effecti%eness. effecti%eness. For con%enience-- in "articular& so that you can use the )3C Calculator 4"readsheet-you are asked to use a scale for your score that has a axiu 5 6.+ and a iniu 7 -6.+. )t first you are are not told what your your bank� s own best estiate estiate for its cost cost "er False 8e!ati%e 9acce"ted a""licant who becoes a defaultin! custoer: and False Positi%e 9rejected custoer who would not ha%e defaulted: classification. 0herefore& the best you can do is to desi!n your odel to axii/e the )rea 3nder the ;uestion# ?hat is your odel@ Ai%e it as a function of the two or ore of the six in"uts. For exa"le# 9)!e  Years at Current )ddress:1ncoe not a !reat odelE. Your odel should ha%e at least two in"uts. * r

?hat is your odel� s )3C on the 0rainin! 4et@ 3se two di!its to the ri!ht of the decial "lace. *, x ' x .G r 9999Hess than .+ is not correct - you need to ake the hi!hest %alue the lowest by di%idin! by -*. .+ has no "redicti%e %alue. .I or hi!her is too !ood to be trueE::::

1nitial )ssessent for uestion O. >uestion# )t that sae threshold score 98uestion ' are sustainable lon! ter. >uestion# ow uch oney does the bank sa%e& "er e%ent& usin! your odel and its data-in"uts& instead of issuin! credit cards to e%eryone who asks@ int# the cost of issuin! credit cards to e%eryone 9no odel& no forecast: has been deterined to be ,+LK+((( 2 K*&,+( "er e%ent. Dollar %alue of the odel-"lus-data is the difference between K*&,+( and your nuber. 8ote# for Coursera to inter"ret your answer correctly you ust !i%e your answer as an inte!er - no decials or dollar si!n. For =xa"le - enter KJ((.(( as NJ((N *(( x ,(( r 99999999952K*+( sa%in!s is a weak odel 5K*+( to 52 K,+( sa%in!s is an ok odel

5 K,+( to 52 KO+( sa%in!s is a %ery !ood odel 7KO+( sa%in!s is an excellent odel:::::::: Payback Period for Your Model >uestion# Ai%en that it a""arently cost the bank KG+(&((( to conduct the three-year ex"erient& if the bank "rocesses *((( credit card a""licants "er day on a%era!e& how any days will it take to ensure future sa%in!s will "ay back the bankQs initial in%estent@ Ai%e nuber rounded to the nearest day 9inte!er %alue:. int# ulti"ly your answer to >uestion G - the cost sa%in!s "er a""licant - by *((( to !et the sa%in!s "er day. G(((((

x

6 r 999999More than a week O-G days



%ery !ood

,-6 days



excellent

* day





"oor

too !ood to be trueE:::::::::

)ny odel that is reducin! uncertainty will ha%e a 0rue Positi%e ;ate... ...=$ual to the 0est 1ncidence 9 of outcoes classified as NdefaultN: x ...Hess than the 0est 1ncidence 9 of outcoes classified as NdefaultN: x ...Areater than the 0est 1ncidence 9 of outcoes classified as NdefaultN: Ai%en that the base rate of default in the "o"ulation is ,+& any test that is reducin! uncertainty will ha%e a Positi%e Predicti%e alue 9PP:... ...=$ual to .,+ x ...Hess than .,+ x ...Areater than .,+ Ai%en that the base rate of default in the "o"ulation is ,+& any test that is reducin! uncertainty will ha%e a 8e!ati%e Predicti%e alue 98P:... =$ual to .G+ x ...Hess than .G+ x ...Areater than .G+

Confusion Matrix Metrics. 0o deterine all "erforance etrics for a binary classification& it is sufficient to ha%e three %alues 0he Condition 1ncidence 9here the default rate of ,+: 0he "robability of 0rue Positi%es 9the 0rue Positi%e rate ulti"lied by the Condition 1ncidence:

0he �0est 1ncidence� 9also called � classification incidence� - the su of the "robability of 0rue Positi%es and False Positi%es: 0hese three %alues can all be obtained fro the )3C Calculator 4"readsheet and and then used as in"uts to the 1nforation Aain Calculator 4"readsheet to deterine all other "erforance etrics. )3C_Calculator and ;e%iew of )3C Cur%e.xlsx 1nforation Aain Calculator.xlsx >uestion# ?hat is your odel� s 0rue Positi%e ;ate@ 4a%e this answer as it will be needed a!ain for Part 6 9>ui/ 6: * x 6( x .6( r 999999952 .,+ is incorrect::::::::

>uestion# ?hat is your odel� s

� test

incidence� @

4a%e this answer as it will be needed a!ain for Part 6 9>ui/ 6: ( x * x *((( x ,((.(( x 0est 1ncidences cannot be so sall that they force a hi!h false ne!ati%e rate nor lar!e that they force a hi!h false "ositi%e rate. ) "erfect test will of course ha%e a 0est 1ncidence e$ual to the Condition 1ncidence � but ost classification systes are focused on a%oidin! false ne!ati%es and ha%e a hi!her 0est 1ncidence than Condition 1ncidence.

View more...

Comments

Copyright ©2017 KUPDF Inc.
SUPPORT KUPDF