Application and Comparison of Classification Techniques in Controlling Credit Risk

September 12, 2022 | Author: Anonymous | Category: N/A

Share Embed Donate

Report this link

Short Description

Download Application and Comparison of Classification Techniques in Controlling Credit Risk...

Description

Application and Comparison of Classification Classifica tion Techniques Techniques in Controlling Credit Risk Abstract: Credit rating is a powerful tool that can help banks improve loan quality and decrease credit risk. This chapter examines maor classification techniques! which include traditi trad itiona onall statist statistical ical model modelss "#$A! "#$A! %$A and logist logistic ic regress regression ion&! &! k'neare k'nearest st neighb neighbors ors!! (ayesian networks ")a*ve (ayes and TA)&! decision trees "C+.,&! associative classification "C(A "C(A&! &! a ne neur ural al ne netw twor ork k an and d suppo support rt ve vecto ctorr mach machin ines es "-/ "-/&! &! an and d ap appl plie iess th them em to controlling credit risk. The experiments were conducted on 0++ rated companies mainly from the 1ndustrial and Commercial (ank of China. The receiver operating characteristic curve and the $elong'2earson method were adopted to verify and compare their performance. The results reveal that while traditional statistical models produced the poorest outcomes! C+., or -/ did not perform satisfactorily! satisfactor ily! and C(A seemed to be the best choice for credit rating in terms of predictability and interpretability. 3ey 4ords: 4ords: Classification! credit risk! R5C curve! $elong'2earson method. 6. Credit Risk and Credit Rating 1n a broad sense! credit risk is the uncertainty or fluctuation of profit in credit activities. 2eople pay high attention to the possible loss of credit assets in the future. 1n other words! the credi creditt risk risk is th thee po poss ssib ibil ility ity th that at so some me relat related ed fa facto ctors rs ch chan ange ge an and d af affe fect ct cr cred edit it as asset setss negatively! thus leading to the decline of the whole bank7s value "8ang! 9ua  8u 0;;derington 6DF,&. 1n addition to these traditional statistical methods! artificial intelligence techniques! such as case based reasoning systems and neural networks! were adopted to improv imp rovee the predic predictio tion n ability ability in credit credit ratings ratings.. 1nvesti 1nvestigat gation ionss of neural neural networ networks ks and numerous numer ous experiments experiments reveale revealed d that such methods can normally normally reach higher higher accuracy than traditional statistical methods "$utta  -hekhar 6DFF -ingleton  -urkan 6DD; /oody  Btans 6DD, 3im 6DD2 "$ong! [hang! 4ong  #i 6DDD&! C/AR "#iu! 9an  2ei 0;;6& and C2AR "8in  9an 0;;ach point on the R5C curve corresponds to a specific threshold value. The area The area un unde derr the the R5 R5C C cu curv rvee "ABC "ABC&& was was co comp mput uted ed to measu measure re th thee di discr scrim imin inato atory ry performance. =rom a statistical viewpoint! it estimates the probability that the output of a randomly selected sample from the negative population will be less than or equal to that of a randomly selected sample from the positive population. The larger ABC! the more powerful the classifier is. #ong and his associates proposed a nonparametric approach to compare the areas under two or more correlated receiver operating characteristic curves "#ong! #ong  Clarke'2earson 6D 6DFF FF&. &. -upp -uppos osee th that at th there ere ar aree # class classif ifie iers rs to be co comp mpare ared d an and d th thee ou outp tput utss of th thee rt% classification classific ation algorithm algorithm on positive positive samples are X i $ r i ! P 6!0!...! ! and the outputs on negative samples are Y j n r j ! P 6!0!...! . The ABC value of the rt% classifier can be evaluated as: Mrumus wher wh eree & is the row vector vector of coeffi coefficien cients. ts. =or instance! instance! when when only two classifi classifiers ers are compared! & is set at "6! '6& and 6 N is assumed to be equal to 0 N . The corresponding value compared! & of equation ",+& shows whether these two ABC values are statistically different. 4.3 !peri"ental &esults

Table < summariGes the accuracy and ABC value of each classifier on the test dataset. Their R5C curves are plotted in =igures ,! H! E and F. Mtabel < 1n order order to test test whethe whetherr the differ differenc encee betwee between n two classif classifiers iers is signif significan icant! t! the non' parametric $elong'2earson statistical method "#ong! #ong  Clarke'2earson 6DFF& was employed. The comparison results are described in Table + in terms of one'tail one'tail ' ' values. Mfigure,'F Mtabel + The ABC value of a classifier on the it% row is less than or equal to that of the classifier on the j the jth th column when i _ j. j. According to the above table! the performance of these classifiers can be grouped into three categories. The traditional statistical methods! such as #5?! #$A and %$A! resulted in the poorest performance. The C+., and -/ methods did not achieve the satisfactory results as expected. The R5C values of the rest of the five classifiers are higher and have no significant differences among them. The C(A algorithm is preferred in this experiment because of its powerful classification ability as well as understandable rule sets for decision makers.

5ur research research results results are consist consistent ent with with prior prior research research studie studiess indicat indicating ing that that machin machinee learning techniques! such as decision trees and neural networks! can normally provide better prediction outcomes than traditional statistical methods. This is probably the case because tradi traditi tion onal al stati statisti stical cal meth method odss requ require ire re resea search rchers ers to impo impose se spec specifi ificc truc tructu tures res an and d assumptions to different models and then to estimate parameters in them so as to fit these training trainin g data. /achine learning techniques! however! however! are free of structural structural assumption assumptionss that underlie statistical methods! and can extract knowledge from a dataset directly. =or example! the structure of a decision tree is never determined before being trained! while it can be recursively split! from a root node! and pruned later in order to fit the training data as well as to obtain good prediction ability. The most surprising result is that the popular -/ method did not achieve outstanding performance no matter what penalty factor and kernel parameters were selected. This result disagrees with previous work regarding the application of -/ to the the an analy alysis sis of cr cred edit it da data ta.. The The mech mechan anism ism be behi hind nd th this is ph phen enom omen enon on de deser serve vess more more exploration and analysis in the future. The associa associativ tivee classifi classificati cation on techni technique ques! s! such such as C(A! C(A! have have not been been emphas emphasiGed iGed in previous credit rating research work. As mentioned above! associative classifiers search globally for all class association rules that satisfy given minimum support and minimum confidence thresholds. The richness of the rules gives this technique the potential to uncover the true classification structure of the data. Compared with decision tree based techniques! associative classification is more flexible because a decision tree is generated in a recursive way! which may prevent it from discovering a better classification strategy. =or example! once the first split of the root node is performed! it will affect all subsequent split choices! which appears to be a bit rigid in this sense. As long as classassociation rules are pruned and organi org aniGed Ged approp appropriat riately ely!! associa associativ tivee classifi classificati cation on techniq techniques ues can probab probably ly yi yield eld good good performance. Although the experiments in this chapter indicate that C(A "or associative classification methods in general& has its advantage and might be a proper choice when rating the risk of an applicant! it is worthy of mentioning that these techniques are heuristic and data driven! and it is impossible for one algorithm to outperform all others in all situations. Bsers or decision makers are expected to be cautious in selecting appropriate classification tools and their corresponding parameters if they try to extract knowledge of high quality in enterprise data. 5. Conclusions and Future Work

Controllin Contro lling g credit credit risk is crucial crucial for commer commercial cial banks banks to identi identify fy the cli client entss that that will will probably breach their contracts in the future. Although the credit rating system provides an effective tool! it is not possible to rate all the clients and repeat the rating frequently. $ata mining and computational intelligence! especially classification techniques! can be applied to learn and predict the credit rating automatically! thus helping commercial banks detect the potential high'risk clients in an accurate and timely manner.

A comprehensive examination of several well'known classifiers is described in this chapter. All these classifiers have been applied to 0++ rated companies mainly from the 1ndustrial and Commercial (ank of China. The results revealed that traditional statistical models had the poorest outcomes! and that C+., and -/ did not achieve a satisfactory performance as expected. 5n the other hand! C(A! an associative classification technique! seemed to be the most appropriate choice. =uture work may focus on collecting more data for experiments and applications! particularly with wit h more more explor explorati ation on of Chines Chinesee credit credit rating rating data data structu structures. res. 1n this this chapter chapter!! feature feature selectionItransformation methods such as A)5A or 2CA analysis are found independent of these classification methods and did not lead to improvements of their prediction abilities. An investigation in the future might be to apply another type of feature selection methods! which are are de depe pend nden entt on th thee classi classifi ficat catio ion n algo algorit rithm hms! s! in or orde derr to find find ou outt th thee be best st fe feat atur uree combination for each classifier. Acknowledgements

The work work was partly partly suppor supported ted by the )ation )ational al )at )atura urall -cienc -ciencee =ounda =oundatio tion n of China China "ED "EDD0,; D0,;;6 ;6IE IE; ;0< 0

Application and Comparison of Classification Techniques in Controlling Credit Risk

Short Description

Description

Comments

We need your help!