A Structured Approach to Building Predictive Key Risk Indicators
Short Description
A Structured Approach to Building Predictive Key Risk Indicators...
Description
A Structured Approach to Building Predictive Key Risk Indicators by Aravind Immaneni, Chris Mastro and Michael Haubenstock
L
eading risk indicators with good predictive capabilities are critical to the successful management of enterprise risk. This article describes how a process that incorporates some Six Sigma meth-
ods for developing and using key risk indicators was used at Capital One.
The Role of KRIs in ERM Key risk indicators (KRIs) play a critical role in any risk management framework. Tools for monitoring controls, risk drivers, and exposures, they can provide insights into potential risk events. For example, where self-assessments are used periodically to identify risks and controls, KRIs can monitor them in the intervening intervals. KRIs also can provide a means to express risk appetite. KRIs often serve their most practical purpose in conjunction with a system of thresholds; when a KRI breaches its associated threshold, it triggers a review, escalation, or management action. As a rule, KRIs should be monitored closer to the “front” than in the higher reaches of management. In the absence of any major risk changes, monthly summaries of the
most important measures may suffice as a risk profile update to management. However, this is easier said than done, and one of the current challenges of operational risk management is how to structure senior management reporting to be as useful as possible. Especially where KRIs are concerned, most measures are business or process specific and difficult to aggregate. Even measures that are common to many areas of an organization, such as turnover, training, and other human resources measures, may track risk well in relatively small business units but track it very poorly when measured at the enterprise level. Types of Key Risk Indicators Key risk indicators encompass different types of metrics. For the purposes of this article, KRIs are
divided into four different categories: coincident indicators, causal indicators, control effectiveness indicators, and volume indicators. • Coincident indicators can be thought of as a proxy measure of a loss event and can include internal error metrics or near misses. An example of a coincident indicator in a paymentprocessing operation may be number of misapplied payments identified through internal quality assurance sampling. • Causal indicators are metrics that are aligned with root causes of the risk event, such as system down time or number of late purchase orders. • Control effectiveness indicators provide ongoing monitoring of the performance of controls. Measures may include control
© 2004 by RMA. Aravind Immaneni is a senior process redesign specialist, Chris Mastro is group manager of process engineering, and Michael Haubenstock is director, Operational Risk Management, at Capital One, Richmond, Virginia. 42
Operational Risk: A Special Edition of The RMA Journal May 2004
A Structured Approach to Building Key Risk Indicators
•
effectiveness, such as percent of supplier base using encrypted data transfer, or bypassed controls, such as dollars spent with nonapproved suppliers. Volume indicators (sometimes called inherent risk indicators) frequently are tracked as key performance indicators; however, they also can serve as a KRI. As volume indicators (e.g., number of online account applications) change, they can increase the likelihood and/or impact of an associated risk event, such as fraud losses. Volume indicators are often associated with multiple risk types in a process or business unit.
Key Risk Indicator Methodology The successful identification and application of effective KRIs require a structured approach. We used a six-step process that incorporates various Six Sigma tools: 1. Identify existing metrics. 2. Assess gaps. 3. Improve metrics. 4. Validate and determine trigger levels. 5. Design dashboard. 6. Establish control plan. This is a process that could be applied to develop, validate, and implement KRIs across any business, but here we illustrate it with an example from a call-center operation. The risk here is that a customer is not handled profes-
sionally and the information given is not accurate. 1. Identify existing metrics. Developing key risk indicators often starts with a risk assessment. Risk events in a business are identified, assessed, and catalogued along with their associated controls and an analysis of their root causes. Quite sensibly, businesses sometimes then opt to focus their KRI development on the events with high inherent or residual risk. Then the first step in the KRI process is to identify existing metrics for each high-risk potential event. Interviewing subject matter experts (SMEs) in the business typically uncovers five to 10 existing
Figure 1
Gap Assessment Template
Q#
Dimension
Call-Center Operations Customer contact not completed in an accurate or professional manner Customer Satisfaction Index Associate Attrition Average Handling Time (AHT) Transfer Rate
Assessment Question
Low Values (rating 1)
5
6
7
High Values (rating 5) –Frequency is clearly defined. –Frequency is at least daily or lowest required for the specific metric. –Frequency is low enough to identify and prevent potential risk events. Trigger levels exist and they are sound. Clear escalation criteria with responsible owner with documentation. The metric is tied to one of the major root causes and has sufficient lead to prevent the risk event from occurring. Clear ownership for creation and analysis of the metric as part of an established job function. Historical data available and has been tracked as a metric for significant period of time.
–Frequency is clearly defined. –Frequency is at least weekly. –It is not clear whether the frequency is sufficient to prevent the risk event. Trigger levels have been identified but are not analytically sound. Escalation criteria exist but no clear owner or documentation. The metric is tied to a control or root Is the metric a leading or a Is the metric tied to the risk cause, but not leading enough to Leading/Lagging indicator? event occurrence? prevent a risk event. Some ownership, but changes from Is there a clear owner for the No clear owner. Metric is more ad- time to time or is not a clearly Ownership creation and analysis of the metric? hoc in nature. established job function. Past data available but has not been Does historical data exist on New or recently created metric tracked. It can be retrieved with Historical Data the metric? with no past data. some effort. Data reliability and accuracy can Reliable data collection process is not be ascertained (or are unknown). in place and is not subjective. Reliable, repeatable data collection How accurate and reliable is Process/procedure for data Measurement error is high procedures. Measurement error is low Data Accuracy the data? collection is subjective in nature. (inadequate) or unknown. (adequate) and it is well known. Average Score (Q1 to Q7)
Is the frequency of measurement –Frequency is not clear adequate to flag a risk event –Frequency is monthly or less 1 Frequency prior to occurrence? frequent. Do trigger levels exist and if so Trigger levels have not been 2 Trigger Levels are they analytically sound? identified. Are there clear escalation criteria 3 Escalation Criteria tied to the trigger levels? No clear escalation criteria.
4
Medium Values (rating 3)
Met ric 1 Met ric 2 Met ric 3 Met ric 4
Dept: Risk Event: Metric 1: Metric 2: Metric 3: Metric 4:
3
2
5
4
3
1
5
1
1
1
5
1
1
3
4
2
5
5
5
3
4
5
5
3
1 2.6
5 3.1
Each metric is evaluated along seven dimensions to identify gaps. Metrics that score higher are better candidates for serving as a Key Risk Indicator. In this example, Metric 3 (Average Handling Time) has the highest overall rating against the dimensions.
43
5 4 4.9 2.6
A Structured Approach to Building Key Risk Indicators
metrics as potential KRIs and you should expect at least one or two of each type—coincident, causal, control, and volume. 2. Assess gaps. Once that inventory is complete, the next step is to evaluate the suitability and effectiveness of each of these existing metrics as leading risk indicators. Two tools are used: the gap assessment and the design matrix. The gap assessment tool has seven dimensions along which each metric is rated on a scale of 1 to 5. The dimensions are frequency of measurement, trigger levels, escalation criteria, leading/lagging, metric ownership, historical data availability, and data source accuracy. For each dimension, a clear distinction is made between what constitutes a weak, moderate, or strong rating. Evaluating metrics along these dimensions identifies
whether the metrics in the current form would be effective as a KRI; typically, a composite score of 4.0 or higher is desired. An example of a KRI gap assessment for a call center is shown in Figure 1. The second tool, the design matrix, is a variant of the quality function deployment commonly used in Six Sigma exercises. The drivers of the risk event are listed on the left by row, and the existing metrics are listed along the top by column. Risk-event drivers are the root causes that allowed the risk event to occur, such as data-entry error, incomplete communication, or an associate not following procedures. Each risk-event driver is given an importance weighting, which reflects the percent contribution of this driver to the likelihood that the risk event in question would occur. The relationship between each of these metrics and
the drivers of the risk event are scored using the 0-1-3-9 scoring criteria. The driver rating is a binary rating of yes/no, with a “yes” only if the risk-event driver scores a 9 on any one of the metrics. The metric rating is the weighted average of the ratings that the metric was scored for each driver. An example of a design matrix for existing metrics is shown in Figure 2. With these tools, we can assess where existing business metrics fall short in terms of suitability and effectiveness for use as key risk indicators. They point the way to where we need to make improvements to existing metrics and find additional ones.
44
Operational Risk: A Special Edition of The RMA Journal May 2004
Driver Rating
Transfer Rate
Associate Attrition
Average Handle Time (AHT)
Customer Satisfaction Index
Weighting
3. Improve metrics. First, we focus on metrics that scored a 9 in the design matrix (particularly along multiple risk-event drivers) but that have a low score in the gap assessment. When that low score is associated Figure 2 with something that Design Matrix Template can be fixed, such as Dept: Call-Center Operations Risk Event: Customer contact not completed in an accurate or professional manner insufficient frequency, Date: 6/2/2003 inadequate trigger levAssessor Name: John Doe III els, or an absence of Existing/Potential KRIs established escalation Scoring Criteria criteria, improvements 0-—No relationship along these dimensions 1—Weak relationship 3—Moderate relationship can transform an exist9—Strong relationship ing metric into an effecRisk-Event Drivers tive KRI. Associate not knowledgeable on company policy/procedure 10% 3 9 3 3 Y Next, we look at Associate has poor communication skills 50% 3 1 3 1 N metrics that have Associate misunderstands customer request 10% 3 3 3 1 N scored high on the gap Associate cannot explain policies accurately to the customer 25% 3 3 3 3 N assessment but don’t Customer call transferred to the wrong queue 5% 1 1 3 9 Y score a 9 along any riskevent drivers on the design matrix. We ask Metric Rating 100% 2.90 2.50 3.00 2.10 whether there are modA design matrix is used to assess the relationship between each potential KRI and each risk-event ifications to the metric driver. In this example, no existing metric aligned strongly to the primary risk-event driver (poor that might strengthen communication skills).
A Structured Approach to Building Key Risk Indicators
In such cases, a riskevent driver can be Design Matrix Template Dept: Call-Center Operations used as a proxy. Since Risk Event: Customer contact not completed in an accurate or professional manner the business usually Date: 6/2/2003 has a good understandAssessor Name: John Doe III Existing/Potential KRIs ing of the targets and trigger levels (or conScoring Criteria trol limits) for the risk0-—No relationship 1—Weak relationship event drivers, the cor3—Moderate relationship relation between the 9—Strong relationship driver and the risk indiRisk-Event Drivers cator lets us set correAssociate not knowledgeable on company policy/procedure 10% 3 9 3 9 Y sponding targets and Associate has poor communication skills 50% 3 1 9 3 Y trigger levels for the Associate misunderstands customer request 10% 3 3 9 9 Y KRI metrics. A good Associate cannot explain policies accurately to the customer 25% 3 1 0 9 Y example is shown in Customer call transferred to the wrong queue 5% 1 1 0 0 N Figure 4, where the risk-event driver, customer satisfaction Metric Rating 100% 2.90 2.00 5.70 5.55 index (CSI), is plotted The design matrix also is used to assess new KRIs that are created. In this example, two new metrics against the new com(Communication Score and Knowledge Score) were created that better align with the primary riskevent drivers and have a higher metric rating, i.e., they provide better risk-event coverage. munication score metric. The goals and the assess the strength of the relationthe relationship with at least one of trigger levels for the CSI are transship between the risk-event drivers the risk-event drivers. For examlated into trigger levels for the and the metrics. In many cases ple, payment-data-entry errors may KRI. these correlations are self-evident. be a key driver of the risk event Validation is not necessary for For example, a metric that meas“payments not processed in a timeeach KRI and risk-event driver. ures cycle time (or turnaround time) Ideally, each risk will have one or ly or accurate manner.” An existing of an evaluation process is strongly metric that measures all data-entry two major metrics that need to be correlated with the risk-event driver validated to ensure that appropriate errors across the department may have a weak relationship measuring “evaluation is not completed in a trigger levels are set to enable timely manner.” In such cases the to payment errors; however, the intervention. validation is not necessary, and trigdriver rating can be increased by ger levels are set based on business modifying the metric to capture 5. Dashboard design. The and/or regulatory requirements. In only payment-keying errors. next step is to design dashboard all other cases, especially when new Before moving into validation, reports on these critical metrics for metrics are created as described in the list of is pared down to five or business managers, process owners, the previous section, these metrics fewer KRIs. Each metric in the and senior management. A dashshould be validated to ensure that design matrix that has no strong board can be useful on a standrelationship to any risk-event driver the metric is indeed a predictive alone basis or as part of another is removed. An example of a design risk indicator. management process, such as a Ideally, validation will involve matrix for the new metrics is shown monthly business review process. a statistical analysis of historical in Figure 3. Dashboards typically use graphs data between the risk event itself and tables to give a concise and and the metric. However, in most 4. Validation and trigger-level comprehensive risk picture, highcases historical data is not availidentification. The previous two lighting KRIs that are above conable, particularly on the risk event. steps used subjective judgment to trol-plan trigger levels and reportDriver Rating
Knowledge Score
Communication Score
Associate Attrition
Customer Satisfaction Index
Weighting
Figure 3
45
A Structured Approach to Building Key Risk Indicators
Figure 4 Regression Plot 85
Customer Satisfaction
80
⽧
75 70
⽧
65
⽧
60
⽧
⽧
⽧
55 50 45 40 1
1.5
2
2.5 L3 Target: 2.6 Communication Score Lower Spec Limit: 2.2
3
Trigger Limit: 2.3
A regression analysis is used to validate each KRI and to establish appropriate trigger levels. For this example, the Communication Score KRI correlates well with overall Customer Satisfaction.
ing on the actions that have been taken as a result. An example of a KRI dashboard for our call-center example is shown in Figure 5.
new owner can come up to speed on the procedures quickly and understand the level of risk the business is willing to accept in managing such risk. The control plan can be a more detailed one-page description of all the actions and accountabilities around a specific KRI. In such a case, a separate page would be needed for each metric. A simpler
version could be a one-page description of all the key risk indicators associated with that specific risk event in a tabular form. In both cases, the control plan should state the KRI metric, the measurement frequency, a description of the measurement system, goals, trigger levels, escalation criteria, and the owner for the escalation criteria. The control plan could be presented as an appendix to the dashboard to bring attention to the specific actions taken with respect to each KRI over the course of the reporting period. An example of a control plan and the associated escalation criteria for our call-center example are shown in Figure 6. Challenges in Instituting KRIs at an Enterprise Level A firm-wide initiative for KRIs creates challenges in development, aggregation, and reporting. The potential applications are enormous for development of the measures, touching every business area in the organization. The approach can be either top-down or bottom-up. A top-down approach would look at overall objectives and risks and determine
46
Individual Value
Individual Value
6. Control plan and escalation criteria. The purpose of the control plan is to ensure that clear escalation criteria and roles for intervention have been established when a KRI is triggered. This documentation enables a Figure 5 process Regression Plot owner to Communication Score for Team A Knowledge Score for Team A follow an 3.0 3.0 UCL=2.967 agreed, conUCL=2.920 2.9 sistent pro䡲 2.8 Target=2.1 䡲 䡲 Target=2.75 䡲 tocol each 2.7 2.5 䡲 䡲 Mean=2.404 䡲 time a KRI Mean=2.616 2.6 䡲 䡲 Trigger is triggered. Trigger 䡲 䡲 2.5 䡲 Limit=2.3 䡲 Limit=2.55 䡲 In the event 2.0 2.4 LCL=1.841 LCL=2.311 the process 2.3 is transi1 2 3 4 5 6 7 1 2 3 4 5 6 7 Week Week tioned to a different A dashboard is used to display and monitor each Key Risk Indicator. Individual control charts are used to monitor the Communication Score and Knowledge Score KRIs from the call-center example. owner, the
Operational Risk: A Special Edition of The RMA Journal May 2004
A Structured Approach to Building Key Risk Indicators
Figure 6 Control Plan Metric Name
Metric Description
Communication Score
A measure of associate’s communication skills on a call, measured on a 3-point scale.
1=customer impact. 2=no customer impact. 3=not heard on call.
Knowledge Score
A measure of associate’s ability to explain policies and procedures to the customer measured on a 3-point scale on a call monitored by QA.
1=customer leaves call without resolution to issue or question appropriately addressed. 2=customer gets resolution but has to ask multiple times. 3=customer is serviced appropriately upon first request.
Measurement
Metric Frequency
Metric Owner
Trigger Limit(s)
Escalation Procedure
Procedure Owner
Last Update
3 calls measured weekly per associate.
QA Manager
2.3
Associates receiving
View more...
Comments