Data, Information and Knowledge Management Framework and the Data Management Book of Knowledge (DMBOK)

May 7, 2017 | Author: Alan McSweeney | Category: N/A
Share Embed Donate


Short Description

Structured and Comprehensive Approach to Data Management and the Data Management Book of Knowledge (DMBOK)...

Description

Structured and Comprehensive Approach to Data Management and the Data Management Book of Knowledge (DMBOK)

Alan McSweeney

Objectives •

To provide an overview of a structured approach to developing and implementing a detailed data management policy including frameworks, standards, project, team and maturity

March 8, 2010

2

Agenda •

Introduction to Data Management



State of Information and Data Governance



Other Data Management Frameworks



Data Management and Data Management Book of Knowledge (DMBOK)



Conducting a Data Management Project



Creating a Data Management Team



Assessing Your Data Management Maturity

March 8, 2010

3

Preamble •

Every good presentation should start with quotations from The Prince and Dilbert

March 8, 2010

4

Management Wisdom •

There is nothing more difficult to take in hand, more perilous to conduct or more uncertain in its success than to take the lead in the introduction of a new order of things. − The Prince







Never be in the same room as a decision. I'll illustrate my point with a puppet show that I call "Journey to Blameville" starring "Suggestion Sam" and "Manager Meg.“ You will often be asked to comment on things you don't understand. These handouts contain nonsense phrases that can be used in any situation so, let's dominate our industry with quality implementation of methodologies. Our executives have started their annual strategic planning sessions. This involves sitting in a room with inadequate data until an illusion of knowledge is attained. Then we'll reorganise, because that's all we know how to do. − Dilbert

March 8, 2010

5

Information •

Applications •

Processes

Information IT Systems

• •



People

March 8, 2010

Infrastructure



Information in all its forms – input, processed, outputs – is a core component of any IT system Applications exist to process data supplied by users and other applications Data breathes life into applications Data is stored and managed by infrastructure – hardware and software Data is a key organisation asset with a substantial value Significant responsibilities are imposed on organisations in managing data 6

Data, Information and Knowledge • • • • • •

• •

Data is the representation of facts as text, numbers, graphics, images, sound or video Data is the raw material used to create information Facts are captured, stored, and expressed as data Information is data in context Without context, data is meaningless - we create meaningful information by interpreting the context around data Knowledge is information in perspective, integrated into a viewpoint based on the recognition and interpretation of patterns, such as trends, formed with other information and experience Knowledge is about understanding the significance of information Knowledge enables effective action March 8, 2010

7

Data, Information, Knowledge and Action

Knowledge

Information

March 8, 2010

Action

Data

8

Information is an Organisation Asset •

Tangible organisation assets are seen as having a value and are managed and controlled using inventory and asset management systems and procedures



Data, because it is less tangible, is less widely perceived as a real asset, assigned a real value and managed as if it had a value



High quality, accurate and available information is a prerequisite to effective operation of any organisation

March 8, 2010

9

Data Management and Project Success •

Data is fundamental to the effective and efficient operation of any solution − Right data − Right time − Right tools and facilities



Without data the solution has no purpose



Data is too often overlooked in projects



Project managers frequently do not appreciate the complexity of data issues

March 8, 2010

10

Generalised Information Management Lifecycle Enter, Create, Acquire, Derive, Update, Capture



Store, Manage, Replicate and Distribute

M an ag

Protect and Recover



Design, define and implement framework to manage information through this lifecycle

Generalised lifecycle that differs for specific information types e,

Co nt ro

la

nd

Ad mi

n is t er

Archive and Recall

Delete/Remove

March 8, 2010

11

Expanded Generalised Information Management Lifecycle Plan, Design and Specify

De

Implement Underlying Infrastructure

sig n, Im

ple m

Enter, Create, Acquire, Derive, Update, Capture Store, Manage, Replicate and Distribute



Include phases for information management lifecycle design and implementation of appropriate hardware and software to actualise lifecycle March 8, 2010

en

t, M an ag e,

Co nt ro

la

nd

Ad

mi ni

ste

r

Protect and Recover

Archive and Recall

Delete/Remove 12

Data and Information Management •

Data and information management is a business process consisting of the planning and execution of policies, practices, and projects that acquire, control, protect, deliver, and enhance the value of data and information assets

March 8, 2010

13

Data and Information Management To manage and utilise information as a strategic asset

To implement processes, policies, infrastructure and solutions to govern, protect, maintain and use information To make relevant and correct information available in all business processes and IT systems for the right people in the right context at the right time with the appropriate security and with the right quality To exploit information in business decisions, processes and relations March 8, 2010

14

Data Management Goals •

Primary goals − To understand the information needs of the enterprise and all its stakeholders − To capture, store, protect, and ensure the integrity of data assets − To continually improve the quality of data and information, including accuracy, integrity, integration, relevance and usefulness of data − To ensure privacy and confidentiality, and to prevent unauthorised inappropriate use of data and information − To maximise the effective use and value of data and information assets

March 8, 2010

15

Data Management Goals •

Secondary goals − To control the cost of data management − To promote a wider and deeper understanding of the value of data assets − To manage information consistently across the enterprise − To align data management efforts and technology with business needs

March 8, 2010

16

Triggers for Data Management Initiative •

When an enterprise is about to undertake architectural transformation, data management issues need to be understood and addressed



Structured and comprehensive approach to data management enables the effective use of data to take advantage of its competitive advantages

March 8, 2010

17

Data Management Principles •

Data and information are valuable enterprise assets



Manage data and information carefully, like any other asset, by ensuring adequate quality, security, integrity, protection, availability, understanding and effective use



Share responsibility for data management between business data owners and IT data management professionals



Data management is a business function and a set of related disciplines

March 8, 2010

18

Organisation Data Management Function •

Business function of planning for, controlling and delivering data and information assets



Development, execution, and supervision of plans, policies, programs, projects, processes, practices and procedures that control, protect, deliver, and enhance the value of data and information assets



Scope of the data management function and the scale of its implementation vary widely with the size, means, and experience of organisations



Role of data management remains the same across organisations even though implementation differs widely March 8, 2010

19

Scope of Complete Data Management Function Data Management Data Governance

Data Architecture Management

Data Development

Data Operations Management

Data Security Management

Data Quality Management

Reference and Master Data Management

Data Warehousing and Business Intelligence Management

Document and Content Management

Metadata Management

March 8, 2010

20

Shared Role Between Business and IT •

Data management is a shared responsibility between data management professionals within IT and the business data owners representing the interests of data producers and information consumers



Business data ownership is the concerned with accountability for business responsibilities in data management



Business data owners are data subject matter experts



Represent the data interests of the business and take responsibility for the quality and use of data March 8, 2010

21

Why Develop and Implement a Data Management Framework? • • • • • • • • • •

Improve organisation data management efficiency Deliver better service to business Improve cost-effectiveness of data management Match the requirements of the business to the management of the data Embed handling of compliance and regulatory rules into data management framework Achieve consistency in data management across systems and applications Enable growth and change more easily Reduce data management and administration effort and cost Assist in the selection and implementation of appropriate data management solutions Implement a technology-independent data architecture March 8, 2010

22

Data Management Issues

March 8, 2010

23

Data Management Issues •

Discovery - cannot find the right information



Integration - cannot manipulate and combine information



Insight - cannot extract value and knowledge from information



Dissemination - cannot consume information



Management – cannot manage and control information volumes and growth

March 8, 2010

24

Data Management Problems – User View • • • • • • • • • • • • • • •

Managing Storage Equipment Application Recoveries / Backup Retention Vendor Management Power Management Regulatory Compliance Lack of Integrated Tools Dealing with Performance Problems Data Mobility Archiving and Archive Management Storage Provisioning Managing Complexity Managing Costs Backup Administration and Management Proper Capacity Forecasting and Storage Reporting Managing Storage Growth March 8, 2010

25

Information Management Challenges •

Explosive Data Growth − Value and volume of data is overwhelming − More data is see as critical − Annual rate of 50+% percent



Compliance Requirements − Compliance with stringent regulatory requirements and audit procedures



Fragmented Storage Environment − Lack of enterprise-wide hardware and software data storage strategy and discipline



Budgets − Frozen or being cut March 8, 2010

26

Data Quality •

Poor data quality costs real money



Process efficiency is negatively impacted by poor data quality



Full potential benefits of new systems not be realised because of poor data quality



Decision making is negatively affected by poor data quality

March 8, 2010

27

State of Information and Data Governance •

Information and Data Governance Report, April 2008 − International Association for Information and Data Quality (IAIDQ) − University of Arkansas at Little Rock, Information Quality Program (UALR-IQ)

March 8, 2010

28

Your Organisation Recognises and Values Information as a Strategic Asset and Manages it Accordingly

3.4%

Strongly Disagree

21.5%

Disagree

17.1%

Neutral

39.5%

Agree

18.5%

Strongly Agree

0%

March 8, 2010

10%

20%

30%

40%

50%

29

Direction of Change in the Results and Effectiveness of the Organisation's Formal or Informal Information/Data Governance Processes Over the Past Two Years

Results and Effectiveness Have Significantly Improved

8.8%

50.0%

Results and Effectiveness Have Improved Results and Effectiveness Have Remained Essentially the Same

31.9%

3.9%

Results and Effectiveness Have Worsened Results and Effectiveness Have Significantly Worsened

0.0%

5.4%

Don’t Know 0%

March 8, 2010

10%

20%

30%

40%

50%

60%

70%

30

Perceived Effectiveness of the Organisation's Current Formal or Informal Information/Data Governance Processes

Excellent (All Goals are Met)

2.5%

Good (Most Goals are Met)

21.1%

51.5%

OK (Some Goals are Met)

Poor (Few Goals are Met)

19.1%

Very Poor (No Goals are Met)

3.9%

2.0%

Don’t Know 0%

March 8, 2010

10%

20%

30%

40%

50%

60%

70%

31

Actual Information/Data Governance Effectiveness vs. Organisation's Perception

It is Better Than Most People Think

20.1%

It is the Same as Most People Think

32.4%

It is Worse Than Most People Think

35.8%

11.8%

Don’t Know

0%

March 8, 2010

5%

10%

15%

20%

25%

30%

35%

40%

45%

50%

32

Current Status of Organisation's Information/Data Governance Initiatives Started an Information/Data Governance Initiative, but Discontinued the Effort

1.5%

Considered a Focused Information/Data Governance Effort but Abandoned the Idea

0.5% 7.4%

None Being Considered - Keeping the Status Quo Exploring, Still Seeking to Learn More

20.1%

Evaluating Alternative Frameworks and Information Governance Structures

23.0%

Now Planning an Implementation

13.2%

First Iteration Implemented the Past 2 Years

19.1%

First Interation"in Place for More Than 2 Years

8.8%

Don’t Know

6.4% 0%

March 8, 2010

5%

10%

15%

20%

25%

30% 33

Expected Changes in Organisation's Information/Data Governance Efforts Over the Next Two Years 46.6%

Will Increase Significantly

39.2%

Will Increase Somewhat

10.8%

Will Remain the Same

1.0%

Will Decrease Somewhat

Will Decrease Significantly

0.5%

2.0%

Don’t Know

0% March 8, 2010

10%

20%

30%

40%

50%

60% 34

Overall Objectives of Information / Data Governance Efforts Improve Data Quality

80.2%

Establish Clear Decision Rules and Decisionmaking Processes for Shared Data

65.6%

Increase the Value of Data Assets

59.4%

Provide Mechanism to Resolve Data Issues

56.8%

Involve Non-IT Personnel in Data Decisions IT Should not Make by Itself

55.7%

Promote Interdependencies and Synergies Between Departments or Business Units

49.6%

Enable Joint Accountability for Shared Data

45.3%

Involve IT in Data Decisions non-IT Personnel Should not Make by Themselves Other None Applicable Don't Know

35.4% 5.2% 1.0% 2.6% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100 %

March 8, 2010

35

Change In Organisation's Information / Data Quality Over the Past Two Years Information / Data Quality Has Significantly Improved

10.5%

Information / Data Quality Has Improved

68.4%

Information / Data Quality Has Remained Essentially the Same

15.8%

Information / Data Quality Has Worsened

Information / Data Quality Has Significantly Worsened

3.5%

0.0%

1.8%

Don’t Know

0%

March 8, 2010

10%

20%

30%

40%

50%

60%

70%

80%

36

Maturity Of Information / Data Governance Goal Setting And Measurement In Your Organisation 3.7%

5 - Optimised

11.8%

4 - Managed

26.7%

3 - Defined

2 - Repeatable

28.9%

1 - Ad-hoc

28.9%

0% March 8, 2010

5%

10%

15%

20%

25%

30%

35%

40%

45%

50% 37

Maturity Of Information / Data Governance Processes And Policies In Your Organisation 1.6%

5 - Optimised

4.8%

4 - Managed

24.5%

3 - Defined

46.3%

2 - Repeatable

22.9%

1 - Ad-hoc

0% March 8, 2010

5%

10%

15%

20%

25%

30%

35%

40%

45%

50% 38

Maturity Of Responsibility And Accountability For Information / Data Governance Among Employees In Your Organisation 6.9%

5 - Optimised

3.2%

4 - Managed

31.7%

3 - Defined

25.4%

2 - Repeatable

32.8%

1 - Ad-hoc

0% March 8, 2010

5%

10%

15%

20%

25%

30%

35%

40%

45%

50% 39

Other Data Management Frameworks

March 8, 2010

40

Other Data Management-Related Frameworks •

• • • •

TOGAF (and other enterprise architecture standards) define a process for arriving an at enterprise architecture definition, including data TOGAF has a phase relating to data architecture TOGAF deals with high level DMBOK translates high level into specific details COBIT is concerned with IT governance and controls: − IT must implement internal controls around how it operates − The systems IT delivers to the business and the underlying business processes these systems actualise must be controlled – these are controls external to IT − To govern IT effectively, COBIT defines the activities and risks within IT that need to be managed

• •

COBIT has a process relating to data management Neither TOGAF nor COBIT are concerned with detailed data management design and implementation March 8, 2010

41

DMBOK, TOGAF and COBIT Can be a Precursor to Implementing Data Management

TOGAF Defines the Process for Creating a Data Architecture as Part of an Overall Enterprise Architecture

DMBOK Is a Specific and Comprehensive Data Oriented Framework

DMBOK Provides Detailed for Definition, Implementation and Operation of Data Management and Utilisation

Can Provide a Maturity Model for Assessing Data Management

COBIT Provides Data Governance as Part of Overall IT Governance

March 8, 2010

42

DMBOK, TOGAF and COBIT – Scope and Overlap DMBOK

TOGAF

Data Development Data Operations Management Reference and Master Data Management Data Warehousing and Business Intelligence Management Document and Content Management Metadata Management Data Quality Management

Data Architecture Management Data Management Data Migration

Data Governance

March 8, 2010

Data Security Management

COBIT

43

TOGAF and Data Management •

Phase H: Architecture Change Management

Phase G: Implementation Governance

Phase A: Architecture Vision Phase B: Business Architecture

Phase C1 (subset of Phase C) relates to defining a data architecture Phase C1: Data Architecture

Requirements Management

Phase C: Information Systems Architecture

Phase D: Technology Architecture

Phase F: Migration Planning

Phase C2: Solutions and Application Architecture

Phase E: Opportunities and Solutions

March 8, 2010

44

TOGAF Phase C1: Information Systems Architectures - Data Architecture - Objectives •

Purpose is to define the major types and sources of data necessary to support the business, in a way that is: − Understandable by stakeholders − Complete and consistent − Stable



Define the data entities relevant to the enterprise



Not concerned with design of logical or physical storage systems or databases

March 8, 2010

45

TOGAF Phase C1: Information Systems Architectures - Data Architecture - Overview Phase C1: Information Systems Architectures - Data Architecture

Approach Elements

Inputs

Steps

Outputs

Key Considerations for Data Architecture

Reference Materials External to the Enterprise

Select Reference Models, Viewpoints, and Tools

Architecture Repository

Non-Architectural Inputs

Develop Baseline Data Architecture Description

Architectural Inputs

Develop Target Data Architecture Description

Perform Gap Analysis

Define Roadmap Components

Resolve Impacts Across the Architecture Landscape Conduct Formal Stakeholder Review

Finalise the Data Architecture

Create Architecture Definition Document March 8, 2010

46

TOGAF Phase C1: Information Systems Architectures - Data Architecture - Approach - Key Considerations for Data Architecture •

Data Management − Important to understand and address data management issues − Structured and comprehensive approach to data management enables the effective use of data to capitalise on its competitive advantages − Clear definition of which application components in the landscape will serve as the system of record or reference for enterprise master data − Will there be an enterprise-wide standard that all application components, including software packages, need to adopt − Understand how data entities are utilised by business functions, processes, and services − Understand how and where enterprise data entities are created, stored, transported, and reported − Level and complexity of data transformations required to support the information exchange needs between applications − Requirement for software in supporting data integration with external organisations

March 8, 2010

47

TOGAF Phase C1: Information Systems Architectures - Data Architecture - Approach - Key Considerations for Data Architecture •

Data Migration − Identify data migration requirements and also provide indicators as to the level of transformation for new/changed applications − Ensure target application has quality data when it is populated − Ensure enterprise-wide common data definition is established to support the transformation

March 8, 2010

48

TOGAF Phase C1: Information Systems Architectures - Data Architecture - Approach - Key Considerations for Data Architecture •

Data Governance − Ensures that the organisation has the necessary dimensions in place to enable the data transformation − Structure – ensures the organisation has the necessary structure and the standards bodies to manage data entity aspects of the transformation − Management System - ensures the organisation has the necessary management system and data-related programs to manage the governance aspects of data entities throughout its lifecycle − People - addresses what data-related skills and roles the organisation requires for the transformation

March 8, 2010

49

TOGAF Phase C1: Information Systems Architectures - Data Architecture - Outputs •

Refined and updated versions of the Architecture Vision phase deliverables − Statement of Architecture Work − Validated data principles, business goals, and business drivers



Draft Architecture Definition Document − Baseline Data Architecture − Target Data Architecture • • • • •

Business data model Logical data model Data management process models Data Entity/Business Function matrix Views corresponding to the selected viewpoints addressing key stakeholder concerns

− Draft Architecture Requirements Specification • • • • • •

Gap analysis results Data interoperability requirements Relevant technical requirements Constraints on the Technology Architecture about to be designed Updated business requirements Updated application requirements

− Data Architecture components of an Architecture Roadmap March 8, 2010

50

COBIT Structure COBIT Plan and Organise (PO)

Acquire and Implement (AI)

Deliver and Support (DS)

Monitor and Evaluate (ME)

PO1 Define a strategic IT plan

AI1 Identify automated solutions

DS1 Define and manage service levels

ME1 Monitor and evaluate IT performance

PO2 Define the information architecture

AI2 Acquire and maintain application software

DS2 Manage third-party services

ME2 Monitor and evaluate internal control

PO3 Determine technological direction

AI3 Acquire and maintain technology infrastructure

DS3 Manage performance and capacity

ME3 Ensure regulatory compliance

PO4 Define the IT processes, organisation and relationships

AI4 Enable operation and use

DS4 Ensure continuous service

ME4 Provide IT governance

PO5 Manage the IT investment

AI5 Procure IT resources

DS5 Ensure systems security

PO6 Communicate management aims and direction

AI6 Manage changes

DS6 Identify and allocate costs

PO7 Manage IT human resources

AI7 Install and accredit solutions and changes

DS7 Educate and train users

PO8 Manage quality

DS8 Manage service desk and incidents

PO9 Assess and manage IT risks

DS9 Manage the configuration

PO10 Manage projects

DS10 Manage problems

DS11 Manage data DS12 Manage the physical environment DS13 Manage operations March 8, 2010

51

COBIT and Data Management •

COBIT objective DS11 Manage Data within the Deliver and Support (DS) domain



Effective data management requires identification of data requirements



Data management process includes establishing effective procedures to manage the media library, backup and recovery of data and proper disposal of media



Effective data management helps ensure the quality, timeliness and availability of business data

March 8, 2010

52

COBIT and Data Management •

• •

Objective is the control over the IT process of managing data that meets the business requirement for IT of optimising the use of information and ensuring information is available as required Focuses on maintaining the completeness, accuracy, availability and protection of data Involves taking actions − Backing up data and testing restoration − Managing onsite and offsite storage of data − Securely disposing of data and equipment



Measured by − User satisfaction with availability of data − Percent of successful data restorations − Number of incidents where sensitive data were retrieved after media were disposed of

March 8, 2010

53

COBIT Process DS11 Manage Data •

DS11.1 Business Requirements for Data Management − Establish arrangements to ensure that source documents expected from the business are received, all data received from the business are processed, all output required by the business is prepared and delivered, and restart and reprocessing needs are supported



DS11.2 Storage and Retention Arrangements − Define and implement procedures for data storage and archival, so data remain accessible and usable − Procedures should consider retrieval requirements, cost-effectiveness, continued integrity and security requirements − Establish storage and retention arrangements to satisfy legal, regulatory and business requirements for documents, data, archives, programmes, reports and messages (incoming and outgoing) as well as the data (keys, certificates) used for their encryption and authentication



DS11.3 Media Library Management System − Define and implement procedures to maintain an inventory of onsite media and ensure their usability and integrity − Procedures should provide for timely review and follow-up on any discrepancies noted



DS11.4 Disposal − Define and implement procedures to prevent access to sensitive data and software from equipment or media when they are disposed of or transferred to another use − Procedures should ensure that data marked as deleted or to be disposed cannot be retrieved.



DS11.5 Backup and Restoration − Define and implement procedures for backup and restoration of systems, data and documentation in line with business requirements and the continuity plan − Verify compliance with the backup procedures, and verify the ability to and time required for successful and complete restoration − Test backup media and the restoration process



DS11.6 Security Requirements for Data Management − Establish arrangements to identify and apply security requirements applicable to the receipt, processing, physical storage and output of data and sensitive messages − Includes physical records, data transmissions and any data stored offsite

March 8, 2010

54

COBIT Data Management Goals and Metrics Activity Goals

Process Goals

Activity Goals

•Backing up data and testing restoration •Managing onsite and offsite storage of data •Securely disposing of data and equipment

•Maintain the completeness, accuracy, validity and accessibility of stored data •Secure data during disposal of media •Effectively manage storage media

•Backing up data and testing restoration •Managing onsite and offsite storage of data •Securely disposing of data and equipment

Are Measured By Key Performance Indicators •Frequency of testing of backup media •Average time for data restoration

March 8, 2010

Drive

Are Measured By

Drive

Are Measured By

Process Key Goal Indicators

IT Key Goal Indicators

•% of successful data restorations •# of incidents where sensitive data were retrieved after media were disposed of •# of down time or data integrity incidents caused by insufficient storage capacity

•Occurrences of inability to recover data critical to business process •User satisfaction with availability of data •Incidents of noncompliance with laws due to storage management issues 55

Data Management Book of Knowledge (DMBOK)

March 8, 2010

56

Data Management Book of Knowledge (DMBOK) •

DMBOK is a generalised and comprehensive framework for managing data across the entire lifecycle



Developed by DAMA (Data Management Association)



DMBOK provides a detailed framework to assist development and implementation of data management processes and procedures and ensures all requirements are addressed



Enables effective and appropriate data management across the organisation



Provides awareness and visibility of data management issues and requirements March 8, 2010

57

Data Management Book of Knowledge (DMBOK) •

Not a solution to your data management needs



Framework and methodology for developing and implementing an appropriate solution



Generalised framework to be customised to meet specific needs



Provide a work breakdown structure for a data management project to allow the effort to be assessed



No magic bullet

March 8, 2010

58

Scope and Structure of Data Management Book of Knowledge (DMBOK) Data Management Environmental Elements Data Management Functions

March 8, 2010

59

DMBOK Data Management Functions Data Management Functions Data Governance

Data Architecture Management

Data Development

Data Operations Management

Data Security Management

Data Quality Management

Reference and Master Data Management

Data Warehousing and Business Intelligence Management

Document and Content Management

Metadata Management

March 8, 2010

60

DMBOK Data Management Functions • • • • • • • • • •

Data Governance - planning, supervision and control over data management and use Data Architecture Management - defining the blueprint for managing data assets Data Development - analysis, design, implementation, testing, deployment, maintenance Data Operations Management - providing support from data acquisition to purging Data Security Management - Ensuring privacy, confidentiality and appropriate access Data Quality Management - defining, monitoring and improving data quality Reference and Master Data Management - managing master versions and replicas Data Warehousing and Business Intelligence Management - enabling reporting and analysis Document and Content Management - managing data found outside of databases Metadata Management - integrating, controlling and providing metadata

March 8, 2010

61

DMBOK Data Management Environmental Elements Data Management Environmental Elements Goals and Principles

Activities

Primary Deliverables

Roles and Responsibilities

Practices and Techniques

Technology

Organisation and Culture March 8, 2010

62

DMBOK Data Management Environmental Elements • • •





• •

Goals and Principles - directional business goals of each function and the fundamental principles that guide performance of each function Activities - each function is composed of lower level activities, sub-activities, tasks and steps Primary Deliverables - information and physical databases and documents created as interim and final outputs of each function. Some deliverables are essential, some are generally recommended, and others are optional depending on circumstances Roles and Responsibilities - business and IT roles involved in performing and supervising the function, and the specific responsibilities of each role in that function. Many roles will participate in multiple functions Practices and Techniques - common and popular methods and procedures used to perform the processes and produce the deliverables and may also include common conventions, best practice recommendations, and alternative approaches without elaboration Technology - categories of supporting technology such as software tools, standards and protocols, product selection criteria and learning curves Organisation and Culture – this can include issues such as management metrics, critical success factors, reporting structures, budgeting, resource allocation issues, expectations and attitudes, style, cultural, approach to change management

March 8, 2010

63

DMBOK Data Management Functions and Environmental Elements Goals and Principles Data Governance Data Architecture Management Data Development Data Operations Management Data Security Management Data Quality Management Reference and Master Data Management Data Warehousing and Business Intelligence Management Document and Content Management Metadata Management March 8, 2010

Activities

Primary Deliverables

Roles and Practices and Responsibilities Techniques

Technology

Organisation and Culture

 Scope of Each Data Management Function 

64

Scope of Data Management Book of Knowledge (DMBOK) Data Management Framework •

Hierarchy − Function • Activity − Sub-Activity (not in all cases)



Each activity is classified as one (or more) of: − Planning Activities (P) • Activities that set the strategic and tactical course for other data management activities • May be performed on a recurring basis

− Development Activities (D) • Activities undertaken within implementation projects and recognised as part of the systems development lifecycle (SDLC), creating data deliverables through analysis, design, building, testing, preparation, and deployment

− Control Activities (C) • Supervisory activities performed on an on-going basis

− Operational Activities (O) • Service and support activities performed on an on- going basis March 8, 2010

65

Activity Groups Within Functions •

Planning Activities

Control Activities

March 8, 2010

Development Activities

Operational Activities



Activity groups are classifications of data management activities Use the activity groupings to define the scope of data management subprojects and identify the appropriate tasks: − Analysis and design − Implementation − Operational improvement − Management and administration 66

DMBOK Function and Activity Structure Data Management

Data Governance

Data Architecture Management

Data Development

Data Operations Management

Data Security Management

Data Quality Management

Reference and Master Data Management

DW and BI Management

Document and Content Management

Metadata Management

Data Management Planning

Understand Enterprise Information Needs

Data Modeling, Analysis, and Solution Design

Database Support

Understand Data Security Needs and Regulatory Requirements

Develop and Promote Data Quality Awareness

Understand Reference and Master Data Integration Needs

Understand Business Intelligence Information Needs

Documents / Records Management

Understand Metadata Requirements

Data Management Control

Develop and Maintain the Enterprise Data Model

Detailed Data Design

Data Technology Management

Define Data Security Policy

Define Data Quality Requirement

Identify Master and Reference Data Sources and Contributors

Define and Maintain the DW / BI Architecture

Content Management

Define the Metadata Architecture

Analyse and Align With Other Business Models

Data Model and Design Quality Management

Define Data Security Standards

Profile, Analyse, and Assess Data Quality

Define and Maintain the Data Integration Architecture

Implement Data Warehouses and Data Marts

Develop and Maintain Metadata Standards

Define and Maintain the Database Architecture

Data Implementation

Define Data Security Controls and Procedures

Define Data Quality Metrics

Implement Reference and Master Data Management Solutions

Implement BI Tools and User Interfaces

Implement a Managed Metadata Environment

Define and Maintain the Data Integration Architecture

Manage Users, Passwords, and Group Membership

Define Data Quality Business Rules

Define and Maintain Match Rules

Process Data for Business Intelligence

Create and Maintain Metadata

Define and Maintain the DW / BI Architecture

Manage Data Access Views and Permissions

Test and Validate Data Quality Requirements

Establish “Golden” Records

Monitor and Tune Data Warehousing Processes

Integrate Metadata

Define and Maintain Enterprise Taxonomies and Namespaces

Monitor User Authentication and Access Behaviour

Set and Evaluate Data Quality Service Levels

Define and Maintain Hierarchies and Affiliations

Monitor and Tune BI Activity and Performance

Manage Metadata Repositories

Define and Maintain the Metadata Architecture

Classify Information Confidentiality

Continuously Measure and Monitor Data Quality

Plan and Implement Integration of New Data Sources

Distribute and Deliver Metadata

Audit Data Security

Manage Data Quality Issues

Replicate and Distribute Reference and Master Data

Query, Report, and Analyse Metadata

Clean and Correct Data Quality Defects

Manage Changes to Reference and Master Data

Design and Implement Operational DQM Procedures Monitor Operational DQM Procedures and Performance

March 8, 2010

67

DMBOK Function and Activity - Planning Activities Data Management

Data Governance

Data Architecture Management

Data Development

Data Operations Management

Data Security Management

Data Quality Management

Reference and Master Data Management

DW and BI Management

Document and Content Management

Metadata Management

Data Management Planning

Understand Enterprise Information Needs

Data Modeling, Analysis, and Solution Design

Database Support

Understand Data Security Needs and Regulatory Requirements

Develop and Promote Data Quality Awareness

Understand Reference and Master Data Integration Needs

Understand Business Intelligence Information Needs

Documents / Records Management

Understand Metadata Requirements

Data Management Control

Develop and Maintain the Enterprise Data Model

Detailed Data Design

Data Technology Management

Define Data Security Policy

Define Data Quality Requirement

Identify Master and Reference Data Sources and Contributors

Define and Maintain the DW / BI Architecture

Content Management

Define the Metadata Architecture

Analyse and Align With Other Business Models

Data Model and Design Quality Management

Define Data Security Standards

Profile, Analyse, and Assess Data Quality

Define and Maintain the Data Integration Architecture

Implement Data Warehouses and Data Marts

Develop and Maintain Metadata Standards

Define and Maintain the Database Architecture

Data Implementation

Define Data Security Controls and Procedures

Define Data Quality Metrics

Implement Reference and Master Data Management Solutions

Implement BI Tools and User Interfaces

Implement a Managed Metadata Environment

Define and Maintain the Data Integration Architecture

Manage Users, Passwords, and Group Membership

Define Data Quality Business Rules

Define and Maintain Match Rules

Process Data for Business Intelligence

Create and Maintain Metadata

Define and Maintain the DW / BI Architecture

Manage Data Access Views and Permissions

Test and Validate Data Quality Requirements

Establish “Golden” Records

Monitor and Tune Data Warehousing Processes

Integrate Metadata

Define and Maintain Enterprise Taxonomies and Namespaces

Monitor User Authentication and Access Behaviour

Set and Evaluate Data Quality Service Levels

Define and Maintain Hierarchies and Affiliations

Monitor and Tune BI Activity and Performance

Manage Metadata Repositories

Define and Maintain the Metadata Architecture

Classify Information Confidentiality

Continuously Measure and Monitor Data Quality

Plan and Implement Integration of New Data Sources

Distribute and Deliver Metadata

Audit Data Security

Manage Data Quality Issues

Replicate and Distribute Reference and Master Data

Query, Report, and Analyse Metadata

Clean and Correct Data Quality Defects

Manage Changes to Reference and Master Data

Design and Implement Operational DQM Procedures Monitor Operational DQM Procedures and Performance

March 8, 2010

68

DMBOK Function and Activity - Control Activities Data Management

Data Governance

Data Architecture Management

Data Development

Data Operations Management

Data Security Management

Data Quality Management

Reference and Master Data Management

DW and BI Management

Document and Content Management

Metadata Management

Data Management Planning

Understand Enterprise Information Needs

Data Modeling, Analysis, and Solution Design

Database Support

Understand Data Security Needs and Regulatory Requirements

Develop and Promote Data Quality Awareness

Understand Reference and Master Data Integration Needs

Understand Business Intelligence Information Needs

Documents / Records Management

Understand Metadata Requirements

Data Management Control

Develop and Maintain the Enterprise Data Model

Detailed Data Design

Data Technology Management

Define Data Security Policy

Define Data Quality Requirement

Identify Master and Reference Data Sources and Contributors

Define and Maintain the DW / BI Architecture

Content Management

Define the Metadata Architecture

Analyse and Align With Other Business Models

Data Model and Design Quality Management

Define Data Security Standards

Profile, Analyse, and Assess Data Quality

Define and Maintain the Data Integration Architecture

Implement Data Warehouses and Data Marts

Develop and Maintain Metadata Standards

Define and Maintain the Database Architecture

Data Implementation

Define Data Security Controls and Procedures

Define Data Quality Metrics

Implement Reference and Master Data Management Solutions

Implement BI Tools and User Interfaces

Implement a Managed Metadata Environment

Define and Maintain the Data Integration Architecture

Manage Users, Passwords, and Group Membership

Define Data Quality Business Rules

Define and Maintain Match Rules

Process Data for Business Intelligence

Create and Maintain Metadata

Define and Maintain the DW / BI Architecture

Manage Data Access Views and Permissions

Test and Validate Data Quality Requirements

Establish “Golden” Records

Monitor and Tune Data Warehousing Processes

Integrate Metadata

Define and Maintain Enterprise Taxonomies and Namespaces

Monitor User Authentication and Access Behaviour

Set and Evaluate Data Quality Service Levels

Define and Maintain Hierarchies and Affiliations

Monitor and Tune BI Activity and Performance

Manage Metadata Repositories

Define and Maintain the Metadata Architecture

Classify Information Confidentiality

Continuously Measure and Monitor Data Quality

Plan and Implement Integration of New Data Sources

Distribute and Deliver Metadata

Audit Data Security

Manage Data Quality Issues

Replicate and Distribute Reference and Master Data

Query, Report, and Analyse Metadata

Clean and Correct Data Quality Defects

Manage Changes to Reference and Master Data

Design and Implement Operational DQM Procedures Monitor Operational DQM Procedures and Performance

March 8, 2010

69

DMBOK Function and Activity - Development Activities Data Management

Data Governance

Data Architecture Management

Data Development

Data Operations Management

Data Security Management

Data Quality Management

Reference and Master Data Management

DW and BI Management

Document and Content Management

Metadata Management

Data Management Planning

Understand Enterprise Information Needs

Data Modeling, Analysis, and Solution Design

Database Support

Understand Data Security Needs and Regulatory Requirements

Develop and Promote Data Quality Awareness

Understand Reference and Master Data Integration Needs

Understand Business Intelligence Information Needs

Documents / Records Management

Understand Metadata Requirements

Data Management Control

Develop and Maintain the Enterprise Data Model

Detailed Data Design

Data Technology Management

Define Data Security Policy

Define Data Quality Requirement

Identify Master and Reference Data Sources and Contributors

Define and Maintain the DW / BI Architecture

Content Management

Define the Metadata Architecture

Analyse and Align With Other Business Models

Data Model and Design Quality Management

Define Data Security Standards

Profile, Analyse, and Assess Data Quality

Define and Maintain the Data Integration Architecture

Implement Data Warehouses and Data Marts

Develop and Maintain Metadata Standards

Define and Maintain the Database Architecture

Data Implementation

Define Data Security Controls and Procedures

Define Data Quality Metrics

Implement Reference and Master Data Management Solutions

Implement BI Tools and User Interfaces

Implement a Managed Metadata Environment

Define and Maintain the Data Integration Architecture

Manage Users, Passwords, and Group Membership

Define Data Quality Business Rules

Define and Maintain Match Rules

Process Data for Business Intelligence

Create and Maintain Metadata

Define and Maintain the DW / BI Architecture

Manage Data Access Views and Permissions

Test and Validate Data Quality Requirements

Establish “Golden” Records

Monitor and Tune Data Warehousing Processes

Integrate Metadata

Define and Maintain Enterprise Taxonomies and Namespaces

Monitor User Authentication and Access Behaviour

Set and Evaluate Data Quality Service Levels

Define and Maintain Hierarchies and Affiliations

Monitor and Tune BI Activity and Performance

Manage Metadata Repositories

Define and Maintain the Metadata Architecture

Classify Information Confidentiality

Continuously Measure and Monitor Data Quality

Plan and Implement Integration of New Data Sources

Distribute and Deliver Metadata

Audit Data Security

Manage Data Quality Issues

Replicate and Distribute Reference and Master Data

Query, Report, and Analyse Metadata

Clean and Correct Data Quality Defects

Manage Changes to Reference and Master Data

Design and Implement Operational DQM Procedures Monitor Operational DQM Procedures and Performance

March 8, 2010

70

DMBOK Function and Activity - Operational Activities Data Management

Data Governance

Data Architecture Management

Data Development

Data Operations Management

Data Security Management

Data Quality Management

Reference and Master Data Management

DW and BI Management

Document and Content Management

Metadata Management

Data Management Planning

Understand Enterprise Information Needs

Data Modeling, Analysis, and Solution Design

Database Support

Understand Data Security Needs and Regulatory Requirements

Develop and Promote Data Quality Awareness

Understand Reference and Master Data Integration Needs

Understand Business Intelligence Information Needs

Documents / Records Management

Understand Metadata Requirements

Data Management Control

Develop and Maintain the Enterprise Data Model

Detailed Data Design

Data Technology Management

Define Data Security Policy

Define Data Quality Requirement

Identify Master and Reference Data Sources and Contributors

Define and Maintain the DW / BI Architecture

Content Management

Define the Metadata Architecture

Analyse and Align With Other Business Models

Data Model and Design Quality Management

Define Data Security Standards

Profile, Analyse, and Assess Data Quality

Define and Maintain the Data Integration Architecture

Implement Data Warehouses and Data Marts

Develop and Maintain Metadata Standards

Define and Maintain the Database Architecture

Data Implementation

Define Data Security Controls and Procedures

Define Data Quality Metrics

Implement Reference and Master Data Management Solutions

Implement BI Tools and User Interfaces

Implement a Managed Metadata Environment

Define and Maintain the Data Integration Architecture

Manage Users, Passwords, and Group Membership

Define Data Quality Business Rules

Define and Maintain Match Rules

Process Data for Business Intelligence

Create and Maintain Metadata

Define and Maintain the DW / BI Architecture

Manage Data Access Views and Permissions

Test and Validate Data Quality Requirements

Establish “Golden” Records

Monitor and Tune Data Warehousing Processes

Integrate Metadata

Define and Maintain Enterprise Taxonomies and Namespaces

Monitor User Authentication and Access Behaviour

Set and Evaluate Data Quality Service Levels

Define and Maintain Hierarchies and Affiliations

Monitor and Tune BI Activity and Performance

Manage Metadata Repositories

Define and Maintain the Metadata Architecture

Classify Information Confidentiality

Continuously Measure and Monitor Data Quality

Plan and Implement Integration of New Data Sources

Distribute and Deliver Metadata

Audit Data Security

Manage Data Quality Issues

Replicate and Distribute Reference and Master Data

Query, Report, and Analyse Metadata

Clean and Correct Data Quality Defects

Manage Changes to Reference and Master Data

Design and Implement Operational DQM Procedures Monitor Operational DQM Procedures and Performance

March 8, 2010

71

DMBOK Environmental Elements Structure Data Management Environmental Elements

Goals and Principles

Primary Deliverables

Activities

Roles and Responsibilities

Technology

Practices and Techniques

Organisation and Culture

Vision and Mission

Phases. Tasks, Steps

Inputs and Outputs

Individual Roles

Tool Categories

Recognised Best Practices

Critical Success Factors

Business Benefits

Dependencies

Information

Organisation Roles

Standards and Protocols

Common Approaches

Reporting Structures

Strategic Goals

Sequence and Flow

Documents

Business and IT Roles

Selection Criteria

Alternative Techniques

Management Metrics

Specific Objectives

Use Cases and Scenarios

Databases

Qualifications and Skills

Learning Curves

Guiding Principles

Trigger Events

Other Resources

Values, Beliefs, Expectations

Attitudes. Styles, Preferences Teamwork, Group Dynamics, Authority, Empowerment. Contracting Strategies

Change Management Approach March 8, 2010

72

DMBOK Environmental Elements

March 8, 2010

73

Data Governance

March 8, 2010

74

Data Governance •

Core function of the Data Management Framework



Interacts with and influences each of the surrounding ten data management functions



Data governance is the exercise of authority and control (planning, monitoring, and enforcement) over the management of data assets



Data governance function guides how all other data management functions are performed



High-level, executive data stewardship



Data governance is not the same thing as IT governance



Data governance is focused exclusively on the management of data assets March 8, 2010

75

Data Governance – Definition and Goals •

Definition − The exercise of authority and control (planning, monitoring, and enforcement) over the management of data assets



Goals − To define, approve, and communicate data strategies, policies, standards, architecture, procedures, and metrics − To track and enforce regulatory compliance and conformance to data policies, standards, architecture, and procedures − To sponsor, track, and oversee the delivery of data management projects and services − To manage and resolve data related issues − To understand and promote the value of data assets March 8, 2010

76

Data Governance - Overview Inputs

Primary Deliverables

•Business Goals •Business Strategies •IT Objectives •IT Strategies •Data Needs •Data Issues •Regulatory Requirements

Suppliers

•Data Policies •Data Standards •Resolved Issues •Data Management Projects and Services •Quality Data and Information •Recognised Data Value

Data Governance

•Data Producers •Knowledge Workers •Managers and Executives •Data Professionals •Customers

•Business Executives •IT Executives •Data Stewards •Regulatory Bodies

Participants •Executive Data Stewards •Coordinating Data Stewards •Business Data Stewards •Data Professionals •DM Executive •CIO March 8, 2010

Consumers

Tools •Intranet Website •E-Mail •Metadata Tools •Metadata Repository •Issue Management Tools •Data Governance KPI •Dashboard

Metrics •Data Value •Data Management Cost •Achievement of Objectives •# of Decisions Made •Steward Representation / Coverage •Data Professional Headcount •Data Management Process Maturity 77

Data Governance Function, Activities and SubActivities Data Governance Data Management Planning

Data Management Control

Understand Strategic Enterprise Data Needs

Supervise Data Professional Organisations and Staff

Develop and Maintain the Data Strategy

Coordinate Data Governance Activities

Establish Data Professional Roles and Organisations

Manage and Resolve Data Related Issues

Identify and Appoint Data Stewards

Monitor and Ensure Regulatory Compliance

Establish Data Governance and Stewardship Organisations

Monitor and Enforce Conformance with Data Policies, Standards and Architecture

Develop and Approve Data Policies, Standards, and Procedures

Oversee Data Management Projects and Services

Review and Approve Data Architecture

Communicate and Promote the Value of Data Assets

Plan and Sponsor Data Management Projects and Services Estimate Data Asset Value and Associated Costs March 8, 2010

78

Data Governance •

Data governance is accomplished most effectively as an on-going program and a continual improvement process



Every data governance programme is unique, taking into account distinctive organisational and cultural issues, and the immediate data management challenges and opportunities



Data governance is at the core of managing data assets

March 8, 2010

79

Data Governance - Possible Organisation Structure Data Governance Structure

Organisation Data Governance Council

Data Governance Office

CIO

Data Management Executive

Business Unit Data Governance Councils

Data Technologists

Data Stewardship Committees

Data Stewardship Teams March 8, 2010

80

Data Governance Shared Decision Making Business Decisions

Shared Decision Making

IT Decisions

Enterprise Information Model

Enterprise Information Management Strategy

Database Architecture

IT Leadership

Information Needs

Enterprise Information Management Policies

Data Integration Architecture

Capital Investments

Information Specifications

Enterprise Information Management Standards

Data Warehousing and Business Intelligence Architecture

Research and Development Funding

Quality Requirements

Enterprise Information Management Metrics

Metadata Architecture

Issue Resolution

Enterprise Information Management Services

Technical Metadata

Business Operating Model

Data Governance Model

March 8, 2010

81

Data Stewardship •

Formal accountability for business responsibilities ensuring effective control and use of data assets



Data steward is a business leader and/or recognised subject matter expert designated as accountable for these responsibilities



Manage data assets on behalf of others and in the best interests of the organisation



Represent the data interests of all stakeholders, including but not limited to, the interests of their own functional departments and divisions



Protects, manages, and leverages the data resources



Must take an enterprise perspective to ensure the quality and effective use of enterprise data March 8, 2010

82

Data Stewardship - Roles •

Executive Data Stewards – provide data governance and make of high-level data stewardship decisions



Coordinating Data Stewards - lead and represent teams of business data stewards in discussions across teams and with executive data stewards



Business Data Stewards - subject matter experts work with data management professionals on an ongoing basis to define and control data

March 8, 2010

83

Data Stewardship Roles Across Data Management Functions - 1 Data Architecture Management Data Development

Data Operations Management

Data Security Management

Reference and Master Data Management

March 8, 2010

All Data Stewards

Executive Data Stewards

Review, validate, approve, maintain and refine data architecture Validate physical data models and database designs, participate in database testing and conversion

Review and approve the enterprise data architecture

Coordinating Data Stewards Integrate specifications, resolving differences

Business Data Stewards Define data requirements specifications Define data requirements and specifications

Define requirements for data recovery, retention and performance Help identify, acquire, and control externally sourced data Provide security, privacy and confidentiality requirements, identify and resolve data security issues, assist in data security audits, and classify information confidentiality Control the creation, update, and retirement of code values and other reference data, define master data management requirements, identify and help resolve issues 84

Data Stewardship Roles Across Data Management Functions - 2 All Data Stewards Data Warehousing and Business Intelligence Management

Data Quality Management

March 8, 2010

Coordinating Data Stewards

Business Data Stewards Provide business intelligence requirements and management metrics, and they identify and help resolve business intelligence issues Define enterprise taxonomies and resolve content management issues

Document and Content Management Metadata Management

Executive Data Stewards

Create and maintain business metadata (names, meanings, business rules), define metadata access and integration needs and use metadata to make effective data stewardship and governance decisions Define data quality requirements and business rules, test application edits and validations, assist in the analysis, certification, and auditing of data quality, lead clean-up efforts, identify ways to solve causes of poor data quality, promote data quality awareness 85

Data Strategy •

High-level course of action to achieve high-level goals



Data strategy is a data management program strategy a plan for maintaining and improving data quality, integrity, security and access



Address all data management functions relevant to the organisation

March 8, 2010

86

Elements of Data Strategy • • • • • • • • • •

Vision for data management Summary business case for data management Guiding principles, values, and management perspectives Mission and long-term directional goals of data management Management measures of data management success Short-term data management programme objectives Descriptions of data management roles and business units along with a summary of their responsibilities and decision rights Descriptions of data management programme components and initiatives Outline of the data management implementation roadmap Scope boundaries March 8, 2010

87

Data Strategy

Data Management Programme Charter Data Management Scope Statement Goals and objectives for a defined planning horizon and the roles, organisations, and individual leaders accountable for achieving these objectives

March 8, 2010

Overall vision, business case, goals, guiding principles, measures of success, critical success factors, recognised risks

Data Management Implementation Roadmap Identifying specific programs, projects, task assignments, and delivery milestones

88

Data Policies •

Statements of intent and fundamental rules governing the creation, acquisition, integrity, security, quality, and use of data and information



More fundamental, global, and business critical than data standards



Describe what to do and what not to do



Should be few data policies stated briefly and directly

March 8, 2010

89

Data Policies •

Possible topics for data policies − Data modeling and other data development activities − Development and use of data architecture − Data quality expectations, roles, and responsibilities − Data security, including confidentiality classification policies, intellectual property policies, personal data privacy policies, general data access and usage policies, and data access by external parties − Database recovery and data retention − Access and use of externally sourced data − Sharing data internally and externally − Data warehousing and business intelligence − Unstructured data - electronic files and physical records March 8, 2010

90

Data Architecture •

Enterprise data model and other aspects of data architecture sponsored at the data governance level



Need to pay particular attention to the alignment of the enterprise data model with key business strategies, processes, business units and systems



Includes − Data technology architecture − Data integration architecture − Data warehousing and business intelligence architecture − Metadata architecture

March 8, 2010

91

Data Standards and Procedures •

Include naming standards, requirement specification standards, data modeling standards, database design standards, architecture standards and procedural standards for each data management function



Must be effectively communicated, monitored, enforced and periodically re-evaluated



Data management procedures are the methods, techniques, and steps followed to accomplish a specific activity or task

March 8, 2010

92

Data Standards and Procedures •

Possible topics for data standards and procedures − Data modeling and architecture standards, including data naming conventions, definition standards, standard domains, and standard abbreviations − Standard business and technical metadata to be captured, maintained, and integrated − Data model management guidelines and procedures − Metadata integration and usage procedures − Standards for database recovery and business continuity, database performance, data retention, and external data acquisition − Data security standards and procedures − Reference data management control procedures − Match / merge and data cleansing standards and procedures − Business intelligence standards and procedures − Enterprise content management standards and procedures, including use of enterprise taxonomies, support for legal discovery and document and e-mail retention, electronic signatures, report formatting standards and report distribution approaches March 8, 2010

93

Regulatory Compliance •

Most organisations are is impacted by government and industry regulations



Many of these regulations dictate how data and information is to be managed



Compliance is generally mandatory



Data governance guides the implementation of adequate controls to ensure, document, and monitor compliance with data-related regulations.

March 8, 2010

94

Regulatory Compliance •

Data governance needs to work the business to find the best answers to the following regulatory compliance questions − − − − − − − − − − − − − −

How relevant is a regulation? Why is it important for us? How do we interpret it? What policies and procedures does it require? Do we comply now? How do we comply now? How should we comply in the future? What will it take? When will we comply? How do we demonstrate and prove compliance? How do we monitor compliance? How often do we review compliance? How do we identify and report non-compliance? How do we manage and rectify non-compliance?

March 8, 2010

95

Issue Management •

Data governance assists in identifying, managing, and resolving data related issues − − − − − − − − − − −

Data quality issues Data naming and definition conflicts Business rule conflicts and clarifications Data security, privacy, and confidentiality issues Regulatory non-compliance issues Non-conformance issues (policies, standards, architecture, and procedures) Conflicting policies, standards, architecture, and procedures Conflicting stakeholder interests in data and information Organisational and cultural change management issues Issues regarding data governance procedures and decision rights Negotiation and review of data sharing agreements

March 8, 2010

96

Issue Management, Control and Escalation •

Data governance implements issue controls and procedures − Identifying, capturing, logging and updating issues − Tracking the status of issues − Documenting stakeholder viewpoints and resolution alternatives − Objective, neutral discussions where all viewpoints are heard − Escalating issues to higher levels of authority − Determining, documenting and communicating issue resolutions.

March 8, 2010

97

Data Management Projects •

Data management roadmap sets out a course of action for initiating and/or improving data management functions



Consists of an assessment of current functions, definition of a target environment and target objectives and a transition plan outlining the steps required to reach these targets including an approach to organisational change management



Every data management project should follow the project management standards of the organisation

March 8, 2010

98

Data Asset Valuation •

Data and information are truly assets because they have business value, tangible or intangible



Different approaches to estimating the value of data assets



Identify the direct and indirect business benefits derived from use of the data



Identify the cost of data loss, identifying the impacts of not having the current amount and quality level of data

March 8, 2010

99

Data Architecture Management

March 8, 2010

100

Data Architecture Management •

Concerned with defining and maintaining specifications that − Provide a standard common business vocabulary − Express strategic data requirements − Outline high level integrated designs to meet these requirements − Align with enterprise strategy and related business architecture

Data architecture is an integrated set of specification artifacts used to define data requirements, guide integration and control of data assets and align data investments with business strategy • Includes formal data names, comprehensive data definitions, effective data structures, precise data integrity rules, and robust data documentation •

March 8, 2010

101

Data Architecture Management – Definition and Goals •

Definition − Defining the data needs of the enterprise and designing the master blueprints to meet those needs



Goals − To plan with vision and foresight to provide high quality data − To identify and define common data requirements − To design conceptual structures and plans to meet the current and long-term data requirements of the enterprise

March 8, 2010

102

Data Architecture Management - Overview Inputs •Business Goals •Business Strategies •Business Architecture •Process Architecture •IT Objectives •IT Strategies •Data Strategies •Data Issues and Needs •Technical Architecture

Suppliers

Primary Deliverables

Data Architecture Management

•Data Stewards •Subject Matter Experts (SMEs) Data Architects •Data Analysts and Modelers Other Enterprise Architects •DM Executive and Managers •CIO and Other Executives •Database Administrators •Data Model Administrator March 8, 2010

Consumers •Data Producers •Knowledge Workers •Managers and Executives •Data Professionals •Customers

•Executives •Data Stewards •Data Producers •Information Consumers

Participants

•Enterprise Data Model Information Value Chain Analysis •Data Technology Architecture •Data Integration / MDM Architecture •DW / BI Architecture •Metadata Architecture •Enterprise Taxonomies and Namespaces •Document Management Architecture •Metadata

Tools

•Data Modeling Tools •Model Management Tool •Metadata Repository Office •Productivity Tools

Metrics •Data Value •Data Management Cost •Achievement of Objectives •# of Decisions Made •Steward Representation / Coverage •Data Professional Headcount •Data Management Process Maturity 103

Enterprise Data Architecture •

Integrated set of specifications and documents − Enterprise Data Model - the core of enterprise data architecture − Information Value Chain Analysis - aligns data with business processes and other enterprise architecture components − Related Data Delivery Architecture - including database architecture, data integration architecture, data warehousing / business intelligence architecture, document content architecture, and metadata architecture

March 8, 2010

104

Data Architecture Management Activities Understand Enterprise Information Needs • Develop and Maintain the Enterprise Data Model • Analyse and Align With Other Business Models • Define and Maintain the Database Architecture • Define and Maintain the Data Integration Architecture • Define and Maintain the Data Warehouse / Business Intelligence Architecture • Define and Maintain Enterprise Taxonomies and Namespaces • Define and Maintain the Metadata Architecture •

March 8, 2010

105

Understanding Enterprise Information Needs In order to create an enterprise data architecture, the organisation must first define its information need • An enterprise data model is a way of capturing and defining enterprise information needs and data requirements • Master blueprint for enterprise-wide data integration • Enterprise data model is a critical input to all future systems development projects and the baseline for additional data requirements analysis • Evaluate the current inputs and outputs required by the organisation, both from and to internal and external targets •

March 8, 2010

106

Develop and Maintain the Enterprise Data Model •

Data is the set of facts collected about business entities



Data model is a set of data specifications that reflect data requirements and designs



Enterprise data model is an integrated, subject-oriented data model defining the critical data produced and consumed across the organisation



Define and analyse data requirements



Design logical and physical data structures that support these requirements

March 8, 2010

107

Enterprise Data Model Enterprise Data Model

Subject Area Model

Conceptual Data Model

Enterprise Logical Data Models

Data Steward Responsibility Assignments

March 8, 2010

Other Enterprise Data Model Components

Valid Reference Data Values

Data Quality Specifications

Entity Life Cycles

108

Enterprise Data Model •

Build an enterprise data model in layers



Focus on the most critical business subject areas

March 8, 2010

109

Subject Area Model •

List of major subject areas that collectively express the essential scope of the enterprise



Important to the success of the entire enterprise data model



List of enterprise subject areas becomes one of the most significant organisation classifications



Acceptable to organisation stakeholders



Useful as the organising framework for data governance, data stewardship, and further enterprise data modeling

March 8, 2010

110

Conceptual Data Model • • • • • •



Conceptual data model defines business entities and their relationships Business entities are the primary organisational structures in a conceptual data model Business needs data about business entities Include a glossary containing the business definitions and other metadata associated with business entities and their relationships Assists improved business understanding and reconciliation of terms and their meanings Provide the framework for developing integrated information systems to support both transactional processing and business intelligence. Depicts how the enterprise sees information March 8, 2010

111

Enterprise Logical Data Models •

Logical data model contain a level of detail below the conceptual data model



Contain the essential data attributes for each entity



Essential data attributes are those data attributes without which the enterprise cannot function – can be a subjective decision

March 8, 2010

112

Other Enterprise Data Model Components •

Data Steward Responsibility Assignments- for subject areas, entities, attributes, and/or reference data value sets



Valid Reference Data Values - controlled value sets for codes and/or labels and their business meaning



Data Quality Specifications - rules for essential data attributes, such as accuracy / precision requirements, currency (timeliness), integrity rules, nullability, formatting, match/merge rules, and/or audit requirements



Entity Life Cycles - show the different lifecycle states of the most important entities and the trigger events that change an entity from one state to another March 8, 2010

113

Analyse and Align with Other Business Models •

Information value-chain analysis maps the relationships between enterprise model elements and other business models



Business value chain identifies the functions of an organisation that contribute directly or indirectly to the organisation’s goals

March 8, 2010

114

Define and Maintain the Data Technology Architecture •

Data technology architecture guides the selection and integration of data-related technology



Data technology architecture defines standard tool categories, preferred tools in each category, and technology standards and protocols for technology integration



Technology categories include Database management systems (DBMS) Database management utilities Data modelling and model management tools Business intelligence software for reporting and analysis Extract-transform-load (ETL), changed data capture (CDC), and other data integration tools − Data quality analysis and data cleansing tools − Metadata management software, including metadata repositories − − − − −

March 8, 2010

115

Define and Maintain the Data Technology Architecture •

Classify technology architecture components as − Current - currently supported and used − Deployment - deployed for use in the next 1-2 years − Strategic - expected to be available for use in the next 2+ years − Retirement - the organisation has retired or intends to retire this year − Preferred - preferred for use by most applications. − Containment - limited to use by certain applications − Emerging - being researched and piloted for possible future deployment

March 8, 2010

116

Define and Maintain the Data Integration Architecture •

Defines how data flows through all systems from beginning to end



Both data architecture and application architecture, because it includes both databases and the applications that control the data flow into the system, between databases and back out of the system

March 8, 2010

117

Define and Maintain the Data Warehouse / Business Intelligence Architecture •

Focuses on how data changes and snapshots are stored in data warehouse systems for maximum usefulness and performance



Data integration architecture shows how data moves from source systems through staging databases into data warehouses and data marts



Business intelligence architecture defines how decision support makes data available, including the selection and use of business intelligence tools

March 8, 2010

118

Define and Maintain Enterprise Taxonomies and Namespaces •

Taxonomy is the hierarchical structure used for outlining topics



Organisations develop their own taxonomies to organise collective thinking about topics



Overall enterprise data architecture includes organisational taxonomies



Definition of terms used in such taxonomies should be consistent with the enterprise data model

March 8, 2010

119

Define and Maintain the Metadata Architecture •

Metadata architecture is the design for integration of metadata across software tools, repositories, directories, glossaries, and data dictionaries



Metadata architecture defines the managed flow of metadata



Defines how metadata is created, integrated, controlled, and accessed



Metadata repository is the core of any metadata architecture



Focus of metadata architecture is to ensure the quality, integration, and effective use of metadata March 8, 2010

120

Data Architecture Management Guiding Principles •





• • • • •

Data architecture is an integrated set of specification master blueprints used to define data requirements, guide data integration, control data assets, and align data investments with business strategy Enterprise data architecture is part of the overall enterprise architecture, along with process architecture, business architecture, systems architecture, and technology architecture Enterprise data architecture includes three major categories of specifications: the enterprise data model, information value chain analysis, and data delivery architecture Enterprise data architecture is about more than just data - it helps to establish a common business vocabulary An enterprise data model is an integrated subject-oriented data model defining the essential data used across an entire organisation Information value-chain analysis defines the critical relationships between data, processes, roles and organisations and other enterprise elements Data delivery architecture defines the master blueprint for how data flows across databases and applications Architectural frameworks like TOGAF help organise collective thinking about architecture March 8, 2010

121

Data Development

March 8, 2010

122

Data Development •

Analysis, design, implementation, deployment, and maintenance of data solutions to maximise the value of the data resources to the enterprise



Subset of project activities within the system development lifecycle focused on defining data requirements, designing the data solution components, and implementing these components



Primary data solution components are databases and other data structures

March 8, 2010

123

Data Development – Definition and Goals •

Definition − Designing, implementing, and maintaining solutions to meet the data needs of the enterprise



Goals − Identify and define data requirements − Design data structures and other solutions to these requirements − Implement and maintain solution components that meet these requirements − Ensure solution conformance to data architecture and standards as appropriate − Ensure the integrity, security, usability, and maintainability of structured data assets March 8, 2010

124

Data Development - Overview Inputs

Primary Deliverables

•Business Goals and Strategies •Data Needs and Strategies •Data Standards •Data Architecture •Process Architecture •Application Architecture •Technical Architecture

Suppliers

Data Development

•Data Stewards •Subject Matter Experts •IT Steering Committee •Data Governance Council •Data Architects and Analysts •Software Developers •Data Producers •Information Consumers

Participants •Data Stewards and SMEs •Data Architects and Analysts •Database Administrators •Data Model Administrators •Software Developers •Project Managers •DM Executives and Other IT Management March 8, 2010

Tools •Data Modeling Tools •Database Management Systems •Software Development Tools •Testing Tools •Data Profiling Tools •Model Management Tools •Configuration Management Tools •Office Productivity Tools

•Data Requirements and Business Rules •Conceptual Data Models •Logical Data Models and Specifications •Physical Data Models and Specifications •Metadata (Business and Technical) •Data Modeling and DB Design Standards •Data Model and DB Design Reviews •Version Controlled Data Models •Test Data •Development and Test Databases •Information Products •Data Access Services •Data Integration Services •Migrated and Converted Data

Consumers •Data Producers •Knowledge Workers •Managers and Executives •Customers •Data Professionals •Other IT Professionals 125

Data Development Function, Activities and SubActivities Data Development Data Modelling, Analysis and Solution Design

Data Model and Design Quality Management

Detailed Data Design

Analyse Information Requirements

Design Physical Databases

Develop and Maintain Conceptual Data Models

Physical Database Design

Data Implementation

Develop Data Modeling and Design Standards

Implement Development / Test Database Changes

Review Data Model and Database Design Quality

Create and Maintain Test Data

Entities

Performance Modifications

Conceptual and Logical Data Model Reviews

Migrate and Convert Data

Relationships

Physical Database Design Documentation

Physical Database Design Review

Build and Test Information Products

Data Model Validation

Build and Test Data Access Services

Develop and Maintain Logical Data Models

Design Information Products

Attributes

Design Data Access Services

Domains

Design Data Integration Services

Manage Data Model Versioning and Integration

Validate Information Requirements Prepare for Data Deployment

Keys

Develop and Maintain Physical Data Models March 8, 2010

126

Data Development - Principles • • • • • • • • •



Data development activities are an integral part of the software development lifecycle Data modeling is an essential technique for effective data management and system design Conceptual and logical data modeling express business and application requirements while physical data modeling represents solution design Data modeling and database design define detail solution component specifications Data modeling and database design balances tradeoffs and needs Data professionals should collaborate with other project team members to design information products and data access and integration interfaces Data modeling and database design should follow documented standards Design reviews should review all data models and designs, in order to ensure they meet business requirements and follow design standards Data models represent valuable knowledge resources and so should be carefully managed and controlled them through library, configuration, and change management to ensure data model quality and availability Database administrators and other data professionals play important roles in the construction, testing, and deployment of databases and related application systems

March 8, 2010

127

Data Modeling, Analysis, and Solution Design •

Data modeling is an analysis and design method used to define and analyse data requirements, and design data structures that support these requirements



A data model is a set of data specifications and related diagrams that reflect data requirements and designs



Data modeling is a complex process involving interactions between people and with technology which do not compromise the integrity or security of the data



Good data models accurately express and effectively communicate data requirements and quality solution design March 8, 2010

128

Data Model •

The purposes of a data model are: − Communication - a data model is a bridge to understanding data between people with different levels and types of experience. Data models help us understand a business area, an existing application, or the impact of modifying an existing structure. Data models may also facilitate training new business and/or technical staff − Formalisation - a data model documents a single, precise definition of data requirements and data related business rules − Scope – a data model can help explain the data context and scope of purchased application packages



Data models that include the same data may differ by: − Scope - expressing a perspective about data in terms of function (business view or application view), realm (process, department, division, enterprise, or industry view), and time (current state, short-term future, long-term future) − Focus - basic and critical concepts (conceptual view), detailed but independent of context (logical view), or optimised for a specific technology and use (physical view) March 8, 2010

129

Analyse Information Requirements • •

• •

• •

Information is relevant and timely data in context To identify information requirements, first identify business information needs, often in the context of one or more business processes Business processes (and the underlying IT systems) consume information output from other business processes Requirements analysis includes the elicitation, organisation, documentation, review, refinement, approval, and change control of business requirements Some of these requirements identify business needs for data and information Logical data modeling is an important means of expressing business data requirements March 8, 2010

130

Develop and Maintain Conceptual Data Models •

Visual, high-level perspective on a subject area of importance to the business



Contains the basic and critical business entities within a given realm and function with a description of each entity and the relationships between entities



Define the meanings of the essential business vocabulary



Reflect the data associated with a business process or application function



Independent of technology and usage context

March 8, 2010

131

Develop and Maintain Conceptual Data Models •

Entities − A data entity is a collection of data about something that the business deems important and worthy of capture − Entities appear in conceptual or logical data models



Relationships − Business rules define constraints on what can and cannot be done • Data Rules – define constraints on how data relates to other data • Action Rules - instructions on what to do when data elements contain certain values

March 8, 2010

132

Develop and Maintain Logical Data Models Detailed representation of data requirements and the business rules that govern data quality • Independent of any technology or specific implementation technical constraints • Extension of a conceptual data model • Logical data models transform conceptual data model structures by normalisation and abstraction •

− Normalisation is the process of applying rules to organise business complexity into stable data structure − Abstraction is the redefinition of data entities, elements, and relationships by removing details to broaden the applicability of data structures to a wider class of situations March 8, 2010

133

Develop and Maintain Physical Data Models •

Physical data model optimises the implementation of detailed data requirements and business rules in light of technology constraints, application usage, performance requirements, and modeling standards



Physical data modeling transforms the logical data model



Includes specific decisions − Name of each table and column or file and field or schema and element − Logical domain, physical data type, length, and nullability of each column or field − Default values − Primary and alternate unique keys and indexes March 8, 2010

134

Detailed Data Design •

Detailed data design activities include − Detailed physical database design, including views, functions, triggers, and stored procedures − Definition of supporting data structures, such as XML schemas and object classes − Creation of information products, such as the use of data in screens and reports − Definition of data access solutions, including data access objects, integration services, and reporting and analysis services

March 8, 2010

135

Design Physical Databases •

Create detailed database implementation specifications • Ensure the design meets data integrity requirements • Determine the most appropriate physical structure to house and organise the data, such as relational or other type of DBMS, files, OLAP cubes, XML, etc. • Determine database resource requirements, such as server size and location, disk space requirements, CPU and memory requirements, and network requirements • Creating detailed design specifications for data structures, such as relational database tables, indexes, views, OLAP data cubes, XML schemas, etc. • Ensure performance requirements are met, including batch and online response time requirements for queries, inserts, updates, and deletes • Design for backup, recovery, archiving, and purge processing, ensuring availability requirements are met • Design data security implementation, including authentication, encryption needs, application roles and data access and update permissions • Review code to ensure that it meets coding standards and will run efficiently

March 8, 2010

136

Physical Database Design •

Choose a database design based on both a choice of architecture and a choice of technology



Base the choice of architecture (for example, relational, hierarchical, network, object, star schema, snowflake, cube, etc.) on data considerations



Consider factors such as how long the data needs to be kept, whether it must be integrated with other data or passed across system or application boundaries, and on requirements of data security, integrity, recoverability, accessibility, and reusability



Consider organisational or political factors, including organisational biases and developer skill sets, that lean toward a particular technology or vendor

March 8, 2010

137

Physical Database Design - Principles •

Performance and Ease of Use - Ensure quick and easy access to data by approved users in a usable and business-relevant form



Reusability - The database structure should ensure that, where appropriate, multiple applications would be able to use the data



Integrity - The data should always have a valid business meaning and value, regardless of context, and should always reflect a valid state of the business



Security - True and accurate data should always be immediately available to authorised users, but only to authorised users



Maintainability - Perform all data work at a cost that yields value by ensuring that the cost of creating, storing, maintaining, using, and disposing of data does not exceed its value to the organisation March 8, 2010

138

Physical Database Design - Questions • •



• • • •

• •

What are the performance requirements? What is the maximum permissible time for a query to return results, or for a critical set of updates to occur? What are the availability requirements for the database? What are the window(s) of time for performing database operations? How often should database backups and transaction log backups be done (i.e., what is the longest period of time we can risk non-recoverability of the data)? What is the expected size of the database? What is the expected rate of growth of the data? At what point can old or unused data be archived or deleted? How many concurrent users are anticipated? What sorts of data virtualisation are needed to support application requirements in a way that does not tightly couple the application to the database schema? Will other applications need the data? If so, what data and how? Will users expect to be able to do ad-hoc querying and reporting of the data? If so, how and with which tools? What, if any, business or application processes does the database need to implement? (e.g., trigger code that does cross-database integrity checking or updating, application classes encapsulated in database procedures or functions, database views that provide table recombination for ease of use or security purposes, etc.). Are there application or developer concerns regarding the database, or the database development process, that need to be addressed? Is the application code efficient? Can a code change relieve a performance issue? March 8, 2010

139

Performance Modifications •

Consider how the database will perform when applications make requests to access and modify data



Indexing can improve query performance in many cases



Denormalisation is the deliberate transformation of a normalised logical data model into tables with redundant data

March 8, 2010

140

Physical Database Design Documentation •

Create physical database design document to assist implementation and maintenance

March 8, 2010

141

Design Information Products • • • • • • • •



Design data-related deliverables Design screens and reports to meet business data requirements Ensure consistent use of business data terminology Reporting services give business users the ability to execute both pre-developed and ad-hoc reports Analysis services give business users to ability slice and dice data across multiple dimensions Dashboards display a wide array of analytics indicators, such as charts and graphs, efficiently Scorecard display information that indicates scores or calculated evaluations of performance Use data integrated from multiple databases as input to software for business process automation that coordinates multiple business processes across disparate platforms Data integration is a component of Enterprise Application Integration (EAI) software, enabling data to be easily passed from application to application across disparate platforms March 8, 2010

142

Design Data Access Services May be necessary to access and combine data from remote databases with data in the local database • Goal is to enable easy and inexpensive reuse of data across the organisation preventing, wherever possible, redundant and inconsistent data • Options include •

− Linked database connections − SOA web services − Message brokers − Data access classes − ETL − Replication March 8, 2010

143

Design Data Integration Services •

Critical aspect of database design is determining appropriate update mechanisms and database transaction for recovery



Define source-to-target mappings and data transformation designs for extract-transform-load (ETL) programs and other technology for ongoing data movement, cleansing and integration



Design programs and utilities for data migration and conversion from old data structures to new data structures

March 8, 2010

144

Data Model and Design Quality Management •

Balance the needs of information consumers (the people with business requirements for data) and the data producers who capture the data in usable form



Time and budget constraints



Ensure data resides in data structures that are secure, recoverable, sharable, and reusable, and that this data is as correct, timely, relevant, and usable as possible



Balance the short-term versus long-term business data interests of the organisation

March 8, 2010

145

Develop Data Modeling and Design Standards •



Data modeling and database design standards serve as the guiding principles to effectively meet business data needs, conform to data architecture, and ensure data quality Data modeling and database design standards should include − A list and description of standard data modeling and database design deliverables − A list of standard names, acceptable abbreviations, and abbreviation rules for uncommon words, that apply to all data model objects − A list of standard naming formats for all data model objects, including attribute and column class words − A list and description of standard methods for creating and maintaining these deliverables − A list and description of data modeling and database design roles and responsibilities − A list and description of all metadata properties captured in data modeling and database design, including both business metadata and technical metadata, with guidelines defining metadata quality expectations and requirements − Guidelines for how to use data modeling tools − Guidelines for preparing for and leading design reviews March 8, 2010

146

Review Data Model and Database Design Quality •

Conduct requirements reviews and design reviews, including a conceptual data model review, a logical data model review, and a physical database design review

March 8, 2010

147

Conceptual and Logical Data Model Reviews •

Conceptual data model and logical data model design reviews should ensure that: − Business data requirements are completely captured and clearly expressed in the model, including the business rules governing entity relationships − Business (logical) names and business definitions for entities and attributes (business semantics) are clear, practical, consistent, and complementary − Data modeling standards, including naming standards, have been followed − The conceptual and logical data models have been validated

March 8, 2010

148

Physical Database Design Review •

Physical database design reviews should ensure that: − The design meets business, technology, usage, and performance requirements − Database design standards, including naming and abbreviation standards, have been followed − Availability, recovery, archiving, and purging procedures are defined according to standards − Metadata quality expectations and requirements are met in order to properly update any metadata repository − The physical data model has been validated

March 8, 2010

149

Data Model Validation •

Validate data models against modeling standards, business requirements, and database requirements



Ensure the model matches applicable modeling standards



Ensure the model matches the business requirements



Ensure the model matches the database requirements

March 8, 2010

150

Manage Data Model Versioning and Integration •

Data models and other design specifications require change control − Each change should include − Why the project or situation required the change − What and how the object(s) changed, including which tables had columns added, modified, or removed, etc. − When the change was approved and when the change was made to the model − Who made the change − Where the change was made

March 8, 2010

151

Data Implementation •

Data implementation consists of data management activities that support system building, testing, and deployment − Database implementation and change management in the development and test environments − Test data creation, including any security procedures − Development of data migration and conversion programs, both for project development through the SDLC and for business situations − Validation of data quality requirements − Creation and delivery of user training − Contribution to the development of effective documentation March 8, 2010

152

Implement Development / Test Database Changes •

Implement changes to the database that are required during the course of application development



Monitor database code to ensure that it is written to the same standards as application code



Identify poor SQL coding practices that could lead to errors or performance problems

March 8, 2010

153

Create and Maintain Test Data •

Populate databases in the development environment with test data



Observe privacy and confidentiality requirements and practices for test data

March 8, 2010

154

Migrate and Convert Data •

Key component of many projects is the migration of legacy data to a new database environment, including any necessary data cleansing and reformatting

March 8, 2010

155

Build and Test Information Products •

Implement mechanisms for integrating data from multiple sources, along with the appropriate metadata to ensure meaningful integration of the data



Implement mechanisms for reporting and analysing the data, including online and web-based reporting, ad-hoc querying, BI scorecards, OLAP, portals, and the like



Implement mechanisms for replication of the data, if network latency or other concerns make it impractical to service all users from a single data source

March 8, 2010

156

Build and Test Data Access Services •

Develop, test, and execute data migration and conversion programs and procedures, first for development and test data and later for production deployment



Data requirements should include business rules for data quality to guide the implementation of application edits and database referential integrity constraints



Business data stewards and other subject matter experts should validate the correct implementation of data requirements through user acceptance testing

March 8, 2010

157

Validate Information Requirements •

Test and validate that the solution meets the requirements, and plan deployment, developing training, and documentation.



Data requirements may change abruptly, in response to either changed business requirements, invalid assumptions regarding the data or reprioritisation of existing requirements



Test the implementation of the data requirements and ensure that the application requirements are satisfied

March 8, 2010

158

Prepare for Data Deployment •

• •

• •

Leverage the business knowledge captured in data modeling to define clear and consistent language in user training and documentation Business concepts, terminology, definitions, and rules depicted in data models are an important part of application user training Data stewards and data analysts should participate in deployment preparation, including development and review of training materials and system documentation, especially to ensure consistent use of defined business data terminology Help desk support staff also require orientation and training in how system users appropriately access, manipulate, and interpret data Once installed, business data stewards and data analysts should monitor the early use of the system to see that business data requirements are indeed met March 8, 2010

159

Data Operations Management

March 8, 2010

160

Data Operations Management •

Management is the development, maintenance, and support of structured data to maximise the value of the data resources to the enterprise and includes − Database support − Data technology management

March 8, 2010

161

Data Operations Management – Definition and Goals •

Definition − Planning, control, and support for structured data assets across the data lifecycle, from creation and acquisition through archival and purge



Goals − Protect and ensure the integrity of structured data assets − Manage the availability of data throughout its lifecycle − Optimise performance of database transactions

March 8, 2010

162

Data Operations Management - Overview Inputs

Primary Deliverables

•Data Requirements •Data Architecture •Data Models •Legacy Data •Service Level Agreements

Suppliers

Data Operations Management

Consumers

•Executives •IT Steering Committee •Data Governance Council •Data Stewards •Data Architects and Modelers •Software Developers

Participants •Database Administrators •Software Developers •Project Managers •Data Stewards •Data Architects and Analysts •DM Executives and Other IT Management •IT Operators March 8, 2010

•DBMS Technical Environments •Dev/Test, QA, DR, and Production Databases •Externally Sourced Data •Database Performance •Data Recovery Plans •Business Continuity •Data Retention Plan •Archived and Purged Data

•Data Creators •Information Consumers •Enterprise Customers •Data Professionals •Other IT Professionals

Tools

Metrics •Database Management Systems •Data Development Tools •Database Administration Tools •Office Productivity Tools

•Availability •Performance

163

Data Operations Management Function, Activities and Sub-Activities Data Operations Management Database Support

Data Technology Management Implement and Control Database Environments

Understand Data Technology Requirements

Obtain Externally Sourced Data

Define the Data Technology Architecture

Plan for Data Recovery

Evaluate Data Technology

Backup and Recover Data

Install and Administer Data Technology

Set Database Performance Service Levels

Inventory and Track Data Technology Licenses

Monitor and Tune Database Performance

Support Data Technology Usage and Issues

Plan for Data Retention Archive, Retain, and Purge Data Support Specialised Databases March 8, 2010

164

Data Operations Management - Principles • • • • • • • • • •

Write everything down Keep everything Whenever possible, automate a procedure Focus to understand the purpose of each task, manage scope, simplify, do one thing at a time Measure twice, cut once React to problems and issues calmly and rationally, because panic causes more errors Understand the business, not just the technology Work together to collaborate, be accessible, share knowledge Use all of the resources at your disposal Keep up to date March 8, 2010

165

Database Support - Scope •

Ensure the performance and reliability of the database, including performance tuning, monitoring, and error reporting



Implement appropriate backup and recovery mechanisms to guarantee the recoverability of the data in any circumstance



Implement mechanisms for clustering and failover of the database, if continual data availability data is a requirement



Implement mechanisms for archiving data operations management March 8, 2010

166

Database Support - Deliverables •

A production database environment, including an instance of the DBMS and its supporting server, of a sufficient size and capacity to ensure adequate performance, configured for the appropriate level of security, reliability and availability



Mechanisms and processes for controlled implementation and changes to databases into the production environment



Appropriate mechanisms for ensuring the availability, integrity, and recoverability of the data in response to all possible circumstances that could result in loss or corruption of data



Appropriate mechanisms for detecting and reporting any error that occurs in the database, the DBMS, or the data server



Database availability, recovery, and performance in accordance with service level agreements March 8, 2010

167

Implement and Control Database Environments •

Updating DBMS software



Maintaining multiple installations, including different DBMS versions



Installing and administering related data technology, including data integration software and third party data administration tools



Setting and tuning DBMS system parameters



Managing database connectivity



Tune operating systems, networks, and transaction processing middleware to work with the DBMS



Optimise the use of different storage technology for cost-effective storage

March 8, 2010

168

Obtain Externally Sourced Data •

Managed approach to data acquisition centralises responsibility for data subscription services



Document the external data source in the logical data model and data dictionary



Implement the necessary processes to load the data into the database and/or make it available to applications

March 8, 2010

169

Plan for Data Recovery •

Establish service level agreements (SLAs) with IT data management services organisations for data availability and recovery



SLAs set availability expectations, allowing time for database maintenance and backup, and set recovery time expectations for different recovery scenarios, including potential disasters



Ensure a recovery plan exists for all databases and database servers, covering all possible scenarios − Loss of the physical database server − Loss of one or more disk storage devices − Loss of a database, including the DBMS master database, temporary storage database, transaction log segment, etc. − Corruption of database index or data pages − Loss of the database or log segment file system − Loss of database or transaction log backup files March 8, 2010

170

Backup and Recover Data •

Make regular backups of database and the database transaction logs



Balance the importance of the data against the cost of protecting it



Databases should reside on some sort of managed storage area



For critical data, implement some sort of replication facility

March 8, 2010

171

Set Database Performance Service Levels Database performance has two components - availability and performance • An unavailable database has a performance measure of zero • SLAs between data management services organisations and data owners define expectations for database performance • Availability is the percentage of time that a system or database can be used for productive work • Availability requirements are constantly increasing, raising the business risks and costs of unavailable data •

March 8, 2010

172

Set Database Performance Service Levels •

Factors affecting availability include − Manageability - ability to create and maintain an effective environment − Recoverability - ability to reestablish service after interruption, and correct errors caused by unforeseen events or component failures − Reliability - ability to deliver service at specified levels for a stated period − Serviceability - ability to determine the existence of problems, diagnose their causes, and repair / solve the problems



Tasks to ensure databases stay online and operational − − − − − − −

Running database backup utilities Running database reorganisation utilities Running statistics gathering utilities Running integrity checking utilities Automating the execution of these utilities Exploiting table space clustering and partitioning Replicating data across mirror databases to ensure high availability

March 8, 2010

173

Set Database Performance Service Levels •

Cause of loss of database availability − − − − − − − − − − − − − − − −

Planned and unplanned outages Loss of the server hardware Disk hardware failure Operating system failure DBMS software failure Application problems Network failure Data center site loss Security and authorisation problems Corruption of data (due to bugs, poor design, or user error) Loss of database objects Loss of data Data replication failure Severe performance problems Recovery failures Human error

March 8, 2010

174

Monitor and Tune Database Performance •





Optimise database performance both proactively and reactively, by monitoring performance and by responding to problems quickly and effectively Run activity and performance reports against both the DBMS and the server on a regular basis including during periods of heavy activity When performance problems occur, use the monitoring and administration tools of the DBMS to help identify the source of the problem − − − − − − − −

Memory allocation (buffer / cache for data) Locking and blocking Failure to update database statistics Poor SQL coding Insufficient indexing Application activity Increase in the number, size, or use of databases Database volatility

March 8, 2010

175

Support Specialised Databases •

Some specialised situations require specialised types of databases

March 8, 2010

176

Data Technology Management •

Managing data technology should follow the same principles and standards for managing any technology



Use a reference model for technology management such as Information Technology Infrastructure Library (ITIL)

March 8, 2010

177

Understand Data Technology Requirements • • •

Understand the data and information needs of the business Understand the best possible applications of technology to solve business problems and take advantage of new business opportunities Understand the requirements of a data technology before determining what technical solution to choose for a particular situation − − − − − − − − − −

What problem does this data technology mean to solve? What does this data technology do that is unavailable in other data technologies? What does this data technology not do that is available in other data technologies? Are there any specific hardware requirements for this data technology? Are there any specific Operating System requirements for this data technology? Are there any specific software requirements or additional applications required for this data technology to perform as advertised? Are there any specific storage requirements for this data technology? Are there any specific network or connectivity requirements for this data technology? Does this data technology include data security functionality? If not, what other tools does this technology work with that provides for data security functionality? Are there any specific skills required to be able support this data technology? Do we have those skills in-house or must we acquire them?

March 8, 2010

178

Define the Data Technology Architecture •

Data technology architecture addresses three core questions − What technologies are standard (which are required, preferred, or acceptable)? − Which technologies apply to which purposes and circumstances? − In a distributed environment, which technologies exist where, and how does data move from one node to another?



Technology is never free - even open-source technology requires maintenance



Technology should always be regarded as the means to an end, rather than the end itself



Buying the same technology that everyone else is using, and using it in the same way, does not create business value or competitive advantage for the organisation March 8, 2010

179

Define the Data Technology Architecture •

Technology categories include − Database management systems (DBMS) − Database management utilities − Data modelling and model management tools − Business intelligence software for reporting and analysis − Extract-transform-load (ETL), changed data capture (CDC), and other data integration tools − Data quality analysis and data cleansing tools − Metadata management software, including metadata repositories

March 8, 2010

180

Define the Data Technology Architecture •

Classify technology architecture components as − Current - currently supported and used − Deployment - deployed for use in the next 1-2 years − Strategic - expected to be available for use in the next 2+ years − Retirement - the organisation has retired or intends to retire this year − Preferred - preferred for use by most applications. − Containment - limited to use by certain applications − Emerging - being researched and piloted for possible future deployment



Create road map for the organisation consisting of these components to helps govern future technology decisions March 8, 2010

181

Evaluate Data Technology •



Selecting appropriate data related technology, particularly the appropriate database management technology, is an important data management responsibility Data technologies to be researched and evaluated include: − Database management systems (DBMS) software − Database utilities, such as backup and recovery tools, and performance monitors − Data modeling and model management software − Database management tools, such as editors, schema generators, and database object generators − Business intelligence software for reporting and analysis − Extract-transfer-load (ETL) and other data integration tools − Data quality analysis and data cleansing tools − Data virtualisation technology − Metadata management software, including metadata repositories March 8, 2010

182

Evaluate Data Technology •

Use a standard technology evaluation process − Understand user needs, objectives, and related requirements − Understand the technology in general − Identify available technology alternatives − Identify the features required − Weigh the importance of each feature − Understand each technology alternative − Evaluate and score each technology alternative’s ability to meet requirements − Calculate total scores and rank technology alternatives by score − Evaluate the results, including the weighted criteria − Present the case for selecting the highest ranking alternative March 8, 2010

183

Evaluate Data Technology • •

Selecting strategic DBMS software is very important Factors to consider when selecting DBMS software include: − Product architecture and complexity − Application profile, such as transaction processing, business intelligence, and personal profiles − Organisational appetite for technical risk − Hardware platform and operating system support − Availability of supporting software tools − Performance benchmarks − Scalability − Software, memory, and storage requirements − Available supply of trained technical professionals − Cost of ownership, such as licensing, maintenance, and computing resources − Vendor reputation − Vendor support policy and release schedule − Customer references

March 8, 2010

184

Install and Administer Data Technology •

Need to deploy new technology products in development / test, QA / certification, and production environments



Create and document processes and procedures for administering the product



Cost and complexity of implementing new technology is usually underestimated



Features and benefits are usually overestimated



Start with small pilot projects and proof-of-concept (POC) implementations to get a good idea of the true costs and benefits before proceeding with larger production implementation March 8, 2010

185

Inventory and Track Data Technology Licenses •

Comply with licensing agreements and regulatory requirements



Track and conduct yearly audits of software license and annual support costs



Track other costs such as server lease agreements and other fixed costs



Use data to determine the total cost-of-ownership (TCO) for each type of technology and technology product



Evaluate technologies and products that are becoming obsolete, unsupported, less useful, or too expensive March 8, 2010

186

Support Data Technology Usage and Issues •

Work with business users and application developers to − Ensure the most effective use of the technology − Explore new applications of the technology − Address any problems or issues that surface from its use



Training is important to effective understanding and use of any technology

March 8, 2010

187

Data Security Management

March 8, 2010

188

Data Security Management •

Planning, development, and execution of security policies and procedures to provide proper authentication, authorisation, access, and auditing of data and information assets



Effective data security policies and procedures ensure that the right people can use and update data in the right way, and that all inappropriate access and update is restricted



Effective data security management function establishes governance mechanisms that are easy enough to abide by on a daily operational basis

March 8, 2010

189

Data Security Management – Definition and Goals •

Definition − Planning, development, and execution of security policies and procedures to provide proper authentication, authorisation, access, and auditing of data and information.



Goals − Enable appropriate, and prevent inappropriate, access and change to data assets − Meet regulatory requirements for privacy and confidentiality − Ensure the privacy and confidentiality needs of all stakeholders are met

March 8, 2010

190

Data Security Management •

Protect information assets in alignment with privacy and confidentiality regulations and business requirements − Stakeholder Concerns - organisations must recognise the privacy and confidentiality needs of their stakeholders, including clients, patients, students, citizens, suppliers, or business partners − Government Regulations - government regulations protect some of the stakeholder security interests. Some regulations restrict access to information, while other regulations ensure openness, transparency, and accountability − Proprietary Business Concerns - each organisation has its own proprietary data to protect - ensuring competitive advantage provided by intellectual property and intimate knowledge of customer needs and business partner relationships is a cornerstone in any business plan − Legitimate Access Needs - data security implementers must also understand the legitimate needs for data access

March 8, 2010

191

Data Security Requirements and Procedures •

Data security requirements and the procedures to meet these requirements − Authentication - validate users are who they say they are − Authorisation - identify the right individuals and grant them the right privileges to specific, appropriate views of data − Access - enable these individuals and their privileges in a timely manner − Audit - review security actions and user activity to ensure compliance with regulations and conformance with policy and standards

March 8, 2010

192

Data Security Management - Overview Inputs

Primary Deliverables

•Business Goals •Business Strategy •Business Rules •Business Process •Data Strategy •Data Privacy Issues •Related IT Policies and Standards

Suppliers

Data Security Management

•Data Stewards •IT Steering Committee •Data Stewardship Council •Government •Customers

Participants •Data Stewards •Data Security Administrators •Database Administrators •BI Analysts •Data Architects •DM Leader •CIO/CTO •Help Desk Analysts March 8, 2010

Tools •Database Management System •Business Intelligence Tools •Application Frameworks •Identity Management Technologies •Change Control Systems

•Data Security Policies •Data Privacy and Confidentiality Standards •User Profiles, Passwords and Memberships •Data Security Permissions •Data Security Controls •Data Access Views •Document Classifications •Authentication and Access History •Data Security Audits

Consumers •Data Producers •Knowledge Workers •Managers •Executives •Customers •Data Professionals

193

Data Security Management Function, Activities and Sub-Activities Data Security Management

Understand Data Security Needs and Regulatory Requirements

Define Data Security Policy

Business Requirements

Define Data Security Standards

Define Data Security Controls and Procedures

Manage Users, Passwords, and Group Membership

Manage Data Access Views and Permissions

Monitor User Authentication and Access Behaviour

Classify Information Confidentially

Audit Data Security

Password Standards and Procedures

Regulatory Requirements

March 8, 2010

194

Data Operations Management - Principles • • • • • • • • • • • • • • •

Be a responsible trustee of data about all parties. Understand and respect the privacy and confidentiality needs of all stakeholders, be they clients, patients, students, citizens, suppliers, or business partners Understand and comply with all pertinent regulations and guidelines Data-to-process and data-to-role relationship (CRUD Create, Read, Update, Delete) matrices help map data access needs and guide definition of data security role groups, parameters, and permissions Definition of data security requirements and data security policy is a collaborative effort involving IT security administrators, data stewards, internal and external audit teams, and the legal department Identify detailed application security requirements in the analysis phase of every systems development project Classify all enterprise data and information products against a simple confidentiality classification schema Every user account should have a password set by the user following a set of password complexity guidelines, and expiring every 45 to 60 days Create role groups; define privileges by role; and grant privileges to users by assigning them to the appropriate role group. Whenever possible, assign each user to only one role group Some level of management must formally request, track, and approve all initial authorisations and subsequent changes to user and group authorisations To avoid data integrity issues with security access information, centrally manage user identity data and group membership data Use relational database views to restrict access to sensitive columns and / or specific rows Strictly limit and carefully consider every use of shared or service user accounts Monitor data access to certain information actively, and take periodic snapshots of data access activity to understand trends and compare against standards criteria Periodically conduct objective, independent, data security audits to verify regulatory compliance and standards conformance, and to analyse the effectiveness and maturity of data security policy and practice In an outsourced environment, be sure to clearly define the roles and responsibilities for data security and understand the chain of custody data across organisations and roles. March 8, 2010

195

Understand Data Security Needs and Regulatory Requirements •

Distinguish between business rules and procedures and the rules imposed by application software products



Common for systems to have their own unique set of data security requirements over and above those required business processes

March 8, 2010

196

Business Requirements •

Implementing data security within an enterprise requires an understanding of business requirements



Business needs of an enterprise define the degree of rigidity required for data security



Business rules and processes define the security touch points



Data-to-process and data-to-role relationship matrices are useful tools to map these needs and guide definition of data security role-groups, parameters, and permissions



Identify detailed application security requirements in the analysis phase of every systems development project March 8, 2010

197

Regulatory Requirements •

Organisations must comply with a growing set of regulations



Some regulations impose security controls on information management

March 8, 2010

198

Define Data Security Policy •

Definition of data security policy based on data security requirements is a collaborative effort involving IT security administrators, data stewards, internal and external audit teams, and the legal department



Enterprise IT strategy and standards typically dictate highlevel policies for access to enterprise data assets



Data security policies are more granular in nature and take a very data-centric approach compared to an IT security policy

March 8, 2010

199

Define Data Security Standards • • •



No one prescribed way of implementing data security to meet privacy and confidentiality requirements Regulations generally focus on ensuring achieving an end without defining them means for achieving it Organisations should design their own security controls, demonstrate that the controls meet the requirements of the law or regulations and document the implementation of those controls Information technology security standards can also affect − − − − − − −

Tools used to manage data security Data encryption standards and mechanisms Access guidelines to external vendors and contractors Data transmission protocols over the internet Documentation requirements Remote access standards Security breach incident reporting procedures

March 8, 2010

200

Define Data Security Standards •

Consider physical security, especially with the explosion of portable devices and media, to formulate an effective data security strategy − Access to data using mobile devices − Storage of data on portable devices such as laptops, DVDs, CDs or USB drives − Disposal of these devices in compliance with records management policies

• • • •

An organisation should develop a practical, implementable security policy including data security guiding principles Focus should be on quality and consistency not creating a lengthy body of guidelines Execution of the policy requires satisfying the elements of securing information assets: authentication, authorisation, access, and audit Information classification, access rights, role groups, users, and passwords are the means to implementing policy and satisfying these elements March 8, 2010

201

Define Data Security Controls and Procedures •

Implementation and administration of data security policy is primarily the responsibility of security administrators



Database security is often one responsibility of database administrators



Implement proper controls to meet the objectives of relevant laws



Implement a process to validate assigned permissions against a change management system used for tracking all user permission requests

March 8, 2010

202

Manage Users, Passwords, and Group Membership •

Role groups enable security administrators to define privileges by role and to grant these privileges to users by enrolling them in the appropriate role group



Data consistency in user and group management is a challenge in a mixed IT environment



Construct group definitions at a workgroup or business unit level



Organise roles in a hierarchy, so that child roles further restrict the privileges of parent roles

March 8, 2010

203

Password Standards and Procedures •

Passwords are the first line of defense in protecting access to data



Every user account should be required to have a password set by the user with a sufficient level of password complexity defined in the security standards

March 8, 2010

204

Manage Data Access Views and Permissions •

Data security management involves not just preventing inappropriate access, but also enabling valid and appropriate access to data



Most sets of data do not have any restricted access requirements



Control sensitive data access by granting permissions - opt-in



Access control degrades when achieved through shared or service accounts − Implemented as convenience for administrators, these accounts often come with enhanced privileges and are untraceable to any particular user or administrator − Enterprises using shared or service accounts run the risk of data security breaches − Evaluate use of such accounts carefully, and never use them frequently or by default March 8, 2010

205

Monitor User Authentication and Access Behaviour •

Monitoring authentication and access behaviour is critical because: − It provides information about who is connecting and accessing information assets, which is a basic requirement for compliance auditing − It alerts security administrators to unforeseen situations, compensating for oversights in data security planning, design, and implementation

Monitoring helps detect unusual or suspicious transactions that may warrant further investigation and issue resolution • Perform monitoring either actively or passively • Automated systems with human checks and balances in place best accomplish both methods •

March 8, 2010

206

Classify Information Confidentiality • • •

Classify an organisation’s data and information using a simple confidentiality classification schema Most organisations classify the level of confidentiality for information found within documents, including reports A typical classification schema might include the following five confidentiality classification levels: − For General Audiences: Information available to anyone, including the general public − Internal Use Only: Information limited to employees or members, but with minimal risk if shared − Confidential: Information which should not be shared outside the organisation. Client Confidential information may not be shared with other clients − Restricted Confidential: Information limited to individuals performing certain roles with the need to know − Registered Confidential: Information so confidential that anyone accessing the information must sign a legal agreement to access the data and assume responsibility for its secrecy

March 8, 2010

207

Audit Data Security •

Auditing data security is a recurring control activity with responsibility to analyse, validate, counsel, and recommend policies, standards, and activities related to data security management



Auditing is a managerial activity performed with the help of analysts working on the actual implementation and details



The goal of auditing is to provide management and the data governance council with objective, unbiased assessments, and rational, practical recommendations



Auditing data security is no substitute for effective management of data security



Auditing is a supportive, repeatable process, which should occur regularly, efficiently, and consistently March 8, 2010

208

Audit Data Security •

Auditing data security includes − Analysing data security policy and standards against best practices and needs − Analysing implementation procedures and actual practices to ensure consistency with data security goals, policies, standards, guidelines, and desired outcomes − Assessing whether existing standards and procedures are adequate and in alignment with business and technology requirements − Verifying the organisation is in compliance with regulatory requirements − Reviewing the reliability and accuracy of data security audit data − Evaluating escalation procedures and notification mechanisms in the event of a data security breach − Reviewing contracts, data sharing agreements, and data security obligations of outsourced and external vendors, ensuring they meet their obligations, and ensuring the organisation meets its obligations for externally sourced data − Reporting to senior management, data stewards, and other stakeholders on the state of data security within the organisation and the maturity of its practices − Recommending data security design, operational, and compliance improvements March 8, 2010

209

Data Security and Outsourcing • •

• • • •

Outsourcing IT operations introduces additional data security challenges and responsibilities Outsourcing increases the number of people who share accountability for data across organisational and geographic boundaries Previously informal roles and responsibilities must now be explicitly defined as contractual obligations Outsourcing contracts must specify the responsibilities and expectations of each role Any form of outsourcing increases risk to the organisation Data security risk is escalated to include the outsource vendor, so any data security measures and processes must look at the risk from the outsource vendor not only as an external risk, but also as an internal risk March 8, 2010

210

Data Security and Outsourcing •

Transferring control, but not accountability, requires tighter risk management and control mechanisms: − − − − − − − −



Service level agreements Limited liability provisions in the outsourcing contract Right-to-audit clauses in the contract Clearly defined consequences to breaching contractual obligations Frequent data security reports from the service vendor Independent monitoring of vendor system activity More frequent and thorough data security auditing Constant communication with the service vendor

In an outsourced environment, it is important to maintain and track the lineage, or flow, of data across systems and individuals to maintain a chain of custody March 8, 2010

211

Reference and Master Data Management

March 8, 2010

212

Reference and Master Data Management •

Reference and Master Data Management is the ongoing reconciliation and maintenance of reference data and master data − Reference Data Management is control over defined domain values (also known as vocabularies), including control over standardised terms, code values and other unique identifiers, business definitions for each value, business relationships within and across domain value lists, and the consistent, shared use of accurate, timely and relevant reference data values to classify and categorise data − Master Data Management is control over master data values to enable consistent, shared, contextual use across systems, of the most accurate, timely, and relevant version of truth about essential business entities



Reference data and master data provide the context for transaction data

March 8, 2010

213

Reference and Master Data Management – Definition and Goals •

Definition − Planning, implementation, and control activities to ensure consistency with a golden version of contextual data values



Goals − Provide authoritative source of reconciled, high-quality master and reference data − Lower cost and complexity through reuse and leverage of standards − Support business intelligence and information integration efforts

March 8, 2010

214

Reference and Master Data Management - Overview Inputs •Business Drivers •Data Requirements Policy and Regulations •Standards •Code Sets •Master Data •Transactional Data

Suppliers •Steering Committees •Business Data Stewards •Subject Matter Experts •Data Consumers •Standards Organisations •Data Providers

Participants •Data Stewards •Subject Matter Experts •Data Architects •Data Analysts •Application Architects •Data Governance Council •Data Providers •Other IT Professionals March 8, 2010

Primary Deliverables

Reference and Master Data Management Tools •Reference Data Management Applications •Master Data Management Applications •Data Modeling Tools •Process Modeling Tools •Metadata Repositories •Data Profiling Tools •Data Cleansing Tools •Data Integration Tools •Business Process and Rule Engines Change Management Tools

•Master and Reference Data Requirements •Data Models and Documentation •Reliable Reference and Master Data •Golden Record Data Lineage •Data Quality Metrics and Reports •Data Cleansing Services

Consumers •Application Users •BI and Reporting Users •Application Developers and Architects •Data Integration Developers and Architects •BI Developers and Architects •Vendors, Customers, and Partners

Metrics •Reference and Master Data Quality •Change Activity •Issues, Costs, Volume •Use and Re-Use •Availability •Data Steward Coverage 215

Reference and Master Data Management Function, Activities and Sub-Activities Reference and Master Data Management

Reference Data

Master Data

Understand Reference and Master Data Integration Needs

Identify Reference and Master Data Sources and Contributors

Define and Maintain the Data integration Architecture

Implement Reference and Master Data Management Solutions

Define and Maintain Match Rules

Establish Golden Records

Define and Maintain Hierarchies and Affiliations

Party Master Data

Vocabulary Management and Reference Data

Financial Master Data

Defining Golden Master Data Values

Plan and Implement Integration of New Data Sources

Replicate and Distribute Reference and Master Data

Manage Changes to Reference and Master Data

Product Master Data

Location Master Data

March 8, 2010

216

Reference and Master Data Management Principles • • •



• •

Shared reference and master data belongs to the organisation, not to a particular application or department Reference and master data management is an on-going data quality improvement program; its goals cannot be achieved by one project alone Business data stewards are the authorities accountable for controlling reference data values. Business data stewards work with data professionals to improve the quality of reference and master data Golden data values represent the organisation’s best efforts at determining the most accurate, current, and relevant data values for contextual use. New data may prove earlier assumptions to be false. Therefore, apply matching rules with caution, and ensure that any changes that are made are reversible Replicate master data values only from the database of record Request, communicate, and, in some cases, approve of changes to reference data values before implementation

March 8, 2010

217

Reference Data •

Reference data is data used to classify or categorise other data



Business rules usually dictate that reference data values conform to one of several allowed values



In all organisations, reference data exists in virtually every database



Reference tables link via foreign keys into other relational database tables, and the referential integrity functions within the database management system ensure only valid values from the reference tables are used in other tables March 8, 2010

218

Master Data •

Master data is data about the business entities that provide context for business transactions



Master data is the authoritative, most accurate data available about key business entities, used to establish the context for transactional data



Master data values are considered golden



Master Data Management is the process of defining and maintaining how master data will be created, integrated, maintained, and used throughout the enterprise

March 8, 2010

219

Master Data Challenges • • • • • •



• • •

What are the important roles, organisations, places, and things referenced repeatedly? What data is describing the same person, organisation, place, or thing? Where is this data stored? What is the source for the data? Which data is more accurate? Which data source is more reliable and credible? Which data is most current? What data is relevant for specific needs? How do these needs overlap or conflict? What data from multiple sources can be integrated to create a more complete view and provide a more comprehensive understanding of the person, organisation, place or thing? What business rules can be established to automate master data quality improvement by accurately matching and merging data about the same person, organisation, place, or thing? How do we identify and restore data that was inappropriately matched and merged? How do we provide our golden data values to other systems across the enterprise? How do we identify where and when data other than the golden values is used? March 8, 2010

220

Party Master Data •

Includes data about individuals, organisations, and the roles they play in business relationships



Customer relationship management (CRM) systems perform MDM for customer data (also called Customer Data Integration (CDI))



Focus is to provide the most complete and accurate information about each and every customer



Need to identify duplicate, redundant and conflicting data



Party master data issues − − − −

Complexity of roles and relationships played by individuals and organisations Difficulties in unique identification High number of data sources Business importance and potential impact of the data

March 8, 2010

221

Financial Master Data •

Includes data about business units, cost centers, profit centers, general ledger accounts, budgets, projections, and projects



Financial MDM solutions focus on not only creating, maintaining, and sharing information, but also simulating how changes to existing financial data may affect the organisation’s bottom line

March 8, 2010

222

Product Master Data •

Product master can consists of information on an organisation’s products and services or on the entire industry in which the organisation operates, including competitor products, and services



Product Lifecycle Management (PLM) focuses on managing the lifecycle of a product or service from its conception (such as research), through its development, manufacturing, sale / delivery, service, and disposal

March 8, 2010

223

Location Master Data •

Provides the ability to track and share reference information about different geographies, and create hierarchical relationships or territories based on geographic information to support other processes



Different industries require specialised earth science data (geographic data about seismic faults, flood plains, soil, annual rainfall, and severe weather risk areas) and related sociological data (population, ethnicity, income, and terrorism risk), usually supplied from external sources

March 8, 2010

224

Understand Reference and Master Data Integration Needs Reference and master data requirements are relatively easy to discover and understand for a single application • Potentially much more difficult to develop an understanding of these needs across applications, especially across the entire organisation • Analysing the root causes of a data quality problem usually uncovers requirements for reference and master data integration • Organisations that have successfully managed reference and master data typically have focused on one subject area at a time •

− Analyse all occurrences of a few business entities, across all physical databases and for differing usage patterns March 8, 2010

225

Identify Reference and Master Data Sources and Contributors •

Successful organisations first understand the needs for reference and master data



Then trace the lineage of this data to identify the original and interim source databases, files, applications, organisations and the individual roles that create and maintain the data



Understand both the upstream sources and the downstream needs to capture quality data at its source

March 8, 2010

226

Define and Maintain the Data integration Architecture •





Effective data integration architecture controls the shared access, replication, and flow of data to ensure data quality and consistency, particularly for reference and master data Without data integration architecture, local reference and master data management occurs in application silos, inevitably resulting in redundant and inconsistent data The selected data integration architecture should also provide common data integration services − − − − − − −

• •

Change request processing, including review and approval Data quality checks on externally acquired reference and master data Consistent application of data quality rules and matching rules Consistent patterns of processing Consistent metadata about mappings, transformations, programs and jobs Consistent audit, error resolution and performance monitoring data Consistent approach to replicating data

Establishing master data standards can be a time consuming task as it may involve multiple stakeholders. Apply the same data standards, regardless of integration technology, to enable effective standardisation, sharing, and distribution of reference and master data March 8, 2010

227

Data Integration Services Architecture Data Quality Management Data Acquisition, File Management and Audit

Data Standardisation Cleansing and Matching

Replication Management

Source Data

Rules

Reconciled Master Data

Archives

Errors

Subscriptions

Staging

MetaData Management Business Metadata March 8, 2010

Integration Metadata

Job Flow and Statistics 228

Implement Reference and Master Data Management Solutions •

Reference and master data management solutions are complex



Given the variety, complexity, and instability of requirements, no single solution or implementation project is likely to meet all reference and master data management needs



Organisations should expect to implement reference and master data management solutions iteratively and incrementally through several related projects and phases

March 8, 2010

229

Define and Maintain Match Rules •

Matching, merging, and linking of data from multiple systems about the same person, group, place, or thing is a major master data management challenge



Matching attempts to remove redundancy, to improve data quality, and provide information that is more comprehensive



Data matching is performed by applying inference rules − Duplicate identification match rules focus on a specific set of fields that uniquely identify an entity and identify merge opportunities without taking automatic action − Match-merge rules match records and merge the data from these records into a single, unified, reconciled, and comprehensive record. − Match-link rules identify and cross-reference records that appear to relate to a master record without updating the content of the cross-referenced record

March 8, 2010

230

Establish Golden Records •

Establishing golden master data values requires more inference, application of matching rules, and review of the results

March 8, 2010

231

Vocabulary Management and Reference Data A vocabulary is a collection of terms / concepts and their relationships • Vocabulary management is defining, sourcing, importing, and maintaining a vocabulary and its associated reference data •

− See ANSI/NISO Z39.19 - Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies http://www.niso.org/kst/reports/standards?step=2&gid=&project _key=7cc9b583cb5a62e8c15d3099e0bb46bbae9cf38a

Vocabulary management requires the identification of the standard list of preferred terms and their synonyms • Vocabulary management requires data governance, enabling data stewards to assess stakeholder needs •

March 8, 2010

232

Vocabulary Management and Reference Data •

Key questions to ask to enable vocabulary management − What information concepts (data attributes) will this vocabulary support? − Who is the audience for this vocabulary? What processes do they support, and what roles do they play? − Why is the vocabulary needed? Will it support applications, content management, analytics, and so on? − Who identifies and approves the preferred vocabulary and vocabulary terms? − What are the current vocabularies different groups use to classify this information? Where are they located? How were they created? Who are their subject matter experts? Are there any security or privacy concerns for any of them? − Are there existing standards that can be leveraged to fulfill this need? Are there concerns about using an external standard vs. internal? How frequently is the standard updated and what is the degree of change of each update? Are standards accessible in an easy to import / maintain format in a cost efficient manner? March 8, 2010

233

Defining Golden Master Data Values •





• •

Golden data values are the data values thought to be the most accurate, current, and relevant for shared, consistent use across applications Determine golden values by analyssing data quality, applying data quality rules and matching rules, and incorporating data quality controls into the applications that acquire, create, and update data Establish data quality measurements to set expectations, measure improvements, and help identify root causes of data quality problems Assess data quality through a combination of data profiling activities and verification against adherence to business rules Once the data is standardised and cleansed, the next step is to attempt reconciliation of redundant data through application of matching rules March 8, 2010

234

Define and Maintain Hierarchies and Affiliations •

Vocabularies and their associated reference data sets are often more than lists of preferred terms and their synonyms



Affiliation management is the establishment and maintenance of relationships between master data records

March 8, 2010

235

Plan and Implement Integration of New Data Sources •

Integrating new reference data sources involves − Receiving and responding to new data acquisition requests from different groups − Performing data quality assessment services using data cleansing and data profiling tools − Assessing data integration complexity and cost − Piloting the acquisition of data and its impact on match rules − Determining who will be responsible for data quality − Finalising data quality metrics

March 8, 2010

236

Replicate and Distribute Reference and Master Data •

Reference and master data may be read directly from a database of record, or may be replicated from the database of record to other application databases for transaction processing, and data warehouses for business intelligence



Reference data most commonly appears as pick list values in applications



Replication aids maintenance of referential integrity

March 8, 2010

237

Manage Changes to Reference and Master Data •

Specific individuals have the role of a business data steward with the authority to create, update, and retire reference data



Formally control changes to controlled vocabularies and their reference data sets



Carefully assess the impact of reference data changes

March 8, 2010

238

Data Warehousing and Business Intelligence Management

March 8, 2010

239

Data Warehousing and Business Intelligence Management •

A Data Warehouse is a combination of two primary components − An integrated decision support database − Related software programs used to collect, cleanse, transform, and store data from a variety of operational and external sources

Both components combine to support historical, analytical, and business intelligence (BI) requirements • A Data Warehouse may also include dependent data marts, which are subset copies of a data warehouse database • A Data Warehouse includes any data stores or extracts used to support the delivery of data for BI purposes •

March 8, 2010

240

Data Warehousing and Business Intelligence Management •

Data Warehousing means the operational extract, cleansing, transformation, and load processes and associated control processes that maintain the data contained within a data warehouse



Data Warehousing process focuses on enabling an integrated and historical business context on operational data by enforcing business rules and maintaining appropriate business data relationships and processes that interact with metadata repositories



Business Intelligence is a set of business capabilities including − Query, analysis, and reporting activity by knowledge workers to monitor and understand the financial operation health of, and make business decisions about, the enterprise − Strategic and operational analytics and reporting on corporate operational data to support business decisions, risk management, and compliance

March 8, 2010

241

Data Warehousing and Business Intelligence Management •

Together Data Warehousing and Business Intelligence Management is the collection, integration, and presentation of data to knowledge workers for the purpose of business analysis and decision-making



Composed of activities supporting all phases of the decision support life cycle that provides context − Moves and transforms data from sources to a common target data store − Provides knowledge workers various means of access, manipulation − Reporting of the integrated target data

March 8, 2010

242

Data Warehousing and Business Intelligence Management – Definition and Goals •

Definition − Planning, implementation, and control processes to provide decision support data and support knowledge workers engaged in reporting, query and analysis



Goals − To support and enable effective business analysis and decision making by knowledge workers − To build and maintain the environment / infrastructure to support business intelligence activity, specifically leveraging all the other data management functions to cost effectively deliver consistent integrated data for all BI activity

March 8, 2010

243

Data Warehousing and Business Intelligence Management - Overview Inputs •Business Drivers •BI Data and Access Requirements •Data Quality Requirements •Data Security Requirements •Data Architecture •Technical Architecture •Data Modeling Standards and Guidelines •Transactional Data •Master and Reference Data •Industry and External Data

Suppliers •Executives and Managers •Subject Matter Experts •Data Governance Council •Information Consumers (Internal and External) •Data Producers •Data Architects and Analysts

Primary Deliverables

Data Warehousing and Business Intelligence Management

Consumers

Tools

Participants •Business Executives and Managers •DM Execs and Other IT Management •BI Program Manage •SMEs and Other Information Consumers •Data Stewards •Project Managers •Data Architects and Analysts •Data Integration (ETL) Specialists •BI Specialists •Database Administrators •Data Security Administrators •Data Quality Analysts March 8, 2010

•DW/BI Architecture •Data Warehouses •Data Marts and OLAP Cubes •Dashboards and Scorecards •Analytic Applications •File Extracts (for Data Mining/Statistical Tools) •BI Tools and User Environments •Data Quality Feedback Mechanism/Loop

•Database Management Systems •Data Profiling Tools •Data Integration Tools •Data Cleansing Tools •Business Intelligence Tools •Analytic Applications •Data Modeling Tools •Performance Management Tools •Metadata Repository •Data Quality Tools •Data Security Tools

•Knowledge Workers •Managers and Executives •External Customers and Systems •Internal Customers and Systems •Data Professionals Other IT Professionals

Metrics •Usage Metrics •Customer/User Satisfaction •Subject Area Coverage % •Response/Performance Metrics

244

Data Warehousing and Business Intelligence Management Objectives • • • • • •

• •



Providing integrated storage of required current and historical data, organised by subject areas Ensuring credible, quality data for all appropriate access capabilities Ensuring a stable, high-performance, reliable environment for data acquisition, data management, and data access Providing an easy-to-use, flexible, and comprehensive data access environment Delivering both content and access to the content in increments appropriate to the organisation’s objectives Leveraging, rather than duplicating, relevant data management component functions such as Reference and Master Data Management, Data Governance, Data Quality, and Metadata Providing an enterprise focal point for data delivery in support of the decisions, policies, procedures, definitions, and standards that arise from DG Defining, building, and supporting all data stores, data processes, data infrastructure, and data tools that contain integrated, post-transactional, and refined data used for information viewing, analysis, or data request fulfillment Integrating newly discovered data as a result of BI processes into the DW for further analytics and BI use. March 8, 2010

245

Data Warehousing and Business Intelligence Management Function, Activities and Sub-Activities Data Warehousing and Business Intelligence Management Understand Business Intelligence Information Needs

Define and Maintain the DW-BI Architecture

Implement Data Warehouses and Data Marts

Implement Business Intelligence Tools and User Interfaces

Process Data for Business Intelligence

Monitor and Tune Data Warehousing Processes

Query and Reporting Tools

Staging Areas

On Line Analytical Processing (OLAP) Tools

Mapping Sources and Targets

Analytic Applications

Data Cleansing and Transformations (Data Acquisition)

Monitor and Tune BI Activity and Performance

Implementing Management Dashboards and Scorecards Performance Management Tools

Predictive Analytics and Data Mining Tools

Advanced Visualisation and Discovery Tools March 8, 2010

246

Data Warehousing and Business Intelligence Management Principles • • • • • •

• •

• •



Obtain executive commitment and support as these projects are labour intensive Secure business SMEs as their support and high availability are necessary for getting the correct data and useful BI solution Be business focused and driven. Make sure DW / BI work is serving real priority business needs and solving burning business problems. Let the business drive the prioritisation Demonstrable data quality is essential Provide incremental value. Ideally deliver in continual 2-3 month segments Transparency and self service. The more context (metadata of all kinds) provided, the more value customers derive. Wisely exposing information about the process reduces calls and increases satisfaction. One size does not fit all. Make sure you find the right tools and products for each of your customer segments Think and architect globally, act and build locally. Let the big-picture and end- vision guide the architecture, but build and deliver incrementally, with much shorter term and more project-based focus Collaborate with and integrate all other data initiatives, especially those for data governance, data quality, and metadata Start with the end in mind. Let the business priority and scope of end-data- delivery in the BI space drive the creation of the DW content. The main purpose for the existence of the DW is to serve up data to the end business customers via the BI capabilities Summarise and optimise last, not first. Build on the atomic data and add aggregates or summaries as needed for performance, but not to replace the detail. March 8, 2010

247

Understand Business Intelligence Information Needs • • • • • • • • •

All projects start with requirements Gathering requirements for DW-BIM projects has both similarities to and differences from gathering requirements for other projects For DW-BIM projects, it is important to understand the broader business context of the business area targeted as reporting is generalised and exploratory Capturing the actual business vocabulary and terminology is a key to success Document the business context, then explore the details of the actual source data Typically, the ETL portion can consume 60%-70% of a DW-BIM project’s budget and time The DW is often the first place where the pain of poor quality data in source systems and / or data entry functions becomes apparent Creating an executive summary of the identified business intelligence needs is a best practice When starting a DW-BIM programme, a good way to decide where to start is using a simple assessment of business impact and technical feasibility − Technical feasibility will take into consideration things like complexity, availability and state of the data, and the availability of subject matter experts − Projects that have high business impact and high technical feasibility are good candidates for starting.

March 8, 2010

248

Define and Maintain the DW-BI Architecture •

Successful DW-BIM architecture requires the identification and bringing together of a number of key roles − Technical Architect - hardware, operating systems, databases and DW-BIM architecture − Data Architect - data analysis, systems of record, data modeling and data mapping − ETL Architect / Design Lead - staging and transform, data marts, and schedules − Metadata Specialist - metadata interfaces, metadata architecture and contents − BI Application Architect / Design Lead - BI tool interfaces and report design, metadata delivery, data and report navigation and delivery

• • •

Technical requirements including performance, availability, and timing needs are key drivers in developing the DW-BIM architecture The design decisions and principles for what data detail the DW contains is a key design priority for DW-BIM architecture Important that the DW-BIM architecture integrate with the overall corporate reporting architecture March 8, 2010

249

Define and Maintain the DW-BI Architecture •

No DW-BIM effort can be successful without business acceptance of data



Business acceptance includes the data being understandable, having verifiable quality and having a demonstrable origin



Sign-off by the Business on the data should be part of the User Acceptance Testing



Structured random testing of the data in the BIM tool against data in the source systems over the initial load and a few update load cycles should be performed to meet sign-off criteria



Meeting these requirements is paramount for every DW-BIM architecture

March 8, 2010

250

Implement Data Warehouses and Data Marts •

The purpose of a data warehouse is to integrate data from multiple sources and then serve up that integrated data for BI purposes



Consumption is typically through data marts or other systems



A single data warehouse will integrate data from multiple source systems and serve data to multiple data marts



Purpose of data marts is to provide data for analysis to knowledge workers



Start with the end in mind - identify the business problem to solve, then identify the details and what would be used and continue to work back into the integrated data required and ultimately all the way back to the data sources.

March 8, 2010

251

Implement Business Intelligence Tools and User Interfaces •

Well defined set of well-proven BI tools



Implementing the right BI tool or User Interface (UI) is about identifying the right tools for the right user set



Almost all BI tools also come with their own metadata repositories to manage their internal data maps and statistics

March 8, 2010

252

Query and Reporting Tools •

Query and reporting is the process of querying a data source and then formatting it to create a report



With business query and reporting the data source is more often a data warehouse or data mart



While IT develops production reports, power users and casual business users develop their own reports with business query tools



Business query and reporting tools enable users who want to author their own reports or create outputs for use by others March 8, 2010

253

Query and Reporting Tools Landscape Customers, Suppliers and Regulators Frontline Workers Embedded BI

Published Reports

Executives and Managers Scorecards

Dashboards OLAP

Analysts and Information Workers IT Developers

BI Spreadsheets Production Reporting Tools Commonly Used Tools March 8, 2010

Interactive Fixed Reports

Statistics

Specialist Tools

Business Query

Commonly Used Tools 254

On Line Analytical Processing (OLAP) Tools • •



OLAP provides interactive, multi-dimensional analysis with different dimensions and different levels of detail The value of OLAP tools and cubes is reduction of the chance of confusion and erroneous interpretation by aligning the data content with the analyst's mental model Common OLAP operations include slice and dice, drill down, drill up, roll up, and pivot − Slice - a slice is a subset of a multi-dimensional array corresponding to a single value for one or more members of the dimensions not in the subset − Dice - the dice operation is a slice on more than two dimensions of a data cube, or more than two consecutive slices − Drill Down / Up - drilling down or up is a specific analytical technique whereby the user navigates among levels of data, ranging from the most summarised (up) to the most detailed (down) − Roll-Up – a roll-up involves computing all of the data relationships for one or more dimensions. To do this, define a computational relationship or formula − Pivot - to change the dimensional orientation of a report or page display

March 8, 2010

255

Analytic Applications •

Analytic applications include the logic and processes to extract data from well-known source systems, such as vendor ERP systems, a data model for the data mart, and pre-built reports and dashboards



Analytic applications provide businesses with a pre-built solution to optimise a functional area or industry vertical



Different types of analytic applications include customer, financial, supply chain, manufacturing, and human resource applications

March 8, 2010

256

Implementing Management Dashboards and Scorecards •

Dashboards and scorecards are both ways of efficiently presenting performance information



Dashboards are oriented more toward dynamic presentation of operational information while scorecards are more static representations of longer-term organisational, tactical, or strategic goals



Typically, scorecards are divided into 4 quadrants or views of the organisation such as Finance, Customer, Environment, and Employees, each with a number of metrics

March 8, 2010

257

Performance Management Tools •

Performance management applications include budgeting, planning, and financial consolidation

March 8, 2010

258

Predictive Analytics and Data Mining Tools •

Data mining is a particular kind of analysis that reveals patterns in data using various algorithms



A data mining tool will help users discover relationships or show patterns in more exploratory fashion

March 8, 2010

259

Advanced Visualisation and Discovery Tools •

Advanced visualisation and discovery tools allow users to interact with the data in a highly visual, interactive way



Patterns in a large dataset can be difficult to recognise in a numbers display



A pattern can be picked up visually fairly quickly when thousands of data points are loaded into a sophisticated display on a single page of display

March 8, 2010

260

Process Data for Business Intelligence •

Most of the work in any DW-BIM effort involves in the preparation and processing of the data

March 8, 2010

261

Staging Areas •

A staging area is the intermediate data store between an original data source and the centralised data repository



All required cleansing, transformation, reconciliation, and relationships happen in this area

March 8, 2010

262

Mapping Sources and Targets •

Source-to-target mapping is the documentation activity that defines data type details and transformation rules for all required entities and data elements and from each individual source to each individual target



DW-BIM adds additional requirements to this source-to-target mapping process encountered as a component of any typical data migration



One of the goals of the DW-BIM effort should be to provide a complete lineage for each data element available in the BI environment all the way back to its respective source(s)



A solid taxonomy is necessary to match the data elements in different systems into a consistent structure in the EDW

March 8, 2010

263

Data Cleansing and Transformations (Data Acquisition) •

• • • • •

Data cleansing focuses on the activities that correct and enhance the domain values of individual data elements, including enforcement of standards Cleansing is particularly necessary for initial loads where significant history is involved The preferred strategy is to push data cleansing and correction activity back to the source systems whenever possible Data transformation focuses on activities that provide organisational context between data elements, entities, and subject areas Organisational context includes cross- referencing, reference and master data management and complete and correct relationships Data transformation is an essential component of being able to integrate data from multiple sources March 8, 2010

264

Monitor and Tune Data Warehousing Processes •

Processing should be monitored across the system for bottlenecks and dependencies among processes



Database tuning techniques should be employed where and when needed, including partitioning, tuned backup and recovery strategies



Archiving is a difficult subject in data warehousing



Users often consider the data warehouse as an active archive due to the long histories that are built, and are unwilling, particularly if the OLAP sources have dropped records, to see the data warehouse engage in archiving March 8, 2010

265

Monitor and Tune BI Activity and Performance •

A best practice for BI monitoring and tuning is to define and display a set of customer- facing satisfaction metrics



Average query response time and the number of users per day / week / month, are examples of useful metrics to display



Regular review of usage statistics and patterns is essential



Reports providing frequency and resource usage of data, queries, and reports allow prudent enhancement



Tuning BI activity is analogous to the principle of profiling applications in order to know where the bottlenecks are and where to apply optimisation efforts March 8, 2010

266

Document and Content Management

March 8, 2010

267

Document and Content Management •





Document and Content Management is the control over capture, storage, access, and use of data and information stored outside relational databases Strategic and tactical focus overlaps with other data management functions in addressing the need for data governance, architecture, security, managed metadata, and data quality for unstructured data Document and Content Management includes two sub-functions: − Document management is the storage, inventory, and control of electronic and paper documents. Document management encompasses the processes, techniques, and technologies for controlling and organising documents and records, whether stored electronically or on paper − Content management refers to the processes, techniques, and technologies for organising, categorising, and structuring access to information content, resulting in effective retrieval and reuse. Content management is particularly important in developing websites and portals, but the techniques of indexing based on keywords, and organising based on taxonomies, can be applied across technology platforms. March 8, 2010

268

Document and Content Management – Definition and Goals •

Definition − Planning, implementation, and control activities to store, protect, and access data found within electronic files and physical records (including text, graphics, images, audio, and video)



Goals − To safeguard and ensure the availability of data assets stored in less structured formats − To enable effective and efficient retrieval and use of data and information in unstructured formats − To comply with legal obligations and customer expectations − To ensure business continuity through retention, recovery, and conversion − To control document storage operating costs March 8, 2010

269

Document and Content Management - Overview Inputs •Text Documents •Reports •Spreadsheets •Email •Instant Messages •Faxes •Voicemail •Images •Video Recordings •Audio Recordings •Printed Paper Files •Microfiche/Microfilm •Graphics

Primary Deliverables

Document and Content Management

•Managed Records in Many Media Formats •E-discovery Records •Outgoing Letters and Emails •Contracts and Financial Documents •Policies and Procedures •Audit Trails and Logs •Meeting Minutes •Formal Reports •Significant Memoranda

Consumers

Suppliers •Employees •External Parties

Participants •All Employees •Data Stewards •DM Professionals •Records Management Staff •Other IT Professionals •Data Management Executive •Other IT Managers •Chief Information Officer •Chief Knowledge Officer March 8, 2010

Tools •Stored Documents •Office Productivity Tools •Image and Workflow Management Tools •Records Management Tools •XML Development Tools •Collaboration Tools •Internet •Email Systems

•Business and IT Users •Government Regulatory Agencies • Senior Management •External Customers

Metrics •Return on investment •Key Performance Indicators •Balanced Scorecards

270

Document and Content Management Function, Activities and Sub-Activities Document and Content Management

Document / Record Management

Content Management

Plan for Managing Documents / Records

Define and Maintain Enterprise Taxonomies (Information Content Architecture)

Implement Document / Record Management Systems for Acquisition, Storage, Access, and Security Controls

Document / Index Information Content Metadata

Backup and Recover Documents / Records

Provide Content Access and Retrieval

Retention and Disposition of Documents / Records

Govern for Quality Content

Audit Document / Records Management March 8, 2010

271

Document and Content Management - Principles Everyone in an organisation has a role to play in protecting its future. Everyone must create, use, retrieve, and dispose of records in accordance with the established policies and procedures • Experts in the handling of records and content should be fully engaged in policy and planning. Regulatory and best practices can vary significantly based on industry sector and legal jurisdiction • Even if records management professionals are not available to the organisation, everyone can be trained and have an understanding of the issues. Once trained, business stewards and others can collaborate on an effective approach to records management •

March 8, 2010

272

Document and Content Management •







A document management system is an application used to track and store electronic documents and electronic images of paper documents Document management systems commonly provide storage, versioning, security, metadata management, content indexing, and retrieval capabilities A content management system is used to collect, organise, index, and retrieve information content; storing the content either as components or whole documents, while maintaining links between components While a document management system may provide content management functionality over the documents under its control, a content management system is essentially independent of where and how the documents are stored March 8, 2010

273

Document / Record Management •

Document / Record Management is the lifecycle management of the designated significant documents of the organisation



Records can − − − − −

Physical such as documents, memos, contracts, reports or microfiche Electronic such as email content, attachments, and instant messaging Content on a website Documents on all types of media and hardware Data captured in databases of all kinds



More than 90% of the records created today are electronic



Growth in email and instant messaging has made the management of electronic records critical to an organisation

March 8, 2010

274

Document / Record Management •

The lifecycle of Document / Record Management includes: − Identification of existing and newly created documents / records − Creation, Approval, and Enforcement of documents / records policies − Classification of documents / records − Documents / Records Retention Policy − Storage: Short and long term storage of physical and electronic documents / records − Retrieval and Circulation: Allowing access and circulation of documents / records in accordance with policies, security and control standards, and legal requirements − Preservation and Disposal: Archiving and destroying documents / records according to organisational needs, statutes, and regulations March 8, 2010

275

Plan for Managing Documents / Records • • •

• • •

Plan document lifecycle from creation or receipt, organisation for retrieval, distribution and archiving or disposition Develop classification / indexing systems and taxonomies so that the retrieval of documents is easy Create planning and policy around documents and records on the value of the data to the organisation and as evidence of business transactions Identify the responsible, accountable organisational unit for managing the documents / records Develop and execute a retention plan and policy to archive, such as selected records for long-term preservation Records are destroyed at the end of their lifecycle according to operational needs, procedures, statutes and regulations March 8, 2010

276

Implement Document / Record Management Systems for Acquisition, Storage, Access, and Security Controls •

Documents can be created within a document management system or captured via scanners or OCR software



Electronic documents must be indexed via keywords or text during the capture process so that the document can be found



A document repository enables check-in and check-out features, versioning, collaboration, comparison, archiving, status state(s), migration from one storage media to another and disposition



Document management can support different types of workflows − Manual workflows that indicate where the user sends the document − Rules-based workflow, where rules are created that dictate the flow of the document within an organisation − Dynamic rules that allow for different workflows based on content

March 8, 2010

277

Backup and Recover Documents / Records •

The document / record management system needs to be included as part of the overall corporate backup and recovery activities for all data and information



Document / records manager be involved in risk mitigation and management, and business continuity especially regarding security for vital records



A vital records program provides the organisation with access to the records necessary to conduct its business during a disaster and to resume normal business afterward

March 8, 2010

278

Retention and Disposition of Documents / Records •

Defines the period of time during which documents / records for operational, legal, financial or historical value must be maintained



Specifies the processes for compliance, and the methods and schedules for the disposition of documents / records



Must deal with privacy and data protection issues



Legal and regulatory requirements must be considered when setting up document record retention schedules

March 8, 2010

279

Audit Document / Records Management •

Document / records management requires auditing on a periodic basis to ensure that the right information is getting to the right people at the right time for decision making or performing operational activities − Inventory - Each location in the inventory is uniquely identified − Storage - Storage areas for physical documents / records have adequate space to accommodate growth − Reliability and Accuracy - Spot checks are executed to confirm that the documents / records are an adequate reflection of what has been created or received − Classification and Indexing Schemes - Metadata and document file plans are well described − Access and Retrieval - End users find and retrieve critical information easily − Retention Processes - Retention schedule is structured in a logical way − Disposition Methods - Documents / records are disposed of as recommended − Security and Confidentiality - Breaches of document / record confidentiality and loss of documents / records are recorded as security incidents and managed appropriately − Organisational Understanding of Documents / Records Management - Appropriate training is provided to stakeholders and staff as to the roles and responsibilities related to document / records management

March 8, 2010

280

Content Management •

Organisation, categorisation, and structure of data / resources so that they can be stored, published, and reused in multiple ways



Includes data / information, that exists in many forms and in multiple stages of completion within its lifecycle



Content management systems manage the content of a website or intranet through the creation, editing, storing, organising, and publishing of content

March 8, 2010

281

Define and Maintain Enterprise Taxonomies (Information Content Architecture) •

Process of creating a structure for a body of information or content



Contains a controlled vocabulary that can help with navigation and search systems



Content Architecture identifies the links and relationships between documents and content, specifies document requirements and attributes and defines the structure of content in a document or content management system

March 8, 2010

282

Document / Index Information Content Metadata •

Development of metadata for unstructured data content



Maintenance of metadata for unstructured data becomes the maintenance of a cross-reference of various local schemes to the official set of organisation metadata

March 8, 2010

283

Provide Content Access and Retrieval •

Once the content has been described by metadata / key word tagging and classified within the appropriate Information Content Architecture, it is available for retrieval and use



Finding unstructured data can be eased through portal technology

March 8, 2010

284

Govern for Quality Content •

Managing unstructured data requires effective partnerships between data stewards, data professionals, and records managers



The focus of data governance can include document and record retention policies, electronic signature policies, reporting formats, and report distribution policies



High quality, accurate, and up-to-date information will aid in critical business decisions



Timeliness of the decision-making process with high quality information may increase competitive advantage and business effectiveness March 8, 2010

285

Metadata Management

March 8, 2010

286

Metadata Management • • •

Metadata is data about data Metadata Management is the set of processes that ensure proper creation, storage, integration, and control to support associated usage of metadata Leveraging metadata in an organisation can provide benefits − Increase the value of strategic information by providing context for the data, thus aiding analysts in making more effective decisions − Reduce training costs and lower the impact of staff turnover through thorough documentation of data context, history, and origin − Reduce data-oriented research time by assisting business analysts in finding the information they need, in a timely manner − Improve communication by bridging the gap between business users and IT professionals, leveraging work done by other teams, and increasing confidence in IT system data − Increase speed of system development time-to-market by reducing system development life-cycle time − Reduce risk of project failure through better impact analysis at various levels during change management − Identify and reduce redundant data and processes, thereby reducing rework and use of redundant, out-of-date, or incorrect data March 8, 2010

287

Metadata Management – Definition and Goals •

Definition − Planning, implementation, and control activities to enable easy access to high quality, integrated metadata



Goals − Provide organisational understanding of terms, and usage − Integrate metadata from diverse source − Provide easy, integrated access to metadata − Ensure metadata quality and security

March 8, 2010

288

Metadata • •

Metadata is information about the physical data, technical and business processes, data rules and constraints, and logical and physical structures of the data, as used by an organisation Descriptive tags describe data, concepts and the connections between the data and concepts − Business Analytics: Data definitions, reports, users, usage, performance

− Business Architecture: Roles and organisations, goals and objectives − Business Definitions: The business terms and explanations for a particular concept, fact, or other item found in an organisation − Business Rules: Standard calculations and derivation methods − Data Governance: Policies, standards, procedures, programs, roles, organisations, stewardship assignments − Data Integration: Sources, targets, transformations, lineage, ETL workflows, EAI, EII, migration / conversion − Data Quality: Defects, metrics, ratings − Document Content Management: Unstructured data, documents, taxonomies, name sets, legal discovery, search engine indexes − Information Technology Infrastructure: Platforms, networks, configurations, licenses − Logical Data Models: Entities, attributes, relationships and rules, business names and definitions − Physical Data Models: Files, tables, columns, views, business definitions, indexes, usage, performance, change management − Process Models: Functions, activities, roles, inputs / outputs, workflow, business rules, timing, stores − Systems Portfolio and IT Governance: Databases, applications, projects and programs, integration roadmap, change management − Service-Oriented Architecture (SOA) Information: Components, services, messages, master data − System Design and Development: Requirements, designs and test plans, impact − Systems Management: Data security, licenses, configuration, reliability, service levels March 8, 2010

289

Metadata Management - Overview Inputs

Primary Deliverables

•Metadata Requirements •Metadata Issues •Data Architecture •Business Metadata •Technical Metadata •Process Metadata •Operational Metadata •Data Stewardship Metadata

Suppliers •Data Stewards •Data Architects •Data Modelers •Database Administrators •Other Data Professionals •Data Brokers •Government and Industry Regulators

Participants •Metadata Specialist •Data Integration Architects •Data Stewards •Data Architects and Modelers •Database Administrators •Other DM Professionals •Other IT Professionals •DM Executive •Business Users March 8, 2010

Metadata Management

•Metadata Repositories •Quality Metadata •Metadata Models and Architecture •Metadata Management •Operational Analysis •Metadata Analysis •Data Lineage •Change Impact Analysis •Metadata Control Procedures

Consumers

Tools •Metadata Repositories •Data Modeling Tools •Database Management Systems •Data Integration Tools •Business Intelligence Tools •System Management Tools •Object Modeling Tools •Process Modeling Tools •Report Generating Tools •Data Quality Tools •Data Development and Administration Tools •Reference and Master Data Management Tools

•Data Stewards •Data Professionals •Other IT Professionals •Knowledge Workers •Managers and Executives •Customers and Collaborators •Business Users

Metrics •Meta Data Quality •Master Data Service Data Compliance •Metadata Repository Contribution Metadata Documentation Quality Steward Representation / Coverage •Metadata Usage / Reference •Metadata Management Maturity •Metadata Repository Availability 290

Metadata Management Function, Activities and SubActivities Metadata Management

Understand Metadata Requirements

Define the Metadata Architecture

Develop and Maintain Metadata Standards

Implement a Managed Metadata Environment

Create and Maintain Metadata

Integrate Metadata

Manage Metadata Repositories

Distribute and Deliver Metadata

Business User Requirements

Centralised Metadata Architecture

Industry / Consensus Metadata Standards

Metadata Repositories

Technical User Requirements

Distributed Metadata Architecture

International Metadata Standards

Directories, Glossaries and Other Metadata Stores

Hybrid Metadata Architecture

Standard Metadata Metrics

March 8, 2010

Query, Report and Analyse Metadata

291

Metadata Management - Principles • • • • • • • • • • • • • • •

Establish and maintain a metadata strategy and appropriate policies, especially clear goals and objectives for metadata management and usage Secure sustained commitment, funding, and vocal support from senior management concerning metadata management for the enterprise Take an enterprise perspective to ensure future extensibility, but implement through iterative and incremental delivery Develop a metadata strategy before evaluating, purchasing, and installing metadata management products Create or adopt metadata standards to ensure interoperability of metadata across the enterprise Ensure effective metadata acquisition for both internal and external meta- data Maximise user access, since a solution that is not accessed or is under-accessed will not show business value Understand and communicate the necessity of metadata and the purpose of each type of metadata; socialisation of the value of metadata will encourage business usage Measure content and usage Leverage XML, messaging, and Web services Establish and maintain enterprise-wide business involvement in data stewardship, assigning accountability for metadata Define and monitor procedures and processes to ensure correct policy implementation Include a focus on roles, staffing, standards, procedures, training, and metrics Provide dedicated metadata experts to the project and beyond Certify metadata quality

March 8, 2010

292

Understand Metadata Requirements Metadata management strategy must reflect an understanding of enterprise needs for metadata • Gather requirements to confirm the need for a metadata management environment, to set scope and priorities, educate and communicate, to guide tool evaluation and implementation, guide metadata modeling, guide internal metadata standards, guide provided services that rely on metadata, and to estimate and justify staffing needs • Gather requirements from business and technical users • Summarise the requirements from an analysis of roles, responsibilities, challenges, and the information needs of selected individuals in the organisation •

March 8, 2010

293

Business User Requirements •

Business users require improved understanding of the information from operational and analytical systems



Business users require a high level of confidence in the information obtained from corporate data warehouses, analytical applications, and operational systems



Need appropriate access to information delivery methods, such as reports, queries, ad-hoc, OLAP, dashboards with a high degree of quality documentation and context



Business users must understand the intent and purpose of metadata management March 8, 2010

294

Technical User Requirements •

Technical requirement topics include − Daily feed throughput: size and processing time − Existing metadata − Sources - known and unknown − Targets − Transformations − Architecture flow logical and physical − Non-standard metadata requirements



Technical users must understand the business context of the data at a sufficient level to provide the necessary support, including implementing the calculations or derived data rules March 8, 2010

295

Define the Metadata Architecture •

Metadata management solutions consist of − Metadata creation / sourcing − metadata integration − Mmetadata repositories − Metadata delivery − Metadata usage − Metadata control / management

March 8, 2010

296

Centralised Metadata Architecture • •

Single metadata repository that contains copies of the live metadata from the various sources Advantages − High availability, since it is independent of the source systems − Quick metadata retrieval, since the repository and the query reside together − Resolved database structures that are not affected by the proprietary nature of third party or commercial systems − Extracted metadata may be transformed or enhanced with additional metadata that may not reside in the source system, improving quality



Disadvantages − Complex processes are necessary to ensure that changes in source metadata quickly replicate into the repository − Maintenance of a centralised repository can be substantial − Extraction could require custom additional modules or middleware − Validation and maintenance of customised code can increase the demands on both internal IT staff and the software vendors March 8, 2010

297

Distributed Metadata Architecture • •

Metadata retrieval engine responds to user requests by retrieving data from source systems in real time with no persistent repository Advantages − Metadata is always as current and valid as possible − Queries are distributed, possibly improving response / process time − Metadata requests from proprietary systems are limited to query processing rather than requiring a detailed understanding of proprietary data structures, therefore minimising the implementation and maintenance effort required − Development of automated metadata query processing is likely simpler, requiring minimal manual intervention − Batch processing is reduced, with no metadata replication or synchronisation processes



Disadvantages − No enhancement or standardisation of metadata is possible between systems − Query capabilities are directly affected by the availability of the participating source systems − No ability to support user-defined or manually inserted metadata entries since there is no repository in which to place these additions March 8, 2010

298

Hybrid Metadata Architecture •



Hybrid architecture where metadata still moves directly from the source systems into a repository but the repository design only accounts for the user-added metadata, the critical standardised items and the additions from manual sources Advantages − Near-real-time retrieval of metadata from its source and enhanced metadata to meet user needs most effectively, when needed − Lowers the effort for manual IT intervention and custom-coded access functionality to proprietary systems.



Disadvantages − Source systems must be available because the distributed nature of the back-end systems handles processing of queries − Additional overhead is required to link those initial results with metadata augmentation in the central repository before presenting the result set to the end user − Design forces the metadata repository to contain the latest version of the metadata source and forces it to manage changes to the source, as well − Sets of program / process interfaces to tie the repository back to the meta- data source(s) must be built and maintained

March 8, 2010

299

Develop and Maintain Metadata Standards •

Check industry or consensus standards and international standards



International standards provide the framework from which the industry standards are developed and executed

March 8, 2010

300

Industry / Consensus Metadata Standards •

Understanding the various standards for the implementation and management of metadata in industry is essential to the appropriate selection and use of a metadata solution for an enterprise − OMG (Object Management Group) specifications • • • • • • •

Common Warehouse Metadata (CWM) Information Management Metamodel (IMM) MDC Open Information Model (OIM) Extensible Markup Language (XML) Unified Modeling Language (UML) Extensible Markup Interface (XMI) Ontology Definition Metamodel (ODM)

− World Wide Web Consortium (W3C) RDF (Relational Definition Framework) for describing and interchanging meta- data using XML − Dublin Core Metadata Initiative (DCMI) interoperable online metadata standard using RDF − Distributed Management Task Force (DTMF) Web-Based Enterprise Management (WBEM) Common Information Model (CIM) standards-based management tools facilitating the exchange of data across otherwise disparate technologies and platforms − Metadata standards for unstructured data • • • •

March 8, 2010

ISO 5964 - Guidelines for the establishment and development of multilingual thesauri ISO 2788 - Guidelines for the establishment and development of monolingual thesauri ANSI/NISO Z39.1 - American Standard Reference Data and Arrangement of Periodicals ISO 704 - Terminology work Principles and methods

301

International Metadata Standards •

ISO / IEC 11179 is an international metadata standard for standardising and registering of data elements to make data understandable and shareable

March 8, 2010

302

Standard Metadata Metrics •

Controlling the effectiveness of the metadata deployed environment requires measurements to assess user uptake, organisational commitment, and content coverage and quality − Metadata Repository Completeness − Metadata Documentation Quality − Master Data Service Data Compliance − Steward Representation / Coverage − Metadata Usage / Reference − Metadata Management Maturity − Metadata Repository Availability

March 8, 2010

303

Implement a Managed Metadata Environment •

Implement a managed metadata environment in incremental steps in order to minimise risks to the organisation and to facilitate acceptance



First implementation is a pilot to prove concepts and learn about managing the metadata environment

March 8, 2010

304

Create and Maintain Metadata •

Metadata creation and update facility provides for the periodic scanning and updating of the repository in addition to the manual insertion and manipulation of metadata by authorised users and program



Audit process validates activities and reports exceptions



Metadata is the guide to the data in the organisation so its quality is critical

March 8, 2010

305

Integrate Metadata •

Integration processes gather and consolidate metadata from across the enterprise including metadata from data acquired outside the enterprise



Challenges will arise in integration that will require resolution through the governance process



Use a non-persistent metadata staging area to store temporary and backup files that supports rollback and recovery processes and provides an interim audit trail to assist repository managers when investigating metadata source or quality issues



ETL tools used for data warehousing and Business Intelligence applications are often used effectively in metadata integration processes

March 8, 2010

306

Manage Metadata Repositories •

Implement a number of control activities in order to manage the metadata environment



Control of repositories is control of metadata movement and repository updates performed by the metadata specialist

March 8, 2010

307

Metadata Repositories •

Metadata repository refers to the physical tables in which the metadata are stored



Generic design and not merely reflecting the source system database designs



Metadata should be as integrated as possible this will be one of the most direct valued-added elements of the repository

March 8, 2010

308

Directories, Glossaries and Other Metadata Stores •

A Directory is a type of metadata store that limits the metadata to the location or source of data in the enterprise



A Glossary typically provides guidance for use of terms



Other Metadata stores include specialised lists such as source lists or interfaces, code sets, lexicons, spatial and temporal schema, spatial reference, and distribution of digital geographic data sets, repositories of repositories and business rules

March 8, 2010

309

Distribute and Deliver Metadata •

Metadata delivery layer is responsible for the delivery of the metadata from the repository to the end users and to any applications or tools that require metadata feeds to them

March 8, 2010

310

Query, Report and Analyse Metadata •

Metadata guides management and use of data assets



A metadata repository must have a front-end application that supports the search-and- retrieval functionality required for all this guidance and management of data assets

March 8, 2010

311

Data Quality Management

March 8, 2010

312

Data Quality Management • •

• • •



Critical support process in organisational change management Data quality is synonymous with information quality since poor data quality results in inaccurate information and poor business performance Data cleansing may result in short-term and costly improvements that do not address the root causes of data defects More rigorous data quality program is necessary to provide an economic solution to improved data quality and integrity Institutionalising processes for data quality oversight, management, and improvement hinges on identifying the business needs for quality data and determining the best ways to measure, monitor, control, and report on the quality of data Continuous process for defining the parameters for specifying acceptable levels of data quality to meet business needs, and for ensuring that data quality meets these levels March 8, 2010

313

Data Quality Management – Definition and Goals •

Definition − Planning, implementation, and control activities that apply quality management techniques to measure, assess, improve, and ensure the fitness of data for use



Goals − To measurably improve the quality of data in relation to defined business expectations − To define requirements and specifications for integrating data quality control into the system development lifecycle − To provide defined processes for measuring, monitoring, and reporting conformance to acceptable levels of data quality

March 8, 2010

314

Data Quality Management •

Data quality expectations provide the inputs necessary to define the data quality framework



Framework includes defining the requirements, inspection policies, measures, and monitors that reflect changes in data quality and performance



Requirements reflect three aspects of business data expectations − Way to record the expectation in business rules − Way to measure the quality of data within that dimension − Acceptability threshold

March 8, 2010

315

Data Quality Management Approach •

Planning for the assessment of the current state and identification of key metrics for measuring data quality



Deploying processes for measuring and improving the quality of data



Monitoring and measuring the levels in relation to the defined business expectations



Acting to resolve any identified issues to improve data quality and better meet business expectations

March 8, 2010

316

Data Quality Management - Overview Inputs

Primary Deliverables

•Business Requirements •Data Requirements •Data Quality Expectations •Data Policies and Standards •Business metadata •Technical metadata •Data Sources and Data Stores

•Improved Quality Data •Data Management •Operational Analysis •Data Profiles •Data Quality Certification Reports •Data Quality Service Level Agreements

Suppliers •External Sources •Regulatory Bodies •Business Subject Matter Experts •Information Consumers •Data Producers •Data Architects •Data Modelers

Participants •Data Quality Analysts •Data Analysts •Database Administrators •Data Stewards •Other Data Professionals •DRM Director •Data Stewardship Council March 8, 2010

Data Quality Management

Tools •Data Profiling Tools •Statistical Analysis Tools •Data Cleansing Tools •Data Integration Tools •Issue and Event Management Tools

Consumers •Data Stewards •Data Professionals •Other IT Professionals •Knowledge Workers •Managers and Executives Customers

Metrics •Data Value Statistics •Errors / Requirement Violations •Conformance to Expectations •Conformance to Service Levels

317

Data Quality Management Function, Activities and Sub-Activities Data Quality Management

Develop and Promote Data Quality Awareness

Define Data Quality Requirements

March 8, 2010

Profile, Analyse and Assess Data Quality

Define Data Quality Metrics

Define Data Quality Business Rules

Test and Validate Data Quality Requirements

Set and Evaluate Data Quality Service Levels

Continuously Measure and Monitor Data Quality

Manage Data Quality Issues

Clean and Correct Data Quality Defects

Design and Implement Operational DQM Procedures

Monitor Operational DQM Procedures and Performance

318

Data Quality Management - Principles • • • • • • • • • • • •

Manage data as a core organisational asset All data elements will have a standardised data definition, data type, and acceptable value domain Leverage Data Governance for the control and performance of DQM Use industry and international data standards whenever possible Downstream data consumers specify data quality expectations Define business rules to assert conformance to data quality expectations Validate data instances and data sets against defined business rules Business process owners will agree to and abide by data quality SLAs Apply data corrections at the original source, if possible If it is not possible to correct data at the source, forward data corrections to the owner of the original source whenever possible Report measured levels of data quality to appropriate data stewards, business process owners, and SLA managers Identify a gold record for all data elements

March 8, 2010

319

Develop and Promote Data Quality Awareness •

Promoting data quality awareness means more than ensuring that the right people in the organisation are aware of the existence of data quality issues



Establish a data governance framework for data quality − − − − − − − − −

Set priorities for data quality Develop and maintain standards for data quality Report relevant measurements of enterprise-wide data quality Provide guidance that facilitates staff involvement Establish communications mechanisms for knowledge sharing Develop and apply certification and compliance policies Monitor and report on performance Identify opportunities for improvements and build consensus for approval Resolve variations and conflicts

March 8, 2010

320

Define Data Quality Requirements • •

Applications are dependent on the use of data that meets specific needs associated with the successful completion of a business process Data quality requirements are often hidden within defined business policies − − − − −



Identify key data components associated with business policies Determine how identified data assertions affect the business Evaluate how data errors are categorised within a set of data quality dimensions Specify the business rules that measure the occurrence of data errors Provide a means for implementing measurement processes that assess conformance to those business rules

Dimensions of data quality − − − − − − − − − − −

Accuracy Completeness Consistency Currency Precision Privacy Reasonableness Referential Integrity Timeliness Uniqueness Validity

March 8, 2010

321

Profile, Analyse and Assess Data Quality •

Perform an assessment of the data using two different approaches, bottom-up and top-down



Bottom-up assessment of existing data quality issues involves inspection and evaluation of the data sets themselves



Top-down approach involves understanding how their processes consume data, and which data elements are critical to the success of the business application − Identify a data set for review − Catalog the business uses of that data set − Subject the data set to empirical analysis using data profiling tools and techniques − List all potential anomalies, review and evaluate − Prioritise criticality of important anomalies in preparation for defining data quality metrics March 8, 2010

322

Define Data Quality Metrics •

Poor data quality affects the achievement of business objectives



Seek and use indicators of data quality performance to report the relationship between flawed data and missed business objectives



Measuring quality similarly to monitoring any type of business performance activity



Data quality metrics should be reasonable and effective − − − − − −

Measurability Business Relevance Acceptability Accountability / Stewardship Controllability Trackability

March 8, 2010

323

Define Data Quality Business Rules •

Measurement of conformance to specific business rules requires definition



Monitoring conformance to these rules requires



Segregating data values, records, and collections of records that do not meet business needs from the valid ones



Generating a notification event alerting a data steward of a potential data quality issue



Establishing an automated or event driven process for aligning or possibly correcting flawed data within business expectations March 8, 2010

324

Test and Validate Data Quality Requirements •

Data profiling tools analyse data to find potential anomalies



Data profiling tools allow data analysts to define data rules for validation, assessing frequency distributions and corresponding measurements and then applying the defined rules against the data sets



Characterising data quality levels based on data rule conformance provides an objective measure of data quality



By using defined data rules to validate data, an organisation can distinguish those records that conform to defined data quality expectations and those that do not



In turn, these data rules are used to baseline the current level of data quality as compared to ongoing audits March 8, 2010

325

Set and Evaluate Data Quality Service Levels • •



Data quality SLAs specify the organisation’s expectations for response and remediation Having data quality inspection and monitoring in place increases the likelihood of detection and remediation of a data quality issue before a significant business impact can occur Operational data quality control defined in a data quality SLA includes − − − − − − − −

The data elements covered by the agreement The business impacts associated with data flaws The data quality dimensions associated with each data element The expectations for quality for each data element for each of the identified dimensions in each application or system in the value chain The methods for measuring against those expectations The acceptability threshold for each measurement The individual(s) to be notified in case the acceptability threshold is not met. The timelines and deadlines for expected resolution or remediation of the issue The escalation strategy and possible rewards and penalties when the resolution times are met.

March 8, 2010

326

Continuously Measure and Monitor Data Quality •

Provide continuous monitoring by incorporating control and measurement processes into the information processing flow



Incorporating the results of the control and measurement processes into both the operational procedures and reporting frameworks enable continuous monitoring of the levels of data quality

March 8, 2010

327

Manage Data Quality Issues •

• •





Supporting the enforcement of the data quality SLA requires a mechanism for reporting and tracking data quality incidents and activities for researching and resolving those incidents Data quality incident reporting system provides this capability Tracking of data quality incidents provides performance reporting data, including mean-time-to-resolve issues, frequency of occurrence of issues, types of issues, sources of issues and common approaches for correcting or eliminating problems Data quality incident tracking also requires a focus on training staff to recognise when data issues appear and how they are to be classified, logged and tracked according to the data quality SLA Implementing a data quality issues tracking system provides a number of benefits − Information and knowledge sharing can improve performance and reduce duplication of effort − Analysis of all the issues will help data quality team members determine any repetitive patterns, their frequency, and potentially the source of the issue March 8, 2010

328

Clean and Correct Data Quality Defects •

Perform data correction in three general ways − Automated correction - Submit the data to data quality and data cleansing techniques using a collection of data transformations and rule-based standardisations, normalisations, and corrections − Manual directed correction - Use automated tools to cleanse and correct data but require manual review before committing the corrections to persistent storage − Manual correction: Data stewards inspect invalid records and determine the correct values, make the corrections, and commit the updated records

March 8, 2010

329

Design and Implement Operational DQM Procedures •

Using defined rules for validation of data quality provides a means of integrating data inspection into a set of operational procedures associated with active DQM



Design and implement detailed procedures for operationalising activities − Inspection and monitoring − Diagnosis and evaluation of remediation alternatives − Resolving the issue − Reporting

March 8, 2010

330

Monitor Operational DQM Procedures and Performance Accountability is critical to the governance protocols overseeing data quality control • Issues must be assigned to some number of individuals, groups, departments, or organisations • Tracking process should specify and document the ultimate issue accountability to prevent issues from dropping through the cracks • Metrics can provide valuable insights into the effectiveness of the current workflow, as well as systems and resource utilisation and are important management data points that can drive continuous operational improvement for data quality control •

March 8, 2010

331

Conducting a Data Management Project

March 8, 2010

332

Conducting a Data Management Project •

Data management project depends on: − Scope of the Project – data management functions to be encompassed − Type of Project – from architecture to analysis to implementation − Scope Within the Organisation – one or more business units or the entire organisation

March 8, 2010

333

Data Management Function and Project Type Scope of Project Type of Project

Data Data Data Governance Architecture Development Management

Data Data Security Reference and Operations Management Master Data Management Management

Data Warehousing and Business Intelligence Management

Document and Content Management

Metadata Data Quality Management Management

Architecture

Analysis and Design

Implementation

Operational Improvement

Management and Administration

March 8, 2010

334

Mapping the Path Through the Selected Data Management Project •

Use the framework to define the breakdown of the selected project

March 8, 2010

335

Project Elements – Data Management Functions, Type of Project, Organisational Scope

Organisational Scope of Project Data Management Functions Within Scope of Project

Type of Project



Select the project building blocks based on the project scope

March 8, 2010

336

Creating a Data Management Team

March 8, 2010

337

Creating a Data Management Team •

Having implemented a data management framework, must be monitored, managed and constantly improved



Need to consolidate and coordinate data management and governance efforts to meet the challenges of − Demand for performance management data − Complexity in systems and processes − Greater regulatory and compliance requirements



Build a Data Management Center of Excellence (DMCOE)

March 8, 2010

338

Data Management Center of Excellence • • • • • •



Separate business units with the organisation generally implement their own solutions Each business unit will have different IT systems, data warehouses/data marts and business intelligence tools Organisation-wide coordination of data resources requires a centralised dedicated structure like the DMCOE providing data services Leads a organisation to business benefits through continuous improvement of data management DMCOE functions need to focus on leveraging organisational knowledge and skills to maximise the value of data to the organisation Maximise technology investment while decreasing costs and increasing efficiency, centralise best practices and standards and empower knowledge workers with information and provide thought leadership to the entire company DMCOE does not exist in isolation to other operations and service management functions

March 8, 2010

339

DMCOE Functions Maximise the value of the data technology investment to the organisation by taking a portfolio approach to increase skills and leverage and to optimise the infrastructure • Focus on project delivery and information asset creation with an emphasis on reusability and knowledge management along with solution delivery • Ensure the integrity of the organisation’s business processes and information systems • Ensure the quality compliance effort related to the configuration, development, and documentation of enhancements • Develop information learning and effective practices •

March 8, 2010

340

Data Charter •

Create charter that lists the fundamental principles of data management the DMCOE will adhere to: − Data Strategy - Create a data blueprint, based upon business functions to facilitate data design − Data Sharing - Promote the sharing of data across the organisation and reduce data redundancy − Data Integrity - Ensure the integrity of data from design and availability perspectives − Technical Expertise - Provide the expertise for the development and support of data systems − High Availability and Optimal Performance - Ensure consistent high availability of data systems through proper design and use and optimise performance of the data systems March 8, 2010

341

DMCOE Skills •

DMCOE needs skills across three dimensions − Specific data management functions − Business management and administration − Technology and service management

March 8, 2010

342

DMCOE Skills Data Management Design and Development Data Management Process Management

Data Management Business Skills

Personnel Management

Data Management Portfolio Management Data Management Strategy

Data Reference Data Data Warehousing Document Data Data Data Security and Master Metadata Data Quality Architecture Operations and Business and Content Governance Development Management Data Management Management Management Management Intelligence Management Management Management

Data Management Specific Functions

Environment and Infrastructure Management Service Management and Support

Data Management Technology and Service Functions March 8, 2010



Application Deployment and Data Migration



Idealised set of DMCOE skills that need to be customised to suit specific organisation needs Just one view of a DMCOE

Technical Architecture 343

DMCOE Business Management and Administration Skills DMCOE Business Management and Administration

Data Management Strategy

Data Management Portfolio Management

Personnel Management

Data Management Process Management

Data Management Design and Development

Education and Skills Identification and Development

Creation and Enforcement of Process Standards

Requirements Definition and Management

Co-ordination of Data Management Systems and Initiatives

Resource Management and Allocation

Management of Data Processes

Analysis and Design

Creation and Enforcement of Data Principles and Standards

Vendor Management

Performance Management

Development Standards

Data Quality

Solution Development and Deployment

Strategic Planning Processes

Data Usage Strategy March 8, 2010

Management of Portfolio of Data Management Initiatives

344

DMCOE Technology and Service Management Skills DMCOE Technology and Service Management

Environment and Infrastructure Management

March 8, 2010

Service Management and Support

Application Deployment and Data Migration

Technical Architecture

Change Management and Control

Service Desk

Application Deployment

Infrastructure Architecture

Version Management and Control

Service Level Management

Test Management – System, Integration, UAT, UAT Support

Application and Tools Architecture

Performance Monitoring and Management

Security Management

Data Migration Management

Data and Content Architecture

Reporting

System Maintenance

Integration Architecture 345

Benefits of DMCOE Consistent infrastructure that reduces time to analyse and design and implement new IT solutions • Reduced data management costs through a consistent data architecture and data integration infrastructure reduced complexity, redundancy, tool proliferation • Centralised repository of the organisation's data knowledge • Organisation-wide standard methodology and processes to develop and maintain data infrastructure and procedures • Increased data availability • Increased data quality •

March 8, 2010

346

Assessing Your Data Management Maturity

March 8, 2010

347

Assessing Your Data Management Maturity •





A Data Management Maturity Model is a measure of and then a process for determining the level of maturity that exists within an organisation’s data management function Provides a systematic framework for improving data management capability and identifying and prioritising opportunities, reducing cost and optimising the business value of data management investments Measure of data management maturity so that: − It can be tracked over time to measure improvements − It can be use to define project for data management maturity improvements within costs, time, and return on investment constraints



Enables organisations to improve their data management function so that they can increase productivity, increase quality, decrease cost and decrease risk March 8, 2010

348

Data Management Maturity Model •

Assesses data management maturity on a level of 1 to 5 across a number of data management capabilities Level

Title

Description

1

Initial

2

Repeatable and Reactive

Data management is ad hoc and localised. Everybody has their own approach that is unique and not standardised except for local initiatives. Data management has become independent of the person or business unit administering and is standardised.

3

Defined and Standardised

Data management is fully documented, determined by subject matter experts and validated.

4

Managed and Predictable

5

Optimising and Innovating

Data management results and outcomes are stored and proactively cross-related within and between business units. The data management function actively exploit benefits of standardisation. As time, resources, technology, requirements and business landscape changes the data management function is able to be easily and quickly adjusted to fit new needs and environments

March 8, 2010

349

Maturity Level 1 - Initial • • • • • • • • •

Data management processes are mostly disorganised and generally performed on an ad hoc or even even chaotic basis Data is considered as general purpose and is not viewed by either business or executive management to be a problem or a priority Data is accessible but not always available and is not secure or auditable No data management group and no one owns the responsibility for ensuring the quality, accuracy or integrity of the data Data management (to the degree that it is done at all) is reliant on the efforts and competence of individuals Data proliferates without control and the quality is inconsistent across the various business and applications silos Data exists in unconnected databases and spreadsheets using multiple formats and inconsistent definitions Little data profiling or analysis and data is not considered or understood as a component of linked processes No formal data quality processes and the processes that do exist are not repeatable because they are neither well defined nor well documented March 8, 2010

350

Maturity Level 2 - Repeatable and Reactive • • •

• • • • •

Fundamental data management practices are established, defined, documented and can be repeated Data policies for creation and change management exist, but still rely on individuals and are not institutionalised throughout the organisation Data as valuable asset is a concept understood by some, but senior management support is lacking and there is little organisational buy-in to the importance of an enterprise-wide approach to managing data data is stored locally and data quality is reactive to circumstances Requirements are known and managed at the business unit and application level Procurement is ad hoc based on individual needs and data duplication is mostly invisible Data quality varies among business units and data failures occur on a crossfunctional basis. Most data is integrated point-to-point and not across business units

March 8, 2010

351

Maturity Level 3 - Defined and Standardised • • • • •

• • • •

Business analysts begin to control the data management process with IT playing a supporting role Data is recognised as a business enabler and moves from an undervalued commodity to an enterprise asset but there are still limited controls in place Executive management appreciates and understands the role of data governance and commits resources to its management Data administrative function exists as a complement to the database administration function and data is present for both business and IT related development discussions Some core data has defined policy that it is documented as part of the applications development lifecycle and the policies are enforced to a limited extent and testing is performed to ensure that data quality requirements are being achieved Data quality is not fully defined and there are multiple views of what quality Metadata repository exists and a data group maintains corporate data definitions and business rules A centralised platform for managing data is available at the group level and feeds analytical data marts Data is available to business users and can be audited

March 8, 2010

352

Maturity Level 4 - Managed and Predictable • • • • • • • • • • • • •

Data is treated as a critical corporate asset and viewed as equivalent to other enterprise wide assets Unified data governance strategy exists throughout the enterprise with executive level and CEO support Data management objectives are reviewed by senior management Business process interaction is completely documented and planning is centralised Data quality control, integration and synchronisation are integral parts of all business processes Content is monitored and corrected in real time to manage the reliability of the data manufacturing process and is based on the needs of customers, end users and the organisation as a whole Data quality is understood in statistical terms and managed throughout the transactions lifecycle Root cause analysis is well established and proactive steps are taken to prevent and not just correct data inconsistencies A centralised metadata repository exists and all changes are synchronised Data consistency is expected and achieved Data platform is managed at the enterprise level and feeds all reference data repositories Advanced platform tools are used to manage the metadata repository and all data transformation processes Data quality and integration tools are standardised across the enterprise.

March 8, 2010

353

Maturity Level 5 - Optimising and Innovating • • • • • • • • • • •

The organisation is in continuous improvement mode Process enhancements are managed through monitoring feedback and a quantitative understanding of the causes of data inconsistencies Enterprise wide business intelligence is possible Organisation is agile enough to respond to changing circumstances and evolving business objectives Data is considered as the key resource for process improvement Data requirements for all projects are defined and agreed prior to initiation Development stresses the re-use of data and is synchronised with the procurement process Process of data management is continuously being improved Data quality (both monitoring and correction) is fully automated and adaptive Uncontrolled data duplication is eliminated and controlled duplication must be justified Governance is data driven and the organisation adopts a “test and learn” philosophy March 8, 2010

354

Data Management Maturity Evaluation - Key Capabilities and Maturity Levels Level 1

Level 2

Level 3

Level 4

Level 5

Data Governance Data Architecture Management

< Description of capability associated with maturity level >

Data Development Data Operations Management Data Security Management Reference and Master Data Management Data Warehousing and Business Intelligence Management Document and Content Management Metadata Management Data Quality Management March 8, 2010

355

More Information Alan McSweeney [email protected]

March 8, 2010

356

View more...

Comments

Copyright ©2017 KUPDF Inc.
SUPPORT KUPDF