21741638 the Crm Data Warehouse
Short Description
crm...
Description
CHAPTER 4 THE CRM DATA WAREHOUSE
WHAT IS A DATA WAREHOUSE? A large reservoir of detailed and summary data that describes the organization and its activities, organized by the various business dimensions in a way to facilitate easy retrieval of information describing activities data mart –a subset of the data warehouse, tailored to meet the specialized needs of a particular group of users Top-down approach bottom-up approach to data warehouse development—the data marts are created first and then integrated.
Data Warehousing Objectives (1) (2) (3) (4)
keep the warehouse data current; ensure that the warehouse data is accurate; keep the warehouse data secure; make the warehouse data easily available to authorized users; (5) maintain descriptions of the warehouse data so that users and system developers can understand the meaning of each element
Data Warehouse vs. DBMS OLTP (on-line transaction processing) Major task of traditional relational DBMS Day-to-day operations: purchasing, inventory, banking, manufacturing, payroll, registration, accounting, etc.
OLAP (on-line analytical processing) Major task of data warehouse system Data analysis and decision making
Distinct features (OLTP vs. OLAP): User and system orientation: customer vs. market
Data contents: current, detailed vs. historical, consolidated Database design: ER + application vs. star + subject View: current, local vs. evolutionary, integrated Access patterns: update vs. read-only but complex queries
OLTP
OLAP
users
clerk, IT professional
knowledge worker
function
day to day operations
decision support
DB design
application-oriented
subject-oriented
data
current, up-to-date detailed, flat relational isolated repetitive
historical, summarized, multidimensional integrated, consolidated ad-hoc lots of scans
unit of work
read/write index/hash on prim. key short, simple transaction
# records accessed
tens
millions
#users
thousands
hundreds
DB size
100MB-GB
100GB-TB
metric
transaction throughput
query throughput, response
usage access
complex query
DATA WAREHOUSE ARCHITECTURE staging area— data is prepared to be moved into the warehouse data repository and the metadata repository metadata— data about data, or descriptions of the data in the data warehouse Exhibit 4.1: A Data Warehouse System Model
EXHIBIT 4.1 A DATA WAREHOUSE SYSTEM MODEL Data Warehouse System
Data gathering system
Staging area
Warehouse data repository
Legend : Data flow
Management and control
Control flow
Metadata repository
Information Delivery system
A Data Warehouse System Model Management and Control management and control component—like a traffic officer standing in the middle of a street intersection, controlling the flow of traffic through the intersection
Staging Area ETL— extraction, transformation, and loading as the activities of this staging area extraction— obtaining data from the internal databases and files of systems, accomplished according to a schedule transformation— a process that includes cleaning, standardizing, reformatting, and summarizing loading— writing the data into the data warehouse
A Data Warehouse System Model WAREHOUSE DATA REPOSITORY where the warehouse data is stored within the computer system or systems
Data Content customer picture—a compilation of geographic, demographic, activity, psychographic, and behavioral data
Data Characteristics the types of data to be processed, including considerations of data granularity, data hierarchies, and data dimensions Data Types fixed-length format variable-length format
A Data Warehouse System Model Data Granularity the degree of detail that is represented by the data, where the greater the detail, the finer the granularity Data Hierarchies since multiple attributes can describe a single entity, an attribute is a data element that identifies or describes an occurrence of a data entity (i.e., a particular customer is identified by a customer number attribute)
Exhibit 4.2: An Example of a Data Hierarchy
Data Dimensions for example, a manager can query the data warehouse for a display of data according to salesperson, customer, product, and time Exhibit 4.3: Every Data Record Contains the Time Element
EXHIBIT 4.2 AN EXAMPLE OF A DATA HIERARCHY
Customer
Customer number Customer age Customer gender Customer marital status Customer number of dependents Customer education Customer dwelling type Customer state Customer city Customer zip code
EXHIBIT 4.3 EVERY DATA RECORD CONTAINS THE TIME ELEMENT
Customer sales order
Customer invoice
Sales order date
Invoice date Customer statement Statement date
Warehouse shipping order Date shipped
Customer payment Customer Payment date
A Data Warehouse System Model METADATA REPOSITORY describes the flow of data from the time that the data is captured until it is archived, i.e., metadata in the metadata repository for the customer number attribute would describe its format, editing rules, and so on TYPES OF METADATA Metadata for Users (analysis) identification of the source systems, the time of the last update, the different report formats that are available, and how to find data in the data warehouse Metadata for Systems Developers data to allow developers to maintain, revise, and reengineer the data warehouse system, including the various rules that were employed in creating the warehouse data repository, and the rules for extraction, cleansing, transforming, purging, and archiving
A Data Warehouse System Model Data and Process Models object diagrams and entity-relationship diagrams use cases, use case diagrams, and data flow diagrams
CASE Tools stands for computer-aided system engineering and is a way to use the computer to develop systems
DBMS Systems include a data dictionary component, which contains excellent descriptions of the data in the database or data warehouse.
HOW DATA IS STORED IN THE DATA WAREHOUSE dimension table— a list of all of the attributes that identify and describe a particular entity Exhibit 4.4: A Sample Dimension Table fact table— a list of all the facts that relate to some type of the organization’s activity Exhibit 4.5: A Sample Fact Table
EXHIBIT 4.4 A SAMPLE DIMENSION TABLE
Customer Customer number Customer name Customer phone number Customer e-mail address Customer territory Customer credit code Customer standard industry code Customer city Customer state Customer zip code
EXHIBIT 4.5 A SAMPLE FACT TABLE
Commercial Sales Facts Actual sales units Budgeted sales units Actual sales amount Budgeted sales amount Sales discount amount Net sales amount Sales commission amount Sales bonus amount Sales tax amount
INFORMATION PACKAGES a table that is maintained in the data warehouse repository that identifies both the dimensions and the facts that relate to a business activity Exhibit 4.6: Information Package Format key—a number, such as a customer number, that identifies a particular occurrence of the dimension Exhibit 4.7: A Sample Information Package
EXHIBIT 4.6 INFORMATION PACKAGE FORMAT
Subject : Name of business activity being measured Dimension Name
Dimension Name
Dimension Name
Dimension Name
Dimension Key
Dimension Key
Dimension Key
Dimension Key
Dimension 1
Dimension 1
Dimension 1
Dimension 1
Dimension 2
Dimension 2
Dimension 2
Dimension 2
Dimension 3
Dimension 3
Dimension n
Dimension 3
Dimension 4
Dimension n
Dimension n Facts : Numberic measures of the business activity
Dimension 4 Dimension n
EXHIBIT 4.7 A SAMPLE INFORMATION PACKAGE
Subject : Commercial sales Time
Salesperson
Customer
Product
Time Key
Salesperson key
Customer key
Product key
Hour
Salesperson name Customer name
Product name
Day
Sales branch
Customer territory
Product model
Month
Sales region
Customer credit code Product brand
Quarter
Subsidiary
Product line
Year Facts : Actual sales units, budgeted sales units, actual sales amount, budgeted sales amount, sales discount amount, net sales amount, sales commission amount, sales bonus amount, sales tax amount
STAR SCHEMAS the arrangement of an information package that usually identifies multiple dimension tables for a single fact table and has the appearance of a star, with the fact table in the center and the dimension tables forming the points Exhibit 4.8: Star Schema Format foreign keys— a means of linking the fact table to the dimension tables by means of the keys identified at the top of the fact table where the keys identify other, ―foreign‖ tables as opposed to the fact table Exhibit 4.9: A Sample Star Schema
EXHIBIT 4.8 STAR SCHEMA FORMAT
Dimension 1 name Dimension 1 key Dimension 1 hierarchy
Dimension 2 name Dimension 2 key Dimension 2 hierarchy
Business activity name
Dimension 1 key Dimension 2 key Dimension n key Measurable fact 2 Measurable fact 4 Measurable fact 5 Measurable fact n
Dimension n name Dimension n key Dimension n hierarchy
EXHIBIT 4.9 A SAMPLE STAR SCHEMA Time
Customer Customer key Customer name Customer type Customer credit code Salesperson number Sales territory Standard industry code Salesperson Salesperson key Salesperson name Sales region Sales branch
Product sales facts Product key Customer key Salesperson key Time key Sales units Gross sales amount Sales discount amount Net sales amount Sales commission amount
Time key Day Month Quarter Year
Customer payment Product key Product name Product unit price Product quantity
Example of Star Schema time
item
time_key day day_of_the_week month quarter year
Sales Fact Table time_key item_key branch_key
branch
location_key
branch_key branch_name branch_type
units_sold
dollars_sold avg_sales
Measures
item_key item_name brand type supplier_type
location location_key street city province_or_street country
Example of Snowflake Schema time time_key day day_of_the_week month quarter year
item Sales Fact Table
time_key item_key branch_key
branch
location_key
branch_key branch_name branch_type
units_sold dollars_sold avg_sales
Measures
item_key item_name brand type supplier_key
supplier supplier_key supplier_type
location location_key street city_key
city
city_key city province_or_street country
Example of Fact Constellation time time_key day day_of_the_week month quarter year
item Sales Fact Table time_key
item_key item_name brand type supplier_type
item_key
location_key
branch_key branch_name branch_type
units_sold dollars_sold avg_sales
Measures
time_key
item_key shipper_key
from_location
branch_key branch
Shipping Fact Table
location
to_location
location_key street city province_or_street country
dollars_cost units_shipped shipper shipper_key shipper_name location_key shipper_type
DATA WAREHOUSE NAVIGATION summary information— preprocessed data that provides the user with exactly the content that is needed top-down navigation— the user seeks more detail in an effort to understand the summary information roll up navigation— the user summarizes data to ―see the forest rather than the trees‖ or to prepare summary graphs drill across navigation— the user moves from one data hierarchy to another, i.e., information on customer sales, salesperson sales, and then product sales Exhibit 4.10: Navigation Paths
EXHIBIT 4.10 NAVIGATION PATHS
Summary information (Net sales for the Western sales region)
Roll up
Drill across
Hierarchy 1 (customer)
Hierarchy 2 (salesperson)
Drill down Detailed information (Net sales for salesperson 3742) Drill through Detailed data (Sales units for salesperson 3742)
Hierarchy n (product)
DATA WAREHOUSE SECURITY information systems security— damage, destruction, theft, and misuse Exhibit 4.11: The Security Action Cycle The Corporate Security Environment deterrence— security policies and procedures that are intended to deter security violations, such as guidelines for proper system use and the requirement that users change their passwords periodically prevention— measures aimed at those persons who ignore deterrence, and include such things as locks on computer rooms, user passwords, file permissions detection— proactive actions include system audits, reports of suspicious activity, and virus scanning software and reactive actions take the form of investigations remedies— respond with warnings, reprimands, termination of employment, or legal action.
EXHIBIT 4.11 THE SECURITY ACTION CYCLE
1. Deterrence
Deterrence feedback
2. Prevention
3. Detection
4. Remedies
Deterred abuse
Prevented abuse
Undetected abuse
Unpunished abuse
Maximize Deterred abuse
Maximize Prevented abuse
Maximize Undetected abuse
Maximize Unpunished abuse
DATA WAREHOUSE SECURITY Data Warehouse Security Measures network security— using procedures such as firewalls to restrict access to the network that houses the servers and data files, databases, data warehouses, and data marts data security— obtaining access to data once access to the network has been achieved; where, data files may be located on multiple servers on the network, and the user must provide a second password database or data warehouse security— the security checks of the database management system (DBMS) that may include a third password, verification of user name, and also verification of access to particular data tables, records, and even record fields
View more...
Comments