Understanding Multi Dimensional Database
Prepared By Amit Sharma Hyperion/OBIEE Trainer learnhyperion.wordpress.com
[email protected] Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Review
Architecture
Characteristics
Relational OLAP
Multidimensional OLAP
ROLAP VS. MOLAP
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Today Star Schema Fact table Dimensions Drilling Down & Roll up Slicing & Dicing
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Fact Definition : Facts are numeric measurements (values) that represent a specific business activity. Facts are stored in a FACT table I.e. the center of the star schema. Facts are used in business data analysis, are units, cost, prices and revenues Example: sales figures are numeric measurements that represent product and/or service sales. .
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Fact Table Central table Mostly raw numeric items Narrow rows, a few columns at most Large number of rows (millions to a billion) Access via dimensions
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Fact Table Definition :The centralized table in a star schema is called as FACT table, that contains facts and connected to dimensions. A fact table typically has two types of columns:
Contain facts and
Foreign keys to dimension tables.
The primary key of a fact table is usually a composite key that is made up of all of its foreign keys. A fact table might contain either detail level facts or facts that have been aggregated (fact tables that contain aggregated facts are often instead called summary tables). A fact table usually contains facts with the same level of aggregation.
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Dimension Definition : Qualifying characteristics that provide additional perspective to a given fact. Example: sales might be compared by product from region to region and from one time period to the next. Here sales have product, location and time dimensions. Such dimensions are stored in DIMENSIONAL TABLE.
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Dimension Tables Definition: The dimensions of the fact table are further described with dimension tables Fact table:
Sales (Market_id, Product_Id, Time_Id, Sales_Amt) Dimension Tables:
Market (Market_Id, City, State, Region) Product (Product_Id, Name, Category, Price) Time (Time_Id, Week, Month, Quarter)
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
What is Star Schema? Definition: Star Schema is a relational database schema for representing multidimensional data. It is the simplest form of data warehouse schema that contains one or more dimensions and fact tables. It is called a star schema because the entityrelationship diagram between dimensions and fact tables resembles a star where one fact table is connected to multiple dimensions. The center of the star schema consists of a large fact table and it points towards the dimension tables. The advantage of star schema are slicing down, performance increase and easy understanding of data.
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Steps in designing Star Schema Identify a business process for analysis(like sales). Identify measures or facts (sales dollar). Identify dimensions for facts(product dimension, location dimension, time dimension, organization dimension). List the columns that describe each dimension.(region name, branch name, region name). Determine the lowest level of summary in a fact table(sales dollar). In a star schema every dimension will have a primary key. In a star schema, a dimension table will not have any parent table. Whereas in a snow flake schema, a dimension table will have one or more parent tables. Hierarchies for the dimensions are stored in the dimensional table itself in star schema. Whereas hierarchies are broken into separate tables in snow flake schema. These hierarchies helps to drill down the data from topmost hierarchies to the lowermost hierarchies. Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Attributes Each dimension table contain attributes. Used to search, filter and classify facts. Example, Sales, we can identify some attributes for each dimension: Product Dimension: product ID, description, product type Location Dimension: region, state, city. Time Dimension: year quarter, month, week and date.
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Attributes Hierarchy Definition : AH provides a top-down data organization Used for aggregation and drill-down/roll-up data analysis. Example, location dimension attributes can be organized in a hierarchy by region, state and city. AH provides the capability to perform drill-down and roll-up searches. Allows the DW and OLAP systems to to have defined path.
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
A Concept Hierarchy: Dimension (location) all
all Europe
region country city office
Germany
Frankfurt
...
...
...
Spain
North_America Canada
Vancouver ... L. Chan
...
Mexico
Toronto
... M. Wind
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
A Concept Hierarchy: Dimension (location) Product Line
Product Family
Product Category
Product Name
Books
Arts and Music Architecture
High Top Almonds
Books
Audiobooks
Biographies
Wild Swans: Three Daughters of China
Books
Audiobooks
Childrens
The Hobbit
Books
Audiobooks
Childrens
Winnie The Pooh
Books
Audiobooks
Fiction
The Adventures of Huckleberry Finn
Product_Line->Product_Family->Product_Category->Product_Name Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Multidimensional Data Sales volume as a function of product, month, and region
Re gi on
Dimensions: Product, Location, Time Hierarchical summarization paths Industry Region
Year
Product
Category Country Quarter Product
City Office
Month
Week
Day
Month Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
TV PC VCR sum
1Qtr
2Qtr
Date
3Qtr
4Qtr
sum
Total annual sales of TV in U.S.A. U.S.A Canada Mexico
Country
Pr od uc t
A Sample Data Cube
sum
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Total annual sales of TV in U.S.A.
Country
Pr od uc t
A Sample Data Cube Date TV1Qtr2Qtr3Qtr 4Qtr sum U.S.A PC VCR sum Canada Mexico sum
Essbase
Sales Manager
John
Qtr1
Qtr2
Qtr3
Qtr4
Sales
Sales
Sales
Sales
New York
1000
1000
1000
1000
California
200
200
200
200
63446
63446
63446
63446
Ohio
6633
6633
6633
6633
Illnois
3466
3466
3466
3466
Texas
Qtr1 Mac
Qtr2
Qtr3
Qtr4
New York California Texas Ohio
300
Illnois
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Star Schema A single fact table and for each dimension one dimension table Does not capture hierarchies directly
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Example of Star Schema: Figure 1.6
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
In the example, sales fact table is connected to dimensions location, product, time and organization. It shows that data can be sliced across all dimensions and again it is possible for the data to be aggregated across multiple dimensions. "Sales dollar" in sales fact table can be calculated across all dimensions independently or in a combined manner which is explained below. Sales dollar value for a particular product
Sales dollar value for a product in a location
Sales dollar value for a product in a year within a location
Sales dollar value for a product in a year within a location sold or serviced by an employee
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Example of Star Schema time
item
time_key day day_of_the_week month quarter year
Sales Fact Table time_key item_key branch_key
branch branch_key branch_name branch_type
location_key units_sold dollars_sold avg_sales
item_key item_name brand type supplier_type
location location_key street city province_or_street country
Measures Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Aggregation
Many OLAP queries involve aggregation of the data in the fact table For example, to find the total sales (over time) of each product in each market, we might use SELECT S.Market_Id, S.Product_Id, SUM (S.Sales_Amt) FROM Sales S GROUP BY S.Market_Id, S.Product_Id The aggregation is over the entire time dimension and thus produces a two-dimensional view of the data
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Aggregation Over Time The output of the previous query
Market_Id
Product_Id
SUM(Sales_Amt)
M1
M2
M3
P1
3003
1503
…
P2
6003
2402
…
P3
4503
3
…
P4
7503
7000
…
P5
…
…
…
M4
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Typical OLAP Operations Roll up (drill-up): summarize data by climbing up hierarchy or by dimension reduction Drill down (roll down): reverse of roll-up from higher level summary to lower level summary or detailed data, or introducing new dimensions Slice and dice: project and select Pivot (rotate): reorient the cube, visualization, 3D to series of 2D planes. Other operations drill across: involving (across) more than one fact table drill through: through the bottom level of the cube to its back-end relational tables (using SQL)
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Drilling Down and Rolling Up Some dimension tables form an aggregation hierarchy Market_Id → City → State → Region Executing a series of queries that moves down a hierarchy (e.g., from aggregation over regions to that over states) is called drilling down Requires the use of the fact table or information more specific than the requested aggregation (e.g., cities) Executing a series of queries that moves up the hierarchy (e.g., from states to regions) is called rolling up
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Drilling Down Drilling down on market: from Region to State Sales (Market_Id, Product_Id, Time_Id, Sales_Amt) Market (Market_Id, City, State, Region) SELECT S.Product_Id, M.Region, SUM (S.Sales_Amt) FROM Sales S, Market M WHERE M.Market_Id = S.Market_Id GROUP BY S.Product_Id, M.Region SELECT S.Product_Id, M.State, SUM (S.Sales_Amt) FROM Sales S, Market M WHERE M.Market_Id = S.Market_Id GROUP BY S.Product_Id, M.State,
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Rolling Up
Rolling up on market, from State to Region If we have already created a table, State_Sales, using 1.
SELECT (S.Sales_Amt) FROM WHERE GROUP BY
S.Product_Id,
M.State, SUM
Sales S, Market M M.Market_Id = S.Market_Id S.Product_Id, M.State
then we can roll up from there to: 2.
SELECT T.Product_Id, M.Region, SUM (T.Sales_Amt) FROM State_Sales T, Market M WHERE M.State = T.State GROUP BY T.Product_Id, M.Region
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Roll-up and Drill Down Higher Level of Aggregation
Roll Up
Region Country State Location Address
Drill-Down
Sales Channel
Sales Representative
Low-level Details Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
“Slicing and Dicing”
The Telecomm Slice
Product
Household Telecomm Video Audio
s n o i eg Europe R Far East India Retail Direct Special
Sales Channel
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Snowflake Schema A snowflake schema is a term that describes a star schema structure normalized through the use of outrigger tables. i.e dimension table hierarchies are broken into simpler tables. In star schema example we had 4 dimensions like location, product, time, organization and a fact table (sales)
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Snowflake schema Represent dimensional hierarchy directly by normalizing tables. Easy to maintain and saves storage
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Example of Snowflake Schema
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Example of Snowflake Schema time time_key day day_of_the_week month quarter year
item Sales Fact Table time_key item_key branch_key
branch
location_key
branch_key branch_name branch_type
units_sold dollars_sold avg_sales
Measures
item_key item_name brand type supplier_key
supplier
supplier_key supplier_type
location location_key street city_key
city
city_key city province_or_street country
Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material
Questions??
Prepared By Amit Sharma Hyperion/OBIEE Trainer learnhyperion.wordpress.com
[email protected] Learnhyperion.wordpress.com
[email protected] for all Hyperion video tutorial/Training/Certification/Material