DW Assignment1

May 1, 2018 | Author: Tejaswini Skumar | Category: Data Warehouse, Metadata, Information System, Top Down And Bottom Up Design, Information
Share Embed Donate


Short Description

Data warehouse analyst...

Description

6/14/2017

Assignment 01 Data Warehousing and Data Mining

Tejaswini Shivakumar Student Id: 1508973

Tejaswini Shivakumar

Assignment 01 Data Warehousing and Data Mining CHAPTER 1: 1. What do we mean by strategic information? For a commercial bank, name five types of strategic objectives. Answer:

Strategic information is a broad based information required to make decisions for the formation and execution of business strategies and their target. Strategic information is not used for the functioning of the daily operations of the business but used for analysis, discerning trends, and monitoring performance of the business. They are information systems that are developed in response to corporate business initiative. They are intended to give competitive advantage to the organization. ... Strategic information management (SIM) is a salient feature in the world of information technology (IT).Data warehouse is the new computing environment to provide this strategic information. Five types of strategic objectives for a commercial bank are:1. Retain customers by making quick decisions. 2. Introduce two new credit card schemes for students to market in 2 years. 3. Product Bundling : A successful strategy employed by all banks is product bundling, such as offering a free checking account for those who open a savings account. 4. Ease of access to bank account from mobile devices to increase online bank users. 5. Teller Referrals: banks consistently train tellers to look for opportunities to cross-sell bank products and refer customers to the right person.

1

Tejaswini Shivakumar

2. Do you agree that a typical retail store collects huge volumes of data through its operational systems? Name three types of transaction data likely to be collected by a retail store in large volumes during its daily operations. Answer: Yes, I agree that a typical retail store collects huge volumes of data through its operational systems. These basically use the Online Transaction Processing (OLTP) systems to input the date into the various databases. Three types of transaction data likely to be collected by a retail store in large volumes during its daily operation are:1. Customer information initiation and update for loyal customers. 2. Credit card approval and cad card submission for the payment. 3. Sale and delivery of gift cards. 3. Examine the opportunities that can be provided by strategic information for a medical center. Can you list five such opportunities? Answer: Strategic information for a medical center are:1. Improved vaccination rates and combined offers with vaccinations. 2. Promote a culture that embraces, expects, and rewards the delivery of patient- and family-centered care. 3. Use of advance equipment’s and instruments for medical research and treatment. 4. Increase the scope, quality and impact of innovative research for finding the root cause and eradicating cancer. 5. Attract, retain and mentor excellent and well qualified doctors, nurses and trainees. 4. Why were all the past attempts by IT to provide strategic information failures? List three concrete reasons and explain. Answer: The past attempts by IT to provide strategic information failed because IT provided strategic information from the operational systems. These operational systems such as order processing, inventory control, claims processing, billing etc. were not dealing with

2

Tejaswini Shivakumar

providing strategic information, but were used to run the daily core business of the company. Three concrete reasons are:1. Many ad hoc requests are received by the IT which leads to a large overhead and congestion. Due to the finite resources the company owns, IT is incapable to answer all the request in a uniform timely manner which delays processes. 2. The large number of request also keep changing over time. Hence the users need additional reports to understand the previous reports. 3. IT were unable to provide flexible and useful information environment for strategic decision making for the analysis. Hence IT were not able to provide strategic information. 5. Describe five differences between operational systems and informational systems. Answer: Five differences between operational systems and informational systems:1. Operational systems deals with current data values whereas Information systems deals with archived, derived and summarized data values. 2. The data structure are optimized for transactions in Operational systems whereas the data structure are optimized for complex queries in Informational systems. 3. In Operation Systems, the frequency of accessing data is high whereas in Information System, the frequency of accessing data is medium or low sometimes. 4. In Operation Systems , the access types used for the data values are read, update and delete whereas in Information System, the access types used for the data values is only read. 5. The number of people associated or dealing with Operation systems are large where as the number of people associated with Information systems are relatively small number. 6. Why are operational systems not suitable for providing strategic information? Give three specific reasons and explain. Answer: Operational systems are used to run the daily business of the company, they are the bread and butter of the company. These systems are responsible to put the data in the database.

3

Tejaswini Shivakumar

They collect various types of data such as customer name, sales amount, date, product number, etc., to capture business transactions. This data is then converted to meaningful information to produce reports for decision making. Strategic information is important information used for decision making later used for analysis and monitoring performance. 7. Name six characteristics of the computing environment needed to provide strategic information. Answer: Six characteristics of the computing environment needed to provide strategic information:1. It provides an ideal environment for analysis of data and decision support. 2. Fluid, flexible and interactive environment for the users. 3. It is a 100 percent user driven environment. 4. It provides read intensive data usage. 5. It follows a very responsive and useful interactive pattern. 6. It provides the ability to identify answers to complex, unpreditable questions. 8. What types of processing take place in a data warehouse? Describe. Answer: Data warehouse is an informational environment that presents a flexible and interactive source of strategic information. The major processing that takes place in this new environment is analytical. There exists four stages of analysis processing requirement. It executes simple queries and generates reports on current or past data. It provides ability to perform “what if” analysis in different ways and analyze, query, step back and then resume the process to any defined length. It identifies past trends and uses them for future results. 9. A data warehouse in an environment, not a product. Discuss. Answer: A data warehouse is not a particular computer element such as software or hardware ,it is a computing user-centric environment where the user obtains the strategic information. The users are directly linked with the data they require for better decision making. It is a flexible and interactive environment for decision making, data analysis and monitoring performance.

4

Tejaswini Shivakumar

10. Data warehousing is the only viable means to resolve the information crisis and to provide strategic information. List four reasons to support this assertion and explain them. Answer: Operation systems are suitable for providing strategic information as they were used to run the daily business of the company. They were the bread and butter for the company and were responsible to put in data in the database. This data is then converted to meaningful information to produce reports for decision making. Strategic information is important information used for decision making later used for analysis and monitoring performance. Business intelligence influence data warehouse to provide strategic information. Data derived from the operation systems were totally unsuitable information. Hence Data warehouse is the only viable means to provide strategic information. Data warehouse directly links users with data required for better decision making. Information crisis occurred because numerous data was bottlenecked by various enterprises for past few years. Hence the required information was not accessible for delivering strategic information. Therefore data warehouse was the only best source for delivering strategic information.

CHAPTER 2: 1. Name at least six characteristics or features of a data warehouse. Answer: The six characteristics of data warehouse are:1. It grants subject related data. 2. Its data is time oriented. 3. Precise and exact data. 4. Data granularity provided. 5. Non-volatile data. 6. Integrated data.

5

Tejaswini Shivakumar

2. Why is data integration required in a data warehouse, more so there than in an operational application? Answer: Data warehouse fetches data from operational systems. All relevant information is collected from various applications for accurate decision making. These applications are distributed having separate operational systems. It is vital to undergo the data through various processes such as transformation, consolidation and integration to remove the inconsistency in the data prior to storing it into the data warehouse. 3. Every data structure in the data warehouse contains the time element. Why? Answer: Data warehouse stores current data value, making time element an important factor for the data structure. For example, in an order entry system, the status of an order is the current status of the order. Along with the current data value, data warehouse also enables the storage of historic data. We could find this historic data from data warehouse obtained from operational systems. Hence every data element stored in the data warehouse has a time element associated with it. This is an important aspect which is useful for the design and implementation phase ahead. 4. Explain data granularity and how it is applicable to the data warehouse. Answer: Granularity is defined as specific level of data stored in the data warehouse. Basically data granularity are of two types, high granularity and low granularity. Low granularity is referred to detailed data collected from the atomic level. High granularity is referred to data similar or equivalent to data at transaction level which is atomic data level. With the feature of data granularity, data warehouse can output summary or report of required data such produce an annual report for deposits at a bank. 5. How are the top-down and bottom-up approaches for building a data warehouse different? Discuss the merits and disadvantages of each approach. Answer: The top-down approach provides a larger picture how the data is built in a data warehouse. It provides the precise information of the data stored in the data warehouse. The advantages of top-down approach are:6

Tejaswini Shivakumar

1) It enables a truly corporate effort, an enterprise view of data. 2) It is inherently architected and is not a collection of different data marts 3) It provides a single storage which ca be used widely for accessing contents of data 4) It presents centralized rules for every data. 5) It outputs quick results for every data store. The disadvantage of top-down approach are:1) It takes longer to build even with an iterative method. 2) It is highly prone to failure. 3) It needs high level of cross functional skills. 4) High outlay without proof of concept The bottom-up approach examines each and every group or category of data, analyze and then provides a detailed report. The advantages of using bottom-up approach are: 1) It provides faster and easier implementation of manageable pieces of data. 2) The return on investment is favorable and provides proof of concept. 3) The risk of failure is quite less. 4) It increments data according to their importance and schedules important data marts first. 5) It permits project team to learn and grow thus expanding. The disadvantage of using bottom-up approach is: 1) The data mart has its own narrow view of data. 2) It provides redundant data. 3) It perpetuates inconsistent and irreconcilable data. 4) Data fragmentation is the biggest weakness.

7

Tejaswini Shivakumar

6. What are the various data sources for the data warehouse? Answer: There are various data sources for the data warehouse are: 1) External data: The executives rely on the data from external sources for the major amount of information they require. 2) Archived data: The operational data keeps modifying periodically and the past data is archived in the system. 3) Data staging component: This works in three different phases. Firstly it extracts the data, secondly it transforms the data, and finally it loads the data in the system. 4) Data Extraction: Numerous data sources are dealt with. 5) Data transformation: Transformation of the data from one system to the new system only when required. 6) Data loading: The task of loading requires two set of people in the event.

7. Why do you need a separate data staging component? Answer: When we collect data from different operational systems and external sources, the separate space for extraction is provided by data staging component. The required data arrives from different sources which needs changed and transformed into a format which is acceptable to store the required extracted data for further analysis. Three important phases for analyzing the data are provided by the data staging component. . Firstly it extracts the data, secondly it transforms the data, and finally it loads the data in the system. Data staging gives a different separate unique space for cleaning, modifying and concatenating various diverse data for data storage and future use in the data warehouse. 8. Under data transformation, list five different functions you can think of. Answer: Data transformation is an important step in the data warehouse. It contains different functions in the data mart. The most important function is the data conversion, which means that the users have to populate the database before the record is updates. 8

Tejaswini Shivakumar

The second important function is data stored in the data warehouse is not just the initially loaded data which means that the changes in the data source have to be made before the data is loaded into the system. Standardization of data element is another factor for data source. It describes the length of the field. Sorting and changing data takes place in the large scale in data transformation. It also describes the function of cleaning up the data and resolves the synonyms and antonyms error in the data.

9. Name any six different methods for information delivery. Answer: Information delivery provides various distinct methods and elements which are used in data marts and data warehouse. Six different methods for information delivery are:1) It uses Ad hoc reports for online or Intranet transformation of data. 2) It executes complex queries 3) MD analysis reports 4) Statistical analysis 5) EIS Feed 6) Data mining 10. What are the three major types of metadata in a data warehouse? Briefly mention the purpose of each type. Answer: Meta data is data about data equivalent to data dictionary. The three massive type of Meta data are: 1) Operational metadata: The data from the various distinct types of components derive from the operational meta data through the enterprise. It consists of various field lengths and data types. In the operational Meta data you can skip files, update records, dealing with multiple coding schemes. The operational metadata contains all the information about the coding schemes. 2) Extraction and transformation Metadata: This type of data consists the Meta data about the extraction of data from the source system like extraction methods, business rules etc. This type of metadata consists information about data staging components. 9

Tejaswini Shivakumar

3) End-user Meta data: It is also called as the navigational mapping of data warehouse. It assists the users to find information about the end-users of the data warehouse. It helps the clients to analyze about their own business terminology.

References:  http://smallbusiness.chron.com/five-successful-bank-business-strategies2628.html  https://onstrategyhq.com/resources/examples-of-strategic-objectives/  Wikipedia  Data warehousing blackboard notes.  Data Warehousing Fundamentals for IT Professionals - By Paulraj Ponniah

10 

View more...

Comments

Copyright ©2017 KUPDF Inc.
SUPPORT KUPDF