Seven Styles of Data Integration

December 9, 2016 | Author: Odeleye Wole
Seven Styles of Data Integration

A White Paper by Kevin Quinn

Kevin Quinn

Kevin Quinn is vice president of product marketing for Information Builders. Kevin has over 24 years experience designing and implementing business intelligence and enterprise integration solutions. He has written and published many articles and white papers on strategic information architecture. In his various roles at Information Builders, Kevin has helped companies worldwide develop information deployment strategies that help them accelerate decision-making and improve corporate performance. He has worked with global companies to implement best practices for their successful deployments. Kevin graduated from Queens College with a bachelor’s degree in Computer Science.

Table of Contents 1

Executive Summary


Seven Styles of Data Integration


Traditional Data Warehouse


Real-Time Data Warehouse


Operational Data Access


Enterprise Information Integration (EII)


Process Integration


Search Technology


Data Access via Web Services



Executive Summary Most people assume that the starting point for any business intelligence (BI) project is a data warehouse. In reality, while data warehouses are important for many types of analytical systems, they aren’t always necessary. Building a data warehouse can dramatically increase the cost of a BI project. It can also reduce the value of the information by taking timely operational data and making it dated or even irrelevant. In our experience, many BI projects can realize benefits from alternate data integration scenarios. The intended audience for this paper is project, business, and IT managers who have responsibility for BI activities. If you hold one of these positions, we’d like to expand your understanding of BI projects by describing seven proven techniques for accessing BI data. We’ll use real-world examples from Information Builders’ customer base to demonstrate the high-value, high-return data access options that are available to you. Data warehouses themselves are not the problem. The problem arises when a data warehouse is viewed as a solution to all BI deployments, or there is an expectation that simply building a data warehouse will drive users to information. Data warehouses should not be implemented without a clear understanding of the business challenges they will solve. Before building a data warehouse, you should also carefully research potential data-access architectures to make sure you have devised the best method for connecting your BI tools with your data. This paper describes data warehouses along with many other options for placing relevant, timely information in the hands of business users. As you’ll see, while a data warehouse is a good solution in some instances, many BI applications are better served with integration and portal technologies that simply pull data into reports on an as-needed basis. In the pages that follow, we'll review seven basic ways to integrate and access data to solve various business problems: 1. A traditional data warehouse, periodically refreshed from production data sources 2. A real-time data warehouse, constantly updated by trickle-feeding data from production data sources 3. Operational data access, in which users obtain a real-time view of business activity from operational data and applications 4. Enterprise information integration (EII), in which BI users benefit from the real-time aggregation of corporate data across multiple data sources 5. Process integration, which involves delivering real-time information based on a business event or as part of a business process 6. Search technology that can rapidly scan indexed content to create Google-style results from data sources throughout the enterprise 7. Web services, which can expose or extract data from multiple sources of information, irrespective of underlying operating systems, applications, or databases


Information Builders

Seven Styles of Data Integration Traditional Data Warehouse Data warehouses are important for many BI projects, particularly when analytic systems are involved. Generally they involve gathering data from multiple sources to create an aggregated source of information for reporting. A data warehouse is a consolidated view of enterprise data, optimized for reporting and analysis. Data and information are extracted from production data sources as they are generated (real-time information), or in periodic stages (latent information), making it simpler and more efficient to run queries against that data, rather than to separately access each data source.

A data warehouse takes data from one or more sources on a scheduled (e.g. daily, weekly, monthly) basis.

There are many valid reasons for building a data warehouse, including the following: ■

To reduce overhead on a transaction-processing system or production application by staging data to a reporting database

To reduce the complexity of the data and put it in a form that is suitable for reporting

To maintain and analyze historical data that is no longer accessible in operational applications

For example, Moneris Solutions, Canada’s leading technology merchant of credit card processing, created a data warehouse to allow merchants to view daily and historical sales data. Developers used data-integration technology from iWay Software, an Information Builders company, to extract data from point-of-sale and transactional card systems in three data centers, and load it into a Microsoft SQL Server-based data warehouse. Moneris maintains three months worth of daily transactions and 24 months of summary data in its data warehouse, which is dimensionally modeled to speed up reporting and analysis activities. The company downloads about five million rows of new transactional information into the warehouse each day to support a merchant base of more than 300,000 customers. It’s a massive


Seven Styles of Data Integration

data access, summarization, and delivery exercise, and a data warehouse is an ideal way to supply the information customers’ need. These merchants use WebFOCUS to run parameterized reports such as the daily authorization log, monthly merchant statement, and daily corporate summary, creating reports about individual stores as well as rolling up summary information to reflect larger operations.

Real-Time Data Warehouse While refreshing the Merchant Direct data warehouse once each day works fine for Moneris’ customers, some businesses require more current data. For example, customer service reps often need up-to-date information about the customers who call in for assistance. Have you ever been transferred from one attendant to another because the customer rep didn’t have the right information at his or her fingertips? Many of these transfers occur because customer service reps can’t access current data, especially when multiple products and services are involved. For example, a telecommunications company might offer land-line phone services, wireless phone services, Internet services, and TV services, requiring reps to look in many different places to fully understand a customer’s total relationship with the company. Some companies try to solve this problem by migrating customer data from multiple systems into a central data warehouse, which customer support reps can query for insight about customer activities. But keeping the information up to date is a challenge. You might call to ask a question about your cell phone plan, and a few minutes later send a message requesting information about a new feature the rep just told you about. How long will it take the company to update your customer records in a database that all the reps can see? This problem reflects a real quandary for a North American telecommunications company. This firm created a data warehouse that accumulates data from five different operational sources each night. They placed data into the warehouse in batch mode at the end of each day. This architecture was adequate for most customer inquiries, except for cases where a customer issue involved several different calls or e-mail messages during the course of a day. These inquiries sometimes entailed accessing data from several different operational systems. To get the information, customer support reps had to transfer phone calls to other reps, delaying call resolutions and increasing support costs. To resolve the situation, this telecommunications company used Information Builders’ iWay integration technology to trickle-feed the data warehouse – meaning new records are added right away. Today, as soon as new data is entered into any one of these five operational systems, it is extracted, transformed, and loaded into a real-time repository that includes information about customer accounts, invoices, service orders, products, support histories, and much more. Call center representatives always have up-to-the-minute information about customer accounts and inquiries, and customers don’t get passed from one division to another.


Information Builders

In this scenario the data warehouse is updated simultaneously with operational systems, a record at a time.

Is this a complicated architecture? Not if you have the right integration tools. iWay listens for transactions as they are committed to each of the operational systems, then makes corresponding updates to the real-time data warehouse, transposing information into a common format along the way. As a result, updates to any of the operational systems are reflected in the data warehouse within five minutes of any customer interaction, regardless of which venue the customer uses to contact the company. The telco also used WebFOCUS to create a business intelligence portal for displaying the data – a real-time window that enables reps to stay up-to-date on the history of each account.

Operational Data Access As we’ve seen, analytical BI systems generally access a data warehouse. They give users an excellent view of past business events and entities, but not of current business processes, which are ongoing. Operational business intelligence systems, by contrast, give users a real-time view of business events as they occur, such as shipping orders to customers, routing parts through an assembly line, or sending trouble tickets to customer service reps. Integration technology is important to both operational and analytic BI systems, but in different ways. Analytical BI applications rely on extract, transform, and load (ETL) tools to keep a data warehouse current, perhaps once a day or once a week. Operational BI applications generally get their information from an automated workflow process or directly from production systems. There is less latency between when an event occurs and when the BI system is aware of that event, putting business users in touch with current information.


Seven Styles of Data Integration

Reports are generated directly from the operational system (or sometimes an exact copy of the operational system).

For example, RBC Royal Bank provides real-time loan status information to its asset-based lending (ABL) customers. Asset based lending is a flexible way of providing fast-growing or highly leveraged companies with working capital. The lending institution approves revolving lines of credit secured by accounts receivable and inventory. The major difference between asset-based lending and traditional commercial lending is control; lenders must continually assess the make-up and status of each borrower’s collateral. This enables them to maximize the borrower’s margin availability based on the underlying value of its current assets. To make its ABL calculations, RBC considers more than a million invoices each month along with lengthy inventory reports. They use iWay to translate this steady stream of data into meaningful information that can be directly input to the ABL reporting system. This gives customers a real-time view of the status of their loans – up to the millisecond. If the operations group updates the data, it is posted immediately, so the customer always obtains the latest information. Thanks to this real-time reporting architecture, RBC’s asset-based lending clients can view their borrowing base position, outstanding loan balances, collateral composition, and listings of ineligible accounts through a secured and encrypted Web site.

Enterprise Information Integration (EII) When an operational BI application accesses multiple sources of information, we typically refer to it as enterprise information integration (EII). This architecture enables BI systems to look across multiple business applications and accept events from multiple sources, such as those supporting customer relationships, the supply chain, and sales transactions. These federated queries can propagate information from any source – real-time ERP transactions, warehoused data, and business-to-business systems – and deliver it to line managers, executives, or automated business processes.


Information Builders

Enterprise information integration (EII) refers to the real-time aggregation of corporate data across multiple data sources. It presents distributed data as if it exists in a single location. This distinguishes it from other types of data-access technologies, since data is not permanently moved or replicated into a new location or database. The source data remains intact.

EII combines data from several sources, which can include operational systems and data warehouses.

A major Canadian airline used this architecture to create a BI application that helps maintenance workers identify deviations. In airline parlance, deviations refer to aircraft maintenance issues, including tracking parts for repairs. Previously, even trivial maintenance issues such as a faulty seat-back table or a torn seat cushion prevented the airline from selling those seats on its flights, reducing revenue and profitability. But maintenance workers weren’t always notified in time, since the information the airline needed to expedite these repairs was distributed across three different applications. The airline needed real-time information to service these planes between flights. At first glance, it might seem that integrating data from the three different applications into a staged data warehouse would satisfy the requirements. After carefully analyzing the requirements, the airline realized it could generate a federated query to access data from all of these sources simultaneously. They didn’t need to build a warehouse to maintain this information.


Seven Styles of Data Integration

Developers used WebFOCUS to build a report that combines data from three operational sources: the primary maintenance system, which holds information about seat and other problems on the plane; the parts inventory system, which holds information about the location of the necessary replacement parts; and the plane routing system, which holds scheduling information. This one report informs maintenance workers about which planes need which parts in which locations, enabling them to fix each problem as soon as possible. Based on this one report, the airline can attend to maintenance problems in a timely fashion, increasing seat sales and improving profitability. Maintenance personnel use WebFOCUS to list all the deviations requiring attention. They can generate standard or parameterized reports that list the type, location, and destination of each affected plane, along with a catalog of available parts. This federated system not only makes it easy to identify the required parts but it has become an important performance management tool for monitoring the activities of each maintenance crew, such as their success identifying, classifying, and closing deviations.

Process Integration While users querying a database or running a report typically initiates analytical BI systems, the business process itself triggers process-driven BI systems. For example when an order entry system receives an order or a manufacturing process updates a bill of materials, these events might notify other applications within the enterprise. In some cases, users are asked to supply input, perhaps to correlate events with data obtained from other parts of a business process. In other cases there is no user input involved. iWay is a key technology behind these applications because it enables applications to listen for events, detect them, propagate them, and determine which actions to take according to conditions that have been determined in advance. Setting up triggers and alerts enables a BI process to interface with transaction systems and be triggered by events occurring in those systems. You might set up a trigger to send a message when conditions reach a predefined threshold, such as when inventory falls below a certain level or new sales figures are available. There are three basic categories of process integration: ■

Real-time alerts

Process-driven BI

Transactional integration

In all three cases, the BI application acquires data before it ever gets loaded into a database.


Information Builders

Data is accessed as the business event occurs and is delivered even before it enters a database. Delivery targets can be any device (computer, phone, or handset) or even another part of a process.

For example, let’s say a customer orders 50 widgets through your online store. A BI application might send a real-time alert to verify that there is enough stock on hand to fill that order. A processdriven BI application not only checks the inventory but also makes a decision to replenish it by sending a message to the supplier. Transaction integration is similar, but in this case a database transaction triggers the event. In other words, simply committing the order to the database triggers an alert to verify the stock on hand, along with a message to the supplier to replenish the inventory. All three scenarios are closely related, since they involve delivering real-time information based on a business event or as part of a business process. Messages are generated, monitored, and interpreted so that applications can take the necessary actions. Sometimes this type of integration scenario is referred to as business activity monitoring (BAM). But whatever term is used, it involves monitoring events related to business processes like EDI transactions, message bus activity, FTP activity, e-mail activity, database transactions, and application updates. Very few business intelligence products can monitor and interpret these real-time events. WebFOCUS is an exception, thanks in part to its close relationship with iWay. Consider IPC, the largest group purchasing organization for independent pharmacies in the United States. In 2006, when the U.S. Food and Drug Administration (FDA) announced that drug wholesalers must track the pedigree of all pharmaceutical products dispensed in retail pharmacies, IPC had to set up a real-time environment to track each bottle of drugs that passes through its warehouses.


Seven Styles of Data Integration

IPC used iWay to track the information and update the associated information systems. Now, when pharmaceutical wholesalers send product data to IPC, IPC’s drug-tracking system automatically ties it to a purchase order. iWay matches the purchase order with a shipping notice, then sends back an e-mail confirmation of the transaction. After the wholesaler verifies that the correct manufacturer is listed, iWay updates the pharmaceutical database as well. Thanks to this automated business flow, IPC knows the exact pedigrees of the drugs that it has purchased before the products even arrive at the shipping dock. In turn, they are able to provide this information to their individual pharmacies, fulfilling the FDA requirement. IPC plans to use WebFOCUS to expand its operational reporting capabilities, creating reports to drill down into pharmaceutical deliveries by region, as well as to provide inventory summaries to individual pharmacies. Now that iWay is monitoring its business processes, current order activity, shipping status, and inventory levels are always listed in these reports. Many state and local governments rely on process integration to facilitate collaboration among agencies. For example, the New York City Department of Health (DoH) has developed a firstresponse system to help hospitals, emergency workers, and the Centers for Disease Control and Prevention to proactively monitor the outbreak of diseases. Nearly three-dozen New York hospitals routinely feed patient data to the DoH, which uses iWay and WebFOCUS to combine and analyze the information. There’s no time to put the data into a traditional data warehouse, let alone expect healthcare workers to go looking for it. These are real-time problems and they demand a real-time solution. The same data-sharing partnership applies to the city’s 911 emergency system. As information comes in from both sources, the DoH uses business intelligence tools to spot trends that indicate a disease cluster in specific neighborhoods, then immediately sends messages to the appropriate authorities.

Search Technology Everybody is familiar with the convenience and far-reaching capabilities of search technology. But not many companies have learned how powerful this technology can be in the context of BI applications. The problem is that search engines are designed to index and track Web pages, not necessarily database transactions. Enter the iWay Enterprise Index, which taps into these streams of information, transforms them into a usable format, and prepares them for searching by end users. This unleashes information that was previously locked up in proprietary information systems – no data warehouse required. The iWay Enterprise Index powers a new BI product called WebFOCUS Magnify, which enables users to search dynamic business intelligence content in addition to structured and unstructured data sources. It rapidly scans indexed content to create Google-style results from data sources throughout the enterprise. From a standard search page, you can follow links to execute reports and access information.


Information Builders

An enriched version of each transaction is sent to a search engine in HTML format in concert with the operational system. Subsequent searches link transactions to reports that will further reveal necessary information.

Why is this unique? The real breakthrough is in its scope: this technology allows you to find data across disparate applications and databases even when you don’t know what you are looking for. The iWay Enterprise Index can turn database transactions into Web pages, then feed those pages to a search engine. Subsequent searches will not only return your usual Web page findings, but also uncover information stored in transactional Web pages, which iWay creates on the fly. These special Web pages contain links to the original database sources as well as to relevant reports, revealing new insight into the items you are searching for. Let’s bring this to light with an example. Since September 11, 2001, law enforcement officials have realized the importance of sharing information across local, state, and federal databases. They have made great strides with BI applications that can combine and access data from many different places. But like most BI solutions, these applications assume you know what you are looking for before you generate a report or submit a query. Unfortunately, that’s not always how law enforcement personnel operate. With WebFOCUS Magnify, a simple search for a license plate number could uncover transactions across multiple data sources and law enforcement organizations. Magnify indexes transactions

10 Seven Styles of Data Integration

across multiple data sources and then allows you to reach back into those data sources to find related information – without having to create a data warehouse or join databases. A simple search on indexed Web pages reveals database records that help the user qualify the information. The search results might represent three or four different databases along with references to transactions, such as records of moving violations. It’s easy to tie those results back to a WebFOCUS report that presents a history of the registered owner of the vehicle. This progression is illustrated in the following screenshots. In Figure 1, a law enforcement officer enters a partial license plate number (YOR) in the Magnify search box at the top of the page. This returns a list of database transactions that include this string of plate numbers (on the right) and a list of the associated data sources (on the left).

Figure 1.

In Figure 2, the officer has clicked on the Data Source item to reveal the database records from which these search results were derived. The officer found eight records in the Incident and Criminal database, and five records in the Vehicle Registration database, along with an Officer Activity Report about a particular offense.

11 Information Builders

Figure 2.

In Figure 3, the officer clicks on Arrest History to narrow the search further. This brings up the complete record of the incident in question.

Figure 3.

12 Seven Styles of Data Integration

Magnify takes search technology to the next level. You begin with free-form, Google-style searches, and then reach into the associated transactions and databases to find additional information. Customer support reps could use this same type of technology to investigate a problem. The billing system, marketing system, shipping system, and order entry system might all reference the same customer. Simply entering a customer number or phone number could turn up records of customer activity, so the rep can find additional information related to the customer’s problem.

Data Access via Web Services Another important way to access data is via a Web service. Information Builders’Web services adapter can treat data coming from an Internet Web service as if it were stored in a relational table. This solves many different problems without recourse to a data warehouse. For example, a purchasing officer might need to review a supplier’s inventory, pricing, and delivery options to determine which items to restock. If that information is available as a Web service, the officer could retrieve it in a single report and make an instant re-stocking decision. Regardless of the underlying operating systems, applications or databases, the Web service can make all of the data look the same.

Report results are obtained from combining data from one or more data sources and Web services. The Web services are treated as relational tables.

13 Information Builders

Other companies need to combine their own internal data with external information, such as to compare customer information with external demographic information and plot it on a map. With the iWay Web Services Adapter, you can join data from an internal system with demographic data from the Internet, perhaps using a zip code column as a point of similarity. In other situations, developers create Web services to extract a subset of information from an internal database or application, enabling multiple departments to access their own slices of the data. For example, the marketing department might need to tap into certain parts of a sales or finance system. A Web service can reveal just the pertinent data. This flexibility is especially important in today’s highly distributed manufacturing world, where a company might want to source production and assembly tasks to multiple partners, plants, or contract manufacturers. Consider Guardian Industries Corp., a leading manufacturer of float glass and fabricated glass products. Guardian and its affiliates manufacture glass in 24 plants in 14 countries. The company selected WebFOCUS because it can work with a service-oriented architecture (SOA) and because it preserves the hierarchy of information represented in reporting objects. Guardian publishes Web services from its enterprise resource planning (ERP) applications and passes the information to WebFOCUS. WebFOCUS consumes the services and generates pertinent reports for each department and contractor. To integrate its information systems, Guardian uses WebFOCUS ReportCaster, via its API, to automatically generate and print reports using the ERP system. This enables the ERP system to create and maintain documents as discrete objects that are accessible to many types of applications. Guardian can therefore securely exchange information, both inside and outside of the company, without the overhead of developing and maintaining a data warehouse.

14 Seven Styles of Data Integration

Conclusion Organizations create data warehouses for reasons that are not entirely valid. We commonly hear the following: ■

My business intelligence solution requires a data warehouse

I need to get data from more than one application, so a data warehouse is necessary to combine it all at first

I need a data warehouse because that’s our reporting strategy and all of our BI systems require it

As the examples in this paper illustrate, there are many ways to access data for analysis and reporting. We suggest that you analyze each business challenge to understand whether a data warehouse or another type of information-access method presents the best solution. Always try to identify the best method at the outset of the project, and don’t assume that a data warehouse is the correct solution before assessing all the options.

15 Information Builders

