Exadata and Database Machine Administration Workshop Student Guide
D67016GC20 Edition 2.0 January 2011 D71669
Authors
Copyright © 2010, Oracle and/or it affiliates. All rights reserved.
Peter Fusek
Disclaimer
Jean-Francois Verrier Mark Fuller Dave Winter
Technical Contributors and Reviewers
This document contains proprietary information and is protected by copyright and other intellectual property laws. You may copy and print this document solely for your own use in an Oracle training course. The document may not be modified or altered in any way. Except where your use constitutes "fair use" under copyright law, you may not use, share, download, upload, copy, print, display, perform, reproduce, publish, license, post, transmit, or distribute this document in whole or in part without the express authorization of Oracle.
Andrew Babb
Sue Lee
Bharat Baddepudi
Juan Loaiza
Maria Billings
Barb Lundhild
Robert Carlin
Varun Malhotra
Michael Cebulla
Louis Nagode
Nilesh Choudhury
Dan Norris
Christian Craft
Michael Nowak
The information contained in this document is subject to change without notice. If you find any problems in the document, please report them in writing to: Oracle University, 500 Oracle Parkway, Redwood Shores, California 94065 USA. This document is not warranted to be error-free.
Ravindra Dani
Sriram Palapudi
Restricted Rights Notice
Aslam Edah-Tally
Umesh Panchaksharaiah
Boris Erlikhman
Sugam Pandey
Amit Ganesh
Robert Pastijn
Ed Gilowski
Marshall Presser
Joel Goodman
Georg Schmidt
Scott Gossett
Akshay Shah
Jim Hall
Kam Shergill
Roger Hansen
Tim Shelter
James He
Eric Siglin
David Hitchcock
Sundararaman Sridharan
Bill Hodak
Vijay Sridharan
Vimala Jacob
Mahesh Subramaniam
Martin Jensen
Lawrence To
Kevin Jernigan
Alex Tsukerman
Caroline Johnston
Kodi Umamageswaran
Larry Justice
Douglas Utzig
Vikram Kapoor
Harald van Breederode
Bruce Kyro
Mark Van de Wiel
Sumeet Lahorani
Dave Winter
Publishers Sujatha Nagendra Giri Venugopal
If this documentation is delivered to the United States Government or anyone using the documentation on behalf of the United States Government, the following notice is applicable: U.S. GOVERNMENT RIGHTS The U.S. Government’s rights to use, modify, reproduce, release, perform, display, or disclose these training materials are restricted by the terms of the applicable Oracle license agreement and/or the applicable U.S. Government contract. Trademark Notice Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.
Contents
1
Introduction Course Objectives 1-2 Audience and Prerequisites 1-3 Course Scope 1-4 Course Contents 1-5 Terminology 1-6 Additional Resources 1-7 Practice 1 Overview: Introducing the Laboratory Environment 1-8
2
Exadata Overview Objectives 2-2 Traditional Enterprise Database Storage Deployment 2-3 Exadata Storage Deployment 2-4 Exadata Implementation Architecture Overview 2-6 Introducing Exadata 2-7 Exadata Hardware Details (Sun Fire X4270 M2) 2-8 Exadata Specifications 2-9 InfiniBand Network 2-10 Classic Database I/O and SQL Processing Model 2-11 Exadata Smart Scan Model 2-12 Exadata Smart Storage Capabilities 2-13 Exadata Smart Scan Scale-Out Example 2-16 Exadata Hybrid Columnar Compression 2-19 Exadata Hybrid Columnar Compression Architecture Overview 2-20 Exadata Smart Flash Cache 2-21 Exadata Storage Index 2-23 Storage Index with Partitions Example 2-25 Database File System 2-26 I/O Resource Management 2-27 Benefits Multiply 2-28 Exadata Key Benefits for Data Warehousing 2-29 Exadata Key Benefits for OLTP 2-31 Quiz 2-32 Summary 2-34
iii
Additional Resources 2-35 Practice 2 Overview: Introducing Exadata Features 2-36 3
Exadata Architecture Objectives 3-2 Exadata Software Architecture Overview 3-3 Exadata Software Architecture Details 3-5 Exadata Smart Flash Cache Architecture 3-7 Exadata Monitoring Architecture 3-9 Disk Storage Entities and Relationships 3-10 Interleaved Grid Disks 3-12 Flash Storage Entities and Relationships 3-13 Disk Group Configuration 3-14 Quiz 3-15 Summary 3-17 Additional Resources 3-18 Practice 3 Overview: Introducing Exadata Cell Architecture 3-19
4
Exadata Configuration Objectives 4-2 Exadata Installation and Configuration Overview 4-3 Initial Network Preparation 4-4 Configuration of New Exadata Servers 4-6 Answering Questions During the Initial Boot Sequence 4-7 Exadata Administrative User Accounts 4-11 Configuring a New Exadata Cell 4-12 Important I/O Metrics for Oracle Databases 4-13 Testing Performance Using CALIBRATE 4-14 Configuring the Exadata Cell Server Software 4-15 Creating Cell Disks 4-16 Creating Grid Disks 4-17 Creating Flash-Based Grid Disks 4-18 Configuring Hosts to Access Exadata Cells 4-19 Configuring ASM and Database Instances for Exadata 4-20 Configuring ASM Disk Groups for Exadata 4-21 Optional Configuration Tasks 4-22 Exadata Storage Security Overview 4-23 Exadata Storage Security Implementation 4-24 Quiz 4-26 Summary 4-29
iv
Additional Resources 4-30 Practice 4 Overview: Configuring Exadata 4-31 5
Exadata Performance Monitoring and Maintenance Objectives 5-2 Monitoring Overview 5-3 Exadata Metrics and Alerts Architecture 5-4 Monitoring Exadata with Metrics 5-6 Monitoring Exadata with Metrics: Example 5-8 Monitoring Exadata with Alerts 5-9 Displaying Alert Examples 5-11 Monitoring Exadata with Active Requests 5-13 Monitoring SQL Execution Plans 5-14 Smart Scan Execution Plan Example 5-15 Predicate Offloading Considerations 5-16 Monitoring Exadata from Your Database 5-17 Monitoring Exadata with Wait Events 5-18 Monitoring Exadata with Enterprise Manager 5-19 Additional Monitoring Tools and Utilities 5-20 Cell Maintenance Overview 5-21 Automated Cell Maintenance Operations 5-23 Replacing a Damaged Physical Disk 5-24 Replacing a Damaged Flash Card 5-26 Moving All Disks from One Cell to Another 5-27 Using the Exadata Software Rescue Procedure 5-28 Quiz 5-30 Summary 5-32 Additional Resources 5-33 Practice 5 Overview: Monitoring Exadata 5-34
6
Exadata and I/O Resource Management Objectives 6-2 I/O Resource Management Overview 6-3 I/O Resource Management Concepts 6-5 I/O Resource Management Plans 6-6 IORM Architecture 6-7 I/O Resource Management Plans Example 6-8 Enabling Intradatabase Resource Management 6-11 Intradatabase Plan Example 6-12 Enabling IORM for Multiple Databases 6-13 Interdatabase Plan Example 6-14 v
Category Plan Example 6-16 Complete Example 6-17 Using Database I/Os Metrics 6-20 Quiz 6-21 Summary 6-25 Additional Resources 6-26 7
Optimizing Database Performance with Exadata Objectives 7-2 Optimizing Performance 7-3 Flash Memory Usage 7-4 Compression Usage 7-6 Index Usage 7-8 ASM Allocation Unit Size 7-9 Minimum Extent Size 7-10 Quiz 7-11 Summary 7-13 Additional Resources 7-14 Practice 7 Overview: Optimizing Database Performance with Exadata 7-15
8
Database Machine Overview and Architecture Objectives 8-2 Introducing Database Machine 8-3 Database Machine X2-2 Full Rack 8-4 X2-2 Database Server Hardware Details (Sun Fire X4170 M2) 8-5 Start Small and Grow 8-6 Database Machine X2-8 Full Rack 8-7 X2-8 Database Server Hardware Details (Sun Fire X4800) 8-8 Database Machine Capacity 8-9 Database Machine Performance 8-10 Database Machine X2-2 Architecture 8-11 InfiniBand Network Architecture 8-13 X2-2 Leaf Switch Topology 8-14 Full Rack Spine and Leaf Topology 8-15 Scale Performance and Capacity 8-16 Scaling Out to Multiple Full Racks 8-17 Quiz 8-18 Summary 8-20
vi
9
Database Machine Configuration Objectives 9-2 Database Machine Implementation Overview 9-3 Configuration Worksheet Overview 9-5 Getting Started 9-6 Configuration Worksheet Example 9-7 Configuring ASM Disk Groups with Configuration Worksheet 9-11 Generating the Configuration Files 9-13 Other Pre-Installation Tasks 9-14 The Result After Installation and Configuration 9-15 Supported Additional Configuration Activities 9-17 Unsupported Configuration Activities 9-18 Quiz 9-20 Summary 9-22 Additional Resources 9-23
10 Migrating Databases to Database Machine Objectives 10-2 Migration Best Practices Overview 10-3 Performing Capacity Planning 10-4 Database Machine Migration Considerations 10-5 Choosing the Right Migration Path 10-6 Logical Migration Approaches 10-7 Physical Migration Approaches 10-9 Other Approaches 10-11 Post-Migration Best Practices 10-12 Quiz 10-13 Summary 10-15 Additional Resources 10-16 Practice 10 Overview: Migrating to Databases Machine using Transportable Tablespaces 10-18 11 Bulk Data Loading with Database Machine Objectives 11-2 Bulk Data Loading Overview 11-3 Preparing the Data Files 11-4 Staging the Data Files 11-5 Configuring the Staging Area 11-6 Configuring the Staging Area 11-7 Configuring the Target Database 11-10 Loading the Target Database 11-11 vii
Quiz 11-13 Summary 11-15 Additional Resources 11-16 Practice 11 Overview: Bulk Data Loading with Database Machine 11-17 12 Backup and Recovery with Database Machine Objectives 12-2 Backup and Recovery Overview 12-3 Using RMAN with Database Machine 12-4 General Recommendations for RMAN 12-5 Disk Based Backup Strategy 12-7 Disk Based Backup Configuration 12-8 Tape Based Backup Strategy 12-10 Tape Based Backup Configuration 12-11 Hybrid Backup Strategy 12-15 Restore and Recovery Recommendations 12-16 Backup and Recovery of Database Machine Software 12-17 Quiz 12-18 Summary 12-20 Additional Resources 12-21 Practice 12 Overview: Using RMAN Optimizations for Database Machine 12-22 13 Monitoring and Maintaining Database Machine Objectives 13-2 Monitoring Tools Overview 13-3 ILOM Overview 13-4 ILOM Example 13-6 DCLI Overview 13-7 DCLI Examples 13-8 InfiniBand Diagnostic Utilities 13-9 Database Machine Support Overview 13-11 Patching and Updating Overview 13-12 Maintaining Exadata Software 13-13 Maintaining Database Server Software 13-14 Maintaining Other Software 13-15 Quiz 13-16 Summary 13-18 Additional Resources 13-19 Practice 13 Overview: Using the distributed command line utility (dcli) 13-20
viii
A New Features in Update Release 11.2.1.3.1 Objectives A-2 New Features Overview A-3 Auto Service Request (ASR) A-4 The ASR Process A-5 ASR Requirements A-6 Oracle Linux 5.5 A-7 Enhanced Operating System Security A-8 Pro-active Disk Quarantine A-9 Other New Features A-10 Summary A-11
ix
I t d ti Introduction
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Course Objectives After completing this seminar, you should be able to: • Describe the key capabilities of Exadata and Database Machine • Identify the benefits of using Database Machine for different application classes • Describe the architecture of Database Machine and its integration with Oracle Database, Clusterware and ASM • Complete the initial configuration of Database Machine • Describe D ib various i recommended d d approaches h ffor migrating i ti to Database Machine • Configure Exadata I/O Resource Management • Monitor Database Machine health and optimize performance Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata and Database Machine Administration Workshop 1 - 2
Audience and Prerequisites •
•
This course is primarily designed for administrators who will configure and administer Oracle Exadata Database Machine. Prior knowledge g and understanding g of the following g is assumed: – Oracle Database 11g Release 2, including RAC and ASM. – Linux and general network, storage and system administration concepts.
•
Recommended prior training: – – – –
Oracle Database 11g: Administration Workshop I Oracle Database 11g: Administration Workshop II Oracle 11g: RAC and Grid Infrastructure Administration Oracle Linux: Linux Fundamentals Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Audience and Prerequisites This seminar is primarily designed for administrators who will configure and administer Oracle Exadata Database Machine Machine. Please be mindful of the prerequisites because this course does not teach all aspects of the technologies used inside Database Machine. Rather it focuses on topics that are specific to Exadata and Database Machine. Prior knowledge and understanding of Oracle Database 11g Release 2, including Automatic Storage Management (ASM) and Real Application Clusters (RAC), is assumed. In addition, a g knowledge g of Linux is assumed along g with an understand of g general networking, g working storage and system administration concepts. For students that do not meet these prerequisites, the recommended prior training includes the following courses: • Oracle Database 11g: Administration Workshop I • Oracle Database 11g: Administration Workshop II • Oracle 11g: g RAC and Grid Infrastructure Administration • Oracle Linux: Linux Fundamentals
Exadata and Database Machine Administration Workshop 1 - 3
Course Scope •
This course covers two main subject areas: – Exadata Storage Server X2-2 —
This section focuses on the architecture and key capabilities of Exadata along with how to configure, monitor and optimize it.
– Oracle Exadata Database Machine — —
—
•
This section introduces students to Database Machine. The installation and configuration process is covered so that students can make appropriate configuration decisions. Students also learn how to maintain, monitor and optimize Database Machine after initial configuration.
Hardware is discussed during the course, however detailed hardware installation and maintenance is outside the scope of this course.
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Course Scope This course covers two main subject areas: • The first section introduces students to Exadata Storage Server X2 X2-2 2 (formerly known as Exadata Storage Server Version 2). Students learn about the architecture and key capabilities of Exadata along with how to configure, monitor and optimize it. • The second section introduces students to Oracle Exadata Database Machine. Students learn about the various Database Machine configurations. The installation and configuration process is covered so that students are equipped to make appropriate upfront configuration decisions. They also learn how to maintain, monitor and optimize Database Machine after initial configuration. Students are introduced to various options for migrating to Database Machine and learn how to select the best approach. Although the hardware components of Database Machine are introduced and described to varying degrees throughout this course, you should consult the hardware documentation for specific hardware installation and maintenance details.
Exadata and Database Machine Administration Workshop 1 - 4
Course Contents 1. 2. 3. 4. 5. 6. 7. 8. 9. 10 10. 11. 12. 13.
Introduction Exadata Overview Exadata Architecture Exadata Configuration Exadata Monitoring and Maintenance Exadata and I/O Resource Management Optimizing Database Performance with Exadata Database Machine Overview and Architecture Database Machine Configuration Migrating Databases to Database Machine Bulk Data Loading with Database Machine Backup and Recovery with Database Machine Database Machine Monitoring and Maintenance
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Course Contents The slide shows the ordering of lessons in this course.
Exadata and Database Machine Administration Workshop 1 - 5
Terminology •
Unless otherwise indicated, ‘Exadata’ refers to ‘Exadata Storage Server’. – Typically a reference to Exadata refers to the combination of software and hardware used in Exadata Storage Server. However at times there are specific references to Exadata However, hardware or Exadata software. – Unless otherwise indicated, Exadata X2-2 (formerly known as Exadata Version 2) is implied throughout the course. Exadata X2-2 is based on Sun hardware and is the only version of Exadata supported in Oracle Exadata Database Machine. Machine
•
Unless otherwise indicated, ‘Database Machine’ refers to ‘Oracle Exadata Database Machine’. – Typically, Database Machine refers to the entire system including both hardware and software.
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Terminology The slide indicates the conventions used throughout this course to abbreviate the formal product names for Exadata Storage Server and Oracle Exadata Database Machine Machine.
Exadata and Database Machine Administration Workshop 1 - 6
Additional Resources •
Demonstrations (Viewlets) – http://www.oracle.com/technetwork/tutorials/index.html – Enter the Oracle Learning Library and conduct a search for content in the Database Machine functional category. g y Look out for demonstrations with Exadata and Database Machine Version 2 Series in the title.
•
Oracle Technology Network (OTN) Exadata and Database Machine Page – http://www.oracle.com/technetwork/database/exadata/index. html
•
OTN Exadata Discussion Forum – http://forums.oracle.com/forums/forum.jspa?forumID=829
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata and Database Machine Administration Workshop 1 - 7
Practice 1 Overview: Introducing the Laboratory Environment In this practice you will be introduced to the laboratory environment used to support all the practices during this course.
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata and Database Machine Administration Workshop 1 - 8
E d t Overview Exadata O i
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Objectives After completing this lesson, you should be able to: • Contrast the Exadata storage architecture with traditional shared storage offerings • Describe the hardware components of Exadata • Outline the capabilities of Exadata • Describe the main advantages of using Exadata compared to traditional storage servers
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata and Database Machine Administration Workshop 2 - 2
Traditional Enterprise Database Storage Deployment Database Servers
Storage Arrays
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Traditional Enterprise Database Storage Deployment The graphic in the slide illustrates the traditional deployment approach for multiple databases. Each database has an isolated allocation of storage resources and its bandwidth is limited by the hardware allocated to it. The isolation and dedication of hardware resources to individual databases can simultaneously lead to unused space and unused input/output (I/O) bandwidth for some databases, and overcommitted bandwidth with insufficient free space in others. The right balance is almost never achieved because real-world workloads are very dynamic. Large storage arrays are used today for many enterprise database deployments. These large storage arrays must be partitioned and have their bandwidth and space allocated across the d t b databases and d applications li ti sharing h i th the storage t array. B Because th these storage t arrays h house vast quantities of mission-critical data, they must be highly engineered, and consequentially very expensive, to deliver high levels of reliability and availability. Enterprise-class storage arrays are not only costly to procure, they also require highly specialized skills to manage and maintain. The result is a very high total cost of ownership when traditional large storage arrays are used in real-world enterprise database deployments.
Exadata and Database Machine Administration Workshop 2 - 3
Exadata Storage Deployment Oracle Database 11g Servers
Smart storage operations
I/O Resource Management High performance storage network Storage consolidation (Transparent to databases)
Data compression p Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata Storage Deployment The graphic in the slide illustrates the general deployment approach with Exadata. • You can use Exadata to consolidate your storage environment environment. Using Exadata Exadata, multiple databases can use storage from a single pool. Exadata uses Oracle Automatic Storage Management (ASM) to evenly distribute the storage load for every database across every available disk in the storage pool. Every database can use all the available disks to maximize performance. Exadata requires the use of Oracle Database 11g Release 2. Exadata works equally well with single-instance or Oracle Real Application Clusters (RAC) databases. Users and database administrators use the same tools and k knowledge l d they th are already l d familiar f ili with. ith B Being i b based d on iindustry-standard d t t d d components t and technologies, Exadata is inexpensive to deploy. In addition, tight integration with the full suite of Oracle Database high-availability features, ensures that the reliability and integrity needs of mission-critical environments are met. • A key advantage of Exadata is the ability to offload some database processing to Exadata servers. With Exadata, the database can offload single table scan predicate filters and projections, join processing based on bloom filters, along with CPU-intensive decompression and decryption operations. This ability is known as SQL processing offload or Smart Scan.
Exadata and Database Machine Administration Workshop 2 - 4
Exadata Storage Deployment (continued)
•
•
•
In addition to Smart Scan, Exadata has other smart storage capabilities including the ability to offload incremental backup optimizations, file creation operations, and more. This approach yields substantial CPU CPU, memory memory, and I/O bandwidth savings in the database server resulting in potentially massive performance improvements. Exadata includes Exadata Hybrid Columnar Compression. This feature provides very high levels of data compression implemented inside Exadata. Exadata Hybrid Columnar Compression allows the database to reduce the number of I/Os required to scan a table. For example, for data with a compression ratio of 10 to 1, the I/Os required to scan the data are reduced from 10 to 1 as well. Exadata ensures that I/O resources are made available whenever, and to whichever, database needs them based on priorities and policies that you can define. The Database Resource Manager (DBRM) and Exadata I/O Resource Management (IORM) work together to manage intradatabase and interdatabase I/O resource usage to ensure that your defined service-level agreements (SLAs) are met when multiple applications and databases share Exadata storage. Finally, even for queries that do not use Smart Scan, Exadata has many advantages over conventional storage. Exadata is highly optimized for fast processing of large queries. It has been carefully architected to ensure no bottlenecks in the controller or in other components inside the storage server. It makes intelligent use of high-performance flash memory to boost performance and also uses a state-of-the-art InfiniBand network that has much higher throughput than conventional storage networks.
Exadata and Database Machine Administration Workshop 2 - 5
Exadata Implementation Architecture Overview Oracle Database 11g Servers
Exadata Cell Exadata software
Disk
Linux OS
…
Exadata Cell Exadata software
Disk
Linux OS
…
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata Implementation Architecture Overview Exadata is a self-contained storage platform that houses disk storage and runs the Exadata Storage Server Software provided by Oracle Oracle. A single Exadata server is also called a cell cell. A cell is the building block for a storage grid. More cells provide greater capacity and I/O bandwidth. Databases are typically deployed across multiple cells, and multiple databases can share a single cell. The databases and cells communicate with each other via a highperformance InfiniBand network. Each cell is a purely dedicated storage platform for Oracle Database files although you can use Database File System (DBFS), a feature of Oracle Database, to store your business files i id the inside th d database. t b Like other storage arrays, each cell is a computer with CPUs, memory, a bus, disks, network adapters, and the other components normally found in a server. It also runs an operating system (OS), which in the case of Exadata is Linux. The Oracle-provided software resident in the Exadata cell runs under this operating system. The OS is accessible in a restricted mode to administer and manage Exadata.
Exadata and Database Machine Administration Workshop 2 - 6
Introducing Exadata • Exadata Storage Server
High performance storage for Oracle Database – Up to 1.8 GB/sec raw data bandwidth – Up to 75,000 I/Os per second using flash
• •
64 bit Intel-based Sun Fire Server Preinstalled software – Exadata Storage Server Software – Oracle Linux x86_64 – Drivers and Utilities
•
Only available in conjunction with Database Machine
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Introducing Exadata Exadata is highly optimized for use with Oracle Database. Exadata delivers outstanding I/O and SQL processing performance for data warehousing and online transaction processing (OLTP) applications. Exadata is based on a 64 bit Intel-based Sun Fire server. Oracle provides the storage server software to impart database intelligence to the storage, and tight integration with Oracle Database and its features. Each cell is shipped with all the hardware and software components preinstalled including the Exadata Storage Server Software, Oracle Linux x86_64 operating system and InfiniBand protocol drivers. Since March 2010, Exadata is no longer offered as a standalone storage product. Now Exadata is only available for use in conjunction with Database Machine. Individual Exadata servers can still be purchased, however they must be connected to Database Machine. Custom configurations using Exadata are no longer supported for new installations.
Exadata and Database Machine Administration Workshop 2 - 7
Exadata Hardware Details (Sun Fire X4270 M2)
Processors
2 Six-Core Intel® Xeon® L5640 Processors (2.26 GHz)
Memory
24 GB (6 x 4 GB)
Local Disks
12 x 600 GB 15K RPM High Performance SAS or 12 x 2 TB 7.2K RPM High Capacity SAS
Flash
4 x 96 GB Sun Flash Accelerator F20 PCIe Cards
Disk Controller
Disk controller HBA with 512 MB battery backed cache
N t Network k
T InfiniBand Two I fi iB d 4X QDR (40Gb/ (40Gb/s)) ports t (1 dual-port PCIe 2.0 HCA) Four embedded Gigabit Ethernet ports
Remote Management
1 Ethernet port (ILOM)
Power Supplies
2 redundant hot-swappable power supplies
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata Hardware Details (Sun Fire X4270 M2) The slide shows a description of the Exadata Storage Server hardware.
Exadata and Database Machine Administration Workshop 2 - 8
Exadata Specifications HP Disks
HC Disks
Exadata Smart Flash Cache1
384 GB
384 GB
Raw Disk Capacity1
7.2 TB
24 TB
Uncompressed Data Capacity2
2 TB
7 TB
Raw Disk Throughput (MBPS)
1,800
1,000
Effective Throughput with Flash (MBPS)
3,600
3,600
Disk I/Os per Second (IOPS)
3,600
1,440
Flash I/Os p per Second (IOPS) ( )
75,000
75,000
1 - Raw capacity calculated using 1 GB = 1000 x 1000 x 1000 bytes and 1 TB = 1000 x 1000 x 1000 x 1000 bytes. 2 - User Data: Actual space for uncompressed end-user data, computed after single mirroring (ASM normal redundancy) and after allowing space for database structures such as temporary space, logs, undo space, and indexes. Actual user data capacity varies by application. User Data capacity calculated using 1 TB = 1024 * 1024 * 1024 * 1024 bytes.
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata Specifications Exadata is available in two configurations: with high performance (HP) disks or with high capacity (HC) disks. disks The table in the slide lists the key capacity and performance specifications for both configuration options. Note: MBPS stands for megabytes per second, IOPS stands for I/Os per second. Note: These metrics do not take into account compression. With compressed data, you can achieve much higher effective throughput rates. In all cases, actual performance will vary by application.
Exadata and Database Machine Administration Workshop 2 - 9
InfiniBand Network InfiniBand: • Is the Exadata storage network: – Provides highest performance available – 40 Gb/sec each direction – Is widely used in high-performance computing since 2002
•
Looks oo s like e normal o a Ethernet e e to o host os so software: ae – All IP-based tools work transparently – TCP/IP, UDP, HTTP, SSH, and so on
•
Has the efficiency of a SAN: – Zero copy and buffer reservation capabilities
•
Is used for both storage and RAC interconnect: – Less configuration configuration, lower cost cost, higher performance
•
Uses high-performance ZDP InfiniBand protocol (RDS V3): – Zero-copy, zero-loss Datagram protocol – Open Source software developed by Oracle – Very low CPU overhead
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
InfiniBand Network InfiniBand is the only storage network supported by Exadata because of its performance and proven track record in high-performance p g p computing. p g InfiniBand works like normal Ethernet but much faster. It has the efficiency of a SAN, using zero copy and buffer reservation. Zero copy means that data is transferred across the network without intermediate buffer copies in the various network layers. Buffer reservation is used so that the hardware knows exactly where to place buffers ahead of time. These are two important characteristics that distinguish InfiniBand from normal Ethernet. InfiniBand is also supported as a unified network fabric for Exadata and the Oracle RAC interconnect. This facilitates easier configuration and fewer cables and switches. You can also l use it ffor hi high-performance h f external t l connectivity, ti it such h as tto connectt b backup k servers or ETL servers. On top of InfiniBand, Exadata uses the Zero Data loss UDP (ZDP) protocol. ZDP is open source software that is developed by Oracle. It is like UDP but more reliable. Its full technical name is RDS (Reliable Datagram Sockets) V3. The ZDP protocol has a very low CPU overhead with tests showing only a 2 percent CPU utilization while transferring 1 GB/sec of data. E hE Each Exadata d t server iis configured fi d with ith one d dual-port l t InfiniBand I fi iB d card dd designed i d tto b be connected to two separate InfiniBand switches for high availability. Each InfiniBand link is able to carry the full data bandwidth of the entire cell, which means you can lose an entire network without losing any performance. Exadata and Database Machine Administration Workshop 2 - 10
Classic Database I/O and SQL Processing Model SELECT customer_id 1 FROM orders WHERE order_amount>20000;
6
Row returned
Extents identified
2
5
SQL processing: 2 MB returned
I/O issued
3
4
I/O executed: 10 GB returned
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Classic Database I/O and SQL Processing Model With traditional storage, all the database intelligence resides in the software on the database server To illustrate how SQL processing is performed in this architecture server. architecture, an example of a table scan is shown in the graphic in the slide. 1. The client issues a SELECT statement with a predicate to filter a table and return only the rows of interest to the user. 2. The database kernel maps this request to the file and extents containing the table. 3. The database kernel issues the I/Os to read all the table blocks. 4 All the blocks for the table being queried are read into memory 4. memory. 5. SQL processing is conducted against the data blocks searching for the rows that satisfy the predicate. 6. The required rows are returned to the client. As is often the case with the large queries, the predicate filters out most of the rows in the table. Yet all the blocks from the table need to be read, transferred across the storage network,, and copied p into memory. y Manyy more rows are read into memoryy than required q to complete the requested SQL operation. This generates a large amount of unproductive I/O, which wastefully consumes resources and impacts application throughput and response time. Exadata and Database Machine Administration Workshop 2 - 11
Exadata Smart Scan Model SELECT customer_id 1 FROM orders WHERE order_amount>20000;
iDB command 2 constructed and sent to Exadata cells
SQL processing in Exadata
3
6
5
Row returned
Consolidated result set built from all Exadata cells
4
2 MB returned to server
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata Smart Scan Model Using Exadata, database operations are handled differently. Queries that perform table scans can be p processed within Exadata and return only y the required q subset of data to the database server. Row filtering, column filtering, some join processing, and other functions can be performed within Exadata. Exadata uses a special direct-read mechanism for Smart Scan processing. The above graphic illustrates how a table scan operates with Exadata: 1. The client issues a SELECT statement to return some rows of interest. 2. The database kernel determines that Exadata is available and constructs an iDB command representing the SQL command and sends it to the Exadata cells. iDB is a unique Oracle data transfer protocol that is used for Exadata storage communications. 3 The Exadata server software scans the data blocks to extract the relevant rows and 3. columns which satisfy the SQL command. 4. Exadata returns to the database instance an iDB message containing the requested rows and columns of data. These results are not block images, so they are not stored in the buffer cache. 5. The database kernel consolidates the result sets from across all the Exadata cells. This is similar to how the results from a parallel query operation are consolidated. 6 The rows are returned to the client 6. client. Moving SQL processing off the database server frees server CPU cycles and eliminates a massive amount of unproductive I/O transfers. These resources are free to better service other requests. Queries run faster, and more of them can be processed. Exadata and Database Machine Administration Workshop 2 - 12
Exadata Smart Storage Capabilities •
Predicate filtering: – Only the rows requested are returned to the database server rather than all the rows in a table.
•
Column filtering: g – Only the columns requested are returned to the database server rather than all the columns in a table.
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata Smart Storage Capabilities The following database functions are integrated within Exadata: • Exadata enables predicate filtering for table scans scans. Rather than returning all the rows for the database to evaluate, Exadata returns only the rows that match the filter condition. The conditional operators that are supported include =, !=, , =, IS [NOT] NULL, LIKE, [NOT] BETWEEN, [NOT] IN, EXISTS, IS OF type, NOT, AND, OR. In addition, many common SQL functions are evaluated by Exadata during predicate filtering. For a full list of functions that can be offloaded to Exadata, use the following query: SELECT * FROM v$sqlfn_metadata WHERE offloadable = 'YES'; • Exadata provides column filtering, also called column projection, for table scans. Only the requested columns are returned to the database server rather than all columns in a table. For tables with many columns, or columns containing LOBs, the I/O bandwidth saved by column filtering can be very large. When used together, the combination of predicate and column filtering dramatically improves performance and reduces I/O bandwidth consumption. For example, when processing the following query, Exadata returns only the employee names that are longer than five characters: SELECT name FROM employees WHERE LENGTH(name) > 5; Without predicate and column filtering, the storage subsystem would need to send all the rows and columns of the employees table to the database to evaluate. Exadata and Database Machine Administration Workshop 2 - 13
Exadata Smart Storage Capabilities •
Join processing: – Simple star join processing is performed within Exadata.
• • •
Scans on encrypted data Scans on compressed data Scoring for Data Mining: – All data mining scoring functions are offloaded. – Up to 10x performance gains.
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata Smart Storage Capabilities (continued) •
•
•
•
Exadata performs join processing for star schemas (between large tables and small lookup tables) tables). This is implemented using Bloom Filters Filters, which is a very efficient probabilistic method to determine whether an element is a member of a set. Exadata performs Smart Scans on encrypted tablespaces and encrypted columns. For encrypted tablespaces, Exadata can decrypt blocks and return the decrypted blocks to Oracle Database, or it can perform row and column filtering on encrypted data. Significant CPU savings can be made within the database server by offloading the CPUintensive decryption task to Exadata cells. Smart Scan works in conjunction with Exadata Hybrid Columnar Compression so that column projection and row filtering can be executed along with decompression at the storage level to save CPU cycles on the database servers. Exadata can perform scoring functions for data mining models. All data mining scoring functions, such as PREDICTION_PROBABILITY, are offloaded to Exadata cells for processing. This accelerates warehouse analysis while it reduces database server CPU consumption p and the I/O load between the database server and Exadata.
Exadata and Database Machine Administration Workshop 2 - 14
Exadata Smart Storage Capabilities •
Backups: – I/O for incremental backups is much more efficient because only changed blocks are returned to the database server.
•
Create/extend tablespace: p – Exadata formats database blocks.
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata Smart Storage Capabilities (continued) •
•
The speed and efficiency of incremental database backups is enhanced with Exadata. The granularity of change tracking in the database is much finer with Exadata Exadata. With Exadata, changes are tracked at the individual Oracle block level rather than at the level of a large group of blocks. This results in less I/O bandwidth being consumed for backups and faster running backups. With Exadata, the create/extend tablespace operation is also executed much more efficiently. Instead of formatting blocks in database server memory and writing them to storage, a single iDB command is sent to Exadata instructing it to format the blocks. Database server memory usage is reduced and I/O associated with the creation and formatting of the database blocks is eliminated with Exadata.
Exadata and Database Machine Administration Workshop 2 - 15
Exadata Smart Scan Scale-Out Example
Database Server
dbs1
InfiniBand Storage Network 40 Gb/s Maximum
Exadata Cell
edsc1
edsc2
…
edsc13
edsc14
Each cell can deliver 1.8 GB/s. Total of 14 cells that can deliver 14 x 1.8 = 25.2 GB/s Disks (12/cell)
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata Smart Scan Scale-Out Example The example in the next three slides illustrates the power of Smart Scan in a quantifiable manner using a typical case in which multiple Exadata cells scale-out scale out to share a workload workload. The database server, depicted in the upper portion of the slide, is connected to the InfiniBand storage network, which can deliver a maximum of 40 gigabits per second (Gb/s). To keep the example clear and simple, assume that the InfiniBand storage network can deliver data at 40 Gb/s with no messaging overhead. We will also assume that a single database server has access to the full I/O bandwidth of all the Exadata cells. g that each Exadata cell can deliver 1.8 In this scenario, there are 14 Exadata cells. Assuming gigabytes (GB) of I/O throughput per second, the potential scanning power of all the Exadata cells is 25.2 GB per second.
Exadata and Database Machine Administration Workshop 2 - 16
Exadata Smart Scan Scale-Out Example select /*+ full(lineitem) */ count(*) from lineitem where l_orderkey < 0; Database Server
dbs1
If the table is evenly distributed across all disks, each cell cannot send more than 40 / 14 = 2.85 Gb/s = 0.357 GB/s to the database instance.
If the table is 4800 GB in size, the complete scan would take approximately 16 minutes.
Exadata Cell
edsc1
edsc2
Database asks to retrieve all blocks by doing a full table scan, and then filters matching rows.
…
edsc13
edsc14
0 357 GB/s 0.357 Disks are throttled by the network bandwidth!
Disks (12/cell)
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata Smart Scan Scale-Out Example (continued) Now assume a 4800 gigabyte table is evenly spread across the 14 Exadata cells and a query is executed which requires a full table scan. scan As is commonly the case case, assume that the query returns a small set of result records. Without Smart Scan capabilities, each Exadata server behaves like a traditional storage server by delivering database blocks to the client database. Because the storage network is bandwidth-limited to 40 gigabits per second, it is not possible for the Exadata cells to deliver all their power. In this case, each cell cannot deliver more than gigabytes g y p per second to the database and it would take approximately pp y 16 minutes to 0.357 g scan the whole table.
Exadata and Database Machine Administration Workshop 2 - 17
Exadata Smart Scan Scale-Out Example select /*+ full(lineitem) */ count(*) from lineitem where l_orderkey < 0; Database Server
dbs1
If the table is evenly distributed across all disks, each cell cannot send more than 40 / 14 = 2.85 GB/s = 0.357 GB/s to the database instance.
If the table is 4800 GB in size, the complete table scan will complete in approximately three minutes and ten seconds!
Exadata Cell
1 8 GB/s 1.8
Disks (12/cell)
edsc1
edsc2
Database asks Exadata cells to send back all matching rows.
…
edsc13
edsc14
Each E h cellll can scan att a speed of 1.8 GB/s, and send its matching rows to the database instance. This represents a total scan at a speed of 25.2 GB/s!
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata Smart Scan Scale-Out Example (continued) Now consider if Smart Scan is enabled for the same query. The same storage network bandwidth limit applies applies. However this time the entire 4800 GB is not transported across the storage network; only the matching rows are transported back to the database server. So each Exadata cell can process its part of the table at full speed; that is, 1.8 GB per second. In this case, the entire table scan would be completed in approximately three minutes and ten seconds.
Exadata and Database Machine Administration Workshop 2 - 18
Exadata Hybrid Columnar Compression
Warehouse Compression
Archival Compression
Optimized for Speed
Optimized for Space
• 10 10x average storage t savings i • 10x scan I/O reduction • Optimized for query performance
15x average storage t savings i • 15 – Up to 50x on some data • Some access overhead • For cold or historical data
Reduced Warehouse Size Better Performance
Reclaim Disks Keep Data Online
Can mix compression types by partition for ILM
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata Hybrid Columnar Compression In addition to the basic and OLTP compression capabilities of Oracle Database 11g, Exadata includes Exadata Hybrid Columnar Compression Compression. Exadata Hybrid Columnar Compression offers higher compression ratios for direct path loaded data. This compression capability is recommended for data that is not updated frequently. You can specify Exadata Hybrid Columnar Compression at the table, partition, and tablespace level. You can also choose between two types of Exadata Hybrid Columnar Compression, to achieve the proper trade-off between disk usage and CPU consumption, depending on your requirements: • Warehouse compression: This type of compression is optimized for query performance, and is intended for data warehouse applications. • Online archival compression: This type of compression is optimized for maximum compression ratios, and is intended for data that does not change frequently. You can use Exadata Hybrid Columnar Compression on complete tables or in combination with basic and OLTP compression by using partitioning. Note: A compression advisor, provided by the DBMS_COMPRESSION package, helps you determine the expected compression ratio for a particular table with a particular compression method. Exadata and Database Machine Administration Workshop 2 - 19
Exadata Hybrid Columnar Compression Architecture Overview Compression Unit (CU) Block Header CU Header C1 C2
• • • • •
Block Header
Block Header
Block Header
C2
C5
C7
C3
C8
C4 C5
C6
A compression unit is a logical structure spanning multiple database blocks. E h row iis self-contained Each lf t i d within ithi a compression i unit. it Data organized by column during data load. Each column compressed separately. Smart Scan is supported. Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata Hybrid Columnar Compression Architecture Overview Exadata Hybrid Columnar Compression is a new method for organizing data in database blocks Tables are organized into sets of rows called compression units (CU) blocks. (CU). Within a compression unit, data is organized by column and then compressed. The column organization of data brings similar values close together, enhancing compression ratios. Each row is self-contained within a compression unit. In addition to providing excellent compression, Exadata Hybrid Columnar Compression works in conjunction with Smart Scan so that column projection and row filtering can be executed along with decompression at the storage level to save CPU cycles on the database servers. Note: Although the diagram in the slide shows a compression unit containing four data blocks, it should not be assumed that a compression unit always contains fours blocks. The size of a compression unit is determined automatically by Oracle Database based on various factors in order to deliver the most effective compression result while maintaining excellent query performance.
Exadata and Database Machine Administration Workshop 2 - 20
Exadata Smart Flash Cache • • •
High performance cache for frequently accessed objects Excellent for absorbing repeated random reads Allows optimization by application table Hundreds of I/Os per Sec
Tens of Thousands of I/Os per Second
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata Smart Flash Cache For many years, a constraining factor for storage performance has been the number of random I/Os per second (IOPS) that a disk can deliver. deliver To compensate for the fact that even a high performance disk can deliver only a few hundred IOPS, large storage arrays with hundreds of disks are required to deliver in excess of 60,000 IOPS. Exadata provides Exadata Smart Flash Cache, a caching mechanism for frequently accessed data. It is a write-through cache which is useful for absorbing repeated random reads, and very beneficial to OLTP. Using Exadata Smart Flash Cache, a single Exadata cell can support up to 75,000 IOPS, two cells can support up to 150,000 IOPS, and so on. Exadata Smart Flash Cache focuses on caching frequently accessed data and index blocks, along with performance critical information such as control files and file headers. In addition, DBAs can influence caching priorities using the CELL_FLASH_CACHE storage attribute for specific database objects.
Exadata and Database Machine Administration Workshop 2 - 21
Exadata Smart Flash Cache High performance cache that understands different types of database I/O: • Frequently accessed data and index blocks are cached. • Control file reads and writes are cached cached. • File header reads and writes are cached. • DBA can influence caching priorities. • • • • •
I/Os to mirror copies are not cached. Backup-related I/O is not cached. Data Pump I/O is not cached. Data file formatting is not cached. Table scans do not monopolize the cache. Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata Smart Flash Cache (continued) In more recent times, vast and expensive storage arrays have introduced equally expensive nonvolatile memory caches to improve performance performance. However, However these caches know nothing about the applications using them, so their efficiency is limited when compared to their cost. With Exadata, each database I/O is tagged with metadata indicating the I/O type. Exadata Smart Flash Cache uses this information to make intelligent decisions about how to use the cache. This cooperation ensures the efficient use of Exadata Smart Flash Cache. For example, with ASM mirroring turned on, multiple copies of each data block must be protection. However, there is usually y no written to disk to deliver the desired level of data p need to cache the secondary copies of a block because ASM will read the primary copy if it is available. A traditional storage array would not know about this characteristic leading to caching inefficiencies. Similarly, with traditional storage arrays, backups and exports will typically cause all the data to be loaded into the cache even though the operation will not read the data repeatedly. Exadata knows that there is no need to fill the cache with backup and export data. The same is true for data file formatting operations. operations Finally, Finally Exadata does not flood the cache with data from full table scans, as is the case with most storage arrays.
Exadata and Database Machine Administration Workshop 2 - 22
Exadata Storage Index Storage Index in Memory
SELECT * FROM T1 WHERE B CREATE GRIDDISK ... First two LUNs only
Grid Disk
System Area OR
Cell Disk
OR
Visible to ASM
Grid Disk LUN
(hot part) Other ten LUNs
Grid Disk (cold part)
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Disk Storage Entities and Relationships Each Exadata cell contains 12 physical disks. On each of the first two disks, Exadata reserves a system area that spans multiple partitions with a total size of approximately 29 GB. GB The system area contains the OS image, swap space, Exadata software binaries, metric and alert repository, and various other configuration and metadata files. The two system areas are mirror copies of each other which are maintained via software mirroring. Exadata automatically senses the physical disks in each cell. As a cell administrator you can only view a predefined set of physical disk attributes. Each physical disk is mapped to a logical abstraction called a Logical Unit (LUN). A LUN exposes additional predefined metadata t d t attributes tt ib t to t a cellll administrator. d i i t t You Y cannott create t or remove a LUN, LUN they th are automatically created. A cell disk is a higher level abstraction that represents the data storage area on each LUN. For the two LUNs that contain the system areas, Exadata recognizes the way that the LUN is partitioned and maps the cell disk to the disk partition reserved for data storage. For the other 10 disks, Exadata maps the cell disk directly to the LUN. After a cell disk is created created, it can be subdivided into one or more grid disks disks, which are directly exposed to ASM.
Exadata and Database Machine Administration Workshop 3 - 10
Disk Storage Entities and Relationships (continued) Placing multiple grid disks on a cell disk allows the administrator to segregate the storage into pools with different performance characteristics. For example, a cell disk could be partitioned so grid disk resides on the highest g p performing gp portion of the disk ((the outermost tracks on that one g the physical disk), whereas a second grid disk might be configured on the lower performing portion of the disk (the inner tracks). The first grid disk might then be used in an ASM disk group that houses highly active (hot) data, whereas the second grid disk might be used to store less active (cold) data files. Placing multiple grid disks on a cell disk also allows the administrator to segregate the storage into separate pools that can be assigned to different databases. In cases where the entire cell capacity is required for a single database or where it is difficult to clearly define hot and cold data sets, an Exadata administrator will usually define a single grid disk containing all the space on each cell disk. Note: The diagram in the slide shows the cases where one or two grid disks are created from the space on a cell disk. However, you can create more than two grid disks on a cell disk.
Exadata and Database Machine Administration Workshop 3 - 11
Interleaved Grid Disks Fast Tracks
50%
Slower Tracks
50%
Slowest Tracks
Grid Disk 2
Grid Disk 3
Fastest Tracks
Slower Tracks
50%
Fast Tracks
50%
Slowest Tracks
The perform mance of Grid Diisk 3 and Grid Disk 4 is more evenly balanced b
Grid Dis sk1 benefits from m the higher performa ance outer tracks s of the disk
Grid Disk 1
Fastest Tracks
Grid Disk 4
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Interleaved Grid Disks By default, space for grid disks is allocated from the outer tracks to the inner tracks of a physical grid disks can be allocated in an interleaved manner. Grid disks that disk. However, space for g use this type of space allocation are referred to as interleaved grid disks. This method attempts to equalize the performance of the grid disks residing on the same physical disk. The slide contrasts default grid disk allocation with interleaved grid disks. On the left, two grid disks have been created on a physical disk using default space allocation. In this case, Grid Disk 1 occupies all the fastest (outer) tracks, whereas Grid disk 2 occupies all the slower (inner) tracks. On the O th right, i ht you see an example l off iinterleaved t l d grid id di disks. k With iinterleaving t l i enabled, bl d a di disk k iis divided into two equal parts: the outer half (upper portion) and the inner half (lower portion). When a new grid disk is created, half of the grid disk space is allocated on the upper portion, and the other half of the grid disk space is allocated on the lower portion. Interleaved grid disks are best used in situations where you want to create separate ASM disk groups that share cell disks without a performance bias. Note that interleaving g is enabled by y setting g the INTERLEAVING attribute for the cell disk. For example: CellCLI> CREATE CELLDISK cd_03_cell01_int LUN=03 – INTERLEAVING='normal_redundancy' Exadata and Database Machine Administration Workshop 3 - 12
Flash Storage Entities and Relationships Flash
LUN
CELLDISK
OR
GRIDDISK
ASM disk
FLASHCACHE Exadata Cell
CellCLI> CREATE FLASHCACHE ... CellCLI> CREATE GRIDDISK ... FLASHDISK ...
Flash Cache Flash LUN
Cell Disk
OR
Flash Cache Grid Disk Visible to ASM
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Flash Storage Entities and Relationships Each Exadata cell contains 384 GB of high performance flash memory distributed across 4 PCI flash memory cards. Each card has 4 flash devices for a total of 16 flash devices on each cell. Each flash device has a capacity of 24 GB. Essentially, each flash device is much like a physical disk in the Exadata storage hierarchy. Each flash device is visible to the Cell Server software as a LUN. You can create a cell disk based on a flash-based LUN. You can then create numerous grid disks on each flash-based cell disk. In addition, space on a flash-based cell disk can be allocated to a special area that supports Exadata Smart Flash Cache. By default default, the initial cell configuration process creates flash-based cell disks on all the flash devices, and then allocates all the available flash space to Exadata Smart Flash Cache. To create space for flash-based grid disks, you need to drop the default flash cache. Then you can create a flash cache and flash-based grid disks with your chosen sizes. Unlike physical disk devices, the order in which you allocate your flash space is not important from a performance perspective. Likewise, interleaving is not applicable for flash-based cell disks. Note: The diagram in the slide shows the case where a flash-based cell disk is allocated entirely to flash cache, and the case where a flash-based cell disk is used for flash cache and one grid disk. However, you can allocate up to one flash cache area along with zero or more flash-based grid disks from a flash-based cell disk. Exadata and Database Machine Administration Workshop 3 - 13
Disk Group Configuration
SQL> CREATE DISKGROUP
Exadata Cell
Exadata Cell
CELL1 Failure Group
DATA Disk Group
CELL2 Failure Group
CELL1 Failure Group
FRA Disk Group
CELL2 Failure Group
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Disk Group Configuration After the grid disks are configured, ASM disk groups can be defined across your Exadata g configuration. The slide illustrates an example where two ASM disk groups are defined. The DATA disk group is defined across all the red grid disks, and the FRA disk group is defined across the blue grid disks. When data is loaded into each disk group, ASM will evenly distribute the data and I/O across the grid disks in each disk group. To protect against the failure of an entire Exadata cell, ASM failure groups are automatically defined on a per cell basis. This is to ensure that mirrored ASM extents are placed on different E d t cells. Exadata ll Thi This iis also l ill illustrated t t d iin th the slide. lid B By d default, f lt when h ffailure il groups are automatically created, their names correspond to the cell name. So, different disk groups can have the same failure group names. When using Exadata, it is strongly recommended to use at least NORMAL ASM redundancy for all of your disk groups in conjunction with ASM failure groups spread across at least two Exadata cells. Following this recommendation provides good protection against disk and cell failure. Using HIGH ASM redundancy in conjunction with ASM failure groups spread across at least three Exadata cells provides the best available level of data protection. Such a configuration can tolerate the simultaneous failure of two complete cells without compromising data availability. Exadata and Database Machine Administration Workshop 3 - 14
Quiz What are the three main Exadata services? 1. OMS 2. MS 3 GMON 3. 4. CELLSRV 5. RS
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Answer: 2, 4, 5
Exadata and Database Machine Administration Workshop 3 - 15
Quiz If you use NORMAL ASM redundancy for all of your disk groups in conjunction with ASM failure groups spread across two Exadata cells, under which of the following scenarios will you maintain data availability? 1. A single disk failure in a single cell 2. Simultaneous failure of multiple disks in a single cell 3. Simultaneous failure of a single disk in both cells 4. Complete failure of a single cell
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Answer: 1 ,2, 4 The prescribed configuration may provide protection against failure scenario 3 if, and only if, guarantee data availability y in there are no data extents mirrored to both of the failed disks. To g cases where simultaneous failures affect two cells, you must use HIGH ASM redundancy in conjunction with failure groups spread across at least three Exadata cells.
Exadata and Database Machine Administration Workshop 3 - 16
Summary In this lesson, you should have learned to describe: • The Exadata architecture • The relationship between the various storage abstractions used in Exadata
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata and Database Machine Administration Workshop 3 - 17
Additional Resources •
Lesson Demonstrations (Viewlets) –
Exadata Process Introduction —
–
Hierarchy of Exadata Storage Objects —
–
http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/034ExadataFlashCacheAdmin/034exad ataflashcacheadmin_viewlet_swf.html
Exadata Smart Flash Cache Architecture —
•
http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/033ExadataInterleavedGridDisks/033ex adatainterleavedgriddisks_viewlet_swf.html
Examining Exadata Smart Flash Cache —
–
http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/032ExadataStorageObjects/032exadata i l l /d /db/11 / 2/db h/032E d t St Obj t /032 d t storageobjects_viewlet_swf.html
Creating Interleaved Grid Disks —
–
http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/031ExadataProcessIntro/031exadatapr ocessintro_viewlet_swf.html
http://stcurriculum.oracle.com/demos/db/11g/r2/exadatav2/smartflashcachearchitecture/smartfla shcachearchitecture.swf
My Oracle Support Notes –
Oracle Reliable Datagram Sockets (RDS) and InfiniBand (IB) Support for RAC Interconnect and Exadata Storage —
https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=745616.1
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata and Database Machine Administration Workshop 3 - 18
Practice 3 Overview: Introducing Exadata Cell Architecture In these practices, you will be familiarized with the Exadata cell architecture. You will: • Examine the Exadata processes • Examine the hierarchy of cell objects • Create interleaved grid disks • Examine Exadata Smart Flash Cache
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata and Database Machine Administration Workshop 3 - 19
E d t C Exadata Configuration fi ti
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Objectives After completing this lesson, you should be able to: • Perform the initial Exadata boot sequence • Configure Exadata software • Create and configure ASM disk groups using Exadata • Use the CellCLI Exadata administration tool
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata and Database Machine Administration Workshop 4 - 2
Exadata Installation and Configuration Overview
1 6
Configuring ASM disk group for Exadata
5
Configuring ASM and Database instances for Exadata
Initial network preparation
2
3
Configuration of new Exadata servers
Configuring Exadata software
4 Configuring hosts to use Exadata
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata Installation and Configuration Overview Exadata ships with all hardware and software preinstalled. However, it is necessary to configure general overview of the configuration g tasks. Exadata. This slide introduces a g Note: In most cases the installation and configuration activities described in this lesson occur as part of the installation and configuration of Database Machine and there is no requirement to perform cell-by-cell configuration. You may need to conduct some of the activities described in this lesson during the normal lifecycle of maintaining your Database Machine environment however the complete Exadata configuration process would only be required in rare circumstances, such as when upgrading from a Quarter-Rack Database Machine to a Half-Rack or Full-Rack configuration configuration, for example. example The Database Machine configuration process is described later in this course in the lesson entitled Database Machine Configuration.
Exadata and Database Machine Administration Workshop 4 - 3
Initial Network Preparation
1
For each storage cell, assign the following IP addresses: • One IP address for the bonded InfiniBand port • One IP address for administration network access • One IP address for lights out management Note these network configuration recommendations: • Set up a fault-tolerant, private network subnet for the InfiniBand network. – Use the InfiniBand network for Oracle Clusterware.
•
Assign a block of IP addresses for each network type. – Do not allocate IP addresses ending in .0, .1, or .255.
•
Repeat for each cell
Define your storage cells to DNS. Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Initial Network Preparation Each storage cell contains the following network ports: 1. One dual-port dual port InfiniBand card for high-speed, high speed, high-volume high volume data transfer: Each Exadata cell is designed to be connected to two separate InfiniBand switches for high availability. The dual port card is only for availability reasons because each port is capable of transferring the full data bandwidth generated by the storage cell. You will need to assign one IP address to the bonded InfiniBand interface during the initial configuration of the storage cell. 2. Gigabit Ethernet ports for general administration network access to the cell operating system: Each Exadata server comes with four Gigabit Ethernet ports. ports However, However only one is required for administrative access. You will need to assign one IP address to the cell for network access during the initial configuration process. 3. One gigabit Ethernet port for lights out management: Exadata uses Sun Integrated Lights Out Manager (ILOM). You should assign one IP address to the cell for ILOM during the initial configuration of the storage cell.
Exadata and Database Machine Administration Workshop 4 - 4
Initial Network Preparation (continued) Note the following network configuration and IP address recommendations: • It is recommended that the InfiniBand network should be a dedicated private network subnet for Exadata cells and database server hosts hosts. Multiple InfiniBand switches are recommended to eliminate the switch as a single point of failure. • The InfiniBand network should be used for Oracle Clusterware network and storage communications. Use the following command on your clusterware hosts to verify that the private network for Oracle Clusterware communication is using InfiniBand: oifcfg getif -type cluster_interconnect •
The Reliable Datagram Sockets (RDS) protocol should be used over the InfiniBand network for database server to cell communication and Oracle Real Application Clusters (RAC) communication. Check the database alert log to verify that the private network for Oracle RAC is running the RDS protocol over the InfiniBand network. The following message should be in the log: cluster interconnect IPC version: Oracle RDS/IP (generic)
•
Dedicate a block of IP addresses for the InfiniBand network and ensure that you allow for f t future expansion. i Dedicate a block of IP addresses for the general administration interfaces and the lights out management interfaces. The general administration interfaces and the lights out management interfaces may be on the same subnet and may share that subnet with other hosts. For example, on the 192.168.200.0/24 subnet, you might assign the block of IP addresses between 192.168.200.31 and 192.168.200.50 for your Exadata general administration interfaces and the lights g out management g interfaces. Other hosts sharing g the subnet would be allocated IP addresses outside the dedicated block. If you want, you can place the general administration interfaces and the lights out management interfaces on separate subnets; however, this is not required. Do not allocate addresses that end in .0, .1, or .255, or those that would be used as broadcast addresses for the specific netmask that you have selected. For example, avoid addresses such as 192.168.200.0, 192.168.200.1, and 192.168.200.255. Exadata cells do not require Domain Name System (DNS) however DNS is recommended for use in conjunction with Database Machine. If DNS is available in your network, configure your DNS with the IP addresses and host names associated with the general administrative network on each Exadata cell.
•
•
•
Exadata and Database Machine Administration Workshop 4 - 5
Configuration of New Exadata Servers
2
1. Check all physical connections. 2. Power on the Exadata server. 3. Answer questions during boot sequence: – – – –
Domain Name Service (DNS) server IP addresses Time preference (time region and location) Network Time Protocol (NTP) servers Ethernet and InfiniBand IP addresses, netmasks, gateway, and hostnames – Remote management configuration details
4. Change the initial passwords for the root, celladmin, and cellmonitor users.
Repeat for each cell
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Configuration of New Exadata Servers The slide lists the general steps to configure a new Exadata server: 1. Check all the physical connections to the Exadata server. It is important that all the physical network connections are correct prior to configuring the cell. Check also that both power supplies are connected and that you have a keyboard, video display, and mouse. 2. Power on the cell to boot its operating system. 3. Answer the configuration questions when you are prompted. The slide lists the information that you need to provide. 4. After you successfully perform the previous step, the login screen is displayed. Change the initial passwords for the root, celladmin, and cellmonitor users to more secure passwords. The initial password for root is welcome1. The initial password for the cellmonitor and celladmin users is welcome.
Exadata and Database Machine Administration Workshop 4 - 6
Answering Questions During the Initial Boot Sequence ... Network interfaces Name State IP address Netmask Gateway Hostname eth0 Linked eth1 Unlinked eth2 Unlinked eth3 Unlinked ib0 Linked ib1 Linked Warning. Some network interface(s) are disconnected. Check cables and switches and retry Do you want to retry (y/n) [y]: n Nameserver: mynameserv.company.com Add more nameservers (y/n) [n]: n Setting up local time... Select country by number, [n]ext, [l]ast: 230 Select zone by number, [n]ext: 17 Selected timezone: America/Denver Is this correct (y/n) [y]: y The current ntp server(s): Do you want to change it (y/n) [n]: y Fully qualified hostname or ip address for NTP server. Press enter if none: ntp1.company.com Continue adding more ntp servers (y/n) [n]: n ...
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Answering Questions During the Initial Boot Sequence The next four slides show an example of the initial boot configuration process for Exadata. On each slide, the text in blue indicates a user input. The configuration commences during the server boot sequence. The output from the initial part of the boot sequence is not shown. This slide commences at the beginning of the interview phase where user input is required. In this slide, settings are made for the DNS name server, time zone, and NTP server. Notice that the interview phase commences with a warning indicating that a number of network interfaces are disconnected. As shown in the slide, it is safe to ignore this warning because each Exadata server comes equipped with four Ethernet ports however only one (eth0) is required. So it is normal for eth1, eth2, and eth3 to be disconnected. Always make sure that the required network interfaces (eth0, ib0 and ib1) are correctly linked.
Exadata and Database Machine Administration Workshop 4 - 7
Answering Questions During the Initial Boot Sequence ... Network interfaces Name State IP address Netmask Gateway Hostname eth0 Linked bond0 ib0,ib1 Select interface name to configure or press Enter to continue: eth0 Selected interface. eth0 IP address or none: 10.XXX.XXX.XXX Netmask: 255.255.248.0 255 255 248 0 Gateway (IP address or none): 10.XXX.XXX.1 Fully qualified hostname or none: cell01.company.com Continue configuring or re-configuring interfaces? (y/n) [y]: y Network interfaces Name State IP address Netmask Gateway Hostname eth0 Linked 10.XXX.XXX.XXX 255.255.248.0 10.XXX.XXX.1 cell01.company.com bond0 ib0,ib1 Select interface name to configure or press Enter to continue: bond0 Selected interface. bond0 IP address: 192.168.50.76 Netmask: k 255.255.255.0 2 2 2 0 Fully qualified hostname or none: cell01-priv.company.com Continue configuring or re-configuring interfaces? (y/n) [y]: y Network interfaces Name State IP address Netmask Gateway Hostname eth0 Linked 10.XXX.XXX.XXX 255.255.248.0 10.XXX.XXX.1 cell01.company.com bond0 ib0,ib1 192.168.50.76 255.255.255.0 cell01-priv.company.com Select interface name to configure or press Enter to continue: ...
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Answering Questions During the Initial Boot Sequence (continued) In this slide, the configuration phase continues with settings specified for the Ethernet network (eth0) that supports administrative access to the storage server, along with the InfiniBand network (bond0) that supports the main storage network. Notice that the InfiniBand interface is named bond0 and uses bonding between the physical InfiniBand interfaces ib0 and ib1. Bonding provides the ability to transparently fail over from ib0 to ib1 or from ib1 to ib0 if connectivity to either interface is lost. If you choose not to configure each interface in the list, the unconfigured interfaces will not be started during system startup and the cell will not be fully functional. You can later configure, or change, h cellll network t k settings tti using i th the ipconf i f utility. tilit
Exadata and Database Machine Administration Workshop 4 - 8
Answering Questions During the Initial Boot Sequence ... Select canonical hostname from the list below 1: cell01.company.com 2: cell01-pric.company.com Canonical fully qualified domain name: 1 Select default gateway interface from the list below 1: eth0 Default gateway interface: 1 Canonical hostname: cell01.company.com Nameservers: mynameserv.company.com Timezone: America/Denver NTP servers: ntp1.company.com Default gateway device: eth0 Network interfaces Name State IP address Netmask eth0 Linked 10.XXX.XXX.XXX 255.255.248.0 bond0 ib0,ib1 192.168.50.76 255.255.255.0 Is this correct (y/n) [y]: y ...
Gateway 10.XXX.XXX.1
Hostname cell01.company.com cell01-priv.company.com
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Answering Questions During the Initial Boot Sequence (continued) In this slide, the network configuration is finalized with the specification of the canonical host gateway. y Both of these settings g map to the ethernet network providing g name and default g administrative access to the cell.
Exadata and Database Machine Administration Workshop 4 - 9
Answering Questions During the Initial Boot Sequence ... Do you want to configure basic ILOM settings (y/n) [y]: y Loading basic configuration settings from ILOM ... ILOM Fully qualified hostname [cell01-ilom.company.com]: cell01-ilom.company.com ILOM IP address [10.XXX.XXX.YYY]: 10.XXX.XXX.YYY ILOM Netmask [255.255.248.0]: 255.255.248.0 ILOM Gateway [10.XXX.XXX.1]: 10.XXX.XXX.1 ILOM Nameserver or none [mynameserv.company.com]: mynameserv.company.com ILOM Use NTP Servers (enabled/disabled) [enabled]: enabled ILOM First NTP server. Fully qualified hostname or ip address or none [ntp1.company.com]: ntp1.company.com ILOM Second NTP server. Fully qualified hostname or ip address or none [none]: none Basic ILOM configuration Hostname : IP Address : Netmask : Gateway : DNS servers : Use NTP servers : First NTP server : Second NTP server : Timezone (read-only) :
settings: cell01-ilom.company.com 10.XXX.XXX.YYY 255.255.248.0 10.XXX.XXX.1 mynameserv.company.com enabled ntp1.company.com none America/Denver
Is the correct (y/n) [y]: y ...
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Answering Questions During the Initial Boot Sequence (continued) Configuration completes with settings for Integrated Lights Out Manager (ILOM). If you choose not to configure ILOM at this time, you can use the ipconf utility to do so later. After the user interview phase is completed, the Exadata server finalizes its system startup process. The output from the remaining system startup activities is not shown in the slide. Finally, a login prompt is displayed.
Exadata and Database Machine Administration Workshop 4 - 10
Exadata Administrative User Accounts Three operating system users are configured for each Exadata server: • The root user can: – Edit configuration files such as cellinit.ora and cellip.ora – Change network configuration settings – Run support and diagnostic utilities located under the /opt/oracle.SupportTools directory – Run the CellCLI CALIBRATE command – Perform all the tasks that the celladmin user can perform
•
The celladmin user can: – Perform administrative tasks (CREATE (CREATE, DROP, DROP ALTER, ALTER and so on) using the CellCLI utility – Package incidents for Oracle Support using the adrci utility
•
The cellmonitor user can only view (LIST) Exadata cell objects using the CellCLI utility.
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata Storage Server Administrative User Accounts Three operating system users are configured for each Exadata server: root, celladmin, and cellmonitor. cellmonitor The slide describes the function of each user account. account As mentioned before, after you successfully configure the cell, you should log in and change the initial passwords for the root, celladmin, and cellmonitor users to more secure passwords. The initial password for root is welcome1. The initial password for the cellmonitor and celladmin users is welcome.
Exadata and Database Machine Administration Workshop 4 - 11
Configuring a New Exadata Cell
3
1. Run performance tests on the cell with CALIBRATE. 2. Configure the cell server software. 3. Create cell disks. 4 Create grid disks 4. disks.
Repeat for each cell
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Configuring a New Exadata Cell As part of the initial boot configuration, the cell server software is started with a basic g In addition, the flash modules are configured g as cell disks and all the flash-based configuration. cell disks are allocated to Exadata Smart Flash Cache. At this point, you are ready to finalize the configuration of the Exadata cell. Following is a summary of the recommended procedure. All the steps are executed using CellCLI: 1. As the root user, run performance tests on the cell with the CALIBRATE command. 2. As the celladmin or root user, configure the cell server software with the ALTER CELL command. 3. As celladmin or root, create the disk-based cell disks by using the CREATE CELLDISK command. 4. As celladmin or root, create the grid disks on each cell disk of the storage cell by using the CREATE GRIDDISK command. Repeat this process on each Exadata cell.
Exadata and Database Machine Administration Workshop 4 - 12
Important I/O Metrics for Oracle Databases Disk bandwidth
Channel bandwidth
Metric = IOPS
Metric = MBPS Need large I/O channel
Need high RPM and fast seek time
OLTP
DW/OLAP
(Small random I/O)
(Large sequential I/O)
CALIBRATE
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Important I/O Metrics for Oracle Databases The CALIBRATE command runs raw performance tests on Exadata disks and flash modules. This enables yyou to measure two important database metrics – IOPS and MBPS: • IOPS (I/O per second): This metric represents the number of small random I/O that can be serviced in a second. The IOPS rate mainly depends on how fast the disk media can spin and how many disks are present in the storage system. • MBPS (megabytes per second): The rate at which data can be transferred between the computing server node and the storage array. This mainly depends on the capacity of the I/O channel that is used to transfer data. The database I/O workload typically consists of small random I/Os and large sequential I/Os. Small random I/Os are more prevalent in an OLTP application environment in which each server process reads a data block into the buffer cache for updates and the changed blocks are written to storage in batches by the database writer (DBWn) process. Large sequential I/Os are common in a data warehouse environment. OLTP application performance mainly depends on how fast small I/Os are serviced, which depends on how fast the disk can spin and find the data. Large I/O performance depends on the capacity of the I/O channel that connects the server to the storage array; throughput is better when the capacity of the channel is larger. Exadata and Database Machine Administration Workshop 4 - 13
Testing Performance Using CALIBRATE [root@cell01 ~]# cellcli CellCLI: Release 11.2.1.2.0 - Production on Mon Nov 02 16:42:06 PST 2009 Copyright (c) 2007, 2009, Oracle. Cell Efficiency ratio: 1.0
All rights reserved.
CellCLI> CALIBRATE FORCE Calibration will take a few minutes minutes... Aggregate random read throughput across all hard disk luns: 1601 MBPS Aggregate random read throughput across all flash disk luns: 4194.49 MBPS Aggregate random read IOs per second (IOPS) across all hard disk luns: 4838 Aggregate random read IOs per second (IOPS) across all flash disk luns: 137588 Controller read throughput: 1615.85 MBPS Calibrating hard disks (read only) ... Lun 0_0 on drive [20:0 ] random read throughput: 152.81 MBPS, and 417 IOPS Lun 0_1 on drive [20:1 ] random read throughput: 154.72 MBPS, and 406 IOPS ... Lun 0_10 on drive d i [ [20:10 ] random d read d throughput: h h 156.84 MBPS, and d 421 IOPS Lun 0_11 on drive [20:11 ] random read throughput: 151.58 MBPS, and 424 IOPS Calibrating flash disks (read only, note that writes will be significantly slower). Lun 1_0 on drive [[10:0:0:0]] random read throughput: 269.06 MBPS, and 19680 IOPS Lun 1_1 on drive [[10:0:1:0]] random read throughput: 269.18 MBPS, and 19667 IOPS ... Lun 5_2 on drive [[11:0:2:0]] random read throughput: 269.15 MBPS, and 19603 IOPS Lun 5_3 on drive [[11:0:3:0]] random read throughput: 268.91 MBPS, and 19637 IOPS CALIBRATE results are within an acceptable range.
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Testing Performance Using CALIBRATE The CALIBRATE command enables you to verify the disk and flash memory performance before the cell is put online. You must execute this command while being logged in as the root user at the operating system level. The CALIBRATE FORCE command allows you to run the tests when Cell Server is running. If you do not use the FORCE option, Cell Server must be shut down. Running CALIBRATE at the same time as the Cell Server will impact performance which is why it is not recommended during normal operations. Because the Cell Server software is running immediately after the initial boot sequence, you mustt either ith shut h td down th the C Cellll S Server software ft or execute t th the CALIBRATE FORCE command. d CALIBRATE FORCE is acceptable in this circumstance because the cell is not yet running a user workload, so there is no work to disrupt. In the above example, which shows a typical output for high performance disks, the results matched expectations. A message will alert you if the performance measurements are substandard.
Exadata and Database Machine Administration Workshop 4 - 14
Configuring the Exadata Cell Server Software
[celladmin@cell01 ~]$ cellcli CellCLI: Release 11.2.1.2.0 - Production on Mon Nov 02 17:46:13 PST 2009 Copyright (c) 2007, 2009, Oracle. Cell Efficiency ratio: 1.0 1 0
All rights reserved.
CellCLI> ALTER CELL smtpServer='my_mail.example.com', smtpFromAddr='
[email protected]', smtpPwd= smtpToAddr='
[email protected]', notificationPolicy='critical,warning,clear', notificationMethod='mail' Cell cell01 successfully altered
-
CellCLI>
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Configuring the Exadata Cell Server Software The settings provided during the initial boot sequence configure the hardware and cell operating system. In addition, the Cell Server software is automatically configured using the CREATE CELL command. By default, the cell name is set to the network host name of the Exadata server and the INTERCONNECT1 attribute is set to bond0, which is the InfiniBand storage network interface. You can change the name of the cell or configure the optional Cell Server attributes by using the ALTER CELL command. The slide shows an example ALTER CELL command that configures email notification. This f ilit sends facility d emailil messages tto th the administrator d i i t t off th the storage t cellll whenever h critical, iti l warning, i and clear alerts are detected by the cell. In addition to email notification, it is possible to configure notification using Simple Network Management Protocol (SNMP). Note: After the initial boot configuration, Restart Server (RS) and Management Server (MS) should be running. If not, an error message will display when using the CellCLI utility. In that case, run the following CellCLI commands to start the RS and MS services: ALTER CELL STARTUP SERVICES RS ALTER CELL STARTUP SERVICES MS
Exadata and Database Machine Administration Workshop 4 - 15
Creating Cell Disks CellCLI> CellDisk ... CellDisk CellDisk
CREATE CELLDISK ALL CD_00_cell01 successfully created CD_10_cell01 successfully created CD_11_cell01 successfully created
CellCLI> LIST CELLDISK CD_00_cell01 ... CD_10_cell01 CD_11_cell01 FD_00_cell01 ... FD_14_cell01 FD_15_cell01
normal normal normal normal normal normal
CellCLI>
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Creating Cell Disks After the Exadata cell is first configured, there are 16 flash-based cell disks, which are allocated to Exadata Smart Flash Cache. Before you can use the disk-based storage, you must create disk-based cell disks using the CREATE CELLDISK command. The example in the slide shows the use of the CREATE CELLDISK ALL command to automatically create 12 disk-based cell disks with default names. In most cases, you can use the default cell disk names. If desired, you can configure your cell disks to enable the creation of interleaved grid disks. Use the following command to create cell disks with interleaving enabled: CREATE CELLDISK ALL HARDDISK INTERLEAVING='normal_redundancy' The above example also shows the use of the LIST CELLDISK command to display the diskbased and flash-based cell disks. Check whether the command shows a status of normal for all the cell disks.
Exadata and Database Machine Administration Workshop 4 - 16
Creating Grid Disks CellCLI> CREATE GRIDDISK ALL PREFIX=data, SIZE=300G GridDisk data_CD_00_cell01 successfully created ... GridDisk data_CD_11_cell01 successfully created
Use fastest disk portion
CellCLI> CREATE GRIDDISK ALL PREFIX PREFIX=fra fra GridDisk fra_CD_00_cell01 successfully created ... GridDisk fra_CD_11_cell01 successfully created CellCLI> LIST GRIDDISK data_CD_00_cell01 ... data_CD_11_cell01 fra_CD_00_cell01 ... fra_CD_11_cell01 CellCLI> exit [celladmin@cell01 ~]$
Before
After Grid disks
Cell disk
active active active
…
…
active
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Creating Grid Disks After cell disks are created, you can create grid disks by using the CREATE GRIDDISK command. In the example in the slide, the ALL PREFIX option is used to automatically create one grid disk on each cell disk. When the ALL PREFIX option is used, the generated grid disk names are composed of the grid disk prefix followed by an underscore (_) and then the cell disk name. It is best practice to use the ASM disk group name as the prefix name for the corresponding grid disks. In the example, prefix values data and fra are the names of the ASM disk groups that will be created. Grid disk names must be unique across all cells within a single deployment. By following the recommended naming conventions for naming the grid and cell disks, you will automatically get unique names. The optional SIZE attribute specifies the size of each grid disk. If omitted, the grid disk will automatically consume all the space remaining on the corresponding cell disk. The LIST GRIDDISK command shows all the grid disks that are created. Note that for cell disks that are not enabled for interleaving, the first grid disk created on each cell disk uses the outermost p portion of the disk. In this area,, each track contains more data than the inner tracks resulting in higher transfer rates and better performance. Because the best available offset is chosen automatically in chronological order of grid disk creation, you should first create those grid disks expected to contain the most frequently accessed data, and then create the grid disks that will contain the relatively colder data. Exadata and Database Machine Administration Workshop 4 - 17
Creating Flash-Based Grid Disks CellCLI> DROP FLASHCACHE Flash cache cell01_FLASHCACHE successfully dropped CellCLI> CREATE FLASHCACHE ALL SIZE=100G Flash cache cell01_FLASHCACHE successfully created CellCLI> GridDisk GridDisk ... GridDisk
CREATE GRIDDISK ALL FLASHDISK PREFIX=flash flash_FD_00_cell01 successfully created flash_FD_01_cell01 successfully created flash_FD_15_cell01 successfully created
CellCLI> LIST GRIDDISK ... flash_FD_00_cell01 ... flash_FD_15_cell01 CellCLI> exit [celladmin@cell01 ~]$
active
Before
After
Flash Cache
Flash Cache Grid disk
…
…
active
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Creating Flash-Based Grid Disks By default, the initial cell configuration process creates flash-based cell disks on all the flash devices, and then allocates all the available flash space to Exadata Smart Flash Cache. In certain t i circumstances, i t you can benefit b fit from f creating ti flash-based fl h b d grid id di disks k tto actt as a permanent flash-based data store. To create space for flash-based grid disks, you first need to drop the default flash cache. Then you can create a flash cache and flash-based grid disks with your chosen sizes. In the example in the slide, the default flash cache is dropped. Next, a new Exadata Smart Flash Cache is created. The new cache is 100 GB in total size with 6.25 GB of space allocated on each of the 16 flash-based cell disks. The CREATE GRIDDISK command is used to create flash flash-based based grid disks in the same way as for disk-based grid disks. Note the use of the FLASHDISK option to specify the use of flashbased cell disks as the basis for the grid disks. In the example in the slide, 16 flash-based grid disks are created and each consumes the remaining 17.75 GB of space available on the flashbased cell disks. The flash-based grid disks follow the same default naming convention as diskbased grid disks. Although this example does not show it, you can create multiple grid disks on a flash-based cell disk. Unlike physical disk devices, the order in which you allocate your flash space is not important from a performance perspective. Likewise, interleaving is not applicable for flashbased disks. Note: Circumstances that favor the use of flash-based grid disks are discussed in the lesson titled Optimizing Database Performance with Exadata. Exadata and Database Machine Administration Workshop 4 - 18
Configuring Hosts to Access Exadata Cells
4
1. Create the following directory and files: # mkdir -p /etc/oracle/cell/network-config # chown oracle:dba /etc/oracle/cell/network-config # chmod ug+wx /etc/oracle/cell/network-config $ cd /etc/oracle/cell/network-config $ cat - > /etc/oracle/cell/network-config/cellinit.ora ipaddress1=192.168.50.23/24 $ cat - > /etc/oracle/cell/network-config/cellip.ora cell="192.168.51.27" ll cell="192.168.51.28" cell="192.168.51.29"
Repeat for each host.
2. Restart database and ASM instances.
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Configuring Hosts to Access Exadata Cells After your Exadata cells are configured, the database server hosts must be configured to use the cells: • The cellinit.ora file contains the database server IP address that connects to the storage network. This file is host specific, so the IP address will be specific to each database server. The IP address is specified in Classless Inter-Domain Routing (CIDR) format. • The cellip.ora file contains the IP addresses that are used by storage cells to send data to the database server host. These IP addresses correspond to the bonded InfiniBand interface (bond0) on the cells cells. Restart the database and the Oracle ASM instances on the database server host after you finish creating the cellinit.ora and cellip.ora files. After the files have been configured, they should not be edited while your database or ASM instances are running.
Exadata and Database Machine Administration Workshop 4 - 19
Configuring ASM and Database Instances for Exadata • • •
5
Oracle Database and ASM software must be at least version 11.2.0.1 Use ASM to store OCR and voting disks on Exadata Set the ASM_DISKSTRING ASM DISKSTRING ASM initialization parameter: – ASM_DISKSTRING='o/*/*'
•
Set the COMPATIBLE database initialization parameter: – COMPATIBLE='11.2.0.0.0'
Repeat for each host
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Configuring ASM and Database Instances for Exadata Oracle Database and Oracle Grid Infrastructure 11g Release 2 (11.2.0.1) or later must be installed on the database server before you can access Exadata from ASM and database instances. If you are using Oracle Clusterware, it is recommended that you place the Oracle Cluster Registry (OCR) and voting disks on ASM. To ensure that ASM discovers Exadata grid disks, set the ASM_DISKSTRING initialization parameter. A search string with the following form is used to discover Exadata grid disks: o// Wildcards may be used to expand the search string. For example, to explicitly discover all the available Exadata grid disks set ASM_DISKSTRING='o/*/*'. To discover a subset of available grid disks having names that begin with data, set ASM_DISKSTRING='o/*/data*'. Note that if the ASM_DISKSTRING initialization parameter is not set, then the default is to discover all the available Exadata grid disks. To configure a database instance to access cell storage, ensure that the COMPATIBLE t is i sett to t 11.2.0.0.0 or later l t in i th the d database t b initialization i iti li ti fil file. parameter Note that Database Configuration Assistant (DBCA) 11.2.0.1 does not set the COMPATIBLE initialization parameter to 11.2.0.0.0 by default, and you must set this parameter on the Initialization Parameters page. Exadata and Database Machine Administration Workshop 4 - 20
Configuring ASM Disk Groups for Exadata
6
Disk group DATA Failure group cell01
Failure group cell02
o//data_cd_00_cell01 o//data_cd_01_cell01 ... o//data_cd_11_cell01
o//data_cd_00_cell02 o//data_cd_01_cell02 ... o//data_cd_11_cell02
o//fra_cd_00_cell01 o//fra_cd_01_cell01 ... o//fra_cd_11_cell01
o//fra_cd_00_cell02 o//fra_cd_01_cell02 ... o//fra_cd_11_cell02
All candidate disks on cell01 and cell02
CREATE DISKGROUP data NORMAL REDUNDANCY DISK 'o/*/data*' ATTRIBUTE 'compatible.rdbms' = '11.2.0.0.0', 'compatible.asm' = '11.2.0.0.0', 'cell.smart_scan_capable' = 'TRUE', 'au_size' = '4M';
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Configuring ASM Disk Groups for Exadata You can now create ASM disk groups from your ASM instance. An ASM disk group can include Exadata grid disks and conventional disks. However, to enable Smart Scan processing, all the di k iin an ASM di disks disk k group mustt b be E Exadata d t grid id di disks, k and d th the ffollowing ll i di disk k group attribute tt ib t settings must be used: 'compatible.rdbms' = '11.2.0.0.0' 'compatible.asm' = '11.2.0.0.0' 'cell.smart_scan_capable' = 'TRUE' In addition, it is recommended that you set the AU_SIZE disk group attribute value to 4M to optimize disk scanning. The example in the slide shows candidate ASM disks from two Exadata cells: cell01 and cell02. The CREATE DISKGROUP statement references all of the candidate ASM disks having names that start with data. By default, ASM failure groups corresponding to each cell are automatically defined. As a result, two failure groups are automatically created using corresponding grid disks from each cell. By default, the failure group names correspond to the cell names. Once created created, an Exadata Exadata-based based disk group can be used to house Oracle data files in the same way as an ASM disk group based on any other storage. To complement the recommended AU_SIZE setting of 4 MB, you should set the initial extent size to 8 MB for large segments. This can be done using segment-level or tablespace-level settings. The recommended approaches are discussed in the lesson entitled Optimizing Database Performance with Exadata. Exadata and Database Machine Administration Workshop 4 - 21
Optional Configuration Tasks • •
Configure Exadata storage security. Configure I/O Resource Management (IORM).
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Optional Configuration Tasks After you complete the cell configuration, you can perform the following optional tasks on the storage cell: • Configure Exadata storage security. • Configure I/O Resource Management (IORM). IORM is covered in detail in the lesson titled Exadata and I/O Resource Management. Note: Repeat each configuration task on each relevant storage cell.
Exadata and Database Machine Administration Workshop 4 - 22
Exadata Storage Security Overview ASM-scoped security mode
ASM cluster A RAC DB Instances
Non-RAC DB Instance
Grid disk
Exadata Cell 1
Exadata Cell 2
RAC DB Instances
Non-RAC DB Instance
Database-scoped security mode
ASM cluster B
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata Storage Security Overview Exadata storage security is implemented by controlling which ASM clusters and database grid disks on storage g cells. clients can access specific g • To set up security so that all database clients of an ASM cluster have access to specified grid disks, configure ASM-scoped security. • To set up security so that specific database clients of an ASM cluster have access to specified grid disks, configure database-scoped security. Both concepts are illustrated in the slide. ASM cluster A shares two grid disks per cell with all of its database clients. ASM cluster B shares one grid disk per cell to store the single instance database, and another two grid disks (one per cell) to store the RAC database. Note: By default, none of these security modes are implemented. This situation is called open security where all database clients can access all grid disks. Open security does not require any configuration, and as long as the network and database hosts are well secured you can use this mode for your production databases. Open security is also useful for non-production environments such as those that house test or development databases.
Exadata and Database Machine Administration Workshop 4 - 23
Exadata Storage Security Implementation Exadata Cell ASM cluster hosts CREATE KEY
A S M
A S M
/etc/oracle/cell/network.config
cellkey.ora
Each cell
ASSIGN KEY FOR Each database
Each disk
D B
CREATE/ALTER GRIDDISK availableTo
$ORACLE_HOME/admin//pfile
cellkey.ora
D B
CREATE KEY
Each disk
Each cell
ASSIGN KEY FOR
CREATE/ALTER GRIDDISK availableTo
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata Storage Security Implementation The slide briefly describes the steps to configure ASM-scoped and database-scoped security. It y first if yyou want to set up is important to realize that yyou must set up ASM-scoped security database-scoped security. To implement ASM-scoped security, perform the following steps: 1. Shut down your ASM and database instances. 2. Generate a security key using the CREATE KEY CellCLI command. Run this command once only on any cell. 3. Construct a cellkey.ora file using the generated security key. Copy the cellkey.ora file into the /etc/oracle/cell/network /etc/oracle/cell/network-config/ config/ directory on every host in your ASM cluster. 4. Use the ASSIGN KEY command to assign the security key to the Oracle ASM cluster on all the cells that you want the Oracle ASM cluster to access. The ASM cluster name is determined by the DB_UNIQUE_NAME initialization parameter setting. 5. Enter the Oracle ASM cluster name in the availableTo attribute with the CREATE GRIDDISK or ALTER GRIDDISK command to configure security on the grid disks on all the cells that you want the Oracle ASM cluster to access. access At the conclusion of this step step, each grid disk has an association with the ASM cluster that is allowed to use the disk. 6. Restart your ASM and database instances. Exadata and Database Machine Administration Workshop 4 - 24
Exadata Storage Security Implementation (continued) After you have configured and tested ASM-scoped security, you can proceed to set up database-scoped security. Perform the following steps for each database you want to configure with database-scoped security: 1. Shut down your ASM and database instances. 2. Generate a security key using the CREATE KEY CellCLI command. Run this command once only on any cell. 3. Construct a cellkey.ora file using the generated security key. Copy the cellkey.ora file into the $ORACLE_HOME/admin//pfile/ directory on every host running your database database. 4. Use the ASSIGN KEY command to assign the security key to the database on all the cells that you want the database to access. The database name is determined by the DB_UNIQUE_NAME initialization parameter setting. 5. Enter the database name in the availableTo attribute with the CREATE GRIDDISK or ALTER GRIDDISK command to configure security on the grid disks on all the cells that you want the database to access. At the conclusion of this step, each grid disk has an association with the ASM cluster and specific database that is allowed to use the disk disk. 6. Restart your ASM and database instances. Note: For more information, including examples and further details, refer to the Oracle Exadata Storage Server Software User's Guide 11g Release 2 (11.2).
Exadata and Database Machine Administration Workshop 4 - 25
Quiz Grid disks are seen by ASM by using a discovery string that starts with: 1. c/ 2 o/ 2. 3. g/ 4. e/
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Answer: 2
Exadata and Database Machine Administration Workshop 4 - 26
Quiz The first grid disk you create uses the slowest tracks of the corresponding physical disk. 1. TRUE 2 FALSE 2.
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Answer: 2
Exadata and Database Machine Administration Workshop 4 - 27
Quiz When you create a disk group for which you want Exadata smart storage capabilities enabled, what three attributes must you specify? 1. compatible.rdbms p 2. compatible.asm 3. au_size 4. disk_repair_time 5. cell.smart_scan_capable
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Answer: 1, 2, 5
Exadata and Database Machine Administration Workshop 4 - 28
Summary In this lesson, you should have learned how to: • Perform the initial Exadata boot sequence • Configure Exadata software • Create and configure ASM disk groups using Exadata • Use the CellCLI Exadata administration tool
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata and Database Machine Administration Workshop 4 - 29
Additional Resources •
Lesson Demonstrations (Viewlets) –
Exadata Cell Configuration —
–
Exadata Storage Provisioning —
–
http://stcurriculum oracle com/demos/db/11g/r2/dbmach/044ExadataUserAccounts/044exa curriculum.oracle.com/demos/db/11g/r2/dbmach/044ExadataUserAccounts/044exa datauseraccounts_viewlet_swf.html
Exadata Cell First Boot —
–
http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/043ExadataConsumingGridDisks/ 043exadataconsuminggriddisks_viewlet_swf.html
Exadata Cell User Accounts —
–
http://sthttp://st curriculum.oracle.com/demos/db/11g/r2/dbmach/042ExadataStorageProvisioning/0 42exadatastorageprovisioning_viewlet_swf.html
Consuming Exadata Grid Disks Using ASM —
–
http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/041ExadataCellConfig/041exadat acellconfig_viewlet_swf.html
http://stcurriculum.oracle.com/demos/db/11g/r2/exadatav2/cellfirstboot/cellfirstboot.swf
Another Example of Exadata Cell Configuration —
http://st-curriculum.oracle.com/demos/db/11g/r2/exadatav2/cellcli/cellcli.swf
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata and Database Machine Administration Workshop 4 - 30
Practice 4 Overview: Configuring Exadata In these practices, you will perform a variety of Exadata configuration tasks, including cell configuration and storage provisioning. You will also consume Exadata storage using ASM and exercise the privileges associated with the different cell user accounts.
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata and Database Machine Administration Workshop 4 - 31
E d t P Exadata Performance f M Monitoring it i and d Maintenance
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Objectives After completing this lesson, you should be able to: • Describe the various performance monitoring facilities available for Exadata • Monitor Exadata from directly within a cell, cell from a database instance, and through Enterprise Manager • Interpret SQL execution plans that use Smart Scan • Outline probable maintenance scenarios
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata and Database Machine Administration Workshop 5 - 2
Monitoring Overview
1 Metrics
2 Alerts
3 Active requests
4 Execution plans
5 V$ views
6 Wait events
7
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Monitoring Overview After Exadata is configured and in use, the administrative focus shifts to ongoing monitoring and maintenance maintenance. To monitor Exadata Exadata, you can use the following tools and information: 1. Exadata cell metrics 2. Exadata cell alerts 3. Exadata active requests 4. Database SQL statement execution plans 5. Database V$ views 6. Database wait events 7. Oracle Enterprise Manager Exadata monitoring plug-in
Exadata and Database Machine Administration Workshop 5 - 3
Exadata Metrics and Alerts Architecture MS keeps a set of the metric values. Collected metrics: Cell, Cell Disks, Grid Disk, IORM, Interconnect
Metric thresholds exceeded
CELLSRV internal errors CELLSRV collects metrics
ADR
CELLSRV
One hour of in-memory metric values
Every hour MS flushes metric values to disk.
MS
Cell software issues
Cell Cell hardware issues
LIST METRICCURRENT
Disk
ALTER CELL Seven days
metrics
Email and/or SNMP
1h hour alerts
LIST METRICHISTORY LIST ALERTHISTORY
Metric and Alert History Admin
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata Metrics and Alerts Architecture You can monitor each cell with Exadata cell metrics. CELLSRV periodically records important run-time properties, called metrics, for cell components such as CPUs, cell disks, g grid disks, flash cache, and IORM statistics. These metrics are recorded in memory. Based on its own metric collection schedule, MS gets the set of metric data accumulated by CELLSRV. MS keeps a subset of the metric values in memory, and writes a history to the disk repository every hour. The retention period for metric and alert history entries is specified by the metricHistoryDays cell attribute. You can modify this setting with the CellCLI ALTER CELL command. d B By d default, f lt it is i seven days. d Thi This process iis conceptually t ll similar i il tto d database t b AWR snapshots. You can get the metric value history by using the CellCLI LIST METRICHISTORY command, and the current metric values by using the LIST METRICCURRENT command. At the Exadata cell level, you can define thresholds for metrics. Using the Enterprise Manager plug-in for Exadata, you can set separate EM thresholds for all the Exadata metrics supported by the plug-in plug-in.
Exadata and Database Machine Administration Workshop 5 - 4
Exadata Metrics and Alerts Architecture (continued) In addition to metrics, Exadata can trigger alerts. Alerts represent events of importance occurring within the cell, typically indicating that an Exadata cell function is compromised. MS gg an alert when it discovers a: triggers • Cell hardware issue • Cell software or configuration issue • CELLSRV internal error • Metric that has exceeded an alert threshold You can view triggered alerts using the LIST ALERTHISTORY command. In addition, you can configure the cell to instruct MS to automatically send an email and/or SNMP messages to a designated set of storage administrators.
Exadata and Database Machine Administration Workshop 5 - 5
Monitoring Exadata with Metrics
1
Metrics
alertState metricObjectName unit objectType metricValue metricType
CREATE|ALTER THRES SHOLDS
normal warning critical
number % (percentage) F (fahrenheit) C (celsius)
Th h ld Thresholds name comparision critical occurances observation warning
name …
cumulative instantaneous rate transition
IORM_CONSUMER_GROUP IORM_DATABASE IORM_CATEGORY CELL CELLDISK CELL_FILESYSTEM GRIDDISK HOST_INTERCONNECT FLASHCACHE
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Monitoring Exadata with Metrics Metrics are recorded observations of important run-time properties or internal instrumentation values of the storage cell and its components, such as cell disks or grid disks. Metrics are a series of measurements that are computed and retained in memory for an interval of time, and stored on a disk for a more permanent history. The graphic in the slide describes some of the important metric attributes. Each metric: • Has a name and description • Is associated with a metricObjectName that is the name of the object being measured, such as a specific cell disk, grid disk, or consumer group • Belongs B l to t a group that th t iis d defined fi d b by itits objectType attribute. tt ib t The Th possible ibl groups are shown in the slide. • Has a metricType, which is an indicator of how the statistic was created or defined. Possible values and their meanings are: - cumulative: Cumulative statistics since the metric was created - instantaneous: Value at the time that the metric is collected - rate: t Rates R t computed t db by averaging i statistics t ti ti over observation b ti periods i d - transition: Are collected at the time when the value of the metrics has changed, and typically captures important transitions in hardware status • Has a measurement unit. Possible units are shown in the slide. Exadata and Database Machine Administration Workshop 5 - 6
Monitoring Exadata with Metrics (continued) Understanding the composition of the metric name provides a good insight into the meaning of the metric. The value of the name attribute is a composite of abbreviations. The attribute value sstarts a s with a an abb abbreviation e a o o of the e objec object type ype o on which c the e metric e c is s de defined: ed • CL_ (cell) • CD_ (cell disk) • GD_ (grid disk) • FC_ (flash cache) • DB_ (database) • CG_ CG (consumer group) • CT_ (category) • N_ (interconnect network) After the abbreviation of the object type, many metric names conclude with an abbreviation that relates to the description of the metric. For example, CL_FANS is the instantaneous number of working fans on the cell. I/O l t d metric I/O-related t i name attributes tt ib t continue ti with ith one off the th following f ll i combinations bi ti tto id identify tif th the operation: • IO_RQ (number of requests) • IO_BY (number of MB) • IO_TM (I/O latency) • IO_WT (I/O wait time) Next in the name could be _R for read or _W for write. Following that, there might be _SM or _LG to identify small or large I/Os, respectively. At the end of the name, there could be _SEC to signify per second or _RQ to signify per request. For example: • CD_IO_RQ_R_SM is the number of requests to read small blocks on a cell disk. • GD_IO_BY_W_LG_SEC is the number of MB of large block I/O per second on a grid disk. If a metric value crosses a user-defined threshold, an alert will be generated. Metrics can be associated with warning and critical thresholds. Thresholds relate to extreme values in the metric, which might indicate a problem or other event of interest to an administrator. Thresholds are supported on cell disk and grid disk I/O error count metrics (CD_IO_ERRS_MIN and GD_IO_ERRS_MIN), along with the cell memory utilization (CL_MEMUT ) and cell filesystem utilization (CL_FSUT) metrics. In addition, you can set thresholds for I/O Resource Management (IORM) related metrics. The CellCLI LIST ALERTDEFINITION command lists the metrics for which thresholds can be set. Users of Enterprise Manager Grid Control with the Exadata Plug-in can configure a separate set of thresholds and alerts in the Grid Control environment. These can be used in conjunction with metrics and alerts from across your systems to provide an enterprise-level view of system health and state. Note: For a complete reference of metric and threshold attributes, refer to the Oracle Exadata Storage Server Software User's Guide. For more information about the Exadata Plug-in for Enterprise Manager Grid Control, refer to the Oracle Exadata Storage Server Documentation library.
Exadata and Database Machine Administration Workshop 5 - 7
Monitoring Exadata with Metrics: Example CellCLI> LIST METRICDEFINITION WHERE objectType ='CELL' DETAIL name: CL_CPUT description: "Cell CPU Utilization is the percentage of time over the previous minute that the system CPUs were not idle (from /proc/stat). " metricType: Instantaneous objectType: CELL unit: % ... CellCLI> LIST METRICHISTORY WHERE name like 'CL_.*' – AND collectionTime > '2009-10-11T15:28:36-07:00' CL_RUNQ cell03_2 6.0 2009-10-11T15:28:37-07:00 CL_CPUT cell03_2 47.6 % 2009-10-11T15:29:36-07:00 CL_FANS cell03_2 1 2009-10-11T15:29:36-07:00 CL_TEMP cell03_2 0.0 C 2009-10-11T15:29:36-07:00 CL RUNQ CL_RUNQ cell03_2 ll03 2 5 5.2 2 2009 2009-10-11T15:29:37-07:00 10 11T15 29 37 07 00 ... CellCLI> LIST METRICCURRENT WHERE objectType = 'CELLDISK' CD_IO_TM_W_SM_RQ CD_1_cell03 205.5 us/request CD_IO_TM_W_SM_RQ CD_2_cell03 93.3 us/request CD_IO_TM_W_SM_RQ CD_3_cell03 0.0 us/request ...
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Monitoring Exadata with Metrics: Example The slide shows you some basic commands that you could use to display metric information: • Use tthe e LIST S METRICDEFINITION C O co command a d to d display sp ay tthe e metric et c de definitions t o s for o tthe e cell. A metric definition describes the configuration of a metric. The example does not specify any particular metric, so all metrics corresponding to the WHERE clause are printed. In addition to the WHERE clause, you can also specify the metric definition attributes you want to print. If the ATTRIBUTES clause is not used, a default set of attributes is displayed. To list all the attributes, you can add the DETAIL keyword at the end of the command. • Use the LIST METRICHISTORY command to display the metric history for the cell. A metric history describes a collection of past metric observations. observations Similar to the LIST METRICDEFINITION command, you can specify attribute filters, an attribute list, and the DETAIL keyword for the LIST METRICHISTORY command. The above example lists metrics having names that start with CL_ that were collected after the specified time. • Use the LIST METRICCURRENT command to display the current metric values for the cell. The above example lists all cell disk metrics. The metric values shown in the slide correspond to the average latency per request of writing small blocks to a cell disk. For this metric there is a metric observation for every cell disk.
Exadata and Database Machine Administration Workshop 5 - 8
Monitoring Exadata with Alerts
2
Alerts
alertSource
severity
BMC ADR Metric
warning critical clear info
alertType
metricObjectName
examinedBy
metricName name
stateful stateless
alertAction alertMessage failedMail
ALTER ALERTHISTORY ALL examinedBy="" y
f il dSNMP failedSNMP beginTime
0 1 2 3
EndTime notificationState …
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Monitoring Exadata with Alerts Alerts represent events of importance occurring within the storage cell, typically indicating that g cell functionality y is either compromised or in danger g of failure. An administrator should storage investigate alerts, because they might require urgent corrective or preventative action. Use the ALTER CELL command to configure email or SNMP notification for alerts. Alerts are either stateful or stateless. Stateful alerts represent observable cell states that can be subsequently retested to detect whether the state has changed, indicating that a previously observed alert condition is no longer a problem. Stateless alerts represent point-in-time events that do not represent a persistent condition; they simply show that something has occurred. Al t can h Alerts have one off th the ffollowing ll i severities: iti warning, i critical, iti l clear, l or info. i f Examples of possible events that trigger alerts are physical disk failure, disk read/write errors, cell temperature exceeding recommended value, cell software failure, and excessive I/O latency. Metrics can be used to signal stateful alerts using warning or critical threshold values. When the metric value crosses the threshold value, an alert is signaled. An alert with a clear severityy indicates that a p previous critical or warning g condition has returned to normal. For threshold-based alerts, a clear alert is generated when the measured value crosses back over the threshold value. Exadata and Database Machine Administration Workshop 5 - 9
Monitoring Exadata with Alerts (continued) Alerts with an info severity are stateless and log conditions that might be informative to an administrator but for which no administrator action is required. Informational alerts are not distributed d s bu ed by e email a o or S SNMP notifications. o ca o s The slide illustrates some of the important alert attributes. Each alert has the following attributes: • name provides an identifier for the alert. • alertSource provides the source of the alert. Some possible sources are listed in the slide. • severity determines the importance of the alert. Possible values are warning, critical, c t ca , c clear, ea , and a d info. o • alertType provides the type of the alert: stateful or stateless. Stateful alerts are automatically cleared on transition to normal. Stateless alerts are never cleared unless you change the alert by setting the examinedBy attribute. This attribute identifies the administrator who reviewed the alert and is the only alert attribute that can be modified by the administrator using the ALTER ALERTHISTORY command. • metricObjectName is the object for which a metric threshold has caused an alert. • metricName provides the metric name if the alert is based on a metric. • alertAction is the recommended action to perform for this alert. • alertMessage provides a brief explanation of the alert. • failedMail is the intended email recipient when a notification failed. • failedSNMP is the intended SNMP subscriber when a notification failed. • beginTime g provides the timestamp p p when an alert changes g its state. • endTime provides the timestamp for the end of the period when an alert changes its state. • notificationState indicates progress in notifying subscribers to alert messages: - 0: never tried - 1: sent successfully - 2: retrying (up to 5 times) - 3: five failed retries Note: Some I/O errors may result in an ASM disk going offline without generating an alert in Exadata. You should continue to perform I/O monitoring from your databases and ASM environments to identify and remedy these kinds of problems.
Exadata and Database Machine Administration Workshop 5 - 10
Displaying Alert Examples
CellCLI> LIST ALERTDEFINITION ATTRIBUTES name, metricName, description ADRAlert "CELL Incident Error" HardwareAlert "Hardware Alert" StatefulAlert_CG_IO_RQ_LG CG_IO_RQ_LG "Threshold Based Stateful Alert" StatefulAlert_CG_IO_RQ_LG_SEC CG_IO_RQ_LG_SEC "Threshold Based …Alert" StatefulAlert_CG_IO_RQ_SM CG_IO_RQ_SM "Threshold Based Stateful Alert" ...
CellCLI> LIST ALERTHISTORY WHERE severity = 'critical' AND examinedBy = '' DETAIL
CellCLI> ALTER ALERTHISTORY 1671443814 examinedBy="JFV"
CellCLI> CREATE THRESHOLD ct_io_wt_lg_rq.interactive warning=1000, critical=2000, comparison='>', occurrences=2, observation=5
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Displaying Alert Examples The slide shows you some examples of commands that display alert information. The y g alerts are veryy similar to the ones used for displaying y g metric commands for displaying information: • Use the LIST ALERTDEFINITION command to display the definition for every alert that can be produced on the cell. The example in the slide displays the alert name, metric name, and description. The metric name identifies the metric on which the alert is based. ADRAlert and HardwareAlert are not based on any metric and, therefore, do not have metric names. • Use the LIST ALERTHISTORY command to display the alert history that has occurred on a cell. The example in the slide lists in detail all critical alerts that have not been reviewed by an administrator. • Use the ALTER ALERTHISTORY command to update the alert history for the cell. The above example shows how to set the examinedBy attribute to the user ID of the administrator that examined the alert. The examinedBy attribute is the only ALERTHISTORY attribute that can be modified. The example uses the alert sequence ID to identify the alert. alertSequenceID provides a unique sequence ID number for the alert. When an alert changes its state, another occurrence of the alert is created with the same sequence number but with a different timestamp. Exadata and Database Machine Administration Workshop 5 - 11
Displaying Alert Examples (continued) •
The CREATE THRESHOLD command creates a threshold that specifies the conditions for generation of a metric alert. The example creates a threshold for the CT_IO_WT_LG_RQ g y This metric specifies p the average g metric associated with the INTERACTIVE category. number of milliseconds that large I/O requests issued by the category have waited to be scheduled by IORM in the past minute. A large value indicates that the I/O workload from this category is exceeding the allocation specified for it in the category plan. The alert is triggered by two consecutive measurements (occurrences=2) over the threshold values: one second for a warning alert (warning=1000) and two seconds for a critical alert (critical=2000). The observation attribute is the number of measurements over which measured values are averaged.
Exadata and Database Machine Administration Workshop 5 - 12
Monitoring Exadata with Active Requests
3
Active Requests
ioGridDisk
ioBytes
ioOffset
ioReason
ioType
objectNumber
id
name
asmDiskGroupNumber
parentID
asmFileIncarnation
requestState sessionID sessionSerNumber
file initialization read write predicate pushing filtered backup read predicate push read
asmFileNumber consumerGroupName
t bl tablespaceNumber N b
dbN dbName instanceNumber
sqlID
LIST ACTIVEREQUEST WHERE IoType = 'predicate pushing' DETAIL
fileType
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Monitoring Exadata with Active Requests An active request provides a client-centric or application-centric view of client I/O requests that are currently being processed by a cell. The slide shows the most important attributes of an active request. You can see that an active request is characterized at all levels: instance, database, ASM, and cell. Most of the attributes have self-explanatory names. Here is a brief explanation of some of the attributes: • ioReason is the reason for the I/O activity, such as a control-file read. • ioType identifies the type of active request. Possible values are listed in the slide. • requestState identifies the state of the active request. Possible values include: - Accessing Disk - Computing Result - Network Receive - Network Send - Queued Extent - Queued for Disk - Queued for File Initialization - Queued for Filtered Backup Read - Queued for Network Send - Queued for Predicate Pushing - Queued for Read - Queued for Write - Queued in Resource Manager Use the LIST ACTIVEREQUEST command to display active request details for the cell. The syntax is very similar to other LIST commands. You can specify which attributes to display or you can display them all using the DETAIL clause. You can also filter the output using a WHERE clause. Exadata and Database Machine Administration Workshop 5 - 13
Monitoring SQL Execution Plans
4
Relevant Initialization Parameters: • CELL_OFFLOAD_PROCESSING – TRUE | FALSE – Enables or disables Smart Scan and other smart storage capabilities – Dynamically modifiable at the session or system level using ALTER SESSION or ALTER SYSTEM – Specifiable at the statement level using the OPT_PARAM hint
•
CELL OFFLOAD PLAN DISPLAY CELL_OFFLOAD_PLAN_DISPLAY – NEVER | AUTO | ALWAYS – Allows execution plan to show offloaded predicates – Dynamically modifiable at the session or system level using ALTER SESSION or ALTER SYSTEM Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Monitoring SQL Execution Plans The CELL_OFFLOAD_PROCESSING initialization parameter enables SQL processing offload to Exadata. The default value of the parameter is TRUE which means that predicate evaluation can be offloaded to Exadata. If set to FALSE, the database performs all the predicate evaluation with cells serving blocks like traditional storage. To enable offloading for a particular SQL statement, use the OPT_PARAM hint as shown in the following example: SELECT /*+ OPT_PARAM('cell_offload_processing' 'true') */ ... The CELL_OFFLOAD_PLAN_DISPLAY initialization parameter determines whether the SQL EXPLAIN PLAN statement displays the predicates that can be evaluated by Exadata as STORAGE predicates for a given SQL statement statement. The possible values are: • AUTO instructs the SQL EXPLAIN PLAN statement to display the predicates that can be evaluated as STORAGE only if a cell is present and if a table is on the cell. • ALWAYS produces changes to the SQL EXPLAIN PLAN statement whether or not Exadata is present or the table is on the cell. You can use this setting to identify statements that are candidates for offloading before migrating to Exadata. • NEVER produces no changes to the SQL EXPLAIN PLAN statement due to Exadata Exadata. This may be desirable, for example, if you wrote tools that process execution plan output and these tools have not been updated to deal with new syntax or when comparing plans for two systems: one with Exadata and one without. Exadata and Database Machine Administration Workshop 5 - 14
Smart Scan Execution Plan Example SQL> alter session set CELL_OFFLOAD_PROCESSING = TRUE; Session altered. SQL> alter session set CELL_OFFLOAD_PLAN_DISPLAY = ALWAYS; Session altered. SQL> explain plan for select * from customers where c_customer_sk < 10; Explained. p SQL> select * from table(dbms_xplan.display); -----------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| -----------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 196 | 326 (1)| | 1 | PX COORDINATOR | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 196 | 326 (1)| | 3 | PX BLOCK ITERATOR | | 1 | 196 | 326 (1)| |* 4 | TABLE ACCESS STORAGE FULL| CUSTOMER | 1 | 196 | 326 (1)| -----------------------------------------------------------------------------Predicate Information (identified by operation id): --------------------------------------------------4 - storage("C_CUSTOMER_SK" SELECT * FROM GV$ASM_OPERATION
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Replacing a Damaged Physical Disk Replacing a physical disk due to problem or failure is probably the most likely hardware maintenance operation that Exadata might ever require. Assuming you are using ASM redundancy, d d the th procedure d tto replace l a problem bl di disk k iis quite it simple. i l The first step requires that you identify the problem disk. This could occur in a number of ways: • Hardware monitoring using ILOM may report a problem disk. • If a disk fails, an Exadata alert is generated. The alert includes specific instructions for replacing the disk. If you have configured the system for alert notifications, the alert will be sent to the designated email address or SNMP target. You can also use the LIST ALERTHISTORY command shown in the slide to identify the failed disk. • The LIST PHYSICALDISK command may identify a disk reporting a status of warning or critical. Even if the cell is still functioning, the problem may be a precursor to a disk failure. • The CALIBRATE command may identify a disk delivering abnormally low throughput or IOPS. Even if the cell is still functioning, a single bad physical disk can degrade the performance of other good disks so you may decide to replace the identified disk. Note that running CALIBRATE at the same time as the cell is active will impact performance. After you have identified the problem disk, you can replace it. When you remove the disk, you will get an alert. When you replace a physical disk, the disk must be acknowledged by the RAID controller before it can be used. This does not take a long time, and you can use the LIST PHYSICALDISK command to monitor the status until it returns to normal. Exadata and Database Machine Administration Workshop 5 - 24
Replacing a Damaged Physical Disk (continued) The grid disks and cell disks that existed on the previous disk in the slot will be automatically re-created on the new disk. If these grid disks were part of an Oracle ASM disk group with O or HIGH o G redundancy, edu da cy, they ey will be added back bac to o the ed disk s g group oup a and d the e da data a will be NORMAL rebalanced based on disk group redundancy and the asm_power_limit parameter. Re-creating the ASM disk and rebalancing the data may take some time to complete. You can monitor the progress of these operations within ASM. You can monitor the status of the disk as reported by V$ASM_DISK.STATE until it returns to NORMAL. You can also monitor the rebalance progress using GV$ASM_OPERATION. Review the following considerations when replacing a failed disk: • If the repair timer (specified in the DISK_REPAIR_TIME ASM initialization parameter) has not expired, the ASM disk could be offline (not dropped) and the disk group is yet to be rebalanced. In this case, the prompt replacement of the failed disk can avoid a needless rebalance operation. • The disk could be dropped by Oracle Automatic Storage Management (Oracle ASM), and the rebalance operation may have been successfully run. Check the Oracle ASM alert logs to confirm this. this After the failed disk is replaced, replaced a second rebalance will be required required. • The disk could be dropped, and the rebalance operation is currently running. Check the GV$ASM_OPERATION view to determine if the rebalance operation is still running. In this case the rebalance operation following the disk replacement will be queued. • The disk could be dropped by ASM, and the rebalance operation failed. Check GV$ASM_OPERATION.ERROR to determine why the rebalance operation failed. Monitor the rebalance operation following the disk replacement to ensure it runs. • Rebalance operations from multiple disk groups can be done on different Oracle ASM instances in the same cluster if the physical disk being replaced contains grid disks from multiple disk groups. Multiple rebalance operations cannot be run in parallel on just one Oracle ASM instance. The operations will be queued for the instance.
Exadata and Database Machine Administration Workshop 5 - 25
Replacing a Damaged Flash Card 1
Determine the damaged flash card.
CellCLI> LIST PHYSICALDISK DETAIL name: [9:0:2:0] diskType: FlashDisk ... slotNumber: "PCI Slot: 1; FDOM: 2" status: critical
2
Power down the cell.
5
3
Replace the flash card.
4
Power up the cell.
If the card contained a flash-based grid disk, monitor ASM to confirm the readdition of the disk.
SQL> SELECT NAME, STATE FROM V$ASM_DISK SQL> SELECT * FROM GV$ASM_OPERATION
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Replacing a Damaged Flash Card Each Exadata server is equipped with 4 PCI flash memory cards. Each card has 4 flash modules ((FDOMs)) for a total of 16 flash modules on each cell. Identifying a damaged flash module is similar to identifying a damaged physical disk. Hardware monitoring using ILOM or a drop in performance indicated by the CALIBRATE command may indicate a problem. If a failed FDOM is detected, an alert is generated. The alert message includes if any flash-based grid disks were on the flash module. As shown in the slide, a damaged flash module can also be reported using the LIST PHYSICALDISK DETAIL command. The slotNumber attribute shows the PCI slot and the FDOM number. b IIn thi this example, l th the status t t attribute tt ib t indicates i di t a critical iti l ffault. lt If there were no grid disks on the flash module, the flash module was probably being used for Exadata Smart Flash Cache. In this mode, the bad flash module results in a decreased amount of flash memory on the cell. The performance of the cell is affected proportional to the size of flash memory lost, but the database and applications are not at risk of failure. Although technically the PCI slots in a Exadata server are hot-replaceable, it is recommended to power down the cell while servicing a damaged flash card card. After replacing the card and powering up the cell, no additional steps are required to re-create any flash-based grid disks. Optionally, you can monitor ASM to confirm the readdition of a flash-based grid disk. Exadata and Database Machine Administration Workshop 5 - 26
Moving All Disks from One Cell to Another Original
New
Original
New
1. Make the grid disks inactive: CellCLI> ALTER GRIDDISK ALL INACTIVE
2. Back up the operating system configuration files that will change when the new cell is booted. 3. Move the disks from the original cell to the new cell. •
Ensure the system disks occupy the first two slots.
4. Boot the new cell. 5. Restart Exadata cell services: CellCLI> ALTER CELL RESTART SERVICES ALL
6. Import the cell disks: CellCLI> IMPORT CELLDISK ALL Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Moving All Disks from One Cell to Another You may need to move all drives from one Exadata server to another Exadata server. This may be necessary when there is a chassis-level component failure, or when troubleshooting a h d hardware problem. bl T To move th the d drives, i perform f th the ffollowing ll i steps: t 1. If possible, use the ALTER GRIDDISK ALL INACTIVE command to make the grid disks inactive. 2. If possible, back up /etc/hosts, /etc/modprobe.conf, and the files in /etc/sysconfig/network. This is a precautionary step if you want to retain the settings associated with your original Exadata server in case you plan to move the disks back to the original Exadata server in the future. 3. Move the disks from f the original Exadata cell to the new Exadata cell. Caution: Ensure the first two disks, which are the system disks, are in the same first two slots. Failure to do so will cause the Exadata cell to not function properly. 4. Start the cell. The cell operating system will be automatically reconfigured to suit the new server hardware. 5. Restart the cell services using ALTER CELL RESTART SERVICES ALL. 6. Import the cell disks using IMPORT CELLDISK ALL. If you are using ASM redundancy and the procedure is completed before the amount of time specified in the DISK_REPAIR_TIME ASM initialization parameter, then the ASM disks will be automatically brought back online and updated with any changes made during the cell outage. Exadata and Database Machine Administration Workshop 5 - 27
Using the Exadata Software Rescue Procedure •
Every Exadata server is equipped with a CELLBOOT USB flash drive to facilitate cell rescue – Cell rescue is required in the unlikely event that both system disks fail simultaneously – Use with extreme caution
•
To perform cell rescue: 1. Connect to Exadata using the console 2. Boot the cell, and as soon as you see the "Oracle Exadata" splash p screen,, p press any y key y on the keyboard y 3. In the displayed list of boot options, select the last option, CELL_USB_BOOT_CELLBOOT_usb_in_rescue_mode, and press Enter 4. Select the rescue option, and proceed with the rescue 5. Reconfigure the cell Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Using the Exadata Software Rescue Procedure Exadata maintains mirrored system areas on separate physical disks. If one system area becomes corrupt or unavailable unavailable, Exadata can use the mirrored copy to recover recover. In the rare event that both system disks fail simultaneously, you must use the rescue functionality provided on the CELLBOOT USB flash drive that is built into every Exadata server. It is important to note the following when using the rescue procedure: • Use extreme caution when using this procedure, and pay attention to the prompts. The rescue procedure can potentially rewrite some or all of the disks in the cell. If this happens, then you can irrevocably lose the contents of those disks. Ideally, you should use the rescue procedure only with assistance from Oracle Support Services. • The rescue procedure does not destroy the contents of the data disks or the contents of the data partitions on the system disks unless you explicitly choose to do so during the rescue procedure. • The rescue procedure restores the Exadata software to the same release. This includes any patches that existed on the cell as off the last successful f boot.
Exadata and Database Machine Administration Workshop 5 - 28
Using the Exadata Software Rescue Procedure (Continued) •
The following is not be restored using the rescue procedure: - The crash kernel support rpms kernel-debuginfo-common, and kerneldebuginfo You will need to reinstall them. debuginfo. them These cannot be restored due to space limitations on the CELLBOOT USB flash drive. - Some cell configuration details, such as alert configurations, SMTP information, and administrator e-mail address. Note that the cell network configuration is restored, along with SSH identities for the cell, and the root, celladmin and cellmonitor users. ILOM configurations. Typically, ILOM configurations remain undamaged even in case of Exadata software failures. The rescue procedure does not examine or reconstruct data disks or data partitions on the system disks. If you have data corruption on the grid disks, then do not use the rescue procedure. Instead use the database backup and recovery procedures. -
•
The following rescue options are available for the rescue procedure: • Partial reconstruction recovery: During partial reconstruction recovery, the rescue process re-creates t partitions titi on the th system t disks di k and d checks h k th the disks di k ffor th the existence of a file system. If a file system is discovered, then the process attempts to boot. If the cell boots successfully, then you use the CellCLI commands, such as LIST CELL DETAIL, to verify the cell is usable. You must also recover any data disks, as appropriate. If the boot fails, then you must use the full original build recovery option. • Full original build recovery: This option rewrites the system area of the system disks to restore the Exadata software. It also allows you to erase any data on the data disks, and data partitions on the system disks. • Re-creation of the CELLBOOT USB flash drive: This option is used to make a copy of the CELLBOOT USB flash drive. To perform a rescue using the CELLBOOT USB flash drive: 1. Connect to Exadata using the console. 2. Boot the cell, and as soon as you see the "Oracle Exadata" splash screen, press any key on the keyboard. The splash screen remains visible for only 5 seconds. 3. In the displayed list of boot options, scroll down to the last option, CELL_USB_BOOT_CELLBOOT_usb_in_rescue_mode, and press Enter. 4. Select the rescue option, and proceed with the rescue. 5. After a successful rescue, you must reconfigure the cell to return it to the pre-failure configuration, fi ti and d reinstall i t ll th the kernel-debuginfo and d kernel-debuginfocommon rpms to use crash kernel support. If you chose to preserve the data when prompted by the rescue procedure, then import the cell disks. If you chose not to preserve the data, then you should create new cell disks, and grid disks.
Exadata and Database Machine Administration Workshop 5 - 29
Quiz You can define thresholds for all Exadata metrics? 1. TRUE 2. FALSE
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Answer: 2 Thresholds are supported on cell disk and grid disk I/O error count metrics (CD_IO_ERRS_MIN and GD_IO_ERRS_MIN), along with the cell memory utilization (CL_MEMUT ) and cell file system utilization (CL_FSUT) metrics. In addition, you can set thresholds for I/O Resource Management (IORM) related metrics. The CellCLI LIST ALERTDEFINITION command lists the metrics for which thresholds can be set.
Exadata and Database Machine Administration Workshop 5 - 30
Quiz You enable SQL processing offload using the CELL_OFFLOAD_PLAN_DISPLAY initialization parameter. 1. TRUE 2 FALSE 2.
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Answer: 2 The CELL_OFFLOAD_PROCESSING parameter is used to enable SQL processing offload.
Exadata and Database Machine Administration Workshop 5 - 31
Summary In this lesson, you should have learned how to: • Describe the various performance monitoring facilities available for Exadata • Monitor Exadata from directly within a cell, cell from a database instance and through Enterprise Manager • Interpret SQL execution plans that use offloading • Outline probable maintenance scenarios
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata and Database Machine Administration Workshop 5 - 32
Additional Resources •
Lesson Demonstrations (Viewlets) – Monitoring Exadata Using Metrics, Alerts and Active Requests —
http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/051ExadataMe tricsAlerts/051exadatametricsalerts_viewlet_swf.html
– Monitoring Exadata From Within Oracle Database —
http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/052ExadataDB Monitoring/052exadatadbmonitoring_viewlet_swf.html g g_ _
– Exadata High Availability —
http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/053ExadataHig hAvailability/053exadatahighavailability_viewlet_swf.html
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata and Database Machine Administration Workshop 5 - 33
Practice 5 Overview: Monitoring Exadata In these practices, you will monitor Exadata using metrics, alerts and active requests. You will also monitor Exadata statistics using dynamic performance views (V$ views) in your database. Finally, you will exercise Exadata high availability by examining the effect of a cell crash.
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata and Database Machine Administration Workshop 5 - 34
E d t and Exadata d I/O Resource R Management M t
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Objectives After completing this lesson, you should be able to use Exadata I/O Resource Management to manage workloads within a database and across multiple databases.
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata and Database Machine Administration Workshop 6 - 2
I/O Resource Management Overview •
Traditional benefits of shared storage: – Lower administration costs – More efficient use of storage
•
Common challenge for shared storage: – Workloads interfere with each other. For example: — — —
•
Large queries impact on each other Data loads impact on warehouse queries Batch workloads interfere with OLTP performance
Exadata I/O Resource Management allows you to govern I/O resource usage among different: – User types – Workload types
– Applications – Databases
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
I/O Resource Management Overview Storage is often shared by different workloads on multiple databases. Shared storage provides some important benefits: • When a storage system is dedicated to a single database, the administrator must size the storage system based on the database’s peak anticipated load and size. The correct balance of storage resources is seldom achieved because real-world workloads are very dynamic. This leads to unused I/O bandwidth and space on some systems, whereas others suffer with insufficient bandwidth and space. Sharing facilitates more efficient usage of storage space and I/O bandwidth. • Sharing lowers administration costs by reducing the number of storage systems. systems Shared storage, however, is not a perfect solution. Running multiple types of workloads and databases on shared storage often leads to performance problems. For example, large parallel queries on one production data warehouse can impact the performance of critical queries on another production data warehouse. Also, a data load on a data warehouse can impact the performance of critical queries also running on it. You can mitigate these problems by over provisioning the storage system, but this diminishes the cost savings of shared storage. You can also avoid running noncritical tasks at peak times, but manually achieving this is laborious. When databases have different administrators who do not coordinate their activities, the task is even more difficult. Exadata and Database Machine Administration Workshop 6 - 3
I/O Resource Management Overview (continued) I/O Resource Management (IORM) allows workloads and databases to share Exadata I/O resources automatically according to user-defined policies. To manage workloads within a database you can define intradatabase resource plans using the Database Resource Manager database, (DBRM), which has been enhanced to work in conjunction with Exadata. To manage workloads across multiple databases, you can define IORM plans. For example, if a production database and a test database are sharing an Exadata cell, you can configure resource plans that give priority to the production database. In this case, whenever the test database load would affect the production database performance, IORM will schedule the I/O requests such that the production database I/O performance is not impacted. This means th t the that th test t t database d t b I/O requests t are queued d until til they th can be b issued i d without ith t disturbing di t bi th the production database I/O performance.
Exadata and Database Machine Administration Workshop 6 - 4
I/O Resource Management Concepts Database A
Database B
Finance
OnlineQuery
consumer group
Interactive
consumer group
category
HR
BatchQuery
consumer group
consumer group
Reporting
Batch
ETL
consumer group
category
consumer group
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
I/O Resource Management Concepts A database often has many types of workloads. These workloads may differ in their y issue. Resource consumer g groups performance requirements and the amount of I/O that they provide a way to group sessions that comprise a particular workload. For example, if your database is running four different applications, you can create four consumer groups, one for each application. Alternatively, if your data warehouse has three types of workloads, such as critical queries, normal queries, and ETL (extraction, transformation, and loading), then you can create a consumer group for each type of workload. After you have created the consumer groups, you must create rules that specify how sessions are mapped to consumer groups. The database resource plan plan, or intradatabase resource plan plan, specifies how resources are allocated among consumer groups in a database. A database may have multiple resource plans, however, only one resource plan can be active at any point in time. This allows database resource management to cater for different requirements associated with different time periods. Exadata IORM extends the consumer group concept using categories. While consumer groups represent collections of users within a database, categories represent collections of consumer groups across all databases. The diagram in the slide shows an example of two categories containing i i consumer groups across two databases. d b Y You can manage I/O resources based b d on categories by creating a category plan. For example, you can specify precedence to consumer groups in the Interactive category over consumer groups in the Batch category for all the databases sharing an Exadata cell. Exadata and Database Machine Administration Workshop 6 - 5
I/O Resource Management Plans
I/O Resource Management Inside one database
Intradatabase Resource Plan
Across multiple databases
Interdatabase Resource Plan
Category g y Resource Plan
IORM Plan
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
I/O Resource Management Plans IORM provides different approaches for managing resource allocations. Each approach can be j with other approaches. used independentlyy or in conjunction Database resource management enables you to manage workloads within a database. Database resource management is configured within each database, using Database Resource Manager to create an intradatabase resource plan. You should use this feature if you have multiple types of workloads within a database and you need to define a policy for specifying how these workloads share the database resource allocation. If only one database is using Exadata, this is the only IORM feature that you need. I t d t b Interdatabase resource managementt is i managed d with ith an interdatabase i t d t b plan. l An A interdatabase i t d t b plan specifies how resources are allocated among multiple databases for each cell. The directives in an interdatabase plan specify allocations to databases, rather than consumer groups. Category resource management is an advanced feature. It is useful when Exadata is hosting multiple databases and you want to allocate resources primarily by the category of the work g done. For example, p , suppose pp all databases have three categories g of workloads: OLTP,, being reports, and maintenance. To allocate the I/O resources based on these workload categories, you would use category resource management. Note: The combination of the interdatabase plan and the category plan is called the IORM plan. Exadata and Database Machine Administration Workshop 6 - 6
IORM Architecture Database A Database sends IO requests to cells cells.
Database A CG1Database queue A CG1Database queue A CG1Database queue A CG2 queue CG1 queue CG2 queue CG2 queue CG2 queue
Exadata Cell
CELLSRV
… … …… CG queue CGn
CGn queue CGn queue CGn queue
IO request tag: - DB name - Type - Consumer group
BG queues BG queues BG queues Database BGZqueues Database CG1 queue Z CG1Database queue Z CG1Database queue Z CG2 queue CG1 queue CG2 queue CG2 queue CG2 queue
… … …
IORM
… … …… CGn queue
Disk queue
CGn queue CGn queue CGn queue
Database Z
BG queues BG queues BG queues BG queues
Resource plans
Cell disk
Performance statistics
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
IORM Architecture IORM manages Exadata I/O resources on a per-cell basis. Whenever the I/O requests start to saturate the cell, IORM schedules incoming I/O requests according to the configured resource plans. l IORM schedules h d l I/O b by selecting l ti requests t ffrom diff differentt CELLSRV queues. The Th resource plans are used to determine the order in which the queued I/O requests are issued to disk. The goal of IORM is to fully utilize the available disk resources. Any allocation that is not fully utilized is made available to other workloads in proportion to the configured resource plans. IORM only intervenes when needed. For example, IORM does not intervene if there is only one active consumer group on one database because there is no possibility of contention with another consumer group or database. Background I/Os are scheduled based on their priority relative to the user I/Os. I/Os For example, example redo writes and control file I/Os are critical to performance and are always prioritized above all user I/Os. Writes by the database writer process (DBWn) are scheduled at the same priority level as user I/Os. The diagram in the slide illustrates the high-level implementation of IORM. For each cell disk, each database accessing the cell has one I/O queue per consumer group and three background I/O queues. The background I/O queues correspond to high, medium, and low priority requests with different I/O types mapped to each queue. If you do not set an intradatabase resource plan, all nonbackground I/O requests are grouped into a single consumer group called OTHER_GROUPS. Note: IORM is only used to manage I/O requests to physical disks. IORM does not manage requests to flash-based grid disks or requests serviced by Exadata Smart Flash Cache. Exadata and Database Machine Administration Workshop 6 - 7
I/O Resource Management Plans Example Database A
Database B
(Single Inst)
(RAC)
Intradatabase Plan A
Intradatabase Plan B
(DBMS_RESOURCE_MANAGER)
(DBMS_RESOURCE_MANAGER)
Consumer group 1: 15% Consumer group 2: 10%
Consumer group 5: 22% Consumer group 6: 18%
Consumer group 3: 35% Consumer group 4: 40%
Consumer group 7: 15% Consumer group 8: 45% Controlled I/O distribution
Exadata Storage Server
…
Disk
DB A Plan Interdatabase Plan (CellCLI)
Database A Database B
Disk
DB B Plan
IORM Plan
: 70% : 30%
Category Plan (CellCLI)
INTERACTIVE : 60% BATCH : 40%
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
I/O Resource Management Plans Example For each database, you can use DBRM to create an intradatabase resource plan. When you set y sent to each cell. In an intradatabase resource plan, a description of the plan is automatically the example in the slide, Database A and Database B have separate intradatabase plans. Note also that each consumer group in each intradatabase plan is associated with either the INTERACTIVE or BATCH category. At each cell, an interdatabase plan can be configured and enabled. In the example in the slide, the interdatabase plan is configured with a larger resource allocation for Database A (70%) than for Database B (30%). Also within Al ithi each h cell, ll you can categorize t i consumer groups ffrom diff differentt databases d t b and d distribute I/O resources according to the various categories. In the example in the slide, the INTERACTIVE category (60%) is allocated a greater resource share than the BATCH category (40%).
Exadata and Database Machine Administration Workshop 6 - 8
I/O Resource Management Plans Example Database B
Database A
Database A
Database B
IORM allocation Intradatabase 45%
15%
40%
35%
18%
22%
10%
15%
Interdatabase
30%
70%
30%
70%
Categories
40%
60%
BATCH
INTERACTIVE
All User I/Os (100%) Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
I/O Resource Management Plans Example (continued) The category, interdatabase, and intradatabase plans are used together by Exadata to allocate I/O resources. The category plan is first used to allocate resources among the categories. When a category is selected, the interdatabase plan is used to select a database; only databases that have consumer groups with the selected category can be selected. Finally, the selected database’s intradatabase plan is used to select one of its consumer groups. The percentage of resource allocation represents the probability of making a selection at each level. Expressing this as a formula: Pcgn = cgn / sum(catcgs) * db% * cat% where: • Pcgn is the probability of selecting consumer group n • cgn is the resource allocation for consumer group n • sum(catcgs) is the sum of the resource allocations for all consumer groups in the same category as consumer group n and on the same database as consumer group n • db% is the database allocation percentage in the interdatabase plan • cat% is the category allocation percentage in the category plan Exadata and Database Machine Administration Workshop 6 - 9
I/O Resource Management Plans Example (continued) The hierarchy used to distribute I/Os is illustrated in the slide. The example is continued from the previous slide but the consumer group names are abbreviated to CG1, CG2, and so on. Notice that although each consumer group allocation is expressed as a percentage within each database, IORM is concerned with the ratio of consumer group allocations within each category and database. For example, CG1 nominally receives 16.8% of I/O resources from IORM (15/(15+10)*70%*40%); however, this does not change if the intradatabase plan allocations for CG1 and CG2 are doubled to 30% and 20%, respectively. This is because the allocation to CG1 remains 50% greater than the allocation to CG2. This behavior also explains why CG1 (16.8%) and CG3 (19.6%) have a similar allocation through IORM even though CG3 belongs to the hi h priority higher i it category t (60% versus 40%) and dh has a much h llarger iintradatabase t d t b plan l allocation ll ti (35% versus 15%). Note: ASM I/Os (for rebalance and so on) and I/Os issued by Oracle background processes are handled separately and automatically by Exadata. For clarity, background I/Os are not shown in the example.
Exadata and Database Machine Administration Workshop 6 - 10
Enabling Intradatabase Resource Management •
You can enable intradatabase resource management: – Manually: —
Set the database’s RESOURCE_MANAGER_PLAN parameter.
– Automatically: — —
•
Create a job scheduler window. Associate a resource plan with the window.
Exadata is notified when an intradatabase resource plan is set or modified: – Enabled or modified plan sent to each cell using iDB
•
You must activate the IORMPLAN on all Exadata cells.
•
Following are the commonly used intradatabase plans: – mixed_workload_plan – dss_plan – default_maintenance_p plan Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Enabling Intradatabase Resource Management An intradatabase resource plan can be manually enabled with the RESOURCE_MANAGER_PLAN initialization parameter or automatically y enabled using g the jjob scheduler. When you set an intradatabase resource plan on the database, a description of the plan is automatically sent to each cell. When a new cell is added or an existing cell is restarted, the current intradatabase plan is automatically sent to the cell. This resource plan is used to manage resources on both the database server and cells. Before IORM can be used, you must activate the IORMPLAN on all corresponding Exadata cells. Oracle Database provides several predefined intradatabase plans. The most commonly used are mixed_workload_plan, dss_plan and default_maintenance_plan. Intradatabase plans do not contain a directive for background I/O activity. Background I/Os are scheduled based on their priority relative to the user I/Os. For example, redo writes, and control file reads and writes are critical to performance and are always prioritized above all user I/Os. Note: When an Oracle RAC database uses Exadata, all instances in the Oracle RAC cluster must be set to the same resource plan.
Exadata and Database Machine Administration Workshop 6 - 11
Intradatabase Plan Example BEGIN DBMS_RESOURCE_MANAGER.CREATE_SIMPLE_PLAN(SIMPLE_PLAN => 'my_plan', CONSUMER_GROUP1 => 'high_priority', GROUP1_PERCENT => 80, CONSUMER_GROUP2 => 'low_priority' , GROUP2_PERCENT => 20); END; /
ALTER SYSTEM SET RESOURCE_MANAGER_PLAN = 'my_plan';
Consumer Group
The plan is sent di tl tto th directly the Exadata cells via iDB.
SYS_GROUP
Level 1
Level 2
100%
HIGH_PRIORITY
80%
LOW_PRIORITY
20%
OTHER_GROUP
Level 3
100%
Percentages are used sed for both CPU and I/O resources.
CellCLI> ALTER IORMPLAN ACTIVE
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Intradatabase Plan Example The intradatabase I/O resource plan specifies how I/O resources are allocated among groups in a specific database. consumer g An intradatabase I/O resource plan is created with the procedures in the DBMS_RESOURCE_MANAGER PL/SQL package. There are no specific I/O resource parameters or procedures. You create an intradatabase I/O resource plan exactly the same way as you would create a CPU resource plan. When you specify an allocation percentage, this percentage applies to both database server CPU and Exadata I/O resources if you are using Exadata. There are no specific I/O settings because typically you are constrained by CPU or I/O, but not both at the same time time. The intradatabase I/O resource plan is applicable only when the database uses Exadata. The example in the slide uses the CREATE_SIMPLE_PLAN procedure to create MY_PLAN. This resource plan is used to manage CPU resources at the database level, and I/O resources at the Exadata cell level. Before I/O resources for an intradatabase plan can be managed by Exadata I/O Resource Management, g yyou need to make sure that the IORMPLAN is active. This can be done by y executing the ALTER IORMPLAN ACTIVE command.
Exadata and Database Machine Administration Workshop 6 - 12
Enabling IORM for Multiple Databases •
Enable IORM for multiple databases by configuring an IORMPLAN: – The category plan assigns I/O resources using categories. – The interdatabase plan assigns I/O resources using database names names. – All combinations are possible.
• • • • •
Use CellCLI to define and activate the IORMPLAN on each cell. Configure the same IORMPLAN on each cell. O l one IORMPLAN can be Only b active i at a time i on a cell. ll IORMPLAN settings are persistent across cell reboots. All databases get equal allocations in the absence of an IORMPLAN.
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Enabling IORM for Multiple Databases I/O resource management for multiple databases is configured with the IORMPLAN. The IORMPLAN specifies how I/O resources are allocated for each cell. If you are using multiple cells, you need to configure them all. In most cases, all of your cells should use the same IORMPLAN. The IORMPLAN contains both an interdatabase plan, also called a DB plan, and a category plan. The directives in the DB plan specify I/O resource allocations to database names, rather than consumer groups. The directives in the category plan specify I/O resource allocations to categories, rather than databases or consumer groups. The IORMPLAN is configured and enabled with CellCLI on each cell cell. Only one IORMPLAN can be active on a cell at any given time. At startup, the IORMPLAN is an empty string, which effectively turns off IORM. In that case all databases receive an equal allocation. The IORMPLAN must be activated for I/O resource management to occur. When the IORMPLAN is deactivated, IORM will not manage I/O resources, even if an intradatabase resource plan is set or an IORMPLAN is configured. g
Exadata and Database Machine Administration Workshop 6 - 13
Interdatabase Plan Example CellCLI> alter iormplan > dbplan=((name=sales_prod, level=1, allocation=80), > (name=finance_prod, level=1, allocation=20), > (name=sales_dev, level=2, allocation=100), > (name=sales_test, level=3, allocation=50), > (name=other, (name other level=3, level 3 allocation=50)), allocation 50)) > catplan=''
-
CellCLI> alter iormplan active
Database
Level 1
sales_prod
80%
finance_prod
20%
sales_dev
Level 2
Level 3
100%
sales_test
50%
other
50%
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Interdatabase Plan Example On each Exadata cell, an interdatabase plan specifies how resources are divided among y allocations to databases, multiple databases. The directives in an interdatabase plan specify rather than consumer groups. The interdatabase plan is configured and activated with CellCLI, on each cell. The above example implements an interdatabase plan following the directives shown in the table. The interdatabase plan is created by specifying the DBPLAN part of the IORMPLAN. The interdatabase plan is similar to an intradatabase plan in that each directive consists of a level f from 1 to t 8 and d an allocation ll ti amountt in i percentage t terms. t F For a given i plan, l allll th the allocations at any level must add up to 100 or less. An interdatabase plan differs from an intradatabase plan in that it cannot contain subplans and it only contains I/O resource directives. As a best practice, you should create a directive for each database using the same Exadata cell. To make sure that any database without an explicit directive can be managed, you need to create an allocation named OTHER.
Exadata and Database Machine Administration Workshop 6 - 14
Interdatabase Plan Example (continued) The role attribute indicates that the directive is applied only when the databases are in that database role. This provides the flexibility to automatically adjust the IORM plan according to the role of the database in an Oracle Data Guard environment. If the role attribute is not specified, the directive is applied regardless of the database role. Following is an example of an interdatabase plan using the role attribute: ALTER IORMPLAN dbplan=( (name=sales1, level=1, allocation=30, role=primary), (name=sales2, level=1, allocation=35, role=primary), (name=sales1, (name sales1, level level=2, 2, allocation=20, allocation 20, role=standby), role standby), (name=sales2, level=2, allocation=25, role=standby), (name=other, level=3, allocation = 50)) You can remove an interdatabase plan using: ALTER IORMPLAN dbplan=''
Exadata and Database Machine Administration Workshop 6 - 15
Category Plan Example DBA_RSRC_CONSUMER_GROUPS CONSUMER_GROUP ---------------------------SYS_GROUP BATCH_GROUP INTERACTIVE_GROUP ORA$… OTHER_GROUPS DEFAULT_CONSUMER_GROUP LOW_GROUP AUTO_TASK_CONSUMER_GROUP
CATEGORY --------------ADMINISTRATIVE BATCH INTERACTIVE MAINTENANCE OTHER OTHER OTHER OTHER
DBMS_RESOURCE_MANAGER.CREATE_CATEGORY
Category
Level 1
Interactive
90%
Batch
Level 2
Level 3
80%
Maintenance CellCLI> alter iormplan > dbplan= '' Other > catplan=( catplan ( > (name=interactive, level=1, allocation=90), > (name=batch, level=2, allocation=80), > (name=maintenance, level=3, allocation=50), > (name=other, level=3, allocation=50) > )
50% 50% -
CellCLI> alter iormplan active
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Category Plan Example Database Resource Manager enables you to specify a category for every consumer group. The predefined categories and their associated consumer groups are listed in the slide. This is the default situation after database creation. If you decide to use these default categories, you should map all administrative consumer groups in all databases to the ADMINISTRATIVE category. All high-priority user activity, such as consumer groups for important online transaction processing (OLTP) transactions and time-critical reports, should be mapped to the INTERACTIVE category. All low-priority user activity, such as reports, maintenance, and lowpriority OLTP transactions, should be mapped to the BATCH, MAINTENANCE, and OTHER g categories. You can create your own categories using the CREATE_CATEGORY procedure in the DBMS_RESOURCE_MANAGER package, and then assign your category to a consumer group using the CREATE_CONSUMER_GROUP or UPDATE_CONSUMER_GROUP procedures. You can then manage I/O resources based on categories by creating a category plan. The example shown in the slide implements a category plan based on the allocations described in the table. With this plan, consumer groups associated with the INTERACTIVE category get up to 90 percent of I/O resources. resources 80 percent of the remainder remainder, including any unutilized allocation from the INTERACTIVE category, is allocated to the BATCH category. The MAINTENANCE and OTHER categories share the remainder. Any consumer group without an explicitly specified category defaults to the OTHER category. Exadata and Database Machine Administration Workshop 6 - 16
Complete Example Database A BEGIN DBMS_RESOURCE_MANAGER.CREATE_SIMPLE_PLAN(SIMPLE_PLAN => ‘DB_A_Plan', CONSUMER_GROUP1 => ‘CG1', GROUP1_PERCENT => 15, CONSUMER_GROUP2 => ‘CG2', GROUP1_PERCENT => 10, CONSUMER_GROUP3 => ‘CG3', GROUP1_PERCENT => 35, CONSUMER_GROUP4 => ‘CG4’, GROUP2_PERCENT => 40); DBMS_RESOURCE_MANAGER.CREATE_PENDING_AREA(); DBMS_RESOURCE_MANAGER.UPDATE_CONSUMER_GROUP(CONSUMER_GROUP => ‘CG1’, NEW_CATEGORY => ‘BATCH’); DBMS_RESOURCE_MANAGER.UPDATE_CONSUMER_GROUP(CONSUMER_GROUP => ‘CG2’, NEW_CATEGORY => ‘BATCH’); DBMS_RESOURCE_MANAGER.UPDATE_CONSUMER_GROUP(CONSUMER_GROUP => ‘CG3’, NEW_CATEGORY => ‘INTERACTIVE’); DBMS RESOURCE MANAGER UPDATE CONSUMER GROUP(CONSUMER GROUP => ‘CG4’, DBMS_RESOURCE_MANAGER.UPDATE_CONSUMER_GROUP(CONSUMER_GROUP ‘CG4’ NEW_CATEGORY => ‘INTERACTIVE’); DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA(); END; / ALTER SYSTEM SET RESOURCE_MANAGER_PLAN = ‘DB_A_Plan';
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Complete Example This slide is the first in a series of 3 slides which provide a more complete example showing the use of the different IORM plan types at the same time. time The example is based on the scenario introduced on pages 8, 9 and 10 of this lesson. On this slide, the commands required to configure DBRM on Database A are shown. Note that the example does not show the creation of any categories using DBMS_RESOURCE_MANAGER.CREATE_CATEGORY because the categories used in the scenario (BATCH and INTERACTIVE) are categories that are predefined inside Oracle Database byy default.
Exadata and Database Machine Administration Workshop 6 - 17
Complete Example Database B BEGIN DBMS_RESOURCE_MANAGER.CREATE_SIMPLE_PLAN(SIMPLE_PLAN => ‘DB_B_Plan', CONSUMER_GROUP1 => ‘CG5', GROUP1_PERCENT => 22, CONSUMER_GROUP2 => ‘CG6', GROUP1_PERCENT => 18, CONSUMER_GROUP3 => ‘CG7', GROUP1_PERCENT => 15, CONSUMER_GROUP4 => ‘CG8’, GROUP2_PERCENT => 45); DBMS_RESOURCE_MANAGER.CREATE_PENDING_AREA(); DBMS_RESOURCE_MANAGER.UPDATE_CONSUMER_GROUP(CONSUMER_GROUP => ‘CG5’, NEW_CATEGORY => ‘BATCH’); DBMS_RESOURCE_MANAGER.UPDATE_CONSUMER_GROUP(CONSUMER_GROUP => ‘CG6’, NEW_CATEGORY => ‘BATCH’); DBMS_RESOURCE_MANAGER.UPDATE_CONSUMER_GROUP(CONSUMER_GROUP => ‘CG7’, NEW_CATEGORY => ‘INTERACTIVE’); DBMS RESOURCE MANAGER UPDATE CONSUMER GROUP(CONSUMER GROUP => ‘CG8’, DBMS_RESOURCE_MANAGER.UPDATE_CONSUMER_GROUP(CONSUMER_GROUP ‘CG8’ NEW_CATEGORY => ‘INTERACTIVE’); DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA(); END; / ALTER SYSTEM SET RESOURCE_MANAGER_PLAN = ‘DB_B_Plan';
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Complete Example (continued) On this slide, the commands required to configure DBRM on Database B are shown. These commands are essentially the same as for Database A except for the different consumer group names and resource allocation percentages.
Exadata and Database Machine Administration Workshop 6 - 18
Complete Example Exadata Cells CellCLI> alter iormplan > dbplan=((name=Database_A, level=1, allocation=70), > ( (name=Database_B, b level=1, l l allocation=30)), ll i )) > catplan=((name=INTERACTIVE, level=1, allocation=60), > (name=BATCH, level=1, allocation=40))
-
CellCLI> alter iormplan active
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Complete Example (continued) This slide shows the commands required to configure IORM on the Exadata cells. Exadata uses the IORM plan in conjunction with the DBRM plans propagated by the databases to allocate I/O resources.
Exadata and Database Machine Administration Workshop 6 - 19
Using Database I/Os Metrics • • •
You can monitor IORM to understand resource consumption and make required adjustments. There are separate metrics for small (≼ 128 KB) and large I/Os. Which database has the heaviest load? – Look for highest DB_IO_RQ_SM + DB_IO_RQ_LG values.
•
Which database was throttled the most? – Look for highest DB_IO_WT_SM + DB_IO_WT_LG values. Name
Description
DB_IO_RQ_SM DB_IO_RQ_LG
Total number of I/O requests issued by the database since any resource plan was set
DB_IO_RQ_SM_SEC DB_IO_RQ_LG_SEC
I/O requests per second issued by the database in past minute
DB_IO_WT_SM DB_IO_WT_LG
Total number of seconds that I/O requests issued by the database waited to be scheduled
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Using Database I/Os Metrics Exadata provides three groups of I/O metrics that correspond to the three types of IORM plans: category metrics, database metrics, and consumer group metrics. I/O metrics allow you to understand d t d your I/O consumption ti and d make k adjustments dj t t to t optimize ti i performance f and d resource utilization. For each I/O metric, a distinction is made between small I/Os, typically associated with OLTP applications, and large I/Os, which are usually indicative of DSS workloads. I/O metric names include _SM or _LG to identify small or large I/Os, respectively. For database metrics the objectType attribute is set to IORM_DATABASE. The table in the slide gives you a quick description of some important database I/O metrics. A separate set of metric observations is available for each database specified in the IORM plan plan. Metric observations for different databases are differentiated by the name of the database, which is set in the metricObjectName attribute. You can compare metrics between databases to determine which one has the heaviest load or which one was throttled most as illustrated in the slide. A special metricObjectName value of _OTHER_DATABASE_ is used for database I/O metrics associated with ASM and for databases that are not explicitly mentioned in the interdatabase IORM plan. While this slide focuses on database metrics, the same principles apply for category metrics and consumer group metrics. For example, the CG_IO_RQ_SM_SEC metric specifies the rate of small I/O requests issued by a consumer group per second over the past minute. A large value indicates a heavy I/O workload from this consumer group in the past minute. Exadata and Database Machine Administration Workshop 5 - 20
Quiz If a consumer group does not require its full resource allocation, what happens to the leftover allocation? 1. It remains unused. 2 It is divided equally among other consumer groups 2. groups. 3. It is allocated to other active consumer groups, according to the resource plan.
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Answer: 3
Exadata and Database Machine Administration Workshop 6 - 21
Quiz Which of the following conditions are required for IORM to intervene and control the allocation of I/O resources? 1. The IORM plan must be active. 2 More than one consumer group must be active. 2. active 3. The disks must be heavily utilized.
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Answer: 1, 2, 3 All of the conditions listed in this question must be present for IORM to intervene.
Exadata and Database Machine Administration Workshop 6 - 22
Quiz In which order are the different I/O resource plans applied to allocate I/O resources? 1. Category, intradatabase, interdatabase 2 Interdatabase, 2. Interdatabase category category, intradatabase 3. Category, interdatabase, intradatabase 4. Interdatabase, intradatabase, category 5. Intradatabase, interdatabase, category
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Answer: 3
Exadata and Database Machine Administration Workshop 6 - 23
Quiz You can create categories using the CellCLI utility. 1. TRUE 2. FALSE
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Answer: 2 You can create your own categories using the CREATE_CATEGORY procedure in the DBMS_RESOURCE_MANAGER package, and then assign your category to a consumer group using the CREATE_CONSUMER_GROUP or UPDATE_CONSUMER_GROUP procedures. You can then manage I/O resources based on categories by creating a category plan. The category plan can be created using the CellCLI utility.
Exadata and Database Machine Administration Workshop 6 - 24
Summary In this lesson, you should have learned how to use Exadata I/O Resource Management to manage workloads within a database and across multiple databases.
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata and Database Machine Administration Workshop 6 - 25
Additional Resources •
Lesson Demonstrations (Viewlets) – Intradatabase I/O Resource Management —
http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/061ExadataIntr aDBIORM/061exadataintradbiorm_viewlet_swf.html
– Interdatabase I/O Resource Management —
http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/062ExadataInt erDBIORM/062exadatainterdbiorm_viewlet_swf.html
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
Exadata and Database Machine Administration Workshop 6 - 26