VSICM6__M09_HighAvailabilityandFT

May 28, 2018 | Author: gokhan gokhan | Category: V Mware, Computer Cluster, Virtual Machine, Replication (Computing), Backup

Share Embed Donate

Report this link

Short Description

VSICM6__M09_HighAvailabilityandFT...

Description

vSphere HA and vSphere Fault Tolerance Module 9

You Y ou Are Here 1. Course Introduction

7. Virtual Machine Management

2. Software-Defined Data Center

8. Resource Management and

3. Creating Virtual Machines 4. vCenter Server 5. Configuring and Managing

Virtual Networks 6. Configuring and Managing

Virtual Storage

Monitoring 9. vSphere HA and vSphere

Fault Tolerance 10. Host Scalability 11. vSphere Update Manager and

Host Maintenance 12. Installing vSphere Components

You Y ou Are Here 1. Course Introduction

7. Virtual Machine Management

2. Software-Defined Data Center

8. Resource Management and

3. Creating Virtual Machines 4. vCenter Server 5. Configuring and Managing

Virtual Networks 6. Configuring and Managing

Virtual Storage

Monitoring 9. vSphere HA and vSphere

Fault Tolerance 10. Host Scalability 11. vSphere Update Manager and

Host Maintenance 12. Installing vSphere Components

Importance Most organizations rely on computer-based services like email, databases, and Web-based applications. The failure of any of these services can mean lost productivity and revenue.

Configuring highly available, computer-based services is extremely important for an organization to remain competitive in contemporary business environments.

Module Lessons Lesson 1:

Introduction to vSphere HA

Lesson 2:

vSphere HA Architecture

Lesson 3:

Configuring vSphere HA

Lesson 4:

Introduction to vSphere Fault Tolerance

Lesson 5:

vSphere Replication and vSphere Data Protection

Lesson 1: Introduction to vSphere HA

Learner Objectives By the end of this lesson, you should be able to meet the following objectives: • Describe the options that you can configure to make your VMware vSphere®

environment highly available • Discuss the response of VMware vSphere® High Availability when a VMware

ESXi™ host, a virtual machine, or an application fails

Protection at Every Level vSphere makes it possible to reduce planned downtime, prevent unplanned downtime, and recover rapidly from outages. vSphere HA and vSphere Fault Tolerance

vSphere vMotion, vSphere DRS

vSphere Storage vMotion

Site Recovery Manager

NIC Teaming, Storage Multipathing

vSphere Replication, Third-Party Backup Solutions, vSphere Data Protection

Component

Server

Storage

Data

Site

vCenter Server Availability: Recommendations Make VMware vCenter Server™ and the components that it relies on highly available.

vCenter Server relies on these major components: • vCenter Server database: – Create a cluster for the database.

• Authentication identity source: – For example, VMware Center™ Single Sign-On™ and Active Directory. – Set up with multiple redundant servers.

Methods for making vCenter Server available: • Use vSphere HA to protect the vCenter Server virtual machine.

About vSphere HA vSphere HA uses multiple ESXi hosts configured as a cluster to provide rapid recovery from outages and cost-effective high availability for applications running in virtual machines.

Protects against server failures

Protects against application failures

Protects against datastore accessibility failures

Protects virtual machines against network isolation

vSphere HA Scenarios: ESXi Host Failure

Virtual Machine A

Virtual Machine B

Virtual Machine A

Virtual Machine C

Virtual Machine E

Virtual Machine B

Virtual Machine D

Virtual Machine F

ESXi Host

ESXi Host

vCenter Server

ESXi Host

When a host fails, vSphere HA restarts the affected virtual machines on other hosts.

= vSphere HA Cluster

vSphere HA Scenarios: Guest Operating System Failure

Virtual Machine A

Virtual Machine C

Virtual Machine E

VMware Tools

VMware Tools

VMware Tools

Virtual Machine B

Virtual Machine D

Virtual Machine F

VMware Tools

VMware Tools

VMware Tools

ESXi Host

ESXi Host

ESXi Host

vCenter Server

When a virtual machine stops sending heartbeats or the virtual machine process crashes (vmx), vSphere HA resets the virtual machine.

= vSphere HA Cluster

vSphere HA Scenarios: Application Failure

Application

Application

Application

Virtual Machine A

Virtual Machine C

Virtual Machine E

Application

Application

Application

Virtual Machine B

Virtual Machine D

Virtual Machine F

ESXi Host

ESXi Host

vCenter Server

ESXi Host

When an application fails, vSphere HA restarts the affected virtual machine on the same host. Requires installation of VMware Tools™.

= vSphere HA Cluster

Importance of Redundant Heartbeat Networks In a vSphere HA cluster, heartbeats have these characteristics: • Heartbeats are sent between the master host and the slave hosts. • They are used to determine whether a master host or slave host has failed. • They are sent over a heartbeat network.

Redundant heartbeat networks ensure reliable failure detection. Heartbeat network implementation: • Implemented by using a VMkernel port marked for management.

Redundancy Using NIC Teaming You can use NIC teaming to create a redundant heartbeat network on ESXi hosts.

Ports or port groups used must be VMkernel ports.

NIC Teaming on an ESXi Host

Redundancy Using Additional Networks You can also create redundancy by configuring more heartbeat networks: On each ESXi host, create a second VMkernel port on a separate virtual switch with its own physical adapter.

Review of Learner Objectives You should be able to meet the following objectives: • Describe the options that you can configure to make your VMware vSphere®

environment highly available • Discuss the response of VMware vSphere® High Availability when a VMware

ESXi™ host, a virtual machine, or an application fails

Lesson 2: vSphere HA Architecture

Learner Objectives By the end of this lesson, you should be able to meet the following objectives: • Describe the heartbeat mechanisms used by vSphere HA • Identify and discuss other failure scenarios • Recognize vSphere HA design considerations

vSphere HA Architecture: Agent Communication

FDM vpxa

Datastore

Datastore

Datastore

FDM

FDM hostd

ESXi Host (Slave)

To configure high availability, ESXi hosts are grouped into an object called a cluster.

hostd

vpxa

ESXi Host (Slave)

vpxd

vpxa

hostd

ESXi Host (Master)

vCenter Server = Management Network

vSphere HA Architecture: Network Heartbeats VMFS

VMFS

NAS/NFS

Virtual Machine A

Virtual Machine C

Virtual Machine E

Virtual Machine B

Virtual Machine D

Virtual Machine F

Slave Host

Slave Host

The master host sends periodic heartbeats to the slave hosts so that the slave hosts know that the master host is alive.

Master Host

vCenter Server Management Network 1 Management Network 2

vSphere HA Architecture: Datastore Heartbeats VMFS

NAS/NFS

VMFS

Virtual Machine A

Virtual Machine C

Virtual Machine E

Virtual Machine B

Virtual Machine D

Virtual Machine F

Slave Host

Master Host

Datastores are used as a backup communication channel to detect virtual machine and host heartbeats.

Slave Host Cluster Edit Settings Window

vCenter Server Management Network 1 Management Network 2

Additional vSphere HA Failure Scenarios • Slave host failure • Master host failure • Host isolation • Virtual machine storage failure: – Virtual Machine Component Protection • All Paths Down • Permanent Device Loss

• Network failures and isolation

Failed Slave Host When a slave host does not respond to the network heartbeat issued by the master host, the master vSphere HA agent tries to identify the cause. VMFS (Heartbeat Region)

NAS/NFS (Lock File)

File Locks

File Locks

Virtual Machine A

Virtual Machine C

Virtual Machine E

Virtual Machine B

Virtual Machine D

Virtual Machine F

Failed Slave Host

Master Host

Slave Host

vCenter Server

Primary Heartbeat Network Alternate Heartbeat Network

Failed Master Host When the master host is placed in maintenance mode or crashes, the slave hosts detect that the master host is no longer issuing heartbeats. VMFS

NAS/NFS (Lock File)

(Heartbeat Region) File Locks

File Locks

Virtual Machine A Virtual Machine C Virtual Machine E Virtual Machine B Virtual Machine D Virtual Machine F

Slave Host MOID: 98

Failed Master Host master host MOID: MOID: 99 99

vCenter Server

Default Gateway (Isolation Address)

Slave Host MOID: 100

Primary Heartbeat Network Alternate Heartbeat Network MOID = Managed Object ID

Isolated Host If the host does not observe election traffic on the management and cannot ping its default gateway, the host is isolated.

Virtual Machine A

Virtual Machine C

Virtual Machine E

Virtual Machine B

Virtual Machine D

Virtual Machine F

ESXi Host

ESXi Host

ESXi Host

Default Gateway (Isolation Address)

Primary Heartbeat Network Alternate Heartbeat Network

Design Considerations Host isolation events can be minimized through good design: • Implement redundant heartbeat networks. • Implement redundant isolation addresses.

If host isolation events do occur, good design enables vSphere HA to determine whether the isolated host is still alive. Implement datastores so that they are separated from the management network by using one or both of the following approaches: • Fibre Channel over fiber optic • Physically separating your IP storage network from the management network

Virtual Machine Storage Failures With an increasing number of virtual machines and datastores on each host, storage connectivity issues have high costs but are infrequent. Connectivity problems due to: • Network or switch failure • Array misconfiguration • Power outage

Virtual machine availability is affected: • Virtual machines on affected hosts are

difficult to manage. • Applications with attached disks crash.

ESXi

ESXi

Virtual Machine Component Protection Virtual Machine Component Protection (VMCP) protects against storage failures in a virtual machine. Only vSphere HA clusters that contain ESXi 6 hosts can be used to enable VMCP.

Runs on cluster enabled for vSphere HA. ESXi

VMCP detects and responds to failures.

ESXi

Application availability and remediation.

Review of Learner Objectives You should be able to meet the following objectives: • Describe the heartbeat mechanisms used by vSphere HA • Identify and discuss other failure scenarios • Recognize vSphere HA design considerations

Lesson 3: Configuring vSphere HA

Learner Objectives By the end of this lesson, you should be able to meet the following objectives: • Recognize the prerequisites for creating and using a vSphere HA cluster • Configure a vSphere HA cluster

About Clusters A cluster is a collection of ESXi hosts and their associated virtual machines, configured to share their resources. vCenter Server manages cluster resources like a single pool of resources.

Components such as vSphere HA and VMware vSphere® Distributed Resource Scheduler™ are configured on a cluster.

Cluster

vSphere HA Prerequisites • All hosts must be licensed for vSphere HA. • A cluster must contain at least two hosts. • All hosts must be configured with static IP addresses. If you are using DHCP,

you must ensure that the address for each host persists across reboots. • All hosts must have at least one management network in common. • All hosts must have access to the same virtual machine networks and

datastores. • For Virtual Machine Monitoring to work, VMware Tools™ must be installed. • Only vSphere HA clusters that contain ESXi 6 hosts can be used to enable

VMCP.

Configuring vSphere HA Settings When you create a vSphere HA cluster or configure a cluster, you must configure settings that determine how the feature works.

vSphere HA Settings: Virtual Machine Monitoring (1) You use Virtual Machine Monitoring settings to control the monitoring of virtual machines.

vSphere HA Settings: Virtual Machine Monitoring (2)

vSphere HA Settings: Datastore Heartbeating A heartbeat file is created on the selected datastores and is used in the event of a management network failure.

vSphere HA Settings: Admission Control vCenter Server uses admission control to ensure that: 



Sufficient resources are available in a cluster to provide failover protection

Virtual machine resource reservations are respected

vSphere HA Settings: Advanced Options To customize vSphere HA behavior, you set advanced vSphere HA options. To force cluster not to use the default isolation address (default gateway): •

das.usedefaultisolationaddress = false

To force cluster to ping alternate isolation addresses: •

das.isolationaddressX = pintable address

To force cluster to wait beyond default 30-second isolation action window: •

fdm.isolationpolicydelaysec = > 30 sec

Configuring Virtual Machine Overrides You can override the vSphere HA settings that are set on a cluster for individual virtual machines in that cluster.

Network Configuration and Maintenance Before changing the networking settings on an ESXi host (adding port groups, removing virtual switches, and so on), you must suspend the Host Monitoring feature and place the host in maintenance mode.

This practice prevents unwanted attempts to fail over virtual machines.

Cluster Resource Reservation The Resource Reservation tab reports total cluster CPU, memory, memory overhead, storage capacity, the capacity reserved by virtual machines, and how much capacity is still available.

Monitoring Cluster Status You can monitor the status of a vSphere HA cluster on the Monitor tab.

Lab 21: Using vSphere HA Demonstrate vSphere HA functionality 1. Create a Cluster Enabled for vSphere HA 2. Add Your ESXi Host to a Cluster 3. Test vSphere HA Functionality 4. View the vSphere HA Cluster Resource Usage 5. Manage vSphere HA Slot Size 6. Configure a vSphere HA Cluster with Strict Admission Control 7. Prepare for Upcoming Labs

Review of Learner Objectives You should be able to meet the following objectives: • Recognize the prerequisites for creating and using a vSphere HA cluster • Configure a vSphere HA cluster

Lesson 4: Introduction to vSphere Fault Tolerance

Learner Objectives By the end of this lesson, you should be able to meet the following objectives: • List VMware vSphere® Fault Tolerance requirements and limitations • Describe vSphere Fault Tolerance operation

vSphere Fault Tolerance vSphere Fault Tolerance provides instantaneous failover and continuous availability: • Zero downtime • Zero data loss • No loss of TCP connections Instantaneous Failover Fast Checkpointing

Primary Virtual Machine

ESXi

Secondary Virtual Machine

vSphere Fault Tolerance Features (1) vSphere Fault Tolerance protects mission-critical, high-performance applications regardless of the operating system used. vSphere Fault Tolerance: • Supports up to four virtual CPUs • Supports up to 64 GB of memory • Supports VMware vSphere® vMotion® for primary and secondary virtual

machines • Creates a secondary copy of all virtual machine files, including disks • Provides fast checkpoint copying to keep k eep primary and secondary CPUs

synchronized

vSphere Fault Tolerance Features (2) vSphere Fault Tolerance: • Supports thin-provisioned disks • Supports memory virtualization hardware assist • Supports Enhanced vMotion Compatibility clusters

How vSphere Fault Tolerance Works with vSphere HA and vSphere DRS vSphere Fault Tolerance works with vSphere HA and vSphere DRS. vSphere HA: • Is required for vSphere Fault Tolerance • Restarts failed virtual machines • Is vSphere Fault Tolerance aware

vSphere DRS: • Selects the virtual machine’s location at power -on -on • Does not balance fault-tolerant virtual machines in a balanced cluster Primary Machine

ESXi

Secondary Machine

ESXi

New Secondary Machine

ESXi

Redundant VMDKs vSphere Fault Tolerance creates two complete virtual machines. Each virtual machine has its own .vmx configuration file and .vmdk files. Each of these virtual machines can be on a different datastore.

vmdk file

Primary

Secondary

.vmx file

.vmx file

vmdk file

Datastore 1

vmdk file

vmdk file

vmdk file

Datastore 2

vmdk file

vSphere Fault Tolerance Checkpoint vSphere Fault Tolerance supports multiple processors. Changes on the primary machine are not processed on the secondary machine. The memory is updated on the secondary. Input ESXi

ESXi

FT Network

Result X

vSphere vMotion: Precopy During a vSphere vMotion migration, a second virtual machine is created on the destination host. Then the memory of the source virtual machine is copied to the destination.

VM A

VM A

Memory Bitmap vSphere vMotion Network Virtual Machine Port Group

Memory Precopy

Virtual Machine End User

vSphere vMotion: Memory Checkpoint In vSphere vMotion migration, checkpoint data is the last bit of memory that keeps changing.

VM A

VM A

Memory Bitmap vSphere vMotion Network Virtual Machine Port Group

Checkpoint Data

Virtual Machine End User

vSphere Fault Tolerance Fast Checkpointing The SMP FT checkpoint interval is dynamic by default. It adapts to maximize the workload performance and can range from as small as a few milliseconds to as large as several hundred milliseconds.

vmx config Devices Disks VM memory

checkpoint

Primary Host

Fault Tolerance Network

Secondary Host

Shared Files vSphere Fault Tolerance has shared files: • shared.vmft prevents UUID change. • .ftgeneration is for the split-brain condition.

Primary Host

Secondary Host

shared.vmft .ftgeneration

shared.vmft File The shared.vmft file, which is found on a shared datastore, is the vSphere Fault Tolerance metadata file and contains the primary and secondary instance UUIDs and the primary and secondary vmx paths.

UUID-1 UUID-2

UUID-1

VM Guest OS

Ref: UUID-1

Enabling vSphere Fault Tolerance on a Virtual Machine You can turn on vSphere Fault Tolerance for a virtual machine through the VMware vSphere® Web Client.

Review of Learner Objectives You should be able to meet the following objectives: • List VMware vSphere® Fault Tolerance requirements and limitations • Describe vSphere Fault Tolerance operation

Lesson 5: vSphere Replication and vSphere Data Protection

Learner Objectives By the end of this lesson, you should be able to meet the following objectives: • Describe VMware vSphere® Replication™ • Identify vSphere® Data Protection™ requirements • List vSphere Data Protection sizing guidelines • Describe vSphere Data Protection installation and configuration • Explain how to back up and restore data with vSphere Data Protection

About vSphere Replication vSphere Replication is an extension to vCenter Server. It provides hypervisor-based virtual machine replication and recovery.

vSphere Replication

Source

vSphere

vSphere

Target

How Replication Works vSphere Replication enables replication of a virtual machine from a source site to a target site, monitoring and managing the status of the replication, and recovering the virtual machine at the target site.

Replication Between Two Sites

Steps for Full Recovery vSphere Replication integrates with Volume Shadow Copy Service through VMware Tools.

1. Right-click and select Recover .

2. Select a target folder.

3. Select a target resource.

4. Click Finish.

Validates your choices as you go

About vSphere Data Protection vSphere Data Protection is a robust, easily deployed, disk-based backup and recovery solution. vSphere Data Protection

vSphere Data Protection Requirements and Architecture vSphere Data Protection requires vCenter Server, either the Windows implementation or vCenter Server™ Appliance™.

vSphere Data Protection Components

Creating and Editing a vSphere Data Protection Backup Job You create and edit a backup job on the Backup tab of the vSphere Data Protection UI in the vSphere Web Client.

Creating a Custom Retention Policy

Performing Restores with vSphere Data Protection You can restore an entire virtual machine from the Restore tab in the vSphere Data Protection UI: • The administrator can browse the list of protected virtual machines and select

one or more restore points. • Individual VMDKs can also be restored.

Review of Learner Objectives You should be able to meet the following objectives: • Describe VMware vSphere® Replication™ • Identify vSphere® Data Protection™ requirements • List vSphere Data Protection sizing guidelines • Describe vSphere Data Protection installation and configuration • Explain how to back up and restore data with vSphere Data Protection

VSICM6__M09_HighAvailabilityandFT

Short Description

Description

Comments

We need your help!