CISCO_SAN.pdf
Short Description
Cisco_switch...
Description
Cisco Storage Design Fundamentals Version 3.0
Student Guide
Table of Contents Course Introduction Overview .................................................................................................................................................1 Recommended Prerequisites..................................................................................................................1 Course Outline ........................................................................................................................................2 Cisco Certifications .................................................................................................................................3 Administrative Information.......................................................................................................................5 About Firefly ............................................................................................................................................7
Lesson 1: SCSI and Fibre Channel Primer Overview .................................................................................................................................................9 SCSI Protocol Overview........................................................................................................................10 SCSI Operations ...................................................................................................................................17 Fibre Channel Overview........................................................................................................................22 Fibre Channel Flow Control ..................................................................................................................32 Fibre Channel Addressing.....................................................................................................................41 Fabric Login ..........................................................................................................................................45 Standard Fabric Services......................................................................................................................49
Lesson 2: Cisco MDS 9000 Introduction Overview ...............................................................................................................................................57 Cisco Storage Solutions Overview........................................................................................................58 Airflow and Power .................................................................................................................................65 Software Packages and Licensing........................................................................................................70
Lesson 3: Architecture and System Components Overview ...............................................................................................................................................77 System Architecture ..............................................................................................................................78 Oversubscription and Bandwidth Reservation ......................................................................................89 Credits and Buffers .............................................................................................................................101
Lesson 4: The Multilayer SAN Overview .............................................................................................................................................105 Virtual SANs ........................................................................................................................................106 How VSANs Work ...............................................................................................................................111 Inter-VSAN Routing (IVR) ...................................................................................................................116 PortChannels ......................................................................................................................................124 Intelligent Addressing..........................................................................................................................136 Cisco Fabric Services—Unifying the Fabric........................................................................................143 Switch Interoperability .........................................................................................................................145
Lesson 5: Remote Lab Overview Overview .............................................................................................................................................151 System Memory Areas........................................................................................................................152 CLI Overview.......................................................................................................................................154 Fabric Manager and Device Manager ................................................................................................161 System Setup and Configuration ........................................................................................................164 Using the MDS 9000 Remote Storage Labs.......................................................................................166
Lesson 6: Network-Based Storage Applications Overview .............................................................................................................................................171 Storage Virtualization Overview..........................................................................................................172 Network-Based Storage Virtualization ................................................................................................177 Network-Hosted Applications..............................................................................................................181 Network-Assisted Applications............................................................................................................185 Network-Accelerated Applications ......................................................................................................190 Fibre Channel Write Acceleration .......................................................................................................193
Lesson 7: Optimizing Performance Overview .............................................................................................................................................199 Oversubscription and Blocking ...........................................................................................................200 Virtual Output Queues ........................................................................................................................203 Fibre Channel Congestion Control .....................................................................................................207 Quality of Service ................................................................................................................................209 Port Tracking.......................................................................................................................................220 Load Balancing ...................................................................................................................................222 SAN Performance Management.........................................................................................................224
Lesson 8: Securing the SAN Fabric Overview .............................................................................................................................................237 SAN Security Issues ...........................................................................................................................238 Zoning .................................................................................................................................................244 Port and Fabric Binding ......................................................................................................................252 Authentication and Encryption ............................................................................................................256 Management Security .........................................................................................................................259 End-to-End Security Design................................................................................................................266
Lesson 9: Designing SAN Extension Solutions Overview .............................................................................................................................................269 SAN Extension Applications ...............................................................................................................270 SAN Extension Transports..................................................................................................................273 Extending SANs with WDM ................................................................................................................278 Fibre Channel over IP .........................................................................................................................287 Extending SANs with FCIP .................................................................................................................291 Cisco MDS 9000 IP Services Modules ...............................................................................................296 High Availability FCIP Configurations .................................................................................................302 Using IVR for SAN Extension .............................................................................................................306 SAN Extension Security......................................................................................................................309 FCIP Performance Enhancements .....................................................................................................311
Lesson 10: Building iSCSI Solutions Overview .............................................................................................................................................325 What’s the Problem?...........................................................................................................................326 iSCSI Overview ...................................................................................................................................329 MDS 9000 IP Services Modules .........................................................................................................336 When to Deploy iSCSI ........................................................................................................................342 High-Availability iSCSI Configurations ................................................................................................346 iSCSI Security .....................................................................................................................................353 iSCSI Target Discovery.......................................................................................................................360 Wide Area File Services......................................................................................................................362
ii
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
CSDF
Cisco Storage Design Fundamentals Overview CSDF is an intensive 2-day instructor-led training (ILT) lecture/lab course that provides learners with basic skills in designing Cisco storage networks. You will learn about and implement a broad range of features on the Cisco MDS 9000 platform, including Virtual SANs (VSANs), PortChannels, advanced security features, SAN extension with FCIP, and iSCSI solutions. In the lab, you will configure the switch from an out-of-the-box state and install the Cisco Fabric Manager GUI management application. You will configure VSANs, zones, PortChannels, and FCIP to implement a high-availability extended SAN design. This course provides an introduction to the MDS 9000 family for pre-sales engineers, system engineers, network engineers, and technical decision makers who need to design and implement SAN fabrics using MDS 9000 Family switches. Enrollment is open to Cisco SEs, Cisco channel partners, and customers.
Recommended Prerequisites You will gain the most from this course if you have experience working with storage and storage networking technologies.
Course Outline This slide shows the lessons in this course.
Course Overview • SCSI and Fibre Channel Primer • Introduction to the MDS 9000 Platform • Architecture and System Components • The Multilayer SAN • System Areas and Lab Overview • Network-Based Storage Applications • Optimizing Performance • Securing the SAN Fabric • Designing SAN Extension Solutions • Building iSCSI Solutions © 2006 Cisco Systems, Inc. All rights reserved.
2
Cisco Storage Design Fundamentals (CSDF) v3.0
2
Copyright © 2006, Cisco Systems, Inc.
Cisco Certifications Cisco Storage Networking Certification Path Enhance Your Cisco Certifications and Validate Your Areas of Expertise Cisco Storage Networking Specialists Required Exam
Recommended Training Through Cisco Learning Partners
Prerequisite: Valid CCNA Certification MDS Configuration and Troubleshooting (MDSCT)
Cisco Storage Networking Support Specialist
Cisco Multiprotocol Storage Essentials (CMSE)
642-354
Cisco Advanced Storage Implementation and Troubleshooting (CASI)
Cisco Storage Networking Design Specialist
Required Exam
Recommended Training Through Cisco Learning Partners
Prerequisite: Valid CCNA Certification Cisco MDS Storage Networking Fundamentals (CMSNF or CSDF) 642-353 Cisco Storage Design Essentials (CSDE)
© 2006 Cisco Systems, Inc. All rights reserved.
3
The Cisco Storage Networking Certification Program is part of the Cisco Career Certifications program. The title of Cisco Qualified Specialist (CQS) is awarded to individuals who have demonstrated significant competency in a specific technology, solution area, or job role, which is demonstrated through the successful completion of one or more proctored exams. The CQS Storage Networking program consists of two parallel tracks:
The Cisco Storage Networking Support Specialist (CSNSS) track is for systems engineers, network engineers, and field engineers who install, configure, and troubleshoot Cisco storage networks.
The Cisco Storage Networking Design Specialist (CSNDS) track is for pre-sales systems and network engineers who design Cisco storage networks. IT managers and project managers will also benefit from this certification.
Copyright © 2006, Cisco Systems, Inc.
Cisco Storage Design Fundamentals
3
Cisco Certifications—CCIE Storage Networking
© 2006 Cisco Systems, Inc. All rights reserved.
4
Cisco provides three levels of general certifications for IT professionals with several different tracks to meet individual needs. Cisco also provides focused certifications for designated areas such as cable communications and security. There are many paths to Cisco certification, but only one requirement—passing one or more exams demonstrating knowledge and skill. For details, go to www.cisco.com/go/certifications. CCIE certification in Storage Networking indicates expert level knowledge of intelligent storage solutions over extended network infrastructure using multiple transport options such as Fibre Channel, iSCSI, FCIP and FICON. Storage Networking extensions allow companies to improve disaster recovery, optimize performance and take advantage of network services such as volume management, data replication, and enhanced integration with blade servers and storage appliances. There are no formal prerequisites for CCIE certification. Other professional certifications and/or specific training courses are not required. Instead, candidates are expected to have an indepth understanding of the subtleties, intricacies and challenges of end-to-end storage area networking. You are strongly encouraged to have 3-5 years of job experience before attempting certification. To obtain your CCIE, you must first pass a written qualification exam and then a corresponding hands-on lab exam.
4
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Administrative Information Please silence your cell phones.
Learner Introductions
• Your name • Your company • Skills and knowledge • Brief history • Objective
© 2006 Cisco Systems, Inc. All rights reserved.
6
Please introduce yourself.
Copyright © 2006, Cisco Systems, Inc.
Cisco Storage Design Fundamentals
5
Course Evaluations
www.fireflycom.net/evals © 2006 Cisco Systems, Inc. All rights reserved.
7
Please take time to complete the course evaluations after the class ends. Your feedback helps us continually improve the quality of our courses.
6
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
About Firefly About Firefly • Technology Focus
• Solutions Focus
Datacenter IP and Security
Integrated Data Center Solutions
Content Networking and WAN Optimization
Core IP Services Provisioning
Storage Networking
Multiprotocol SANs
Business Continuance
Business Continuance—All Application Tiers
Optical Networking
Application Optimization Application Security
• Services Global Delivery Curriculum Development State-of-the-Art Remote Labs and E-Learning Needs Assessment Consultative Education © 2006 Cisco Systems, Inc. All rights reserved.
Copyright © 2006, Cisco Systems, Inc.
8
Cisco Storage Design Fundamentals
7
Firefly MDS 9000 Training • Support Track—Cisco Storage Networking Support Specialist (CSNSS) Systems Engineers, Technical Consultants, and Field Engineers – MDS Configuration and Troubleshooting—Extended Edition (MDSCT + FCIP)
– Cisco Multiprotocol Storage Essentials (CMSE) – Cisco Advance Storage Implementation and Troubleshooting (CASI) – Cisco Mainframe Storage Solutions (CMSS)
• Design Track—Cisco Storage Networking Design Specialist (CSNDS) Systems Engineers, Technical Consultants, Storage Architects and SAN Designers – Cisco Storage Design Fundamentals (CSDF)
– Cisco Storage Design Essentials (CSDE) – Cisco Storage Design BootCamp (CSDF + CSDE)
9
© 2006 Cisco Systems, Inc. All rights reserved.
Firefly CCIE-SAN Training • Firefly CCIE Storage KickStart Developed by Firefly Intensive 2-week training program designed for SAN Systems Engineers, Architects, and Support Engineers Also prepares students for Cisco Storage Networking Support Specialist (CSNSS) certification exam Includes the contents of the MDSCT, CMSE, CASI and CMSS courses—usually 14 days Taught only by senior professionals who have passed the CCIE Storage written exam
• Firefly CCIE Storage Lab BootCamp Developed by Firefly 5-day intensive hands-on experience designed for students who have already passed the CCIE Written Exam Taught only by senior Firefly instructors who have achieved CCIE SAN
© 2006 Cisco Systems, Inc. All rights reserved.
8
Cisco Storage Design Fundamentals (CSDF) v3.0
10
Copyright © 2006, Cisco Systems, Inc.
Lesson 1
SCSI and Fibre Channel Primer Overview This lesson provides a brief overview of the SCSI and Fibre Channel protocols.
Objectives Upon completing this lesson, you will be able to explain the fundamentals of SCSI and Fibre Channel. This includes being able to meet these objectives:
Describe SCSI technology
Describe the operations of the SCSI protocol
Explain why FC is a data transport technology that is well-suited to storage networks
Explain the fundamental design of FC flow control
Describe the two addressing schemes used on Fibre Channel networks
Describe the session establishment protocols that are performed by N_Ports and F_Ports in a fabric topology
List the standard services provided by fabric switches as defined by the FC specification
SCSI Protocol Overview SCSI Protocol Overview The SCSI protocol defines how commands, status, and data blocks are exchanged between initiators and targets. SCSI is a Block I/O protocol. The Initiator always sends the command, reads or writes data blocks to the Target and receives a final Response. Initiator
Target
Requests
LUNs
Responses
Application Client
Device Server Tasks
Delivery Subsystem [ Parallel SCSI or FCP or IP ] © 2006 Cisco Systems, Inc. All rights reserved.
4
SCSI Protocol Overview The Small Computer System Interface (SCSI) is a standard that evolved from a proprietary design by Shugart Associates in the 70’s called the SASI bus. SCSI performs the heavy lifting of passing commands, status, and block data between platforms and storage devices. One function of operating systems is to hide the complexity of the computing environment from the end user. Management of system resources including , memory, peripheral devices, display, context switching between concurrent applications, and so on, are generally concealed behind the user interface. The internal operations of the OS must be robust, closely monitor changes of state, ensure that transactions are completed within the allowable time frames, and automatically initiate recovery or retires in the event of incomplete or failed procedures. For I/O operations for peripheral devices such as disk, tape, optical storage, printers, and scanners, these functions are provided by the SCSI protocol, typically embedded in a device driver or logic onboard a host adapter. Because the SCSI protocol layer sits between the operating system and the peripheral resources, it has different functional components. Applications typically access data as files or records. Although these may be ultimately stored on disk or tape media in the form of data blocks, retrieval of the file requires a hierarchy of functions to assemble raw data blocks into a coherent file that can be manipulated by an application. SCSI architecture defines the relationship between initiators (hosts) and targets (for example disks or tape) as a client/server exchange. The SCSI-3 application client resides in the host and represents the upper layer application, file system, and operating system I/O requests. The SCSI-3 device server sits in the target device, responding to requests.
10
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
SCSI Parallel Technology SCSI uses a parallel architecture in which data is sent simultaneously over multiple wires. SCSI is half-duplex - data travels in one direction at a time. On a parallel SCSI bus, a device must assume exclusive control over the bus in order to communicate. The SCSI Initiator then selects the SCSI Target and sends a Command to initiate a data transfer. At the end of the transfer, the device is de-selected and the bus is free.
• Parallel • Half-duplex • Shared bus • Limited distance 5
© 2006 Cisco Systems, Inc. All rights reserved.
SCSI Parallel Technology The bus/target/LUN triad is defined from parallel SCSI technology. The bus represents one of several potential SCSI interfaces installed in the host, each supporting a separate string of disks. The target represents a single disk controller on the string. And the LUN designation allows for additional disks governed by a controller – for example, a RAID device. The following are characteristics of parallel SCSI technology:
SCSI uses a parallel architecture in which data is sent simultaneously over multiple wires.
SCSI is half-duplex—data travels in one direction at a time.
On a SCSI bus, a device must assume exclusive control over the bus in order to communicate. (SCSI is sometimes referred to as a “simplex” channel because only one device can transmit at a time).
Copyright © 2006, Cisco Systems, Inc.
SCSI and Fibre Channel Primer
11
Multidrop Topology and Addressing Data / Address Bus Terminator
Terminator
FC HBA
Clock / Control Signals
SCSI Initiator (I/O Adapter)
Interface
Interface
Interface
SCSI Targets ID ID
ID ID
ID ID
7
6
5
… LUN 0 LUN 1
LUN 0 LUN 1 LUN 2 LUN 3
ID ID
SCSI ID
0 LUN 0 LUN 1 LUN 2 LUN 3
Address = BUS : Target ID : LUN
A SCSI Initiator addresses its SCSI Target using the SCSI Nexus – Bus : Target ID : LUN © 2006 Cisco Systems, Inc. All rights reserved.
6
Multidrop Topology and Addressing All of the devices on a SCSI bus are connected to a single cable. This is called a multidrop topology:
Data bits are sent in parallel on separate wires. Control signals are sent on a separate set of wires.
Only one device at a time can transmit—a transmitting device has exclusive use of the bus.
A special circuit called a terminator must be installed at the end of the cable. The cable must be terminated to prevent unwanted electrical effects from corrupting the signal.
A multidrop topology has inherent limitations:
12
Parallel transmission of data bits allows more data to be sent in a given time period but data bits may arrive early or late (skew) and lead to data errors.
The fact that control signals, such as clock signals, are sent on a separate set of wires also makes synchronization more difficult.
It is an inefficient way to use the available bandwidth, because only one communication session can exist at a time.
Termination circuits are built into most SCSI devices, but the administrator often has to set a jumper on the device to enable termination.
Incorrect cable termination can cause either a severe failure or intermittent, difficult-totrace errors.
To achieve faster data transfer rates, vendors doubled the number of data lines on the cable from 8 (narrow SCSI) to 16 (wide SCSI).
Vendors have increased the clock rate, which increased the transfer rates, but this also increased the possibility of data errors due to skew or electrical interference.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Parallel SCSI is limited to a maximum cable length of 25m.
SCSI was designed to support a few devices at most, so its device addressing scheme is fairly simple—and not very flexible. SCSI devices use hard addressing:
Each device has a series of jumpers that determine the device’s physical address, or SCSI ID. The ID is software-configurable on some devices.
Each device must have a unique SCSI ID. Before adding a device to the cable, the administrator must know the ID of every other device connected to the cable and choose a unique ID for this new device.
The ID of each device determines its priority on the bus. For example, the SCSI Initiator with ID 7 always has a higher priority than the SCSI Target with ID 6. Because each device must have exclusive use of the bus while it is transmitting, ID 6 must wait until ID 7 has finished transmitting. Fixed priority makes it more difficult for administrators to control performance and quality-of-service.
Copyright © 2006, Cisco Systems, Inc.
SCSI and Fibre Channel Primer
13
SCSI-3 Architecture Model SCSI Block Commands
SCSI Streaming Commands
SCSI Enclosure Services
SCSI Media Commands
SCSI Primary Commands
SCSI SCSIParallel Parallel Port PortDriver Driver
SCSI-FCP SCSI-FCP Port PortDriver Driver
iSCSI iSCSI IP IP Port PortDriver Driver
SAS SAS Port PortDriver Driver
SBP-2 SBP-2 Port PortDriver Driver
SCSI SCSIParallel Parallel Port Port
Fibre FibreChannel Channel Port Port
Ethernet Ethernet Port Port
SAS SASSerial Serial Port Port
IEEE-1394 IEEE-1394 (Firewire) (Firewire) Port Port
FC SCSI Adapter
FC HBA
NIC
SAS Interface
*SCSI-3 – Separation of physical interface, transport protocols, and SCSI Command Set 7
© 2006 Cisco Systems, Inc. All rights reserved.
The SCSI-3 Architecture Model The SCSI-3 family of standards introduced several new variations of SCSI commands and a protocol, including serial SCSI-3 and special command sets for streaming and media handling required for tape. As shown in the diagram, the command layer is independent of the protocol layer, which is required to carry SCSI-3 commands between devices. This enables more flexibility in substituting different transports beneath the SCSI-3 command interface to the operating system. The SCSI Architecture Model (SAM) consists of four layers of functionality: 1. The physical interconnect layer specifies the characteristics of the physical SCSI link: —
FC-PH is the physical interconnect specification for Fibre Channel.
—
Serial Storage Architecture (SSA) is a storage bus aimed primarily at the server market.
—
IEEE1394 is the FireWire specification.
—
SCSI Parallel Interface (SPI) is the specification used for parallel SCSI buses.
2. The transport protocol layer defines the protocols used for session management:
14
—
SCSI-FCP is the transport protocol specification for Fibre Channel.
—
Serial Storage Protocol (SSP) is the transport protocol used by SSA devices.
—
Serial Bus Protocol (SBP) is the transport protocol used by IEEE1394 devices.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
3. The shared command set layer consists of command sets for accessing storage resources: —
SCSI Primary Commands (SPC) are common to all SCSI devices.
—
SCSI Block Commands (SBC) are used with block-oriented devices, such as disks.
—
SCSI Stream Commands (SSC) are used with stream-oriented devices, such as tapes.
—
SCSI Media Changer Commands (SMC) are used to implement media changers, such as robotic tape libraries and CD-ROM carousels.
—
SCSI Enclosure Services (SES) defines commands used to monitor and manage SCSI device enclosures like RAID Arrays, including fans, power and temperature monitoring.
4. The SCSI Common Access Method (CAM) defines the SCSI device driver application programming interface (API).
Copyright © 2006, Cisco Systems, Inc.
SCSI and Fibre Channel Primer
15
FC HBA
Interface
Serial SCSI-3 over Fibre Channel—SCSI-FCP FC Link
Fibre Channel Fabric
pWWN
SCSI Initiator (FC HBA)
Payload
Interface
pWWN
Interface
FC Frame Interface
SCSI Targets LUN 0 LUN 1 LUN 2 LUN 3
LUN 0 LUN 1
…
LUN 0 LUN 1 LUN 2 LUN 3
SCSI Commands, Data and Responses are carried in the payload of a frame from source to destination. In SCSI-FCP, the SCSI IDs are mapped to the unique worldwide name in each FC Port. © 2006 Cisco Systems, Inc. All rights reserved.
8
Serial SCSI-3 over Fibre Channel—SCSI-FCP All of the devices are attached to the same fabric and connected via Fibre Channel links to one or more interconnected switches.
16
The SCSI Initiator and SCSI Target ports are zoned together within the Fibre Channel switch.
Each device logs in to the fabric (FLOGI) and registers itself with the Name Server in the switch.
The FC-HBA queries the Name Server and discovers other FC ports in the same zone as itself.
The FC HBA then logs in to each Target port (PLOGI) and they exchange Fibre Channel parameters.
The SCSI Initiator (SCSI-FCP driver) then logs in to the SCSI Target behind the FC Target port (PRLI) and establishes a communication channel between SCSI Initiator and SCSI Target.
The SCSI Initiator commences a SCSI operation by sending a SCSI Command Descriptor Block (CDB) down to the FC HBA with instructions to send it to a specific LUN behind a Target FC port (SCSI Target). The command is carried in the payload of the FC Frame to the target FC port.
The SCSI Target receives the CDB and acts upon it. Usually this would be a Read or Write command. Data is then carried in the payload of the FC Frame between SCSI Initiator and SCSI Target.
Finally, when the operation is complete, the SCSI Target will send a Response back to the SCSI Initiator in the payload of a FC Frame.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
SCSI Operations SCSI Operation • SCSI specifies three phases of operation Command – send the required command and parameters via a Command Descriptor Block (CDB) Data – Transfer data in accordance with the command Response – Receive confirmation of command execution Read
Write
Comm and
Comm and
Xfer-Rdy
FC
Data
FC
Data
HBA
Initiator © 2006 Cisco Systems, Inc. All rights reserved.
Target
Respo
nse
nse Respo
10
Phases of Operation Every communication between SCSI Initiator and SCSI Target is formed by sequences of events called bus phases. Each phase has a purpose and is linked to other phases to execute SCSI commands and transfer data and messages back and forth. The majority of the SCSI protocol is controlled by the SCSI Initiator. The SCSI Target is usually passive and waits for a command. Only the SCSI initiator can initiate a SCSI operation, by selecting a SCSI Target and sending a CDB (Command Descriptor Block) to it. If the CDB contains a Read command, the SCSI Target moves its heads into position and retrieves the data from its disk sectors. This data is returned to the SCSI Initiator. If the CDB contains a Write Command, the SCSI Target prepares its buffers and returns a XferRdy. When the SCSI Initiator receives Xfer-Rdy it can commence writing data. Finally, when the operation is complete, the SCSI Target returns a Response to indicate a successful (or unsuccessful) data transfer.
Copyright © 2006, Cisco Systems, Inc.
SCSI and Fibre Channel Primer
17
SCSI Command Descriptor Blocks SCSI SCSICommand CommandDescriptor DescriptorBlock Block (CDB) (CDB)
Byte 0 1 2 3 4 5 6 7 8 9
7
6
5
4
Group Code
3
2
1
0 First FirstByte Byte––Operation OperationCode Code
Command Code
Reserved Service Action Logical Block Address Logical Block Address Logical Block Address Logical Block Address Reserved MSB Transfer Length Transfer Length Control
Transfer TransferData Datastarting startingatatthis thisLBA LBA
MSB
LSB
Number of SCSI Blocks to be transferred Number of SCSI Blocks to be transferred LSB
Last LastByte Byte––Control ControlByte Byte
A Command is executed by the Initiator sending a CDB to a Target. In serial SCSI-3, the CDB is carried in the payload of the Command Frame 11
© 2006 Cisco Systems, Inc. All rights reserved.
SCSI Command Descriptor Blocks SCSI commands are built from a common structure:
Operation Code Byte
“N: Bytes of parameters
Control Byte
The Operation Code consists of a Group Code and a command Code
Group Code establishes the total command length.
Command Code establishes the command function.
The number of Bytes of parameters (“N”) can be determined from the Operation Code byte which is located in byte 0 of the Command Descriptor Block (CDB). The Control Byte, which is located in the last byte of a Command Descriptor Block, contains control bits that define the behavior of the command. The Logical Block Address is an absolute address of where the first block should be written (or read) on the disk. LBA 00 is the first sector on the disk volume or LUN, LBA 01 is the second sector and so on, until we reach the last sector of the disk volume or LUN. When the CDB is sent to a block device (Disk), blocks are always 512 Bytes long. The Transfer Length contains the number of 512 Byte blocks to be transferred. When the CDB is sent to a streaming device (Tape), the block length is negotiated. The Transfer Length contains the number of blocks to be transferred. 18
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
CDBs can be different sizes – 6Byte CDB, 10 Byte CDB, 12 Byte CDB, 16 Byte CDB etc. to accommodate larger disk volumes, or transfer lengths. 10 Byte CDBs are common
Copyright © 2006, Cisco Systems, Inc.
SCSI and Fibre Channel Primer
19
SCSI Commands • SCSI supports several specific commands for each media type, and primary commands that all devices understand. • The following commands are of particular interest: - REPORT LUNS
How many LUNs do you have?
- INQUIRY
What device are you?
- TEST UNIT READY
Is the LUN available?
- REPORT CAPACITY
What size is each LUN?
12
© 2006 Cisco Systems, Inc. All rights reserved.
SCSI Commands Report LUNs are used by Operating Systems to discover LUNs attached to a particular hardware address. They are typically sent by the Initiator to LUN 0. Inquiries are used by the Operating System to determine the capabilities of each LUN that was discovered with REPORT LUNS. Test Unit Ready is used to check the condition of a particular LUN. Report Capacity is sent to each LUN in turn to obtain the size of each LUN.
20
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Building an I/O Request Server Application
1.
Application makes File I/O request to Volume Manager
2.
Volume Manager maps volume to SCSI ID & Target LUN
3.
File System maps Files to Blocks, makes Block I/O request
Volume Mgr
4.
Command, LBA, Block count and LUN sent to SCSI driver
5.
SCSI driver creates CDB (Command Descriptor Block)
6.
FC driver creates command frame with CDB in payload
7.
FC driver sends command frame to Target LUN and awaits response
File System SCSI Driver FC Driver
CDB Group Code MSB
MSB
Command Code
Reserved Service Action Logical Block Address Logical Block Address Logical Block Address Logical Block Address Reserved Transfer Length Transfer Length
LSB
Read
Write
Comm and
Comm and
LSB
Xfer-Rdy
Control
Data
Data
FC Frame SOF
Header
Payload
CRC EOF
nse Respo
Respo
nse 13
© 2006 Cisco Systems, Inc. All rights reserved.
Building an I/O Request This slide explains the process of the Initiator talking to the Target 1. Application makes File I/O request to Volume Manager 2. Volume Manager maps volume to SCSI ID & Target LUN 3. File System maps Files to Blocks, makes Block I/O request 4. Command, LBA, Block count and LUN sent to SCSI driver 5. SCSI driver creates CDB (Command Descriptor Block) 6. FC driver creates command frame with CDB in payload 7. FC driver sends command frame to Target LUN and awaits response
Copyright © 2006, Cisco Systems, Inc.
SCSI and Fibre Channel Primer
21
Fibre Channel Overview Fibre Channel Overview Fibre Channel is a protocol used for efficiently transporting data between devices connected to the same fabric. Fibre Channel provides reliable and efficient data delivery with high throughput and low latency. Fibre Channel is the transport technology most commonly used for SANs today. FC
FC
FC
FC
HBA HBA
IP Network
Fibre Channel Fabric FC
FC HBA
© 2006 Cisco Systems, Inc. All rights reserved.
15
Fibre Channel Overview FC is a protocol used for efficiently transporting data between devices in the same fabric. It is the network interconnect technology that is most commonly used for SANs today. Traditional storage technologies, such as SCSI, are designed for controlled, local environments. They support few devices and only short distances, but they deliver data quickly and reliably. Traditional data network technologies, such as Ethernet, are designed for chaotic, distributed environments. They support many devices and long distances, but delivery of data can be delayed (Latency) FC combines the best of both worlds. It supports many devices and longer distances, and it provides reliable and efficient data delivery with high throughput and low latency. Like SCSI, Fibre Channel is a Block I/O protocol, delivering data blocks (usually 512 Bytes long) between devices in the same Fabric. In the diagram, the network on the right, consisting of servers and storage devices, is an FC SAN. The SAN consists of servers and storage devices connected by an FC network.
22
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Fibre Channel Topologies FC
FC HBA
FC FC
FC FC
HBA
HBA
FC
FC Fabric
FC
FC
FC
HBA
FC HBA
FC
FC
HBA
FC
FC
FC
HBA
HBA
Point-to-Point
Arbitrated Loop
Switched Fabric
Arbitrated Loop provides shared bandwidth at low cost Switched Fabric provides aggregate bandwidth and scalability but requires complex FC switches, which increase the cost Most SANs today use the Switched Fabric topology © 2006 Cisco Systems, Inc. All rights reserved.
16
Fibre Channel Topologies Fibre Channel Protocol includes three basic SAN topologies.
Point-to-Point
Exactly two FC ports connected together. Both devices have exclusive access to the full link bandwidth
Arbitrated Loop
Up to 126 FC ports connected together on a Private Loop (not connected to a FC Switch)
Up to 127 FC ports connected together on a Public Loop (connected via a FL port on a FC Switch)
All devices share the available bandwidth around the loop, therefore a practical limit might only be 20 or so devices.
A device that wishes to communicate with another device must do the following operations.
1. Arbitrate to gain control of the Loop 2. Open the port it wishes to communicate with 3. Send or Receive Data frames 4. Close the port 5. Release the loop, ready for the next transfer. —
Usually only two devices communicate at a time, the other FC ports in the loop are passive
Copyright © 2006, Cisco Systems, Inc.
SCSI and Fibre Channel Primer
23
—
When the loop is broken or a device is added or removed, the downstream FC port sends thousands of LIP primitive sequences to inform the other loop devices that the loop has been broken.
—
The LIP (Loop Initialization Procedure) is used to assign (or re-assign) Arbitrated Loop Physical Addresses (ALPAs) to each FC Port on the loop. This operation is disruptive and frames may be lost during this phase. Nowadays, most users would connected FCAL devices via a FC Hub to minimize disruption.
Switched Fabric
24
The topology of choice for FC SANs. Each connected device has access to full bandwidth on its link through the switch port it is connected to.
The FC SAN can be expanded by adding more switches and increasing the number of ports for connected devices.
The FC 24 bit addressing scheme allows for potentially 16,500,000 devices to be connected. A realistic number is a few thousand. This is because there can only be a maximum 239 switches in a single fabric and most switches today have a small number of ports each.
Each FC switch must provide ‘services’ for management of the SAN. These services include a Name Server, Domain Manager, FSPF Topology Database, Zoning Server, Time Server etc.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Fibre Channel Switched Fabric Topology FC
FC
FC
HBA
FC
FC
FC
HBA
FC HBA
FC
FC
FC HBA
FC HBA
FC
FC Fabric
FC
FC HBA
ISL
FC
FC HBA
FC
FC
FC
FC
HBA
HBA
A Fabric contains one or more switches, connected together via Inter Switch Links (ISLs) 17
© 2006 Cisco Systems, Inc. All rights reserved.
Fibre Channel Switched Fabric Topology The Switched Fabric topology incorporates one or more high-bandwidth FC switches, to handle data traffic among host and storage devices.
Each switch is assigned a unique ID called a Domain. There can be a maximum 239 switch domains in a fabric, however, McData impose a 32 Domain limit in their designs.
FC Switches are connected together via Inter-Switch Links (ISLs).
Each device is exclusively connected to its FC port on the switch via a bi-directional Full Duplex link.
All connected devices share the same addressing space within the fabric and can potentially communicate with each other.
Frames flow from device to device via one or more FC Switches. As a frame moves from switch to switch, this is called a hop. McData impose a 3 hop limit in their designs. Brocade impose a 7 hop limit and Cisco impose a 10 hop limit.
Copyright © 2006, Cisco Systems, Inc.
SCSI and Fibre Channel Primer
25
Fibre Channel Ports Ports are intelligent interface points on the Fibre Channel network: Embedded in a FC Host Bus Adapter (HBA) Embedded in a fabric switch Embedded in a storage array or tape controller A Link connects exactly two Ports together
Ports
FC Ports are assigned a dynamic FCID
Node FC
Node FC FC
FC HBA
FC
FC HBA
Switch
Server © 2006 Cisco Systems, Inc. All rights reserved.
Link
FC
Array controller
Tape device
Storage 18
Fibre Channel Ports In data networking terminology, ports are often thought of as just physical interfaces where you plug in the cable. In FC however, ports are intelligent interfaces, responsible for actively performing critical network functions. The preceding graphic contains several ports. There are ports in the host I/O adapter (host bus adapter [HBA]), ports in the switch, and ports in the storage devices. FC terminology differentiates between several different types of ports, each of which performs a specific role on the SAN. You will encounter these terms often as you continue to learn about FC, so it is important that you learn to recognize the different port types. In addition to the common ports defined for FC, Cisco has developed some proprietary port types. Fibre Channel Ports are assigned a unique address, a Fibre Channel ID (FCID) at login time.
26
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Standard Fibre Channel Ports Valid combinations
FC
FC
• NL – NL
FCAL Hub
NL
NL
HBA
Host
• NL – FL
Storage Array
FL
• N–N • N–F
E
• E–E
Inter Switch Link
E
• E–B FC HBA
N
F
FC
F
Link
Host
N Link
E
Storage Array
B
© 2006 Cisco Systems, Inc. All rights reserved.
WAN Bridge
19
Standard Fibre Channel Ports
An N_Port (Node Port) is a port on a node that connects to a fabric. I/O adapters and array controllers contain one or more N_Ports. N_Ports can also directly connect two nodes in a point-to-point topology
An F_Port (Fabric Port) is a port on a switch that connects to an N_Port.
An E_Port (Extension Port) is a port on a switch that connects to an E_Port on another switch.
An FL_Port (Fabric Loop Port) is a port on a switch that connects to an arbitrated loop. Logically, an FL_Port is considered part of both the fabric and the loop. FL_Ports are always physically located on the switch. Note that FC hubs, although they obviously have physical interfaces, do not contain FC ports. Hubs are basically just passive signal splitters and amplifiers. They do not actively participate in the operation of the network. On an arbitrated loop, the node ports manage all FC operations. Not all switches support FL_Port operation. For example, some McDATA switches do not support FL_Port operation.
An NL_Port (Node Loop Port) is a port on a node that connects to another port in an arbitrated loop topology. There are two types of NL_Ports: Private NL_Ports can communicate only with other loop ports; public NL_Ports can communicate with other loop ports and with N_Ports on an attached fabric. Note that the term L_Port (Loop Port) is sometimes used to refer to any port on an arbitrated loop topology. “L_Port” can mean either “FL_Port” or “NL_Port”. In reality, there is no such thing as an L_Port.
Copyright © 2006, Cisco Systems, Inc.
SCSI and Fibre Channel Primer
27
Nowadays, most ports are universal. They automatically sense the port they are connected to and adopt the correct valid port type automatically. However, it is good practice to lock down the port type to its correct function.
28
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Fibre Channel Frame R_CTL
Destination Address D_ID
D_ID = Where the frame is going to
CS_CTL
Source Address S_ID
S_ID = Where the frame is coming from
TYPE
Frame Control F_CTL
TYPE = Payload Protocol
SEQ_ID
DF_CTL
OX_ID
SEQ_CNT RX_ID
SEQ_CNT and SEQ_ID = Sequence IDs OX_ID and RX_ID = Exchange IDs
Parameter Field 4 Bytes wide x 6 Words = 24 Bytes
SOF
0–2112 Byte Data Field
Header Optional Headers
4
24
0-64
0–2048 Byte Payload
0–2048
CRC
EOF
Fill Bytes
0-3
4
4 Bytes 20
© 2006 Cisco Systems, Inc. All rights reserved.
Fibre Channel Frames The maximum total length of an FC frame is 2148 bytes, or 537 words. ( A Word = 4 Bytes).
A 4-byte SOF (Start of Frame) delimiter
A 24-byte header
A data payload that can vary from 0 to 2112 bytes. Typically 2048 Bytes for SCSI-FCP.
A 4-byte CRC (Cyclic Redundancy Check) that is used to detect bit-level errors in the header or payload
A 4-byte EOF (End of Frame) delimiter
The Header contains fields used for identifying and routing the frame across the fabric.
R_CTL: Routing Control field defines the frame’s function.
D_ID: Destination Address. The FCID of the FC Port where the frame is being sent to.
CS_CTL: Class Specific Control field. Only used for Class 1 and 4.
S_ID: Source Address. The FCID of the FC Port where the frame has come from.
TYPE: The Upper Layer Protocol Data type contained in the payload. This is hex ’08’ for SCSI-FCP.
F_CTL: Frame Control field contains miscellaneous control information regarding the frame, including how many fill bytes there are (0-3).
SEQ_ID: Sequence ID. The unique identifying number of the Sequence within the Exchange.
DF_CTL: Data Field Control. This field defines the use of the Optional Headers. SCSIFCP doesn’t use Optional Headers.
Copyright © 2006, Cisco Systems, Inc.
SCSI and Fibre Channel Primer
29
SEQ_CNT: Sequence Count. The number of the frame within a sequence. The first frame is hex ’00’
OX_ID: Originating Exchange ID. A Unique identifying number provided by the source FC Port.
RX_ID: Responding Exchange ID. A Unique identifying number provided by the destination FC Port. OX_ID and RX_ID together define the Exchange ID.
PARMS: Parameter Field. Usually provides a relative offset into the ULP data buffer.
The frame payload consists of 3 elements:
30
The payload itself, containing data or commands, is variable and can be up to 2112 bytes.
The first 64 bytes of the payload can be used to incorporate optional headers. This would reduce the data payload size to 2048 bytes (2KB). SCSI-FCP usually carries multiples of 512 Byte blocks.
The payload ends with 0-3 fill bytes. This is necessary because the smallest unit of data recognized by FC is a 4-byte word. However, the ULP is not aware of this FC requirement, and the data payload for a frame might not end on a word boundary. FC therefore adds up to 3 fill bytes to the end of the payload—as many as are needed to ensure that the payload ends on a word boundary.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Fibre Channel Data Constructs Read
Words are 4 Bytes long. The smallest unit of transfer in Fibre Channel. Frames are made up from several Words.
Sequence 1
Comm and
Sequences are uni-directional and contain one or more frames Data
Sequence 3
Respo
Sequence 2
Word
Exchange
Exchanges are bi-directional and contain three or more sequences
nse
Exchange Frame Sequence
FC
FC HBA
Initiator © 2006 Cisco Systems, Inc. All rights reserved.
Target 21
Fibre Channel Data Constructs The preceding graphic shows a transaction between a host (Initiator) and a storage device (Target):
The smallest unit of data is a word. Words consist of 32 bits (4 bytes) of data that are encoded into a 40-bit form by the 8b/10b encoding process.
Words are packaged into frames. An FC frame is equivalent to an IP packet.
A sequence is a series of frames sent from one node to another node. Sequences are unidirectional—in other words, a sequence is a set of frames that are issued by one node.
An exchange is a series of sequences sent between tow nodes. The exchange is the mechanism used by two ports to identify and manage a discrete transaction. The exchange defines an entire transaction, such as a SCSI read or write request. An exchange is opened whenever a transaction is started between two ports and is closed when the transaction ends. An FC exchange is equivalent to a TCP session.
Copyright © 2006, Cisco Systems, Inc.
SCSI and Fibre Channel Primer
31
Fibre Channel Flow Control Fibre Channel Flow Control How data interchange is controlled in a network: The flow control strategy used by Ethernet and other data networks can degrade performance: Transmitter does not stop transmitting packets until after the receiver’s buffers overflow and packets are already lost Lost packets must be retransmitted Degradation can be severe under heavy traffic loads
Lost packets Data
Tx
Data Data
Rx
PAUSE
Flow Control in Ethernet 23
© 2006 Cisco Systems, Inc. All rights reserved.
Flow control is a mechanism for ensuring that frames are sent only when there is somewhere for them to go. Just as traffic lights are used to control the flow of traffic in cities, flow control manages the data flow in an FC fabric. Some data networks, such as Ethernet, use a flow-control strategy that can result in degraded performance:
A transmitting port (Tx) can begin sending data packets at any time.
When the receiving port’s (Rx) buffers are completely filled and cannot accept any more packets, Rx “tells” Tx to stop or slow the flow of data.
After Rx has processed some data and has some buffers available to accept more packets, it “tells” Tx to resume sending data.
This strategy results in lost packets when the receiving port is overloaded, because the receiving port tells the transmitting port to stop sending data after it has already “overflowed”. Lost packets must be retransmitted, which degrades performance. Performance degradation can become severe under heavy traffic loads.
32
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
What is Credit-Based Flow Control? Fibre Channel uses a credit-based strategy: When receiver is ready to accept a frame it sends a credit to the transmitter, giving permission for the transmitter to send a frame. The receiver is always in control.
Benefits: Prevents loss of frames due to buffer overflow Maximizes link throughput and performance under high loads
Disadvantages: Long distance links require lots more credits Port Rx has
1 0
free buffers
DATA
Tx
Rx READY
Flow Control in Fibre Channel 24
© 2006 Cisco Systems, Inc. All rights reserved.
What is Credit-Based Flow Control? To improve performance under high traffic loads, FC uses a credit-based flow control strategy in which the receiver must issue a credit for each frame that is sent by the transmitter before that frame can be sent. A credit-based strategy ensures that the receiving port is always in control. The receiving port must issue a credit for each frame that is sent by the transmitter. This strategy prevents frames from being lost when the receiving port runs out of free buffers. Preventing lost frames maximizes performance under high traffic load conditions because the transmitting port does not have to resend frames. The preceding diagram illustrates a credit-based flow control process:
The transmitting port (Tx) counts the number of free buffers at the receiving port (Rx).
Before Tx can send a frame, Rx must notify Tx that Rx has a free buffer and is ready to accept a frame. When Tx receives the notification (called a credit), it increments its count of the number of free buffers at Rx.
Tx only sends frames when it knows that Rx can accept them.
When Tx sends a frame it decrements the credit count
When the credit count falls to zero, Tx must stop sending frames and wait for another credit
Copyright © 2006, Cisco Systems, Inc.
SCSI and Fibre Channel Primer
33
Types of Flow Control • Fibre Channel defines two types of flow control: Buffer-to-buffer
R_RDY
Port-to-Port across every Link
End-to-end
ACK
Between Source and Destination Ports
Buffer-to-buffer flow control
N_Port
F_Port
E_Port
E_Port
F_Port
Source
N_Port
Destination End-to-end flow control 25
© 2006 Cisco Systems, Inc. All rights reserved.
Types of Flow Control FC defines two types of flow control:
Buffer-to-buffer flow control takes place between two ports that are connected by a FC link, such as an N_Port and an F_Port, or two E_Ports, or two NL_Ports.
The receiving port at the other end of the link sends a primitive signal (4 Bytes) called a R_RDY (Receiver Ready) to the transmitting port.
End-to-end flow control takes place between the source port and the destination port. Whenever the receiving port receives a frame it acknowledges that frame with an ACK frame (36 Bytes).
Note that buffer-to-buffer flow control is performed between E_Ports in the fabric, but it is not performed between the incoming and outgoing ports in a given switch. In other words, FC buffer-to-buffer flow control is not used between two F_Ports or between an F_Port and an E_Port within a switch. FC standards do not define how switches route frames across the switch. Buffer-to-buffer flow control is used in the following situations:
34
Class 1 connection request frames use buffer-to-buffer flow control, but Class 1 data traffic uses only end-to-end flow control.
Class 2 and Class 3 frames always use buffer-to-buffer flow control.
Class F service uses buffer-to-buffer flow control.
In an Arbitrated Loop, every communication session is a virtual dedicated point-to-point circuit between a source port and destination port. Therefore, there is little difference between buffer-to-buffer and end-to-end flow control. Buffer-to-buffer flow control alone is generally sufficient for arbitrated loop topologies.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
End-to-end flow control is used in the following situations:
Classes 1, 2, 4, and 6 use end-to-end flow control.
Class 2 service uses both buffer-to-buffer and end-to-end flow control.
Copyright © 2006, Cisco Systems, Inc.
SCSI and Fibre Channel Primer
35
Buffer to Buffer and End-to-End Flow Control Fabric F_Port
F_Port
R_RDY
Data R_RDY
N_Port A
1 N_Port B
2 4
R_RDY
R_RDY
ACK
5 Buffer-to-buffer flow control
3
Buffer-to-buffer flow control
End-to-end flow control 26
© 2006 Cisco Systems, Inc. All rights reserved.
Buffer-to-Buffer Flow Control and End to End Flow Control The preceding preceding diagram illustrates buffer-to-buffer flow control in Class 3: 1. Before N_Port A can transmit a frame, it must receive the primitive signal R_RDY from its attached F_Port. The R_RDY signal tells N_Port A that its F_Port has a free buffer. 2. When it receives the R_RDY signal, N_Port A transmits a frame. 3. The frame is passed through the fabric. Buffer-to-buffer flow control is performed between every pair of E_Ports, although this is not shown here. 4. At the other side of the fabric, the destination F_Port must wait for an R_RDY signal from N_Port B. 5. When N_Port B sends an R_RDY, the F_Port transmits the data frame. End-to-end flow control is designed to overcome the limitations of buffer-to-buffer flow control. The preceding preceding diagram illustrates end-to-end flow control in Class 2: 1. Standard buffer-to-buffer flow control is performed for each data frame. 2. After the destination N_Port B receives a frame, it waits for an R_RDY from the F_Port. 3. When N_Port B receives an R_RDY, it sends an acknowledgement (ACK) frame back to N_Port A. 4. At the other side of the fabric, the initiator F_Port must wait for an R_RDY signal from N_Port A. 5. When N_Port A sends an R_RDY, the F_Port transmits the ACK frame.
36
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
End-to-end flow control involves only the port at which a frame originates and the ultimate destination port, regardless of how many FC switches are in the data path. When end-to-end flow control is used, the transmitting port is responsible for ensuring that all frames are delivered. Only when the transmitting N_Port receives the last ACK frame in response to a sequence of frames sent does it know that all frames have been delivered correctly, and only then will it empty its ULP data buffers. If a returning ACK indicates that the receiving port has detected an error, the transmitting N_Port has access to the ULP data buffers and can resend all of the frames in the sequence.
Copyright © 2006, Cisco Systems, Inc.
SCSI and Fibre Channel Primer
37
Allocating Buffer Credits Credits =
(Round_Trip_Time + Processing_Time) Serialization_Time
Frame serialization time: 2Gb Link rate of 2.125 Gbps = 4.7ns/byte Frame size = 2048 data + 36 header + 24 IDLE = 2108 bytes Frame serialization time = 2108 x 4.7 = 9.9µs ≈ 10µs per frame
10Km 10µs Frame
Initiator N_Port
Target N_Port 27
© 2006 Cisco Systems, Inc. All rights reserved.
Allocating Buffer Credits You can calculate the number of credits required on a link to maintain optimal performance using the following formula: Credits = (Round_Trip_Time + Processing_Time) / Serialization_Time
Example This diagram and the following two diagrams illustrate how the required number of BB_Credits are calculated for a 10km, 2Gbps FC link:
38
At a link rate of 2.125 Gbps, the time required to serialize (transmit) each byte is 4.7ns. (Note that each byte is 10 bits due to 8b/10b encoding.)
The maximum SCSI-FCP Fibre Channel payload size is 2048 bytes, because SCSI usually transfers multiple SCSI blocks of 512 Bytes each.
The payload size used in an actual customer environment would be based on the I/O characteristics of the customer’s applications.
You also need to account for the frame overheads. These are:
SOF (Start of Frame) 4 Bytes
FC Header, which is 24 Bytes
CRC which is 4 Bytes
EOF (End of Frame) which is also 4 Bytes
Also, the number of IDLEs between frames, which is usually 6 IDLEs, or 24 bytes. This gives a total of 2108 bytes.
The total serialization time at 2Gbps for a 2108-byte frame (including idles) is 9.9µs, or approximately 10µs.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Allocating Buffer Credits Propagation delay: Speed of light in fiber ≈ 5µs/Km Time to transmit frame across 10Km ≈ 50µs
Processing time: Assume same as de-serialization time ≈ 10µs
Response time: Time to transmit R_RDY across 10Km ≈ 50µs
Total latency ≈ 50µs + 10µs + 50µs = 110µs 10Km 20µs 50µs Frame R_RDY
Initiator N_Port © 2006 Cisco Systems, Inc. All rights reserved.
10µs Frame
50µs Target N_Port 28
The speed of light in a fiber optic cable is approximately 5ns per metre or 5µs per kilometer, so each frame will take about 50µs to travel across the 10Km link.
The receiving port must then process the frame, free a buffer, and generate an R_RDY.
This processing time can vary—for example, if the receiver ULP driver is busy, the frame might not be processed immediately.
In this case, we can assume that the receiving port will process the frame immediately, so the processing time is equal to the time it takes to de-serialize the frame.
Assume that the de-serialization time is equal to the serialization time: 10µs
The receiving port then transmits a credit (R_RDY) back across the link. This response takes another 50µs to reach the transmitter.
The total latency on the link is equal to the frame serialization time plus the round-trip time across the link, or about 110µs.
Copyright © 2006, Cisco Systems, Inc.
SCSI and Fibre Channel Primer
39
Allocating Buffer Credits (cont.) Given frame serialization time ≈ 10µs and total latency ≈ 110µs, There could be up to 5 frames on the link at one time, 1 being processed, and 5 credits being returned. Therefore, at 2Gbps we require approx 10 credits to maintain full bandwidth
A good rule of thumb is…….. At 2Gbps with a 2KB payload, you need approx 1 credit per Km 10Km
Frame
Frame
Frame
Frame
R_RDY
R_RDY
R_RDY
R_RDY
Initiator N_Port © 2006 Cisco Systems, Inc. All rights reserved.
40
Frame
Frame
R_RDY
Target N_Port 29
Given a frame serialization time of 10µs, and a total round-trip latency of 110µs, there could be up to 5 frames on the link at one time plus one being received and processed by the receiving port. In addition, 5 credits are being returned to the transmitting port down the other side of the link.
In other words, ignoring the de-serialization time, approximately 10 buffer-to-buffer credits are required to make full use of the bandwidth of the 10km link at 2Gbps with 2KB frames.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Fibre Channel Addressing World-Wide Names • Every Fibre Channel port and node has a hard-coded address called a World Wide Name (WWN): Allocated to manufacturer by IEEE Coded into each device when manufactured 64 or 128 bits (128 bits most common today)
• Switch Name Server maps WWNs to FC addresses (FCID) World-Wide Node Names (nWWNs) uniquely identify devices (Nodes) World-Wide Port Names (pWWNs) uniquely identify each port in a device
Example Example WWN WWN
WWNN 200000456801EF25 WWN
20:00:00:45:68:01:EF:25
Example Example WWNs WWNs from from aa Dual-Ported Dual-Ported Device Device nWWN pWWN A pWWN B
20:00:00:45:68:01:EF:25 21:00:00:45:68:01:EF:25 22:00:00:45:68:01:EF:25 31
© 2006 Cisco Systems, Inc. All rights reserved.
World-Wide Names WWNs are unique identifiers that are hard-coded into FC devices. Every FC port has at least one WWN. Vendors buy blocks of WWNs from the IEEE and allocate them to devices in the factory. WWNs are important for enabling fabric services because they are:
Guaranteed to be globally unique
Permanently associated with devices
These characteristics ensure that the fabric can reliably identify and locate devices, which is an important consideration for fabric services. When a management service or application needs to quickly locate a specific device: 1. The service or application queries the switch Name Server service with the WWN of the target device 2. The Name Server looks up and returns the current port address (FCID) that is associated with the target WWN 3. The service or application communicates with the target device using the port address
Copyright © 2006, Cisco Systems, Inc.
SCSI and Fibre Channel Primer
41
There are two types of WWNs:
nWWNs uniquely identify devices (Nodes). Every host bus adaptor (HBA), array controller, switch, gateway, and FC disk drive has a single unique WWNN.
pWWNs uniquely identify each port in a device. A dual-ported HBA has three WWNs: one nWWN and a pWWN for each port.
nWWNs and pWWNs are both needed because devices can have multiple ports. On singleported devices, the nWWN and pWWN are usually the same. On multi-ported devices, however, the pWWN is used to uniquely identify each port. Ports must be uniquely identifiable because each port participates in a unique data path. nWWNs are required because the node itself must sometimes be uniquely identified. For example, path failover and multiplexing software can detect redundant paths to a device by observing that the same WWNN is associated with multiple pWWNs. Cisco MDS switches use the following acronyms:
42
pWWN (Port WWN)
nWWN (Node WWN)
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Dynamic FCID Addressing • FCIDs are dynamically assigned to each device FC Port by the switch during the Fabric Login process. • Each Switch in a fabric is assigned a Domain ID (1-239) • Port is usually 00 for a N_Port or AL_PA for a NL_Port • Area is usually tied to the physical switch port that the device is connected to, but this is restrictive. • MDS switches instead, logically assign the Area and Port to a FCID. Bit
23
16 15
08 07
00
Fabric
Domain
Area
Port
Public Loop
Domain
Area
AL_PA
Private Loop © 2006 Cisco Systems, Inc. All rights reserved.
00000000 00000000
AL_PA 32
Dynamic FCID Addressing FCIDs are dynamically assigned to each FC port by the switch when it receives a Fabric Login (FLOGI) from the device:
Each FC switch in the fabric is assigned a unique Domain ID from 1 to 239 (except McDATA switches, which assign only domains 97 to 127).
Traditional FC switches will assign the Area ID based upon the physical port on the switch that the device is connected to. For example, a device connected to port 3 on the switch will receive an Area ID of hex 03. Therefore the FCID is tied to the physical port on the switch.
The Port ID is usually hex 00 for a N_Port or the AL_PA (Arbitrated Loop Physical Address) for a NL_Port. This means that every N_Port connected to the switch is reserved an entire area of 256 addresses, although it will only use 00. This is a wasteful use of addresses and one of the reasons why Fibre Channel cannot support the full 16.5 million addresses.
The Cisco MDS does not tie the Area to the physical port on the switch, but will assign the FCID logically in sequence starting with an area of 00.
The latest HBAs support Flat Addressing and the Cisco MDS will combine the Area and Port fields together as a 16 bit Port ID field. Each device is assigned an FCID in sequential order starting at 0000, 0001 etc. Legacy devices will be assigned a fixed Port ID of 00 per Area as defined above.
Copyright © 2006, Cisco Systems, Inc.
SCSI and Fibre Channel Primer
43
The FC-AL Address Space In a public (fabric-attached) loop:
Public NL_Ports are assigned a full 24-bit fabric address when they log into the fabric.
There are 126 AL_PA addresses available to NL_Ports in an arbitrated loop; the AL_PA 0x00 is reserved for the FL_Port (which is logically part of both the fabric and the loop).
The Domain and Area fields are identical to those of the FL_Port to which the loop is connected.
In a private (isolated) loop :
44
Private NL_Ports can communicate with each other based upon the AL_PA, which is assigned to each port during loop initialization.
Private NL_ports are not assigned a 24-bit fabric address, and the Domain and Area segments are not used.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Fabric Login Fabric Login • FCIDs are dynamically assigned to each device FC Port by the switch during the Fabric Login (FLOGI) process. • Each device will register (PLOGI) with the switch’s Name Server • Initiators will query the Name Server for available targets, then send PLOGI to the target to exchange FC parameters • Initiator will login to each Target using Process Login (PRLI) to establish a channel of communication between them (Image Pair) Fabric N_Port A Initiator Node
F_Port A
F_Port B
FLOGI FLOGI
N_Port B FLOGI FLOGI
PLOGI PLOGI
Target Node
PLOGI PLOGI PLOGI PLOGI
Process A
© 2006 Cisco Systems, Inc. All rights reserved.
PRLI PRLI
Process B
34
Before an N_Port can begin exchanging data with other N_Ports, three processes must occur:
The N_Port must log in to its attached F_Port. This process is known as Fabric Login (FLOGI). During PLOGI, both ports exchange Fibre Channel common parameters. ie. Buffer credits, buffer size, classes of service supported etc.
The Initiator N_Port must log in to its target N_Port. This process is also known as Port Login (PLOGI). This time the initiator and target ports exchange Fibre Channel common parameters like before. If one port supports 2KB buffers but the other only supports 1KB buffers, they will negotiate down to the lowest common factor ie 1KB buffers.
The Initiator N_Port must exchange information about ULP support with its target N_Port to ensure that the initiator and target process can communicate. This process is known as Process Login (PRLI). Parameters exchanged are specific to the Upper Layer Protocol (ULP). For instance, one port will state that it is an Initiator, the other must say that it is a Target. If both ports are Initiators, then the PRLI is rejected.
Copyright © 2006, Cisco Systems, Inc.
SCSI and Fibre Channel Primer
45
Fabric Login Analyzer Screenshot
Analyzer screenshot showing the contents of a FLOGI frame sent to the Fabric Login Server in the FC switch. 35
© 2006 Cisco Systems, Inc. All rights reserved.
Fabric Login Analyzer Screenshot The preceding image shows an analyzer trace that displays part of a fabric login sequence. The top of the trace shows the NOS-OLS-LR-LRR sequence that occurred while the link was being initialized. (NOS is off the screen) The right-hand panel shows the contents of the FLOGI frame from the N_Port to the FLOGI Server on the switch F_Port (FFFFFE). Useful information can be obtained by studying these analyzer traces:
46
Notice that at this time the N_Port does not yet have an address. 00.00.00
Notice also that the World Wide Port Name is the same as the World Wide Node Name. This is common in single ported nodes.
The N_Port does not support Class 1, but it does support Classes 2 and 3.
The N_Port supports Alternate Buffer Credit Management Method and can guarantee 2 BB_Credits at its receiver port.
You can see that this is a single-frame Class 3 sequence because the Start of Frame is SOFi3 and End of Frame is EOFt, meaning that this initial first frame is also the last one in the sequence.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Registered State Change Notification (RSCN) How can a host keep track of its targets? It can poll each device regularly, but this would be a high overhead, or It can register for State Change Notification (SCN) with the switch The switch will send a RSCN whenever targets go offline/online The host will then query the Name Server to find out what has changed FC
Link failure FC
SCR SCR FC HBA
LS_ACC LS_ACC
Fabric RSCN RSCN Controller RSCN RSCN
SCR SCR LS_ACC LS_ACC
Host
Storage 36
© 2006 Cisco Systems, Inc. All rights reserved.
The Registered State Change Notification (RSCN) Process Changes to the state of the fabric can affect the operation of ports. Examples of fabric state changes include:
A node port is added or removed from the fabric
Inter-switch links (ISLs) are added or removed from the fabric
A membership change occurs in a zone
Ports must be notified when these changes occur.
The RSCN Process The FC-SW standard provides a mechanism through which switches can automatically notify ports that changes to the fabric have occurred. This mechanism, known as the RSCN process, is implemented by a fabric service called the Fabric Controller. The RSCN process works as follows:
Nodes register for notification by sending a State Change Registration (SCR) frame to the Fabric Controller.
The Fabric Controller transmits RSCN commands to registered nodes when a fabric state change event occurs. RSCNs are transmitted as unicast frames because multicast is an optional service and is not supported by many switches.
Only nodes that might be affected by the state change are notified. For example, if the state change occurs within Zone A, and Port X is not part of Zone A, then Port X will not receive an RSCN.
Nodes respond to the RSCN with an LS_ACC frame.
Copyright © 2006, Cisco Systems, Inc.
SCSI and Fibre Channel Primer
47
The RSCN message identifies the ports that were affected by the state change event, and it identifies the general nature of the event. After receiving an RSCN, the node can then use additional Link Services commands to obtain more information about the event. For example, if the RSCN specifies that the status of Port Y has changed, the nodes that receive the RSCN can attempt to verify the current (new) state of Port Y by querying the Name Server. The Fabric Controller will generate RSCNs in the following circumstances:
48
A fabric login (FLOGI) from an Nx_Port.
The path between two Nx_Ports has changed (e.g., a change to the fabric routing tables that affects the ability of the fabric to deliver frames in order, or an E_Port initialization or failure)
An implicit fabric logout of an Nx_Port, including implicit logout resulting from loss-ofsignal, link failure, or when the fabric receives a FLOGI from a port that had already completed FLOGI.
Any other fabric-detected state change of an Nx_Port.
Loop initialization of an L_Port, and the L_bit was set in the LISA Sequence.
An Nx_Port can also issue a request to the Fabric Controller to generate an RSCN. For example, if one port in a multi-ported node fails, another port in that node can send an RSCN to notify the fabric about the failure.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Standard Fabric Services
Domain Manager
Name Server
Unzoned Name Server
Zone Server
Configuration Server
Standard Fabric Services
Alias Server
Fabric Management Server Generic ControllerServices
Key Server
Time Server
FC-4 ULP Mapping FC-3 Generic Services
Common Transport
FC-2 Framing & flow control
Link Services
FC-1 Encoding FC-0 Physical interface © 2006 Cisco Systems, Inc. All rights reserved.
38
The FC-SW-2 specification defines several services that are required for fabric management. These services include:
Name Server
Login Server
Address Manager
Alias Server
Fabric Controller
Management Server
Key Distribution Server
Time Server
The FC-SW-2 specification does not require that switches implement all of these services; some services can be implemented as an external server function. However, the services discussed in this lesson are typically implemented in the switch, as in Cisco MDS 9000 Family Switches.
Copyright © 2006, Cisco Systems, Inc.
SCSI and Fibre Channel Primer
49
The Domain Manager Management Management Services Services VSAN VSANManager Manager
WWN WWNManager Manager
Domain Manager Fabric Configuration
Principal Switch Selection
FC_ID Allocation
Port PortManager Manager
Domain ID Allocation
FCID Database and Cache
Login LoginServer Server 39
© 2006 Cisco Systems, Inc. All rights reserved.
The Domain Manager The Domain Manager is the logical function of a switch that is responsible for the assignment of addresses in a fabric. The Domain Manager is responsible for:
Allocating domain IDs (requesting a domain ID, and assigning domain IDs to other switches if this switch is the Principal Switch)
Allocating port addresses (FC_IDs)
Participating in the Principal Switch selection process
Performing the Fabric Build and Reconfiguration processes when the topology changes
The Domain Manager supports the Fabric Port Login Server, which is the service that N_Ports use when logging in to the fabric. When an N_Port logs into the fabric, it sends a FLOGI command to the Login Server. The Login Server then requests an FC_ID from the Domain Manager and assigns the FC_ID the N_Port in its ACC reply to the FLOGI request. The preceding diagram shows how the Domain Manager interacts with other fabric services:
50
The VSAN Manager provides the Domain Manager with VSAN configuration and status information.
The WWN Manager tells the Domain Manager what WWN is assigned to the VSAN.
The Port Manager provides the Domain Manager with information about the fabric topology (a list of E_Ports) and notifies the Domain Manager about E_Port state changes.
The Login Server receives N_Port requests for FC_IDs during FLOGI.
The Domain Manager interacts with management services to allow administrators to view and modify Domain Manager parameters.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
The Name Server • Name Server stores data about nodes, such as: FC_IDs nWWNs and pWWNs Fibre Channel operating parameters Supported protocols Supported Classes of Service
• Supports soft zoning • Provides information only about nodes in the requestor’s zone • Distributed Name Server (dNS) resides in each switch • Responsible for entries associated with that switch’s domain • Maintains local data copies and updates via RSCNs • Sends RSCNs to the fabric when a local change occurs
40
© 2006 Cisco Systems, Inc. All rights reserved.
The Name Server FC Name Server is a database implemented by the switch that stores information about each node, including:
FC_IDs
WWPN and WWNNs
FC operating parameters, such as supported ULPs and Classes of Service
The Name Server:
Supports soft zoning by performing WWN lookups to verify zone membership
Enforces zoning by only providing information about nodes in the requestor’s zone
Is used by management applications that need to obtain information about the fabric
Each switch in a fabric contains its own resident name server, called a distributed Name Server (dNS). Each dNS within a switch is responsible for the name entries associated with the domain assigned to the switch. The dNS instances synchronize their databases using the RSCN process. When a client Nx_Port wants to query the Name Service, it submits a request to its local via the Well Known Address for the Name Server. If the required information is not available locally, the dNS within the local switch responds to the request by making any necessary requests of other dNS instances contained in the other switches. The communication between switches that is performed to acquire the requested information is transparent to the original requesting client. Partial responses to dNS queries are allowed. If an entry switch sends a partial response back to an Nx_Port, it must set the partial response bit in the CT header.
Copyright © 2006, Cisco Systems, Inc.
SCSI and Fibre Channel Primer
51
Name Server Operations When ports and nodes register with the Name Server, their characteristics are stored as objects in the Name Server database. The Port Identifier is the Fibre Channel port address identifier (FC_ID) assigned to an N_Port or NL_Port during fabric login (FLOGI). The Port Identifier is the primary key for all objects in a Name Server record. All objects are ultimately related back to this object. Because a node may have more than one port, the Node Name is a secondary key for some objects. There are three types of Name Server requests:
Get Object: This request is used to query the Name Server
Register Object: Only one object at a time can be registered with the Name Server. A Client registers information in the Name Server database by sending a registration request containing a Port Identifier or Node Name.
Deregister Object: Only one global deregistration request is defined for the Name Server.
Name Server information is available, upon request, to other nodes, subject to zoning restrictions. If zones exist within the fabric, the Name Server restricts access to information in the Name Server database based on the zone configuration. When a port logs out of a fabric, the Name Server deregisters all objects associated with that port.
52
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
The Management Server • Information is provided without regard to zone single access point for information about the fabric topology Read-only access Services provided: Fabric Configuration Service (FCS) Zone Service Unzoned Name Service
© 2006 Cisco Systems, Inc. All rights reserved.
41
The Management Server The FC Management Server provides a single access point for obtaining information about the fabric topology. Whereas the Name Server only provides information about ports configured within the zone of the port requesting information, the Management Server provides information about the entire fabric, without regard to zone. The Management Server allows SAN management applications to discover and monitor SAN components, but it does not allow applications to configure the fabric—the Management Server provides read-only access to its data. The Management Server provides the following services:
The Fabric Configuration Service (FCS) supports configuration management of the fabric. This service allows applications to discover the topology and attributes of the fabric.
The Zone Service provides zone information for the fabric to either management applications or directly to clients.
The Unzoned Name Service provides access to provide information about the fabric without regard to zones. This service allows management applications to see all the devices on the entire fabric.
Copyright © 2006, Cisco Systems, Inc.
SCSI and Fibre Channel Primer
53
Well-Known Addresses • Well-known addresses are reserved addresses for FC Services at the top of the 24-bit fabric address space Broadcast Alias
FFFFFF
Mandatory
Fabric Login Server
FFFFFE
Mandatory
Fabric Controller
FFFFFD
Mandatory
Name Server
FFFFFC
Optional
Time Server
FFFFFB
Optional
Management Server
FFFFFA
Optional
QoS Facilitator
FFFFF9
Optional
Alias Server
FFFFF8
Optional
Key Distribution Server
FFFFF7
Optional
Clock Synchronization Server
FFFFF6
Optional
Multicast Server
FFFFF5
Optional
Reserved
FFFFF4 – FFFFF0
© 2006 Cisco Systems, Inc. All rights reserved.
42
Well-Known Addresses Well-known Addresses allow devices to reliably access switch services. All services are addressed in the same way as an N_Port is addressed. Nodes communicate with services by sending and receiving Extended Link Services commands (frames) to and from Well-Known Addresses Well-known addresses are the highest 16 addresses in the 24-bit fabric address space:
FFFFFF - Broadcast Alias
FFFFFE - Fabric Login Server
FFFFFD - Fabric Controller
FFFFFC - Name Server
FFFFFB - Time Server
FFFFFA - Management Server
FFFFF9 - Quality of Service Facilitator
FFFFF8 - Alias Server
FFFFF7 - Key Distribution Server
FFFFF6 - Clock Synchronization Server
FFFFF5 - Multicast Server
FFFFF4–FFFFF0 - Reserved
The first three services are mandatory in all FC switches, however all FC switches today implement the first six services by default for ease of management.
54
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Lesson 2
Cisco MDS 9000 Introduction Overview In this lesson, you will learn about the MDS 9000 Family of SAN switches, including an overview of the MDS chassis and line card modules.
Objectives Upon completing this lesson, you will be able to identify the components of an MDS 9000 storage networking solution. This includes being able to meet these objectives:
Identify the hardware components of the MDS 9000 platform
Explain supported airflow and power configurations
Explain the MDS 9000 licensing model
Cisco Storage Solutions Overview MDS 9000 Family Product Line: First Generation Industry Leading Innovation and Investment Protection across a Comprehensive Product Line MDS 9500 Multilayer Directors Multilayer Fabric Switches MDS 9000 Family Systems MDS 9020
MDS 9000 Modules Supervisor-1 Mgmt
MDS 9120 and 9140
32-port FC 16-port FC
MDS 9216 and 9216i
Multiprotocol Services (14+2)
MDS 9506
8-port IPS iSCSI + FCIP
MDS 9509
SSM Virtualization
Device and Fabric Manager, FM Server, Performance Manager, Traffic Analyzer
SAN-OS
MDS 9000 Family Operating System
© 2006 Cisco Systems, Inc. All rights reserved.
4
Multilayer switches are switching platforms with multiple layers of intelligent features, such as:
Ultra High Availability
Scalable Architecture
Comprehensive Security Features
Ease of Management
Advanced Diagnostics and Troubleshooting Capabilities
Seamless Integration of Multiple Technologies
Multi-protocol Support
Multilayer switches also offer a scalable architecture with highly available hardware and software. Based on the MDS 9000 Family-Operating System and a comprehensive management platform called Cisco Fabric Manager, the MDS 9000 Family offers a variety of application line card modules and a scalable architecture from an entry-level fabric switch to director-class systems. The Cisco MDS 9000 Family offers a industry leading investment protection across a comprehensive product line. The 9020 is a new low cost 20 port FC switch providing 1/2/4Gb/s at full line rate. This model currently has a single power supply, four fans and front to rear airflow. Featuring nondisruptive software upgrades and management via CLI or FM/DM. The current release 2.1.2 does not support VSANs but this is planned for release 3.0.0
58
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
MDS 9000 Family Product Line: Second Generation Industry Leading Innovation and Investment Protection across a Comprehensive Product Line MDS 9513 Multilayer Director 4Gbps SFPs
10Gbps X2
MDS 9000 Family Systems MDS 9513
MDS 9000 Modules Supervisor-2 Mgmt
12-port FC 1/2/4 Gb/s
24-port FC 1/2/4 Gb/s
48-port FC 1/2/4 Gb/s
4-port FC 10 Gb/s
Device and Fabric Manager, FM Server, Performance Manager, Traffic Analyzer
SAN-OS © 2006 Cisco Systems, Inc. All rights reserved.
MDS 9000 Family Operating System 5
In April 2006, Cisco introduced the MDS 9513 Multilayer Director and second generation linecards. The 9513 Multilayer Director Switch is a new 13 slot chassis, with two Supervisor-2 slots. (Note that Supervisor-1 is not compatible with this chassis). Supporting this architecture but forward and backward compatible with the existing architecture are the new 12 Port, 24 port and 48 port FC linecards that provide 1/2/4 Gb/s using new 4Gb/s SFPs. The 4 port 10Gb/s FC linecards are also forward and backward compatible with the existing architecture and provide 4x 10Gb/s FC ports at full line rate using X2 GBICs.
Copyright © 2006, Cisco Systems, Inc.
Cisco MDS 9000 Introduction
59
MDS 9500 Series Directors MDS 9506
MDS 9509
6 slots up to 192 Ports 7 RU Form Factor 6 per rack 21.75” Deep
9 slots up to 336 Ports 14 RU Form Factor 3 per rack 18.8” Deep
© 2006 Cisco Systems, Inc. All rights reserved.
MDS 9513
13 slots up to 528 Ports 14 RU Form Factor 3 per rack 28” Deep 6
MDS 9500 Series Directors The Cisco MDS 9500 series multi-layer directors elevate the standard for director-class switches. Providing industry-leading availability, multiprotocol support, advanced scalability, security, non-blocking fabrics that are 10-Gbps ready, and a platform for storage management, the Cisco MDS 9500 Series allows you to deploy high-performance SANs with a lower total cost of ownership. Layering a rich set of intelligent features and hardware-based services onto a high-performance, protocol-agnostic switch fabric, the Cisco MDS 9500 Series of multilayer directors addresses the stringent requirements of large data-center storage environments. MDS 9500 Series switch chassis are available in three sizes: MDS 9513 (14) rack units, MDS 9509 (14) rack units and MDS 9506 (7) rack units.
MDS 9506 Chassis The Cisco MDS 9506 Director has a 6-slot chassis with the same features as the Cisco MDS 9509 Director. It has slots for two supervisor modules, and four switching or services modules. The power supplies are located in the back of the chassis, with the Power Entry Modules (PEMs) in the front of the chassis for easy access. The MDS 9506 supports the same director-class features as the MDS 9509 but with a more compact six-slot (7 Rack Units) chassis design because the pwer supplies are located at the rear. It has slots for two supervisor modules, and four switching or services modules. Power supplies are installed in the back for easy removal, with the Power Entry Modules (PEMs) in the front of the chassis for easy access. Up to six MDS 9506 chassis’s can be installed in a standard 42U rack with up to (128) 1/2Gbps FC ports per chassis or up to (24) 1-Gbps Ethernet ports per chassis for IP storage 60
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
services applications. The maximized number of available FC ports is an industry leading port density per rack with up to (768), in a single seven foot (42 RU) rack, thus optimizing the use of valuable data center floor space. Additionally, cable management is facilitated by the single side position of both interface and power terminations.
MDS 9509 Chassis The Cisco MDS 9509 Director has a 9-slot chassis with redundant supervisor modules, up to seven switching modules or six IPS modules, redundant power supplies, and a removable fan module. Slots 5 and 6 are reserved for redundant supervisor modules, which provides control, switching, and local and remote management. The Cisco MDS 9509 supports an industry leading port density per system that is expandable up to (224) FC ports in a single chassis. Up to (48) gigabit Ethernet ports can be configured when using the IP storage services module. Even though a (48) gigabit Ethernet port configuration is physically possible in an MDS 9509, it’s not likely to be deployed because it only allows room for a single FC switch module. There are two system clock cards for added high availability. Dual redundant power supplies are located at the front of the chassis, therefore the MDS 9509 is only 18.8” deep.
MDS 9513 Chassis The Cisco MDS 9513 Director has a 13-slot chassis with redundant supervisor-2 modules, up to eleven switching modules or ten IPS modules, redundant 6KW power supplies, a removable fan module at the front and dditional removable fan modules at the rear for the fabric modules. Slots 7 and 8 are reserved for redundant supervisor-2 modules, which provides control, switching, and local and remote management. The Cisco MDS 9513 supports an industry leading port density per system that is expandable up to (528) FC ports in a single chassis. Up to (80) gigabit Ethernet ports can be configured when using the IP storage services module. There are two new removable system clock modules at the rear for added high availability. Dual redundant 6KW power supplies are located at the rear of the chassis. The MDS 9513 has a revised airflow system at the rear of the chassis, in at the bottom and out at the top.
Copyright © 2006, Cisco Systems, Inc.
Cisco MDS 9000 Introduction
61
3.0
MDS 9513 Director •
Redefining director-class storage switching Ultra-high availability Multiprotocol support - FC, iSCSI, FCIP, FICON
• Industry Leading Port Density Up to 528 ports per chassis, up to 1584 ports per rack SW Support in SAN-OS 3.0
• New Supervisor-2 (required) Enhanced crossbar arbiter Dual BIOS, Redundant Bootflash: Advanced Security features
• Redundant High Performance Crossbar Fabric Modules • Redundant 6000W AC Power Supplies Room-to-grow power for future application modules
• Revised airflow Bottom to top, at rear of chassis Front and rear fan trays
MDS 9513 Rear view 7
© 2006 Cisco Systems, Inc. All rights reserved.
The MDS 9513 Director The Cisco MDS 9513 Director has a 13-slot chassis with redundant supervisor modules, up to eleven switching modules or ten IPS modules, redundant power supplies, a removable linecard fan module at the front and removable fabric fan module at the rear. Slots 7 and 8 are reserved for redundant supervisor-2 modules, which provides control, switching, and local and remote management. MDS 9513 supports an industry leading port density per system that is expandable up to (528) FC ports in a single chassis using 11x 48 port linecards. Up to (80) gigabit Ethernet ports can be configured when using 10x 8 port IP storage services modules. There are two removable system clock cards for added high availability. At the rear of the chassis, there are two new Fabric Modules that contain redundant high performance (2Tb per second) crossbars. The MDS 9513 has new redundant 6000W power supplies and features a new revised airflow, from bottom to top at the rear of the chassis.
62
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
MDS 9000 Fabric Switches • MDS 9216A Expands from 16 to 64 FC ports MDS 9216 Fibre Channel switch
Modular upgrade to IP or intelligent SAN switching full compatibility with MDS9500 series
• MDS 9216i Expands from 14 to 62 FC ports + 2x GigE ports for FCIP and iSCSI MDS 9216i 14+2 Port Switch
Fully supports any of the new linecards
• MDS 9120 and 9140 MDS 9120 20 port Fibre Channel switch
Cost-effective, intelligent fabric switch for SAN edge or small/medium FC SAN Small footprint, high density, fixed configuration
MDS 9140 40 port Fibre Channel switch
Feature compatibility w/ MDS 9506/9509/9216
• MDS 9020 Low cost 4Gbps FC Switch with free FMS license MDS 9020 4Gbps 20 port FC Switch © 2006 Cisco Systems, Inc. All rights reserved.
Supports non-disruptive Firmware Upgrades 8
MDS 9000 Fabric Switches MDS 9000 fabric switches include the MDS 9216A, the MDS9216i, MDS 9100 Series and the new MDS 9020. The MDS 9216 offers fibre channel expansion up to sixty four (64) FC ports using the new 48 port module and can also accept any linecard including the IP Storage Services (IPS) module or SSM intelligent virtual storage module for industry leading performance in the mid-range SAN switch category. The MDS 9216i offers a built in 14+2 module, can expand up to sixty two (62) FC ports using the new 48 port FC linecard. In addition to the two built in GbE ports that support FCIP and iSCSI, it can also accept the IP Storage Services (IPS) module.. The Cisco MDS 9100 Series is a cost effective intelligent fabric switch platform. These highdensity fixed configuration switches take up 1U small footprint while offering a large feature compatibility with the other MDS 9000 Family systems. The Cisco MDS9020 low cost entry level switch supports 4Gbps FC ports and non-disruptive Firmware Upgrades. The MDS 9020 is provided with a free FMS license for efficient multifabric management.
Copyright © 2006, Cisco Systems, Inc.
Cisco MDS 9000 Introduction
63
Common Architecture: Ease-of-Migration and Investment Protection
IPS-8
Supervisor-2
Supervisor-1
12-port FC Module
FC-16 24-port FC Module FC-32
48-port FC Module MDS 9506 & 9509
MDS 9216 & 9216i
MDS 9513
SSM
MPS 14+2
All Line Cards Forward/Backward Compatible*
New Generation
Current Generation • • •
Architectural support for up to 256 indexes Max planned system density of 240 ports 1/2Gb/s FC interfaces
© 2006 Cisco Systems, Inc. All rights reserved.
4-port 10Gb/s FC
• • •
Architectural support for up to 1,024 indexes Max planned system density of 528 ports 1/2/4Gb/s, 10G FC interfaces
*Some feature considerations in mixed Vegas/Isola configurations. Sup-2 only in 9513
9
All first generation and second generation modules are forwards and backwards compatible. The first generation has architectural support for up to 256 indexes (destination ports) and the max planned system density is 240 ports (using MDS 9509) although in practice it is 224 using 7x 32 port linecards. However, using a mix of current and second generation modules it is possible to increase this to 252 ports. Each supervisor module consumes two indexes, so a total of 4 indexes are used by supervisors on MDS 9500 switches. It is worth noting that each Gigabit interface uses 4 indexes, so an IPS-8 would consume 32 indexes and a 14+2 would consume 22 indexes from the pool. The second generation platform has architectural support for up to 1024 indexes and the max planned system density is currently 528 ports using 11x 48 port cards. However, if any one of the current linecards are inserted into the 9513 chassis, the maximum number of indexes are reduced to 252. The 9513 chassis must use only the Supervisor-2 module, however both Supervisor-1 and Supervisor-2 cards may be used in the current generation 9506 and 9509.
64
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Airflow and Power MDS 9000 Series Fan Modules and Airflow Hot swappable fan modules Easy installation and removal Integrated temperature management
r Ai
Fl
ow
MDS 9120 & 9140 MDS 9120/9140 Rear view
Two rear mounted fan modules Hot swappable MDS 9216 – 4 fans
Air Flow
Front mounted fan tray Hot swappable
MDS 9216 © 2006 Cisco Systems, Inc. All rights reserved.
11
MDS 9100 and 9200 Fan Modules and Airflow The Cisco MDS 9100 Series switch supports two hot-swappable fan modules. The Cisco MDS 9100 Series switch continues to run if a fan module is removed, as long as preset temperature thresholds have not been exceeded. This means you can swap out a fan module without having to bring the system down. The fan modules each have one Status LED. Fan module status is also indicated on a front panel LED. The MDS 9216 switch supports a hot-swappable fan module with four fans. It provides 270 CFM (cubic feet per minute) of cooling, allowing 400 Watt of power dissipation per slot. Sensors on the supervisor module monitor the internal air temperature. If the air temperature exceeds a preset threshold, the environmental monitor displays warning messages. If one or more fans within the fan module fail, the Fan Status LED turns red. Individual fans cannot be replaced, you must replace the entire fan module. The MDS 9216 will continue to run if the fan module is removed, as long as preset temperature thresholds have not been exceeded. This means you can swap out a fan module without having to bring the system down. The fan module is designed to be removed and replaced while the system is operating without presenting an electrical hazard or damage to the system, provided the replacement is performed promptly. Removal periods should be limited to a total of 5 minutes, depending on system temperature. Integrated temperature and power management facilities help to ensure increased uptime. Fan status LEDs indicate the condition of the fans on the module. If a fan or fans fail, the fan status LED turns red. When all fans are operating properly, the LED is green.
Copyright © 2006, Cisco Systems, Inc.
Cisco MDS 9000 Introduction
65
MDS 9500 Series Fan Modules and Airflow MDS 9506 & 9509 airflow
Air Flo w
Hot swappable front mounted fan tray Easy installation and removal Sensors monitor system temperature Temperature rise or fan failure generates an event Recommended to replace fan tray at earliest opportunity
3 minutes
2 minutes
Shutdown
Replace fan tray within 3 mins or receive critical warning. Shutdown in 2 mins if fan tray is not replaced
MDS 9506 – 6 fans MDS 9509 – 9 fans MDS 9513 – 15 fans Plus additional fabric fan tray at rear
MDS 9513 airflow - bottom to top, at rear of chassis
© 2006 Cisco Systems, Inc. All rights reserved.
12
MDS 9500 Fan Modules and Airflow The MDS 9500 Series supports hot-swappable fan modules that are easily installed or removed from the from of the chassis. They provide 85 cubic feet per minute (CFM) of airflow per slot with 410 Watts of power dissipation per slot. The MDS 9506 has a fan module with six fans and the Cisco MDS 9509 has a fan module with nine fans. Sensors on the supervisor module monitor the internal air temperature. If the air temperature exceeds a preset threshold, the environmental monitor displays warning messages. If one or more fans within the module fails, the Fan Status LED turns red and the module must be replaced. When all fans are operating properly, the LED is green. If the fan LED is red, the fan assembly may not be seated properly in the chassis, in which case remove the fan assembly and reinstall. After reinstalling, if the LED is still red, then there is a failure on the fan assembly. Fan LED status indication is provided on a per-module basis. If one fan fails then the module is considered failed. The switch can continue to run when the fan module is removed for a maximum of 5 minutes if the temperature thresholds are not exceeded. This allows you to swap out a fan module without having to bring the system down. The fan module is designed to be removed and replaced while the system is operating without presenting an electrical hazard or damage to the system, provided the replacement is performed promptly. Install the fan module into the front chassis cavity with the status LED at the top. Push the fan module to ensure power supply connector mates with the chassis, and tighten captive installation screws. If the switch is powered on, listen for the fans; you should immediately hear them operating.
66
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Power Management • MDS switches have dual power supplies * • Hot swappable for easy installation and removal * • Power Supply Modes Redundant Mode: default Power capacity of the lower capacity supply Sufficient power will be available in case of PSU failure Combined Mode: non-redundant Twice the power capacity of lower capacity supply Sufficient power may not be available in case of a power supply failure Only modules with sufficient power are powered up Power is reserved for the Supervisors and Fan Assemblies After supervisors, modules are powered up starting at slot 1 * MDS 9020 has a single integral power supply © 2006 Cisco Systems, Inc. All rights reserved.
13
Power Management Power supplies are configured in redundant mode by default but they can also be configured in a combined or non-redundant mode. In redundant mode, the chassis uses the power capacity of the lower capacity power supply so that sufficient power is available in case of a single power supply failure. In combined mode, chassis uses twice the power capacity of the lower capacity power supply. Sufficient power may not be available in case of a power supply failure in this mode. If there is a power supply failure and the real power requirements for the chassis exceed the power capacity of the remaining power supply, the entire system will be reset automatically, to prevent permanent damage to the power supply. In either mode, power is reserved for the Supervisor and fan assemblies. Each supervisor module has roughly 220 watts in reserve, even if there is only one installed, and the fan module has 210 watts in reserve. In the case of insufficient power, after supervisors and fans are powered, line card modules are given power from the top of the chassis down. After the reboot, only those modules that have sufficient power shall be powered up. If the real power requirements do not trigger an automatic reset, no module will be powered down; Instead no new module shall be powered up. In all cases of power supply failure, removal, etc., a syslog message shall be printed, a call home message shall be sent if configured, and a SNMP trap shall be sent.
Copyright © 2006, Cisco Systems, Inc.
Cisco MDS 9000 Introduction
67
MDS 9000 Power Supplies • MDS 9020 Integral power supply 100 Watt AC supply 100W @ 100-240 VAC
• MDS 9100 Removable power supplies at rear of chassis 300 Watt AC supply 300W @ 100-240 VAC
• MDS 9216 Removable power supplies at rear of chassis 845 Watt AC supply 845W @ 100-240 VAC
14
© 2006 Cisco Systems, Inc. All rights reserved.
MDS 9000 Power Supply Modules The MDS 9500 Series supports redundant hot-swappable power supplies, that support AC input voltages, each of which is capable of supplying sufficient power to the entire chassis should one power supply fail. The power supplies monitor their output voltage and provide status to the supervisor module. To prevent the unexpected shutdown of an optional module, the power management software only allows a module to power up if adequate power is available. The power supplies can be configured to be redundant or combined. By default, they are configured as redundant, so that if one fails, the remaining power supply can still power the entire system. Condition LEDs give visual indications of the installed modules and their operation. The Cisco MDS 9020 switch has a single integral power supply only.
68
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
MDS 9500 Power Supplies • MDS 9506 Removable power supplies at rear of chassis 1900 Watt AC supply 1900W @ 200-240 VAC 1050W @ 100-120 VAC
• MDS 9509 Removable power supplies at front of chassis 2500 Watt AC supply 2500W @ 200-240 VAC 1300W @ 100-120 VAC 3000 Watt AC supply (new)
• MDS 9513 Removable power supplies at rear of chassis 6000 Watt AC supply 6000W @ 200-240 VAC © 2006 Cisco Systems, Inc. All rights reserved.
15
MDS 9500 Power Supply Modules The MDS 9500 Series supports redundant hot-swappable power supplies, that support AC input voltages, each of which is capable of supplying sufficient power to the entire chassis should one power supply fail. The power supplies monitor their output voltage and provide status to the supervisor module. To prevent the unexpected shutdown of an optional module, the power management software only allows a module to power up if adequate power is available. The power supplies can be configured to be redundant or combined. By default, they are configured as redundant, so that if one fails, the remaining power supply can still power the entire system. Condition LEDs give visual indications of the installed modules and their operation. The Cisco MDS 9509 Director supports the following types of power supplies: A 2500 Watt AC power supply with an AC input and DC output. This power supply requires 220 VAC to deliver 2500 Watts of power. If powered by 110 VAC, it will deliver only 1300 Watts. This supply has a current rating of 20 amps for circuit breakers but a 16 amp maximum draw under normal conditions. A new 3000 Watt AC power supply with an AC input and a DC output is also available for the MDS 9509. All three power supplies appear to be very similar from the outside with the handle, air vent, power switch, condition LEDs, and captive screws in the same location. However, make note of the different power input connections where the differences are removable or permanently attached cords or a terminal block connection. The MDS 9513 power supplies are located at the rear of the chassis and provide 6000W of power at a nominal voltage of 220 VAC.
Copyright © 2006, Cisco Systems, Inc.
Cisco MDS 9000 Introduction
69
Software Packages and Licensing Software Licensing “Standard” Standard” Package - Free
• Enforced software licensing started with SAN-OS 1.3
• Fibre Channel and iSCSI
• Includes “standard” license package - Free
• VSANs and Zoning
• Five additional license packages
• iSCSI Server Load Balancing • PortChannels • FCC and Virtual Output Queuing
– Enterprise package
• Diagnostics (SPAN, RSPAN, etc.)
– SAN Extension over IP (FCIP)
• Fabric Manager and Device Manager
– Mainframe (FICON)
• SNMPv3, SSH, SSL, SFTP
– Fabric Manager Server (FMS)
• SMI-S 1.10 and FDMI
– Storage Services Enabler (SSE)
• Role-based access control
• Features may be evaluated free for 120 days
• RADIUS and TACACS+, MS CHAP • RMON, Syslog, Call Home • Brocade native interop modes 2 and 3 • McData native interop mode 4 • NPIV (N_Port ID Virtualization) • IVR over FCIP • Command Scheduler • IPv6 (management & IP services)
© 2006 Cisco Systems, Inc. All rights reserved.
17
The Cisco MDS 9000 Family SAN-OS is the underlying system software that powers the award-winning Cisco MDS 9000 Family Multilayer Switches. SAN-OS is designed for storage area networks (SANs) in the best traditions of Cisco IOS® Software to create a strategic SAN platform of superior reliability, performance, scalability, and features. In addition to providing all the features that the market expects of a storage network switch, the SAN-OS provides many unique features that help the Cisco MDS 9000 Family to deliver low total cost of ownership (TCO) and a quick return on investment (ROI).
Common Software Across All Platforms The SAN-OS runs on all Cisco MDS 9000 Family switches, from multilayer fabric switches to multilayer directors. Using the same base system software across the entire product line enables Cisco Systems® to provide an extensive, consistent, and compatible feature set on the Cisco MDS 9000 Family. Most Cisco MDS 9000 Family software features are included in the base switch configuration. The standard software package includes the base set of features that Cisco believes are required by most customers for building a SAN. However, some features are logically grouped into addon packages that must be licensed separately.
70
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Bundled Packages for SAN-OS 3.0 • Enterprise Package • Enhanced security features VSAN-based access control LUN zoning & Read-only zones Port Security Host/switch authentication (FC-SP) Digital Certificates (IKE x.509) IPsec Security (iSCSI and FCIP) *
• SAN Extension over IP Package (FCIP) • • • • • • •
• Storage Services Enabler Package (SSE)
• Advanced traffic engineering QoS and Zone-based QoS Extended Credits * FC Write Acceleration & Read Acceleration SCSI Flow Statistics Enhanced IOD
• Enhanced VSAN functionality Inter-VSAN Routing IVR with FCID NAT
• Mainframe Package (FICON) • • • •
FICON protocol and CUP management FICON VSANs and intermixing FICON Tape Acceleration (R/W) Switch cascading and fabric binding
© 2006 Cisco Systems, Inc. All rights reserved.
FCIP Protocol FCIP Compression Hardware-based FCIP Compression * Hardware based FCIP Encryption * FCIP Write Acceleration FCIP Tape Acceleration and Read Acceleration SAN Extension Tuner
• SANTap protocol • NASB (Network Accelerated Serverless Backup) • FAIS (Fabric Application Interface Standard)
• Fabric Manager Server (FMS) • • • • • • • • • •
Multiple physical fabric management Centralized fabric discovery services Continuous MDS health and event monitoring Long-term, historical data collection Performance reports and charting for hot-spot analysis Performance prediction and server summary reports Web-Based Operational View Threshold Monitoring Configurable RRD Data collection auto-update and event forwarding * = Requires MPS (14+2) module or 9216i
18
Bundled Software Packages for SAN-OS 3.0 The SAN-OS feature packages are:
Enterprise Package—adds a set of advanced features which are recommended for all enterprise SANs.
SAN Extension over IP Package—enables FCIP for IP Storage Services and allows the customer to use the IP Storage Services to extend SANs over IP networks.
Mainframe Package—adds support for the FICON protocol. FICON VSAN support is provided to help ensure that there is true hardware-based separation of FICON and open systems. Switch cascading, fabric binding, and intermixing are also included in this package.
Fabric Manager Server Package—extends Cisco Fabric Manager by providing historical performance monitoring for network traffic hotspot analysis, centralized management services, and advanced application integration for greater management efficiency.
Storage Services Enabler Package—enables network-hosted storage applications to run on the Cisco MDS 9000 Family Storage Services Module (SSM). A Storage Services Enabler package must be installed on each SSM.
The SAN-OS Software package fact sheets are available at http://www.cisco.com/en/US/products/hw/ps4159/ps4358/products_data_sheets_list.html.
Copyright © 2006, Cisco Systems, Inc.
Cisco MDS 9000 Introduction
71
Unique Attributes of MDS Licensing • Simple value-based packaging • Feature-rich Standard Package (no extra charge) Simple bundles for advanced features that provide significant value All upgrades included in support pricing
• High availability Non-disruptive installation No single point of failure 120 day grace period for enforcement
• Ease of use Seamless electronic licenses No separate software images for licensed features Licenses installed on switch at factory Automated license key installation Centralized License Management Console Provides single point of license management of all switches
© 2006 Cisco Systems, Inc. All rights reserved.
19
Unique Attributes of MDS Licensing License usability can be a nightmare with existing products. Customers have concerns about compromising availability with disruptive software installations for licensed features. License management is a notorious problem. Cisco license packages require a simple installation of an electronic license—no software installation or upgrade is required. Licenses can also be installed on the switch in the factory. MDS switches store license keys on the chassis SPROM, so license keys are never lost even during a switch software reinstall. Cisco Fabric Manager includes a centralized license management console that provides a single interface for managing licenses across all MDS switches in the fabric, reducing management overhead and preventing problems due to improperly maintained licensing. In the event that an administrative error does occur with licensing, the switch provides a grace period before the unlicensed features are disabled, so there is plenty of time to correct the licensing issue. All licensed features may be evaluated for a period of up to 120 days before a license is required.
72
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Multilayer Intelligent Storage Platform Enterprise-Class Management FM DM PM Traffic Analyzer PAA Advanced Diagnostics Integration
Intelligent Storage Services Virtualization Replication Volume Mgmt
SAN Consolidation VSANs PortChannels Security Traffic Engineering QoS FCC
Multiprotocol FC FICON FCIP iSCSI
High Availability Infrastructure Hardware Redundancy Hot-Swap Non-Disruptive Upgrade
© 2006 Cisco Systems, Inc. All rights reserved.
20
The Cisco MDS 9000 Series is the first multilayer intelligent storage platform.
High-availability infrastructure—Redundant power and cooling, redundant supervisor modules with stateful failover, hot-swap modules, and non-disruptive software upgrades for the MDS 9500 platform give you 99.999% availability.
Multiprotocol—iSCSI enables integration of mid-range servers into the SAN, FICON enables integration of mainframe systems with complete isolation of FICON and FC ports, and FCIP enables cost-effective DR solutions.
SAN consolidation—Intelligent infrastructure services like virtual SANs (VSANs), PortChannels, per-VSAN FSPF routing, QoS, FCC, and robust security enable stable, scalable, and secure enterprise SAN consolidation.
Intelligent storage services—Network-based services for resource virtualization, volume management, data mobility, and replication lower TCO and increase ROI.
Enterprise-class management—Integrated device, fabric, and performance management improve management productivity and easily integrate with existing enterprise management frameworks like IBM Tivoli and HP OpenView.
Copyright © 2006, Cisco Systems, Inc.
Cisco MDS 9000 Introduction
73
Q: How do we build a 3000 port fabric?
A: Six MDS 9513 Directors © 2006 Cisco Systems, Inc. All rights reserved.
21
Question: How do we build a 3000 port fabric? Answer: Using six MDS 9513 directors. The MDS 9513 has the largest port capacity (528 ports) of any Fibre Channel switch or director in the market today.
74
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Lesson 3
Architecture and System Components Overview This lesson describes the hardware and software architecture of the MDS 9000 storage networking platform.
Objectives Upon completing this lesson, you will be able to describe the hardware architecture and components of the MDS 9000 Family of switches. This includes being able to meet these objectives:
Describe the system architecture of the MDS 9000 platform
Explain how to design fabrics using full-rate and oversubscribed line cards
Explain how buffer credits are allocated on MDS 9000 line card modules
System Architecture MDS Integrated Crossbar • Investment protection
Centralized Crossbar switch architecture
Ability to support new line cards Multiprotocol support in one system
• Highly-scalable system
external interfaces
Aggregate Bandwidth up to 2.2 Tbps
• High Port Density
external interfaces
Crossbar switch fabric
Flexibility to support higher density line cards Fewer devices to purchase and manage Increase in usable FC ports No wasted internal ports Minimal switch interconnects
• Centralized redundant architecture Flexibility to support speed mismatches 1Gb – 2Gb – 4Gb – 10Gb Arbiter schedules frames fairly to ensure consistent latency Virtual Output Queuing for optimal crossbar performance © 2006 Cisco Systems, Inc. All rights reserved.
4
MDS Integrated Crossbar The integrated crossbar system provides investment protection through its ability to support new line cards, including new transports. It also provides multiprotocol support in one system (Fibre Channel, Internet Small Computer Systems Interface [iSCSI], and Fibre Channel over IP [FCIP]), as well as being highly scalable, supporting speeds of up to 1.44 terabits per second (Tbps) on MDS 9506 and 9509 and up to 2.2 Terabits per second on MDS 9513. High port density means fewer devices to purchase. This situation results in an increase in usable ports due to minimal switch interconnects. Thus, common equipment (power, supervisors) is amortized over more ports. The integrated crossbar system provides 80-Gbps bandwidth on MDS 9506 and 9509 and up to 100 Gbps on MDS 9513. It also features a redundant OOB management channel on the backplane. An integrated crossbar has many benefits. Investment protection is ensured by the ability to support new line cards including transports and multiprotocol support in one system. Data transfers of up to 2.2Tbps or 100Gbps per slot provides a highly-scalable system. High port density means fewer devices to purchase and manage, while an increase in usable ports due to minimal switch interconnects means that common equipment like power supplies, supervisors, and chassis can be amortized over more ports.
78
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
There is an aggregate 720-Gbps multiprotocol crossbar per supervisor module used on MDS 9506 and 9509 but the MDS 9513 has new Crossbar Fabric modules located at the rear of the chassis that provide a total aggregate bandwidth of 2.2Tbps. All MDS chassis can operate on a single crossbar at full bandwidth on all attached ports without blocking. A technique called Virtual Output Queuing (VOQ) is deployed for optimal crossbar performance. VOQ resolves head-of-line blocking issues for continuous data flow.
Copyright © 2006, Cisco Systems, Inc.
Architecture and System Components
79
MDS 9506 & 9509 Crossbar Architecture Active Supervisor Flash Card
Eth
Console
Standby Supervisor Flash Card
Eth
Console
uP
Crossbar
uP
A
Crossbar
A
Back plane I/F Q F
Q
I/F
uP Q
F
F
Q F
M
M
M
M
F
F
F
F
S-16 & S-32 FC Line Cards
I/F
uP
Q
Q
F
I/F
uP
uP
Q
Q
Q
F
F
F
F
V
V
V
M
M
M
F
F
F
Q
F
M
Si
F
EE
MPS 14+2 Multiprotocol
Si
Si
Si
Si
EE
EE
EE
EE
IPS-8 FCIP & iSCSI IP Services
© 2006 Cisco Systems, Inc. All rights reserved.
I-32 SSM Storage Services Module 5
MDS 9506 and 9509 Crossbar Architecture The MDS 9000 Series Multilayer switches were designed from the ground up around a collection of sophisticated application specific integrated circuit (ASIC) chips. Unlike other fabric vendors, whose products are largely based on single-ASIC designs, the MDS 9000 family of products are far more powerful and flexible, both in terms of the features they support today as well as their ability to evolve and grow with customer’s changing needs. Because of its sophisticated, modular, multi-ASIC design, the MDS 9000 Series Multilayer switches are capable of supporting many protocols and services, including Fibre Channel, FCIP, iSCSI, FICON and virtualization services—all concurrently and in the same chassis. The hot-swappable, modular line card design provides a high degree of flexibility and allows for ease of expansion. Each line card in the MDS 9500 series director has redundant high speed paths across the backplane to the high-performance crossbar fabrics located on the redundant supervisor modules, thus providing a “five-nines” level of availability. Although one supervisor is Active and the other Passive, both crossbars are always active and capable of routing frames. The graphic illustrates a block diagram of a possible switch implementation, including:
80
Dual supervisor modules (Supervisor-1 or Supervisor-2) containing crossbar, microprocessor, flash memory, console and Ethernet interfaces
An FC line card capable of supporting Fibre Channel and FICON protocols. Examples of this are the 16-port and 32-port line cards.
An IP Services line card capable of supporting IP storage services and protocols like FCIP and iSCSI
An MPS 14+2 line card with 14 FC ports and two GigE ports supporting iSCSI and three FCIP tunnels per GigE port.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
An SSM line card, capable of performing virtualization services, snapshots, replication and SCSI 3rd Party copy services to support NASB (Network Assisted Serverless Backup)
Frames arriving at the interface are de-encoded, conditioned, maybe virtualized and passed to the forwarding ASIC (F) then stored in the appropriate Virtual Output Queue (Q) until the arbiter (A), decides that a credit is available at the destination port and that the frame can now continue its journey. The frame leaves the VOQ and passes through the Up interface (I/F) across one of the crossbars and down to the destination line card and straight out of the appropriate interface. Notice that all line cards have an identical architecture from the F ASIC and above, so all frames crossing the crossbar have already been conditioned and processed and have an identical structure, regardless of their underlying protocol, FC, Ficon, iSCSI, FCIP. The internal architecture of the MDS 9216 is very similar to the MDS 9500 in that many of the same internal components are utilized. There are however several key differences:
The MDS 9216 has a fixed non-redundant supervisor card that provides arbitration and supports a single modular card, although it can be of any type.
The MDS 9216 does not use a crossbar fabric. In a two-slot design there is no need or advantage to using a switching crossbar. Instead, the two cards are connected to each other through the high-speed back plane, and to themselves through an internal loopback interface.
Copyright © 2006, Cisco Systems, Inc.
Architecture and System Components
81
MDS 9506 & 9509 Crossbar Architecture
20-Gbps
Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line Card 80-Gbps
Crossbar 720-Gbps Supervisor Module
20-Gbps
20-Gbps
20-Gbps
© 2006 Cisco Systems, Inc. All rights reserved.
Crossbar 720-Gbps Supervisor Module
6
Each supervisor module has an onboard 720 Gbps crossbar—360 Gbps transmit (Tx) and 360 Gbps receive (Rx). Therefore in a dual supervisor installation, the MDS 9000 system has an aggregate total bandwidth of up to 1.44-Tbps. Each installed line card in a dual supervisor configuration has 80 Gbps bandwidth available to the supervisor cross-bars. Each path is 20 Gbps in each direction. Each card connects through dual 20-Gbps paths to each supervisor cross-bar. Data is load shared across both cross-bars when dual supervisor modules are installed. Both crossbars are active-active and frames from a line card will travel across either one or the other crossbar. The Arbiter function schedules frame delivery at over 1 billion frames per second and routes frames over either one crossbar or the other.
82
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
MDS 9513 Crossbar Architecture
Total Crossbar Fabric Bandwidth 2.2 Tbps
Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line Card 80-Gbps Line Card 80-Gbps Line Card 80-Gbps Line Card 80-Gbps Line Card 100-Gbps
25Gbps
Crossbar Fabric Module
25Gbps
25Gbps
25Gbps
© 2006 Cisco Systems, Inc. All rights reserved.
Crossbar Fabric Module
7
MDS 9513 Crossbar Architecture Only Supervisor-2 modules can be fitted to the MDS 9513. Supervisor-2 modules also have integral crossbars but they are not used when installed on the MDS 9513. Dual redundant Crossbar Fabric modules, situated at the rear of the chassis with a total aggregate bandwidth of 2.2 Terabits per second are used instead. Each line card has four 25Gbps channels connecting to the Crossbar Fabric Modules providing a total bandwidth of 100Gbps per slot. Eleven line card slots x 100 Gbps = 2.2 Tbps Both crossbars are active-active and frames from a line card will travel across either one or the other crossbar. The Arbiter function schedules frame delivery at over 1 billion frames per second and routes frames over either one crossbar or the other.
Copyright © 2006, Cisco Systems, Inc.
Architecture and System Components
83
3.0
MDS 9513 System Diagram Arbiter
Crossbar Fabric
Crossbar Fabric
Arbiter
50Gbps 50Gbps
24-port LC
16-port LC
Fwd/VOQ
25Gbps
MAC/PHY
MAC/PHY
© 2006 Cisco Systems, Inc. All rights reserved.
4 Ports 10G LC
Fwd/VOQ
TCAM
12 Ports FRM LC
Adj
Buf
48 Ports OSM LC
Up i/f TCAM
VOQ
Down i/f VOQ
Forwarding
Forwarding
MAC
MAC
MAC
MAC
8
Each of the eleven line card slots on the MDS9513 has 2x 2.5Gbps serial links to each Arbiter ASIC. In addition, each Supervisor slot has one each, making a total of 24x 2.5Gbps serial links to the Arbiter ASICs. These are used to communicate with the Central Arbiter to request and grant permission for a frame to cross the crossbar. Each line card Fwd/VOQ ASIC is connected to each of the Crossbar Fabrics via a pair of dual redundant 25Gbps channels providing a total 50Gbps to each crossbar. A second 50Gbps dual redundant pair of channels provides the return path from the Crossbar Fabric to the other Fwd/VOQ ASIC. Each channel comprises 8x 3.125Gbps serial links for transmit and 8x 3.125Gbps for receive. Frames arrive at the line card MAC/PHY interface and are forwarded to the Fwd/VOQ ASIC where the frames are stored in a buffer and associated with a destination VOQ. The Fwd/VOQ ASIC requests permission from the Arbiter to deliver a frame to the destination port. When the Arbiter has received a credit from the destination device, it grants permission for the frame to be sent across one of the crossbar fabrics. When permission is granted by the Arbiter, a frame leaves the VOQ in the Fwd/VOQ ASIC along one of the 25Gbps channels to one of the Crossbar Fabrics then returns via one of the 25Gbps return channels and out through the MAC/PHY ASIC on the appropriate line card. All frames travel across the crossbar fabric, regardless of where the source and destination ports are located on the ASICs or line cards. This provides consistent latency of approx 20us per frame and minimises jitter which can occur in other vendor products.
84
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
3.0
MDS 9513 Crossbar Fabric
• Redundant Crossbar Fabric Active/Active operation balances the load across both crossbars Rapid failover in case of failure ensures no loss of frames
High Bandwidth non blocking architecture Each Crossbar Fabric supports dual 25Gbps channels from each line card Total Crossbar Bandwidth = 2.2 Tbps A single crossbar fabric still provides sufficient bandwidth for all line cards
High Performance Centralized Architecture Ensures consistent latency across the switch Supports up to 1024 indexes (destination interfaces) Enhanced High Performance Arbiter schedules frames at over 1 billion/sec © 2006 Cisco Systems, Inc. All rights reserved.
9
MDS 9513 Crossbar Fabric Both MDS 9513 Crossbar Fabric modules are located at the rear of the chassis and provide a total aggregate bandwidth of 2.2 Tbps. Each fabric module is connected to each of the line cards via dual redundant 25Gbps channels making a total of 100Gbps per slot. A single fabric crossbar module can support full bandwidth on all connected ports in a fully loaded MDS 9513 without blocking. The arbiter schedules frames at over 1 billion frames per second, ensuring that blocking will not occur even when the ports are fully utilized.
Copyright © 2006, Cisco Systems, Inc.
Architecture and System Components
85
Hot-Swappable Supervisors • Dual supervisors Active and Standby Hot-swappable Stateful-Standby keeps sync with all major management and control protocols of active supervisor
• Non-disruptive upgrades Load and activate new software without disrupting traffic Standby supervisor maintains the previous version of code while the active supervisor is updated
© 2006 Cisco Systems, Inc. All rights reserved.
10
The Cisco MDS 9500 Series of Multilayer Directors supports two Supervisor modules in the chassis for redundancy. Each Supervisor module consists of a Control Engine and a Crossbar Fabric. The Control Engine is the central processor responsible for the management of the overall system. In addition, the Control Engine participates in all of the networking control protocols including all Fibre Channel services. In a redundant system, two Control Engines operate in an active/standby mode. The Control Engine that is in standby mode is actually in a stateful-standby mode such that it keeps sync with all major management and control protocols that the active Control Engine maintains. While the standby Control Engine is not actively managing the switch, it continually receives information from the active Control Engine. This allows the state of the switch to be maintained between the two Control Engines. Should the active Control Engine fail, the secondary Control Engine will seamlessly resume its function. The Crossbar Fabric is the switching engine of the system. The crossbar provides a high speed matrix of switching paths between all ports within the system. A crossbar fabric is embedded within each Supervisor module. The two crossbar fabrics operate in a load-shared active-active mode. Each crossbar fabric has a total switching capacity of 720 Gbps and serves 80 Gbps of bandwidth to each slot on MDS 9506 and 9509. Since each switching module of the Cisco MDS 9506 or 9509 does not consume more than 80 Gbps of bandwidth to the crossbar, the system will operate at full performance even with one Supervisor module. In a fully populated MDS 9500, the system will not experience any disruption or any loss of performance with the removal or failure of one Supervisor module. The Supervisor Module is a hot swappable module. In a dual Supervisor module system this allows the module to be removed and replaced without causing disruption to the rest of the system.
86
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
3.0
Supervisor-2 Module Features • High Performance Integrated Crossbar Active when installed in MDS 9506 or MDS 9509 chassis Bypassed when installed in MDS 9513 chassis Supports up to 48 Gbps of front-panel bandwidth per slot
• Enhanced Crossbar Arbiter 1024 destination indexes per chassis Supports mix and match of gen-1 and gen-2 modules Up to 252 ports when any gen-1 modules are present Up to 528 ports when only gen-2 modules are present
• PowerPC Management Processor Provides increased performance and lower power consumption vs. Sup-1
Front Panel Interfaces Console Port Management Ethernet Port 10/100/1000 COM1 Port Compact Flash Slot USB Ports (2)
• MDS 9513 requires Supervisor-2
© 2006 Cisco Systems, Inc. All rights reserved.
11
Supervisor-2 Module Features Supervisor-2 is an upgraded version of supervisor-1 with additional Flash memory, RAM and NVRAM memory and redundant BIOS. It can be used in any MDS 9500 chassis, 9506, 9509 or 9513. All frames pass directly from line card ASICs across the crossbar and out to their destination interfaces. Frame flow is not regulated by the supervisor. Supervisor-2 uses a new PowerPC management processor to provide FC services to connected devices. eg FSPF, zoning, Name Server, FLOGI server, security, VSANs and IVR. When used in a MDS 506 or 9509, the integral crossbar is used. When used in the MDS 9513, the integral crossbar is bypassed, and the Crossbar Fabric modules are used instead. Supervisor-2 supports 1024 destination indexes providing up to 528 ports in MDS 9513 when only gen-2 modules are used. If any gen-1 modules are installed in MDS 9513, then only 252 ports can be used.
Copyright © 2006, Cisco Systems, Inc.
Architecture and System Components
87
Switch CPU Resources—Critical for Scalability • FC Switch CPU and memory resources are critical to SAN scalability and resiliency • Cisco has addressed this need with powerful processing capabilities in the MDS 9000 Family • Inadequate CPU resources can have major adverse effects on SAN operations SAN Scalability – Additional CPU resources required for each new neighbor switch, logged-in device, propagated zone set, RSCN-registered device, etc. SAN Resiliency – Without adequate CPU resources, fabric reconvergence from a fault can take excessive time or even fail altogether as high number of computations are required. SAN Security – Additional CPU resources are required for security features such as SSH and SSL access, encryption, FC-SP fabric authentication, and port binding
© 2006 Cisco Systems, Inc. All rights reserved.
12
Switch CPU Resources All Fibre Channel switches must provide a number of distributed services for their connected devices. These include the distributed Name Server, Login Server, Time Server, Management Server, FSPF, Zoning Server etc. These services are provided by the active supervisor and must respond to requests in a timely manner; the faster the better, otherwise the fabric may appear unresponsive and in extreme conditions, may ‘hang’ for a period of time. For this reason, the MDS contains processors with many times the performance of competitive FC switches. The very nature of a SAN is to fan out the connectivity of fewer storage subsystems to numerous server connections. While performance is important, so are the capabilities of a switched fabric to provide services, including congestion avoidance, preferential services, and blocking avoidance. Cisco provides a full SAN switching product line with the Cisco MDS 9000 Series, a line that is optimized to build scalable SAN infrastructures and to provide industry-leading performance, resiliency, security, and manageability Independent switch performance validation testing has proven the Cisco MDS family’s performance capabilities consistently outperform competing products.
88
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Oversubscription and Bandwidth Reservation Oversubscription Overview • Each device FC port must be connected to a single FC switch port but not all devices can utilize the full bandwidth available to them • A 2Gbps FC port can provide 200MB/s of bandwidth • Servers rarely require more than 25MB/s • Oversubscription allows several devices to share the available bandwidth • ISL oversubscription is typically 7:1 • 200MB/s shared by 7 servers = approx 28MB/s avg per server
FC FC FC HBA FC HBA FC HBA FC HBA FC HBA
HBA
2Gbps ISL
ISL Bandwidth 200MB/s
HBA
© 2006 Cisco Systems, Inc. All rights reserved.
14
Oversubscription Overview Fibre Channel standards dictate that in a Fabric Topology, each attached FC device port must be attached to its own dedicated FC switch port. Today’s switch ports support 1Gbps, 2Gbps, 4Gbps and 10Gbps ports but the connected device cannot usually utilize the full bandwidth available to them. A 2Gbps port can provide 200MB/s of bandwidth in each direction, a total of 400MB/s per port. Servers often have internal bandwidth limitations and applications rarely require more than 25MB/s today. This is changing with the introduction of PCI-Express motherboards that have replaced the old parallel PCI bus with multiple 2.5Gbps serial channels to each slot. If the application is capable of demanding it, each PCI-Express channel will fully utilize a 2Gbps Fibre Channel port. However, today most servers require less than 25MB/s. Oversubscription allows several devices to share the available bandwidth. ISL oversubscription is typically 7:1 with seven servers sharing the total bandwidth of a 2Gbps FC port. 200MB/s shared by 7 servers = approx 28MB/s average per server.
Copyright © 2006, Cisco Systems, Inc.
Architecture and System Components
89
• 16-port line card has no oversubscription
4x 1Gbps = 4Gbps 4x 2Gbps = 8Gbps
FRM FRM
Port 16
Port 15
Port 14
Port 13
Port 12
Port 11
Port 10
Port Group
Port 9
Port Group
Port 8
Port Group
Port 7
Port Group
Port 6
10 Gbps Shared Bandwidth
Port 5
10 Gbps Shared Bandwidth
Port 4
10 Gbps Shared Bandwidth
Port 3
10 Gbps Shared Bandwidth
Port 2
Port 1
16-Port Full-Rate Mode Line Card
1:1 1:1
Supports Full Rate Mode at 1Gbps or 2Gbps Use when the device requires full bandwidth at up to 200MB/s
• Suitable for Storage Arrays and ISLs between switches • Up to 255 Buffer Credits per FC interface – Fully configurable plus 145 performance buffers per port Default 255 credits for E_Ports, 16 credits for Fx_Ports 15
© 2006 Cisco Systems, Inc. All rights reserved.
The 16-Port Full-Rate Mode Line Card The 16-port line card operates in Full Rate Mode. Each port on the line card can deliver up to 2Gbps. There are 4 ports in a port group, so total bandwidth requirement could be 8Gbps per port group. The internal path to the forwarding ASIC provides 10Gbps, so more than enough bandwidth is available. The 16-port line card is suitable for any device that requires full 2Gbps bandwidth. eg Storage Arrays or ISLs to other switches. Each port has up to 255 configurable buffer credits. By default an E_Port or TE_Port will be allocated the full 255 credits, F_Ports are allocated 16 credits but may be configured up to 255 credits. An additional 145 performance buffers are available when required.
90
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
32-Port Oversubscribed Mode Line Card
• 32-port line card has limited internal bandwidth
4x 1Gbps = 4Gbps 4x 2Gbps = 8Gbps
Oversubscribed by design at 3.2:1 @ 2Gbps Provides twice as many ports for approximately the same price
OSM OSM
Port 32
Port 31
Port 30
Port 29
Port 28
Port 27
Port 26
Port 25
Port 24
Port 23
Port 22
Port 21
Port 20
Port 19
Port 18
Port 17
Port 16
Port Group
Port 15
Port Group
Port 14
Port Group
Port 13
Port Group
Port 11
Port Group
Port 12
Port Group
Port 10
Port Group
Port 8
Port Group
Port 9
2.5 Gbps Shared Bandwidth
Port 7
2.5 Gbps Shared Bandwidth
Port 6
2.5 Gbps Shared Bandwidth
Port 4
2.5 Gbps Shared Bandwidth
Port 5
2.5 Gbps Shared Bandwidth
Port 2
2.5 Gbps Shared Bandwidth
Port 3
2.5 Gbps Shared Bandwidth
Port 1
2.5 Gbps Shared Bandwidth
1.6:1 3.2:1
8 / 2.5 = 3.2
• Suitable for connecting servers that require less than 62MB/s avg. bandwidth • 12 Buffer Credits (fixed) per FC interface
16
© 2006 Cisco Systems, Inc. All rights reserved.
The 32-Port Oversubscribed Mode Line Card The 32-port line card is designed to provide twice as many ports (32 ports) for nearly the same price as a 16-port line card. However, to make space, some internal components are removed. The 32-port line card operates in Oversubscribed Mode. Each port on the line card can deliver 2Gbps. There are 4 ports in a port group, so total bandwidth requirement could be 8Gbps per port group. However, the internal path to the forwarding ASIC provides only 2.5Gbps, so 4 ports must share 2.5Gbps or 250MB/s. 250MB/s shared by 4 ports means that on average, each device should not exceed 62.5MB/s on average. However, one port could be operating at 100MB/s and another at 20MB/s, another at 60MB/s and another at 70MB/s providing the total group bandwidth does not exceed 250MB/s The 32-port line card is suitable for any device that does not require full 2Gbps bandwidth. eg Servers and tape drives that demand less than 62MB/s average bandwidth. Each port has only 12 buffer credits, non-configurable.
Copyright © 2006, Cisco Systems, Inc.
Architecture and System Components
91
3.0
Generation-2 Fibre Channel Modules
Four modules address key SAN consolidation requirements • 12-Port 1/2/4-Gbps FC Module Full-rate 4Gbps performance for ISLs and highest performance server and tape applications
• 24-Port 1/2/4-Gbps FC Module Full-rate 2Gbps performance for enterprise storage connect and high performance server applications Maximum subscription ratio with all ports active
• 48-Port 1/2/4-Gbps FC Module Shared bandwidth 2Gbps performance for mainstream server applications
• 4-Port 10-Gbps FC Module Full-rate 10Gbps performance for ISL consolidation and high bandwidth Metro connect © 2006 Cisco Systems, Inc. All rights reserved.
FC Line Card
1 Gbps
2 Gbps
4 Gbps
10 Gbps
12-Port
1:1
1:1
1:1
N/A
24-Port
1:1
1:1
2:1
N/A
48-Port
1:1
2:1
4:1
N/A
4-Port
N/A
N/A
N/A
1:1
17
Generation-2 Fibre Channel Modules Four new second-generation modules provide much more flexibility when configuring ports.
12-port 1/2/4Gbps Fibre Channel module providing 4Gbps full rate bandwidth on every port.
24-port 1/2/4Gbps Fibre Channel module providing 4Gbps at 2:1 oversubscription and full rate bandwidth on each port at 1Gbps and 2Gbps.
48-port 1/2/4Gbps Fibre Channel module providing 4Gbps at 4:1 oversubscription and 2Gbps at 2:1 oversubscription and full rate bandwidth at 1Gbps on each port.
4-port 10Gbps Fibre Channel module providing 10Gbps full rate bandwidth on every port.
10G modules use 64b/66b encoding that is incompatible with modules operating at 1/2/4Gbps using 8b/10b encoding.
92
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
3.0
Port Groups on Generation-2 Line Cards Port-group
48 port
Port-group
24 port
12 port
Port-group
• Each line card has 4 port groups, denoted by screen-printed borders • Each port group has 12 Gbps of shared bandwidth • Ports can be configured to have Dedicated Bandwidth (1Gb / 2Gb / 4Gb) • Remaining ports share unused bandwidth 18
© 2006 Cisco Systems, Inc. All rights reserved.
Port Groups Each port group is clearly marked on the line cards with screen-printed borders. Each port group has 12Gbps of internal bandwidth available. Any port can be configured to have dedicated bandwidth at 1Gbps, 2Gbps or 4Gbps. All remaining ports in the port group share any remaining unused bandwidth. Any port in dedicated bandwidth mode has access to extended buffers. Any port in shared bandwidth mode has only 16 buffer credits.
Copyright © 2006, Cisco Systems, Inc.
Architecture and System Components
93
3.0
12-Port Full-Rate Mode Line Card
• Each Port Group shares 12 Gbps of bandwidth
Port 12
FRM FRM FRM
Port 9
3x 1Gbps = 6Gbps 3x 2Gbps = 12Gbps 3x 4Gbps = 24Gbps
Port 8
Port 11
• 3 ports per Port Group
Port 10
Port Group
Port 7
Port Group
Port 6
Port Group
Port 5
Port Group
Port 4
12 Gbps Shared Bandwidth
Port 3
12 Gbps Shared Bandwidth
Port 2
12 Gbps Shared Bandwidth
Port 1
12 Gbps Shared Bandwidth
1:1 1:1 1:1
• Full Rate Mode at 1/2/4 Gbps • Suitable for 4Gbps Storage Array ports • Suitable for ISLs between switches 16x 4Gbps = 64Gbps Port Channel
© 2006 Cisco Systems, Inc. All rights reserved.
19
The 12-Port Full-Rate Mode Line Card The 12-port line card module operates in Full Rate Mode. Each port on the line card can deliver up to 4Gbps. There are 6 ports in a port group, so total bandwidth requirement could be 24Gbps per port group. The internal path to the forwarding ASIC provides 25Gbps, so more than enough bandwidth is available. The 12-port line card is suitable for any device that requires full 4Gbps bandwidth. eg Storage Arrays or ISLs to other switches. Each port has up to 250 configurable buffer credits. By default an E_Port or TE_Port will be allocated the full 250 credits, F_Ports are allocated 16 credits but may be configured up to 250 credits. An additional 2488 extended buffers and 512 performance buffers are available per module to ports configured in dedicated mode. Also, 144 proxy buffers are available per module.
94
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
• 6 ports per Port Group • Each Port Group shares 12 Gbps of bandwidth
6x 1Gbps = 6Gbps 6x 2Gbps = 12Gbps 6x 4Gbps = 24Gbps
FRM FRM OSM
Port 23
Port 22
Port 21
Port 20
Port 19
Port 18
Port 17
Port 16
Port 15
Port 14
Port 13
Port 12
Port 11
Port 10
Port 9
Port Group
Port 8
Port Group
Port 7
Port Group
Port 6
Port Group
Port 5
12 Gbps Shared Bandwidth
Port 3
12 Gbps Shared Bandwidth
Port 4
12 Gbps Shared Bandwidth
Port 2
Port 1
12 Gbps Shared Bandwidth
Port 24
3.0
24-Port Oversubscribed Mode Line Card
1:1 1:1 2:1
• Full Rate Mode at 1 and 2 Gbps • 2:1 oversubscription at 4 Gbps • Suitable for Storage Arrays that require less than 200MB/s bandwidth and ISLs between switches
20
© 2006 Cisco Systems, Inc. All rights reserved.
The 24-Port Oversubscribed Mode Line Card The 24-port line card module operates in Oversubscribed Mode. Each port on the line card can deliver up to 4Gbps. There are 12 ports in a port group, so total bandwidth requirement could be 48Gbps per port group. The internal path to the forwarding ASIC provides 25Gbps, so with every port at 4Gbps all ports are up to 2:1 oversubscribed. At 1Gbps and 2Gbps there is more than enough bandwidth to provide all ports with full bandwidth. The 24-port line card module is suitable for any device that requires less than 200MB/s bandwidth. eg Storage Arrays or ISLs to other switches. Each port has up to 250 configurable buffer credits. By default an E_Port or TE_Port will be allocated the full 250 credits, F_Ports are allocated 16 credits but may be configured up to 250 credits. Also, 144 performance buffers are available per module.
Copyright © 2006, Cisco Systems, Inc.
Architecture and System Components
95
3.0
12 Gbps Shared Bandwidth
12 Gbps Shared Bandwidth
12 Gbps Shared Bandwidth
12 Gbps Shared Bandwidth
Port Group
Port Group
Port Group
Port Group
Port 1 Port 2 Port 3 Port 4 Port 5 Port 6 Port 7 Port 8 Port 9 Port 10 Port 11 Port 12
Port 13 Port 14 Port 15 Port 16 Port 17 Port 18 Port 19 Port 20 Port 21 Port 22 Port 23 Port 24
Port 25 Port 26 Port 27 Port 28 Port 29 Port 30 Port 31 Port 32 Port 33 Port 34 Port 35 Port 36
Port 37 Port 38 Port 39 Port 40 Port 41 Port 42 Port 43 Port 44 Port 45 Port 46 Port 47 Port 48
48-Port Oversubscribed Mode Line Card
• 12 Ports per Port Group • Each Port Group shares 12Gbps of bandwidth
12x 1Gbps = 12Gbps 12x 2Gbps = 24Gbps 12x 4Gbps = 48Gbps
FRM OSM OSM
1:1 2:1 4:1
• Full Rate Mode at 1Gbps • 2:1 oversubscription at 2Gbps • 4:1 oversubscription at 4Gbps • Suitable for servers that require less than 100MB/s average bandwidth
© 2006 Cisco Systems, Inc. All rights reserved.
21
The 48-Port Oversubscribed Mode Line Card The 48-port line card module operates in Oversubscribed Mode. Each port on the line card can deliver up to 4Gbps. There are 12 ports in a port group, so total bandwidth requirement could be 48Gbps per port group. The internal path to the forwarding ASIC provides 12Gbps, so with every port at 4Gbps all ports are up to 4:1 oversubscribed and at 2Gbps all ports are 2:1 oversubscribed At 1Gbps there is more than enough bandwidth to provide all ports with full bandwidth. The 48-port line card module is suitable for any device that requires less than 100MB/s bandwidth. eg servers or tape drives that require less than 100MB/s on average. Each port has up to 250 configurable buffer credits. By default an E_Port or TE_Port will be allocated the full 250 credits, F_Ports are allocated 16 credits but may be configured up to 250 credits. Also, 144 performance buffers are available per module.
96
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
3.0
12 Gbps Bandwidth
12 Gbps Bandwidth
Port Group
Port Group
Port Group
Port Group
Port 2
• 1 port per Port Group
Port 4
12 Gbps Bandwidth
Port 3
12 Gbps Bandwidth
Port 1
4-Port 10Gbps Full-Rate Mode Line Card
1x 1Gbps = 12Gbps
FRM
1:1
• Each Port Group shares 12Gbps of bandwidth • Full Rate Mode at 10Gb/s • Suitable for ISLs between switches 4 ports x 10Gbps = 40Gbps Port Channel with 4 cards x 4 ports x 10Gbps = 160Gbps port Channel between switches
22
© 2006 Cisco Systems, Inc. All rights reserved.
The 4-Port 10Gbps Full-Rate Mode Line Card The 4-port 10G FC line card module operates in Full Rate Mode. Each port on the line card can deliver up to 10Gbps. Each 10Gbps port has its own port group, so total bandwidth requirement could be 10Gbps per port group. The internal path to the forwarding ASIC provides 12Gbps, so more than enough bandwidth is available. The 4-port 10G FC line card is suitable for any device that requires full 10Gbps bandwidth. eg ISLs to other switches. Up to 16 ports (over 4x 10G FC line cards) may be placed in a Port Channel providing up to 160Gbps of Port Channel bandwidth. Each port has up to 250 configurable buffer credits. By default an E_Port or TE_Port will be allocated the full 250 credits, F_Ports are allocated 16 credits but may be configured up to 250 credits. An additional 2488 extended buffers and 512 performance buffers are available per module to ports configured in dedicated mode. Also, 144 proxy buffers are available per module.
Copyright © 2006, Cisco Systems, Inc.
Architecture and System Components
97
3.0
Port Bandwidth Reservation 2-250 Credits
Shared ports have 16 BB_Credits
• Ports can be taken out of service to releases credits and resources
Port 5
• Other ports share unused bandwidth
Port 4 – Out of Service
Interface Port mode becomes “dedicated” at 1,2 or 4 Gbps
Shared unused bandwidth 6Gbps
Port 3
• Ports within a port group can be allocated 1, 2 or 4 Gbps of guaranteed bandwidth
Port 1 - Dedicated 4 Gbps
• Second-generation modules only
Port 6 – Dedicated 2 Gbps
12 Gbps Shared Bandwidth
Port 2
• Allows for greater flexibility in deploying oversubscribed modules
Example: 24-Port FC Module • Dedicating one port to 4 Gbps • Dedicating another port to 2 Gbps • Taking one port Out of Service
23
© 2006 Cisco Systems, Inc. All rights reserved.
Port Bandwidth Reservation Bandwidth reservation provides maximum flexibility when configuring ports on secondgeneration modules. Any port in a port group can be allocated 1Gbps, 2Gbps or 4Gbps dedicated bandwidth All remaining ports in the port group share any remaining unused bandwidth Ports in dedicated bandwidth mode have access to a pool of 2488 extended buffers, and 512 performance buffers. Ports in shared bandwidth mode have only 16 buffer credits. Ports can be taken out-of-service to release credits and resources to remaining ports in the port group.
Best Practice for Configuring Ports Shared to dedicated: Configure in order: speed, rate-mode, mode, credit Dedicated to shared: Configure in order: credit, rate-mode, speed, mode Port-mode configurations:
98
Auto/E mode cannot be configured in shared rate-mode
FL mode is not supported in 4port 10Gbps module
TL mode is not supported in any Generation-2 modules
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Line Card Default Configurations Line Card
Speed Gbps
Rate-mode
Port Mode
16 port
Auto 1/2
Dedicated
Auto Fx/E/TE
32 port
Auto 1/2
Shared
Auto Fx
12 port
Auto 1/2/4
Dedicated
Auto Fx/E/TE
24 port
Auto 1/2/4
Shared
Fx
48 port
Auto 1/2/4
Shared
Fx
4-port 10Gb
10Gb only
Dedicated
Auto Fx/E/TE
• 10G and 1/2/4Gbps ports cannot be mixed – 1/2/4 Gbps uses 8b/10b encoding – 10 Gbps uses 64b/66b encoding
24
© 2006 Cisco Systems, Inc. All rights reserved.
Line Card Default Configurations Line cards operate in two different modes, Dedicated and Shared. All ports on Dedicated Rate Mode line card modules (16 port, 12-port and 4-port 10G FC) have access to full bandwidth per port All ports on Shared Rate Mode line card modules (32 port, 24-port and 48 port) share 12Gbps bandwidth across a port group. Any port can configure dedicated bandwidth. All remaining ports in the port group share any remaining unused bandwidth.
Copyright © 2006, Cisco Systems, Inc.
Architecture and System Components
99
Recommended Uses of FC Switch Line Card Modules Traditional Core-Edge Topology Typical 15:1 or higher oversubscription
32 port edge switch
MDS 9000 Core-Edge Topology
MDS 9000 Collapsed Core
Same ISL oversubscription
MDS 9500 with FRM cards MDS 9216 with integrated FRM ports and additional OSM linecard ISL oversubscription >> Line Card oversubscription
4-port 10Gb FRM linecard for core ISLs 12/16-port FRM linecard for storage and ISLs
MDS 9500 with mixed cards
24/32/48-port OSM linecard for host/tape connectivity
Lower oversubscription than core-edge 25
© 2006 Cisco Systems, Inc. All rights reserved.
Recommended Uses of FC Switch Line Card Modules Use Full-Rate Mode line cards for:
Storage connectivity
ISLs
Core switches (if deploying a core-edge topology)
Use Oversubscribed Mode line cards to reduce the cost of deploying:
Server connectivity
Tape connectivity
Edge switches (if deploying a core-edge topology)
The Oversubscribed Mode line cards are designed to allow cost-effective consolidation of a core-edge topology into a collapsed core:
100
The Oversubscribed Mode line cards serve the function of the edge switches. In core-edge topologies, the oversubscription of the ISLs between the core and edge switches is significantly greater than the oversubscription of the MDS 9000 Oversubscribed Mode line cards. In other words, a collapsed-core topology with Oversubscribed Mode line cards has less oversubscription than a typical core-edge topology.
The Full-Rate Mode line cards are used for ISLs and storage connectivity, where oversubscription is not desirable.
In a core-edge topology, at least one Full-Rate Mode line card is typically deployed in each edge switch for ISLs to the core.
Gen-2 shared-bandwidth line cards allow SAN engineers to tune the performance required per end device.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Credits and Buffers Second-Generation Module Credits and Buffers
3.0
• Buffer-to-buffer credits Up to 250 Buffer Credits per port
E_Port default = 250 Fx_Port default = 16
• 6144 credits shared across module in Dedicated rate-mode Availability depends on rate-mode and port-mode eg: In 48-port line card, all interfaces configured 125 credits (48x 125 = 6000) or 46 interfaces with 120 plus 2 interfaces with 240 (46x 120 = 5520 + 480 = 6000)
• Max. of 16 credits can be configured In Shared rate-mode • Performance Buffers Up to 145 extra buffer credits per port Shared among all ports in the module – not guaranteed Supported in FRM 12-port 4Gbps and 4-port 10Gbps line cards
• More credits can be shared by making interfaces out-of-service 27
© 2006 Cisco Systems, Inc. All rights reserved.
Buffer-to-buffer credits:
Depends on rate-mode and port-mode
Max. of 16 credits can be configured In shared rate-mode
~6000 credits shared across module in dedicated rate-mode
eg: In 48-port module, all interfaces configured 125 credits
Or, 40 interfaces 120 each plus 2 interfaces 225
Performance Buffers:
Min/Max/Default – 1/145/145
Shared among all ports in the module – not guaranteed
Supported in 12port 4Gbps and 4port 10Gbps module
Credits can be shared by taking interfaces out of service.
Copyright © 2006, Cisco Systems, Inc.
Architecture and System Components
101
3.0
Second Generation Buffer Credit Allocation • Total 6144 Buffers per line card module • User credits may be configured in the range 2 to 250 • Default 16 credits for Fx ports, and 250 credits for E/TE ports • Ports configured in dedicated mode may use extended buffer credits
144
144
Performance Buffers
512
512
Extended Buffer Credits (Dedicated mode only)
2488
2488
User Configurable Credits (up to 250 per port)
3000
144
144
6000
6000
(24x 250)
(48x 125)
6144 Buffers
Proxy/Reserved Buffers
3000 (12x 250)
(4x 250) 4-port 10G
12-port 1/2/4G
24-port 1/2/4G
48-port 1/2/4G
Dedicated Mode
Dedicated Mode
Shared Mode
Shared Mode
© 2006 Cisco Systems, Inc. All rights reserved.
28
Second-Generation Buffer Credit Allocation Each second-generation line card module has a total of 6144 buffers available. By default Fx ports are allocated 16 buffer credits and E/TE ports are allocated 250 ports. However any port can configure between 2 and 250 buffer credits per port. 4-port 10Gbps FC modules and 12-port FC modules operate in dedicated rate mode and have access to an additional 2488 extended buffer credits and 512 performance buffers shared across all ports in the module. Each port can configure a maximum 145 additional performance buffers. 24-port modules operate in shared rate mode and each port can configure between 2 and 250 buffer credits per port 48-port modules operate in shared rate mode and each port can configure between 2 and 125 buffer credits per port
102
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Best Practices • Match the switch port speed to the connected device • Configure port speed to use 1, 2 or 4Gbps • Or set auto-sensing with a maximum 2Gbps • Configuring 4Gbps will reserve 4Gbps bandwidth regardless of the autonegotiated port speed
• Configure the rate mode – dedicated or shared • Dedicated ports will reserve bandwidth, shared ports share remaining bandwidth
• Configure the port mode • F, FL, E, TE, TL, SD, ST
• Configure Buffer-to-Buffer Credits • Take any unused interfaces Out-of-Service • This will free up resources and any spare credits or bandwidth. 29
© 2006 Cisco Systems, Inc. All rights reserved.
Best Practices for Configuring Second-Generation Line Cards 1. Match the switch port speed to the port speed of the connected device and lock down the port speed to 1Gbps, 2Gbps or 4Gbps 2. Ports may be configured in auto-sense maximum 2Gbps mode and will connect at 1Gbps or 2Gbps 3. Configure the rate mode for the port – dedicated or shared Dedicated ports will have dedicated 1Gbps, 2Gbps or 4Gbps bandwidth Shared ports will share any remaining unused bandwidth left over for the port group. 1. Configure the port mode – F, FL, E, TE, TL, SD or ST 2. Configure buffer to buffer credits. 1 buffer credit is required for each 1Km link distance at 2Gbps with 2KB frame payload 3. Take any unused ports “Out of Service” to free up resources and any spare credits or bandwidth
Copyright © 2006, Cisco Systems, Inc.
Architecture and System Components
103
104
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Lesson 4
The Multilayer SAN Overview In this lesson, you will learn how to build scalable intelligent SAN fabrics using VSANs, IVR, PortChannels, intelligent addressing, and interoperability with other vendors’ switches.
Objectives Upon completing this lesson, you will be able to create a high-level SAN design with MDS 9000 switches. This includes being able to meet these objectives:
Explain the benefits of VSANs
Explains how VSANs are implemented
Explain how IVR enables sharing of resources across VSANs
Explain how PortChannels provide high availability inter-switch links
Explain how the addressing features of the MDS 9000 simplify SAN management
Explain the purpose of the CFS protocol
Explain how the MDS 9000 interoperates with third-party switches
Virtual SANs VSANs Address the Limitations of Common SAN Deployments • Virtual Storage Area Networks (VSAN) VSANs are Virtual Fabrics
Cisco MDS 9000 Family with VSAN Service
Allocate ports within a physical fabric to create isolated virtual fabrics. SAN islands are virtualized onto a common SAN infrastructure. VSAN on FC is similar to VLAN on Ethernet. Fabric Services are isolated within a VSAN Fabric disruption is limited to VSAN
Independent physical SAN islands are virtualized onto a common SAN infrastructure.
Statistics gathered are per VSAN
© 2006 Cisco Systems, Inc. All rights reserved.
4
VSANs Address the Limitations of Common SAN Deployments Today, many SAN environments consist of numerous islands of connectivity. Commonly deployed SAN islands are physically isolated environments consisting of one or more interconnected switches where each island is typically dedicated to a single or to multiple related applications. A SAN island may be independently managed by a separate administration team, while strict isolation from faults is achieved through physical network deployment separation. However, because this physical isolation restricts access by other networks and users, the sharing of critical storage assets and the economic savings of storage consolidation are limited. VSAN functionality is a feature developed by Cisco that leverages the advantages of isolated SANs fabrics with capabilities that address the limitations of isolated SAN islands. VSANs provide a method for allocating ports within a physical fabric to create virtual fabrics. Independent physical SAN islands are virtualized onto a common SAN infrastructure. An analogy is that VSANs on Fibre Channel (FC) networks are like VLANs on Ethernet networks. Separate fabric services are available on each VSAN, because it is a virtual fabric, as are statistics, which are gathered on a per-VSAN basis.
106
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
VSANs Reduce Infrastructure Costs 16 Port Switches Ports Required: 70 Ports Deployed: 96 ISL Ports: 16 (7:1 fan-out)
• Dynamic provisioning/resizing
Ports Stranded: 10 Net: 96 ports for 70 used
32 Port Switches
• Improved port utilization • Non-disruptive (re)assignment
Ports Required: 40 Ports Deployed: 64 ISL Ports: 0
Ports Stranded: 24 Net: 64 ports for 40 used
• Shared ISL bandwidth
70 Port Fabric – Red_VSAN 40 Port Fabric – Blue_VSAN Ports Required: 70+40 Ports Deployed: 128 ISL Ports: 0 Ports Assignable: 18 (able to add more switching modules too!) Net: 110 ports for 110 used 5
© 2006 Cisco Systems, Inc. All rights reserved.
VSANs Reduce Infrastructure Costs VSANs allow dynamic provisioning and resizing of virtualized SAN islands. Virtual fabrics are built to meet initial port requirements. This not only allows for good port utilization, but also for dynamic resizing of virtual fabrics to meet actual, rather than projected, needs. With individual fabrics, port counts are dictated to some degree by the hardware configurations available. Provisioning ports logically, rather than physically, allows assignment of only as many ports as are needed. Stranded ports (ports unneeded on an isolated fabric) are also reduced or eliminated.
Ports can be (re)assigned to VSANs non-disruptively.
ISLs become Enhanced ISLs (EISLs) carrying tagged traffic from multiple VSANs.
ISL bandwidth is securely shared between VSANs, which reduces cost of excessive ISLs.
EISLs only carry permitted VSANs, which can limit the reach of individual VSANs.
Each port can belong to only one VSAN, and there is no leakage between VSANs. InterVSAN Routing (IVR) must be used to exchange traffic between two different VSANs.
Copyright © 2006, Cisco Systems, Inc.
The Multilayer SAN
107
VSANs Constrain Fault Impacts Create a VSAN for each application Isolate traffic flows by assigning different FSPF costs
• Fabric Services are replicated and maintained on a per-VSAN basis • Any disruption is contained within the VSAN ie FSPF reconfigure fabric Misbehaving HBA or device RSCN broadcast Active zoneset change
!!Fabric event!! HBA generates erroneous control frames – but other VSANs are protected
© 2006 Cisco Systems, Inc. All rights reserved.
6
VSANs Constrain Fault Impacts VSANs sectionalize the fabric to increase availability. All fabric services are replicated and maintained on a per-VSAN basis, including name services, notification services, and zoning services. This means that fabric events are isolated on a per VSAN basis. This isolation provides high availability by protecting unaffected VSANs from events on a single VSAN within the physical fabric. The faults are constrained to the extents of the affected VSAN , and only affect devices within that VSAN. Protection is provided from events like:
Misbehaving HBA or controller
Fabric rebuild event
Zone set change
Fabric recovery from a disruptive event is also per-VSAN, resulting in faster reconvergence due to the smaller scope.
108
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
SAN Consolidation with VSANs VSANs enable highly resilient, large port density and manageable SAN designs • Leverage VSANs as a replacement for multiple separate physical fabrics
Group ‘A’
Group ‘B’
Cisco MDS 9506
• Separate VSANs provide hardware based traffic isolation and security • Very high port-density platforms to minimize number of switches required • Eliminates the wasted ports of the SAN island approach • Link bundling and QoS within fabric to optimize resource usage and traffic management
Cisco MDS 9216 VSAN Trunk Bundles
VSAN Trunks
Cisco MDS 9513
Shared Storage Pool Backup VSAN
Cisco MDS 9509
Main Data Center © 2006 Cisco Systems, Inc. All rights reserved.
7
SAN Consolidation with VSANs One of the key enablers for SAN consolidation on the MDS platform is the Virtual SANs (VSANs) feature. VSANs completely isolate groups of ports in the fabric, allowing virtual fabrics to replace multiple physical SAN fabrics as the means to secure and scale applications. VSANs allow high-density switch platforms to replace inefficient workgroup fabric switches. In addition to VSANs, features like PortChannels (link aggregation) and QoS allow IT to optimize resource usage and manage traffic within the fabric. IT can ensure that applications get the resources they need without having to physically partition the fabric.
Copyright © 2006, Cisco Systems, Inc.
The Multilayer SAN
109
VSAN Advantages • Good ROI Leverage VSANs as a replacement for multiple separate physical fabrics Reduce number of switches Increase port density
Department/ Customer ‘A’
Department/ Customer ‘B’
• Availability VSAN-Enabled Fabric
Disruptions and I/O pauses are confined to the local VSAN Increase fabric stability
VSAN Trunks
• Scalability Fabric Services are per VSAN Reduce the size of the FC distributed database FC_IDs can be reused
Mgmt VSAN
• Security Separate VSANs provide hardwarebased traffic isolation and security © 2006 Cisco Systems, Inc. All rights reserved.
Shared Storage 8
VSAN Advantages VSANs allow implementation of multiple logical SANs over a common fabric, which eliminates costs associated with separate physical fabrics. The virtual fabrics exist on the same physical infrastructure, but are isolated from each other. Each VSAN contains zones and separate (replicated) fabric services, which improves:
Availability through the isolation of virtual fabrics from fabric-wide faults/reconfigurations
Scalability through:
Replicated fabric services per VSAN
Support for 256 VSANs
Centralized management capability
Security through fabric isolation
256 VSANs is not a hard limit. The VSAN header is 12 bits long and supports up to 4096 and we can grow to that number in the future as larger scale SAN deployments increase. Please note that the total “number” of VSANs that can be configured is 256 but the numbering can be anywhere between 1-4093 due to the reasons mentioned above. The FCIDs contain an 8-bit field for domains and a few are reserved leaving the 239 domain (switch) limitation per SAN with each switch getting its own domain ID. With Cisco’s VSAN technology, this limitation is now extended per VSAN, implying that domains (and hence FCIDs) can be reused across VSANs. Thus, this enables the deployment of much larger scale SANs than available currently.
110
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
How VSANs Work VSAN Primary Functions The VSAN feature consists of two primary functions.
• Hardware-based isolation of traffic No special drivers or configuration required for end nodes Traffic tagged at FC ingress port and carried across EISL
• Independent Fabric Services for each VSAN
Fibre Channel Services for Blue VSAN VSAN header is removed at egress port
Enhanced ISL (EISL) Trunk carries tagged traffic from multiple VSANs
Name server Management server Principle switch selection, etc. Services are run, managed and configured independently
Trunking E_Port (TE_Port) Trunking E_Port (TE_Port)
FSPF Zone server
Fibre Channel Services for Red VSAN
VSAN tagged header(VSAN_ID) is added at ingress port indicating membership
No special support required by end nodes
© 2006 Cisco Systems, Inc. All rights reserved.
Fibre Channel Services for Blue VSAN Fibre Channel Services for Red VSAN 10
VSAN Primary Functions The VSAN feature consists of two primary functions. A hardware-based isolation of tagged traffic belonging to different VSANs, which requires no special drivers or configuration at the end nodes, such as hosts, disks, etc. Traffic is tagged at the Fibre Channel ingress port (Fx_Port) and carried across EISL links between MDS 9000 switches. Since VSANs use explicit frame tagging, they can be extended over the metro or WAN. The MDS 9000 Family IP storage module can add tags to be transported in Fibre Channel over Internet Protocol (FCIP) for greater distances. Fibre Channel, and therefore VSANs, can easily be carried across dark fiber. However, VSANs add 8 bytes of header, which may be a concern for channel extenders. The channel extenders may consider it an invalid frame and drop it. Dense wavelength division multiplexing (DWDM) switches may also count frames as invalid but may still pass the frames anyway. Qualification is still ongoing within Cisco to validate various extension methods. The creation of an independent instance of Fibre Channel fabric services for each newly created VSAN. These services include: zone server, name server, management server, principle switch selection, etc. Each service runs independently on each VSAN and is independently managed and configured as well.
Copyright © 2006, Cisco Systems, Inc.
The Multilayer SAN
111
VSAN Attributes • 256 VSANs per switch • 239 switches per VSAN
VSAN 10
VSAN 20
VSAN 30
VSAN 1 (default)
•Traffic is isolated within its own VSAN Control over each incoming and outgoing port
• Each frame in the fabric is uniquely tagged and Cisco MDS 9509 Single Chassis
labeled with a VSAN_ID header on the ingress port VSAN_ID maintained across TE ports VSAN ID stripped away across E_Ports. VSAN & Priority in EISL header for QoS
• FC_ID can be reused across other VSANs Increases switch granularity Simplifies Migration Ease of management
Fabric 10 Domain ID 0x61 44 Ports
Fabric 20 Domain ID 0x94 24 Ports
Fabric 30 Domain ID 0x33 12 Ports
Fabric 1 Domain ID 0x12 8 Ports
(Logically within the MDS 9509 single chassis) © 2006 Cisco Systems, Inc. All rights reserved.
11
VSAN Attributes VSANs help achieve traffic isolation in the fabric by adding control over each incoming and outgoing port. There can be up to 256 VSANs in the switch and 239 switches per VSAN. This affectively helps with network scalability because the fabric is no longer limited by 239 Domain_IDs since they can be reused within each VSAN. To uniquely identify each frame in the fabric, the frame is labeled with a VSAN_ID on the ingress port; the VSAN_ID is stripped away across E ports. Across TE ports, the VSAN_ID is still maintained. By carrying SAN/priority in the header, quality of service (QoS) can be properly applied. The VSAN_ID is always stripped away at the other edge of the fabric. If an E port is capable of carrying multiple VSANs, it then becomes a trunking E port (TE port). VSANs also facilitate the reuse of address space by creating independent virtual SANs, therefore increasing the available number of addresses and improving switch granularity. Without a VSAN, an administrator needs to purchase separate switches and links for separate SANs. The system granularity is at the switch level, not at the port level. VSANs are easy to manage. To move or change users, you only need to change the configuration of the SAN, not its physical structure. To move devices between VSANs, you simply change the configuration at the port level; no physical moves are required.
112
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
VSAN Numbering Rules • VSAN 1 – Default VSAN Automatically configured by the switch as the default VSAN All ports are originally in VSAN 1 Always present, cannot be deleted
Configured VSANs
Cisco MDS 9000 Family Switches with VSAN Service
VSAN 10 VSAN 20 VSAN 30
Trunking E_Port (TE_Port)
• VSAN 2 through 4093 User-configurable VSANs A maximum of 254 VSANs can be created in this number range
Trunking E_Port (TE_Port)
• VSAN 4094 – Isolated VSAN Used to isolate ports whose port-VSAN has been deleted
Port is in VSAN 4094 (Isolated VSAN)
VSAN 20
Not propagated across switches Always present, cannot be deleted
VSAN 10
Host is isolated from the fabric
VSAN 30
Configured VSANs © 2006 Cisco Systems, Inc. All rights reserved.
12
VSAN Numbering Rules There are certain rules that have to be followed when creating VSANs. VSAN 1, for instance, is automatically configured by the switch as the default VSAN. All ports that are configured are originally put into VSAN 1 until specifically configured into another VSAN number. The VSAN numbers ranging from 2 through 4093 are the user-configurable VSANs. Even though there are many more number possibilities in this range, a maximum of 254 VSANs can be created here. VSAN 4094 is a reserved special VSAN called the “isolated VSAN.” It is used to temporarily isolate the ports whose VSAN has been deleted. VSAN 4094 is not propagated across switches, is always present, and cannot be deleted. In the figure, VSAN 30 is not propagated across EISL, because it is not configured in the local switch but is configured on the remote switch. Instead of the host device on the local switch being able to connect to the remote switch, it has been placed in the isolated VSAN 4094 because the port’s VSAN (VSAN 30) has been deleted from the local switch configuration. Note
VSAN 0 and VSAN 4095 are reserved and not used.
Copyright © 2006, Cisco Systems, Inc.
The Multilayer SAN
113
TE_Ports and EISLs Trunking E_Port (TE_Port): Carries tagged frames from multiple VSANs Only understood by Cisco MDS 9000 switches Trunks all VSANs (1-4093) by default VSAN Allowed List defines which frames are allowed TE_Port
Can be optionally disabled for E_Port operation Has native VSAN assignment for E_Port operation
EISL
Not to be confused with port aggregation (PortChannels)
TE_Port TE_Ports
Enhanced ISL (EISL): Link created by connecting two TE_Ports Superset of ISL functionality
EISL
Also carries per-VSAN control protocol information FSPF, distributed Name Server, Zoning updates etc 13
© 2006 Cisco Systems, Inc. All rights reserved.
TE_Ports and EISLs Trunking E_Ports (TE_Ports) have the following characteristics:
TE_Ports can pass tagged frames belonging to multiple VSANs.
TE_Ports are only supported by Cisco MDS 9000 switches.
By default, TE_Ports can pass all VSAN traffic (1-4093). The passing of traffic for specific VSANs can be disabled.
By default, E-Ports are assigned as part of VSAN 1.
TE_Ports allow for the segregation of SAN traffic and should not be confused with port aggregation (referred to by some vendors as trunking).
Enhanced ISLs (EISLs) are ISLs that connect two TE_Ports:
114
An EISL is created when two TE_Ports are connected.
EISLs offer a superset of ISL functionality,
EISLs carry per-VSAN control protocol information.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
2.0
WWN-Based VSANs WWN-Based VSANs
Port-Based VSANs SW1
SW1
SW2
SW2
SAN-OS 2.0
FC HBA
Move requires Reconfiguration on SW2
FC HBA
FC HBA
Move without reconfiguration
FC HBA
• VSAN membership currently based on physical port of switch
• VSAN membership based on pWWN of server or storage
• Reconfiguration is required when server or storage moves to another switch
• Fabric-wide distribution of configuration using CFS
© 2006 Cisco Systems, Inc. All rights reserved.
• No re-configuration is required when a host or storage moves 14
WWN-Based VSANs With the introduction of SAN-OS 2.0, VSAN membership now may be defined based on the world wide name (WWN) of hosts and storage devices, or by switch port. With WWN-based VSAN membership, host and targets can be moved from one port to any other port anywhere in the MDS fabric without requiring manual reconfiguration of the port VSANs.
Copyright © 2006, Cisco Systems, Inc.
The Multilayer SAN
115
Inter-VSAN Routing (IVR) IVR Overview Inter-VSAN Routing (IVR) allows selective routing between specific members of two or more VSANs • Preserve VSAN benefits
FC
• Selectively allow traffic flow
FC HBA
FC HBA
• Share resources. ie Tape VSAN 10 Transit VSAN FC or FCIP
FC
FC
FC
FC
HBA
• Transit VSAN isolates WAN infrastructure • Resolves problems with merged fabrics
HBA
VSAN 20
• FC control frames remain within the VSAN
© 2006 Cisco Systems, Inc. All rights reserved.
16
IVR Overview VSANs are like “virtual switches.” They improve SAN scalability, availability, and security by allowing multiple SANs to share a common physical infrastructure of switches and ISLs.These benefits are derived from the separation of Fibre Channel services in each VSAN and isolation of traffic between VSANs. Data traffic isolation between the VSANs also inherently prevents sharing of resources attached to a VSAN, for example robotic tape libraries. Using IVR, resources across VSANs are accessed without compromising other VSAN benefits. When IVR is implemented, data traffic is transported between specific initiators and targets on different VSANs without merging VSANs into a single logical fabric. FC control traffic does not flow between VSANs, nor can initiators access any resource across VSANs aside from the designated resources. IVR allows valuable resources like tape libraries to be easily shared across VSANs, and IVR used in conjunction with FCIP provides more efficient business continuity or disaster recovery solutions. IVR works for both FC and FCIP links. Using IVR, a backup server in VSAN10 could access a tape library in VSAN20 by configuring the switches involved to allow traffic between these devices, by VSAN and pwwn. Because the other nodes were not configured for IVR, they are unable to access each other.
116
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Single-Switch IVR Designs VSAN 10
Media Server
IVR Zone
VSAN 20
Tape Library FC
FC FC FC FC FC FC FC
FC
FC FC
FC
FC
FC
FC FC FC FC FC FC FC FC
FC
FC
FC
FC
FC
FC
FC
FC
FC
FC
FC
• Single-switch IVR designs: Simplest IVR topology — one switch in path Transit VSAN not required IVR zone and zoneset permits selective access across VSAN boundaries
© 2006 Cisco Systems, Inc. All rights reserved.
17
Single-Switch IVR Designs An IVR path is a set of switches and ISLs through which a frame from an end-device in one VSAN can reach another end-device in some other VSAN. Multiple paths can exist between two such end-devices. The simplest example of an IVR topology is one involving two VSANs and a single switch. In addition to the normal zones and zonesets that exist within each VSAN, IVR supports the creation of an IVR zone and zoneset, which allows selective access between the devices in two VSANs. In the example, the backup media server in VSAN 10 is allowed to access the tape library in VSAN 20. All other devices are restricted to their respective VSANs.
Copyright © 2006, Cisco Systems, Inc.
The Multilayer SAN
117
IVR with Unique Domain IDs • Prior to SAN-OS 2.1, Domain IDs must be unique within an IVR zoneset • If Domain IDs are unique – Frames are routed from one VSAN to another with no added latency – S_ID & D_ID are unchanged across VSANs
• All VSANs belong to a single Autonomous Fabric - AFID=1 Autonomous Fabric ID=1
VSAN 10
VSAN 99
HBA
fcid 05.02.01
VSAN 10
FC
FCIP tunnel
FC
AFID 1
VSAN 20
Domain 0x05
S_ID: 05.02.01
Domain 0x0A
D_ID: 06.03.04
S1
AFID 1
Domain 0x5F
VSAN 99
Domain 0x5C
S_ID: 05.02.01
D_ID: 06.03.04
© 2006 Cisco Systems, Inc. All rights reserved.
S2
Domain 0x14
AFID 1
Domain 0x06
VSAN 20
S_ID: 05.02.01
fcid 06.03.04
D_ID: 06.03.04
18
IVR with Unique Domain IDs Unique Domain IDs are required for all switches involved in IVR. In this way, a frame moving from fcid 05.02.01 in VSAN 10 to fcid 06.03.04 in VSAN 20 will retain the same source and destination FCIDs as it crosses two or more VSANs. Whenever a frame enters a Cisco MDS 9000 switch, it is tagged with a VSAN header indicating the native VSAN of the port. In the case of IVR, when the destination FCID resides in a different VSAN, the tag will be rewritten at the ingress port of the IVR border switch. In the figure, assume that a frame is destined from fcid 05.02.01 in VSAN 10 to fcid 06.03.04 in VSAN 20. The left-most switch, with Domain ID 5, applies a VSAN 10 ID tag. The next switch performs a VSAN rewrite, changing the VSAN tag to 99. The last switch changes the VSAN tag to 20. The process is reversed on the return path.
VSAN Rewrite Table Each IVR-enabled switch maintains a copy of the VSAN Rewrite Table. The table can hold up to 4096 entries. Each entry includes the following information:
118
Switch identifier
Current VSAN ID
Source Domain
Destination Domain
Next-Hop VSAN (rewritten VSAN)
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
2.1
IVR with Overlapping Domain IDs
• From SAN-OS 2.1, Domain IDs do not have to be unique within an IVR zoneset • If Domain IDs overlap – IVR NAT will rewrite the S_ID and/or D_ID in the frame header – and route the frame to its destination in a different VSAN
• All VSAN IDs must be unique within the same Autonomous Fabric Autonomous Fabric ID=1
VSAN 10
VSAN 99 Autonomous Fabric ID=1 VSAN 20
VSAN 10
NAT
NAT
HBA
fcid 05.02.01
AFID 1
FC
FCIP tunnel
FC
Domain 0x05
S_ID: 05.02.01
Domain 0x0A
D_ID: 06.03.04
S1
AFID 1
Domain 0x5F
VSAN 99
Domain 0x5C
S_ID: 05.02.01
D_ID: 06.03.04
S2
Domain 0x14
AFID 1
Domain 0x05
VSAN 20
© 2006 Cisco Systems, Inc. All rights reserved.
S_ID: 06.02.01
fcid 05.03.04
D_ID: 05.03.04
19
IVR with Overlapping Domain IDs IVR-2, introduced in SAN-OS 3.0, offers several enhancements over previous versions of IVR:
Removes unique VSAN ID and Domain ID requirement
Integrates with QoS, LUN zoning, and read-only zoning
Provides Automatic IVR configuration propagation throughout fabric – AUTO Mode
Provides Automatic IVR topology discovery
Licensed with Enterprise and San Extension (with IPS 4 or 8 installed) packages
In the example, notice that VSAN 10 has a switch with Domain ID 05 and so does VSAN 20. Therefore IVR NAT must provide a proxy entry in VSAN 10 for VSAN20 Device 05.03.04 and renumber it as 06.03.04. A frame from VSAN 10 fcid 05.02.01 is written with a destination fcid of 06.03.04 and routed via the transit VSAN 99 to VSAN 20. As the frame arrives at the border switch in VSAN 20, the frame header is rewritten as 05.03.04 and routed to its destination port. Notice that with SAN-OS 2.1 there is only one Autonomous Fabric so all VSAN IDs must be unique within the same Autonomous Fabric.
Copyright © 2006, Cisco Systems, Inc.
The Multilayer SAN
119
3.0
Autonomous Fabrics and Overlapping VSAN IDs
• From SAN-OS 3.0, you can configure up to 64 separate Autonomous Fabrics • All VSAN IDs must be unique within the same Autonomous Fabric but the same VSAN ID can exist in a different Autonomous Fabric • IVR NAT will rewrite the S_ID and/or D_ID and route frames as before
Autonomous Fabric ID=1
Autonomous Fabric ID=2
VSAN 10
VSAN 99 Autonomous Fabric ID=1 VSAN 10
VSAN 10
NAT
NAT
HBA
fcid 05.02.01
AFID 1
FC
FCIP tunnel
FC
Domain 0x05
S_ID: 05.02.01
Domain 0x0A
D_ID: 06.03.04
S1
AFID 1
Domain 0x5F
VSAN 99
Domain 0x5C
S_ID: 05.02.01
D_ID: 06.03.04
S2
Domain 0x14
AFID 2
Domain 0x05
VSAN 10
fcid 05.03.04
S_ID: 06.02.01
D_ID: 05.03.04
20
© 2006 Cisco Systems, Inc. All rights reserved.
Autonomous Fabrics and Overlapping VSAN IDs From SAN-OS 3.0 with IVR-2 you can configure up to 64 separate Autonomous Fabrics. Each VSAN in a single physical switch must only belong to one Autonomous Fabric. IVR must know about the topology of the IVR-enabled switches in the fabric to function properly. You can specify the topology in two ways:
Manual Configuration
Configure the IVR topology manually on each IVR-enabled switch
Automatic Mode
Uses CFS configuration distribution to dynamically learn and maintain up-to-date information about the topology of the IVR-enabled switches in the network.
In the example, VSAN 10 on the left is joined to VSAN 10 on the right via a Transit VSAN 99. This would be illegal in a single Autonomous Fabric so both sides are configured in separate Autonomous Fabrics 1 and 2. Notice that IVR NAT now rewrites the AFID in the EISL frame header from AFID1:VSAN 10 to AFID2:VSAN 10 as it passes through the IVR edge switches.
120
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
3.0
IVR Service Groups • A Group of unique VSAN IDs and Domain IDs
AFID 2 VSAN 100 VSAN 1
VSAN 1
• VSAN ID can be reused in different AFIDs without merging that VSAN, as long as the AFIDs do not share a switch.
1 10
• Default AFID is 1, and can be changed via CLI or FM
swwn1
AN
• AFIDs in a switch cannot share a VSAN ID
VSAN 103
VS
• A single switch can be in multiple AFIDs
S2
S1
swwn2 10 2
• Each VSAN in a switch must belong to one and only one Autonomous Fabric ID (AFID)
AFID 4 AFID 1
VS AN
• Up to 16 Service Groups supported with a total of 64 AFID-VSAN combinations
S3 swwn3 VSAN 1
AFID 3
• IVR Control traffic distributed amongst all SGs • IVR Data traffic contained within each SG © 2006 Cisco Systems, Inc. All rights reserved.
21
IVR Service Groups IVR Service groups are defined as a group of unique VSAN IDs and Domain IDs within an Autonomous Fabric. VSAN IDs (domain IDs) can be the same as long as they reside in different AFIDs. With IVR-2, there can be a total 16 IVR Service Groups and the allowed AFID range is 1 – 64 AFID is now used in routing decision. With IVR-1 prior to SAN-OS 2.1, zoning had to be performed at each switch in the configuration but IVR-2 uses CFS to distribute IVR zoning to IVR2 enabled switches IVR-2 zoning can now be done on 1 switch only, and propagated to the IVR2 fabric. Notice that IVR control traffic eg. IVR Topology Database is distributed to all IVR enabled switches across all configured IVR Service Groups but IVR Data Traffic eg. Frames moving from VSAN to VSAN are contained within each IVR Service Group.
Copyright © 2006, Cisco Systems, Inc.
The Multilayer SAN
121
IVR Best Practices Encourage use of non-overlapping domains across all VSANs. For large installations, try not to have IVZ members spread across many switches. It wastes resources. Allow for multiple paths between the IVZ members. Set default zone policy to “deny” and avoid using the “force” option when activating the IVZS. Make sure that exactly the same IVR topology is applied to all IVR-enabled switches. Configure IVR to use Cisco Fabric Services Use Cisco Fabric Manager to configure IVR.
22
© 2006 Cisco Systems, Inc. All rights reserved.
IVR Best Practices The following are recommended best practices for implementing IVR:
122
While it is not strictly required to have unique Domain IDs across VSANs for switches that are not participating in IVR, unique Domain IDs are recommended, because they simplify fabric design and management.
Because the VSAN rewrite table is limited to 4096 entries, and because entries are perdomain, not per-end device, it is best to minimize the number of switches that contain IVZ members in very large implementations.
Implement redundant path designs whenever possible.
In normal FC environments, it is generally considered a best practice to set the default zone policy to deny. Because members of IVZs cannot exist in the default zone, activation of an IVZS using the “force” option may lead to traffic disruption if IVZ members previously existed in a default zone policy of permit.
Make sure that exactly the same IVR topology is applied to all IVR-enabled switches.
Using the Cisco Fabric Manager to configure IVR can help avoid errors and will ensure that the same IVR configuration is applied to all IVR enabled switches.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Competing Technologies: Logical SANs Fabric A
Fabric B
Application Hosts
Multiprotocol Routers
Application Hosts
Backup Server A
Backup Server B
LSAN_1 LSAN_2
• External routers or gateways Tape Library
• Add latency to every frame
Fabric C
Backup Media Server
• Consume ISL ports • Are difficult to manage • Are a single point of failure © 2006 Cisco Systems, Inc. All rights reserved.
23
Competing Technologies: Logical SANs Brocade offers a proprietary solution called Logical SANs, or LSANs. This feature allows traffic between devices that would otherwise be isolated in separate fabrics. This implementation makes sense for Brocade’s small-to-medium sized business customers, who typically have a significant investment in smaller, legacy switches deployed in workgroup SANs. LSAN implementation requires purchase of at least one proprietary multiprotocol router. The diagram shows a redundant configuration, which requires two of the special-purpose routers. LSANs use a multiprotocol router to:
Join fabrics without merging them
Perform NAT to join separate address spaces
Perform functions similar to iFCP gateways
Copyright © 2006, Cisco Systems, Inc.
The Multilayer SAN
123
PortChannels PortChannels • A PortChannel is a logical bundling of ISLs: Multiple links are combined into one aggregated link More reliable than FSPF equal-cost routing Can span line cards for higher availability Higher throughput up to 160Gbps per PortChannel (16 x 10Gbps) No distance limitations Up to 16 ISLs per PortChannel
Single PortChannel Between Two MDS Switches
Up to 128 PortChannels per switch
Multiple PortChannels Between Two MDS Switches
© 2006 Cisco Systems, Inc. All rights reserved.
25
A PortChannel is a logical bundling of identical links. PortChannels (link bundling) enable multiple physical links to be combined into one aggregated link. The bandwidth from these links is aggregated into this logical link. There may be a single PortChannel between switches or multiple PortChannels between switches. PortChannels provide a point-to-point connection over multiple interswitch link (ISL) E_Ports or extended interswitch link (EISL) TE_Ports. PortChannels increase the aggregate bandwidth of an ISL by distributing traffic among all functional links in the channel. This decreases the cost of the link between switches. PortChannels provides high availability on an ISL. If one of the physical links fails, traffic previously carried on this link will be switched to the remaining links. PortChannels are known in the industry by other names, such as the following:
124
ISL trunking (Brocade Communications Systems)
Port bundling
Aggregated channels
Channel groups
Channeling
Bundles
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
FSPF Routing FC
A
HBA
B
HBA
10
11
1. 2.
FSPF Routing Table from Domain 10 to 13 10 – 11 – 13 Cost 2000 Server A + C 10 – 12 – 13 Cost 2000 Server B
1Gbps links
FC
FC
FC
C
12
13
FSPF Link Costs 1Gbps = 1000 2Gbps = 500 4Gbps = 250
HBA
• FSPF builds a routing table between each domain in the fabric • FSPF chooses the least cost path and routes all frames along it, but we have two equal cost paths—path 1 and 2 both have a cost of 2000 • FSPF applies round-robin algorithm to share the load between connected devices • Server A + C share path 1 and server B is allocated path 2 • All frames from Server A to Storage will be carried across Path 1 • Path 1 will carry a different load than Path 2 • FSPF does NOT load balance across equal cost paths on non-Cisco switches © 2006 Cisco Systems, Inc. All rights reserved.
26
FSPF Routing When Fibre Channel switches are joined together with ISLs, FSPF builds a routing table which is distributed to all switches using Link State updates. The routing table is a list of every possible path between any two domains in the Fibre Channel fabric. Each path is assigned a cost based upon the speed of the link:
1Gbps = 1000
2Gbps = 500
4Gbps = 250
FSPF then chooses the least cost path between any two domains. All frames will be sent along the least cost path and all other possible paths will be ignored. Every time an ISL is added, FSPF issues a Build Fabric (BF) command to rebuild the routing table. Every time an ISL is removed, FSPF issues a Build Fabric (BF) command to rebuild the routing table.
Copyright © 2006, Cisco Systems, Inc.
The Multilayer SAN
125
FSPF Routing Issues FC
A
HBA
B
HBA
10
11
1. 2. 3.
FSPF Routing Table from Domain 10 to 13 10 – 11 – 13 Cost 2000 Server A 10 – 12 – 13 Cost 2000 Server B 10 – 11 – 13 Cost 2000 Server C
1Gbps links
FC
FC
FC
C
12
FSPF Link Costs 1Gbps = 1000 2Gbps = 500 4Gbps = 250
13
HBA
• To provide more bandwidth, we could add another link between switches • FSPF again chooses the least cost path and routes all frames along it • We now have three equal cost paths—paths 1, 2 and 3 all have a cost of 2000 • FSPF re-applies round-robin algorithm to share the load between connected devices • Server A is allocated path 1, server B has path 2 and server C has path 3 • All frames from Server A to storage will still be carried across Path 1 • Path 1 will carry a different load than Path 2 or Path 3 • FSPF still does not load balance across equal-cost paths on non-Cisco switches © 2006 Cisco Systems, Inc. All rights reserved.
27
FSPF Routing issues Often, we would add a second ISL between switches to provide more bandwidth. Both ISLs would have the same cost, if the link speed is the same. If there are equal cost paths, then the paths will be shared among connected devices and allocated on a round-robin basis. Once a path has been allocated to a device, then all frames will use that path even though another equal cost path may be available. If the path is congested, then other equal cost paths will not be used. FSPF does not load balance across equal cost paths; it only shares equal cost paths to connected devices. This can lead to one path being congested and another equal cost path underutilized.
Exchange-Based Load Balancing Cisco MDS switches, by default will load balance across equal cost paths based upon the Fibre Channel exchange. This provides better granularity and balances the load across equal cost paths.
126
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
FSPF and Port Channels FC
A
HBA
10
FSPF Routing Table from Domain 10 to 13 10 – 11 – 13 Cost 1000 Server A + B + C 10 – 12 – 13 Cost 2000 None Port Channel Link Costs FSPF Link cost / number of links ie. 1000 / 2 = 500
1Gbps links
FC
B
11
1. 2.
HBA
FC
FC
C
12
13
FSPF Link Costs 1Gbps = 1000 2Gbps = 500 4Gbps = 250
HBA
• To provide more bandwidth, we could add another link between switches • FSPF rebuilds the routing table between each domain in the fabric • FSPF again chooses the least cost path and routes all frames along it, but we now only have one least cost path—path 1, which has a new cost of 1000 • All frames from Domain 10 to 13 will follow path 1 • By default, the Port Channel will load balance across all links in the Port Channel • If a link fails within the Port Channel, the FSPF cost doesn’t change and frames will continue to flow through the Port Channel © 2006 Cisco Systems, Inc. All rights reserved.
28
When two or more ISLs are placed in a Port Channel, this is seen as a single path by FSPF and a cost is calculated based upon the cost of each link divided by the number of links in the Port Channel. Cisco MDS switches will provide exchange-based load balancing across all links within the Port Channel.
Copyright © 2006, Cisco Systems, Inc.
The Multilayer SAN
127
Flapping Links
Link State Records
• Flapping link can cause FSPF recalculation • FSPF will rebuild the topology database when the link goes down • FSPF will rebuild the topology database when the link comes up • On a failing GBIC or link, this can happen several times a second • Results in wide-scale disruption to the fabric © 2006 Cisco Systems, Inc. All rights reserved.
29
Flapping Links PortChannels can handle some types of hardware failures better than ISLs not belonging to a PortChannel. For example, if a flapping link exists between the two middle directors outside of a PortChannel, FSPF overhead is incurred. In this case, each time the ISL goes down or comes up, all of the switches in the fabric will recalculate the cost of each of their FSPF links by exchanging Link State Records on every (E) ISL interface. Switches synchronize databases by sending LSRs (Link State Records) in a Link State Update (LSU) SW_ILS extended link service command. When a switch receives an LSU, it compares each LSR in the LSU with its current topology database. If the new LSR is not present in the switch’s LSD, or if the new LSR is newer then the existing LSR, the LSR is added to the database. Cisco uses a modified Dykstra algorithm which does a very fast computation of the FSPF topology database. When a link flaps, the LSU's are flooded and then the path calculation occurs. While Cisco MDS switches handle flapping links more efficiently than most competitors, placing ISLs within a PortChannel can completely eliminate the overhead associated with FSPF re-calculation caused by a flapping link.
128
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Flapping Links with PortChannels
• Flapping link within a PortChannel results in no FSPF recalculation • Frames continue to flow across remaining links in the PortChannel • Fabric stability is maintained
© 2006 Cisco Systems, Inc. All rights reserved.
30
Flapping Links with PortChannels If, instead, the three middle links in the diagram are part of a PortChannel FSPF overhead is virtually eliminated. The PortChannel is represented as a single path to the FSPF routing table. No FSPF path cost recalculation is performed when a link fails in a PortChannel. This is true as long as there remains at least one functioning link in the PortChannel.
Copyright © 2006, Cisco Systems, Inc.
The Multilayer SAN
129
2.0
PortChannel Protocol • Used for exchanging configuration information between switches to automatically configure and maintain PortChannels
• The PortChannel protocol provides a consistency check of the configuration at both ends. • Simplifies PortChannel configuration • Automatically creates a PortChannel between switches
1 B A
2
1 2
1 2
3
3
A
B
C Channel group10 With PortChannel Protocol, ports will be isolated instead of suspended © 2006 Cisco Systems, Inc. All rights reserved.
Channel group10
Individual link
The plug-and-play functionality of the PortChannel Protocol allows the A3-B3 link to be dynamically added to the PortChannel
31
The PortChannel Protocol The PortChannel Protocol (PCP) was introduced and is enabled by default in SAN-OS 2.0. PCP uses the FC Exchange Peer Parameters (EPP) protocol to exchange configuration information between switches in order to automatically configure and maintain PortChannels. PCP contains 2 sub-protocols:
Bringup protocol: misconfig detection and synchronization of port bringup
Autocreate protocol: automatic aggregation of ports into port channels
PCP is exchanged only on FC and FCIP interfaces. The autocreate protocol is run to determine if a port can aggregate with other ports to form a channel group. Both the local and peer port have to be autocreate enabled for autocreate protocol to be attempted. More than 1 port needs to be autocreate enabled for aggregation to be attempted. A port cannot both be manually configured to be part of a PortChannel and have autocreate enabled. These two configurations are mutually exclusive. Autocreate enabled ports need to have the same compatibility parameters to be aggregated: speed, mode, trunk mode, port VSAN, allowed VSANs, port and fabric binding config.
130
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Load Balancing Source
• VSANs have two load balancing options: Flow-based: All traffic from a given source to a given destination follows the same path – Load Sharing
Data
Sequence 2 Sequence 3
• Some hardware/software combinations perform better with flow-based load balancing
Sequence 1
• Load-balancing option is configured on a per-VSAN basis and applies to both FSPF and PortChannels
Comm and
Exchange
Exchange-based: FC exchanges from a given source to a given destination are Load Balanced across multiple PortChannel links and FSPF equal cost paths
Destination
Read
nse Respo
e.g. HP CA with EVA
• Exchange-based Load Balancing is the default • Devices in the MDS family do not split exchanges – allowing for guaranteed IOD over WAN One exchange per ISL
© 2006 Cisco Systems, Inc. All rights reserved.
32
Load Balancing Load balancing is configured for each VSAN in an MDS 9000 fabric. There are two load balancing methods: flow based and exchange based. Flow based load balancing sends all traffic with the same src_id-dst_id along the same path. Exchange based load balancing ensures that members of the same SCSI exchange follow the same path. Exchange based flow control is the default and is appropriate for most environments. Load balancing is configured on a VSAN by VSAN basis, and whichever method is chosen is applied to both FSPF and PortChannels. Some hardware/software combinations can perform better with flow based load balancing. For example, HP EVA storage subsystems when coupled with Continuous Access (CA) software are sensitive to out-of-order exchanges that are possible with exchange-based load balancing. These devices, while rare, do perform significantly better with flow based load balancing.
Copyright © 2006, Cisco Systems, Inc.
The Multilayer SAN
131
Cisco Exchange-Based Load Balancing Load-balance across 16 FSPF logical paths FC HBA
S_ID 10.02.00
D_ID 20.01.00
FC HBA
FC
One logical path; up to 16 physical links
HBA
Exchange-based load balancing (S_ID - D_ID - OX_ID) maintains IOD in a stable fabric
© 2006 Cisco Systems, Inc. All rights reserved.
33
Exchange-Based Load Balancing By default, MDS 9000 family switches load balance traffic across equal cost FSPF paths and across the links in a PortChannel, on the basis of a FC exchange. All frames within a given exchange follow the same path. In the example, traffic going from the FC source ID 10.02.00 to the FC destination ID 20.01.00 will be load balanced across the equal cost FSPF “logical” links and within the PortChannel physical links. All frames within a given FC exchange will follow the same path. It is possible that exchanges could be delivered out of order. Because an exchange represents an upper layer transaction, e.g., a SCSI read or write operation, most devices are not sensitive to exchange re-ordering.
132
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Interoperability with PortChannels 4 FSPF Equal Cost paths Cisco PortChannel
Cisco MDS 9509
Cisco MDS 9216
Brocade Trunk
Brocade 3800
Brocade 12000
PortChannels are proprietary and are not supported between Cisco MDS switches and other vendors’ switches Not compatible with other vendors’ trunking Standard ISL flow control must be configured on the Brocade switch
© 2006 Cisco Systems, Inc. All rights reserved.
34
Interoperability with PortChannels Brocade’s trunking feature is comparable to Cisco PortChannels. Cisco PortChannels and Brocade trunking are not supported between MDS and Brocade switches. Brocade uses a proprietary flow control technique called VC (Virtual Channel) Flow Control. When an ISL comes up between an MDS switch and a Brocade switch, the ISL will be negotiated during ELP for standards-based buffer-to-buffer flow control. The MDS will reject Brocade’s proprietary VC flow control protocol and negotiate standards-based buffer-to-buffer flow control instead. All Brocade-to-Brocade ISLs can still use Brocade’s VC flow control protocol, and MDS-toMDS ISLs can still be trunking EISLs and PortChannels.
Copyright © 2006, Cisco Systems, Inc.
The Multilayer SAN
133
Best Practices • Use PortChannels wherever possible • Place single ISLs in a PortChannel – Non disruptive scalability
• Configure links on different switching modules for redundancy and high availability • Use the same Channel_ID at both ends - Not a requirement but it makes management easier
• Ensure that each end terminates at a single switch - This is a requirement
• Quiesce links before removing them from a PortChannel - Avoids traffic disruption - Not needed from SAN-OS 2.0 onwards
• Use in-order-guarantee feature only when required 35
© 2006 Cisco Systems, Inc. All rights reserved.
Best Practices Use PortChannels whenever possible. PortChannels:
Reduce CPU usage from the levels required to maintain multiple neighbors
Provide an independent recovery mechanism, faster than FSPF
Are completely transparent to upper-layer protocols
Can be nondisruptively scaled by adding links
Follow these guidelines when implementing PortChannels:
134
Spread PortChannel links across different switching modules. As a result, should a switching module fail, the PortChannel can continue to function as long as at least one link remains functional.
Try to use the same Channel_ID at both ends of the PortChannel. While the PortChannel number is only locally significant, this practice helps identify the PortChannel more easily within the fabric.
PortChannels are point-to-point logical links. Ensure that all links in a PortChannel connect to the same two switches or directors.
In order to prevent frame loss, it is best to quiesce a link before disabling it from a PortChannel
When difficulties arise with configuring PortChannels, the problem is often the result of inconsistently configured links. All links within the PortChannel require the same attributes for the PortChannel to come up. Use the “show port-channel consistency detail” command to identify link configuration inconsistencies.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Use the in-order-delivery feature only when necessary. In-order-delivery adds latency because it deliberately holds frames in the switch. It also consumes more switch memory, because it stacks the frames at the egress port.
Copyright © 2006, Cisco Systems, Inc.
The Multilayer SAN
135
Intelligent Addressing Dynamic FCID Assignment Problems WWN1
WWN1
11
Port Change Host Reboot Switch Reboot
FC
11
EVENT FC
X FC - Switch Directory Server
22
FCID1 = WWN1
22
FCID1
EVENT…
FCID2
FCID2 = WWN1
• Non-Cisco FCID Assignment: Dynamically assigned by default Can change if device is removed and re-added to the fabric Can change if the switch Domain ID changes © 2006 Cisco Systems, Inc. All rights reserved.
37
FCID Assignment Problems FCIDs are normally assigned dynamically by a FC switch when the devices (N_Ports), including hosts, disks, and tape arrays log into the fabric. FCIDs can therefore change as devices are removed from and added to the fabric. This diagram shows a simplified depiction of a host logging into the fabric and receiving FCIDs from the switch:
After the N_Port has established a link to its F_Port, the N_Port obtains a port address by sending a Fabric Login (FLOGI) Link Services command to the switch Login Server (at Well-Known Address 0xFFFFFE). The FLOGI command contains the WWN of the N_Port in the payload of the frame.
The Login Server sends an Accept (ACC) reply that contains the N_Port address in the D_ID field.
The initiator N_Port then contacts the target N_Port using the FCID of the target.
In the event of a port change, host reboot or switch reboot, previous FCID assignments have the potential to change.
136
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
FCID Target Binding Class I H/W Path Driver S/W State H/W Type Description --------------------------------------------------------------------------------fc 0 0/1/2/0 td CLAIMED INTERFACE HP Mass Storage Adapter fcp 1 0/1/2/0.1 fcp CLAIMED INTERFACE FCP Domain ext_bus 3 0/1/2/0.1.19.0.0 fcparray CLAIMED INTERFACE FCP Array Interface Target 6 0/1/2/0.1.19.0.0.0 tgt CLAIMED DEVICE disk 3 0/1/2/0.1.19.0.0.0.0 sdisk CLAIMED DEVICE HP OPEN-8 /dev/dsk/c4t0d0 /dev/rdsk/c4t0d0 disk 10 0/1/2/0.1.19.0.0.0.7 sdisk CLAIMED DEVICE HP OPEN-8 /dev/dsk/c4t0d7 /dev/rdsk/c4t0d7 target 7 0/1/2/0.1.19.0.0.1 tgt CLAIMED DEVICE disk 18 0/1/2/0.1.19.0.0.1.7 sdisk CLAIMED DEVICE HP OPEN-9 /dev/dsk/c4t1d7 /dev/rdsk/c4t1d7
• FCID target binding: HP/UX and AIX map block devices to FCIDs FCIDs are non-persistent by default Can jeopardize high-availability © 2006 Cisco Systems, Inc. All rights reserved.
38
FCID Target Binding Some operating systems, such as HP/UX v11.0 and AIX v4.3 and v5.1, map block devices (such as file systems) by default to the dynamically assigned FCIDs. As each Fibre Channel target device is attached to the operating system, the FCID is used as the identifier, not the WWN as in many other operating systems. The problem with the target-binding method employed by HP/UX and AIX is that FCIDs are dynamically assigned, non-persistent identifiers. There are several possible cases where a new FCID may be assigned to a storage device, thereby invalidating the binding held by a given server. These cases might involve a simple move of a storage device to a new port, or a port failure requiring the storage device to be moved to a different switch port. It could even be something as simple as a SAN switch being rebooted. All of these conditions may cause new FCIDs to be assigned to existing storage devices. A SAN designer must pay close attention to this detail when deploying HP/UX and AIX servers in a SAN, because this binding method can represent a significant risk to availability. IBM AIX v5.2 and later versions include a new feature called “dynamic tracking of FC devices” that can detect the change of a target FCID and remap the target without any intervention. However, AIX v4.3 and v5.1 do not possess this feature and are still widely used.
Copyright © 2006, Cisco Systems, Inc.
The Multilayer SAN
137
Intelligent Addressing Services • The Cisco MDS 9000 employs three intelligent addressing services: FCID address caching: Cache of FCID addresses is maintained by default in RAM memory Devices moved from one port to another within a switch, retain the same FCID FCID assignments do not survive system reboot Persistent FCID allocation Stores FCIDs in NVRAM - enables FCIDs to persist across switch reboots Reduces management complexity and availability risks associated with some HP/UX and AIX servers Persistent FCIDs can be selectively added or purged Enabled by default from SAN-OS 2.0 Static FCID assignment Allows greater administrative control over FCID assignment Area and Port octet in the FCID can be manually configured Requires static Domain ID assignment Useful when migrating HP-UX and AIX servers © 2006 Cisco Systems, Inc. All rights reserved.
39
Intelligent Addressing Services The Cisco MDS 9000 Family of Multilayer Directors and Fabric Switches deliver several intelligent addressing services that reduce the complexity and eliminate the availability risk associated with deploying HP/UX and AIX servers in a Fibre Channel SAN. All of these services are included in the base MDS 9000 software feature set at no additional cost and work together to give the SAN designer several options when designing the SAN addressing scheme. Cisco MDS 9000 switches use a default addressing mechanism that assigns FCIDs in a sequential order. An MDS switch maintains an active cache of assignments based on the WWN of the FCID receiver. This active cache is used to reassign the same FCID to a device even after it temporarily goes offline or is moved to another port in the switch. The cache mechanism is active at all times, and is enabled by default. When a device is moved from one port to another port on the same switch, the device is automatically assigned the same FCID. This capability allows storage devices that are being used by HP/UX and AIX hosts to be easily moved to other ports on the switch as necessary and be assured of the same FCID assignment. For example, if a switch port or SFP failure were to occur, the device connection could simply be moved to another port and would assume the same FCID. There is no pre-configuration required to use this feature. It is enabled by default. The FCID address cache is maintained in switch dynamic memory. When a switch is powered off, the cache assignments are not maintained. To address this issue, Cisco has also provided the ability to assign persistent FCID addresses to connected devices. The persistent FCID allocation feature assigns FCIDs to devices and records the binding in non-volatile memory. As new devices are attached to the switch, the WWN-to-FCID mapping is stored in persistent non138
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
volatile memory. This binding remains intact until it is explicitly purged by the switch administrator. The persistent FCID allocation option can be applied globally or for individual VSANs. This feature reduces the management complexity and availability risks associated with deploying HP/UX and AIX servers. The persistent FCID allocation feature is enabled on a perVSAN basis, allowing different VSANs to have different addressing policies or practices. The Cisco MDS 9000 Family also supports static FCID assignments. Using static FCID assignments, the area and port octets in the FCID are manually assigned by the administrator. This feature allows SAN administrators to use custom numbering or addressing schemes to divide the FCID domain address space among available SAN devices. This feature is particularly useful for customers who migrate from other vendor’s switches because they can retain the same FCIDs after migration. Because the Domain ID is the first octet of the FCID, the administrator must assign a static Domain ID to the switch in order to specify the entire FCID. Therefore, in order to statically assign FCIDs on a given switch, that switch must first be configured with a static Domain ID. The static FCID assignment feature is enabled on a per-VSAN basis, and static Domain IDs must be assigned on a per-VSAN basis for each switch in the VSAN. Static FCID assignment eases the migration of HP-UX and AIX servers from a legacy fabric to an MDS fabric. The MDS switches can be configured with the same FCIDs as the legacy fabric, eliminating the need to remap storage targets on the servers.
Copyright © 2006, Cisco Systems, Inc.
The Multilayer SAN
139
Flat FCID Assignment • Older HBAs use area bits for port IDs
Bit
23
16 15
Domain
Entire area reserved for a single port
08 07
00
Port
Area
255 addresses wasted
• FCID Flat assignment mode:
Bit
Port and area bits are used for port IDs
23
16 15
Domain
08 07
00
Port IDs
Increases scalability of FCID address space
• FCID Auto assignment mode: (Default) Auto detects capability of the HBA in most cases MDS maintains a list of HBAs, identified by their Company IDs (OUIs) that require a complete area during Fabric Login Supports HBAs that use both area and port bits for port IDs Supports HBAs that only use area bits for port IDs
Cisco MDS logically assigns FCIDs – they are not tied to the physical port © 2006 Cisco Systems, Inc. All rights reserved.
40
Flat FCID Assignment Non-Cisco FC switches tie the Area ID to the physical port on the switch. N_Ports are normally assigned an entire area with a Port ID of 00. This severely restricts the number of ports within a switch to 256 (8 bit Area value). Cisco MDS assigns Port IDs based upon the entire 16 bit value, thereby lifting the restriction of 256 ports per switch. Fibre Channel standards require a unique FCID to be allocated to each N_Port that is attached to an Fx_Port. Some HBAs assume that only the area octet will be used to designate the port number. In other words, such an HBA assumes that no two ports have the same area value. When a target is assigned with a FCID that has the same area value as the HBA, but a different port value, the HBA fails to discover these targets. To isolate these HBAs in a separate area, switches in the Cisco MDS 9000 Family follow a different FCID allocation scheme depending on the addressing capability of each HBA. By default, the FCID allocation mode is auto. In the auto mode, only HBAs without interoperability issues are assigned FCIDs with both area and port bits. This is known as flat FCID assignment. All other HBAs are assigned FCIDs with a whole area (port bits set to 0). However, in some cases it may be necessary to explicitly disable flat FCID assignment mode if the switch cannot correctly detect the capability of the HBA.
140
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Distributed Device Alias Services (DDAS)
2.0
Cryptic WWNs WWN1 = 12:22:67:92:86:92:15:34 WWN2 = 02:12:35:86:93:08:64:43
Simplify SAN configuration and management tasks User-friendly CLI/FM commands and outputs
FC FC
WWN1
WWN2
Human-readable names (aliases)
Fabric-wide distribution ensures no reconfiguration when a device is moved across VSANs Unique aliases minimize zone merge issues
Alias1 = Server-Oracle-ERP Alias2 = Array-OLTP
FC FC
Alias1
Alias2
© 2006 Cisco Systems, Inc. All rights reserved.
41
Distributed Device Alias Services Distributed Device Alias Services (DDAS) simplifies SAN configuration and management tasks. User-friendly alias names can be employed in Fabric Manager and the CLI. By distributing device aliases fabric wide, no reconfiguration is required when a device is moved across VSANs. Zone merge issues are minimal when using unique aliases fabric wide. Future releases of SAN-OS will include the capability to dynamically assign aliases.
Copyright © 2006, Cisco Systems, Inc.
The Multilayer SAN
141
N_Port Identifier Virtualization (NPIV)
3.0
• Provides ability to assign multiple port IDs to a single N_Port • Multiple applications on the same port can use different IDs in the same VSAN • Allows zoning and port security to be implemented at the application level • Designed for virtual server environments
Virtual Servers
Email
FC Node
Web Print Three FLOGIs LUN 1 N_Port ID=1 FC
F_Port
LUN 2 N_Port ID=2 LUN 3 N_Port ID=3 N_Port Controller HBA
Three FCIDs Three Name Server entries Three Virtual Devices All share a single FC Port
© 2006 Cisco Systems, Inc. All rights reserved.
MDS Switch 42
N_Port Identifier Virtualization Fibre Channel standards define that a FC HBA N_Port must be connected to one and only one F_Port on a Fibre Channel switch. When the device is connected to the switch, the link comes up and the FC HBA sends a FLOGI command containing its pWWN to the FC switch requesting a Fibre Channel ID. The switch responds with a unique FCID based upon the Domain ID of the Switch, Area ID and Port ID. This is fine for servers with a single operating environment but is restrictive for virtual servers that may have several operating environments sharing the same FC HBA. Each virtual server requires its own FCID. NPIV provides the ability to assign a separate FCID to each virtual server that requests one through its own FLOGI command.
142
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Cisco Fabric Services—Unifying the Fabric Cisco Fabric Services (CFS) • Cisco Fabric Services (CFS) Protocol for configuration and discovery of fabric-wide services Distributes configuration information to all switches in the fabric
Configuration for fabric-wide services
FC FC FC FC
FC
Distribution is global to all CFS enabled switches regardless of VSAN FC FC FC FC
FC
FC
Communication is in-band over FC link or out-of-band over IP as a last resort
FC
• Benefits: CFS Protocol
Fast and efficient distribution Single-point of configuration with fabric-wide consistency Plug-and-play SANs Session-based management
© 2006 Cisco Systems, Inc. All rights reserved.
44
The Cisco Fabric Services Protocol distributes configuration information for WWN-based VSAN members, Distributed Device Alias Services, Port Security, Call Home, Network Time Protocol (NTP), AAA servers, Inter-VSAN Routing zones, Syslog servers, role policies, and Fibre Channel timers to all switches in a fabric. From SAN-OS 3.0 CFS will first attempt to distribute in-band using FC EISLs between switches but as a last resort, using an out-of-band IP connection if available.
Copyright © 2006, Cisco Systems, Inc.
The Multilayer SAN
143
CFS Applications •
Consistent Syslog Server, Call Home and Network Time Protocol (NTP) configuration throughout fabric aids in troubleshooting and SLA compliance
•
Distributed Port Security, RADIUS/TACACS+ and Role-Based Access Control (RBAC) information for simpler security management
•
Fabric wide VSAN timer and IVR topology information propagation from a single switch
•
Distributed Device Alias Service (DDAS) allows fabric-wide aliases, simplifying SAN administration
Syslog Syslog Call CallHome Home NTP NTP
DDAS DDAS
Port PortSecurity Security RADIUS RADIUS RBAC RBAC
CFS CFS
VSAN VSANTimers Timers IBR IBR
© 2006 Cisco Systems, Inc. All rights reserved.
45
CFS Applications The Cisco Fabric Services Protocol aids in the administration, management and deployment of configuration settings SAN wide. Consistent Syslog Server, Call Home and NTP configuration throughout fabric aids in troubleshooting and SLA compliance. CFS distributed Port Security, Radius TACACS and RBAC information enhance and simplify security by providing consistent and comprehensive security settings. Fabric wide IVR and VSAN timer information propagation from a single switch via CFS provides uniformity across the fabric. Fabric wide distributed device aliasing simplifies SAN administration by providing consistent names for devices throughout the fabric based upon the pWWN regardless of VSAN
144
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Switch Interoperability Overview of Switch Interoperability
Open Trunking
X
X
PortChannels
McData
EISL
X
X
VC Flow Control
MDS 9000
Brocade
Switches utilize their proprietary feature set. Different vendors’ switches often cannot interoperate with each other. Cisco MDS switches support 5 modes: Cisco Native Mode
Supports all Cisco proprietary features
Interop Mode 1
FC-SW2 compatible with all other vendors
Interop Mode 2
Legacy Brocade support for 16 port switches
Interop Mode 3
Legacy Brocade support for larger switches
3.0 Interop Mode 4 © 2006 Cisco Systems, Inc. All rights reserved.
Legacy McData support 47
Interoperability allows devices from multiple vendors to communicate across a SAN fabric. Fibre Channel standards (e.g., Fibre Channel Methodologies for Interconnect, FC-MI 1.92) have been put in place to guide vendors towards common external Fibre Channel interfaces. If all vendors followed the standards in the same manner, then interconnecting different products would become a trivial exercise. However, some aspects of the Fibre Channel standards are open to interpretation and include many options for implementation. In addition, vendors have extended the features laid out in the standards documents to add advanced capabilities and functionality to their feature set. Since these features are often proprietary, vendors have had to implement “interoperability modes” to accommodate heterogeneous environments.
Copyright © 2006, Cisco Systems, Inc.
The Multilayer SAN
145
Standard Interoperability Mode 1 Brocade 2800 McDATA 6064 MDS 9509 CNT/Inrange FC/9000
Brocade 3900
Standard Interop mode (Interop mode 1) requires all other switches in fabric to be in Interop mode 1 Enables MDS 9000 switches to interoperate with McData, Brocade, and QLogic switches that are FC-SW-2 compatible Reduces the feature set supported by all switches Requires rebooting of third party switches Can require disruptive restart of an MDS VSAN Interop Modes affect only the VSAN for which they are configured © 2006 Cisco Systems, Inc. All rights reserved.
48
Interop Mode 1 The standard interoperability mode (Interop mode 1) enables the MDS to interoperate with third party switches that have been configured for interoperability. Interop 1 mode allows the MDS to communicate over a standard set of protocols with these switches. In Interop mode 1, the feature set supported by vendors in standard interoperability mode is reduced to a subset that can be supported by all vendors. This is the traditional way vendors achieve interoperability. Most non-Cisco switches require a reboot when configured into standard interoperability mode. On Cisco switches, Interop mode is set on a VSAN rather than the whole switch. As a result, an individual VSAN may need to be restarted disruptively to implement interop 1 mode, but the entire switch does not require a reset.
146
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Legacy Interoperability Modes Allows MDS 9000 to integrate seamlessly with Brocade and McData switches that are running in native mode:
MDS
Does not restrict the use of proprietary features Does not require a reboot of the switch
Interop Mode 2 Supports Brocade switches with 16 or fewer ports eg. 2100/2400/2800/3200 Native core PID format = 0
Interop Mode 3
Brocade 2800 Interop 2
Supports Brocade switches with more than 16 ports eg. 3900/12000/24000 Native core PID format = 1
3.0
Brocade 3900 Interop 3
Interop Mode 4 Supports McData switches and directors
McData 6140 Interop 4
© 2006 Cisco Systems, Inc. All rights reserved.
49
Legacy Interop Modes MDS switches can operate in legacy modes that allow integration into existing Brocade or McData fabrics without rebooting or reconfiguring the switches or a reduction of the feature set.
Interop Mode 2: This mode allows seamless integration with specific Brocade switches (2100/2400/2800/3800 series) running in their own native mode of operation. Interop Mode 2 enables MDS switches to interoperate with older Brocade switches that utilize a restrictive PID format (PID = 0) that permits only 16 devices per domain. This restrictive PID format, also referred to as “CORE PID FORMAT” (CORE PID=0), is common in brocade fabrics that do not have 3900/12000 switch in the fabric.
Interop Mode 3: This mode allows seamless integration with specific Brocade switches (3900 and 12000) running in their own native mode of operation. Interop Mode 3 enables MDS switches to interoperate with newer Brocade switches that utilize a less restrictive PID format (PID = 1) that permits up to 256 devices per domain. This PID format is referred to as “CORE PID FORMAT” (CORE PID=1) which is common in brocade fabrics with at least one 3900, 12000 model or later in the fabric...(In such cases all the other 2800/3800 switches need to be set to CORE PID=1 explicitly), which is a disruptive operation requiring a reboot of every switch.
Interop Mode 4: This mode allows seamless integration with McData switches and directors running in their own native mode of operation.
Note
Brocade Fabric switches with port counts higher than 16 (models 3900 and 12000) require that the core PID value be set to 1. Earlier models, with 16 or fewer ports, set the core PID to 0. These older Brocade switches allocated one nibble of the FCID / PID in area field 0x0 – F for port numbers, limiting the port count to 16. When the core PID is set to 1, the allocated bytes in the FCID/PID allow the use of port numbers 0x00 – FF.
Copyright © 2006, Cisco Systems, Inc.
The Multilayer SAN
147
Inter-VSAN Routing and Interop Modes IVR Edge Switch VSAN 100 Interop Mode 2 or 3
VSAN 200 Interop Mode 4
FC
FC
HBA
Backup Server FC
McData Director Tape Library Brocade Switches
Storage Array
Brocade Fabric in Native Mode
McData Fabric in Native Mode
• Use IVR to seamlessly backup data in a Brocade fabric to a tape library in a McData fabric • IVR is supported by MDS 9100, 9200 and 9500 switches and included in the Enterprise license package. • Enable true SAN consolidation of storage and tape devices across the enterprise © 2006 Cisco Systems, Inc. All rights reserved.
50
Using IVR with Interop Modes This example shows how VSANs and IVR can effectively be used to allow a Backup Server in a Brocade fabric to seamlessly backup data to a Tape Library in a McData fabric. The Brocade switches are connected to an MDS interface in VSAN 100 which is placed in Interop Mode 2 or 3, depending upon the Brocade switch model and Core PID type. The McData director is connected to an MDS interface in VSAN 200 which is placed in Interop Mode 4. IVR is fully compliant with Fibre Channel standards so is completely transparent to the transfer of frames from one fabric to another. This unique feature of Cisco MDS switches allows for the first time, different vendor fabrics to be joined in a SAN without having to disruptively put each switch in Interop Mode 1. This feature enables true SAN consolidation of storage and tape devices across the enterprise.
148
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Lesson 5
Remote Lab Overview Overview This lesson provides an introduction to the hands-on labs in this course.
Objectives Upon completing this lesson, you will be able to become familiar with MDS 9000 management. This includes being able to meet these objectives:
Identify the system memory areas of the MDS 9000 supervisor
Describe the features of the MDS 9000 CLI
Describe the basic features of Cisco Fabric Manager and Device Manager
Explain how to perform the initial configuration of an MDS 9000 switch
Access the MDS 90000 remote storage labs
System Memory Areas System Memory Areas Flash Memory
RAM Memory
NVRAM Memory
Bootflash:
System:
NVRAM:
LINUX system space SAN-OS
Boot Parameters (Kickstart + System)
(internal flash) Kickstart image System Image License files
Running Config
Startup Config copy run start
Slot 0: (external flash)
Modflash: (SSM flash)
Volatile:
Log:
Temporary file space
Logfile
• The Bootflash: contains the Kickstart and System images used for booting the MDS • All config changes made by CLI or FM/DM are instantly active and held in running-config • #Copy run start saves the running-config to startup-config in NVRAM • Startup-config is loaded when the switch is rebooted • Temporary files may be stored in the Volatile: system area. © 2006 Cisco Systems, Inc. All rights reserved.
4
The Cisco MDS contains an internal Bootflash: used for holding the current bootable images, Kickstart and System. License files are also stored here but the Bootflash: can also be used for storing any file including copies of the startup config. MDS 9500 supervisors also have an external Bootflash: memory slot called Slot0: that is used for transferring image files between switches. The SSM linecard also contains an internal ModFlash: used for storing application images. The system RAM memory is used by the Linux operating system and a Volatile: file system is used for storing temporary files. Any changes made to the switch operating parameters or configuration are instantly active and held in the Running-Configuration in RAM. All data stored in RAM will be lost when the MDS is rebooted so an area of non-volatile RAM or NVRAM is used for storage of critical data. The most critical of these would be the runningconfiguration for the switch. The Running-Configuration should be saved to the StartupConfiguration in NVRAM using the CLI command #Copy run start, so that the configuration can be preserved during switch reboot. During the switch boot process, it is essential that the switch knows where to find the kickstart and system images and what they are called. Two boot parameters are held in NVRAM that point to these two files.
152
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Boot Sequence • Both the Kickstart image and System image need to be present for a successful boot • The Boot Parameters point to the location of both the Kickstart and System images • The Boot process will fail if the parameters are wrong or the images are missing • Install command simplifies the process and checks for errors
System:
System
Kickstart
Bootloader
Loads LINUX Kernel & Drivers Gets System Boot Parameters Verifies System image and loads it Switch(Boot)# prompt
Gets Kickstart Boot Parameters Verifies Kickstart image and loads it Loader> prompt
BIOS Runs POST Loads Bootloader © 2006 Cisco Systems, Inc. All rights reserved.
Loads SAN-OS Checks file systems Loads startup-config Switch# prompt
(RAM) LINUX system space SAN-OS
NVRAM Boot Parameters #boot system bootflash:system30.img #boot kickstart bootflash:kickstart30.img
Bootflash: (internal flash) system30.img kickstart30.img 5
Boot Sequence When the MDS is first switched on or during reboot, the System BIOS on the Supervisor module first runs POST (Power On Self Test) diagnostics then loads the Bootloader bootstrap function. The Boot parameters are held in NVRAM and point to the location and name of both the Kickstart and System images. The Bootloader obtains the location of the Kickstart file, usually on Bootflash: and verifies the Kickstart image before loading it. The Kickstart loads the Linux Kernel and device drivers and then needs to load the System Image. Again, the Boot parameters in NVRAM should point to the location and name of the System image, usually on Bootflash:. The Kickstart then verifies the System image and loads it. Finally, the System image loads the SAN-OS, checks the file systems and proceeds to load the startup-config, containing the switch configuration, from NVRAM. If the Boot parameters are missing or have an incorrect name or location, then the Boot Process will fail at the last stage. If this happens, then the administrator must recover from the error and reload the switch. The ‘#Install All’ command is a script that greatly simplifies the boot procedure and will check for errors and the upgrade impact before proceeding.
Copyright © 2006, Cisco Systems, Inc.
Remote Lab Overview
153
CLI Overview CLI Overview • Command-Line Interface (CLI) Multiple connection options and protocols Direct Console or serial link—VT100 Secure shell access—SSH (encrypted) Terminal Telnet—TCP/IP over Ethernet or Fibre Channel
© 2006 Cisco Systems, Inc. All rights reserved.
7
There are multiple connection options and protocols available to manage the MDS 9000 Family switches via CLI. The initial configuration must be done using a VT100 console access. VT100 console access can be a direct connection or serial link connection such as a modem. Once the initial configuration is complete you can access the switch using either Secure Shell or Telnet. Secure Shell (SSH) protocol provides a secure encrypted means of access. Terminal Telnet access involves a TCP/IP Out-of-Band (OOB) connection through the 10-/100-MB Ethernet port or an in-band connection via IP over FC. You can access the MDS 9000 Family of switches for configuration, status, or management through the console port; initiate a Telnet session through the OOB Ethernet management port or through the in-band IP over FC management feature. The console port is an asynchronous port with a default configuration of 9600 bps, 8 data bits, no parity, and 1 stop bit. This port is the only means of accessing the switch after the initial power up until an IP address is configured for the management port. Once an IP address is configured, you can Telnet to the switch through the management Mgmt0 interface on the supervisor card. In-band IP over FC is used to manage remote switches through the local Mgmt0 interface.
154
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
CLI Features • Structured hierarchy – easier to remember • Style consistent with IOS software • Commands may be abbreviated • Help facility Context-sensitive help (?) Command completion (Tab) Command history buffer (using ↕ ↔ keys) Console error messages
• Command Scheduler with support for running Shell scripts • Support for Command Variables and Aliases • Configuration changes must be explicitly saved before reboot copy running-config startup-config (abbreviated to: copy run start) © 2006 Cisco Systems, Inc. All rights reserved.
8
CLI Features The CLI enables you to configure every feature of the switch. More than 1700 combinations of commands are available and are structurally consistent with the style of Cisco IOS software CLI. The CLI help facility provides:
Context-sensitive help - Provides a list of commands and associated arguments. Type ? at any time, or type part of a command and type ?.
Command completion - The Tab key completes the keyword you have started typing.
Console error messages - Identify problems with any switch commands that are incorrectly entered so that they may be corrected or modified
Command history buffer - Allows recalling of long or complex commands or entries for reentry, renewing, or correction
MDS Command Scheduler - Provides a UNIX “cron” like facility in the SAN-OS that allows the user to schedule a job at a particular time or periodically.
Configuration changes must be explicitly saved, and configuration commands are serialized for execution across multiple SNMP sessions. To save the configuration, enter the copy runningconfig startup-config command from the config mode prompt to save the new configuration into nonvolatile storage. Once this command is issued, the running and the startup copies of the configuration are identical. Every configuration command is logged to the RADIUS server.
Copyright © 2006, Cisco Systems, Inc.
Remote Lab Overview
155
CLI Modes EXEC mode: Show system information, run debug, Copy and delete files, get directory listing for Bootflash:
Configuration mode: Configure features that affect the switch as a whole
Configuration submode: Configure switch sub-parameters
flogi
fcns
database
debug
EXEC Mode
config terminal fspf
interface
fcdomain
port-channel
fc
fcip
switchport
shut
no shut
iscsi
zoneset
mgmt
Config Mode Config submode
end
slot0:
show
exit
bootflash:
copy
exit
dir
exit
switch prompt (switch#)
9
© 2006 Cisco Systems, Inc. All rights reserved.
CLI Modes Switches in the MDS 9000 Family have three command mode levels:
User EXEC mode
Configuration mode
Configuration submodes
The commands available to you depend on the mode that you are in. To obtain a list of available commands, type a “?” at the system prompt. From the EXEC mode, you can perform basic tests and display system information. This includes operations other than configuration such as show and debug. Show commands display system configuration and information. Debug commands enable printing of debug messages for various system components. Use the config or config terminal command from EXEC mode to go into the configuration mode. The configuration mode has a set of configuration commands that can be entered after a config terminal command, in order to set up the switch. The CLI commands are organized hierarchically, with commands that perform similar functions grouped under the same level. For example, all commands that display information about the system, configuration, or hardware are grouped under the show command, and all commands that allow you to configure the switch are grouped under the config terminal command, which includes switch sub-parameters at the configuration submode level. To execute a command, you enter the command by starting at the top level of the hierarchy. For example, to configure a Fibre Channel interface, use the config terminal command. Once you are in configuration mode, issue the interface command. When you are in the interface submode, you can query the available commands there. 156
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Useful CLI Commands # copy run start
Saves active config in NVRAM
# dir bootflash:
List files stored on bootflash:
# erase bootflash:temp
Erase file stored on bootflash:
# copy slot0:tmp bootflash:temp.txt
Copy file and change the name
# debug Flogi
Monitor all FLOGI operations
# no debug all
Switch off debug
# show tech-support
Gather switch info for support
# show tech-support > tempfile
Saves output in volatile:tempfile
# gzip volatile:tempfile
Compresses tempfile
# copy volatile:tempfile slot0:temp
Copies file to external flash card
# config t
Enter Config Mode to change settings
(config)# int fc x/y
Configure specific interface
(config-if)# switchport speed 1000
Configure as a 1Gb port
© 2006 Cisco Systems, Inc. All rights reserved.
10
Useful CLI Commands The top part of the list shows useful commands that can be entered at the EXEC mode. changes to the configuration can only be made by entering Configuration mode first and then entering the appropriate commands. More information can be found by looking in the Cisco MDS Command Reference guide.
Copyright © 2006, Cisco Systems, Inc.
Remote Lab Overview
157
Useful CLI Show Commands # show environment power
Check power ratings
# show interface brief
Summary of all interfaces
# show interface fc x/y
Detailed info about an interface
# show module
Detailed status about all modules
# show hardware
Detailed hardware status
# show version
View current software versions
# show license usage
List installed licenses and status
# show running-config
View active switch settings
# show VSAN
Lists all created VSANs
# show VSAN membership
Lists interfaces by VSAN
# show zoneset active
Show all active zones and zonesets
# show flogi database
Lists all devices logged in to MDS
# show fcns database
Lists all name server entries
© 2006 Cisco Systems, Inc. All rights reserved.
11
Useful CLI Show Commands The show commands are too extensive to list individually so here are some common ones. More information can be found by looking in the Cisco MDS Command Reference guide.
158
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
3.0
Command Aliases • Replaces complex command strings with an alias name • Command aliases persist across reboots • Commands being aliased must be typed in full without abbreviation • Command alias always takes precedence over CLI keywords • (config)# cli alias name gigint interface gigabitethernet
12
© 2006 Cisco Systems, Inc. All rights reserved.
Command Aliases Some commands can require a lot of typing. An example of this is gigabitethernet that can sometimes be shortened to gig, but it is sometimes useful to group several commands and subcommands together. This can be done using Command Aliases. Command Aliases are saved in NVRAM so can persist across reboots. When creating an alias, the individual commands must be typed in full without abbreviation. If you define an alias, it will take precedence over CLI keywords starting with the same letters, so be careful when using abbreviations.
Copyright © 2006, Cisco Systems, Inc.
Remote Lab Overview
159
2.0
Command Scheduler • Helps schedule configuration and maintenance jobs in any MDS switch • Schedule jobs on a one-time basis or periodically One-time Mode The job is executed once at a pre-defined time Periodic Mode Job is executed Daily, Weekly, Monthly, or Delta (configurable)
• MDS Date and Time must be accurately configured • Scheduled jobs may fail if an error is encountered If a licence has expired, if a feature is disabled
• All jobs are executed non-interactively
13
© 2006 Cisco Systems, Inc. All rights reserved.
Command Scheduler The Cisco MDS SAN-OS provides a unix kron like facility called the Command Scheduler. Jobs can be defined listing several commands that are to be executed in order. Jobs can be scheduled to run at the same time every day, week, month or at a configurable frequency (delta). All jobs are executed non-interactively, without administrator response. Be aware that a job may fail if a command that is issued is disabled or no longer supported, because a license may have expired. The job will fail at the point of error, and all subsequent commands will be ignored.
160
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Fabric Manager and Device Manager Installing Fabric Manager • Point browser at mgmt IP address of MDS switch assigned at initial setup • Click link to Install Fabric Manager • Requires Java Web Start • Installs java applets from Web Server running on MDS switch • Follow prompts
• FM Server runs as Windows Service • Performance Manager runs as Windows Service • Runs as daemon on Unix © 2006 Cisco Systems, Inc. All rights reserved.
15
Installing Fabric Manager The Fabric Manager is an SNMP-based device management application with a Java web-based GUI to view and configure multiple MDS 9000 Family director and fabric switches. The software is downloaded automatically to the end user’s (management) workstation using Java Web Start. Secure SNMPv3 communications are used to get and set switch parameters. Open a browser window and in the address bar, enter the IP address of the switch you wish to manage. The MDS switch will respond with a single web page from the MDS web server. If the Java Run Time Environment, Java Web Start has not already been installed, then the web page will include a link in red pointing to the sun website for downloading java. The Cisco Fabric Manager web page contains two links for downloading the java applets for Fabric Manager and Device Manager. Just click on the “Install Fabric Manager” link and follow the onscreen prompts. Cisco Fabric Manager is used to show the entire fabric containing all the switches, hosts and storage devices. Cisco Device Manager is used to manage a single switch. To open Device Manager, just double click on the green icon for a switch on the Fabric Manager topology view.
Copyright © 2006, Cisco Systems, Inc.
Remote Lab Overview
161
Fabric Manager Features Configuration, Diagnosis, Monitoring Device View
• Switch- embedded Java Application Download via Web browser Installed and updated automatically by Java Web Start
Fabric View
Runs on client’s workstation • SNMPv3 secures communications with switches • Discovers FC fabric and displays topology map
Summary View
• Enables rapid multi-switch configuration • Summary view Including: In-line bar charting, sorting, drill-down capabilities Chart, print, or output to file © 2006 Cisco Systems, Inc. All rights reserved.
16
Fabric Manager Features The Fabric Manager provides three management views and a Performance Manager traffic analysis interface. The Fabric View displays a map of your network fabric, including Cisco MDS 9000 switches, hosts, and storage devices. The Device View displays a graphic representation of the switch configuration and provides access to statistics and configuration information for a single switch. The Summary View displays a summary of xE_Ports (interswitch links), Fx_Ports (fabric ports), and Nx_Ports (attached hosts and storage) on a single switch. For more detailed information, Performance Manager included with the Fabric Manager Server license provides detailed traffic analysis by capturing data with the Cisco Port Analyzer Adapter. This data is compiled into various graphs and charts which can be viewed with any web browser.
162
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Fabric Manager Features • FM+DM = CLI-Debug Device View
All features supported: VSANs, Zones, PortChannels
Fabric View
RMON alerts, event filters SNMP users and roles • Real-time & Historical statistics monitoring • Fabric troubleshooting and analysis tools Switch health, End-End Connectivity, Configuration, Zone Merge, Traceroute
Summary View
• Device View provides status at a glance Fan, power, supervisor and switching module status indicators Port status indicators
© 2006 Cisco Systems, Inc. All rights reserved.
17
Fabric Manager discovers network devices and creates a topology map with VSAN and zone visualization. VSAN/zone and switch trees are also available to simplify configuration. Immediately after the Fabric View is opened, the discovery process begins. Using information gathered from a seed switch (MDS 9000 Family), including name server registrations and FCGS3 fabric configuration server information, the Fabric Manager can draw a fabric topology in a user-customizable map. Because of the source of this information, any third-party devices such as other fabric switches that support FC-GS and FC-GS3 standards are discovered and displayed on the topology map. Vendor Organizational Unique Identifier (OUI) values are translated to derive the manufacturer of third-party devices, such as QLogic Corp., EMC Corp., or JNI Corp. Fabric Manager provides an intuitive user interface to a suite of network analysis and troubleshooting tools. One of those tools is the Device Manager, which is a complimentary graphical user interface designed for configuring, monitoring, and troubleshooting specific switches within the SAN fabric.
Copyright © 2006, Cisco Systems, Inc.
Remote Lab Overview
163
System Setup and Configuration Initial Setup • The initial setup routine is performed via a connection to the switch console port - console port parameters 9600 8-N-1 • The initial setup routine prompts you for the IP address and other configuration information necessary for the switch to communicate over the management interface Switch# setup . . This setup utility will guide you through the basic configuration of the system. Setup configures only enough connectivity for management of the system. Press Enter in case you want to skip any dialog. Use ctrl-c at anytime to skip away remaining dialogs. Would you like to enter the basic configuration dialog (yes/no): yes
© 2006 Cisco Systems, Inc. All rights reserved.
19
Initial Setup The first time that you access a switch in the Cisco MDS 9000 Family, it runs a setup program that prompts you for the IP address and other configuration information necessary for the switch to communicate over the supervisor module Ethernet interface. This information is also required if you plan to configure and manage the switch. The IP address must first be set up in the CLI when the switch is powered up for the first time, so that the Cisco MDS 9000 Fabric Manager can reach the switch. The console needs a rollover RJ-45 cable. There is a switch on the supervisor module of the MDS 9500 Series switches that, if placed in the “out” position, will allow the use of a straightthrough cable. The switch is shipped in the “in” position by default and is located behind the LEDs. In order to set up a switch for the first time you must obtain the administrator password, which is used to get network administrator access through the CLI. The Simple Network Management Protocol version 3 (SNMPv3) user name and password are used when you log on to the Fabric Manager but should be identified as soon as possible. The switch name will become the prompt when the switch is initialized, and the management Ethernet port IP address and subnet mask need to be known for out-of-band access.
164
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Setup Defaults • Configure default switchport interface state (shut/noshut) [shut]: • Configure default switchport trunk mode (on/off/auto) [on]: • Configure default zone policy (permit/deny) [deny]: • Enable full zoneset distribution (yes/no) [n]: • Enable FCID persistence in all the VSANs on this switch (yes/no) [n]: • Would you like to edit the configuration (yes/no) [no]: • Use this configuration and save it? (yes/no) [y]:
•
The management interface is active at this point
•
All Fibre Channel and Gigabit Ethernet interfaces are shut down
•
Select yes to use and save the configuration
•
The Setup Routine can be accessed from the EXEC mode of the CLI with the # setup command
© 2006 Cisco Systems, Inc. All rights reserved.
20
Setup Defaults It is recommended to have the switch interfaces come up administratively disabled, or “shut.” This approach will ensure that the administrator will have to configure the interface as needed and then enable with the no shut command, resulting in a more controlled environment. Switch trunk ports mode should be on. Two connected E_Ports will not do trunking if one end port has the trunk mode off. The default zoning policy of “deny” will make all interfaces on a switch inoperable until a zone is created and activated—interfaces in the default zone cannot communicate with each other. This policy can be used for greater security. If the permit policy is enabled, then all ports in the default zone will be able to communicate with each other. The system will ask if you would like to edit the configuration that just printed out. Any configuration changes made to a switch are immediately enforced but are not saved. If no edits are needed, then you will be asked if you want to use this configuration and save it as well. Since [y] (“yes”) is the default selection, pressing Return will activate this function, and the configuration becomes part of the running-config and is copied to the startup-config. This will also ensure that the kickstart and system boot images are automatically configured. Therefore, you do not have to run a copy command after this process. A power loss will restart the switch using the startup-config, which has everything saved that has been configured to nondefault values. If you do not save the configuration at this point, none of your changes will be updated the next time the switch is rebooted.
Copyright © 2006, Cisco Systems, Inc.
Remote Lab Overview
165
Using the MDS 9000 Remote Storage Labs MDS 9000 Remote Storage Labs • 24x7x365 support for training events and customer demos • Full console and desktop access • 30 student “pods” • 60 MDS 9000 switches • Real live equipment not a simulation
© 2006 Cisco Systems, Inc. All rights reserved.
22
The labs are located in Nevada and used extensively, 24 hours a day during MDS lab based training courses throughout the world. The lab interface provides login authentication, and full access to switch consoles and desktop access on each server. The labs currently contain over 30 ‘pods’ each containing two MDS switches, two servers, JBOD storage and PAA for diagnostics.
166
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Remote Storage Lab Interface
• Point browser at www.labgear.net • Enter Username and Password • Click Console to access MDS CLI • Click Desktop to access W2K server
23
© 2006 Cisco Systems, Inc. All rights reserved.
Remote Storage Lab Interface Open a browser and in the address bar type www.labgear.net This will open the main window to authenticate your session. Enter your labgear pod username and password. These will be assigned to you by your instructor. To access the MDS switch console, click on the green console button. To access the desktop of either of the Win2K servers, click on the Desktop button and enter the username and password. Username is administrator, password is cisco
Copyright © 2006, Cisco Systems, Inc.
Remote Lab Overview
167
CSDF Labs • Lab 1: Initial Switch Config • Lab 2: Accessing Disks via Fibre Channel • Lab 3: Configuring High Availability SAN Extension • Lab 4: Configuring IVR for SAN Extension • Lab 5: Exploring Fabric Manager Tools • Lab 6: Implementing iSCSI
© 2006 Cisco Systems, Inc. All rights reserved.
24
This slide shows the labs that you will perform in this course.
168
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Lesson 6
Network-Based Storage Applications Overview This lesson explains how the MDS 9000 Storage Services Module (SSM) enables networkbased storage applications.
Objectives Upon completing this lesson, you will be able to explain how the MDS 9000 Storage Services Module (SSM) enables network-based storage applications. This includes being able to meet these objectives:
Explain the basics of SAN-based storage virtualization
Explain the value of network-based storage virtualization
Describe the network-hosted application services supported by the SSM
Describe the network-assisted application services supported by the SSM
Describe the network-accelerated application services supported by the SSM
Describe Fibre Channel Write Acceleration
Storage Virtualization Overview Storage Services for SANs Today • Server management
• Mirror, stripe, concat., slice • Coordinated across hosts • Application integration • Multipathing
Individually managed Mirroring, striping, concatenation coordinated with disk array groupings Each host with different view of storage LUN Mapping and LUN Masking provide paths between initiators and targets
• Volume management Individually managed Just-in-case provisioning Stranded capacity Snapshot within a disk array
• RAID • HA upgrades • Multiple paths • Snapshot • Array-to-array replication
Array-to-array replication © 2006 Cisco Systems, Inc. All rights reserved.
Replication 4
Storage Services for SANs Today Storage services for SANs today are usually a hodge-podge of ad hoc solutions. Managing individual volumes and multipathing at the host level adds to the complexity of SAN administration. Each server requires its own investment in management and attention. SAN administrators will typically over-provision storage in this scenario as a strategy to reduce the amount of time spent on resource management. Unfortunately, this results in a lot of underutilized and wasted storage. Also in this scenario, redundancy and replication tasks are often achieved at the array level, often in a same-box-to-same-box configuration or by using a 3rd party software utility to replicate across hetero storage. This adds an additional layer of complexity to the Information lifecycle and overall SAN management. Low value data winds up living on expensive storage.
172
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
LUN Mapping and LUN Masking LUN Masking Used in the Storage Array to mask or hide LUNs from servers that are denied access Storage Array makes specific LUNs “available” to server ports identified by their pWWN.
LUN Mapping in HBA
FC HBA
FC HBA
FC HBA
FC HBA
LUN Mapping Server then maps some or all visible LUNs to volumes
Target Identification The server FC driver identifies the SCSI Target ID with the pWWN of the target port then associates each port with its Fibre Channel FCID.
MDS
MDS
Command frames are then sent by the SCSI Initiator (server) to the SCSI Target (storage device) In a heterogeneous SAN, there may be several storage arrays and JBODs from different vendors Difficult to configure Costly to manage Difficult to replicate and migrate data © 2006 Cisco Systems, Inc. All rights reserved.
RAID configuration and LUN Masking in Storage Arrays 5
LUN Mapping and LUN Masking In most SAN environments it is essential that each individual LUN is only discovered by a single server HBA (Host Bus Adapter), otherwise the same volume will be accessed by more than one file system leading to potential data loss or loss of security. There are basically three ways to ensure that this does not happen. LUN Masking in the storage array, LUN Mapping in the host or LUN zoning in the MDS switch in the network. LUN Masking is a feature of enterprise storage arrays that provide basic LUN level security by only allowing LUNs to be seen by selected servers, identified by their pWWN. Each storage array vendor have their own management and proprietary techniques for LUN Masking in the array so in a heterogeneous environment with arrays from different vendors, LUN management becomes more difficult. JBODs (Just a Bunch of Disks) do not have a management function or controller so do not support LUN Masking. LUN Mapping is a feature of FC HBAs that allow the administrator to selectively map only some of those LUNs that have been discovered by the HBA. LUN Mapping must be configured on every HBA so in a large SAN this is a huge management task. Most administrators would configure the HBA to automatically map all LUNs that have been discovered by the HBA and perform LUN management in the array (LUN Masking) or in the network (LUN zoning) instead. LUN Zoning is a proprietary technique offered by Cisco MDS switches that allow LUNs to be selectively zoned to their appropriate host port. LUN zoning is usually used instead of LUN Masking in heterogeneous environments or where BODs are installed.
Copyright © 2006, Cisco Systems, Inc.
Network-Based Storage Applications
173
What is Virtualization? The process of presenting a logical grouping or subset of computing resources so that they can be accessed in ways that give benefits over the original configuration.
Virtualization layer
Server Virtualization is a way of creating several Virtual Machines from one computing resource Storage Virtualization is a logical grouping of LUNs creating a common storage pool
© 2006 Cisco Systems, Inc. All rights reserved.
6
What is Virtualization? Virtualization is defined as the process of presenting a logical grouping or subset of computing resources so that they can be accessed in ways that give benefits over the original configuration. In a heterogeneous environment, LUN management can become very costly and time consuming. Storage Virtualization is sometimes used instead to create a common pool of all storage and perform LUN management within the network.
174
Server Virtualization is a way of creating several Virtual Machines from one computing resource
Storage Virtualization is a logical grouping of LUNs creating a common storage pool
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Symmetric Virtualization All server ports are zoned with the Virtualization Appliance, Virtual Target port (T) Servers only discover one target LAN
All storage ports are zoned with the Virtualization Appliance, Virtual Initiator port (I)
I
All storage ports are controlled by one initiator All control and data frames are sent to the virtual target and terminated. The CDB and LUN are remapped and a new frame is sent to the real target Advantages Reduced complexity - Single point of management
T I Virtualization Appliance
Disadvantages All frames are terminated and remapped by the appliance and resent to their destination
T
Adds latency per frame All traffic passes through the appliance Potential single point of failure Potential performance issue © 2006 Cisco Systems, Inc. All rights reserved.
7
Symmetric Virtualization In the symmetric approach, all I/Os and metadata are routed via a central virtualization storage manager. Data and control messages use the same path, which is architecturally simpler but has the potential to create a bottleneck. The virtualization engine does not have to live in a completely separate device. It may be embedded in the network, as a specialized switch, or it may run on a server. To provide alternate data paths and redundancy, there are usually two or more virtual storage management devices; this leads to issues of consistency between the metadata databases used to do the virtualization. The fact that all data I/Os are forced through the virtualization appliance restricts the SAN topologies that can be used and can cause bottlenecking. The bottleneck problem is often addressed by using caching and other techniques to maximize the performance of the engine; however, this again increases complexity and leads to consistency problems between engines.
Copyright © 2006, Cisco Systems, Inc.
Network-Based Storage Applications
175
Asymmetric Virtualization Each server contains an agent Intercepts Block I/O requests
Agent sends meta-data over LAN for re-mapping.
Sends the meta-data (CDB and LUN) to a Virtualization Manager on the LAN.
LAN
The Virtualization Manager re-maps the CDB and LUN and returns it to the server. The server now sends the modified control frame to the storage target port.
I Virtualization Manager
All subsequent data and response frames flow directly between Initiator and Target. Advantages
Subsequent frames flow directly between Initiator and Target
Data frames are sent directly to the storage port Low latency
T
Disadvantages Requires agent in host to intercept control frame Remapping of CDB and LUN adds latency to first frame in exchange Virtualization Manager could be single point of failure
8
© 2006 Cisco Systems, Inc. All rights reserved.
Asymmetric Virtualization In the asymmetric approach, the I/O is split into three parts:
First, the server intercepts the Block I/O requests
Then queries the metadata manager to determine the physical location of the data.
Then, the server stores or retrieves the data directly across the SAN.
The metadata can be transferred in-band, over the SAN, or out-of-band, over an Ethernet link; the latter is more common as it avoids IP metadata traffic slowing the data traffic throughput on the SAN, and because it does not require Fibre Channel HBAs that support IP. Each server which uses the virtualized part of the SAN must have a special interface or agent installed to communicate with the metadata manager in order to translate the logical data access to physical access. This special interface may be software or hardware. Initial implementations will certainly be software, but later implementations might use specialized HBAs, or possibly an additional adapter which works with standard HBAs.
176
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Network-Based Storage Virtualization Network-Based Storage Virtualization Single point of management
• Application integration • Multipathing
Insulate servers from storage changes Data migration
FC
Highly resilient storage upgrades Capacity on demand Increased utilization Consolidation
FC
HBA
HBA
• LUN abstraction • Mirror, striping • Snapshot • Replication
Legacy investment protection
FC
FC
HBA
HBA
Virtualization MDS
MDS
Heterogeneous storage network Simplified data protection Snapshots Replication Different class of storage for different purposes • RAID • HA upgrades • Multiple paths 10
© 2006 Cisco Systems, Inc. All rights reserved.
Network based virtualization offers substantial benefits that overcome the challenges of traditional SAN management solutions. Network based virtualization means that management is now consolidated into a single point and simplified - hosts and storage are now independent of the various management solutions.
Servers are no longer responsible for volume management and data migration
Network based virtualization enables real-time provisioning of storage, reducing the waste and overhead of over-provisioning storage.
Legacy and hetero storage assets can be consolidated and fully utilized
Data is better protected by simplified snapshot and replication techniques
Easier to assign different classes of data
What are some existing approaches to storage virtualization? How is the MDS series a superior solution?
Copyright © 2006, Cisco Systems, Inc.
Network-Based Storage Applications
177
Network-Based Storage Applications STORAGE VIRTUALIZATION—TODAY •
Host-Based Apps
NETWORK-BASED VIRTUALIZATION •
Host-Based Apps App Integration Multi-Pathing
•
Network-Based Apps Volume Mgmt Snapshot Replication
App Integration Multi-Pathing Volume Mgmt
•
Array-Based Apps RAID/Vol Mgmt Multiple Paths Snapshot Replication
•
Replication
Customer Benefit Information Lifecycle Management Increased Storage Utilization Improved Business Continuance
Virtualization
Array-Based Apps RAID Multiple Paths
Proof Points • Simplified management • Non-disruptive data migration across tiered storage • Heterogeneous storage pooling • Flexible storage provisioning • Supports point-in-time copy, replication • Flexible data protection services 11
© 2006 Cisco Systems, Inc. All rights reserved.
Network-Based Storage Applications Network-based applications today are provided by the following vendors
EMC Invista
Incipient Network Storage Platform
Veritas Storage Foundation™ for Networks
Benefits:
178
Insulate servers: All storage changes, including upgrade to storage arrays are seamless to the hosts.
Consolidation: Different types of storage can accumulate through mergers and acquisitions, reorganizations, or vendor shift within an I.T. department. Net-based virtualization allows you to incorporate new storage seamlessly and maintain the same services and scripts.
Migration: Ability to move data seamlessly from one set of storage to another. (Note, that some do this with host-based volume manager. What if you have thousands of hosts???)
Secure isolation: Instantiation of a Virtual LUN (VLUN) so it is only accessible within an administrator-defined VSAN or Zone
Problem: Just-in-case provisioning
Different class of storage for different purposes
Central Storage: Central tool to manage all storage.
Cisco Storage Design Fundamentals (CSDF) v3.0
Solution: Just-in-time Provisioning
Copyright © 2006, Cisco Systems, Inc.
The Storage Services Module (SSM) Fully distributed architecture Provides huge aggregate performance Embedded ASICs Multiple CPP (Control Path Processors) + DPP (Data Path Processors) In-line SCSI processing up to 1.2 million IOPS Integrated 32 Fibre Channel ports - 1/2Gbps Multiple paths from hosts to virtualization engine down to the physical storage Remote mirroring in case of local disaster SSM applications
I/F
uP
FC-WA (Fibre Channel Write Acceleration)
Q
Q
FAIS (Fabric Application Industry Standard)
F
F
F
V
V
V
M
M
M
F
F
F
NASB (Network-Assisted Serverless Backup) SANtap protocol
Q
12
© 2006 Cisco Systems, Inc. All rights reserved.
The Storage Services Module (SSM) The SSM is an intelligent module that not only contains 32 1/2 Gbps FC ports but also multiple Control Path processors (CPP) and Data Path processors (DPP) used for hosting or assisting storage applications provided by a number of different partners. Cisco is working with best of breed partners to achieve optimized hardware for a leveraged solution. The SSM provides support for a number of storage features
FC-WA (Fibre Channel Write Acceleration) to enhance performance of write operations over long distances, eg. Array replication
FAIS (Fabric Application Industry Standard) is a standards based protocol used by external virtualization devices to communicate with the SSM through an Open API (Application Programming Interface)
NASB (Network-Assisted Serverless Backup) is used with supporting backup software to move the data mover function into the network and thereby reduce the CPU load on the application server or media server
SANTap Protocol is used by a number of storage partners with external storage appliances to communicate with the SSM.
Copyright © 2006, Cisco Systems, Inc.
Network-Based Storage Applications
179
Network-Based Virtualization Techniques Network-Hosted
Network-Assisted
Partner software resides on MDS
Partner software resides on arrays, external server or appliance
Network-Accelerated Partner Software is accelerated by Cisco engine or agent (E.G. Cisco X-Copy)
Appliance Server SSM Storage Services Module FC HBA FC HBA FC HBA FC HBA FC HBA FC
FC FC FC
HBA
Potential Network Virtualization Applications Heterogeneous Volume Management
Asynchronous Replication
Data Migration
Serverless Backup
Heterogeneous Replication / Copy Services
FC Write Acceleration
Continuous Data Protection (CDP) © 2006 Cisco Systems, Inc. All rights reserved.
13
Network-Based Virtualization Techniques Three types of Network-Based storage virtualization techniques implemented by the MDS SSM module are Network-Hosted, Network-Assisted and Network-Accelerated.
180
The Network-hosted technique is implemented by installing Cisco partner software on the SSM module in the MDS. The “Network” device is hosting the software which performs the Virtualization function for the application.
The Network-Assisted technique is implemented by installing Cisco partner software on a separate appliance or external server. In this technique, the “Network” device is assisting the software which performs the Virtualization function for the application.
The Network-Accelerated technique uses a function on the Cisco SSM to accelerate the Partner application. Serverless Backup is a typical Network Application that is “Accelerated” by the X-copy function running on the MDS SSM. Another function is Fibre Channel Write Acceleration.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Network-Hosted Applications Network-Hosted Services MDS 9000 Storage Services Module (SSM) • ASIC-based innovation • Open, standards-based platform • Hosts multiple partner applications
MDS 9000 SSM Network-Hosted
Network-Assisted
Network-Accelerated
FAIS-based open API (T11)
SANTap Protocol
Standard FC protocols
Volume Management
Heterogeneous storage replication
Serverless Backup
Data Migration
Continuous Log based data protection
FC Write Acceleration
Copy Services
Online Data Migration
Synchronous Acceleration
Storage performance/SLA monitoring
Invista 15
© 2006 Cisco Systems, Inc. All rights reserved.
With the SSM, Cisco introduced an open, standards based platform for enabling intelligent fabric applications.
SSM hardware: Dual function module with 32 FC ports with embedded Virtualization Engine processors
Purpose-built-ASICS – this optimizes virtualization functions, all done in ASICS, providing high performance with a highly available, scalable, and fully distributed architecture
Any-to-any virtualization (no need to connect hosts or storage directly into one of the FC ports)
Multiple best of breed partners for flexibility and investment protection
There are four key customer benefits of this intelligent fabric applications platform:
First, it is an open, and standards-based solution for enabling multiple partner application
Second, it provides feature velocity by reducing the development cycle
Third, it has a modular-software architecture for running multiple applications simultaneously
Finally, it provides investment protection by delivering real-world applications today with flexibility to enable advanced functions using software
Copyright © 2006, Cisco Systems, Inc.
Network-Based Storage Applications
181
Cisco and EMC Virtualization Example Heterogeneous Servers—Unix and Windows
FC
• Scalable hardware • High performance • Embedded diagnostics • Multiprotocol Platform for Intelligent services • VSAN scaling features
Cisco Provides
HBA
FC HBA
FC HBA
FC HBA
FC HBA
• EMC Invista • Network Based Volume Management (creation, presentation and management of volumes) • Online data mobility • Heterogeneous Clones Point-in-time copies
Cisco MDS 9000 Series
• Data Path Cluster (DPC) • Cisco Storage Services Module (SSM)
EMC Provides • Control Path Cluster (CPC) on CX700 • EMC Invista software • Cabinet
• Cisco Storage Services Enabler license
• Meta-storage
• Cisco MDS 9000 Family of Fibre Channel switches
© 2006 Cisco Systems, Inc. All rights reserved.
EMC/IBM/HP/Hitachi Storage Arrays
16
Cisco and EMC Virtualization Example Cisco has partnered with major storage software vendors to enable disk virtualization on dedicated hardware on the MDS9000. By providing dedicated hardware in the network to perform virtualization servers we can deliver greater performance and resilience. EMC’s implementation is pictured here.
182
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Cisco and EMC Virtualization Example (Cont.) Control Processor
• Volume Mgmt
• Independent control path
• Data Migration
• Program the Data Path
• Copy Services
• Processes exceptions
Cisco MDS Benefits: FAIS
• High-performance fast path
SSM
• Fully-distributed intelligence • Integrated, HA architecture
Data Path
• Multiprotocol integration • Comprehensive security
Virtual to Physical Mapping
• Troubleshooting & diags
Data Traffic Control Traffic 17
© 2006 Cisco Systems, Inc. All rights reserved.
Invista requires two components: intelligent switches from vendors such as Cisco, Brocade, and McDATA, along with a separate appliance--or set of them. This set, known as the Control Path Cluster (CPC), builds what amounts to routing tables and maintains metadata. The tables are used by the intelligent switches to rewrite FC and SCSI addresses at wire speed. That capability makes the architecture highly scaleable, but more complex as well. EMC Invista is installed on an external Control Path Cluster (CX700) providing
Volume Management
Data Migration
Copy Services
EMC Invista manages the Control path in the Control Processor, while data flows directly between host and storage through the Data Path processors located on the SSM module in the Cisco MDS. The benefits of performing virtualization on the SSM module are
Fully integrated into the fast high bandwidth redundant crossbar
High availability and redundancy
Minimal latency and high throughput
Comprehensive centralized security
Providing a centralized solution that is easier to manage.
Copyright © 2006, Cisco Systems, Inc.
Network-Based Storage Applications
183
Fabric Application Industry Standard (FAIS) FAIS is an ANSI T11 standards-based effort to create a common application programming interface (API) for fabric applications to run on an underlying hardware platform. The objective of FAIS is to help developers move storage and data management applications off applications, hosts and storage arrays and onto intelligent storage fabric-based platforms. FAIS was coauthored by Cisco. It is pronounced "face."
184
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Network-Assisted Applications Network-Assisted Services MDS 9000 Storage Services Module (SSM) • ASIC-based innovation • Open, standards-based platform • Hosts multiple partner applications
MDS 9000 SSM Network-Hosted
Network-Assisted
Network-Accelerated
FAIS-based open API (T11)
SANTap Protocol
Standard FC protocols
Volume Mgmt
Heterogeneous storage replication
Serverless Backup
Data Migration
Continuous Log based data protection
Write Acceleration
Copy Services
Online Data Migration
Synchronous Replication
Storage performance/SLA monitoring
Invista © 2006 Cisco Systems, Inc. All rights reserved.
19
Intelligent storage services are also provided on the SSM module from a large number of storage partners. Each network based appliance communicates with the SSM module through the use of the SANTap protocol. Network-Assisted applications include:
Heterogeneous Storage Replication
Continuous Data Protection
Data Migration
Storage Performance Monitoring
Copyright © 2006, Cisco Systems, Inc.
Network-Based Storage Applications
185
Out-of-Band Appliances Advantages: • Appliance not in the primary I/O path Limitations: • Appliance requires host-based software agents consuming CPU, memory and I/O. • Appliance adds latency to initial I/O request. • Potentially compromises I/O performance by issuing twice as many I/Os • Limited interoperability with other appliances or disk array features © 2006 Cisco Systems, Inc. All rights reserved.
FC FC FC HBA FC HBA
Hosts with software agents
HBA
HBA
Host sends I/O command to target once appliance has acknowledged it
Host based agent intercepts I/O command and sends it to appliance
Appliance SAN
FC
Target
20
Out-of-Band Appliances When a separate storage appliance is connected to the network, it has one prime advantage, in that the appliance is not in the main data path and so is not perceived as a bottleneck. The limitations of this approach are many:
186
Each host must have a software agent installed on the server to intercept I/O requests and redirect them to the appliance. If the administrator fails to install the agent on every server, then that server will attempt to communicate directly with its storage instead of via the appliance, possibly leading to data loss.
Each intercepted I/O request must be directed to an appliance that is usually connected on the LAN and therefore adds latency for every I/O operation
When the appliance is connected in-band over Fibre Channel, this results in additional I/O traffic across the FC SAN.
Every solution is proprietary and not defined by standards, so each appliance cannot interoperate with another.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
In-Band Appliances Advantages: • Does not require host agents
Appliance is in the primary data path
Hosts
Limitations: • Disruptive insertion of appliance in the data path • Potentially performance bottleneck because all frames flow through the appliance • Appliance adds latency to all frames • Limited interoperability with other appliances or disk array features
FC FC FC HBA FC HBA
HBA
HBA
Host sends all I/O to the appliance Appliance intercepts I/O and sends it to the target
Appliance SAN
FC
Target
• Appliance can be single point of failure © 2006 Cisco Systems, Inc. All rights reserved.
21
In-Band Appliances When a separate external storage appliance is connected in-band the advantage is that host based software agents are no longer required. However, there are several limitations to this approach:
The appliance cannot be added to the SAN without causing massive disruption.
All data between each of the servers and the storage must now pass through the appliance adding latency to every frame and becoming a potential bottleneck in a busy SAN.
The appliance becomes a Virtual Target for all SCSI based communication. It receives all SCSI I/O and sends it to the appropriate storage devices by creating a Virtual Initiator.
The appliance can become a single point of failure, although most solutions offered today are clustered.
Every solution is proprietary and not defined by standards, therefore each appliance cannot interoperate with other appliances.
Copyright © 2006, Cisco Systems, Inc.
Network-Based Storage Applications
187
Network-Assisted Solutions Using SANTap SANTap is a protocol between an MDS and a storage appliance SANTap sends a copy of I/O, transparently to a storage appliance without impacting the primary data path All SANTap comunication is based upon industry standard SCSI commands
Appliance not in the primary data path
Hosts FC FC HBA FC HBA FC HBA HBA
Host issues I/O command to target SANTap sends copy of I/O to appliance
Advantages: • Eliminates disruption of adding an appliance to the SAN • Eliminates the need for host agents
Appliance No I/O disruption
SAN
• Appliance is not in the primary I/O path FC
• No added latency, no bottleneck • Enables on-demand storage services
FC
Target SANTap 22
© 2006 Cisco Systems, Inc. All rights reserved.
Network-Assisted Solutions Using SANTap SANTAp is a protocol that is used to pass data between an MDS and a storage appliance. SANTap sends a copy of the FC frame containing SCSI I/O, transparently to a separate storage appliance. The original FC frame containing the SCSI payload is sent directly to its target with no additional latency or disruption. SANTap enables storage application appliances without impacting primary I/O
The integrity, availability and performance of the Primary I/O is maintained
Seamless insertion and provisioning of appliance based storage applications
Storage service can be added to any server/storage device in the network without any rewiring
Incremental model to deploy appliance based applications, easy to revert back to original configuration
No disruption of the Primary I/O from the server to the storage array (viz. integrity, availability & performance preserved)
Addresses the Scalability Issue for appliance based storage applications
Investment protection
Storage applications enabled by SANTap include:
188
Heterogeneous storage replication
Continuous log-based data protection
Online data migration
Storage performance/SLA monitoring
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
SANTap-Based Fabric Applications
Partner
Application Heterogeneous async replication over extended distances with advanced data compression functionality Disk-based Continuous Data Protection (CDP) for Zero-Backup windows with an ability to restore to any point in time Heterogeneous async replication over extended distances with data consistency Heterogeneous asynchronous replication and CDP Heterogeneous asynchronous replication and CDP Heterogeneous asynchronous replication
© 2006 Cisco Systems, Inc. All rights reserved.
23
SANTap-Based Fabric Applications Cisco is working through several storage partners to provide intelligent storage applications through externally connected appliances.
Copyright © 2006, Cisco Systems, Inc.
Network-Based Storage Applications
189
Network-Accelerated Applications Network-Accelerated Services MDS 9000 Storage Services Module (SSM) • ASIC-based innovation • Open, standards-based platform • Hosts multiple partner applications
MDS 9000 SSM Network-Hosted
Network-Assisted
Network-Accelerated
FAIS-based open API (T11)
SANTap Protocol
Standard FC protocols
Volume Mgmt
Heterogeneous storage replication
Serverless Backup
Data Migration
Continuous Log based data protection
FC Write Acceleration
Copy Services
Online Data Migration
Synchronous Replication
Storage performance/SLA monitoring
Invista © 2006 Cisco Systems, Inc. All rights reserved.
25
The SSM module provides a number of network-accelerated intelligent services that enhance the standard Fibre Channel protocols. These are:
190
Network-Assisted Serverless Backup (NASB)
Fibre Channel Write Acceleration (FC-WA)
Network-based synchronous replication
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
3.0
SSM and NASB Serverless Backup - Today
Network-Accelerated Serverless Backup Media Servers
FC FC
FC HBA
HBA
HBA
Media Servers
Application Servers
Application Servers
Instead of Media Servers, MDS (with SSM) moves data directly from Disk to Tape
SAN
SSM
SAN
SAN FC
FC
Tape
Disk Array
LAN Based
FC
FC
Tape
LAN Free
Server Free
Disk Array
Serverless Backup
Data moved over LAN
Data moved over SAN
Data moved over SAN
Data moved over SAN
Application Server moves data
Application Server moves data
Application Server not in data path
Application Server not in data path
Dedicated Media Server moves data
Fabric moves data 26
© 2006 Cisco Systems, Inc. All rights reserved.
SSM and NASB Instead of expensive dedicated media servers providing the function of backup up data from storage to tape, the SSM provides the media server function using standards based SCSI 3rd Party Copy. Customer Benefits Lower TCO
Offload I/O and CPU work from Media Servers to SSM
Reduce server admin & mgmt tasks
Higher Performance & Reliability
Each SSM delivers up to 16 Gbps throughput
SSM integrated in a HA MDS platform
Investment Protection
No changes to existing backup environment
SSM Data Movement can be enabled w/ software
Copyright © 2006, Cisco Systems, Inc.
Network-Based Storage Applications
191
Network-Accelerated Serverless Backup Development Partners CA BrightStor
CommVault Systems Galaxy Backup and Recovery EMC - Legato Networker
VERITAS NetBackup
IBM Tivoli Storage Manager © 2006 Cisco Systems, Inc. All rights reserved.
27
Cisco is working with five vendors who are all at different stages in qualifying their backup solution with the MDS 9000 network-accelerated serverless backup solution.
192
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Fibre Channel Write Acceleration 3.0
Fibre Channel Write Acceleration • Performance of DR/BC applications are inhibited by distance across WAN •
Latency degrades with greater distance
•
Only Write I/Os are affected
•
FC-WA improves write performance over WAN
Databases are very sensitive to latency
FC-Based Replication: How far?
Increased “Service Time” (Round Trip Response)
Initiator
Target
FC
FC
I/Os per second (IOPS) and Application Performance diminish with distance
Disk IO Service Time Increases with latency
WAN Minimum tolerable performance level
Write Write Acceleration Acceleration FCP_W RITE
Round Trip
XFER_RDY FCP_DATA
P FCP_RS
XFER_RDY
DISTANCE Maximum tolerable distance (latency)
© 2006 Cisco Systems, Inc. All rights reserved.
29
SCSI standards define the way a SCSI Initiator shall communicate with a SCSI Target. This consists of four phases:
Initiator sends SCSI Write Command to SCSI Target LUN containing a CDB with the command, LBA and Block count
When the SCSI Target is ready to receive data it responds with Xfer Ready
When the SCSI Initiator receives Xfer Ready it starts sending data to the SCSI Target
Finally, when the SCSI Target has received all the data, it returns a Response or Status to the SCSI Initiator
This constitutes two round trip journeys between the SCSI Initiator and SCSI Target.
Command – Xfer RDY
Data - Response
In a data centre environment, distances are short and therefore the round-trip time is low and latency is reduced. In a WAN environment, when distances are much longer, the SCSI Initiator cannot send data until it receives Xfer Ready after the first round trip journey. As distances increase, this considerably impacts write performance. Fibre Channel Write Acceleration spoofs Xfer Ready in the MDS switch. When the original SCSI Command is sent by the SCSI Initiator through the MDS switch to the SCSI Target, the MDS responds immediately with a Xfer Ready. The SCSI Initiator can now immediately send data to the SCSI Target instead of waiting for the true Xfer Ready to be received. Meanwhile the SCSI Command is received by the Target and it responds with the real Xfer Ready. When Copyright © 2006, Cisco Systems, Inc.
Network-Based Storage Applications
193
the Target MDS switch receives the Data it will pass the data on to the Target. Finally, the SCSI Target sends a Response or Status back to the Initiator in the normal way. In a typical environment, several simultaneous SCSI operations are taking place between the SCSI Initiator and SCSI Target simultaneously, so these operations are interleaved, maximizing performance and minimizing latency.
194
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
SSM and Network-Accelerated FC-WA • Fibre Channel Write Acceleration
3.0
Extend Distances for DR/BC Applications
Reduce round-trip delays Primary Data Center
DR Data Center
Optimize bandwidth for DR FC WA
Increase distance between primary site and remote site Minimizes application latency Investment protection: transport agnostic DWDM, CWDM, SONET/SDH, dark fiber
• Primary Application:
SSM
SSM
Up to 30% Performance improvement seen by major financial services company over 125 km distance
– Synchronous replication
© 2006 Cisco Systems, Inc. All rights reserved.
30
SSM and Network-Accelerated FC-WA The primary application for FC-WA is synchronous replication of data between storage arrays in the main and remote data centres. Tests have shown that over a 125Km distance, there is up to 30% improvement in write performance.
Copyright © 2006, Cisco Systems, Inc.
Network-Based Storage Applications
195
Evolution to a Multilayer Storage Utility Model Homogenous “SAN Islands”
Multilayer Storage Network Engineering, ERP, HR Applications
ERP SAN
Midrange Apps (eg. Microsoft)
HR SAN
Engineering, ERP, HR Applications
Storage Virtualization
Multiprotocol
VSANs
HA
Phase 0: Isolated SANs and Mid-range DAS
Dynamic Provisioning
LAN Free Backup
Data Mobility
Mgmt Storage Classes
Scalability
Midrange DAS
Midrange Apps (eg. Microsoft)
QoS
HA WAN/ FCIP Security Engineering SAN
Multilayer Storage Utility
Pooled Disk and Tape
Pooled Disk and Tape
Phase 1: High-end and Mid-range Consolidation
Phase 2: Network-Hosted Storage Applications
© 2006 Cisco Systems, Inc. All rights reserved.
31
The Multilayer Storage Utility This slide describes Cisco’s vision about storage networking evolution from the Homogeneous SAN Island model to the Multilayer Storage Utility Historically storage networks have been built in physically isolated islands (Phase 0) to address several technical and non-technical issues of older storage networking technologies, such as:
Maintain isolation from fabric events or configuration errors
Provide isolated management of island infrastructures
Driven by bad experiences of large multi-switch fabrics
This model is also associated with very high costs and a high level of complexity To help customers overcome the limitations of building homogeneous “SAN Islands”, Cisco has delivered new storage networking technologies and services aimed at enabling IT organizations to consolidate heterogeneous disk, tape, and hosts onto a common storage networking infrastructure (Phase 1.) By introducing new intelligent storage network services, Cisco enables customers to scale storage networks beyond today’s limitations while delivering the utmost in security and resiliency. An innovative infrastructure virtualization service called Virtual SANs (VSANs) alleviates the need to build isolated SAN islands by replicating such isolation virtually within a common cost-optimized physical infrastructure. The intelligent Multilayer Storage Utility (Phase 2) involves leveraging Cisco Multilayer Storage Networking as the base platform for delivering next generation storage services With the intelligent multilayer storage utility, the storage network is viewed as a system of distributed intelligent network components unified through a common API to deliver a platform for network-based storage services. 196
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Network-based storage services offer several attractive opportunities for further cost optimization of the storage infrastructure. To achieve the Multilayer Storage Utility, Cisco is partnering with the industry leaders as well as with the most promising start up companies to offer complete solutions to customers. Network-hosted storage products from EMC, Veritas and IBM as well as SANTap solutions developed in partnership with companies like Topio, Kashya or Alacritus are excellent examples of how Cisco is delivering in this space.
Copyright © 2006, Cisco Systems, Inc.
Network-Based Storage Applications
197
198
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Lesson 7
Optimizing Performance Overview In this lesson, you will learn how to design high-performance SAN fabrics using FSPF traffic management, load balancing, Virtual Output Queues, Fibre Channel Congestion Control, and Quality of Service.
Objectives Upon completing this lesson, you will be able to engineer SAN traffic on an MDS 9000 fabric. This includes being able to meet these objectives:
Define oversubscription and blocking
Explain how Virtual Output Queues solve the head-of-line blocking problem
Explain how the MDS 9000 handles fabric congestion
Explain how QoS is implemented in an MDS 9000 fabric
Explain how port tracking mitigates performance issues due to failed links
Explain how to configure traffic load balancing on an MDS 9000 SAN fabric
Describe the MDS 9000 tools that simplify SAN performance management
Oversubscription and Blocking Oversubscription and Blocking Oversubscription helps determine fabric design Blocking is avoidable with proper design and switch hardware FC HBA
Cisco MDS switches are completely non-blocking
Host A
Host B
FC HBA
FC HBA
2 Gbps Port
2 Gbps Port
FC HBA
FC
Head Head of of Line Line Blocking Blocking
FC HBA
FC HBA
Oversubscription Oversubscription 5:1 5:1
1 Gbps Port
2 Gbps Port
FC
FC
FC HBA
© 2006 Cisco Systems, Inc. All rights reserved.
Array C
Array D 4
It is important to fully understand two fundamental SAN design concepts: oversubscription and blocking. Although these terms are often used interchangeably, they relate to very different concepts. Oversubscription and blocking considerations are critical when designing a fabric topology. Oversubscription is a normal part of any SAN design and is essentially required to help reduce the cost of the SAN infrastructure. Oversubscription refers to the fan-in ratio of available resources such as ISL bandwidth or disk array I/O capacity, to the consumers of the resource. For example, many SAN designs have inherent design oversubscription as high as 12:1 hoststo-storage as recommended by disk subsystem vendors. A general rule-of-thumb relates oversubscription to the cost of the solution such that the higher the oversubscription, the less costly the solution. Blocking, often referred to as head-of-line (HOL) blocking, within a SAN describes a condition where congestion on one link negatively impacts the throughput on a different link. In this example congestion on the link connecting Array D is negatively impacting the flow of traffic between Host A and Array C. This is discussed in more detail later in this section.
200
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Cisco MDS Performance Advantages
• Completely non-blocking • Maximum throughput • Consistent latency • All ports and flows are serviced evenly • Quality of Service © 2006 Cisco Systems, Inc. All rights reserved.
5
Non-Blocking Switch Architecture A switch is said to be blocking when inefficient hardware design causes ingress traffic to be blocked or stalled, due to preceding traffic that is destined to slower or congested receivers. Blocking represents a condition where ingress and egress bandwidth capacity exist, but the switch is unable to forward at the desired rate due to hardware or queuing inefficiencies. Through the use of a technology called Virtual Output Queuing (VoQ), this problem has been overcome and none of the Cisco MDS 9000 Family of switches suffers from this blocking effect. The MDS 9000 platform is also the only Fibre Channel switching platform today to support Quality of Service (QoS).
Copyright © 2006, Cisco Systems, Inc.
Optimizing Performance
201
Miercom Performance Validation “Overall, the Cisco MDS 9509 proved to be a non-blocking, fully redundant architecture with excellent performance in every topology used.” “…excellent performance regardless of frame size…98.67% line rate for small frames, full line rate with large frames…” “Regardless of the load, minimum latency for both frame sizes was very consistent.”
“Most impressive was the ability to sustain traffic at a much higher throughput rate than other switches…” – Miercom
6
© 2006 Cisco Systems, Inc. All rights reserved.
Maximum Throughput The MDS 9509 exhibited excellent throughput and latency for all frame sizes in this test. […] Regardless of the load, the minimum latency for both frame sizes was very consistent. For small frames, it varied from 7.2 to 52.9 µs under 100% intended load. For large frames, the latency ranged from 19.7 to 218.9 µs under 100% intended load. […] Whenever other switches tested receive frames at a rate exceeding their capability, their buffers fill and their latency increases dramatically.
Consistent Latency The MDS 9509 showed excellent performance regardless of frame size. It achieved near line rate with small frames (98.67%) and full line rate with large frames, both with 100% intended load. […] Furthermore, the MDS 9509 was able to sustain traffic at a much higher throughput rate for minimum- and maximum-sized frames while maintaining a more consistent latency then other switches tested. More impressive was the distribution of the traffic flows, which was varied +/- 0.01 MB/s for small frames and +/- 0.005 MB/s for large frames. Source: Performance Validation Test - Cisco MDS 9509 By Miercom at Spirent SmartLabs, Calabasas California - December 2002 http://cisco.com/application/pdf/en/us/guest/products/ps4358/c1244/cdccont_0900aecd800cbd6 5.pdf
202
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Virtual Output Queues Head-of-Line Blocking • Storage array connected to one switch port wants to transfer data to three servers.
A
B A C B C
ACB
B
FC
C
© 2006 Cisco Systems, Inc. All rights reserved.
8
Head-of-Line Blocking The example illustrates a scenario where a storage array is connected to a switch via a single link. Without VOQ technology, traffic from a storage array destined for three servers, A, B and C, will flow into a switch and be placed into a single input buffer as shown. Assuming the servers are capable of receiving data transfers from the storage arrays at a sufficient rate, there should not be any problems.
Copyright © 2006, Cisco Systems, Inc.
Optimizing Performance
203
Head-of-Line Blocking (Cont.) • Slowdown on one server prevents others from receiving data frames Classic Head of Line Blocking scenario
A
B B C C B
ACBCA
B
FC
C
© 2006 Cisco Systems, Inc. All rights reserved.
9
Should there be a problem or slowdown with one of the servers, the storage array may be prevented from sending data as quickly as it is capable, to the remaining servers. This is a classic HOL blocking condition.
204
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Virtual Output Queues • Solution: MDS 9000 implements 1024 virtual output queues (VOQs) for every ingress port on each linecard. • VOQs alleviate congestion and head-of-line blocking conditions.
A
B C A C B FC
B
256 VOQs with 4 levels of QoS per queue
C
© 2006 Cisco Systems, Inc. All rights reserved.
10
Virtual Output Queues The MDS 9000 Family switches implement a sophisticated VOQ mechanism to alleviate HOL Blocking conditions. Virtual Output Queuing occurs at the each ingress port on the switch. There are effectively 1024 Virtual Output Queues available for every ingress port, including support for four levels of Quality of Service and up to 256 egress ports for every ingress port.
Copyright © 2006, Cisco Systems, Inc.
Optimizing Performance
205
Intelligent Central Arbiter • Sophisticated and fast central arbiter provides fairness and selection among VOQs • Central Arbiter schedules over 1 billion frames per second
A ARB
B A C B C
AA B C
B
FC
C
11
© 2006 Cisco Systems, Inc. All rights reserved.
Intelligent Central Arbiter The MDS 9000 implements an intelligent central arbiter to:
Monitor the input queues and the egress ports.
Provide for fairness.
Allow unobstructed flow of traffic destined for un-congested ports.
Absorb bursts of traffic.
Alleviate conditions that might lead to HOL blocking.
Without a central arbiter, there would be a potential to starve certain modules and ports. The central arbiter maintains the traffic flow - like a traffic cop. The Cisco MDS Arbiter can schedule frames at over 1 billion frames per second. (1 billion = 1000 million)
206
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Fibre Channel Congestion Control Fibre Channel Congestion Control • FCC is a feature that detects and reacts to network congestion • The network self-adapts intelligently to the specific congestion event Maximized throughput Avoids Head of Line blocking Protocol customized for FC loss-less networks
Congestion control message sent from switch 3 to switch 1
Traffic congestion
Cisco MDS 9000 Switch 1
Cisco MDS 9000 Switch 2
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco MDS 9000 Switch 3 13
Fibre Channel Congestion Control (FCC) is used to gracefully alleviate congestion using intelligent feedback mechanisms within the fabric. FCC is a feature designed to throttle data at its source if the destination port is not responding correctly. It is a Cisco proprietary protocol that makes the network react to a congestion situation. The network adapts intelligently to the specific congestion situation, maximizing the throughput and avoiding head of line (HOL) blocking. The protocol has been customized for lossless networks such as Fibre Channel. FCC consists of the following three elements:
Congestion Detection—Performed by analyzing the congestion of each output port in the switch.
Congestion Signaling—Performed with special packets called Edge Quench (EQ).
Congestion Control—Performed through rate limiting of the incoming traffic.
Copyright © 2006, Cisco Systems, Inc.
Optimizing Performance
207
FCC Detection, Signaling, and Limiting Rate Limiting
Congestion Detected
Senders
Receivers Congestion Signal
S1
R1
1 Gbps
50 Mbps Switch A
Switch B
S2
R2
1 Gbps
1 Gbps
Congestion Detection, Signalling, and Control 14
© 2006 Cisco Systems, Inc. All rights reserved.
FCC Detection, Signaling and Limiting In the scenario shown on this slide:
S1 is sending frames into the fabric at 1 Gbps
R1 is only receiving frames at 50 Mbps and does not drain FC frames fast enough
Congestion occurs at the egress port of R1 as the buffers start to fill up
As the buffers fill, frames are backed up to the previous buffer and congestion is detected at the ingress port of Switch B
Congestion signal sent to Switch A source of troubled receiver on the appropriate linecard
S1 begins rate limiting to an R1 sustainable level to match the rate of flow into and out of the switch.
The MDS 9000 switch monitors traffic from each host for congestion and FCC is activated when and if congestion is detected. The “quench on edge” message is sent out and the offending host traffic will be cut in half for each “quench” message received. There is no need for an “un-quench” message, because traffic usually builds back up slowly.
208
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Quality of Service QoS Design Considerations How can I provide priority for critical storage traffic flows? • Quality of Service: Avoids and manages network congestion and sets traffic priorities across the network Provides predictable response times Manages delay and jitter sensitive applications Controls loss during bursty congestion
iSCSI
IP LAN
iSCSI
IP WAN
iSCSI iSCSI
FC SAN
FC SAN
FC
FC HBA FC HBA FC HBA
FC FC
FC HBA FC HBA FC
FC
HBA
16
© 2006 Cisco Systems, Inc. All rights reserved.
Quality of Service (QoS) includes mechanisms that support the classification, marking and prioritization of network traffic. QoS concepts and technology were originally developed for IP networks. The MDS 9000 family of switches extend QoS capabilities into the storage networking domain – for IP SANs as well as Fibre Channel SANs. No other switch on the market today is capable of prioritizing Fibre Channel traffic. Classification involves identifying and splitting traffic into different classes. Marking involves setting bits in a frame or packet to let other network devices know how to treat the traffic. Prioritization involves queuing strategies designed to avoid congestion and provide preferential treatment. In a storage network, examples of classification, marking and prioritization schemes might include:
Classification – Classify all traffic in a particular VSAN, or all traffic bound for a particular destination FCID, or all traffic entering a particular FCIP tunnel
Marking – Set particular bits in the IP header or the EISL VSAN header
Prioritization – Utilize queuing strategies such as Deficit Weighted Round Robin (DWRR) or Class Based Weighted Fair Queuing (CBWFQ) to give preference based on certain markings.
QoS features enable networks to control and predictably service a variety of networked applications and traffic types. The goal of QoS is to provide better and more predictable network service by providing dedicated bandwidth, controlled jitter and latency, and improved loss characteristics.
Copyright © 2006, Cisco Systems, Inc.
Optimizing Performance
209
Applications deployed over storage networks increasingly require quality, reliability, and timeliness assurances. In particular, applications that use voice, video streams, or multimedia must be carefully managed within the network to preserve their integrity. QoS technologies allow IT managers and network managers to:
Predict response times for end-to-end network services
Manage jitter-sensitive applications, such as audio and video playbacks
Manage delay-sensitive traffic, such as real-time voice
Control loss in times of inevitable bursty congestion
Set traffic priorities across the network
Support dedicated bandwidth
Avoid and manage network congestion.
Managing QoS becomes increasingly difficult because many applications deliver unpredictable bursts of traffic. For example, usage patterns for web, e-mail, and file transfer applications are virtually impossible to predict, yet network managers need to be able to support mission-critical applications even during peak periods.
210
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
QoS for Fibre Channel Congestion
High priority flow FC FC HBA
FC HBA
Low priority flow
Three priority queues for data traffic Absolute priority for control traffic Flows classified based on input interface, destination device alias or source/destination FCID or pWWN QoS only functions during periods of congestion FCC must be enabled Follows Differentiated Services (DiffServ) model defined in RFCs 2474 and 2475 17
© 2006 Cisco Systems, Inc. All rights reserved.
QoS for Fibre Channel MDS 9000 Family switches support QoS for Fibre Channel through the classification and prioritization of FC control traffic and data traffic. The QoS implementation in the Cisco MDS 9000 Family follows the Differentiated Services (DiffServ) model. The DiffServ standard is defined in RFCs 2474 and 2475. Data traffic can now be prioritized in three distinct levels of service differentiation: low, medium or high, while control traffic is given absolute priority. You can apply QoS to ensure that FC data traffic for latency-sensitive applications receives higher priority than traffic for throughput-intensive applications like data warehousing. Flows are classified based on one or more of the following attributes:
Input interface
Source FCID
Destination FCID
Source pWWN
Destination pWWN
QoS only functions during periods of congestion. To achieve the greatest benefit, QoS requires that FCC be enabled, and requires two or more switches in the path between the initiators and targets. Data traffic QoS for Fibre Channel is not enabled by default, and requires the Enterprise Package license. However, absolute priority for control traffic is included in the base SAN-OS license, and is enabled by default.
Copyright © 2006, Cisco Systems, Inc.
Optimizing Performance
211
QoS for Fibre Channel VOQ(s)
VOQ(s)
Absolute Priority Absolute Priority
Disk
Absolute Priority Absolute Priority
HBA
OLTP Server Low Throughput Bursty, Random Low Latency
FC
Congestion
FC High Priority High Priority
High Priority High Priority
Medium Priority Medium Priority
Medium Priority Medium Priority
Low Priority Low Priority
Low Priority Low Priority
PQ
Absolute Priority Scheduling
A
FC HBA
Backup Server High Throughput Sequential, Streaming Not Latency Sensitive
© 2006 Cisco Systems, Inc. All rights reserved.
Traffic destined for interface
ABCD
DWRR 1 B B 50% Classify Class map
Traffic Classification: • Source or destination pWWN • Source or destination FCID • Source interface • Destination device alias
DWRR 2
C
30%
DWRR 3
D
20%
Transmit Queue
A
Class Map mapped to: DSCP 0-63 (46 reserved) or Policy Map 18
Transaction processing, a low volume, latency sensitive application, requires quick access to requested information. Backup processing requires high bandwidth but is not sensitive to latency. In a network that does not support service differentiation, all traffic is treated identically—they experience similar latency and get similar bandwidths. With the QoS capability of the MDS 9000 platform, data traffic can now be prioritized in three distinct levels of service differentiation—low, medium or high—while control traffic is given absolute priority. You can apply QoS to ensure that FC data traffic for latency-sensitive applications receives higher priority over traffic for throughput-intensive applications like data warehousing. In the example, the Online Transaction Processing (OLTP) traffic arriving at the switch is marked with a high priority level of through classification (class map) and marking (policy map). Similarly, the backup traffic is marked with a low priority level. The traffic is sent to the corresponding priority queue within a Virtual Output Queue (VOQ). A Deficit Weighted Round Robin (DWRR) scheduler configured in the first switch ensures that high priority traffic is treated better than low priority traffic. For example, DWRR weights of 60:30:10 implies that the high priority queue is serviced at 6 times the rate of the low priority queue. This guarantees lower delays and higher bandwidths to high priority traffic if congestion sets in. A similar configuration in the second switch ensures the same traffic treatment in the other direction. If the ISL is congested when the OLTP server sends a request, the request is queued in the high priority queue and is serviced almost immediately as the high priority queue is not congested. The scheduler assigns it priority over the backup traffic in the low priority queue. Note that the absolute priority queue always gets serviced first; there is no weighted round robin.
212
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Traffic Classification and Queuing MDS 9000 supports one absolute priority queue and three DWRR queues. By default, control traffic is placed in the priority queue. When the priority queue is empty, the scheduler checks the DWRR queues. DWRR supports the weighted fair distribution of bandwidth when servicing queues that contain variable-length packets. DWRR transmits from the higher priority queues without starving the lower-priority queues by keeping track of lower-priority queue under-transmission and compensating in the next round. In the classic DWRR algorithm, the scheduler visits each non-empty queue and determines the number of bytes in the packet at the head of the queue. The deficit counter is incremented by the value of quantum. If the size of the packet at the head of the queue is greater than the deficit counter, then the scheduler moves on to service the next queue. If the size of the packet at the head of the queue is less than or equal to the deficit counter, then the deficit counter is reduced by the number of bytes in the packet and the packet is transmitted on the output port. The scheduler continues to dequeue packets until either the size of the packet at the head of the queue is greater than the deficit counter or the queue is empty. If the queue is empty, the value of the deficit counter is set to zero. When this occurs, the scheduler moves on to service the next non-empty queue. In short, DWRR provides preferential, or “weighted”, round robin scheduling without starving other queues.
Copyright © 2006, Cisco Systems, Inc.
Optimizing Performance
213
Zone-Based QoS • Zone-based QoS complements the standard QoS data-traffic classification by WWN or FCID • Zone-based QoS helps simplify configuration and administration by using the familiar zoning concept ZONE A QoS parameters are distributed as a zone attribute FC FC FC HBA
HBA
High priority flow
Congestion FC
HBA
FC FC FC HBA
HBA
Low priority flow
HBA
QoS parameters are distributed as a zone attribute
ZONE B © 2006 Cisco Systems, Inc. All rights reserved.
19
Zone-Based QoS With zone-based QoS, QoS parameters are distributed as a zone attribute. This simplifies administration of QoS by providing the ability to classify and prioritize traffic on by zone, instead of by initiator-target pair. Zone-based QoS is supported in both the Basic and Enhanced mode. QoS parameters are distributed as vendor-specific attributes. Zone-based QoS cannot be combined with flow-based QoS
214
Note
Zone-based QoS is a licensed feature; it requires the Enterprise Package.
Note
Zone-based QoS may cause traffic disruption upon zone-QoS configuration change (and activation) if in-order-delivery is enabled.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Port Rate Limiting • Rate-limiting ingress traffic: Match ingress flow to available bandwidth Prevent devices from flooding SAN Limit traffic contending for WAN links Limit traffic on over-subscribed interfaces Configured as a percentage of available ingress bandwidth QoS feature must be enabled
50% maximum input to fc1/1 MDS • Supported on MDS 9100 Series switches, MDS 9216i and MPS14+2 20
© 2006 Cisco Systems, Inc. All rights reserved.
Port Rate Limiting A port rate limiting feature is available on 2nd generation modules eg. the MPS 14+2, MDs 9216i and MDS 9100 series switches with SAN-OS 1.3 or higher. This feature helps control the bandwidth for individual FC ports. Rate limiting could be useful in the following situations:
Prevent malicious or malfunctioning devices from flooding the SAN.
Limit traffic contending for WAN links, e.g., storage replication ports.
Limit ingress traffic on over-subscribed mode interfaces.
Port rate limiting is also referred to as ingress rate limiting because it controls ingress traffic into an FC port. The feature controls traffic flow by limiting the number of frames that are transmitted out of the exit point on the MAC. Port rate limiting works on all Fibre Channel ports. Note: Port rate limiting can be configured only on Cisco MDS 9100 Series switches, MDS 9216i and MPS 14+2. This command can be configured only if the following conditions are true:
The QoS feature is enabled using the qos enable command.
The command is issued in a 2nd generation Cisco MDS 9216i or 9100 series switch.
The rate limit ranges from 1 to 100% and the default is 100%. To configure the port rate limiting value, use the switchport ingress-rate interface configuration command.
Copyright © 2006, Cisco Systems, Inc.
Optimizing Performance
215
QoS in Single-Switch Fabrics FC
FC
HBA
FC HBA
FC
FC HBA
HBA
OSM hosts to disk
JBOD disks to host
Host to JBOD disks
• QoS scheduling occurs in VOQ of ingress port • Effective when multiple flows on one ingress “port” contend for the same egress port • Can improve latency and/or bandwidth of higher priority flows
© 2006 Cisco Systems, Inc. All rights reserved.
21
QoS Designs Fibre Channel QoS is effective for some configurations, but not for others. To understand why, it is important to realize that the QoS scheduler operates within the Virtual Output Queue (VOQ) of an ingress port. Because QoS scheduling occurs at the ingress port, for QoS to be effective, it is important that all competing traffic enter a switch through the same ingress port, somewhere before the common point of congestion. The diagram illustrates three configurations where FC QoS might be beneficial in a singleswitch design:
216
Multiple devices attached to the same quad on a host optimized 32-port FC line card. In this configuration, the group of four “over subscribed mode” (OSM) ports is serviced by the same QoS scheduler, so internally they appear to be connected to the same ingress “port”. The common point of congestion would be the storage port.
A multi-disk JBOD attached to an FL port, sending data to a host on the same switch. In this configuration, there are be multiple devices on the same ingress port, each with unique FCID and pWWN which can be used for QoS classification. The common point of congestion would be the host, if for example we had a 2 Gbps JBOD and a 1 Gbps host HBA.
A host sending multiple flows, each of which enter on a common ingress port and is destined for distinct FCID or pWWN within the JBOD. In this configuration, QoS can improve the latency of a higher priority flow, but cannot improve the bandwidth because the host is not QoS aware, so all of the flows would get an equal share of the bandwidth regardless of the DWRR weights.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Because the QoS scheduler operates within the VOQ of the ingress ports, Fibre Channel QoS is not beneficial in all configurations. Two configurations where FC QoS is not effective in the current MDS 9000 QoS implementation are:
Multiple devices attached to “full rate mode” (FRM) ports on a 16-port FC line card contending for the same egress port. In this configuration, where two hosts are both sending data to the storage array there would be no benefit to giving one host a QoS priority higher than the other because the central arbiter would provide fairness to the two ingress ports that are contending for the common egress port.
Multiple devices with a common ingress port (the ISL on the rightmost switch), but multiple egress ports. QoS would not provide a benefit, however FCC will still alleviate congestion on the ISL.
Copyright © 2006, Cisco Systems, Inc.
Optimizing Performance
217
QoS Designs for FCIP DSCP = 46 FC
FCIP 10 FC
IP WAN
HBA
Traffic going into FCIP tunnel is marked with DSCP value Marked at egress of FCIP interface Separate markings for control and data
Downstream IP network must implement and enforce QoS policy based on marking DSCP can be any value from 0 to 63. Default = 0 DSCP 46 is reserved for Expedited Forwarding © 2006 Cisco Systems, Inc. All rights reserved.
22
QoS Designs for FCIP In a SAN extension environment utilizing FCIP, it is possible for high priority storage traffic to be contending for the same WAN resources as other, lower priority storage and/or data traffic. In such situations, it may be desirable to implement QoS in order to provide a higher level of service for particular storage traffic flows. Traffic flowing into an FCIP tunnel can be classified at the egress FCIP interface and marked with a DSCP value. By default, the IPS module creates two TCP connections for each FCIP link. One connection is used for data frames and the other is used only for Fibre Channel control frames (i.e., Class F switch-to-switch protocol frames). This arrangement is used to provide low latency for all control frames. The FCIP QoS feature specifies the DSCP value to mark all IP packets using the TOS field in the IP header. The control DSCP value applies to all FCIP frames in the control TCP connection and the data DSCP value applies to all FCIP frames in the data connection. If the FCIP link has only one TCP connection, the data DSCP value is applied to all packets in that connection. Once marked, it is then up to the devices in the downstream IP network to implement and enforce the QoS strategy. The MDS 9000 does not implement IP QoS for ingress FCIP frames since it is an end device in the IP network.
218
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
End-To-End QoS • Priority for critical storage traffic flows VSANs and high density switches allow for collapsed core design
FCC performs congestion detection, signaling, and congestion control
Traffic engineering makes collapsed core design a feasible solution
End-to-end QoS priority schemes can be designed to meet customer requirements
VSAN Trunks
25
15 0
50
FC SAN
0
250
15
QoS for Fibre Channel
Multi-path Load Balancing
50
50
250
50
25
IP LAN
50
Hosts are assigned to Virtual SANs. Each Virtual SAN is allocated fabric resources, such as bandwidth, independently
QoS for iSCSI
iSCSI Hosts
Each VSAN has its own routing process and associated metrics per link and can therefore make independent routing decisions
50
iSCSI
QoS for FCIP
FCIP- DWDM SONET FC SAN
23
© 2006 Cisco Systems, Inc. All rights reserved.
End-to-End QoS Cisco MDS 9000 Family introduces VSAN technology for hardware-based intelligent frame processing, and advanced traffic management features such as Fibre Channel Congestion Control (FCC) and fabric-wide quality of service (QoS) - enabling the migration from SAN islands to collapsed-core and enterprise-wide storage networks. The MDS 9000 family of switches provide several tools that allow SAN administrators to engineer resource allocation and recovery behavior in a fabric. These tools can be used to provide preferential service to a group of hosts or to utilize cost-effective, wide-area bandwidth first and use an alternate path during fabric fault.
VSANs provide a way to group traffic. VSANs can be selectively grafted or pruned from EISL trunks.
PortChannels support link aggregation to create virtual EISL trunks.
FSPF provides deterministic routing through the fabric. FSPF can be configured on a perVSAN basis to select preferred and alternate paths
With the MDS 9000 family of switches, QoS concepts and technology that were originally developed for IP networks have been extended into the storage networking domain – for IP SANs as well as Fibre Channel SANs. No other switch on the market is capable of prioritizing Fibre Channel traffic as effectively or comprehensively. Classification involves identifying and splitting traffic into different classes. Marking involves setting bits in a frame or packet to let other network devices know how to treat the traffic. Prioritization involves queuing strategies designed to avoid congestion and provide preferential treatment. The Cisco MDS 9000 Family enables the design of comprehensive end-to-end traffic priority schemes that satisfy customer requirements. Copyright © 2006, Cisco Systems, Inc.
Optimizing Performance
219
Port Tracking 2.0
Port Tracking Linked port
Tracked port 1
FC
WAN/MAN FC
2
WAN/MAN
• Unique to MDS 9000 • Failure of link 1 immediately brings down link 2 • Triggers faster recovery where redundant links exist • Tracked ports are continually monitored Tracked ports can be FC, Port Channel, GigE or FCIP interface Linked ports must be FC
• Failover software responds faster to a link failure Milliseconds versus tens of seconds Not dependant on TOVs or RSCNs © 2006 Cisco Systems, Inc. All rights reserved.
25
Tracking and Redirecting Traffic The Port Tracking feature is unique to the Cisco MDS 9000 Family of switches. This feature uses information about the operational state of one link (usually an ISL) to initiate a failure in another link (usually one that connects to an edge device). This process of converting the indirect failure to a direct failure triggers a faster recovery process where redundant links exist. When enabled, the port tracking feature brings down the configured links based on the failed link and forces the traffic to be redirected to another redundant link. Generally, hosts and storage arrays can instantly recover from a link failure on a link that is immediately connected to a switch (direct link). However, recovering from an indirect link failure between switches in a local, WAN, or MAN fabric with a keep-alive mechanism is dependent on factors such as the time out values (TOVs) and registered state change notification (RSCN) information. In tests with port tracking enabled, failover occurred in approximately 150 milliseconds, compared to more than 25 seconds without the port tracking feature enabled. In the diagram, when the direct link (2) to the storage array fails, recovery can be immediate. However, when the WAN/MAN link (1) fails, recovery depends on TOVs, RSCNs, and other factors. The port tracking feature monitors and detects failures that cause topology changes and brings down the links connecting the attached devices. When you enable this feature and explicitly configure the linked ports, the SAN-OS software monitors the tracked ports and alters the operational state of the linked ports upon detecting a link state change. Port tracking is a feature of SAN-OS 2.0. It is included in the base license package at no additional cost.
220
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Port Tracking Terminology A tracked port is a port whose operational state is continuously monitored. The operational state of the tracked port is used to alter the operational state of one or more linked ports. Fibre Channel, PortChannel, FCIP, and Gigabit Ethernet interfaces can be tracked. Generally, interfaces in E and TE port modes are tracked, although Fx ports can also be tracked. A linked port is a port whose operational state is altered based on the operational state of one or more tracked ports. Only a Fibre Channel port can be linked.
Copyright © 2006, Cisco Systems, Inc.
Optimizing Performance
221
Load Balancing Configuring Logical Paths • How can I provide preferred paths for a subset of hosts and storage devices in my SAN?
VSAN Metric
VSAN Metric
10
100
10
50
20
50
20
100
FSPF can be configured on a per-VSAN basis to select preferred and alternate paths. PortChannels provide link aggregation to yield virtual EISL trunks.
© 2006 Cisco Systems, Inc. All rights reserved.
TE_Port
TE_Port
up ck 10 Ba 0 AN 2 VS AN VS
VSANs can be selectively grafted or pruned from EISL trunks.
V VS SAN AN 2 0 10 B ac ku p
• SAN traffic engineering uses a combination of features to provide preferred selection of logical paths: E_Port
E_Port
4 link (8 Gbps) PortChannel configured as EISL
27
Configuring Logical Paths The MDS 9000 Series switches provide a number of features that can be used alone or in combination to classify and select logical paths, including:
VSAN allowed lists, which permit VSANs to be selectively added to or removed from EISL trunks.
FSPF link cost, which can be configured on a per-VSAN basis for the same physical link, providing preferred and alternate paths.
PortChannels, which provide link aggregation and thus logical paths that can be preferred for routing purposes.
The implementation of VSANs gives the SAN designer more control over the flow of traffic and its prioritization through the network. Using the VSAN capability, different VSANs can be prioritized and given access to specific paths within the fabric on a per-application basis. Using VSANs, traffic flows can be engineered to provide an efficient usage of network bandwidth. One level of traffic engineering allows the SAN designer to selectively enable or disable a particular VSAN from traversing any given common VSAN trunk (EISL) thereby creating a restricted topology for the particular VSAN. A second level of traffic engineering is derived from independent routing configurations per VSAN. The implementation of VSANs dictates that each configured VSAN support a separate set of fabric services. One such service is the FSPF routing protocol which can be independently configured per VSAN. Therefore, within each VSAN topology, FSPF can be configured to provide a unique routing configuration and resultant traffic flow. Using the traffic engineering capabilities offered by VSANs allows a greater control over traffic within the fabric and a higher utilization of the deployed fabric resources. 222
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Traffic Engineering Designs for SAN Extension How can I utilize cost-effective, wide-area bandwidth first, while providing an alternate path during a fabric fault? VSAN 10 OLTP
VSAN 10 cost = 100 VSAN 20 cost =200
VSAN 10 OLTP
OC-48
Email VSAN 20
OC-3 VSAN 10 cost = 200 VSAN 20 cost =100
Email VSAN 20
• Preferred paths can be configured per VSAN If one path fails, other path automatically takes over
• Use cost-effective wide-area bandwidth first Use alternate path during fabric fault © 2006 Cisco Systems, Inc. All rights reserved.
28
Traffic Engineering Designs for SAN Extension In addition to tuning FSPF link costs for traffic engineering in local SAN fabrics, FSPF can be particularly beneficial in SAN extension environments where multiple paths exist between separate SAN fabrics. In the diagram, there are two equal cost FCIP paths between the two VSANs on either side. The FCIP interfaces are configured as “trunking” so that they can carry traffic for multiple VSANs. By tuning the per-VSAN FSPF link costs, we are able to give each VSAN a dedicated preferred path, while allowing for transparent failover in the event of a link failure. In other SAN extension environments, FSPF could be used to give preference to a high speed link, for example an OC-48 link, while allowing a slower OC-3 link to provide a failover path. Or preference could be given to the more cost-effective link, allowing for failover to a more expensive path in the event of a fabric fault.
Copyright © 2006, Cisco Systems, Inc.
Optimizing Performance
223
SAN Performance Management Ad Hoc SAN Performance Management • Vendor specific tools offer limited performance management capabilities
FC
Switch Counters Limited Switch Performance Management Tools
• Probes and analyzers in the data path are intrusive, disruptive and expensive iSCSI
IP LAN
FC
FC FC
ANALYZER
IP WAN
SINIFAR
Host Based Metrics
Protocol Analyzer in Data Path
QoS Probes in Data Path
FC
FC FC
FC SAN © 2006 Cisco Systems, Inc. All rights reserved.
FC FC
FC SAN 30
Ad Hoc SAN Performance Management True end-to-end SAN performance management is a daunting task for system administrators. Host based tools, switch based tools and expensive in-band probes and analyzers all provide data – but using vendor specific tools becomes increasingly difficult and time consuming when the SAN and the applications it supports begins to scale. In-band appliances in the data path are expensive, disruptive and may even mask some performance symptoms by retiming the signal, making analysis of performance data even more problematic.
224
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
SPAN • Non-intrusive copy of all traffic from a port • Directed to SD_Port within local or remote MDS switch • Traffic redirected to Cisco Port Analyzer Adapter (PAA) • Also compatible with off-the-shelf FC protocol analyzers
Destination Switch
RSPAN Tunnel Encapsulation HBA
SPAN Source Port
ST
MDS FC Network
Source Switch
SD FC analyzer 31
© 2006 Cisco Systems, Inc. All rights reserved.
Switched Port Analyzer (SPAN) SPAN allows a user to make a copy of all traffic and direct it to another port within the switch. This copy is not intrusive to any of the connected devices and is facilitated in hardware, thereby alleviating any unnecessary CPU load. Using the SPAN feature, a user could connect a Fibre Channel analyzer such as a Finisar analyzer to an unused port on the switch and then simply use SPAN to make a copy of the traffic from a port under analysis and send it to the analyzer in a non-intrusive and non-disruptive fashion. SPAN features include the following:
Non-intrusive and non-disruptive tool used with the Fibre Channel analyzer
Ability to copy all traffic from a port and direct it to another port within the switch
Totally hardware-driven—no CPU burden
Up to 16 SPAN sessions within a switch
Each session can have up to four unique sources and one destination port
Filter the SPAN source based on receive only traffic, transmit only traffic, or bidirectional traffic
The Fibre Channel port that is to be analyzed is designated the SPAN source port. A copy of all Fibre Channel traffic flowing through this port is sent to the SD_Port. This includes traffic traveling in or out of the Fibre Channel port, that is, in the ingress or egress direction. The SD_Port is an independent Fibre Channel port, which receives this forwarded traffic and in turn sends it out for analysis to an externally attached Fibre Channel analyzer.
Copyright © 2006, Cisco Systems, Inc.
Optimizing Performance
225
SPAN Applications Using the SPAN feature, you can conduct detailed troubleshooting on a particular device without any disruption. In addition, a user may want to take a sample of traffic from a particular application host for proactive monitoring and analysis, a process that can easily be accomplished with the SPAN feature. Remote Switched Port Analyzer (RSPAN) further increases the capability of the SPAN feature. With RSPAN, a user has the ability to make a copy of traffic from a source port or VSAN to a port on another connected switch. Debugging protocols supported by SPAN and RSPAN include FSPF, PLOGI, exchange link parameter (ELP), and others. Examples of data analysis that can be performed with SPAN and RSPAN include:
Traffic on a particular VSAN on a TE_Port.
Application-specific analysis using an analyzer.
An important debugging feature SPAN provides is that multiple users can share an SD_Port and analyzer. Also, the MDS 9000 can copy traffic on a single port at line rates. MDS 9000 can use SPAN with both unicast and multicast traffic. Dropped frames are not SPAN-ed. SPAN-ed frames will be dropped if the sum bandwidth of sources exceeds the speed of the destination port.
226
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Performance Manager • Fabric wide, historical performance reporting • Browser-based display • Summary and drill-down reports for detailed analysis • Integrates with Cisco Traffic Analyzer • Requires Fabric Manager Server license
© 2006 Cisco Systems, Inc. All rights reserved.
32
Performance Manager Performance Manager, like Fabric Manager, runs as a Windows service. It monitors network device statistics and displays historical information in a web-based GUI. Performance Manager provides detailed traffic analysis by capturing data with the Cisco Port Analyzer Adapter (PAA). This data is compiled into various graphs and charts which can be viewed with any web browser. It presents recent statistics in detail and older statistics in summary. Performance Manager has three parts:
Definition of traffic flows is done by manual edits or by using a Fabric Manager configuration wizard to create a configuration file;
Collection is where Performance Manager reads the configuration file and collects the desired information;
Presentation is where Performance Manager generates web pages to present the collected data.
Performance Manager can collect a variety of data about ISLs, host ports, storage ports, route flows, and site-specific statistical collection areas. It relies on captured data flows through the use of the PAA, Fabric Manager Server, and the Traffic Analyzer. Using it as a FC traffic analyzer, a user can drill down to the distribution of Read vs. Write I/O, average frame sizes, LUN utilization, etc. Using it as a FC protocol analyzer, a user can have access to frame level information for analysis. The Summary page presents the top 10 Hosts, ISLs, Storage, and Flows by combined average bandwidth for the last 24 hour period. This period changes on every polling interval, although this is unlikely to change the average by much but it could affect the maximum value. The intention is to provide a quick summary of the fabric’s bandwidth consumption and highlight any hot-spots. Copyright © 2006, Cisco Systems, Inc.
Optimizing Performance
227
Performance Manager (Cont.) • Goals: Ability to scale to large fabrics x 12 months of data Provide early warning of traffic problems Ability to see all packets on an individual interface Ability to diagnose traffic problems Simple to set up and use
• Hybrid approach: Aggregate traffic collected using SNMP and stored persistently in round-robin database. Use SPAN, PAA, and NTOP to capture packets for diagnosing traffic problems. © 2006 Cisco Systems, Inc. All rights reserved.
33
The purpose of Performance Manager is to monitor network device statistics historically and provide this information graphically using a web browser. It presents recent statistics in detail and older statistics in summary. The deployment goal of Performance Manager is to be able to scale to large fabrics with 12 months of data, provide an early warning system for potential traffic problems, see all packets on an individual interface, diagnose traffic problems, and have it be simple to setup and use. In order to achieve these goals, Cisco implemented a hybrid approach. First to retrieve aggregate fabric traffic information using SNMP and store it persistently in a round-robin database. Then incorporate the use of SPAN, PAA, and NTOP to capture packets for ultimately diagnosing traffic problems. Performance Manager is a tool that can:
228
Scale to large fabrics
Scales to multi-year histories
Perform data collection without requiring inband local/proxy HBA access
Tolerate poor IP connectivity
Provide SNMPv3 support
Have Zero-administration databases
Provide site customization capabilities
Accommodate fabric topological changes
Integrate and share data with external tools
Run on multiple operating systems
Integrate with Fabric Manager
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Performance Manager – Collected Data • Rx/Tx Bytes for: ISLs Host Ports Storage Ports Flows
• Bytes (and frames) sent from sources to destinations • Configured based on active zones, for example: Host1 Æ Storage1 Host1 Æ Storage2 Storage1 Æ Host1 Storage2 Æ Host1
• Flows need to be defined on the correct linecard(s) © 2006 Cisco Systems, Inc. All rights reserved.
34
Performance Manager – Collected Data The Performance Monitor collects receive and transmit byte data. This data is available for ISLs, host and storage ports, flows, etc. A flow is a count of bytes and frames sent from a particular source to a particular destination. Use the active zones to configure flows. For instance, given:
Zone A: Host1, Storage1, Storage2
Zone B: Host2, Storage1
Possible Flows:
Host1->Storage1
Host1->Storage2
Host2->Storage1
Storage1->Host1
Storage1->Host2
Storage2->Host1
Copyright © 2006, Cisco Systems, Inc.
Optimizing Performance
229
Traffic Analyzer • Cisco customized version of ntop • Free for download from CCO • Live or Offline Analysis • Provides SCSI based Information about storage network devices • FC-enhanced public-domain tools
© 2006 Cisco Systems, Inc. All rights reserved.
35
Traffic Analyzer Cisco Traffic Analyzer is a Cisco customization of the popular ntop network traffic monitoring tool. Traffic Analyzer allows for live or offline analysis, and displays information about storage and the network. Traffic Analyzer is a Fibre Channel-enhanced version of public-domain tools. Traffic Analyzer is not good for accounting, because frames may be dropped on SD_Port, by the PAA or on the host. Traffic Analyzer can be downloaded free from Cisco Connection Online (CCO).
230
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Traffic Analyzer – Statistics Collected • Overall Network Statistics
• N_Port-Based Statistics
Total Bandwidth Used
Per-LUN Statistics
Tx/Rx Bandwidth per VSAN
Traffic Breakdown by Time
Tx/Rx Bandwidth per N_Port
Class-based traffic breakdown
• Session-based Statistics
• VSAN-Based Statistics
SCSI Sessions (I_T_L Nexus)
Traffic Breakdown by VSAN
FICON Sessions (in progress)
VSAN configuration Stats
Other FC Sessions
Domain-based statistics
And more…
© 2006 Cisco Systems, Inc. All rights reserved.
36
Traffic Analyzer – Statistics Collected Traffic Analyzer collects a large amount of statistical information. Some of the statistics collected are:
Overall Network Statistics
Total Bandwidth Used
Tx/Rx Bandwidth per VSAN
Tx/Rx Bandwidth per N_Port
Session-based Statistics
SCSI Sessions (I_T_L Nexus)
FICON Sessions (in progress)
Other FC Sessions
N_Port-Based Statistics
Per-LUN Statistics
Traffic Breakdown by Time
Class-based traffic breakdown
VSAN-Based Statistics
Traffic Breakdown by VSAN
VSAN configuration Stats
Domain-based statistics
Copyright © 2006, Cisco Systems, Inc.
Optimizing Performance
231
Traffic Analyzer – How It Works • Hooks up with Port Analyzer Adapter (PAA) Captures traffic much like Fabric Analyzer Different tool for analyzing traffic Modification of ntop
• Software runs on host (PC, Mac, etc.) No new switch software User Interface is Web Browser
• Requires modified Port Analyzer Adapter (PAA-2) Provides original length for truncated frames Captures more data with less bandwidth Preserves data privacy without compromising statistics accuracy
© 2006 Cisco Systems, Inc. All rights reserved.
37
Traffic Analyzer – How It Works The Traffic Analyzer hooks up with the Port Analyzer Adapter (PAA). It captures traffic much like the Fabric Analyzer, and is yet another tool for analyzing traffic. The Traffic Analyzer is a modification of ntop (see ntop.org). TA software runs on the host (PC, Mac, etc.) rather than on the switch, so there is no new switch software. The User Interface is simply a web browser. Traffic Analyzer must be used with a modified Port Analyzer Adapter (PAA). The newer PAA2 provides the original length of truncated frames. It captures more data with less bandwidth, and preserves data privacy without compromising statistics accuracy. Older PAAs are not field-upgradeable.
232
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Notification and Logging Services Robust fault monitoring = quicker problem-resolution •• Set Set alarms alarms based based on on 11 or or more more parameters, parameters, such such as: as: •• Port Port utilization utilization •• CPU CPU utilization utilization •• Memory Memory utilization utilization
•• Specify Specify actions actions to to be be taken taken based based on on alarms: alarms:
! RMON
Syslog
•• Logging Logging •• SNMP SNMP traps traps •• Log-and-trap Log-and-trap
•• Log Log information information for for monitoring monitoring and and troubleshooting troubleshooting •• Capture Capture accounting accounting records records •• See a complete See a complete picture picture of of events events
Call Home
•• Notification Notification of of critical critical system system events events •• Example: Example: alert alert when when switch switch ports ports are are congested congested
•• ••
Flexible Flexible message message formats: formats: email, email, pager, pager, XML XML Integrates Integrates with with RMON RMON and and Syslog Syslog (SAN-OS (SAN-OS 2.0+) 2.0+) 38
© 2006 Cisco Systems, Inc. All rights reserved.
Call Home Call Home can be configured to provide alerts when switch ports become congested, therefore performance can be monitored remotely and action taken promptly. The Call Home functionality is available directly through the Cisco MDS 9000 Family. It provides multiple Call Home profiles (also referred to as Call Home destination profiles), each with separate potential destinations. Each profile may be predefined or user-defined. A versatile range of message formats are supported, including standard email-based notification, pager services, and XML message formats for automated XML-based parsing applications. The Call Home function can even leverage support from Cisco Systems or another support partner—for example, if a component failure is detected, a replacement part can be on order before the SAN administrator is even aware of the problem. Flexible message delivery and format options make it easy to integrate specific support requirements. The Call Home feature offers the following advantages:
Integration with established monitoring systems like RMON and Syslog
Comprehensive and more robust fault monitoring
Aids in quicker problem-resolution
Copyright © 2006, Cisco Systems, Inc.
Optimizing Performance
233
RMON Threshold Manager Use the options on the Device Manager Events menu to configure and monitor Simple Network Management Protocol (SNMP), Remote Monitor (RMON), Syslog, and Call Home alarms and notifications. SNMP provides a set of preconfigured traps and informs that are automatically generated and sent to the destinations (trap receivers) chosen by the user. Use the RMON Threshold Manager to configure event thresholds that will trigger log entries or notifications. The RMON groups that have been adapted for use with Fibre Channel include the AlarmGroup and EventGroup. The AlarmGroup provides services to set alarms. Alarms can be set on one or multiple parameters within a device. For example, an RMON alarm can be set for a specific level of CPU utilization or crossbar utilization on a switch. The EventGroup allows configuration of events (actions to be taken) based on an alarm condition. Supported event types include logging, SNMP traps, and log-and-trap.
Syslog The system message logging software saves messages in a log file or directs the messages to other devices. This feature provides the following capabilities:
Logging information for monitoring and troubleshooting.
Selection of the types of logging information to be captured.
Selection of the destination of the captured logging information.
By default, the switch logs normal but significant system messages to a log file and sends these messages to the system console. Users can specify which system messages should be saved based on the type of facility and the severity level. Messages are time-stamped to enhance realtime debugging and management. Syslog messages are categorized into seven severity levels, from debug to critical events. Users can limit the severity levels that are reported for specific services within the switch. For example, Syslog can be configured to report only debug events for the FSPF service but record all severity level events for the Zoning service. A unique feature within the Cisco MDS 9000 Family switches is the ability to send accounting records to the Syslog service. The advantage of this feature is consolidation of both types of messages for easier correlation. For example, when a user logs into a switch and changes an FSPF parameter, Syslog and RADIUS provide complementary information that portrays a complete picture of the event. Syslog can store a chronological log of system messages locally or send messages to a central Syslog server. Syslog messages can also be sent to the console for immediate use. These messages can vary in detail depending on the configuration chosen.
234
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Lesson 8
Securing the SAN Fabric Overview This lesson explains how to secure an MDS 9000 SAN fabric using zoning, port and fabric binding, device authentication, and management security.
Objectives Upon completing this lesson, you will be able to design an end-to-end SAN security solution. This includes being able to meet these objectives:
Describe the most common security issues facing SANs
Explain how zoning contributes to the security of a SAN solution
Explain how port and fabric binding contribute to the security of a SAN solution
Explain how device authentication contributes to the security of a SAN solution
Explain how to secure management data paths
Explain the best practices for end-to-end SAN security design
SAN Security Issues SAN Security Challenges SAN security is often overlooked as an area of concern: Application integrity and security is addressed, but not back-end storage network carrying actual data SAN extension solutions now push SANs outside datacenter boundaries
Not all compromises are intentional: Accidental breaches can still have the same consequences
SAN security is only one part of complete datacenter solution: Host access security—one-time passwords, auditing, VPNs Storage security—data-at-rest encryption, LUN security Datacenter physical security
External External DoS DoS or or other other intrusion intrusion
Privilege Privilege escalation/ escalation/ unintended unintended privilege privilege
FC FC FC HBA
HBA
HBA
Application Application tampering tampering (trojans, (trojans, etc) etc)
Unauthorized Unauthorized connections connections (internal) (internal)
Data Data tampering tampering FC
SAN
FC FC
Theft Theft
FC
FC FC
LAN
9
4
© 2006 Cisco Systems, Inc. All rights reserved.
SAN Security Challenges Traditionally, SANs have been considered ‘secure’ primarily because SAN deployments had been limited to a subset of a single data center – in essence, an isolated network. Fibre Channel (FC) has been considered ‘secure’ on the basis that FC networks have been isolated from other networks. Application security has long been the focus of IT professionals, while back end storage and storage networks have often been ignored from a security perspective. Today SANs often span outside a datacenter. SAN extension technologies such as DWDM, CWDM and FCIP can be used to connect devices in multiple datacenters to storage in multiple datacenters. Transport technologies such as iSCSI decrease the cost associated with attaching hosts to a SAN and therefore accelerate the rate at which devices are connected to a SAN. There are many potential threats to network security. Often times these threats are perceived as external, whereby some outside entity (i.e. a hacker or cracker) attempts to break-in to a network to steal data, read confidential information, or simply wreak havoc with a organization’s business operations. While these external entities pose a significant threat to network security, more frequently internal entities pose a far greater threat and typically are not adequately addressed by network security defense mechanisms. SAN security is an important part of a complete datacenter security solution. SAN security attempts to protect both data in transport (Storage Networking Security) and data at rest (Storage Data Security).
238
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
SAN Security Vulnerabilities • Fabric and target threats Compromised application data Compromised LUN integrity
Protocol Threats
Unauthorized Target Access
Compromised application performance Unplanned downtime, costly data loss
• Fabric protocol threats Compromised fabric stability Compromised data security
Unauthorized Fabric Service
Disruptive topology changes Unplanned downtime, instability, poor I/O performance, costly data loss
• SAN management threats Disruption of switch processing
Clear Text Passwords No Audit of Access / Attempts Out-of-band Ethernet Management Connection
Compromised fabric stability Compromised data integrity and secrecy Loss of service, LUN corruption, data corruption, data theft or loss © 2006 Cisco Systems, Inc. All rights reserved.
Accidental or Intentional Harmful Management Activity 5
SAN Security Vulnerabilities Security for the SAN has often been a matter of “security by obscurity”, as IT organizations relied on the inherent security of the data center to protect its storage. With the expansion of SANs outside the heavily defended data center perimeter, more robust security is needed to protect storage resources and the storage fabric. In addition, other vendors’ platforms feature weak management security, with insecure management protocols and no ability to restrict management access.
Copyright © 2006, Cisco Systems, Inc.
Securing the SAN Fabric
239
Fabric Security Tiers
FC
FC HBA
Host-based:
Fabric-based:
• LUN mapping • Standard OS security
• VSANs • LUN masking • Zoning • LUN zoning • Read-only zones • Port mode security • Port binding • Fabric Binding • Authentication • Management security • Role Based Access Control
Array-based:
© 2006 Cisco Systems, Inc. All rights reserved.
6
Fabric Security Tiers SAN security can be implemented at three distinct tiers:
Host
Fabric
Array
Security measures at each of the three tiers can be used by storage administrators to achieve the level of security needed for a particular environment. If an entire data center is located within a single, physically secure data center a more lax suite of security measures might be chosen. However, SANs are commonly being extended beyond the confines of the corporate data center and thus implementing multiple security mechanisms at all three tiers is both warranted and necessary.
240
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Fabric Security Limitations Host and/or array LUN security: No fabric enforcement Soft zoning:
FC
FC HBA
Spoof the WWN or FCID and gain access WWN-based hard zoning: Spoof the WWN and gain access Port-based zoning: Occupy the port and gain access Port security (WWN binding): Spoof the WWN and occupy the port to gain access DH-CHAP: Need full authentication to gain access
© 2006 Cisco Systems, Inc. All rights reserved.
7
Fabric Security Limitations Traditional SAN security methods include LUN security on hosts and storage arrays as well as zoning in the fabric. As a new generation of SAN technology becomes available additional security features can be deployed to close long standing security vulnerabilities. Security Mechanism and its limitations:
Host and/or Array LUN Security – Host and Array LUN security does not rely on fabric enforcement and thus has limited effectiveness. By itself LUN security is not adequate to safeguard a SAN, but host LUN security and array LUN security can be used in conjunction with other security measures to create an effective security policy.
Soft Zoning – Soft zoning is perhaps the oldest and most commonly deployed security method within SANs. Primarily it protects hosts from accidentally accessing targets with which they are unauthorized to communicate. However, soft zoning provides no fabric enforcement. If a host can learn the FCID of a target soft zoning will not prevent that host from accessing the target.
Port-based Zoning – Port based zoning is applied to every FC frame that is switched, thus it has a level of fabric enforcement not provided by soft zoning. However, security provided by port-based zoning can be circumvented simply by gaining physical access to an authorized port.
WWN-based Zoning – WWN based zoning applies switching logic to frames based on their factory burned-in WWN rather than the physical port the devices is connect to. WWN based zoning can be defeated through the spoofing of WWNs, which is relatively trivial to accomplish.
Port Security – Prevents unauthorized fabric access by binding specific WWNs to one or more given switch ports. In order to defeat port security a hacker would need to both spoof the device WWN and access the specific port or ports that devices is authorized to use.
Copyright © 2006, Cisco Systems, Inc.
Securing the SAN Fabric
241
242
DH-CHAP – Enforces fabric and device access through an authentication method during the fabric login phase. DH-CHAP offers an excellent method of securing SAN fabrics, but it does not provide encryption services. Standardized encryption services for SANs will soon be available from multiple vendors, including Cisco.
Depending on the security protocols you have implemented, PPP authentication using MSCHAP can be used with or without Authentication, Authorization and Accounting (AAA) security services. If you have enabled AAA, PPP authentication using MS-CHAP can be used in conjunction with both TACACS+ and RADIUS.
MSCHAPV2 authentication is the default authentication method used by the MicrosoftWindows2000 operating system. Support of this authentication method on Cisco routers will enable users of the MicrosoftWindows2000 operating system to establish remote PPP sessions without needing to first configure an authentication method on the client.
MSCHAPV2 authentication introduces an additional feature not available with MSCHAPV1 or standard CHAP authentication, the change password feature. This feature allows the client to change the account password if the RADIUS server reports that the password has expired.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Comprehensive Security Solutions • Fabric and target security
FC-SP switch-to-switch authentication
LUN Zoning Read – Only Zoning
VSAN-based security – only allow access to devices within attached VSAN Hardware-enforced zoning complementary to VSAN segregation
DH-CHAP Device Authentication
Device authentication using DH-CHAP
Port Mode Security
• Fabric protocol security Port and fabric binding Switch-to-switch authentication
• SAN management security
SSH, RADIUS and TACACS+ Secure Console Access
AAA security with RADIUS and TACACS+ SSH for secure console sessions
Out-of-band Ethernet Management Connection
SNMPv3 Secure GUI Access
Secure GUI access with SNMPv3 authentication and encryption RBAC and logging for audit controls © 2006 Cisco Systems, Inc. All rights reserved.
RBAC and logging
8
Comprehensive Security Solutions While other SAN switch vendors have recently introduced features like hardware-enforced world-wide name (WWN) zoning, port and fabric binding, and device authentication, VSANs add a critical layer of protection by isolating ports on a per-department or per-application basis. The MDS supports standard Cisco role-based access control (RBAC) on a per-VSAN basis, providing a fine (but easily managed) level of granularity in assigning access permissions. Management data paths are secured with authenticated and encrypted SSH, SSL, and SNMPv3 sessions. All of this can be managed via centralized AAA services like Remote Authentication Dial In User Service (RADIUS) and Terminal Access Controller Access Control System (TACACS+), reducing management overhead and ensuring more consistent application of security policies across the enterprise.
Copyright © 2006, Cisco Systems, Inc.
Securing the SAN Fabric
243
Zoning MDS Advanced Zoning Features • Used to control host access to storage devices, within a VSAN FC Alias
• MDS 9000 supports both hard and soft zoning
FC
Soft zoning enforced by Name Server query-responses Hard zoning enforced on every frame by the forwarding ASIC
LUN
• Zoning Options pWWN
pWWN (attached Nx_Port) FCID FC Alias (within a VSAN) Device Alias (global within a SAN)
FCID
Device Alias pWWN
fWWN (switch port-based zoning) Interface (fc1/2)
FC
FCID
fWWN Int fc1/2
HBA
sWWN and port LUN
sWWN
• Fully compliant with FC-GS3, FC-GS-4, FC-SW2, FC-SW3, & FC-MI • Fabric Manager supports Zone Merge Analysis Prevents fabric merge failures due to zone database mismatch © 2006 Cisco Systems, Inc. All rights reserved.
10
MDS Advanced Zoning Features Zoning is a mechanism to control access to devices with a fibre channel fabric. On the Cisco MDS 9000 family of switches and routers zoning is enforced separately in each VSAN. The MDS 9000 supports both hard and soft zoning. Soft zoning is enforced through selective query responses made by the fibre channel name server. Zoning is applied to all data traffic through hard zoning by the forwarding ASIC. Zoning can be based on port and fabric WWN, FCID, interface, and LUNs. Cisco’s implementation of zoning is fully compliant with FC standards including FC-GS3, FCSW2, FC-MI and with SAN-OS 2.0 or higher, the MDS supports the FC-GS-4 and FC-SW-3 standards, which allow greater consistency of zoning parameters across the fabric. Prior to bringing up an ISL between two switches the fabric manager can conduct a zone merge analysis to determine whether the two switch’s zones can be successfully merged.
244
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
MDS Zoning Services • All zoning services offered by Cisco are implemented in hardware ASICs No dependence on whether using mix of WWNs and Port_IDs in a zone – all hardware based
Hardware-Based Zoning Details
WWN-based zoning implemented in software with hardware reinforcement WWNs are translated to FCIDs to be frame-filtered
• Dedicated high speed port ‘filters’ called TCAMs (Ternary CAMs) filter each frame in hardware and reside in front of each port Support up to 20,000 programmable entries consisting of zones and zone members – Up to 8000 zones per fabric Very deep frame filtering for new innovative features Wire-rate filtering performance – no impact regardless of number of zones or zone entries Optimized programming during zoneset activation – incremental zoneset updates
• RSCNs contained within zones in given VSAN • Selective Default Zone behavior – default is deny – configured per VSAN © 2006 Cisco Systems, Inc. All rights reserved.
11
MDS Zoning Services Data plane traffic is secured with VSANs, guaranteeing segregation of traffic across shared fabrics, and with zoning to satisfy traffic segregation requirements within a VSAN. Hardwarebased ACLs provide further granularity for advanced security options. The Cisco MDS 9509 leverages Cisco's experience securing the world's most sensitive data networks to deliver the industry's most secure storage networking platform. VSANs and zoning within the MDS 9000 Family of products are two powerful tools to aid the SAN designer in building robust, secure, and manageable networking environments while optimizing the use and cost of switching hardware. In general, VSANs are used to divide a redundant physical SAN infrastructure into separate virtual SAN islands each with its own set of Fibre Channel fabric services. By each VSAN supporting an independent set of Fibre Channel services, a VSAN-enabled infrastructure can house numerous applications without the concern for fabric resource or event conflicts between these virtual environments. Once the physical fabric has been divided, zoning is then used to implement a security layout within each VSAN that is tuned to the needs of each application within each VSAN.
Copyright © 2006, Cisco Systems, Inc.
Securing the SAN Fabric
245
2.0
Enhanced Zone Server •
Basic mode Represents the zone server behavior of GS-3/SW-2 standard –
•
Supported in pre-2.0 SAN-OS releases
Enhanced mode Represents the zone server behavior of GS-4/SW-3 standard –
Available with SAN-OS 2.0 and greater
–
QoS parameters distributed as part of zone attribute
–
Consistent full-zone database across fabric
–
Support for attributes in the standard
–
Consistent zoning policy across fabric
–
Unique vendor type
–
Reduced payload for activation request
© 2006 Cisco Systems, Inc. All rights reserved.
12
Enhanced Zone Server Starting with SAN-OS 2.0, the MDS supports the FC-GS-4 and FC-SW-3 standards, which allow greater consistency of zoning parameters across the fabric.
246
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Securing Hosts and Storage Host/array
based LUN security:
LUN Mapping in the Host LUN Masking in the Storage Array
FC
FC
HBA
Used to control host access to storage LUNs
HBA
Host pWWN 1
Host pWWN 2
Not enforced in the fabric Prevents contention for storage resources and data corruption Protection from unintentional security breaches Lack of centralized management
© 2006 Cisco Systems, Inc. All rights reserved.
Storage pWWN 3
Storage pWWN 4
A Array-based LUN Security
B
FC
13
Securing Hosts and Storage LUN security is used to determine which hosts are allowed to access which storage volumes. LUN-level security is required in order to allow fabric administrators the ability to control access to storage resources below the port level, such as accessing disks within a JBOD or logical volumes within a RAID array. LUN security can be enforced at the host or in the storage array. All HBAs sold today support LUN Mapping, and most intelligent storage arrays also allow administrators to restrict the hosts that can access each LUN with LUN Masking. LUN-level access could also be enforced by a router or switch in the SAN. LUN security is primarily used to prevent multiple hosts from accessing the same storage resources and thereby causing data corruption. Unless the storage array also supports LUNlevel access control, these security techniques are “voluntary.” Host-level LUN security does not prevent an unauthorized host from connecting to the fabric and accessing storage resources. Note
Different vendors use different terminology to describe LUN security. Typical terms used include: LUN security, LUN masking, LUN mapping, Storage Domains. In this course we use the generic term “LUN Security”.
Example of Host-Based LUN Security: The HBA utility on the red host could be configured to only communicate with the WWN associated with storage port A. Example of Array-Based LUN Security: Storage Port B could be configured to only accept frames from the WWN associated with the blue host port.
Copyright © 2006, Cisco Systems, Inc.
Securing the SAN Fabric
247
1.2
LUN Zoning LUN Zoning is the ability to zone an initiator with a subset of LUNs offered by a target: Use with storage arrays that lack LUN masking capability Use instead of LUN masking to centralize management in heterogeneous storage environments Can be managed centrally from CFM
FC
FC HBA
report_LUNs 10 LUNs available report_size LUN_1 LUN_1 is 50GB report_size LUN_3 LUN_3 is unavailable
© 2006 Cisco Systems, Inc. All rights reserved.
14
LUN Zoning Disk arrays typically have multiple Logical Units on them. Standard FC Zoning extends down to the switch port level or down to the WWN of the port, but not down to the LUN level. This means that any fabric containing disk arrays with multiple LUNs needs security policies configured on both the disk array (or multiple disk arrays) and on the FC switches themselves. LUN Zoning is a feature specific to switches in the Cisco MDS 9000 Family introduced in SAN-OS 1.2 that allows zoning to extend to individual LUNs within the same WWN. This means that the centralized zoning policy configured on the FC switches can extend to hardware-enforcing zoning down to individual LUNs in disk arrays. In the top half of the diagram LUN zoning allows the switch to grant the host access to disks 1 & 2 while preventing the host from accessing all other disks in the array.
248
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
1.2
Read-Only Zoning Read-Only Zoning leverages the hardware-based frame processing of the MDS 9000 Family Use for backup servers and snapshots Especially useful for media servers that need high speed access to rich content for broadcast – block level bypasses NAS service
Streaming media server FC
FC HBA
FCP_READ FCP_DATA FCP_WRITE
Does not work for certain file system types ie NTFS
X
© 2006 Cisco Systems, Inc. All rights reserved.
15
Read-Only Zoning Standard FC Zoning is used to permit devices to communicate with each other. Standard FC Zoning cannot perform any advanced filtering – for example, by blocking or allowing specific I/O Operations such as a Write I/O command. The Cisco MDS 9000 Family provides the ability to enforce ‘read only’ zones in hardware. That is, the switch can enforce read-only access to a given device (e.g. a disk) and will block any write requests. Read-only zoning filters FC4-Command frames based on whether the command is a read or write command When used in conjunction with LUN Zoning, read-only or read-write access can be granted for specific hosts to specific LUNs. Read-only Zoning was introduced with SAN-OS 1.2. This functionality is available on every port across the entire Cisco MDS 9000 product family. On the bottom half of the diagram a streaming video server is granted “read-only” access to the storage array, thus preventing inadvertent or malicious corruption of data. Certain operating systems use file systems that depend on the ability to write to disks (e.g., many Windows file systems). Such file systems may not function properly when placed in a read-only zone. Note
Read-Only Zones requires the Enterprise License Package.
Copyright © 2006, Cisco Systems, Inc.
Securing the SAN Fabric
249
VSAN Best Practices • Use VSANs to isolate each application • Use IVR to allow resource sharing across VSANs • Suspend VSAN 1: Move unused ports to VSAN 4094 Do not configure zones in VSAN 1 Set the default zone policy to “deny” Prevents WWN spoofing on an unused port
VSAN 201 HR_DB
FC
FC FC FC FC
FC FCFC FCFC
FC
FC
VSAN 202 CUST_DB
FC
VSAN 1
FC
© 2006 Cisco Systems, Inc. All rights reserved.
16
VSAN Best Practices Cisco recommends the following VSAN best practices:
250
Use VSANs to isolate each application whenever feasible.
Use IVR to allow resource sharing across VSANs; this allows complete isolation of each application.
Place all unused ports in VSAN 4094.
Because ports are in VSAN 1 by default, suspend VSAN 1, do not configure any zones, and set the default zone policy to deny. This will prevent WWN spoofing on unconfigured ports.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Zoning Best Practices • Use zoning services to isolate servers: Use Single Initiator Zoning - configure one zone per HBA Use read-only zones for read-only targets e.g. snapshot volumes Use LUN zoning to centralize LUN security Set default-zone policies to “deny”
• Configure zones from only one or two switches: Active zoneset will propagate to all switches in the fabric Prevents confusion and potential errors due to conflicting full zonesets If only one zoneset is needed, configure on one switch only Can recover full zoneset from active zoneset if that switch fails Full Zoneset Active Zoneset
Active Zoneset
Active Zoneset 17
© 2006 Cisco Systems, Inc. All rights reserved.
Zoning Best Practices Cisco recommends the following zoning best practices:
Zoning should always be deployed in a FC fabric. Typically one zone will be configured per HBA communicating with storage. This is called Single Initiator Zoning.
Depending on the particular environment port or WWN based zoning may be selected, although WWN zoning provides more convenience and less security than port based zoning. Port security features can be used to harden WWN-based zones.
Read only-zones should be applied to LUNs that will not be modified by initiators.
LUN zoning can be used to augment or replace array-based zoning.
Set the default zone policy to “deny” to prevent inadvertent initiator access to a target.
Only 1 or 2 switches should be used to configure zoning. This will help prevent confusion due to conflicting zonesets or the activation of an incomplete zoneset. If only one zoneset is needed (i.e. the active zoneset), you can configure the full zoneset on one switch only. In the event that switch goes down and the full zoneset is lost, you can easily recover the full zoneset from the active zoneset.
Copyright © 2006, Cisco Systems, Inc.
Securing the SAN Fabric
251
Port and Fabric Binding Port Mode Security • Only allow edge ports to form F_Ports or FL_Ports
E_Port E_Port mode mode FC
• Limit users who can change port mode via RBAC • Port mode security best practices: Use port mode security on all switch ports
Auto Auto mode mode Any port type
HBA
FC HBA
FC
Fx_Port Fx_Port mode mode
F_Port F_Port mode mode
F, FL Only
F Only
Fx_Port Fx_Port mode mode
FC
E_Port E_Port or or Auto Auto mode mode
Shut down all unused ports Place unused ports in VSAN 4094
© 2006 Cisco Systems, Inc. All rights reserved.
19
Port Mode Security Security and convenience are often at odds when administering a SAN. For example, the most convenient port mode setting is “Auto”, which allows any type of device to login into a given switch port. While convenient the “auto” mode could enable a user from intentionally or inadvertently misusing the fabric. A more secure practice is to specifically configure a switch port to only allow a connection from an expected device type. In the diagram, the top left port is configured to only function as an E_Port. If a host were to try to use this port to access the fabric the switch would not allow the connection. Similarly, the storage array in the left half of the diagram is connecting through a switch port configured as either an F or FL port. This switch will only allow and N or NL port to connect to the fabric through this port. In high security environments, port mode security should be used on all switch ports.
252
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Port and Fabric Binding • Security feature that binds a remote entity to a given switch port
WWN 2
WWN 3
WWN 4
FC
• The remote entity can be a host, target or a switch • Remote entities are identified by WWN
FC HBA
FC HBA
WWN 1
• Checked on all VSANs where activated • Failure results in link level login failure • Prevents S_ID spoofing • Included with the Enterprise license • Auto-learn mode to ease configuration
Port X -> WWN 1 Port Y -> WWN 2 Port Z -> WWN 3
© 2006 Cisco Systems, Inc. All rights reserved.
20
Port and Fabric Binding The port security feature restricts access to a switch port, allowing only authorized devices to connect to that port, and blocks all other access. Authorized devices may be hosts, targets, or other switches, and are identified by their World Wide Names (WWN). Port security checks are conducted on all VSANs that have the feature activated. Port security is available upon installation of the Enterprise license. Typically, any Fibre Channel device in a SAN can attach to any SAN switch port and access SAN services based on zone membership. Port Security is a feature that was introduced into the Cisco MDS 9000 family in SAN-OS 1.2 that is used to prevents unauthorized access to a switch port by binding specific WWN(s) as having access to one or more given switch ports. When Port Security is enabled on a switch port, all devices connecting to that port must be in the port-security database and must be listed in the database as bound to a given port. If both these criteria aren’t met, the port won’t ever achieve an operationally ‘active’ state and the devices connected to the port will be denied access to the SAN. In the case of a storage device or host, the port name (pWWN) or node name (nWWN) can be used to lock authorized storage devices to a specific switch port. In the case of an E_Port/TE_Port, the switch name (sWWN) is used to bind authorized switches to a given switch port. When Port Security is enabled on a port:
Login requests from unauthorized Fibre Channel devices (Nx ports) and switches (xE ports) are rejected.
All intrusion attempts are reported to the SAN administrator.
Copyright © 2006, Cisco Systems, Inc.
Securing the SAN Fabric
253
The auto-learn option allows allow for rapid migration across to Port Security when it is being activated for the first time. Rather than manually secure each port, auto-learn allows for automatically population of the port-security database based on an inventory of currentlyconnected devices.
254
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Port Security Best Practices • Use port mode assignments: Lock (E)ISL ports to E_Port mode Lock access ports to Fx_Port mode
• Use port security features everywhere: Bind devices to switch as a minimum level of security Bind devices to a port as an optimal configuration Consider binding to group of ports in case of port failure Bind switches together at ISL ports – bind to specific port, not just switch
• Use FC-SP authentication for switch-to-switch fabric access: Use device-to-switch when available
• Use unique passwords for each FC-SP connection • Use RADIUS or TACACS+ for centralized FC-SP password administration 21
© 2006 Cisco Systems, Inc. All rights reserved.
Port Security Best Practices Port security best practices include the use of port mode assignments:
Lock (E)ISL ports to only be (E)ISL ports
Lock initiator and target ports down to F or FL mode
When higher levels of security are desired, use port security features:
Bind devices to switch as a minimum level of security
Bind devices to a port as an optimal configuration
Consider binding to a group of ports in case of port failure
Bind switches together at ISL ports – bind to specific port, not just switch
Use FC-SP authentication for switch-to-switch fabric access:
Use device-to-switch when available
FC-SP-based authentication should be considered mandatory in a secure SAN in order to prevent access to unauthorized data via spoofed or hijacked WWNs where traditional Port Security would be vulnerable.
Use unique passwords for each FC-SP connection. Use RADIUS or TACACS+ for centralized FC-SP password administration:
RADIUS or TACACS+ authentication is recommended for fabrics with more than five FCSP-enabled devices.
Copyright © 2006, Cisco Systems, Inc.
Securing the SAN Fabric
255
Authentication and Encryption WWN Identity Spoofing Zoning provides segregation, but lacks any form of authentication Circumventing zones through impersonation of a member (identity spoofing) is both possible and relatively trivial to do
http://www.emulex.com/ts/fc/docs/wnt2k/2.00/pu.htm
© 2006 Cisco Systems, Inc. All rights reserved.
23
WWN Identity Spoofing While zones provide a good method of segregating groups of hosts and disks within a SAN they do not effectively protect a SAN from a malicious attack. Most zones today are based on WWNs which are relatively trivial to spoof. Here we see an HBA configuration utility that allows an administrator to override the factory set device WWN. Such a utility might be used by a hacker to circumvent WWN based zoning and thus gain unauthorized access to a fabric. Both Emulex and QLogic provide tools to change the WWN of the Host Bus Adapter
256
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Host and Switch Authentication Prevents accidental and/or malicious devices from joining a secure SAN. • •
Initial phase focused on authentication DH-CHAP between devices Centralized RADIUS or TACACS+ server MS-CHAP now supported in SAN-OSv3
Trusted Hosts FC
Hosts Switches
Based on FC-SP security protocols. • • • •
RADIUS or TACACS+
HB A
FC FC
HB A
Prevent accidental and/or malicious host/switch to connect
HB A
FC HB A
FC-SP (DH-CHAP)
FC-SP (DH-CHAP)
FC HBA
Storage Subsystems
© 2006 Cisco Systems, Inc. All rights reserved.
24
Host and Switch Authentication Support for device authentication was introduced into the Cisco MDS 9000 family in SAN-OS 1.3. This and subsequent releases support data integrity (tamper-proofing) and authentication (non-repudiation) for both switch-to-switch and host-to-switch communication. Authentication is based on Challenge Handshake Authentication Protocol (CHAP) with DiffieHellman (DH) extensions (DH-CHAP). Cisco’s implementation of DH-CHAP supports node-to-switch and switch-to-switch authentication. Authentication can be performed locally in the switch or remotely through a centralized RADIUS or TACACS+ server. If the authentication credentials cannot be ascertained or the authentication check fails, a switch or host will be blocked from joining a FC fabric. Secure switch control protocols prevent accidental and/or malicious devices from joining a secure SAN. Because of the distributed nature of the FC protocol, a user can create a DOS attack by maliciously or accidentally connecting hosts and switches into an existing fabric. This is especially a concern when deploying a geographically dispersed enterprise-wide or campus wide fabric. However, MDS 9000 addresses this with host-to-switch and switch-toswitch authentication features that are being proposed by T11. The FC-SP specification to be specific. Before two entities start exchanging control and data frames, they mutually authenticate each other by using an external RADIUS or TACACS+ server.
Copyright © 2006, Cisco Systems, Inc.
Securing the SAN Fabric
257
Security for IP Storage Traffic iSCSI secured with IPsec
iSCSI Server
Primary Site
iSCSI
CHAP Authentication
CHAP Authentication
IP WAN
Remote Site
FCIP secured with IPsec
• CHAP provides authentication of iSCSI hosts • IPsec provides end-to-end authentication, data integrity and encryption – Hardware-based, high-performance solution – MDS 9216i or MPS-14/2 Module © 2006 Cisco Systems, Inc. All rights reserved.
25
Security for IP Storage Traffic IP Security (IPsec) is available for FCIP and Small Computer System Interface Over IP (iSCSI) over Gigabit Ethernet ports on the Multiprotocol Services modules and Cisco MDS 9216i. The proven IETF standard IPsec capabilities offer secure authentication, data encryption for privacy, and data integrity. Internet Key Exchange Version 1 (IKEv1) and IKEv2 protocols are used for dynamically setting up the security associations for IPsec using preshared keys for remote-side authentication
258
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Management Security SAN Management Security Vulnerabilities SAN Management Threats Disruption of switch processing
Compromised data integrity and secrecy
Compromised fabric stability Compromised data integrity and secrecy Loss of service, LUN corruption, data corruption, data theft or loss
Compromised fabric stability
SAN Management Vulnerabilities Unsecured Console Access Unsecured GUI application access
Clear Text Passwords No Audit of Access / Attempts
Unsecured API access Privilege escalation / unintended privilege
Out-of-band Ethernet Management Connection
Lack of audit mechanisms Accidental or Intentional Harmful Management Activity © 2006 Cisco Systems, Inc. All rights reserved.
27
SAN Management Security Vulnerabilities In an environment where security is only as strong as the weakest link, SAN management security is often overlooked as one of the most vulnerable and dangerous points of compromise. An attack that hijacks a management session or even accidental management activity can create serious consequences that impact the integrity of the fabric and the data assets it provides. Points of SAN management security exposure include:
Unsecured Console Access
Unsecured GUI application access
Unsecured API access
Privilege escalation / unintended privilege
Lack of audit mechanisms
Copyright © 2006, Cisco Systems, Inc.
Securing the SAN Fabric
259
Securing Management Access
Encryption (secure Firewall protocol)
Management VLAN
Management VSAN IP ACL RBAC
√ √
Data Path—Management Client to Switch
28
© 2006 Cisco Systems, Inc. All rights reserved.
Securing Management Access To fully secure management data paths, you need to implement security measures at all points in the data path:
260
Use secure protocols. SNMPv3, SSH, and SSL provide strong authentication and encrypted sessions. Disable SNMPv2, Telnet, and HTTP.
Use VPNs for remote management.
Always implement firewalls between the management network and the Internet. Intrusion Detection Systems (IDS) should also be included in the solution. In a large company, consider implementing an internal firewall to isolate the management network from the rest of the company LAN.
Use a private management VLAN to isolate management traffic.
Implement IP ACLs to restrict access to mgmt0.
Management VSANs can be configured to create a logical SAN for management traffic only.
Use role-based access control (RBAC) to restrict user permissions.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Secure Management Protocols • Simple Network Management Protocol (SNMP) SNMPv3 used to communicate between switch and GUI management applications Supports encryption and authentication SNMP v1 and v2 also supported for legacy applications More than 50 MIBs supported
• Secure Shell (SSH) v2 Encrypts and authenticates traffic between switch and management station Used for CLI sessions instead of telnet SSH Host Key Pair - RSA, RSA1, DSA, or AES
© 2006 Cisco Systems, Inc. All rights reserved.
29
Secure Management Protocols The Cisco MDS 9000 Family of switches supports an extensive SNMP facility, including traps. MDS 9000 switches use SNMPv3, which supports encryption and authentication. SNMPv1 and v2 are also supported for legacy applications. The Cisco MDS 9000 Family of switches supports over 50 SNMP MIBs, allowing secure management from both Cisco GUI management applications and third-party applications. SSHv2 (Secure Shell version 2) encrypts CLI traffic between client and MDS 9000, authenticates communication between client and host, and prevents unauthorized access. The MDS 9000 platform supports the Rivest, Shamir, and Adelman (RSA1 and RSA), Digital Signature Authority (DSA), and Advanced Encryption Standard (AES) public key protocols. SSH should be used instead of Telnet.
Copyright © 2006, Cisco Systems, Inc.
Securing the SAN Fabric
261
Role-Based Access Control (RBAC) • RBAC allows different users and groups to be granted appropriate levels of access to management interfaces. • Predefined Roles Network operator (read only)
Network Administrator – Configure/manage overall network
Exec mode file system commands Show commands and diagnostics
Email VSAN
Network-admin (read/write) Access to all CLI commands
• Customized Roles
Finance VSAN Engineering VSAN
Access to subsets of CLI commands
• VSAN Based RBAC Deploy on a VSAN basis By department By administrative function © 2006 Cisco Systems, Inc. All rights reserved.
VSAN Administrators – Configure/manage their VSAN only 30
Role-Based Access Control RBAC allows different administrative users and groups to be granted different levels of access to management interfaces. Some administrators might be given read-only access to permit device monitoring, others might be given the ability to change port configurations, while only a few trusted administrators are given the ability to change fabric-wide parameters. With SANOS version 1.3.1 and above, customers are able to define roles on a per-VSAN basis. This enhanced granularity allows different administrators to be assigned to manage different SAN domains.
Role-Based Security Best Practices Cisco supports RBAC for MDS switches, allowing different administrative users and groups to be granted different levels of access to management interfaces. Some administrators might be given read-only access to permit device monitoring, others might be given the ability to change port configurations, while only a few trusted administrators are given the ability to change fabric-wide parameters. Users have the union of access permissions from all roles assigned to them. These roles can be assigned to either CLI or SNMP users. Two roles are predefined: network-admin and network-operator. Other roles can be created, with CLI commands enabled or blocked selectively for that particular role.
262
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
VSAN-Based RBAC With SAN-OS version 1.3.1 and higher, customers are able to define roles on a per-VSAN basis. This enhanced granularity allows different administrators to be assigned to manage different SAN domains, as defined by VSANs. A Network Administrator is responsible for overall configuration and management of the network, including platform-specific configuration, configuration of roles and role assignment. Matching the VSANs to the existing operational structure allows for ease of matching user roles to realistic groupings of operational responsibility. VSAN-based roles both limit the reach of individual VSAN Administrators to the resources within their logical domain. In addition, efficient grouping of commands into roles, and assignment of roles to users, allows mapping of user accounts to practical roles, which reduces the likelihood of password sharing among operational groups.
Copyright © 2006, Cisco Systems, Inc.
Securing the SAN Fabric
263
AAA Services Authentication User access with ID and password
Authorization Role level or set of privileges
Accounting Log of user’s management session
Centrally stored access information Covers needs for various applications: CLI Login (Telnet/SSH/Console/Modem) SNMP (authentication and accounting) iSCSI (CHAP authentication) FC-SP (DH-CHAP authentication) © 2006 Cisco Systems, Inc. All rights reserved.
31
AAA Services AAA services consist of authentication, authorization, and accounting facilities for CLI.
Authentication refers to the authentication of users to access a specific device. Within the Cisco MDS 9000 Family switches, RADIUS and TACACS+ can be used to centralize the user accounts for the switches. When a user tries to log on to the switch, the switch will validate the user via information gathered from the central RADIUS or TACACS+ server.
Authorization refers to the scope of access that users receive once they have been authenticated. Assigned roles for users can be stored in a RADIUS or TACACS+ server along with a list of actual devices that each user should have access to. Once the user has been authenticated, the switch can then refer to the RADIUS or TACACS+ server to determine the extent of access the user will have within the switched network.
Accounting refers to the ability to log all commands entered by a user. These command logs are sent to the RADIUS or TACACS+ server and placed in a master log. This log can then be parsed to trace a user's activity and create usage reports or change reports. All exchanges between a RADIUS or TACACS+ server and a RADIUS or TACACS+ client switch can be encrypted using a shared key for added security.
RADIUS and TACACS+ are protocols used for the exchange of attributes or credentials between a RADIUS server and a client device (management station). RADIUS and TACACS+ cover authentication, authorization, and accounting needs for various applications, including: CLI login via Telnet, SSH, console, and modem; SNMP accounting; iSCSI CHAP authentication; and FC-SP DH-CHAP authentication. Separate policies can be specified for each application. The MDS 9000 also has the ability to send RADIUS accounting records to the system log (syslog) service. The advantage of this feature is the consolidation of messages for easier parsing and correlation. 264
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Centralizing Administration • Use RADIUS and/or TACACS+ for: SNMP and CLI users iSCSI CHAP
RADIUS and TACACS+ Deployments Dial/VPN servers
Datacenter routers and switches
FC-CHAP AD
• Improved security due to central control in applying access rules • Use redundant servers • Connect RADIUS/TACACS+ to LDAP or Active Directory servers to centralize all accounts enterprise-wide
Microsoft AD
Terminal servers
RAD
RADIUS
Network management stations
RAD LDAP
LDAP Server
W2K IAS RADIUS TAC+
DB
Linux TACACS+
Cisco MDS 9000 Family Switches
RDBMS Server
© 2006 Cisco Systems, Inc. All rights reserved.
32
Centralizing Administration SAN administration must be limited to qualified and authorized individuals to assure proper configuration of the devices and the fabric. Enterprise-wide security administration is enabled through support for RADIUS servers and TACACS+ servers for the MDS 9000 family. The use of RADIUS or TACACS+ allows user accounts and roles to be applied uniformly across the enterprise, both simplifying administrative tasks as well as increasing security by providing centralized control for application of access rules. In addition, the switch can record management accounting information, logging each management session in a switch. These records may then be used to generate reports for troubleshooting purposes and user accountability. Accounting data can be recorded locally, on the switch itself, or by RADIUS servers. RADIUS is a standards-based protocol defined by RFC 2865 and several associated RFPs. RADIUS uses UDP for transport. TACACS+ is a Cisco client-server protocol which uses TCP (TCP port 49) for transport. The addition of TACACS+ support in SAN-OS enables the following advantages over RADIUS authentication:
The TCP transport protocol provides reliable transfers with a connection-oriented protocol.
TACACS+ provides independent, modular AAA facilities—authorization can be done without authentication.
TACACS+ encrypts the entire protocol payload between the switch and the AAA server to ensure higher data confidentiality—the RADIUS protocol only encrypts passwords.
Copyright © 2006, Cisco Systems, Inc.
Securing the SAN Fabric
265
End-to-End Security Design Intelligent SAN Security Device/SAN Management Security Via SSH, SFTP, SNMPv3, RBAC
Secure In-Transit Protocols CHAP and IPsec for iSCSI
RADIUS or TACACS+ Server for Authentication
VSANs Provide Secure Isolation
iSCSI Hosts
IP LAN
iSCSI
Hardware-Based Zoning Via Port and WWN
Port Binding and DH–CHAP
LUN Zoning – Read-Only Zones
FC
Port Binding and (DH –CHAP )
FCIP- DWDM SONET
FC FC
FC SAN
FC SAN 34
© 2006 Cisco Systems, Inc. All rights reserved.
End-to-End Security Design and Best Practices The MDS 9000 platform provides a full suite of intelligent security functions that when deployed enable a truly secure SAN environment.
Secure SAN management is achieved via role-based access. It includes customizable roles that apply to CLI, SNMP, and web-based access, along with full accounting support.
Secure management protocols like SSH, SFTP, and SNMPv3 ensure that outside connection attempts to the MDS 9000 network are valid and secure.
Secure switch control protocols that leverage IPsec-ESP (Encapsulating Security Protocol) specifications yield SAN protocol security (FC-SP). DH_CHAP authentication is used between switches and devices.
MDS 9000 support of RADIUS and TACACS+ AAA services help to ensure user, switch, and iSCSI host authentication for the SAN.
Secure VSANs and hardware-enforced zoning restrictions using port ID and World Wide Names provide layers of device access and isolation security to the SAN.
Security measures implemented in this scenario include:
266
DH-CHAP capable HBAs installed in all hosts to enable authenticated fabric access
Port-mode security on all switch ports
Port security on all switch ports
Database cluster server groups access utilize their own VSAN to provide traffic isolation
Array-based LUN Security
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
MDS Intelligent Security Solutions
Level of Security
Data Integrity & Encryption Device Authorization & Authentication Traffic Isolation & Device Access Controls
• Port Security
(IPsec for iSCSI and FCIP)
• Fabric Binding
• VSANs
• Host/Switch Authentication for FC and FCIP
• SSHv2, SNMPv3, SSL
• Hardware Zoning
• iSCSI CHAP Authentication
• Centralized AAA w/ RADIUS, TACACS+
• LUN Zoning
• MS-CHAP Authentication
• Read-only Zones
• Digital Certificates
Mgmt Access
•Security for Data-inMotion
• Role Based Access Controls (RBAC) • VSAN based RBACs • IP ACLs
Evolution of Security Solutions
35
© 2006 Cisco Systems, Inc. All rights reserved.
MDS Intelligent Security Solutions Cisco offers the industry’s most comprehensive set of security features in the MDS 9000 Family:
No impact on switch performance
Data path features are all hardware-based
Traditional hard and soft zoning as well as advanced LUN and Read-Only zones are available on MDS devices
Port Mode Security is an excellent way to limit unauthorized access to the fabric
Port Security binds device WWNs with one or more switch ports
DH-CHAP provides device authentication services
IPsec provides integrity and security for in-transit data
All security features are easily managed through Cisco’s Fabric Manager application
Copyright © 2006, Cisco Systems, Inc.
Securing the SAN Fabric
267
268
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Lesson 9
Designing SAN Extension Solutions Overview In this lesson, you will learn how to effectively deploy SAN extension solution on the MDS 9000 platform, including key applications and environments, high availability features, performance enhancements, Inter-VSAN Routing, and optical solutions.
Objectives Upon completing this lesson, you will be able to identify issues and solutions for SAN extension. This includes being able to meet these objectives:
Identify applications for SAN extension
Identify network transports for SAN extension
Explain design configurations for SAN extension over DWDM and CWDM
Define FCIP
Explain design configurations for SAN extension using FCIP
Describe the features of the MDS 9000 IP Services Modules
Explain how to build highly available FCIP configurations
Explain how IVR increases the reliability of SAN extension links
Explain how to secure extended SANs
Explain the options available for optimizing performance of low-cost FCIP transports
SAN Extension Applications Data Backup and Restore Data is backed up to remote data center Backup is accessible directly over the MAN/WAN Reduces Recovery Time Objective (RTO) Much faster than standard offsite vaulting (trucking in tapes) Ensures data integrity, reliability and availability Leverages the infrastructure of existing facilities
Local Datacenter
Remote Datacenter
WAN
BACKUP 4
© 2006 Cisco Systems, Inc. All rights reserved.
Data Backup and Restore Remote backup is a core application for FCIP. It is sometimes known as remote vaulting. In this approach, data is backed up using standard backup applications, such as Veritas NetBackup or Legato Celestra Power, but the backup site is located at a remote location. FCIP is an ideal solution for remote backup applications because:
270
FCIP is relatively inexpensive compared to optical storage networking
Enterprises and Storage Service Providers (SSPs) can provide remote vaulting services using existing IP WAN infrastructures
Backup applications are sensitive to high latency, but in a properly designed SAN the application can be protected from problems with the backup process by using techniques such as snapshots and split mirrors.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Data Replication Data is continuously synchronized across the network Data can be mirrored for multiple points of access Enables rapid failover to remote datacenter for 24/7 data availability Reduces RTO as well as Recovery Point Objective (RPO)
Local Datacenter
Remote Datacenter
WAN
REPLICATION
© 2006 Cisco Systems, Inc. All rights reserved.
5
Data Replication The primary type of application for an FCIP implementation is a disk replication application used for business continuance or disaster recovery. Examples of this types of application include:
Array-based replication schemes such as EMC Symmetrix Remote Data Facility (SRDF), Hitachi True Copy, IBM Peer-to-Peer Remote Copy (PPRC), or HP/Compaq Data Replication Manager (DRM).
Host-based application schemes such as VERITAS Volume Replicator (VVR).
Copyright © 2006, Cisco Systems, Inc.
Designing SAN Extension Solutions
271
Data Replication (Cont.) • Asynchronous and synchronous replication: Need transport solutions to address different levels of requirements for bandwidth and latency Example: Multi-hop replication ASYNCHRONOUS REPLICATION
WAN
DWDM
SYNCHRONOUS REPLICATION
© 2006 Cisco Systems, Inc. All rights reserved.
6
Replication applications can be run in a synchronous mode, where an acknowledgement of a disk write is not sent until the remote copy is done, or in an asynchronous mode, where disk writes are acknowledged before the remote copy is completed. Applications that are using synchronous copy replication are very sensitive to latency delays and might be subject to unacceptable performance. Customer requirements should be carefully weighed when deploying an FCIP link in a synchronous environment. FCIP can be suitable for synchronous replication when run over local Metro Ethernet or short-haul WDM transport.
272
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
SAN Extension Transports Dark Fiber • Dark fiber is viable over data center or campus distances. Single Mode fiber up to 10km at 2Gbps Multi Mode fiber up to 300m at 2Gbps Joined switches form a single fabric Fabric will segment if there is a link failure – disruptive event 2x throughput
P FCP_RS
Write Write Acceleration Acceleration XFER_RDY
Round Trip
FCP_WRITE FCP_DATA
XFER_RDY
P FCP_RS
© 2006 Cisco Systems, Inc. All rights reserved.
52
FCIP with Write Acceleration The protocol for Write Acceleration differs as follows:
After the initiator issues a SCSI FCP Write, an FCP_XFER_RDY is immediately returned to the initiator by the MDS 9000.
The initiator can now immediately send data to its target across the FCIP Tunnel. The data is received by the remote MDS and buffered.
At the remote end, the target, which has no knowledge of Write Acceleration, responds with an FCP_XFER_RDY. The MDS does not allow this to pass back across the WAN.
When the remote MDS receives FCP_XFER_RDY it allows the data to flow to the target.
Finally when all data has been received, the target issues a FCP_RSP response or status, acknowledging the end of the operation (FC Exchange)
Write Acceleration will increase write I/O throughput and reduce I/O response time in most situations, particularly as the FCIP RTT increases.
314
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
FCIP Tape Acceleration • Tape drives cannot handle high WAN latencies Cannot keep the tape streaming - causes “shoe-shining”
• Write Acceleration alone cannot keep the tape streaming Tape drives allow only one outstanding I/O
• Tape Acceleration is an enhancement to Write Acceleration Spoofs FCP Response so next write operation is not delayed Extends tape buffering onto the IPS modules IPS modules act as proxy tape device and backup host 35 FC
HBA
XFER_RDY
Round Trip
FCP_WRITE FCP_DATA
XFER_RDY
FCP_RSP
25 Standard FCIP
20
FCIP with WA
15
FCIP with TA 10 5 0 0
FCP_RSP © 2006 Cisco Systems, Inc. All rights reserved.
Throughput (MB/s)
30
FCIP tunnel
FC
10
20
30
40
50
70
100
RTT (ms) 53
FCIP Tape Acceleration Increasing numbers of customers are realizing the benefits of tape backup over WAN in terms of centralizing tape libraries and maintaining central control over backups. With increasing regulatory oversight of data retention, this is becoming increasingly important. One issue that customers often face is that tape drives have limited buffering that is often not sufficient to handle WAN latencies. Even with Write Acceleration, each drive can support only one outstanding I/O. When the tape drive writes a block, it issues an FCP_RSP status command to tell the initiator to send more data. The initiator then responds with another FCP_Write command. If the latency is too high, the tape drive won’t receive the next data block in time and must stop and rewind the tape. This “shoe-shining” effect not only increases the time it takes to complete the backup job—potentially preventing it from completing within any reasonable time frame—but it also decreases the life of the tape drive. Write Acceleration alone is not sufficient to keep the tape streaming. It halves the total RTT for an I/O, but the initiator must still wait to receive FCP_RSP before sending the next FCP_Write. FCIP Tape Acceleration is an enhancement to Write Acceleration that extends tape buffering onto the IPS modules. The local IPS module proxies as a tape library and the remote IPS module proxies as a backup server. The local IPS sends FCP_RSP back to the host immediately after receiving each block, and data is buffered on both IPS modules to keep the tape streaming. It includes a flow control scheme to avoid overflowing the buffers, which allows the IPS to compensate for changes in WAN latencies or the tape speed.
Copyright © 2006, Cisco Systems, Inc.
Designing SAN Extension Solutions
315
The graph on this slide shows the effects of Write Acceleration and Tape Acceleration. The tests were conducted with Legato Networker 7.0 running on a Dual Xeon 3Ghz CPU with 2G Memory and Windows Advanced Server 2000 with an IBM Ultrium TD2-LTO2 Tape Drive. Cisco has tested the Tape Acceleration feature with tape devices from IBM, StorageTek, ADIC, Quantum, and Sony, as well as VERITAS NetBackup, Legato Networker, and CommVault, and is currently working with CA. Backup application vendors will provide matrices of supported tape libraries and drives.
316
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
3.0
FCIP Tape Read Acceleration • Improves Tape Restore performance over WAN Tape doesn’t stop while waiting for next Read command Target MDS pre-fetches data from tape and caches data on MDS Data is continuously streamed across FCIP WAN link FC
FCIP tunnel
FC HBA
FCP_READ
FCP_DATA
Round Trip
Pre-fetch Data FCP_READ
FCP_READ
FCP_RSP
FCP_RSP
54
© 2006 Cisco Systems, Inc. All rights reserved.
FCIP Tape Read Acceleration Different performance issues occur when restoring data across a WAN.
A tape drive is a sequential storage medium, so the blocks stream off the tape as the tape passes the head
The backup server issues Read Commands to the Tape target device requesting a number of SCSI 512 Byte blocks.
The tape starts to move, reads the data into buffers and then stops waiting for the next command.
Meanwhile the backup server receives the data blocks and issues a new read command for the next x blocks in sequence. The tape starts up again, reads the blocks and so on.
FCIP Tape Read Acceleration performs a read-ahead to pre-fetch the data and keep the tape moving.
Lets assume that the backup server issues a Read Command to read the first x blocks.
This command sent to the tape and the tape starts up, and reads the blocks into the buffer and the data is sent back to the backup server.
Meanwhile, before the tape has stopped moving, the MDS at the remote site issues another Read Command to read the next x blocks in sequence into the buffer and these blocks are sent over the FCIP tunnel to buffers in the MDS at the local data centre.
When the local MDS receives a command from the backup server to read the next x blocks, it consumes the command and sends the data that it has already buffered.
By pre-fetching data and keeping the tape moving, FCIP Tape Read Acceleration will dramatically improve read performance over a WAN.
Copyright © 2006, Cisco Systems, Inc.
Designing SAN Extension Solutions
317
1.3
Dynamic TCP Windowing • MWS = Bandwidth x RTT • MDS IP Services Module dynamically calculates MWS: • Administrator sets max-available-bw and initial RTT • MDS recalculates RTT during idle time
• IPS modules support maximum MWS of 32MB
Source
Destination
max_bw 45Mbps (dedicated) GigE
45Mbps
GigE
RTT = end-to-end latency x 2 One way time = 12ms x 2 = 24 RTT
55
© 2006 Cisco Systems, Inc. All rights reserved.
Dynamic TCP Windowing The TCP Maximum Window Size (MWS) is derived from the product of the maximum bandwidth x RTT x 0.9375 + 4 KB . In SAN-OS 1.3 and higher, you cannot configure the TCP MWS directly on the MDS 9000 IP Services module. You tell the IPS what the maximum bandwidth of the link (the max-availablebw parameter) and configure the initial RTT value. The MDS automatically recalculates the RTT value during idle periods, so the RTT dynamically varies according to network conditions, such as IP routing changes. The MWS value is then dynamically recalculated based on configured bandwidth and RTT. The MWS automatically adjusts the MWS if FCIP compression is used. The TCP MWS can vary up to 32MB. This allows the IPS to support long distances at gigabit speeds. On earlier versions of SAN-OS, you must configure the RTT manually.
318
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Traditional TCP Congestion Avoidance Quick Review: Traditional TCP (simplified) Packets Sent per Round Trip (Congestion Window)
Linear Congestion Avoidance (+1 cwnd per ACK)
loss
cwnd halved on packet loss; retransmission signals congestion; Slow Start threshold adjusted
loss
Slow Start Threshold
Exponential Slow Start (2x pkts per RTT) Low throughput during this period
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Round Trips © 2006 Cisco Systems, Inc. All rights reserved.
56
Traditional TCP Congestion Avoidance This diagram shows how the traditional window-sizing mechanism in TCP relates to the congestion window (cwnd) and RTT. This mechanism was designed for handling small numbers of packets in a lossy network. When the network becomes congested, packets are dropped. Traditional TCP has a tendency to overreact to packet drops by halving the TCP window size. The resulting reduction in speed that can occur in traditional TCP implementations is unacceptable to many storage applications.
Copyright © 2006, Cisco Systems, Inc.
Designing SAN Extension Solutions
319
TCP Packet Shaping Administrator must configure min-available-bw parameter to enable packet shaper and determine the “aggressiveness” of the recovery
Packets Sent per Round Trip (Congestion Window)
Congestion Avoidance (+2 cwnd per RTT)
Retransmission
Maximum Window Size Slow Start Threshold Slow Start Threshold initialized to 95% of MWS
Minimum threshold = min-available-bw
cwnd at 95% of MWS after one RTT
Shaper engaged during first RTT at min-available-bw
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Round Trips © 2006 Cisco Systems, Inc. All rights reserved.
57
TCP Packet Shaping The IPS Module implements a modified TCP windowing algorithm called a packet shaper. When you configure a Gigabit Ethernet port, you specify the minimum-available bandwidth. TCP can then use this value as the Slow Start Threshold. The packet shaper ramps up to this threshold within 1 RTT. From there, the TCP stack uses linear congestion avoidance, increasing throughput at the rate of 2 segments per RTT until the maximum window size is reached. When congestion occurs, the MDS 9000 TCP implementation is more aggressive during recovery than traditional TCP. When congestion occurs, the congestion window drops to the min-available-bandwidth value. The degree of aggressiveness during recovery is therefore proportional to the min-availbandwidth configuration. Note that if conventional TCP traffic shares the same link with FCIP, the conventional TCP flows recover more slowly. The bandwidth allocation will then strongly favour the FCIP traffic. To cause FCIP to behave more fairly, use a lower min-available-bandwidth value to force FCIP to start at a lower rate. Use the min-available-bandwidth parameter to determine how aggressively FCIP should behave.
320
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
FCIP Tuning Hassles • To achieve desired throughput, a number of parameters need to be tuned: TCP parameters: Maximum and minimum available bandwidth (for packet shaper) Round-trip time (RTT) Number of outstanding I/Os for the application SCSI transfer size
• How is this usually done? Standard traffic generation tools - e.g. Iometer Requires testing real hosts and targets
FC HBA
58
© 2006 Cisco Systems, Inc. All rights reserved.
FCIP Tuning Hassles To maximize throughput on FCIP links, a number of parameters need to be tuned, including:
TCP parameters (max bandwidth, round-trip time RTT)
Number of outstanding I/Os for the application
SCSI transfer size
To determine these parameters, users need to use standard traffic generation tools like IOMeter to generate test data and measure response time and throughput. This requires test hosts and test targets, and must be redone every time the environment changes.
Copyright © 2006, Cisco Systems, Inc.
Designing SAN Extension Solutions
321
SAN Extension Tuner • Assists in tuning by generating various SCSI traffic workloads • Built into the IPS port • Creates a virtual N-port on the IPS port that can act as both initiator and target User-specified I/O size, transfer size, and # of concurrent I/Os Can simulate targets that do multiple xfer-rdys for large write commands
• Measures throughput and response time per I/O over the FCIP tunnels Virtual N-port 10:00:00:00:00:00:00:01
FC
Virtual N-port 11:00:00:00:00:00:00:03
FC
Gig3/3
Gig2/3
WAN/MAN Gig3/1
FCIP tunnel
Gig2/1 59
© 2006 Cisco Systems, Inc. All rights reserved.
SAN Extension Tuner The SAN Extension Tuner (SET) is a light-weight tool built into the IPS port itself to assist in tuning by generating various SCSI traffic workloads. The SET creates a virtual N_Port on the IPS port that can act as both initiator and target and mimics SCSI read/write commands. The user can specify the SCSI transfer size and number of outstanding I/Os, and can simulate targets that do multiple FCP_XFER_RDYs for large write commands. The SET measures throughput and I/O latency over the FCIP tunnels, and determines the optimal number of concurrent I/Os for maximum throughput. SCSI read/write commands are mimicked. FICON is not supported.
322
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
MDS Performance Advantages Backup Data Center
Primary Data Center MDS w/ 14+2
MDS w/ 14+2
WAN/MAN
FC
I/O Performance Tape and Write Acceleration, packet shaper
FC
WAN B/W Utilization
Security
Traffic Management
Compression
IPSec Encryption
IVR, SE Tuner
© 2006 Cisco Systems, Inc. All rights reserved.
60
MDS Performance Advantages The MDS platform provides three key features that are designed to squeeze the most performance out of cost-effective IP WANs:
Compression options add to implementation flexibility by allowing bandwidth to be used more effectively. Designed specifically to enable customers to leverage sub-gigabit transports in SAN-OS 1.3, compression can scale to gigabit speeds with SAN-OS 2.0 and the new 14+2 line card.
Write Acceleration increases performance by spoofing the SCSI XFER_READY command to reduce round-trips and lower latency. This feature can double the usable distance without increasing latency. For applications that allow few outstanding I/Os, like tape backup, Write Acceleration can double the effective throughput.
An optimized TCP MWS stack keeps the pipe full by dynamically recalculating the MWS based on changing conditions, and by implementing a packet shaping algorithm to allow fast TCP starts.
Copyright © 2006, Cisco Systems, Inc.
Designing SAN Extension Solutions
323
324
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Lesson 10
Building iSCSI Solutions Overview In this lesson, you will learn how to effectively deploy iSCSI solutions on the MDS 9000 platform, including key applications and environments, high availability features, security features, and deployment consideration.
Objectives Upon completing this lesson, you will be able to explain how iSCSI can be used to enable migration of mid-range applications to the SAN. This includes being able to meet these objectives:
Explain the problems that iSCSI is designed to solve
Describe the iSCSI protocol
Describe how iSCSI is implemented on the MDS 9000 IP Services Modules
Explain how to deploy iSCSI effectively
Explain how to configure high availability for iSCSI
Explain how to secure iSCSI environments
Explain how to simplify management of iSCSI environments with target discovery
Explain where Wide Area File Services (WAFS) is effective
What’s the Problem? Distributed Storage Problem: Customers want to consolidate storage • Distributed storage is difficult to manage • As storage devices increase, Lots of errors Not reliable
backup windows increase • Data center may have extra capacity
Workgroup Servers
that isn’t being utilized
Backup windows exceeded
© 2006 Cisco Systems, Inc. All rights reserved.
Data Center
Extra tape and disk capacity 4
Distributed Storage At a corporate headquarters, how is backup accomplished? In an environment where DAS storage dominates, someone has to load and collect tapes for each device, which easily constitutes a storage management nightmare. Backup windows can be easily exceeded and normal operations can be affected as a result causing delay in the opening of the business day. This is a growing problem for many businesses today. At the same time, the data canter may have a good storage management scheme and applications already in place, as well as unallocated disk space. What is needed is a way to connect these distributed workgroup servers to the data center.
326
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Branch Offices Problem: Branch offices located at greater distances • Lack of resources to manage storage • Inconsistent backup on each site • Compliance with data security and retention regulations • E.g. Banks, Schools, Clinics
Lack of management resources
Branch Office © 2006 Cisco Systems, Inc. All rights reserved.
Regulatory compliance issues
Data Center
Unmanaged backups
Branch Office 5
Branch Offices Branch offices can also pose a storage management issue for the enterprise. With typically too few management resources to manage storage at remote sites, backups are conducted on an adhoc basis, often leaving the company out of compliance with data security and retention regulations.
Copyright © 2006, Cisco Systems, Inc.
Building iSCSI Solutions
327
Mid-Range Applications Need cost-effective SAN solution for mid-range applications • Mid range apps have low bandwidth requirements Typically 10 – 20MB/s avg.
• Can’t utilize FC Resources
Storage Network
N-Tier Applications
IP Switching Network Application Optimization
2Gb FC with 15 MB/s per port
Security Cache
Web Servers
= only 7.5% bandwidth utilization
IP
Firewall
• Have higher latency tolerance
Content Switch
• FC attachment is costly RAID
MDS 9500
App Servers
• Typical uses
IDS SSL
Web server farms Application server farms Branch offices
© 2006 Cisco Systems, Inc. All rights reserved.
Tape
DB Servers Mainframe
IP Comm.
Today’s Datacenter
Operations
6
Mid-Range Applications While FC SANs have dramatically increased operational efficiency for high-end application storage, the high cost of FC has prevented these benefits from migrating down to mid-range applications. Mid-range applications don’t need the same high levels of bandwidth and low levels of latency as high-end applications, so it is often difficult to achieve ROI in a reasonable timeframe by implementing FC for mid-range applications. As a result, many applications in the enterprise, such as file, web, and messaging servers, are managed separately, either via DAS or NAS, keeping management costs high. At the same time, the customer’s investment in FC SANs is not fully realized. Inside the data center, there are a number of different tiers of servers. Two of those tiers are web server farms and application server farms. These servers are typically numerous, yet have low bandwidth requirements and can tolerate higher amounts of latency than database servers. It is often not considered cost-effective to migrate these servers to FC SANs. Assuming 2Gb FC ports, with each host sustaining an average of 15 MBps per port, only 7.5% of the available bandwidth is being utilized.
328
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
iSCSI Overview What is iSCSI? • internet Small Computer Systems Interface (iSCSI) • SCSI Transport protocol carried over TCP/IP • Encapsulates SCSI commands and data into IP packets • TCP is the underlying network layer transport Provides congestion control and in-order delivery of error-free data
• Allows iSCSI hosts to access iSCSI native targets • Allows iSCSI hosts to access FC SAN storage targets via gateway • Provides seamless integration of mid range servers into the SAN • Can use standard Ethernet NICs or iSCSI HBAs Ethernet 18 bytes
IP hdr 20 bytes
© 2006 Cisco Systems, Inc. All rights reserved.
TCP hdr 20 bytes
iSCSI hdr 48 bytes
SCSI Commands and Data
8
What is iSCSI? Internet Small Computer Systems Interface (iSCSI) is a transport protocol that operates on top of TCP and encapsulates SCSI-level commands and data into IP, for a TCP/IP byte stream. It is a means of transporting SCSI packets over TCP/IP, providing for an interoperable solution that can take advantage of existing IP-based infrastructures, management facilities and address distance limitations. Mapping SCSI I/O over TCP ensures that high-volume storage transfers have in-order delivery and error-free data with congestion control. This allows IP hosts to gain access to previously isolated Fibre Channel based storage targets. iSCSI is an end-to-end protocol with human-readable SCSI device (node) naming. It includes base components such as IPSec connectivity security, authentication for access configuration, discovery of iSCSI nodes, a process for remote boot, and iSCSI MIB standards. The iSCSI protocol was defined by an IP Storage Working Group through the Internet Engineering Taskforce (IETF). Version 20 was recently approved by the Internet Engineering Standards Group (IESG).
Copyright © 2006, Cisco Systems, Inc.
Building iSCSI Solutions
329
Advantages of iSCSI • Cost-effective technology for connecting low-end & midrange servers, clients and storage devices • Enables iSCSI hosts to communicate with iSCSI storage • Enables iSCSI hosts to communicate with FC storage, through a gateway
• Builds on SCSI and TCP/IP technology Leverages benefits of IP Knowledge and skills Infrastructure Security tools QoS and traffic engineering Network management tools R&D investment Ubiquitous access © 2006 Cisco Systems, Inc. All rights reserved.
9
Advantages of iSCSI iSCSI leverages existing IP networks. Users can therefore benefit from their experience with IP as well as the industry’s experience with IP technologies. This includes:
330
Economies from using a standard IP infrastructure, products, and service across the organization
Experienced IP staff to install and operate these networks. With minimal additional training it is expected that IP staff in remote locations can maintain iSCSI based servers.
Management tools already exist for IP networks this reduces the need to learn new tools or protocols. Traffic across the IP network can be secured using standards based solutions such as IPsec. QoS is used to ensure that SAN traffic is not affected by the potential unreliable nature of IP. QoS exists today in the IP infrastructure and can be applied from end to end across the IP network to give SAN traffic priority of other less time-sensitive traffic on the network.
iSCSI is compatible with existing IP LAN and WAN infrastructures. iSCSI devices support and Ethernet or Gigabit Ethernet interface to connect to standard LAN infrastructures.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
The Software-Based iSCSI Model • iSCSI is a network service enabled through the use of an iSCSI software driver and optional hardware • Internal TCP/IP Stack consumes CPU resources during data transfer • Error handling is performed by the driver, consuming even more CPU resources.
Applications File System
• Inexpensive solution
Block Device
iSCSI Software Driver
SCSI Generic iSCSI TCP/IP Stack
© 2006 Cisco Systems, Inc. All rights reserved.
NIC Driver
Adapter Driver
NIC Adapter
SCSI Adapter
10
The Software-Based iSCSI Model iSCSI drivers are normally free and provide a very low cost solution to customers that do not require high performance or low latency. The iSCSi driver performs all SCSI processing, TCP/IP processing, and Error Recovery. An iSCSI driver running on a 1 GHz CPU will spend nearly 95% of its CPU cycles when moving data through a NIC at 1 Gbps. However, nowadays host CPUs run at around 3 GHz so have more processing power and will only use around 30% of their CPU cycles moving data at 1 Gbps, leaving the remaining 70% for running applications and the operating system.
Copyright © 2006, Cisco Systems, Inc.
Building iSCSI Solutions
331
TCP and iSCSI Offload Engines • Hardware implementation of iSCSI within specialized NIC • Offloads TCP and iSCSI processing into hardware Full Offload: iSCSI & TCP offload (iSCSI HBA) Partial Offload: TCP offload only (TOE) Applications
• Relieves host CPU resources from iSCSI and TCP processing
File System Block Device
• Does not necessarily increase performance, only if CPU is busy
SCSI Generic iSCSI
• Wire-rate iSCSI performance
TCP/IP Stack
Useful only when host must support high sustained loads Dedicated Hardware
NIC Driver
Adapter Driver
TOE Adapter
SCSI Adapter
iSCSI
© 2006 Cisco Systems, Inc. All rights reserved.
11
TCP and iSCSI Offload Engines Some NICs have a TCP Offload Engine (TOE) to offload TCP/IP processing from the host CPU and reduce the CPU load. Some cards have partial offload and some have full offload.
Partial Offload TOE cards offload TCP/IP processing to the TOE but pass all errors (packet loss) to the driver running on the host CPU. In a lossy network, partial offload TOEs may perform worse than before.
Full Offload TOE cards offload both TCP/IP processing and error recovery to the TOE card. The host CPU is still responsible for SCSI and iSCSI processing.
iSCSI HBAs offload TCP/IP processing and iSCSI processing to co-processors and custom ASICs on the iSCSI HBA. Although relatively expensive, iSCSI HBAs provide lower latency and higher throughput than iSCSI software drivers or TOE cards. Nowadays when host CPU processors have more performance, it is not usually cost effective to use NICs with TOE, but to use software iSCSI drivers instead.
332
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
iSCSI Drivers and Offload Engines • Different approaches to iSCSI initiators: – iSCSI driver with a standard network card – NIC with a TCP offload engine (TOE) – HBAs that offload both TCP and iSCSI Processed in the server: Apps/file system SCSI
Other protocols
iSCSI
Processed in hardware: Apps/file system
Other protocols
Apps/file system
SCSI
SCSI
iSCSI
iSCSI TCP
TCP
TCP
IP
IP
IP
Network hardware
Network hardware
Network hardware
Standard NIC
TOE
iSCSI HBA
© 2006 Cisco Systems, Inc. All rights reserved.
12
iSCSI Drivers and Offload Engines iSCSI drivers running on the host perform all SCSI and iSCSI processing using the host CPU through the standard NIC. As I/O loads increase, the host consumes more CPU cycles and struggles to deliver throughput. In a congested IP network where packets are frequently discarded, the TCP stack running on the host CPU must also recover lost packets. Some NICs have a TCP Offload Engine (TOE) to offload TCP/IP processing from the host CPU and reduce the CPU load. Some cards have partial offload and some have full offload.
Partial Offload TOE cards offload TCP/IP processing to the TOE but pass all errors (packet loss) to the driver running on the host CPU. In a lossy network, partial offload TOEs may perform worse than before.
Full Offload TOE cards offload both TCP/IP processing and error recovery to the TOE card. The host CPU is still responsible for SCSI and iSCSI processing.
To achieve maximum performance, it is necessary to offload both TCP/IP processing, iSCSI processing and error recovery from the host CPU onto the iSCSI HBA. The host is still responsible for SCSI processing.
Copyright © 2006, Cisco Systems, Inc.
Building iSCSI Solutions
333
iSCSI Concepts • Network Entity iSCSI initiator iSCSI target
Network Entity - iSCSI Initiator iSCSI Node Network Portal
• iSCSI Node Identified by iSCSI Node Name Initiator node = Host
Ethernet, Wireless, etc.
Target node = Storage Target node contains one or more LUNs
• Network Portal
IP Network
Identified by IP address and subnet mask Network access – TCP/IP Ethernet, Wireless etc
Network Portal
Network Portal
iSCSI Node
iSCSI Node
Network Entity - iSCSI Targets 13
© 2006 Cisco Systems, Inc. All rights reserved.
iSCSI Concepts SCSI standards define a client server relationship between the SCSI Initiator and the SCSI Target. iSCSI standards define these as the Network Entity. The iSCSI Network Entity contains an iSCSI Node which is either the Initiator or Target. iSCSI Nodes are identified by an iSCSI Node Name. If the Target Node is a storage array, it may contain one or more SCSI LUNs. iSCSI Initiator Nodes communicate with iSCSI Target Nodes through Network Portals. Network Portals connect to the IP network and are identified by an IP Address. It is worth noting that Network portals can also be wireless ethernet ports.
334
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
iSCSI Node Names • iSCSI Node Name Associated with iSCSI nodes, not adapters Up to 255 bytes, human-readable string (UTF-8 encoding) Used for iSCSI login and target discovery
• iSCSI Name Types
Unique identifier assigned by Naming Authority
IQN – iSCSI Qualified Name
iqn.1987-05.com.cisco.storage.backup.server1 Date = yyyy-mm when Domain Acquired
Reversed Domain Name of Naming Authority
EUI – extended unique identifier (IEEE EUI-64) eui.0200123456789abc Unique identifier assigned by manufacturer © 2006 Cisco Systems, Inc. All rights reserved.
14
iSCSI Node Names Every iSCSI Node is identified by an iSCSI Node Name in one of two formats:
iqn: iSCSI Qualified Name, up to 255 bytes, human readable UTF-8 encoded string
eui: Extended Unique Identifier, 8 byte hexadecimal number defined and allocated by IEEE
Although both formats can be used, typically the iSCSI driver will use the iqn format and the eui format will be used by manufacturers of native iSCSI devices.
Copyright © 2006, Cisco Systems, Inc.
Building iSCSI Solutions
335
MDS 9000 IP Services Modules Native iSCSI Deployment • iSCSI hosts can communicate directly with native iSCSI storage devices over an IP network through a standard ethernet NIC • iSCSI servers can be fitted with iSCSI HBAs to offload iSCSI and TCP processing from the host CPU • Most NAS filers now support iSCSI as well as NFS and CIFS protocols NAS Filer
iSCSI Hosts with NICs
iSCSI iSCSI iSCSI iSCSI
NAS
SCSI
IP Network iSCSI iSCSI iSCSI HBA iSCSI iSCSI Servers HBA
SCSI
HBA
with iSCSI HBAs
iSCSI
iSCSI is most suitable for hosts running applications that are not latency sensitive and have a low throughput requirement
HBA
Native iSCSI Storage © 2006 Cisco Systems, Inc. All rights reserved.
16
Native iSCSI Deployment iSCSI hosts can communicate directly with native iSCSI storage devices over an IP network through a standard ethernet nic or through wireless. As the data load increases, the host CPU spends more time processing iSCSI and moving data byte by byte. iSCSI servers can be fitted with iSCSI HBAs to offload iSCSI and TCP processing from the host CPU. More and more mid range iSCSI native storage arrays are coming onto the market that make native iSCSI deployment an inexpensive reality. Most NSA filers now support Block I/O through iSCSI as well as File I/O through NFS and CIFS protocols. iSCSI is suitable for hosts running applications that are not latency sensitive and do not have a large throughput requirement.
336
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
iSCSI Gateways • iSCSI gateways allow iSCSI hosts and servers to communicate with Fibre Channel storage devices • MDS 9216i, MPS 14+2 and IPS line cards all provide iSCSI gateways • iSCSI is provided for free in the standard license
NAS Filer
iSCSI Hosts with NICs
NAS
SCSI
IP Network iSCSI iSCSI HBA iSCSI HBA iSCSI iSCSI Servers HBA
with iSCSI HBAs
FC Servers
iSCSI iSCSI iSCSI iSCSI
SCSI iSCSI
FC FC FC HBA FC HBA
HBA
FC SAN iSCSI Gateway SCSI
FC
HBA
FC
HBA
Native iSCSI Storage
FC Storage
© 2006 Cisco Systems, Inc. All rights reserved.
17
iSCSI Gateways Most enterprises already have data centres with Fibre Channel SANs and FC Storage Arrays but they cannot be accessed directly from iSCSI hosts. iSCSI gateways allow iSCSI hosts and servers to communicate with Fibre Channel storage devices. Cisco MDS 9216i, MPS 14+2 and IPS linecards all provide an iSCSI to FC gateway function. iSCSI is provided for free on MDS switches in the standard license.
Copyright © 2006, Cisco Systems, Inc.
Building iSCSI Solutions
337
Standalone Router Implementations • Standalone router-based approach is most common so far • Separate management interfaces FC FC HBA FC
HBA
FC FC HBA FC
HBA
HBA
HBA
• Separate sets of security policies • Less highly available
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI iSCSI
iSCSI iSCSI
iSCSI Gateways/Routers
© 2006 Cisco Systems, Inc. All rights reserved.
18
Standalone Router Implementations Although some vendors now offer native iSCSI storage, most iSCSI implementations today use a gateway or router-based approach that makes FC storage available to iSCSI hosts. Typical iSCSI gateway/router implementations are appliance-based or are small multiprotocol switches with a handful of ports. The Cisco SN5428-2 was an example of this approach, as a standalone workgroup SAN switch that provided FC-to-iSCSI routing. Although this approach has some viable applications, like small companies and remote offices, it has scalability issues in the datacenter:
338
This approach requires implementing a new set of devices (at least 2 devices for highavailability), and possibly adding more devices if more network capacity is needed.
It means separate management interfaces, and, even worse, separate security policies.
It is also typically less highly available, because the high availability hardware features that one expects in a data center SAN switch are often not viable for a small, low-cost router product.
It is potentially better for WAN-based branch offices. Gateways are not a good fit for metro-based branch offices, such as schools, clinics, and banks.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Integrated iSCSI • Multiprotocol SAN switch • Single SAN fabric • Single management interface FC HBA FC HBA FC
FC HBA FC HBA FC
HBA
• Single set of security policies • Tightly integrated
HBA
• Highly available iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
MDS 9500s with IPS
iSCSI
iSCSI iSCSI
• Designed for the data center and backup from remote offices © 2006 Cisco Systems, Inc. All rights reserved.
19
Integrated iSCSI The Cisco iSCSI solution for data centers is the IP Services (IPS) Module series for the Cisco MDS 9000 platform. This approach integrates iSCSI and FC (along with FCIP and FICON) into a single multiprotocol SAN switch. This provides higher availability because iSCSI is supported on the highly available MDS 9000 platform. This provides a single management interface, a single point of control for security, and unifies iSCSI and FC storage into a single SAN fabric. This approach is designed to meet the availability, manageability and scalability requirements of the data center.
Copyright © 2006, Cisco Systems, Inc.
Building iSCSI Solutions
339
MDS iSCSI Gateway Function Ethernet
IP
TCP
iSCSI
18
20
20
48
SCSI Commands and Data 0 - 1024
IPS Linecard
iSCSI Initiator iSCSI messages
iSCSI
iSCSI HBA
SCSI Com
mand
iSCSI virtual target
SC SI C omman d
R2T
XFER RDY
Data
Data
Status
FC virtual initiator
Response
FC
FC Frames
FC HBA
FC Target
MDS
SOF FC Header SCSI Commands and Data CRC EOF 4
© 2006 Cisco Systems, Inc. All rights reserved.
24
0-2048
4
4
20
MDS iSCSI Gateway Function The iSCSI Gateway function is included within the MDS 9216i, MPS 14+2 and IPS-8 linecard. The iSCSI Gateway provides a virtual iSCSI Target that communicates with the iSCSI Initiator. iSCSI messages are received, the SCSI commands and data are extracted and passed to a virtual FC Initiator which builds a FC frame around the payload. The virtual FC Initiator in the iSCSI Gateway communicates with the FC Target in the storage array or Tape drive. The iSCSI Gateway function is performed in ASICs to minimize latency and provide maximum throughput.
340
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
IPS Modules—iSCSI Features IP Hosts (iSCSI-enabled)
• iSCSI initiator to FC target mapping • High availability options: VRRP pWWN aliasing
iSCSI Drivers Installed
iSCSI iSCSI
iSCSI
Options for mid-range and high-end apps
• Security:
Catalyst 6500
RADIUS support IP ACLs VSANs and VLANs Integrated FC and iSCSI Zoning IPSec
IP Network
Cisco IPS
• Ease of deployment Dynamic initiator and target discovery Proxy initiators iSNS server
MDS 9000
FC Fabric
• Single management interface for FC and IP FC Storage © 2006 Cisco Systems, Inc. All rights reserved.
21
IPS Modules – iSCSI Features The IPS modules support mapping of iSCSI initiators (hosts) to FC targets (storage). They provide initiator and target discovery and LUN mapping to simplify deployment, and integrates iSCSI and FC security policies by supporting iSCSI initiator membership in VSANs and zones. Unlike FC hosts, iSCSI hosts can belong to multiple VSANs. CHAP authentication is supported, with centralized account management via RADIUS. The IPS modules support a range of HA features for both mid-range and high-end storage, including VRRP, iSCSI Trespass, Proxy Initiators, and Ethernet PortChannels. Cisco provides Network Boot drivers that work with the IPS module, to support this key data center application. All MDS management services are supported on both FC and IP interfaces. Cisco Fabric Manager and Device Manager are used to manage the IPS modules. This provides a single management interface for the entire storage network.
Copyright © 2006, Cisco Systems, Inc.
Building iSCSI Solutions
341
When to Deploy iSCSI iSCSI Fan-In—Scenario 1 Scenario 1: Few hosts, moderate bandwidth 30 hosts x 50MB/s = 1500MB/s iSCSI iSCSI iSCSI iSCSI
30 hosts x 50MB/s = 1500MB/s
iSCSI iSCSI iSCSI iSCSI
FC HBA
2:1 Fan-In
FC HBA
FC HBA
FC HBA
FC HBA
FC HBA
FC HBA
FC HBA
60 device connections
iSCSI FC $ $
2 cards x 8 ports x 100MB/s = 1600MB/s © 2006 Cisco Systems, Inc. All rights reserved.
2 cards x 8 quads x 250MB/s = 4000MB/s 23
iSCSI Fan-In Scenario 1 For smaller applications requiring few ports, FC can be as cost effective as iSCSI if there is an existing FC infrastructure. In the scenario shown here, with 30 ports requiring 50MB/s per port, it would be somewhat less expensive to use iSCSI when the cost of FC HBAs are factored in, but the cost difference will not be that significant. Because each host needs 50MB/s, the fan-in ratio of hosts to iSCSI ports is only 2:1. In addition, the two 32 port FC modules would provide more than twice the required bandwidth, which would allow for growth in I/O requirements.
342
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
iSCSI Fan-In—Scenario 2 Scenario 2: Many hosts, low bandwidth 100 hosts x 15MB/s = 1500MB/s iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
100 hosts x 15MB/s = 1500MB/s iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
FC HBA
… iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
FC HBA
FC HBA
FC HBA
FC HBA
FC HBA
FC HBA
FC HBA
FC HBA
FC HBA
FC
…
HBA
FC HBA
FC HBA
FC HBA
FC HBA
FC HBA
FC HBA
FC HBA
FC HBA
FC HBA
200 device connections
6:1 Fan-In
iSCSI FC $ $$$$
Still only 2 IP cards required
8 FC cards + 200 HBA ports required!
© 2006 Cisco Systems, Inc. All rights reserved.
24
iSCSI Fan-In Scenario 2 iSCSI is most cost-effective with high fan-in. In this scenario, the total bandwidth requirement is still 1500MB/s, but this time there are 100 hosts that each need 15MB/s of bandwidth. A FC-only solution would require eight 32-port FC modules (four 32-port modules per fabric in a redundant fabric configuration), and would also require 200 HBA ports. However, due to 6.25:1 fan-in across the IP network, you would only need 2 IPS blades, and host connectivity could be provided by standard Gigabit Ethernet (or even 100Base-T) NICs. In this scenario, it would be far more cost-effective to use iSCSI.
Copyright © 2006, Cisco Systems, Inc.
Building iSCSI Solutions
343
iSCSI Fan-In Ratios • Fan-in/fan-out ratios are an important aspect of optimal SAN designs • High fan-in ratios make IP SANs very cost-effective: Typical iSCSI fan-in
= 10:1 – 20:1
(10MB/s – 20MB/s)
Typical FC fan-in
=
(20MB/s – 50MB/s)
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
4:1 – 10:1
FC
© 2006 Cisco Systems, Inc. All rights reserved.
25
It is desirable to have high fan-in ratios in an IP SAN design, in part because they are more cost-effective, and in part because of the low port density of IP gateways and line cards.
344
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Cost-Effective DAS Consolidation • MDS IP line card provides seamless iSCSI / FC SAN integration • iSCSI enabled hosts capitalize on existing IP infrastructure investment • Less than half the cost of FC attachment • Line card upgrade protects MDS chassis investment
IP Hosts (iSCSI-enabled) iSCSI
• FC SAN resources can be fully utilized
iSCSI
• Common management infrastructure FC
FC HBA
HBA
FC HBA
iSCSI
iSCSI
iSCSI
Cisco IPS Module iSCSI
FC HBA
iSCSI
iSCSI
iSCSI
FC
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
HBA
FC HBA
FC HBA
iSCSI iSCSI
iSCSI iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
Catalyst 6500
Cisco MDS 9509 FC
FC
FC
iSCSI
Backup assets
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
FC SAN © 2006 Cisco Systems, Inc. All rights reserved.
26
Cost-Effective DAS Consolidation iSCSI is an ideal solution for many mid-range and low-end applications. It is a low-cost transport that leverages the existing investment in IP infrastructure. The MDS 9000 IP Services (IPS) line cards integrate iSCSI into the core SAN, allowing iSCSI hosts to utilize existing FC storage resources. The IPS line cards provide line-rate Gigabit Ethernet (GigE) performance, allowing high fan-in ratios—more iSCSI hosts per GigE port— and reducing the cost per host. The IPS provides options for both low-cost and high-end multipathing, providing a range of high-availability solutions to suit the needs of different applications. Lastly, operational management for iSCSI storage is integrated with FC storage management, instead of isolated on a separate boxes.
Copyright © 2006, Cisco Systems, Inc.
Building iSCSI Solutions
345
High-Availability iSCSI Configurations GigE Interfaces • Each GigE port supports three FCIP interfaces and an iSCSI interface • An IPS-8 can support up to 24 FCIP Tunnels + iSCSI concurrently • Each iSCSI interface will support approx 200 connections • GigE ports can be joined by Ethernet Port Channel for HA
FCIP interface FCIP FCIP interface Profile FCIP interface iSCSI interface
GigE port
FCIP interface FCIP FCIP interface Profile FCIP interface iSCSI interface
GigE port
Between odd/even pairs 1-2, 3-4, 5-6, 7-8
© 2006 Cisco Systems, Inc. All rights reserved.
Ethernet Port Channel
IP Network 28
GigE Interfaces Each GigE port supports three FCIP Interfaces and an iSCSI interface simultaneously sharing 1Gbps bandwidth. Tests have shown that each iSCSI interface will support up to 200 iSCSI connections, although it is worth noting that all iSCSI hosts would share the same 1 Gbps bandwidth. GigE ports can be joined using Ethernet Port Channel for High Availability. On the IPS-8 and MPS 14+2 linecard, odd even pairs of ports share the same SiByte ASIC and resources.
346
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Low-End HA Design Topology Adapter teaming
FC
iSCSI
Å Client-side data network omitted
VRRP
pWWN Aliasing
• Low-end HA design topology: Low-cost design VRRP provides redundancy across IPS modules pWWN aliasing provides redundancy on the FC side Single NIC = single points of failure No load-balancing © 2006 Cisco Systems, Inc. All rights reserved.
29
Low-End HA Design Topology In cost-sensitive environments, Cisco MDS 9000 Family features can be used to provide redundancy for iSCSI sessions. One of these features is the Virtual Redundancy Router Protocol (VRRP). VRRP provides redundant router gateway services whereby, should a Gigabit Ethernet port on the IPS module fail, another Gigabit Ethernet port on a redundant IPS module resumes the iSCSI service and continues to provide access for affected iSCSI sessions. Another feature provided by the IPS module is PWWN aliasing. PWWN aliasing provides recovery capability on the Fibre Channel end of the solution. Using PWWN aliasing, fail-over capability is provided to a redundant FC port in the event of a failure on the active FC port that is connected to the actual physical storage target. A requirement for this solution is that both FC ports must have access to the same LUNs and provide redundant paths to the physical storage residing on the FC SAN. Not all storage arrays provide the required active-active LUN capability across multiple storage subsystem interfaces. You should consult your storage vendor for details on this feature.
Copyright © 2006, Cisco Systems, Inc.
Building iSCSI Solutions
347
High-End HA Design Topology Ethernet PortChannel Multipathing S/W FC
iSCSI
Redundant VSANs
• High-end HA design topology: Highly available, redundant fabric design Host multipathing software provides fully redundant paths Active-active load balancing Ethernet PortChannel added for link-level redundancy on the IPS module © 2006 Cisco Systems, Inc. All rights reserved.
30
High-End HA Design Topology Many Fibre Channel SAN designers believe that the highest levels of redundancy and availability are achieved through the use of redundant fabric design topologies which provide pure isolation from fabric wide disruptions. IP SAN designs can of course provide the same levels of redundancy and availability as Fibre Channel based SANs. With the Cisco family of switches, one can also implement this fabric isolation using VLAN and VSAN capabilities. Furthermore, when redundant fabric designs are combined with director class switches, the levels of fault tolerance and availability are even higher. At the upper end of the HA design spectrum for IP SANs, redundant fabrics designs are combined with multipathing software to provide active-active load balancing and nearly instantaneous failover in the event of a component failure. The use of Ethernet PortChannels can further enhance the network resiliency by providing link level redundancy in the event of a port failure on an MDS IPS module.
348
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
VRRP • Two Gigabit Ethernet ports are in a VRRP group with one virtual IP address • If active VRRP port fails, peer reconnects to the same virtual IP address across the second port • Provides front-end redundancy
iSCSI
VRRP FC
iqn.host-2 IP Network
FC SAN
iSCSI
Virtual IP 10.1.1.1 iqn.host-1
© 2006 Cisco Systems, Inc. All rights reserved.
31
VRRP Virtual Routing Redundancy Protocol (VRRP) is a router-based protocol that dynamically handles redundant paths, making failures transparent to applications. Two ports are placed into a VRRP group that is assigned a single virtual IP address. The external router connects to the IPS via the virtual IP address. This enables transparent failover of an iSCSI volume from one IPS port to any other IPS, either locally or on another Cisco MDS 9000 Family switch. VRRP provides redundancy in front of the MDS switch but can take up to 20 secs to failover.
Copyright © 2006, Cisco Systems, Inc.
Building iSCSI Solutions
349
pWWN Aliasing • Provides back-end redundancy • Each FC storage port is mapped to a virtual iSCSI target • pWWN aliasing maps a secondary pWWN to the same virtual target • Trespass feature for mid-range storage arrays Exports LUNs from active to passive port
iSCSI
iqn.disk-1 iqn.host-2
pWWN P1 IP Network
iSCSI
FC
FC SAN iqn.disk-1 iqn.disk-2
pWWN P2
iqn.host-1 32
© 2006 Cisco Systems, Inc. All rights reserved.
pWWN Aliasing Virtual iSCSI targets can be associated with a secondary pWWN on the FC target. This can be used when the physical Fibre Channel target is configured to have a LUN visible across redundant ports. When the active port fails, the secondary port becomes active and the iSCSI session switches to use the new active port. iSCSI transparently switches to using the secondary port without impacting the iSCSI host. All other I/O are terminated with check condition status and the host retries the I/O. If both the primary and secondary pWWNs are available, then both pWWNs can be used – each session may use either pWWN. For mid-range storage arrays, the trespass feature is available to enable the export of LUNs, on an active port failure, from the active to the passive port of a statically imported iSCSI target. In physical Fibre Channel targets which are configured to have LUNs visible over two Fibre Channel N-ports, when the active port fails, the passive port takes over. However, some physical Fibre Channel targets require that the trespass command be issued, to export the LUNs from the active port to the passive port. When the active port fails, the passive port becomes active, and if the trespass feature is enabled, the MDS issues a trespass command to the target to export the LUNs on the new active port. The iSCSI session switches to use the new active port and the exported LUNs are accessed over the new active port. pWWN aliasing and trespass provide redundancy behind the MDS switch.
350
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Host-to-Storage Multipathing • Redundant I/O design with multipathing s/w is a best practice: Error detection, dynamic failover and recovery Active/active or active/passive operation Transparent to applications on server Host with multiple (iSCSI) NICs and multipathing software Ethernet switches Application Multipathing
MDS 9000 redundant fabrics
FC storage array with redundant controller ports FC
iSCSI Driver
© 2006 Cisco Systems, Inc. All rights reserved.
33
Host to Storage Multipathing Multipath storage products can provide a large spectrum of features and functions that affect the performance, availability, accessibility, configurability, and serviceability of the storage subsystem and system I/O. Due to the cost impact of redundancy, and stringent network requirements, administrators may choose to implement redundancy at only one component level. At each individual component level there must be robust management and monitoring techniques built into the network so the switchover can occur with minimal downtime. In a typical multipathing implementation, each path may traverse separate fabrics to complete the connection between initiator and target. Failure anywhere in a chosen path can cause a failover event to occur. Thus multipathing software must provide proactive monitoring and fast fail-over should an existing utilized path fail. During a failure event, it is important for either the network recovery mechanisms to maintain access to all devices (targets, LUNs) or the multipathing implementation to recognize and recover from any loss in connectivity. The key objective for redundancy design is to maintain access at the application layers and minimize any outages. A combination of multipathing software and iSCSI network redundancy is used to ensure true application layer protection.
Copyright © 2006, Cisco Systems, Inc.
Building iSCSI Solutions
351
3.0
iSCSI Server Load Balancing (iSLB) iSCSI Servers
iSCSI Servers Load Balancing iSLB
IPS
IPS
Create a pool of IPS ports Load balance servers to ports from the pool
IPS
FC Arrays
FC Arrays
Without iSLB
IPS
With iSLB
• Manually configure iSCSI configuration on multiple switches
• CFS automatically distributes iSCSI configuration to multiple switches
• Static assignment of Hosts to IPS ports, with Active/Backup redundancy
• iSLB provides dynamic load distribution with Active/Active redundancy
• Manually Zone iSCSI Host WWN with FC Target WWN
• Simplified zoning by automating setup of iSCSI specific attributes
© 2006 Cisco Systems, Inc. All rights reserved.
34
iSCSI Server Load Balancing iSLB provides dynamic load distribution across pooled GigE ports in different linecards on the same MDs switch. This removes the requirement to manually assign iSCSI hosts to IPS ports or manually zone iSCSI hosts with FC targets.
352
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
iSCSI Security iSCSI Access Control MDS 9000 Bridges Security Domains for IP SANs IP Domain VLANs CHAP ACLs IPSec
FC Domain VSANs Zoning Port Security …
… Mgmt Domain SNMP AAA RBAC SSH …
Multiple levels of security © 2006 Cisco Systems, Inc. All rights reserved.
36
iSCSI Access Control When considering security in IP SANs, it is important to consider the overall picture. IP SANs touch several overlapping security domains, and therefore may require utilization of several security mechanisms, including:
IP Domain –VLANs, ACLs, CHAP, IPSec
Management Domain – AAA, SNMP, RBAC, SSH
Fibre Channel Domain – VSANs, Zoning, Port Security
The MDS 9000 family of switches provide the security features, intelligence capabilities and processing capacity needed to bridge these security domains. While it is not a requirement to implement all of these security features, it is a recommended best practice to implement multiple levels of security. For example, iSCSI CHAP authentication is not required, but can be used in combination with FC-based zoning to create a more secure IP SAN.
Copyright © 2006, Cisco Systems, Inc.
Building iSCSI Solutions
353
Centralized Security • Local Radius server on MDS FC Servers
RADIUS
• Centralized AAA services via RADIUS and TACACS+ servers • Single AAA database for:
FC-CHAP
iSCSI CHAP authentication FC-CHAP authentication FC-CHAP
CHAP
CLI/SNMP accounts (RBAC) SNMPv3
iSCSI
RBAC
iSCSI Servers
FC Targets
Management Server © 2006 Cisco Systems, Inc. All rights reserved.
37
Centralized Security The MDS 9000 platform provides centralized AAA services by supporting RADIUS and TACACS+ servers. With iSCSI, RADIUS can be used to implement a single highly available AAA database for:
354
iSCSI CHAP authentication
FC-CHAP authentication
CLI/SNMP accounts (RBAC)
SNMPv3
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
iSCSI Access Control Model
I will only be advertised on gig 5/3 iSCSI
CHAP I will only permit access to iqn.host1
• IPS module supports IP-based access controls: IP ACLs VLAN trunking CHAP authentication for iSCSI initiators
• iSCSI virtual targets provide additional iSCSI access controls: Advertise targets on specific interfaces Permit access to specific iSCSI initiators © 2006 Cisco Systems, Inc. All rights reserved.
38
iSCSI Access Control Model During an iSCSI login, both the iSCSI initiator and target have the option to authenticate each other. By default, the IPS module allows either CHAP authentication or no authentication from iSCSI hosts. CHAP authentication can be enabled globally for all IPS module interfaces, or on a per interface basis. You can control access to each statically-mapped iSCSI target by specifying a list of IPS ports on which it will be advertised and specifying a list of iSCSI initiator node names allowed to access it. By default iSCSI targets are advertised on all Gigabit Ethernet interfaces, subinterfaces, PortChannel interfaces, and PortChannel subinterfaces. By default, static virtual iSCSI targets are not accessible to any iSCSI host. You must explicitly configure accessibility to allow a virtual iSCSI target to be accessed by all hosts. The initiator access list can contain one or more initiators. Each initiator is identified by one of the following:
iSCSI node name
IP address and subnet
Copyright © 2006, Cisco Systems, Inc.
Building iSCSI Solutions
355
FC Access Control Model What’s a VSAN?!
VSAN_10
FC
iSCSI
iSCSI Host VLAN_10
Virtual FC Host (pWWN)
• By default, all iSCSI initiators belong to the port VSAN of their iSCSI interface (VSAN 1) • iSCSI initiators can be assigned to VSANs by pWWN • iSCSI initiators can belong to multiple VSANs © 2006 Cisco Systems, Inc. All rights reserved.
39
FC Access Control Model The iSCSI specifications do not define VSANs, and iSCSI hosts know nothing about VSANs. However, the MDS 9000 extends these concepts from the Fibre Channel domain into the iSCSI domain, providing an inherent transparency to both protocols. By default, iSCSI initiators are members of the port VSAN of their iSCSI interface, which defaults to VSAN 1. The port VSAN of an iSCSI interface can be modified. iSCSI initiators can be members of more than one VSAN. The IPS module creates one Fibre Channel virtual N_Port in each VSAN to which the host belongs.
356
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Zoning iSCSI Initiators VSAN_10 iSCSI_Zone1 FC
iSCSI
iSCSI Host VLAN_10
Virtual FC Host (IQN or IP)
• VSANs and zones are FC access control mechanisms • MDS 9000 extends VSANs and zoning into iSCSI domain • iSCSI initiator access is subject to VSAN and zoning rules
© 2006 Cisco Systems, Inc. All rights reserved.
40
Zoning iSCSI Initiators Zoning is a Fibre Channel access control mechanism for devices within a SAN, or in the case of the MDS 9000, within a VSAN. The MDS 9000’s zoning implementation extends the VSAN and zoning concepts from the Fibre Channel domain to also cover the iSCSI domain. This extension includes both iSCSI and Fibre Channel features and provides uniform, flexible access control across a SAN. iSCSI initiators are subject to the rules and enforcement of VSANs and zoning. By default, dynamically mapped iSCSI initiators are placed in VSAN 1. If the default zone policy in VSAN 1 is set to “permit”, it would be possible for iSCSI initiators to access any un-zoned targets in VSAN 1. Generally speaking, setting the default zone policy to permit is not recommended.
Copyright © 2006, Cisco Systems, Inc.
Building iSCSI Solutions
357
IPSec • IPSec for secure VPN tunnels: Authentication and encryption Site-to-site VPNs for FCIP tunnels
iSCSI
iSCSI
iSCSI
Site-to-site VPNs for iSCSI connections
iSCSI
iSCSI iSCSI
iSCSI iSCSI
Hardware-based IPSec on 14+2 module FC FC HBA FC
HBA
FC FC HBA FC
Site-to-site VPN for IP SAN
HBA
HBA
HBA
FC
FC
Site-to-site VPN for FCIP interconnect FC
FC
FC
© 2006 Cisco Systems, Inc. All rights reserved.
41
IPSec The IPSec protocol creates secure tunnels between a pair of hosts, between a pair of gateways, or between a gateway and a host. IPSec supports session-level and packet-level authentication using a variety of encryption schemes, such as MD-5 SHA-1, DES, and 3DES. Session-level authentication ensures that devices are authorized to communicate and verifies that devices “are who they say they are,” while packet-level authentication ensures that data has not been altered in transit. Applications for IPSec VPNs in the SAN include:
358
Site-to-site VPNs for FCIP SAN interconnects
Site-to-site VPNs for IP SANs (iSCSI hosts accessing remote FC storage)
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
VLANs • Use VLANs to secure data paths at the edge of the IP network • VLAN-to-VSAN mapping
iSCSI
iSCSI
iSCSI
• Private VLANs
iSCSI
iSCSI VSAN
iSCSI
iSCSI
iSCSI
iSCSI iSCSI
iSCSI iSCSI
iSCSI VLAN
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
FC
FC
FC
FC
FC
FCIP VLAN FCIP VSAN
FCIP VSAN 42
© 2006 Cisco Systems, Inc. All rights reserved.
VLANs Within each data center or remote site, VLANs can be used to provide dedicated paths for IP storage traffic. VLANs can be used to:
Protect iSCSI traffic along the data path from the hosts to the SAN fabric
Provide dedicated paths for FC extension over FCIP by extending VLANs from the SAN fabric to edge routers
In addition to providing security, using VLANs to isolate iSCSI and FCIP data paths enhances the network administrator’s ability to provide dedicated bandwidth to SAN devices and allows more effective application of QoS parameters.
Copyright © 2006, Cisco Systems, Inc.
Building iSCSI Solutions
359
iSCSI Target Discovery iSCSI Target Discovery Hello out there… where are my targets?!
Here you go mate, iqn.your.target.disk1 iqn.your.target.disk2
FC
IP Network
iSCSI
• iSCSI target discovery: Uses iSCSI “SendTargets” command to query target When there are few devices and the target IP Address is known use Static Configuration – Point to point In larger iSCSI designs with multiple IP connections use iSNS (Internet Storage Name Service) or SLPv2 (Service Location Protocol)
• SCSI protocol is still used for LUN Discovery © 2006 Cisco Systems, Inc. All rights reserved.
44
The goal of iSCSI discovery is to allow an initiator to find the targets to which it has access, and at least one address at which each target may be accessed. Ideally, this should be done using as little configuration as possible. The iSCSI discovery mechanisms only deal with target discovery; the SCSI protocol is used for LUN discovery. In order for an iSCSI initiator to establish an iSCSI session with an iSCSI target, the initiator needs the IP address, TCP port number and iSCSI target name information. The goal of iSCSI discovery mechanisms are to provide low overhead support for small iSCSI setups, and scalable discovery solutions for large enterprise setups. Thus, there are several methods that may be used to find targets ranging from configuring a list of targets and addresses on each initiator and doing no discovery at all, to configuring nothing on each initiator, and allowing the initiator to discover targets dynamically. There are currently three basic ways to allow iSCSI host systems to discover the presence of iSCSI target storage controllers:
Static configuration
iSCSI SendTargets command
“Zero configuration” methods such as the Service Location Protocol (SLPv2) and/or the Internet Storage Name Service (iSNS)
The diagram above shows the SendTargets method. This method is most often used today with simple iSCSI solutions. However, iSNS server will be used in the future to scale iSCSI target discovery. SAN-OS 2.0 has the iSNS server component.
360
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
iSNS Client and Server Support External iSNS server
FC Servers
• iSNS server is integrated into the MDS External iSNS server is also supported via MDS iSNS client
iSNS client
iSCSI
iSNS client
iSCSI iSCSI
iSCSI Servers
iSCSI
iSNS client
iSCSI
FC Targets
• Enables an integrated solution to configure and manage both Fibre Channel and iSCSI devices:
FC Servers
iSNS server
iSCSI iSCSI
iSNS server
Discovery Domains mapped to FC Zones
No need for dual access-control configuration
iSCSI iSCSI
Distributed, HA solution
Discovery Domain Sets mapped to FC Zonesets
iSNS server
iSCSI
iSCSI Servers
Device registration, discovery, and state change notification
FC Targets
© 2006 Cisco Systems, Inc. All rights reserved.
45
iSNS Client and Server Support “Zero-configuration” discovery methods allow iSCSI hosts to discover accessible targets without requiring an administrator to explicitly configure each host to point to its targets. iSNS is rapidly becoming the industry-standard zero-configuration protocol for iSCSI environments. It is supported by the Microsoft iSCSI client. iSNS provides name services for iSCSI and iFCP SANs. Without iSNS, each iSCSI host must be configured to point to each target portal; this can be very time-consuming and error-prone in a large deployment. with iSNS, iSCSI target devices register their addresses and attributes with a central iSNS server. Initiators then can query the iSNS server to identify accessible targets. iSNS also includes a state change notification protocol that notifies iSCSI devices when the list of accessible targets changes. With native iSCSI storage, each target is a separate portal. The MDS 9000 IPS module acts as a portal for all virtual FC targets configured on that switch. This means that host configuration is relatively simple if you only have one IPS module, but becomes increasingly complex as more IPS modules are added. When native iSCSI targets are added to the mix, iSNS is even more essential for scaling the iSCSI deployment. The Cisco IPS module includes both iSNS client and iSNS server support. If an external iSNS server like the Microsoft iSNS server is used, the MDS registers all virtual iSCSI targets with the external iSNS server. If the MDS iSNS server is used, the iSCSI hosts discover targets by querying the iSNS server in the MDS switch. The iSNS databases are distributed and synchronized in a multi-switch fabric. iSNS also supports a feature called Discovery Domains (DD) and Discovery Domain Sets (DDS). DDs and DDSs are similar to zones and zonesets. One advantage of the MDS iSNS server over other iSNS servers is that the MDS automatically maps the active zoneset to the active iSNS DDS, eliminating the need for dual access-control configuration. Copyright © 2006, Cisco Systems, Inc.
Building iSCSI Solutions
361
Wide Area File Services Typical Enterprise • Data Protection Risks • Regulatory compliance issues
Branch Office
• Management Challenges • High Costs: $20k-$30k/yr. Per Branch
Regional Office
IT IT
Backup
NAS DAS
Backup
NAS DAS
Files Files
IT
Wide Area Network
NAS SAN
IT Backup
NAS DAS DAS Files
Files
Remote Office
Data Center
“Islands of Storage” © 2006 Cisco Systems, Inc. All rights reserved.
47
Typical Enterprise In a typical enterprise environment, several branch offices connect to the Data Center over the WAN. Each branch office is responsible for data protection and backup of critical data leading to concerns for regulatory compliance. Each branch office requires local technical support and management of the infrastructure leading to high costs of deployment.
362
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
WAFS Solution and Data Migration Branch Office
• Data migrated to Data Center
Regional Office
• Reduced IT management costs • Deploy WAAS at each office
IT IT
Backup
Backup
NAS DAS DAS
NAS DAS DAS
Files Files
IT Admin
Wide Area Network
NAS SAN
Backup
NAS DAS DAS Files
Files
Backup
IT
Data Center
Remote Office
Wide Area File Services © 2006 Cisco Systems, Inc. All rights reserved.
48
WAFS Solution and Data Migration To solve management issues and reduce IT management costs, data is migrated to the data center. WAAS appliances that provide WAFS services are deployed at each Branch Office and at the Data Center to provide access to files from each branch office.
Copyright © 2006, Cisco Systems, Inc.
Building iSCSI Solutions
363
Centralization and Consolidation • Storage consolidated in Data Centre
Branch Office Regional Office
• Centralized IT management and backup strategy • Files cached in WAAS and locally accessed
WAAS Manager (Web-based)
Files Files
IT Admin
Wide Area Network
NAS SAN
Files
Files
Backup
Data Center
Cluster
Remote Office
Wide Area File Services 49
© 2006 Cisco Systems, Inc. All rights reserved.
Centralization and Consolidation By consolidating storage in the data center, data can be centrally managed and backed up. However, without WAFS, file access from each branch office would be slow. The WAAS appliance will cache files locally in each branch office, providing much faster access and improved performance.
364
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Data Flows in the Data Center • SCSI is a block I/O protocol • NFS and CIFS are file I/O protocols • The file system maps files to blocks, to access files on the storage device using block I/O NAS Head
Windows Client CIFS NAS
NAS
Protocol conversion and File System only – no local storage
SCSI
FC
FC Storage Array
SCSI SCSI
CIFS NFS NFS
FC
LAN
NAS Filer
iSCSI Gateway
Unix Client
SCSI FC SAN
FC Tape
CIFS SCSI SCSI FC File system maps Files to blocks
iSCSI iSCSI host
= File System
HBA
HBA
© 2006 Cisco Systems, Inc. All rights reserved.
iSCSI Storage
FC Application Server
50
Data Flows in the Data Center Many different File I/O and Block I/O protocols are used throughout the data center. File I/O protocols like NFS and CIFS are used to transfer files between clients and NAS filers across the LAN.
Unix clients use NFS
Windows clients use CIFS
Block I/O protocols like SCSI are used to transfer blocks between SCSI Initiators and SCSI Targets.
iSCSI is used to transport SCSI commands and data across the LAN
Fibre Channel is used to transport SCSI commands and data across the SAN
The File System is a table that is used to map files to blocks. The data center is a complex environment with many different File I/O and Block I/O protocols used to transfer data to and from storage devices. In this environment it is important to understand where the data is located and where the file system is located.
NAS Filers connect to the LAN and have their own File System and local storage They respond to File I/O protocols like NFS and CIFS.
NAS Head is a NAS filer without local storage. They bridge the LAN and the SAN and respond to File I/O protocols then map these through the File System to Block I/O protocols that are used to access FC storage on the SAN.
iSCSI Gateway allows iSCSI hosts on the LAN to access FC storage on the SAN. Note that the File System is now on the iSCSI host.
Copyright © 2006, Cisco Systems, Inc.
Building iSCSI Solutions
365
366
An FC Application server responds to File I/O requests from the Client on the LAN and will retrieve data using Block I/O from FC storage LUNs across the SAN. This time, the FC Application Server contains the File System.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
Data Flows across the WAN • CIFS is a very chatty protocol: >1300 round-trip transactions to/from the file system to load a 1MB file • As distance between client and file system increases, latency increases and files take longer to load Windows Clients
Windows Client
CIFS Branch Office LAN
CIFS NAS
Data Center LAN
NFS NAS Filer Unix Client
WAN
NFS SCSI Branch Office LAN
iSCSI iSCSI host
Unix Clients
HBA
© 2006 Cisco Systems, Inc. All rights reserved.
iSCSI Storage
51
Data Flows across the WAN In a data center environment, distances are short so latencies are relatively low. When the Windows client is located across the WAN at some distance from its server, then latencies increase dramatically and files take longer to load. CIFS is notoriously a very chatty protocol. Over 1300 round-trip transactions take place between the client and the file system in the NAS Filer just to load a 1MB file.
Copyright © 2006, Cisco Systems, Inc.
Building iSCSI Solutions
367
Cisco Wide-Area Application Services
L7: Application Optimization
Unified Management Management Unified
L4: Transport Optimization
Video Video
Network Infrastructure
Web Web
Data Data Redundancy Redundancy Elimination Elimination (DRE) (DRE)
File File Services Services
Local Local Services Services
TCP TCP Flow Flow Optimizations Optimizations (TFO) (TFO)
Other Other Apps Apps
Content Content Distribution Distribution
Application Application Classification Classification and and Policy Policy Engine Engine Logical Logical and and Physical Physical Integration Integration Security Security
Monitoring Monitoring
Quality Qualityof of Service Service
Core Core Routing Routing & & Switching Switching Services Services
52
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Wide Area Application Services (WAAS) Cisco Wide-Area Application Services (WAAS) is a powerful combination of the new Cisco Wide Area Application Engines (WAE) and integrated network modules. WAAS includes, and is the replacement for, Wide-Area File Services (WAFS). WAAS offers distributed enterprises with multiple branch offices the benefits of centralized infrastructure and simple remote access to applications, storage, and content. Cisco WAAS includes best-in-class protocol optimizations, caching, content distribution, and streaming media technologies. The technology overcomes bandwidth and latency limitations associated with TCP/IP and client-server protocols and allows you to consolidate your distributed servers and storage into centrally managed data centers, while offering LAN-like access to remote users. Benefits include:
368
Reduce TCO and improve assets management through centralized rather than distributed infrastructure.
Improve data management and protection by keeping a master copy of all files and content at the data center.
Improve ability to meet regulatory compliance objective.
Raise employee productivity by providing faster access to shared information.
Protect investment in existing WAN deployments.
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
WAFS Performance Word - Time to Save
Word - Time to Open Native WAN
Native WAN
Cisco FE
Cisco FE
Native LAN
Native LAN 0
5
10
15
20
25
0
10
20
30
40
50
60
70
25
30
35
1MB Word File (sec), T1, 80mS
Excel - Time to Open
Excel - Time to Save
Native WAN
Native WAN
Cisco FE
Cisco FE
Native LAN
Native LAN 0
10
20
30
40
50
0
5
10
15
20
2MB Excel File (sec), T1, 80mS
Cisco WAAS shows 5x to 12x faster performance as compared to the WAN, and similar performance to LAN for typical operations on Office applications © 2006 Cisco Systems, Inc. All rights reserved.
53
WAFS Performance The above diagram shows the comparison, on file open and file close, of a WAFS enabled site versus a direct access WAN site. Even when a file is not cached on the local WAN Application Engine, the WAFS performance enhancements use roughly 1/3 of the WAN that a native WAN request for the same file. Note
All graphs and statistics are examples only, actual performance will vary depending on network design, server design and application design.
Copyright © 2006, Cisco Systems, Inc.
Building iSCSI Solutions
369
370
Cisco Storage Design Fundamentals (CSDF) v3.0
Copyright © 2006, Cisco Systems, Inc.
View more...
Comments