DS4000 Maintenance Skill
Short Description
ds4000...
Description
IBM TotalStorage™
DS4000 series skill transfer
© 2009 IBM Corporation
IBM TotalStorage™
Agenda
DS4000 series hardware introduction DS4000 troubleshooting DS4000 hardware maintenance DS4000 firmware package DS4000 data collection DS4000 material
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4000 hardware introduction Product list
DS4300 (FAStT600)
DS4500/DS4400 (FAStT900/FAStT700)
DS4700
DS4800
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4300 (FAStT600) Base mode
Up to two 2 Gbps hot swappable RAID controllers with 512 MB of battery backed cache (256 MB per controller). Support for up to three IBM TotalStorage DS4000 EXP700/EXP710 Expansion Units. Support for one storage partition in standard configuration. There is an option to expandup to 4, 8, or 16 storage partitions.
Turbo mode
Increased cache from 256 MB per controller on base DS4300 to 1 GB per controller on Turbo. Support for up to seven IBM TotalStorage EXP710 Expansion Units. EXP810 Enclosures can also be used behind the DS4300. Host interface on base DS4300 is 2 Gbps. Turbo auto senses to connect to 1 Gbps or 2 Gbps. Eight storage partitions standard, with upgrade to 16 or 64.
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4300 (FAStT600) Green Power LED:
This LED indicates that the DC power status is OK.
Amber General-System-Fault LED:
When a storage server component fails (such as a disk drive, fan, or power supply), this LED will be on.
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4300 (FAStT600)
Host loop LED (green):
– – –
There is no data in cache. The cache option is not selected for the array. The cache memory has failed, or the battery has failed.
Battery charged LED (green):
Normally, this LED should be on. If it is off, it indicates a battery fault. The LED blinks while the battery is charging or performing a self-test.
Expansion port bypass LED (amber):
This LED is on when the data is in cache. If it is off, one of the following situations has occurred: – – –
The host loop is down, not turned on, or not connected. A SFP has failed, or the host port is not occupied. The RAID controller circuitry has failed, or the RAID controller has no power.
Cache activity LED (green):
This LED should be on, which means that the host connection loop is good. If it is off, the following problems might have occurred:
The LED will be on if nothing is plugged into the expansion port, or the expansion is powering off.
Expansion loop link LED (green):
Normally on when the drive-side Fibre Channel loop is operating normally.
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
Host side connection
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
Drive side connection
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4500 (FAStT900) Dual, redundant 2 Gbps RAID controllers with 2 GB of Rambus cache memory (1 GB per RAID controller). The data in the cache is protected by battery backup for at least seven days. Supports connecting up to sixteen EXP100 or EXP710 or up to fourteen EXP810 enclosures Has 16 storage partitions standard, with upgrade option to 64.
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4500 (FAStT900)
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4500 (FAStT900)
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4500 (FAStT900) Speed LED (green):
This LED is on when the selected link speed is 2 Gbps and a link is up. This LED is off when the DS4500 RAID Controller works on 1 Gbps.
Fault LED (amber):
This LED should normally be off. If on it indicates a fault of the mini-hub or one of the SFP modules.
Two Bypass LEDs (amber):
There is one bypass LED for each SFP module. This LED should normally be off if no SFP module is installed. But if a SFP module is present, and a link error is detected (for example, no cable or faulty cable, or host not powered on) it will go on.
Loop good LED (green):
This LED should be normally on. It might be off if there are link errors.
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
Host side connection
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
Drive side connection Only one port on each minihub of the DS4500 on the drive side is ever used. We recommend removing all the SFP modules on the minihub ports that are not connected to any device
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4700 The IBM System Storage DS4700 Express storage server uses 4 Gbps technology Model 70 contains 2 GB of cache memory (1 GB per controller), four 4 Gbps FC host ports (two ports per controller), and four shortwave small form-factor pluggable (SFP) Model 72 contains 4 GB of cache memory (2 GB per controller), eight 4 Gbps FC host ports (four ports per controller), and six shortwave SFP Supports up to six EXP810 Both models 70 and 72 have selectable storage partitions up to 128
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4700
Locate LED (white or blue)
On: Indicates storage subsystem locate. Off: This is the normal status.
Service action allowed LED (blue)
On: The service action can be performed on the component with no adverse
consequences.
Off: This is the normal status.
Service action required LED (amber)
On: There is a corresponding needs attention condition flagged by the controller
firmware. Some of these conditions might not be hardware related.
Off: This is the normal status.
Power LED (green)
– On: The subsystem is powered on. – Off: The subsystem is powered off.
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4700
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4700
LED #1-2 (green): Host channel speed LED #3 (blue): Serviced action allowed LED #4 (amber): Need attention LED #5 (green): Caching active LED #8-11 (amber): Drive channel bypass LED #9-10 (green): Drive channel speed LED #12 (green/yellow): Numeric display (enclosure ID/diagnostic display)
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4700 Service action allowed (blue)
Off: Normal status.
On: Safe to remove.
Battery charging (green)
On: Battery charged and ready.
Blinking: Battery is charging.
Off: Battery is faulted, discharged, or missing.
Needs attention or service action required (amber)
Off: Normal status.
On: Controller firmware or hardware requires attention.
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4700 Power supply fan LED (AC power) (green) Off: Power supply fan is not providing AC power. On: Power supply fan is providing AC power.
Serviced action allowed (blue) On: Safe to remove. Off: Normal status.
Needs attention (amber) Off: Normal status. On: Power supply fan requires attention.
Power supply fan Direct Current Enabled (DC power) (green) Off: Power supply fan is not providing DC power. On: Power supply fan is providing DC power.
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
Host side connection
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
Drive side connection
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4800 The 1825-80A and 1815-82A come with 4 GB of total cache The 1815-84A has 8 GB of total cache 1815-88A has 16 GB of total cache. Supports up to 16 EXP710 FC-only enclosures for a total of 224 disks. Supports up to 14 EXP810 enclosures for a total of 224 disks. Supports up to 512 host storage partitions.
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4800
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4800
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
Host side connection
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
Drive side connection The DS4800 supports four redundant drive channel pairs on which to place expansion enclosures Ports 4 and 3 on controller A are channel group 1. Ports 2 and 1 on controller A are channel group 2. Ports 1 and 2 on controller B are channel group 3. Ports 3 and 4 on controller B are channel group 4.
The two ports on each drive channel group must run at the same speed. .
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
Drive side connection The best sequence (Figure 3-52) to populate drive channel pairs is:
1. Controller A, port 4/controller B, port 1 (drive channel pair 1)
2. Controller A, port 2/controller B, port 3 (drive channel pair 3)
3. Controller A, port 3/controller B, port 2 (drive channel pair 2)
4. Controller A, port 1/controller B, port 4 (drive channel pair 4)
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4000 troubleshooting Basic tools
Recovery Guru
Major Event Log (MEL)
Other tools
RLS
etc…..
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
Recovery Guru If there is an error condition on your DS4000, the Recovery Guru will explain the cause of the problem and will provide necessary actions to recover. It will guide you to perform specific actions, depending on the event
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
Major Event Log The Major Event Log (MEL) is the primary source for troubleshooting a DS4000 storage server. To access the MEL select Advanced → Troubleshooting → View Event Log. By default only the last 100 critical events are shown, but you can choose how many events you want to have listed. The maximum number you can set is 8191. If you want to troubleshoot your system, use the full event log, as it includes information about actions that took place before the actual critical event happened, thus giving you the complete history of the problem. July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4000 hardware maintenance Disk replacment Battery replacement …… Remember DO backup(ASD,profile) before normal maintenance Remember DO data backup before maintenace that maybe harmful
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4000 disk replacement Failed drive/Bypassed drive
Check Recovery Guru, verify the problem and recovery method
Replace the drive according to the service guide – – – –
Plug out the failed drive (Usually amber LED will be on) Wait for about 30 sec Plug in the new drive Waiting for the reconstruction complete
Impending failure drive
Check Recovery Guru, verify the problem and recovery method
Option 1: waiting for the drive failed
Option 2: Directly replace the drive – – – – – – –
July 1, 2007
Un-assign the hot spare Manually failed the drive Plug out the failed drive (Usually amber LED will be on) Wait for about 30 sec Plug in the new drive Waiting for the reconstruction complete Re-assign the hot spare © 2009 IBM Corporation
IBM TotalStorage™
DS4000 disk replacement If Multiple drive failed at almost the same timestamp
Collect data and waiting for L2’s action plan
If reconstruction failed
Recommend to order another one, if failed again, maybe some logical error occurred. Collect data for L2 review
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
Cache battery replacement Check the cache setting in Storage Manager In order to ensure no data in cache, it is recommended to disable the cache setting According to the service guide to replace the cache battery
DS4300 need to offline the controller.
DS4400/DS4500/DS4700 can replace the battery directly.
DS4800 should ensure Ctrl A is optimal
Waiting for the battery self-test/charge complete Reset battery age Reset cache setting
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4000 battery policy Above FW 6.60, the age of DS4000 cache battery has been changed to 10 years. If only battery status is ‘failed’, the battery should be replaced. If battery status is ‘near expiration’, recommended to update the FW to above 6.60. After the upgrade, the warning will be cleared
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4000 firmware The firmware pack include the controller firmware, NVSRAM, ESM code, DDM code The controller firmware and NVSRAM should be matched The ESM code and the controller firmware should be matched Pay attention when the DS4000 is attached with both EXP710 and EXP700/EXP810 Remember always check the firmware package readme file and the code matching before doing update All hardware error should be solved before update the firmware except the JFQ3/JFQ4 issue
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4000 firmware update The normal process of DS4000 firmware update is:
Update ESM code
Update Controller/NVSRAM code
DDM code update need stop IO on hosts
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
Check the DS4000 firmware
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
ESM update
选选选选选 ESM 选选 选选 ESM 选选选选选 5 选选
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
Controller firmware update
Controller firmware and NVSRAM can be updated at the same time The update takes about 15-20 mins
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DDM dirve update
When updating, the host should stop IO
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
Case scenarios DS4000 has JFQ3/JFQ4 disks and need to update the firmware DS4000 has EXP710 attached and need to update the firmware from 06.12.16.00 to 06.60.08.00 When updating the controller firmware, the SM lost connection with DS4000
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
DS4000 date collection All support data Serial port output
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
All support data Do connect to both controller with hub when collecting ASD If hub is not available, collect two ASD from ctrl A and ctrl B respectively In order to make the drive link statistic more accurate, recommend to do … 15-30 before collecting ASD
clear allDriveChannels stats;
reset storagesubsystem RLSBaseline;
reset storagesubsystem SOCBaseline;
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
All support data
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
Serial port output Using Putty or serial cable to connect the DS4000 Connection parameter
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™ 7
Serial port output
Command that should be collect
FW6: – – – – – – – – – – – – – – – – – – – – – – – – – – – – –
July 1, 2007
loadDebug moduleList 1 arrayPrintSummary netCfgShow inetstatShow moduleShow cfgUnitList vdAll vdShow ghsList printBatteryAge cfgPhyList hwLogShow excLogShow spmShowMaps spmShow getObjectGraph_MT 1 getObjectGraph_MT 4 getObjectGraph_MT 8 ccmStateAnalyze 8 fcDevs 1 i fc 111 ionShow 99 hdd 5 fcAll socShow showEnclosures showEnclosuresPage81 unld “ffs:Debug”
– – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – –
loadDebug moduleList 1 evfShowOwnership cmgrShow vdmShowDriveList vdmShowRAIDVolList vdmDrmShowMgr vdmShowVGInfo evfShowAllVols bmgrShow 15 bidShow 255 tditnall iditnall fcnShow chall luall ionShow 12 fcAll 10 showSdStatus ionShow 99 discreteLineTableShow ssmShowTree 2 socShow showEnclosuresPage81 excLogShow hwLogShow spmShowMaps spmShow fcHosts 3 getObjectGraph_MT 1 getObjectGraph_MT 4 getObjectGraph_MT 8 ccmShowState netCfgShow inetstatShow dqlist taskInfoAll 3 tpgmShowSummary unld “ffs:Debug”
© 2009 IBM Corporation
IBM TotalStorage™
DS4000 material Redbook Firmware package
July 1, 2007
© 2009 IBM Corporation
IBM TotalStorage™
Q&A
July 1, 2007
© 2009 IBM Corporation
View more...
Comments