|
IBM-AUSTRIA - PC-HW-Support 30 Aug 1999 |
Recovery Procedures When HSP is Present at Time of Failure
Recovery Procedures When HSP is Present at Time of Failure
One DHS Drive, No RBL
Follow the steps below to bring the DHS drive back to HSP state if the following items are true:
- Only one drive is marked DHS and the rest are ONL.
- The RAID logical drive status is OKY because an HSP is present in the system. Either
the HSP drive is the hard drive that went DHS or the HSP has already automatically
taken over for the DHS drive and has been rebuilt successfully.
- There are no drives with a RBL status.
Once you verify the conditions above through either the RAID administration log or the RAID
administration utility, perform the following steps to bring the DHS drive back to HSP status.
- Physically replace the hard drive in the DHS bay with a new one of the same capacity
or greater.
- With a RAID-l or RAID-5 array, the operating system is still fonctional at this point.
Use either NetFinity Manager 5.0 or the RAID administration utility to bring the drive
back to HSP status. With the RAID administration utility, open the options menu and
select Set Device State.
- When you see the prompt to select a drive, highlight the drive you just replaced (it
should still be marked DHS in the utility), and press Enter. Be careful to select the
correct drive, because you have the option to select any drive connected to the IBM
ServeRAID adapter, including ONL drives.
- You now have a menu listing all the different drive states possible, but you are only
able to highlight DHS, HSP, or Standby Hot Spare (SHS). Highlight HSP (or SHS if
necessary), and press Enter.
- The adapter issues a start unit command to the drive. Once the drive successfully spins
up, the adapter changes the drive's status to HSP (or SHS) and saves the new
configuration.
- If you see an 'Error in starting drive' message, reinsert cables, the hard drive, etc., to
verify these are connected properly, then go to step 2. If the error persists, go to step 1.
- If the error still occurs with a known good hard drive, then troubleshoot to determine
defective part, which may be a cable, back plane, RAID adapter, etc. Once you have
replaced the defective part so that there is a good connection between adapter and
hard drive, go to step 2.
One DDD Drive, One DHS Drive, No RBL
If the system has a DDD drive and a DHS drive, and a defined hot spare existed prior to the drive
failures, then the system should still be up and running as long as the logical drives are
configured as RAID-5 or RAID-l. The logical drives in the array will be in the CRT state due to
one drive in the array being defunct. Perform the following steps to bring the logical drive from
CRT to OKY state:
NOTE: Because the operating system is functional, this procedure assumes you are using the
RAID administration utility within the operating system to recover.
- Physically replace the drives that are marked DDD and DHS.
- Click on the DDD drive from within the RAID administration utility and then click on
Rebuild Drive. You see a message confirming that the drive is starting. The drive
then starts the rebuild process. When this process is complete, the drive is marked
ONL.
- After the rebuild is complete, click on the DUS drive from within the RAID
administration utility. Select Set Device State. You then see several options. Select
HSP (or SLIS if necessary) and click on OK. The adapter issues a start unit command
to the drive, and you see a message confirming that the drive is starting. Once the
drive spins up and the adapter saves the drive's configuration, the drive is marked HSP
(or SHS, as applicable).
More than One DDD Drive, One DHS, No RBL
In this scenario, the operating system is no longer functional. Therefore, you must boot to the
RAID Option Diskette to recover the array. It is extremely important to confirm that either the
RAID administration utility or NetFinity Manager 5.0 has been running prior to the drives being
marked defunct. If so, the utility or NetFinity Manager has logged the sequence of DDD events to
a log file either on a diskette or on a local or network drive. With this file, you can view the log
file on another machine to determine the 'inconsistent' drive. When you know which drive is
'inconsistent', you can attempt to recover data.
Note: The previous paragraph states 'attempt to recover' because once you lose more than one
drive in a set of RAID-5 or RAID-1 logical drives, loss of data is definitely a possibility. The
steps below guide you through a recovery, if at all possible.
- View the RAID log on another machine and write down the order in which the drives
went defunct.
- Boot to the RAID configuration diskette, and select View Configuration. Make sure
that the template contains the correct information for the status of all drives, not just
those listed in the RAID log.
- Using the RAID configuration utility, select Set Device State and choose a DDD
drive that is not listed in the RAID log. Set that drive to an ONL state. Repeat this step
until the only DDD drives remaining are those indicated in the RAID log file.
Note: The drives marked DDD that are not listed in the RAID log are the last ones to
go defunct. You must recover these drives first so that the information from them can
be used to rebuild the original drive that failed (the 'inconsistent' drive). If you do not
replace the 'inconsistent' drive last, then the system uses it to rebuild the last drive
that went defunct, resulting in corrupted data. Therefore, it is extremely important to
perform step 3 carefully.
- Select Set Device State and then select the last drive to go defunct according to the
log file. Set that device to the ONL state. Repeat step 4 until there is only one DDD
drive remaining.
- Select Set Device State and choose the DHS drive. Change its state from DHS to
HSP.
- Select Rebuild and highlight the DDD drive.
One DHS Drive, Zero or More DDD Drives, and One RBL Drive
Usually when you have a RBL drive after bringing up a system, it is because the data on the drive
was being rebuilt when the system went down. If there are DDD drives as well, then thosc drives
are more than likely the cause of the system crash. The following steps allow you to attempt to
recover the array:
- Boot to the RAID configuration utility.
- Select View Configuration and write down the current status of each drive.
Physically replace the DHS drive.
- Return to the utility's Main Menu and choose Device Management. Select Set Device
State. If you see any DDD drives, highlight them and change their state to ONL. If
you do not see any DDD drives, then highlight the DHS drive and change its state to
HSP (or SHS). Repeat this step until there are no more drives marked DDD or DIIS.
- Select Rebuild and highlight the RBL drive. The rebuild process begins, and all data
will be rebuilt to the drive.
Back to
More INFORMATION / HELP is available at the IBM-HelpCenter
Please see the LEGAL - Trademark notice.
Feel free - send a for any BUG on this page found - Thank you.