Return to MAIN-Index  Return to SUB-Index    IBM-AUSTRIA - PC-HW-Support    30 Aug 1999

Recovery Procedures When HSP is Present at Time of Failure



Recovery Procedures When HSP is Present at Time of Failure


One DHS Drive, No RBL 

Follow the steps below to bring the DHS drive back to HSP state if the following items are true:



Once you verify the conditions above through either the RAID administration log or the RAID administration utility, perform the following steps to bring the DHS drive back to HSP status.
  1.  Physically replace the hard drive in the DHS bay with a new one of the same capacity  or greater.
  2.  With a RAID-l or RAID-5 array, the operating system is still fonctional at this point.  Use either NetFinity Manager 5.0 or the RAID administration utility to bring the drive  back to HSP status. With the RAID administration utility, open the options menu and  select Set Device State.
  3.  When you see the prompt to select a drive, highlight the drive you just replaced (it  should still be marked DHS in the utility), and press Enter. Be careful to select the  correct drive, because you have the option to select any drive connected to the IBM  ServeRAID adapter, including ONL drives.
  4.  You now have a menu listing all the different drive states possible, but you are only  able to highlight DHS, HSP, or Standby Hot Spare (SHS). Highlight HSP (or SHS if  necessary), and press Enter.
  5.  The adapter issues a start unit command to the drive. Once the drive successfully spins  up, the adapter changes the drive's status to HSP (or SHS) and saves the new  configuration.
  6.  If you see an 'Error in starting drive' message, reinsert cables, the hard drive, etc., to  verify these are connected properly, then go to step 2. If the error persists, go to step 1.
  7.  If the error still occurs with a known good hard drive, then troubleshoot to determine  defective part, which may be a cable, back plane, RAID adapter, etc. Once you have  replaced the defective part so that there is a good connection between adapter and  hard drive, go to step 2.


One DDD Drive, One DHS Drive, No RBL 

If the system has a DDD drive and a DHS drive, and a defined hot spare existed prior to the drive failures, then the system should still be up and running as long as the logical drives are configured as RAID-5 or RAID-l. The logical drives in the array will be in the CRT state due to one drive in the array being defunct. Perform the following steps to bring the logical drive from CRT to OKY state:

NOTE: Because the operating system is functional, this procedure assumes you are using the RAID administration utility within the operating system to recover.

  1.  Physically replace the drives that are marked DDD and DHS.
  2.  Click on the DDD drive from within the RAID administration utility and then click on Rebuild Drive. You see a message confirming that the drive is starting. The drive  then starts the rebuild process. When this process is complete, the drive is marked  ONL.
  3.  After the rebuild is complete, click on the DUS drive from within the RAID  administration utility. Select Set Device State. You then see several options. Select  HSP (or SLIS if necessary) and click on OK. The adapter issues a start unit command  to the drive, and you see a message confirming that the drive is starting. Once the  drive spins up and the adapter saves the drive's configuration, the drive is marked HSP  (or SHS, as applicable).


More than One DDD Drive, One DHS, No RBL 

In this scenario, the operating system is no longer functional. Therefore, you must boot to the RAID Option Diskette to recover the array. It is extremely important to confirm that either the RAID administration utility or NetFinity Manager 5.0 has been running prior to the drives being marked defunct. If so, the utility or NetFinity Manager has logged the sequence of DDD events to a log file either on a diskette or on a local or network drive. With this file, you can view the log file on another machine to determine the 'inconsistent' drive. When you know which drive is 'inconsistent', you can attempt to recover data.

Note: The previous paragraph states 'attempt to recover' because once you lose more than one drive in a set of RAID-5 or RAID-1 logical drives, loss of data is definitely a possibility. The steps below guide you through a recovery, if at all possible.

  1.  View the RAID log on another machine and write down the order in which the drives  went defunct.
  2.  Boot to the RAID configuration diskette, and select View Configuration. Make sure  that the template contains the correct information for the status of all drives, not just  those listed in the RAID log.
  3.  Using the RAID configuration utility, select Set Device State and choose a DDD  drive that is not listed in the RAID log. Set that drive to an ONL state. Repeat this step  until the only DDD drives remaining are those indicated in the RAID log file.

    Note: The drives marked DDD that are not listed in the RAID log are the last ones to  go defunct. You must recover these drives first so that the information from them can  be used to rebuild the original drive that failed (the 'inconsistent' drive). If you do not  replace the 'inconsistent' drive last, then the system uses it to rebuild the last drive  that went defunct, resulting in corrupted data. Therefore, it is extremely important to  perform step 3 carefully.

  4.  Select Set Device State and then select the last drive to go defunct according to the  log file. Set that device to the ONL state. Repeat step 4 until there is only one DDD  drive remaining.
  5.  Select Set Device State and choose the DHS drive. Change its state from DHS to  HSP.
  6.  Select Rebuild and highlight the DDD drive.


One DHS Drive, Zero or More DDD Drives, and One RBL Drive 

Usually when you have a RBL drive after bringing up a system, it is because the data on the drive was being rebuilt when the system went down. If there are DDD drives as well, then thosc drives are more than likely the cause of the system crash. The following steps allow you to attempt to recover the array:
  1.  Boot to the RAID configuration utility.
  2.  Select View Configuration and write down the current status of each drive.  Physically replace the DHS drive.
  3.  Return to the utility's Main Menu and choose Device Management. Select Set Device  State. If you see any DDD drives, highlight them and change their state to ONL. If  you do not see any DDD drives, then highlight the DHS drive and change its state to  HSP (or SHS). Repeat this step until there are no more drives marked DDD or DIIS.
  4.  Select Rebuild and highlight the RBL drive. The rebuild process begins, and all data  will be rebuilt to the drive.


Back to  Jump to TOP-of-PAGE
More INFORMATION / HELP is available at the  IBM-HelpCenter

Please see the LEGAL  -  Trademark notice.
Feel free - send a Email-NOTE  for any BUG on this page found - Thank you.