Return to MAIN-Index  Return to SUB-Index    IBM-AUSTRIA - PC-HW-Support    30 Aug 1999

Recovery Procedures When HSP is not Present at Time of Failure



Recovery Procedures When HSP is not Present at Time of Failure


One DDD Drive, No RBL 

Follow these steps to bring the DDD drive back to the ONL state if the following items are true:



Once the conditions above are verified through either the RAID administration log or the RAID administration utility, perform the following steps to bring the DDD drive back to ONL status.
  1.  If drive has never been marked DDD, proceed to step 3 to software replace the drive  using the ServeRAID Administration and Monitoring Utility or Netfinity RAID  Manager.

    NOTE: Refer to 'Software Replace vs. Physical Replace' section in this manual to  understand differences between software and physical replacement.

  2.  If the drive has been marked DDD before, proceed to step 6.
  3.  With a RAID-1 or RAID-5 array, the operating system will be functional. Use either  NetFinity Manager or the RAID administration utility within the operating system to  start the Rebuild process. With the RAID administration utility, click on the drive  marked DDD, and select Rebuild from the menu that appears.
  4.  The adapter issues a start unit command to the drive. You receive a message  confirming that the drive is starting. The drive then begins the rebuild process. Once  the drive completes this process, the drive's status changes to ONL.
  5.  If you see a 'Error in starting drive' message, reinsert the cables, hard drive, etc., to  verify there is a good connection, then go to step 3. If the error persists, go to step 6.
  6.  Physically replace the hard drive in the DDD bay with a new one of the same capacity  or greater and go to step 3.
  7.  If the error still occurs with a known good hard file, then troubleshoot to determine if  the cable, back plane, RAID adapter, etc., is defective.

    NOTE: The RAID adapter should not be replaced in many cases. If Hard Events are  reported in the ServeRAID Device Error Table, which can be viewed by clicking on  the logical drive from the ServeRAlD Administration and Monitoring Utility, then  contact your support representative to determine if the adapter needs replacement.

  8.  Once you have replaced the defective part so that there is a good connection between  the adapter and hard drive, go to step 3.


Two DDD Drives, No RBL 

In this case, with no defined hot spare drive, then the server more than likely trapped (under OS/2 and NT), or the volume was dismounted (under NetWare). To solve this scenario, you must examine the RAID log generated by the RAID administration utility and follow the steps below:

  1.  Boot to the RAID configuration utility for your RAID adapter.
  2.  Select Set Device State and highlight the drive marked DDD last by the RAID  adapter. Set this drive's state to ONL. The drive spins up and changes from DDD to  ONL status.

      IF YOU USE THE WRONG ORDER WHEN YOU SELECT SET  DEVICE STATE TO CHANGE DRIVE's STATE TO ONL, DATA CORRUPTION  RESULTS. SEE NOTE BELOW TO DETERMINE LAST DRIVE MARKED DDD  BY THE RAID ADAPTER.

    Note: Refer to 'Using and Understanding the RAID Administration Log' section of  this document, for details on obtaining and interpreting the RAID log. If only one  drive is recorded in the RAID log because the RAID adapter was not able to log the  defunct drive before the operating system went down, then the last drive that went  defunct is the drive that is not recorded in the RAID log. If two drives are recorded in  the RAID log, then the last drive to go defunct is the second drive listed in the logthe  drive with the most recent time stamp.

  3.  If the drive has been marked DDD before, proceed to step 8.
  4.  Proceed to step 5 to software replace the remaining DDD drive using the ServeRAID  Administration and Monitoring Utility or Netfinity RAID Manager.

    NOTE: Refer to 'Software Replace vs. Physical Replace' section in this manual to  understand differences between software and physical replacement.

  5.  With a RAID-1 or RAID-5 array, the operating system will be functional. Use either  NetFinity Manager or the RAID administration utility within the operating system to  start the Rebuild process. With the RAID administration utility, click on the drive  marked DDD, and select Rebuild from the menu that appears.
  6.  The adapter issues a start unit command to the drive. You receive a message  confirming that the drive is starting. The drive then begins the rebuild process. Once  the drive completes this process, the drive's status changes to ONL.
  7.  If you see a 'Error in starting drive' message, reinsert the cables, hard drive, etc., to  verify there is a good connection, then go to step 5. If the error persists, go to step 8.
  8.  Physically replace the hard drive in the DDD bay with a new one of the same capacity  or greater and go to step 5.
  9.  If the error still occurs with a known good hard file, then troubleshoot to determine if  the cable, back plane, RAID adapter, etc., is defective.

    NOTE: The RAID adapter should not be replaced in many cases. If Hard Events are  reported in the ServeRAID Device Error Table, which can be viewed by clicking on  the logical drive from the ServeRAID Administration and Monitoring Utility, then  contact you support representative determine if the adapter needs replacement.

  10.  Once you have replaced the defective part so that there is a good connection between  the adapter and hard drive, go to step 5.
  11.  If software replacement brings all drives back ONL and makes system operational,  carefully inspect all cables, etc. to ensure that cable or backplane is not defective.  Check all backplane connectors and ensure that backplane is not bowed. When  multiple drives are marked defunct, it is often the communication channel (cable or  backplane) that is the cause of the failure. If backplane is bowed, drives and  backplane connectors may not seat properly causing it to have a bad connection.  Also, with hot-swap drives that are removed frequently, connectors could become  damaged if too much force is exerted.


More than Two DDD Drives, No RBL 

  1.  View the RAID log on another machine and write down the order in which drives  went defunct.
  2.  Boot to the RAID Configuration Diskette and select View Configuration. Make sure  that the template contains the correct information for the status of all drives, not just  those listed in the RAID log.
  3.  Using the RAID configuration utility, select Set Device State and choose a DDD  drive not listed in the RAID log to software replace the drives. Change the state of this  drive to ONL. Perform this step until only two DDD drives are remaining. One or  both of these drives should be listed as the first two drives to go defunct as indicated  in the RAID log.

      IF YOU USE TIlE WRONG ORDER WHEN YOU SELECT SET  DEVICE STATE TO CHANGE DRIVES' STATES TO ONL, DATA  CORRUPTION RESULTS. ENSURE THAT YOU ONLY CHANGE DEVICE  STATES TO ONL OF DRIVES NOT LISTED AS DDD IN THE RAID LOG. THE  FIRST DRIVE THAT WENT DEFUNCT REQUIRES REBUILDING. SO IT MUST  BE REPLACED LAST.

    NOTE: Refer to 'Using and Understanding the RAID Administration Log' section  of this document, for details on obtaining and interpreting the RAID log. Refer to 'Software Replace vs. Physical Replace' section in this manual to understand  differences between software and physical replacement.

  4.  Follow the same procedure used to recover from two DDD drives, as outlined in the  previous section.


Back to  Jump to TOP-of-PAGE
More INFORMATION / HELP is available at the  IBM-HelpCenter

Please see the LEGAL  -  Trademark notice.
Feel free - send a Email-NOTE  for any BUG on this page found - Thank you.