Monday, 15 October 2012

Failed Disk Replacement Under SVM



1.       Booting from the alternate root (/) submirror
Ok>boot disk2
2.       Determining the erred state database replicas and volumes and take a copy of the all the metadevices related commands using metadb, metastat commands.
#metastat d54
d1: Mirror
         Submirror 0: d4
           State: Needs maintenance 
         Submirror 1: d14
           State: Okay
         Resync in progress: 15 % done
         Pass: 1            
         Read option: roundrobin (default)
         Write option: parallel (default)
         Size: 2006130 blocks
# metastat -c
d54              m  5.0GB d4 d14
                                d4           s  5.0GB c0t0d0s4
                                d14         s  5.0GB c0t1d0s4
d53              m  8.0GB d3 d13
                                d3           s  8.0GB c0t0d0s3
                                d13         s  8.0GB c0t1d0s3
d56              m   24GB d6 d16
                                 d6           s   24GB c0t0d0s6
                                d16          s   24GB c0t1d0s6
d55              m  5.0GB d5 d15
                                d5           s  5.0GB c0t0d0s5
                                 d15        s  5.0GB c0t1d0s5
d51              m  8.0GB d1 d11
                                d1           s  8.0GB c0t0d0s1
                                d11        s  8.0GB c0t1d0s1
d50              m  8.0GB d0 d10
                                 d0           s  8.0GB c0t0d0s0
                                 d10         s  8.0GB c0t1d0s0

Needs maintenance: A problem has been detected.  This requires that the system administrator tor replace the failed physical device.  Volumes   displaying   Needs maintenance have incurred no data loss, although additional failures could risk data loss. Take action as quickly as possible.
3.       Open up and incident record in Peregrine and also log a call with Vendor.
4.       Dettach the bad hard drive using
#metadetach –f d54 d4
# metadetach –f d53 d3
# metadetach –f d56 d6
# metadetach –f d55 d5
# metadetach –f d51 d1
# metadetach –f d50 d0
5.       Remove the usage from SVM control using [deletes a mirror, d11, with an sub mirror in an error state.
#metaclear –f d4
#metaclear –f d3
#metaclear –f d6
#metaclear –f d5
#metaclear –f d1
#metaclear –f d0
6.       Check the status metadb status
#metadb –i
                 flags              first blk       block count
                a w   p  luo        16              8192            /dev/dsk/c0t0d0s7
                 a w    p  luo        8208            8192            /dev/dsk/c0t0d0s7
                 a w    p  luo        16400           8192            /dev/dsk/c0t0d0s7
                 a  m  p  luo        16              8192            /dev/dsk/c0t1d0s7
                 a      p  luo        8208            8192            /dev/dsk/c0t1d0s7
 a      p  luo        16400           8192            /dev/dsk/c0t1d0s7

W - replica has device write errors
R - replica had device read errors
m- replica is master, this is replica selected as input
7.       Remove metadata from the disk using
#metadb –f –d c0t0d0s7
8.       copy partitions from the disk using  [either we can copy from old or working disk] [OPTIONAL]
#prtvtoc /dev/rdsk/cxtxdxs2 >vtoc.file
9.       Unconfigure the drive
#cfgadm –al
Ap_Id                          Type         Receptacle   Occupant     Condition
c0                             scsi-bus       connected    configured   unknown
c0::dsk/c0t0d0          disk          connected    configured   unknown
c0::dsk/c0t1d0          disk           connected    configured   unknown
#cfgadm  -c unconfigure c0::dsk/c0t0d0
The drive light turns blue
Pull the failed drive out
10.   Halt the system, replace the disk. Use the format command or the fmthard command, to partition the disk as it was before the failure.
Or
Without halting insert the new drive & configure the new disk
#cfgadm –c configure c0::dsk/c0t0d0
Now that the drive is configured and visible from within the format command, we can copy the partition table from the remaining mirror member
#prtvtoc /dev/rdsk/c0t1d0s2|fmthard –s - /dev/rdsk/ c0t0d0s2 (new disk)
11.   Install bootblock on the new drive
#installboot  /usr/platform/’uname –i’/lib/fs/ufs/bootblk /dev/rdsk/c0t0d0s2
12.   Initialise the disk for SVM usage by using
#metainit  -f d4 1 1 c0t0d0s4
#metainit  -f d3 1 1 c0t0d0s3
#metainit  -f d6 1 1 c0t0d0s6
#metainit  -f d5 1 1 c0t0d0s5
#metainit  -f d1 1 1 c0t0d0s1
#metainit  -f d0 1 1 c0t0d0s0
13.   Create metadb on new disk by using
Note: whenever we will use –a option, db info will automatically update in two files
a.       /kernel/drv/md.conf
b.      /etc/lvm/mddb.cf
Or
                We can define manually in the file /etc/lvm/md.tab file. See the instruction on that file
#metadb –a –c 3 c0t0d0s7
14.   Re-enable the submirrors by using
#metattach d54  d4
#metattach d53  d3
#metattach d56  d6
#metattach d55  d5
#metattach d51  d1
#metattach d51  d0

15.   After resync is done , boot the server from the repaired disk.
Check the sync status by using
#metastat d1
d54: Mirror
         Submirror 0: d4
           State: Resyncing
          Resync in progress: 15 % done
         Submirror 1: d14
           State: Okey        
         Pass: 1         
         Read option: roundrobin (default)
         Write option: parallel (default)
         Size: 2006130 blocks

No comments:

Post a Comment

Note: only a member of this blog may post a comment.