Pages

Thursday, November 14, 2013

plex in RECOVER state, subdisk in RELOCATE state

After a 'crash' plexes were in RECOVER state and subdisk in RELOCATE state (the customer updated and rebooted his storages online which causes Veritas to fail).
The system was a HP-UX 11.31. The situation was like that:

# vxprint -g dbdg
...
v  dbvol        fsgen        DISABLED 251658240 -     ACTIVE   -       -
pl dbvol-02     dbvol        DISABLED 251658240 -     RECOVER  -       -
sd vnx2_lun0-02 dbvol-02     ENABLED  251658240 0     -        -       -
pl dbvol-03     dbvol        DISABLED 251658240 -     RECOVER  -       -
sd vnx1_lun0-02 dbvol-03     ENABLED  251658240 0     RELOCATE -       -


Now try to set the plex (dbvol-02) with the available subdisk (vnx2_lun0-02) in stale state:

# vxmend -g dbdg fix stale dbvol-02
# vxprint -g dbdg
...
v  dbvol        fsgen        DISABLED 251658240 -     ACTIVE   -       -
pl dbvol-02     dbvol        DISABLED 251658240 -     STALE    -       -
sd vnx2_lun0-02 dbvol-02     ENABLED  251658240 0     -        -       -
pl dbvol-03     dbvol        DISABLED 251658240 -     RECOVER  -       -
sd vnx1_lun0-02 dbvol-03     ENABLED  251658240 0     RELOCATE -       -


When the plex dbvol-02 is in stale state try to set it in clean state:

# vxmend -g dbdg fix clean dbvol-02
# vxprint -g dbdg
...
v  dbvol        fsgen        DISABLED 251658240 -     ACTIVE   -       -
pl dbvol-02     dbvol        DISABLED 251658240 -     CLEAN    -       -
sd vnx2_lun0-02 dbvol-02     ENABLED  251658240 0     -        -       -
pl dbvol-03     dbvol        DISABLED 251658240 -     RECOVER  -       -
sd vnx1_lun0-02 dbvol-03     ENABLED  251658240 0     RELOCATE -       -


Next try to start the volume (this might take some time):

# vxvol -g dbdg start dbvol   
...


In another console check for Veritas tasks, run vxprint again and note the atomic copy process:

# vxtask list
TASKID  PTID TYPE/STATE    PCT   PROGRESS
   275           ATCOPY/R 08.56% 0/251658240/21546240 PLXATT dbvol dbvol-03 dbdg
# vxprint -g dbdg
...
v  dbvol        fsgen        DISABLED 251658240 -     ACTIVE   ATT1    -
pl dbvol-02     dbvol        DISABLED 251658240 -     ACTIVE   -       -
sd vnx2_lun0-02 dbvol-02     ENABLED  251658240 0     -        -       -
pl dbvol-03     dbvol        DISABLED 251658240 -     STALE    ATT     -
sd vnx1_lun0-02 dbvol-03     ENABLED  251658240 0     RELOCATE -       -


When the atomic copy is done then recheck the state of the volume with vxprint. The volume and the plexes should be in state active:

# vxprint -g dbdg
...
v  dbvol        fsgen        DISABLED 251658240 -     ACTIVE   -       -
pl dbvol-02     dbvol        DISABLED 251658240 -     ACTIVE   -       -
sd vnx2_lun0-02 dbvol-02     ENABLED  251658240 0     -        -       -
pl dbvol-03     dbvol        DISABLED 251658240 -     ACTIVE   -       -
sd vnx1_lun0-02 dbvol-03     ENABLED  251658240 0     -        -       -


Finally try to mount the volume:

# mount -F vxfs /dev/vx/dsk/dbdg/dbvol /u01
UX:vxfs mount: ERROR: V-3-21268: /dev/vx/dsk/dbdg/dbvol is corrupted. needs checking


When you get the above error message then run fsck:

# fsck -F vxfs -y /dev/vx/dsk/khvdg/u03khvvol
log replay in progress
replay complete - marking super-block as CLEAN


Now you should be able to mount the filesystem:

# mount -F vxfs /dev/vx/dsk/dbdg/dbvol /u01

See also:
http://schweitzeraaron.blogspot.com/2010/09/vxvm-recovering-volumes-after-disk.html