Friday, 25 October 2013
Saturday, 5 October 2013
Replacing a failed bootdisk in VXVM
Replacing a failed bootdisk:-
In the following example, the host has a failed bootdisk (c0t0d0). Fortunately, the system is using Veritas volume manager, with a mirror at c0t1d0. The following sequence of steps can be used to restore the system to full redundancy.
System fails to boot
When the system attempts to boot, it fails to find a valid device as required by the boot-device path at device alias "disk". It then attempts to boot from the network:
screen not found.
Can't open input device.
Keyboard not present. Using ttya for input and output.
Sun Ultra 30 UPA/PCI (UltraSPARC-II 296MHz), No Keyboard
OpenBoot 3.27, 512 MB memory installed, Serial #9377973.
Ethernet address 8:0:20:8f:18:b5, Host ID: 808f18b5.
Initializing Memory
Timeout waiting for ARP/RARP packet
Timeout waiting for ARP/RARP packet
Timeout waiting for ARP/RARP packet
...
Boot from mirror
At this point, the administrator realizes that the boot disk has failed, and queries the device aliases to find the one corresponding to the veritas mirror:
ok devalias
vx-rootmirror /pci@1f,4000/scsi@3/disk@1,0:a
vx-rootdisk /pci@1f,4000/scsi@3/disk@0,0:a
. . .
The administrator then boots the system from the mirror device "vx-rootmirror":
ok boot vx-rootmirror
As the system boots, Veritas volume manager detects that the volumes on the rootdisk are not accessible, and detaches those plexes from the root volumes. In spite of this, the system is able to boot cleanly from the mirror device with no operator action required.
ok boot vx-rootmirror
Boot device: /pci@1f,4000/scsi@3/disk@1,0:a File and args:
SunOS Release 5.8 Version Generic_108528-16 64-bit
Copyright 1983-2001 Sun Microsystems, Inc. All rights reserved.
Starting VxVM restore daemon...
VxVM starting in boot mode...
/usr/sbin/prtconf: getexecname() failed
vxvm:vxconfigd: WARNING: Detaching plex rootvol-01 from volume rootvol
vxvm:vxconfigd: WARNING: Disk rootdisk in group rootdg: Disk device not found
configuring IPv4 interfaces: hme0.
Hostname: pegasus
VxVM starting special volumes ( swapvol rootvol var )...
VxVM general startup...
dumpadm: no swap devices could be configured as the dump device
The system is coming up. Please wait.
starting rpc services: rpcbind done.
Setting netmask of hme0 to 255.255.255.0
Setting default IPv4 interface for multicast: add net 224.0/4: gateway pegasus
Starting sshd...
This platform does not support both privilege separation and compression
Compression disabled
syslog service starting.
savecore: no dump device configured
savecore: no dump device configured
dumpadm: no swap devices could be configured as the dump device
Oct 28 14:06:20 pegasus savecore: no dump device configured
Print services started.
/dev/bd.off: not a serial device.
volume management starting.
No VVR license installed on the system; vradmind not started.
No VVR license installed on the system; in.vxrsyncd not started.
The system is ready.
pegasus console login:
Check extent of failures
Once the reboot is complete, the administrator then logs into the system and checks the status of the system. Note that the device c0t0d0s2 is listed as "failed", and the all plexes on that device are listed as "DISABLED/NODEVICE".
pegasus console login: root
Password:
Last login: Mon Oct 28 12:27:20 on console
Oct 28 14:06:52 pegasus login: ROOT LOGIN /dev/console
Sun Microsystems Inc. SunOS 5.8 Generic February 2000
You have new mail.
# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c0t1d0s2 sliced rootmirror rootdg online
- - rootdisk rootdg failed was:c0t0d0s2
# vxprint -ht
Disk group: rootdg
DG NAME NCONFIG NLOG MINORS GROUP-ID
DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE
RV NAME RLINK_CNT KSTATE STATE PRIMARY DATAVOLS SRL
RL NAME RVG KSTATE STATE REM_HOST REM_DG REM_RLNK
V NAME RVG KSTATE STATE LENGTH READPOL PREFPLEX UTYPE
PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE
SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE
SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE
DC NAME PARENTVOL LOGVOL
SP NAME SNAPVOL DCO
dg rootdg default default 0 1035555399.1025.pegasus
dm rootdisk - - - - NODEVICE
dm rootmirror c0t1d0s2 sliced 3359 17690400 -
v rootvol - ENABLED ACTIVE 13423200 ROUND - root
pl rootvol-01 rootvol DISABLED NODEVICE 13423200 CONCAT - RW
sd rootdisk-B0 rootvol-01 rootdisk 17690399 1 0 - NDEV
sd rootdisk-02 rootvol-01 rootdisk 0 13423199 1 - NDEV
pl rootvol-02 rootvol ENABLED ACTIVE 13423200 CONCAT - RW
sd rootmirror-01 rootvol-02 rootmirror 0 13423200 0 c0t1d0 ENA
v swapvol - ENABLED ACTIVE 2100000 ROUND - swap
pl swapvol-01 swapvol DISABLED NODEVICE 2100000 CONCAT - WO
sd rootdisk-01 swapvol-01 rootdisk 13423199 2100000 0 - NDEV
pl swapvol-02 swapvol ENABLED ACTIVE 2100000 CONCAT - RW
sd rootmirror-02 swapvol-02 rootmirror 13423200 2100000 0 c0t1d0 ENA
v var - ENABLED ACTIVE 2100000 ROUND - fsgen
pl var-01 var DISABLED NODEVICE 2100000 CONCAT - WO
sd rootdisk-03 var-01 rootdisk 15523199 2100000 0 - NDEV
pl var-02 var ENABLED ACTIVE 2100000 CONCAT - RW
sd rootmirror-03 var-02 rootmirror 15523200 2100000 0 c0t1d0 ENA
Replace failed disk and restore redundancy
The administrator replaces the failed disk with a new disk of the same geometry. Depending on the system model, the disk replacement may require that the system be powered down. Once the operating system can "see" the new disk c0t0d0 via the format command, the administrator tells Veritas volume manager to rescan the system via the "vxdctl enable" command.
# format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c0t0d0 <SEAGATE-ST19171W-0024 cyl 5266 alt 2 hd 20 sec 168>
/pci@1f,4000/scsi@3/sd@0,0
1. c0t1d0 <SEAGATE-ST19171W-0024 cyl 5266 alt 2 hd 20 sec 168>
/pci@1f,4000/scsi@3/sd@1,0
Specify disk (enter its number): ^D
# vxdctl enable
# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c0t0d0s2 sliced - - error
c0t1d0s2 sliced rootmirror rootdg online
- - rootdisk rootdg failed was:c0t0d0s2
Now the administrator can make use of "vxdiskadm" to manage the process of replacing the boot disk volumes.
# vxdiskadm
Volume Manager Support Operations
Menu: VolumeManager/Disk
1 Add or initialize one or more disks
2 Encapsulate one or more disks
3 Remove a disk
4 Remove a disk for replacement
5 Replace a failed or removed disk
6 Mirror volumes on a disk
7 Move volumes from a disk
8 Enable access to (import) a disk group
9 Remove access to (deport) a disk group
10 Enable (online) a disk device
11 Disable (offline) a disk device
12 Mark a disk as a spare for a disk group
13 Turn off the spare flag on a disk
14 Unrelocate subdisks back to a disk
15 Exclude a disk from hot-relocation use
16 Make a disk available for hot-relocation use
17 Prevent multipathing/Suppress devices from VxVM's view
18 Allow multipathing/Unsuppress devices from VxVM's view
19 List currently suppressed/non-multipathed devices
20 Change the disk naming scheme
21 Get the newly connected/zoned disks in VxVM view
list List disk information
? Display help about menu
?? Display help about the menuing system
q Exit from menus
Select an operation to perform: 4
Remove a disk for replacement
Menu: VolumeManager/Disk/RemoveForReplace
Use this menu operation to remove a physical disk from a disk
group, while retaining the disk name. This changes the state
for the disk name to a "removed" disk. If there are any
initialized disks that are not part of a disk group, you will be
given the option of using one of these disks as a replacement.
Enter disk name [<disk>,list,q,?] list
Disk group: rootdg
DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE
dm rootdisk - - - - NODEVICE
dm rootmirror c0t1d0s2 sliced 3359 17690400 -
Enter disk name [<disk>,list,q,?] rootdisk
The following volumes will lose mirrors as a result of this
operation:
rootvol swapvol var
No data on these volumes will be lost.
The requested operation is to remove disk rootdisk from disk group
rootdg. The disk name will be kept, along with any volumes using
the disk, allowing replacement of the disk.
Select "Replace a failed or removed disk" from the main menu
when you wish to replace the disk.
Continue with operation? [y,n,q,?] (default: y) y
Removal of disk rootdisk completed successfully.
Remove another disk? [y,n,q,?] (default: n) n
Volume Manager Support Operations
Menu: VolumeManager/Disk
1 Add or initialize one or more disks
2 Encapsulate one or more disks
3 Remove a disk
4 Remove a disk for replacement
5 Replace a failed or removed disk
6 Mirror volumes on a disk
7 Move volumes from a disk
8 Enable access to (import) a disk group
9 Remove access to (deport) a disk group
10 Enable (online) a disk device
11 Disable (offline) a disk device
12 Mark a disk as a spare for a disk group
13 Turn off the spare flag on a disk
14 Unrelocate subdisks back to a disk
15 Exclude a disk from hot-relocation use
16 Make a disk available for hot-relocation use
17 Prevent multipathing/Suppress devices from VxVM's view
18 Allow multipathing/Unsuppress devices from VxVM's view
19 List currently suppressed/non-multipathed devices
20 Change the disk naming scheme
21 Get the newly connected/zoned disks in VxVM view
list List disk information
? Display help about menu
?? Display help about the menuing system
q Exit from menus
Select an operation to perform: 5
Replace a failed or removed disk
Menu: VolumeManager/Disk/ReplaceDisk
Use this menu operation to specify a replacement disk for a disk
that you removed with the "Remove a disk for replacement" menu
operation, or that failed during use. You will be prompted for
a disk name to replace and a disk device to use as a replacement.
You can choose an uninitialized disk, in which case the disk will
be initialized, or you can choose a disk that you have already
initialized using the Add or initialize a disk menu operation.
Select a removed or failed disk [<disk>,list,q,?] list
Disk group: rootdg
DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE
dm rootdisk - - - - REMOVED
Select a removed or failed disk [<disk>,list,q,?] rootdisk
Select disk device to initialize [<address>,list,q,?] list
DEVICE DISK GROUP STATUS
c0t0d0 - - error
c0t1d0 rootmirror rootdg online
Select disk device to initialize [<address>,list,q,?] c0t0d0
The following disk device has a valid VTOC, but does not appear to have
been initialized for the Volume Manager. If there is data on the disk
that should NOT be destroyed you should encapsulate the existing disk
partitions as volumes instead of adding the disk as a new disk.
Output format: [Device_Name]
c0t0d0
Encapsulate this device? [y,n,q,?] (default: y) n
c0t0d0
Instead of encapsulating, initialize? [y,n,q,?] (default: n) y
The requested operation is to initialize disk device c0t0d0 and
to then use that device to replace the removed or failed disk
rootdisk in disk group rootdg.
Continue with operation? [y,n,q,?] (default: y)
Replacement of disk rootdisk in group rootdg with disk device
c0t0d0 completed successfully.
Replace another disk? [y,n,q,?] (default: n) n
Volume Manager Support Operations
Menu: VolumeManager/Disk
1 Add or initialize one or more disks
2 Encapsulate one or more disks
3 Remove a disk
4 Remove a disk for replacement
5 Replace a failed or removed disk
6 Mirror volumes on a disk
7 Move volumes from a disk
8 Enable access to (import) a disk group
9 Remove access to (deport) a disk group
10 Enable (online) a disk device
11 Disable (offline) a disk device
12 Mark a disk as a spare for a disk group
13 Turn off the spare flag on a disk
14 Unrelocate subdisks back to a disk
15 Exclude a disk from hot-relocation use
16 Make a disk available for hot-relocation use
17 Prevent multipathing/Suppress devices from VxVM's view
18 Allow multipathing/Unsuppress devices from VxVM's view
19 List currently suppressed/non-multipathed devices
20 Change the disk naming scheme
21 Get the newly connected/zoned disks in VxVM view
list List disk information
? Display help about menu
?? Display help about the menuing system
q Exit from menus
Select an operation to perform: q
Goodbye.
Having replaced the disk in Veritas volume manager, the disk device is now listed as "online", and VxVM is in the process of attaching the replacement plexes to the original volumes.
# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c0t0d0s2 sliced rootdisk rootdg online
c0t1d0s2 sliced rootmirror rootdg online
# vxprint -ht
Disk group: rootdg
DG NAME NCONFIG NLOG MINORS GROUP-ID
DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE
RV NAME RLINK_CNT KSTATE STATE PRIMARY DATAVOLS SRL
RL NAME RVG KSTATE STATE REM_HOST REM_DG REM_RLNK
V NAME RVG KSTATE STATE LENGTH READPOL PREFPLEX UTYPE
PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE
SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE
SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE
DC NAME PARENTVOL LOGVOL
SP NAME SNAPVOL DCO
dg rootdg default default 0 1035555399.1025.pegasus
dm rootdisk c0t0d0s2 sliced 3359 17690400 -
dm rootmirror c0t1d0s2 sliced 3359 17690400 -
v rootvol - ENABLED ACTIVE 13423200 ROUND - root
pl rootvol-01 rootvol ENABLED STALE 13423200 CONCAT - WO
sd rootdisk-05 rootvol-01 rootdisk 2100000 13423200 0 c0t0d0 ENA
pl rootvol-02 rootvol ENABLED ACTIVE 13423200 CONCAT - RW
sd rootmirror-01 rootvol-02 rootmirror 0 13423200 0 c0t1d0 ENA
v swapvol - ENABLED ACTIVE 2100000 ROUND - swap
pl swapvol-01 swapvol DISABLED RECOVER 2100000 CONCAT - WO
sd rootdisk-06 swapvol-01 rootdisk 15523200 2100000 0 c0t0d0 ENA
pl swapvol-02 swapvol ENABLED ACTIVE 2100000 CONCAT - RW
sd rootmirror-02 swapvol-02 rootmirror 13423200 2100000 0 c0t1d0 ENA
v var - ENABLED ACTIVE 2100000 ROUND - fsgen
pl var-01 var DISABLED RECOVER 2100000 CONCAT - WO
sd rootdisk-04 var-01 rootdisk 0 2100000 0 c0t0d0 ENA
pl var-02 var ENABLED ACTIVE 2100000 CONCAT - RW
sd rootmirror-03 var-02 rootmirror 15523200 2100000 0 c0t1d0 ENA
# vxtask list
TASKID PTID TYPE/STATE PCT PROGRESS
161 PARENT/R 0.00% 3/0(1) VXRECOVER rootdisk
162 162 ATCOPY/R 01.22% 0/13423200/163680 PLXATT rootvol rootvol-01
After about an hour, all of the plexes have been synchronized, and full operating system redundancy has been restored:
# vxtask list
TASKID PTID TYPE/STATE PCT PROGRESS
# vxprint -ht
Disk group: rootdg
DG NAME NCONFIG NLOG MINORS GROUP-ID
DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE
RV NAME RLINK_CNT KSTATE STATE PRIMARY DATAVOLS SRL
RL NAME RVG KSTATE STATE REM_HOST REM_DG REM_RLNK
V NAME RVG KSTATE STATE LENGTH READPOL PREFPLEX UTYPE
PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE
SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE
SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE
DC NAME PARENTVOL LOGVOL
SP NAME SNAPVOL DCO
dg rootdg default default 0 1035555399.1025.pegasus
dm rootdisk c0t0d0s2 sliced 3359 17690400 -
dm rootmirror c0t1d0s2 sliced 3359 17690400 -
v rootvol - ENABLED ACTIVE 13423200 ROUND - root
pl rootvol-01 rootvol ENABLED ACTIVE 13423200 CONCAT - RW
sd rootdisk-05 rootvol-01 rootdisk 2100000 13423200 0 c0t0d0 ENA
pl rootvol-02 rootvol ENABLED ACTIVE 13423200 CONCAT - RW
sd rootmirror-01 rootvol-02 rootmirror 0 13423200 0 c0t1d0 ENA
v swapvol - ENABLED ACTIVE 2100000 ROUND - swap
pl swapvol-01 swapvol ENABLED ACTIVE 2100000 CONCAT - RW
sd rootdisk-06 swapvol-01 rootdisk 15523200 2100000 0 c0t0d0 ENA
pl swapvol-02 swapvol ENABLED ACTIVE 2100000 CONCAT - RW
sd rootmirror-02 swapvol-02 rootmirror 13423200 2100000 0 c0t1d0 ENA
v var - ENABLED ACTIVE 2100000 ROUND - fsgen
pl var-01 var ENABLED ACTIVE 2100000 CONCAT - RW
sd rootdisk-04 var-01 rootdisk 0 2100000 0 c0t0d0 ENA
pl var-02 var ENABLED ACTIVE 2100000 CONCAT - RW
sd rootmirror-03 var-02 rootmirror 15523200 2100000 0 c0t1d0 ENA
In the following example, the host has a failed bootdisk (c0t0d0). Fortunately, the system is using Veritas volume manager, with a mirror at c0t1d0. The following sequence of steps can be used to restore the system to full redundancy.
System fails to boot
When the system attempts to boot, it fails to find a valid device as required by the boot-device path at device alias "disk". It then attempts to boot from the network:
screen not found.
Can't open input device.
Keyboard not present. Using ttya for input and output.
Sun Ultra 30 UPA/PCI (UltraSPARC-II 296MHz), No Keyboard
OpenBoot 3.27, 512 MB memory installed, Serial #9377973.
Ethernet address 8:0:20:8f:18:b5, Host ID: 808f18b5.
Initializing Memory
Timeout waiting for ARP/RARP packet
Timeout waiting for ARP/RARP packet
Timeout waiting for ARP/RARP packet
...
Boot from mirror
At this point, the administrator realizes that the boot disk has failed, and queries the device aliases to find the one corresponding to the veritas mirror:
ok devalias
vx-rootmirror /pci@1f,4000/scsi@3/disk@1,0:a
vx-rootdisk /pci@1f,4000/scsi@3/disk@0,0:a
. . .
The administrator then boots the system from the mirror device "vx-rootmirror":
ok boot vx-rootmirror
As the system boots, Veritas volume manager detects that the volumes on the rootdisk are not accessible, and detaches those plexes from the root volumes. In spite of this, the system is able to boot cleanly from the mirror device with no operator action required.
ok boot vx-rootmirror
Boot device: /pci@1f,4000/scsi@3/disk@1,0:a File and args:
SunOS Release 5.8 Version Generic_108528-16 64-bit
Copyright 1983-2001 Sun Microsystems, Inc. All rights reserved.
Starting VxVM restore daemon...
VxVM starting in boot mode...
/usr/sbin/prtconf: getexecname() failed
vxvm:vxconfigd: WARNING: Detaching plex rootvol-01 from volume rootvol
vxvm:vxconfigd: WARNING: Disk rootdisk in group rootdg: Disk device not found
configuring IPv4 interfaces: hme0.
Hostname: pegasus
VxVM starting special volumes ( swapvol rootvol var )...
VxVM general startup...
dumpadm: no swap devices could be configured as the dump device
The system is coming up. Please wait.
starting rpc services: rpcbind done.
Setting netmask of hme0 to 255.255.255.0
Setting default IPv4 interface for multicast: add net 224.0/4: gateway pegasus
Starting sshd...
This platform does not support both privilege separation and compression
Compression disabled
syslog service starting.
savecore: no dump device configured
savecore: no dump device configured
dumpadm: no swap devices could be configured as the dump device
Oct 28 14:06:20 pegasus savecore: no dump device configured
Print services started.
/dev/bd.off: not a serial device.
volume management starting.
No VVR license installed on the system; vradmind not started.
No VVR license installed on the system; in.vxrsyncd not started.
The system is ready.
pegasus console login:
Check extent of failures
Once the reboot is complete, the administrator then logs into the system and checks the status of the system. Note that the device c0t0d0s2 is listed as "failed", and the all plexes on that device are listed as "DISABLED/NODEVICE".
pegasus console login: root
Password:
Last login: Mon Oct 28 12:27:20 on console
Oct 28 14:06:52 pegasus login: ROOT LOGIN /dev/console
Sun Microsystems Inc. SunOS 5.8 Generic February 2000
You have new mail.
# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c0t1d0s2 sliced rootmirror rootdg online
- - rootdisk rootdg failed was:c0t0d0s2
# vxprint -ht
Disk group: rootdg
DG NAME NCONFIG NLOG MINORS GROUP-ID
DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE
RV NAME RLINK_CNT KSTATE STATE PRIMARY DATAVOLS SRL
RL NAME RVG KSTATE STATE REM_HOST REM_DG REM_RLNK
V NAME RVG KSTATE STATE LENGTH READPOL PREFPLEX UTYPE
PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE
SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE
SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE
DC NAME PARENTVOL LOGVOL
SP NAME SNAPVOL DCO
dg rootdg default default 0 1035555399.1025.pegasus
dm rootdisk - - - - NODEVICE
dm rootmirror c0t1d0s2 sliced 3359 17690400 -
v rootvol - ENABLED ACTIVE 13423200 ROUND - root
pl rootvol-01 rootvol DISABLED NODEVICE 13423200 CONCAT - RW
sd rootdisk-B0 rootvol-01 rootdisk 17690399 1 0 - NDEV
sd rootdisk-02 rootvol-01 rootdisk 0 13423199 1 - NDEV
pl rootvol-02 rootvol ENABLED ACTIVE 13423200 CONCAT - RW
sd rootmirror-01 rootvol-02 rootmirror 0 13423200 0 c0t1d0 ENA
v swapvol - ENABLED ACTIVE 2100000 ROUND - swap
pl swapvol-01 swapvol DISABLED NODEVICE 2100000 CONCAT - WO
sd rootdisk-01 swapvol-01 rootdisk 13423199 2100000 0 - NDEV
pl swapvol-02 swapvol ENABLED ACTIVE 2100000 CONCAT - RW
sd rootmirror-02 swapvol-02 rootmirror 13423200 2100000 0 c0t1d0 ENA
v var - ENABLED ACTIVE 2100000 ROUND - fsgen
pl var-01 var DISABLED NODEVICE 2100000 CONCAT - WO
sd rootdisk-03 var-01 rootdisk 15523199 2100000 0 - NDEV
pl var-02 var ENABLED ACTIVE 2100000 CONCAT - RW
sd rootmirror-03 var-02 rootmirror 15523200 2100000 0 c0t1d0 ENA
Replace failed disk and restore redundancy
The administrator replaces the failed disk with a new disk of the same geometry. Depending on the system model, the disk replacement may require that the system be powered down. Once the operating system can "see" the new disk c0t0d0 via the format command, the administrator tells Veritas volume manager to rescan the system via the "vxdctl enable" command.
# format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c0t0d0 <SEAGATE-ST19171W-0024 cyl 5266 alt 2 hd 20 sec 168>
/pci@1f,4000/scsi@3/sd@0,0
1. c0t1d0 <SEAGATE-ST19171W-0024 cyl 5266 alt 2 hd 20 sec 168>
/pci@1f,4000/scsi@3/sd@1,0
Specify disk (enter its number): ^D
# vxdctl enable
# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c0t0d0s2 sliced - - error
c0t1d0s2 sliced rootmirror rootdg online
- - rootdisk rootdg failed was:c0t0d0s2
Now the administrator can make use of "vxdiskadm" to manage the process of replacing the boot disk volumes.
# vxdiskadm
Volume Manager Support Operations
Menu: VolumeManager/Disk
1 Add or initialize one or more disks
2 Encapsulate one or more disks
3 Remove a disk
4 Remove a disk for replacement
5 Replace a failed or removed disk
6 Mirror volumes on a disk
7 Move volumes from a disk
8 Enable access to (import) a disk group
9 Remove access to (deport) a disk group
10 Enable (online) a disk device
11 Disable (offline) a disk device
12 Mark a disk as a spare for a disk group
13 Turn off the spare flag on a disk
14 Unrelocate subdisks back to a disk
15 Exclude a disk from hot-relocation use
16 Make a disk available for hot-relocation use
17 Prevent multipathing/Suppress devices from VxVM's view
18 Allow multipathing/Unsuppress devices from VxVM's view
19 List currently suppressed/non-multipathed devices
20 Change the disk naming scheme
21 Get the newly connected/zoned disks in VxVM view
list List disk information
? Display help about menu
?? Display help about the menuing system
q Exit from menus
Select an operation to perform: 4
Remove a disk for replacement
Menu: VolumeManager/Disk/RemoveForReplace
Use this menu operation to remove a physical disk from a disk
group, while retaining the disk name. This changes the state
for the disk name to a "removed" disk. If there are any
initialized disks that are not part of a disk group, you will be
given the option of using one of these disks as a replacement.
Enter disk name [<disk>,list,q,?] list
Disk group: rootdg
DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE
dm rootdisk - - - - NODEVICE
dm rootmirror c0t1d0s2 sliced 3359 17690400 -
Enter disk name [<disk>,list,q,?] rootdisk
The following volumes will lose mirrors as a result of this
operation:
rootvol swapvol var
No data on these volumes will be lost.
The requested operation is to remove disk rootdisk from disk group
rootdg. The disk name will be kept, along with any volumes using
the disk, allowing replacement of the disk.
Select "Replace a failed or removed disk" from the main menu
when you wish to replace the disk.
Continue with operation? [y,n,q,?] (default: y) y
Removal of disk rootdisk completed successfully.
Remove another disk? [y,n,q,?] (default: n) n
Volume Manager Support Operations
Menu: VolumeManager/Disk
1 Add or initialize one or more disks
2 Encapsulate one or more disks
3 Remove a disk
4 Remove a disk for replacement
5 Replace a failed or removed disk
6 Mirror volumes on a disk
7 Move volumes from a disk
8 Enable access to (import) a disk group
9 Remove access to (deport) a disk group
10 Enable (online) a disk device
11 Disable (offline) a disk device
12 Mark a disk as a spare for a disk group
13 Turn off the spare flag on a disk
14 Unrelocate subdisks back to a disk
15 Exclude a disk from hot-relocation use
16 Make a disk available for hot-relocation use
17 Prevent multipathing/Suppress devices from VxVM's view
18 Allow multipathing/Unsuppress devices from VxVM's view
19 List currently suppressed/non-multipathed devices
20 Change the disk naming scheme
21 Get the newly connected/zoned disks in VxVM view
list List disk information
? Display help about menu
?? Display help about the menuing system
q Exit from menus
Select an operation to perform: 5
Replace a failed or removed disk
Menu: VolumeManager/Disk/ReplaceDisk
Use this menu operation to specify a replacement disk for a disk
that you removed with the "Remove a disk for replacement" menu
operation, or that failed during use. You will be prompted for
a disk name to replace and a disk device to use as a replacement.
You can choose an uninitialized disk, in which case the disk will
be initialized, or you can choose a disk that you have already
initialized using the Add or initialize a disk menu operation.
Select a removed or failed disk [<disk>,list,q,?] list
Disk group: rootdg
DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE
dm rootdisk - - - - REMOVED
Select a removed or failed disk [<disk>,list,q,?] rootdisk
Select disk device to initialize [<address>,list,q,?] list
DEVICE DISK GROUP STATUS
c0t0d0 - - error
c0t1d0 rootmirror rootdg online
Select disk device to initialize [<address>,list,q,?] c0t0d0
The following disk device has a valid VTOC, but does not appear to have
been initialized for the Volume Manager. If there is data on the disk
that should NOT be destroyed you should encapsulate the existing disk
partitions as volumes instead of adding the disk as a new disk.
Output format: [Device_Name]
c0t0d0
Encapsulate this device? [y,n,q,?] (default: y) n
c0t0d0
Instead of encapsulating, initialize? [y,n,q,?] (default: n) y
The requested operation is to initialize disk device c0t0d0 and
to then use that device to replace the removed or failed disk
rootdisk in disk group rootdg.
Continue with operation? [y,n,q,?] (default: y)
Replacement of disk rootdisk in group rootdg with disk device
c0t0d0 completed successfully.
Replace another disk? [y,n,q,?] (default: n) n
Volume Manager Support Operations
Menu: VolumeManager/Disk
1 Add or initialize one or more disks
2 Encapsulate one or more disks
3 Remove a disk
4 Remove a disk for replacement
5 Replace a failed or removed disk
6 Mirror volumes on a disk
7 Move volumes from a disk
8 Enable access to (import) a disk group
9 Remove access to (deport) a disk group
10 Enable (online) a disk device
11 Disable (offline) a disk device
12 Mark a disk as a spare for a disk group
13 Turn off the spare flag on a disk
14 Unrelocate subdisks back to a disk
15 Exclude a disk from hot-relocation use
16 Make a disk available for hot-relocation use
17 Prevent multipathing/Suppress devices from VxVM's view
18 Allow multipathing/Unsuppress devices from VxVM's view
19 List currently suppressed/non-multipathed devices
20 Change the disk naming scheme
21 Get the newly connected/zoned disks in VxVM view
list List disk information
? Display help about menu
?? Display help about the menuing system
q Exit from menus
Select an operation to perform: q
Goodbye.
Having replaced the disk in Veritas volume manager, the disk device is now listed as "online", and VxVM is in the process of attaching the replacement plexes to the original volumes.
# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c0t0d0s2 sliced rootdisk rootdg online
c0t1d0s2 sliced rootmirror rootdg online
# vxprint -ht
Disk group: rootdg
DG NAME NCONFIG NLOG MINORS GROUP-ID
DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE
RV NAME RLINK_CNT KSTATE STATE PRIMARY DATAVOLS SRL
RL NAME RVG KSTATE STATE REM_HOST REM_DG REM_RLNK
V NAME RVG KSTATE STATE LENGTH READPOL PREFPLEX UTYPE
PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE
SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE
SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE
DC NAME PARENTVOL LOGVOL
SP NAME SNAPVOL DCO
dg rootdg default default 0 1035555399.1025.pegasus
dm rootdisk c0t0d0s2 sliced 3359 17690400 -
dm rootmirror c0t1d0s2 sliced 3359 17690400 -
v rootvol - ENABLED ACTIVE 13423200 ROUND - root
pl rootvol-01 rootvol ENABLED STALE 13423200 CONCAT - WO
sd rootdisk-05 rootvol-01 rootdisk 2100000 13423200 0 c0t0d0 ENA
pl rootvol-02 rootvol ENABLED ACTIVE 13423200 CONCAT - RW
sd rootmirror-01 rootvol-02 rootmirror 0 13423200 0 c0t1d0 ENA
v swapvol - ENABLED ACTIVE 2100000 ROUND - swap
pl swapvol-01 swapvol DISABLED RECOVER 2100000 CONCAT - WO
sd rootdisk-06 swapvol-01 rootdisk 15523200 2100000 0 c0t0d0 ENA
pl swapvol-02 swapvol ENABLED ACTIVE 2100000 CONCAT - RW
sd rootmirror-02 swapvol-02 rootmirror 13423200 2100000 0 c0t1d0 ENA
v var - ENABLED ACTIVE 2100000 ROUND - fsgen
pl var-01 var DISABLED RECOVER 2100000 CONCAT - WO
sd rootdisk-04 var-01 rootdisk 0 2100000 0 c0t0d0 ENA
pl var-02 var ENABLED ACTIVE 2100000 CONCAT - RW
sd rootmirror-03 var-02 rootmirror 15523200 2100000 0 c0t1d0 ENA
# vxtask list
TASKID PTID TYPE/STATE PCT PROGRESS
161 PARENT/R 0.00% 3/0(1) VXRECOVER rootdisk
162 162 ATCOPY/R 01.22% 0/13423200/163680 PLXATT rootvol rootvol-01
After about an hour, all of the plexes have been synchronized, and full operating system redundancy has been restored:
# vxtask list
TASKID PTID TYPE/STATE PCT PROGRESS
# vxprint -ht
Disk group: rootdg
DG NAME NCONFIG NLOG MINORS GROUP-ID
DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE
RV NAME RLINK_CNT KSTATE STATE PRIMARY DATAVOLS SRL
RL NAME RVG KSTATE STATE REM_HOST REM_DG REM_RLNK
V NAME RVG KSTATE STATE LENGTH READPOL PREFPLEX UTYPE
PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE
SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE
SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE
DC NAME PARENTVOL LOGVOL
SP NAME SNAPVOL DCO
dg rootdg default default 0 1035555399.1025.pegasus
dm rootdisk c0t0d0s2 sliced 3359 17690400 -
dm rootmirror c0t1d0s2 sliced 3359 17690400 -
v rootvol - ENABLED ACTIVE 13423200 ROUND - root
pl rootvol-01 rootvol ENABLED ACTIVE 13423200 CONCAT - RW
sd rootdisk-05 rootvol-01 rootdisk 2100000 13423200 0 c0t0d0 ENA
pl rootvol-02 rootvol ENABLED ACTIVE 13423200 CONCAT - RW
sd rootmirror-01 rootvol-02 rootmirror 0 13423200 0 c0t1d0 ENA
v swapvol - ENABLED ACTIVE 2100000 ROUND - swap
pl swapvol-01 swapvol ENABLED ACTIVE 2100000 CONCAT - RW
sd rootdisk-06 swapvol-01 rootdisk 15523200 2100000 0 c0t0d0 ENA
pl swapvol-02 swapvol ENABLED ACTIVE 2100000 CONCAT - RW
sd rootmirror-02 swapvol-02 rootmirror 13423200 2100000 0 c0t1d0 ENA
v var - ENABLED ACTIVE 2100000 ROUND - fsgen
pl var-01 var ENABLED ACTIVE 2100000 CONCAT - RW
sd rootdisk-04 var-01 rootdisk 0 2100000 0 c0t0d0 ENA
pl var-02 var ENABLED ACTIVE 2100000 CONCAT - RW
sd rootmirror-03 var-02 rootmirror 15523200 2100000 0 c0t1d0 ENA
Replace failed disk in VXVM
To replace a failed disk in Veritas, please follow the below procedure:-
1. Check the failed disk using the command 'vxdisk list'
2. Run the 'format' command to see ' if the disk is offline' or 'not responding to selection'.
3. Log a service call to hardware vendor.
4. Remove the failed disk from volume manager control using the below commands.
a. Run 'vxdiskadm' as root.
b. choose option 4: Remove a disk for replacement
c. Choose the logical name corresponding the disk that has failed ( for ex. data02)
5. Get the disk replaced by the vendor.
6. Make sure the disk appears fine in the format command(no need to do any partition).
7. Run 'vxdctl enable' to enable vxconfigd sense the replaced device
8. Run 'vxdiskadm' command again and follow the below steps.
a. Choose option:5 Replace a failed or removed disk.
b. Choose the disk that was removed in step 4b(for ex. data02).
c. Choose the device corresponding to the logical name(for ex. c1t10d0)
d. Say no to 'encapsulate' and choose okay to initialise the disk to replace the failed one.
e. Accept default (no - option) for FMR plex resync option
f. Once completed successful appeared on the prompt.Exit vxdiskadm
9. Check the disks are online by running 'vxdisk list'.
vxprint -ht
Moving hot-relocated subdisk back to disk
# vxdiskadm
Choose option 14
Move hot-relocated subdisks back to a disk
Menu: VolumeManager/Disk/UnrelocateDisk
Use this operation to move subdisks which were hot-relocated back
onto the original disk that has been replaced due to a disk failure.
This operation takes, as input, the original disk name. If the
failed drive was replaced with a disk using a different name, this
operation also provides an option to specify the new name.
Enter the original disk name [,list,q,?] list
datadg0211
datadg03
Enter the original disk name [,list,q,?] datadg0211
Unrelocate to a new disk [y,n,q,?] (default: n)
Requested operation is to move all the subdisks which were hot-relocated
from datadg0211 back to datadg0211 of disk group datadg02.
Continue with operation? [y,n,q,?] (default: y)
Use -f option to unrelocate the subdisks if moving to the exact offset fails?
[y,n,q,?] (default: n)
Thanks To Elumalai M
Subscribe to:
Comments (Atom)