module 3module3 troubleshooting Non-Bootable LVM Disks and LVM StructuresRecovering Lost or Damaged Structures 1. device files (restore from tape backup or recreate using insf and mknod) Re-create PV device files: #ioscan –C disk #insf –eC disk Re-create a VG device file: #mkdir /dev/vg01 #mknod /dev/vg01/group c 64 0x010000 #chmod 755 /dev/vg01 Re-create LV device files: #mknod /dev/vg01/data b 64 0x010001 #mknod /dev/vg01/rdata c 64 0x010001 #chmod 640 /dev/vg01/* 如果不确定哪个逻辑卷名称和VG中的minor number已经被使用,可以用脚本来创建所有255个devices files #!/usr/bin/sh vgname=vg01 vgnum=01 mkdir /dev/$vgname for m in 0 1 2 3 4 5 6 7 8 9 a b c d e f do for n in 0 1 2 3 4 5 6 7 8 9 a b c d e f do if [[$m$n=00]] ; then mknod /dev/$vgname/group c 64 0x$00$m$n else mknod /dev/$vgname/lvol$m$n b 64 0x$00$m$n mknod /dev/$vgname/rlvol$m$n c 64 0x$00$m$n fi done done then run vgdisplay . Logical volumes with size 0M may be safely removed. Mount the remaining logical volumes to determine their contents and guess their names. Then unmount and rename the device files using the mv command #mv /dv/vg01/lvol1 /dev/vg01/data #mv /dev/vg01/rlovl1 /dev/vg01/rdata Corrupt /etc/lvmtab: Cp /etc/lvmtab /etc/lvmtab.bkp Vgscan –p (先预览可能的结果) vgscan 2. /etc/lvmtab (restore from tape or rebuild using information from PVRAs/VGRAs via vgscan) 3. kernel Structures (Re-scan PVRAs/VGRAs using vgchange –a y) 4. PVRA/VGRA (restore from /etc/lvmconf backup directory) Vgscan的副作用及修改 1. Vgchange –a to activate all volume groups 2. Use lvlnboot with –R option to correct boot information on disk 3. Use vgreduce to reduce any alternate links that were added to the /etc/lvmtab file by vgscan,but they were not needed 4. If the original primary path of a disk s become an alter nate path after /etc/lvmtab file is reconstructed ,the order can be easily reverted by using vgreduce to remove the primary path and use vgextend to add the path back again. 当卷组激活时,lvm 使用组成卷组的磁盘上的PVRA/VGRA结构来组成LVM的内核结构。然后LVM伪驱动使用这些内核结构来系统的lvm配置。Vgdisplay vgdisplay pv display 等命令使用lvm 内核结构。 l When a pv that is non-responsive at activation time comes back online l When a failed /corrupted PV is replaced without deactivating/reactivating the VG Ioscan Vgchange –a y vg01 Vgdisplay –v vg01 Failed Disks #tail /var/adm/syslog/syslog.log 包含POWERFAILED信息 如果同时包含pv[#] returned to vg[#],那么可能是高io 和设置的timeout值比较小 pvchange –t 180 /dev/dsk/disk_device syslog.log文件中的lbolt信息 #tail /var/adm/syslog/syslog.log vmunix:SCSI: Request timeout; Abort – lbolt : 137056 , dev: lf070500 lf是十进制的31,lsdev告诉我们31是sdisk SCSI driver的block major# 07代表c7 这里是2位 0代表t0 这里是1位 5 接着的2位数字代表d* ,这里是d5,也有可能是d15等。 00:最后2位数字跟磁带设备相关,磁盘设备始终是00 if vgdisplay and pvdisplay say “couldn’t query disk …”,that typically means that the VGRA/PVRA headers weren’t accessible when the volume group was activated see if you can access the disk using the dd command ,if you get an error message. Or if the dd command doesn’t successfully read all 1024 records ,th disk may be physically corrupted # dd if=/dev/rdsk/c0t1d0 of=/dev/null bs=1024k count=1024 1024+0 records in 1024+0 records out Activating a volume group that contains Failed Disks 覆盖quorum激活卷组 vgchange –a y –q n vg01 这样允许vg激活,但是非冗余的lv可能不可用,要是vg能正常工作,不损失数据,损坏的磁盘必须要进行恢复。 对于启动盘镜像的情况下,在启动时也可以覆盖quorum参数 ISL >hpux –lq (Parisc) Hpux>boot vmnuix –lq (IPF) Restoring a Powerfailed Disk If a disk powerfails afeer activation. No administrator intervention is required when the disk returns If a disk is powerfailed before activation. Execute ioscan , vgchange vgsync when the disk returns. Ioscan –fC disk Vgchange –a y vg01 Vgsync vg01 Replacing a Failed Disk 1. Restore critical LVM structures 2. Recreate FS metadata 3. Restore user data Vgcfgbackup vg01 (备份PVRA/VGRA到默认的目录 /etc/lvmconf) Vgcfgbackup –f /tmp/vg01.conf vg01 (备份PVRA/VGRA到指定的文件) 当VG中少disk时,要create ,extend ,reduce logical volume时,需要指定 –A n参数,否则这些命令不会成功(因为physical volume 丢失),不过这种操作并不推荐。 Restoreing the PVRA/VGRA Vgcfgrestore may be used to restore the PVRA/VGRA to a replacement disk Vgcfgrestore –n vg01 /dev/rdsk/c*t*d* Vgcfgrestore –f /tmp/vg01.conf /dev/rdsk/c*t*d* Restoring the PVRA/VGRA complete Procedure 1. ioscan –fC disk ; insf –C disk 2. Restore the PVRA/VGRA headers to the replacement disk Vgcfgrestore –n vg01 [-o /dev/dsk/c#t#d#] /dev/rdsk/c*t*d* 其中-o参数指定的是原来磁盘的路径,后面为新的磁盘的路径,如果新加入的磁盘和原来的磁盘同一物理路径,则可以省略-o参数,对于在同一物理路径下的磁盘替换,可以省略步骤4 1. Vgchange –a y vg01 2. Extra steps are required if the replacement disk is at a different harware location .the old hardware path must be removed from /etc/lvmtab via the vgexport command. And the new hardware path must be added via the vgimport command . note that file systems in the volume group must be unmounted before you deactivate the volume group Vgchage –a n vg01 Vgexport –s –v –m /etc/lvmconf/vg01.map vg01 Mkdir /dev/vg01 Mknod /dev/vg01/group c 64 0x010000 Vgimport –s –v –m /etc/lvmconf/vg01.map vg01 Vgchange –a y vg01 5.extra steps are required too if the replacement disk is larger than the original idsk .move all of the extents off the disk,take the disk out of the volume group,then add it back in again. If you wish ,you can move the logical volumes back to the replacement disk after the vgextend pvmove –n /dev/vg01/data /dev/dsk/cxtxdx /dev/dsk/cytydy vgreduce vg01 /dev/dsk/cxtxdx pvcreate -f /dev/rdsk/cxtxdx vgextend vg01 /dev/dsk/cxtxdx 6.Proceed with file system and user data recovery ,the procedure described there only restores the lvm headers.File system metadata and user data must be restored using separate procedures described in the pages that follow. Restoring Unmirrored File System Data 1. Restore critical LVM structures using vgcfgrestore 2. Recreate FS metadata using newfs 3. Restore user data from backup tape (frecover –rv) Restoring Mirrored File System Data After a mirrored physical volume has been replaced and vgcfgrestored ,the mirrors must be resync’d “stale “ extents will be resync’ed automatically if l the vg is deactivate/reactivated or l the pv simply powerfailed ,then returned “stale” extents must be manually resync’ed with vgsync or lvsync if l a pv is replaced without deactivating/reactivating the VG l a pv is powerfailed when a VG is initially activated, but later returns 手工同步lv vgchange –a y vg01 vgsync vg01 or lvsync /dev/vg01/data verify all extents are synchronized (lvdisplay –v /dev/vg01/data) Removing Corrupted Physical Volumes 1. obtain a list of LVs in the VG and determine witch ones touch the non-responsive disk vgdisplay –v vg01 lvdisplay –v /dev/vg01/data |grep “???” 2. Remove these LVs from VG Lvremove –A n /dev/vg01/data 3. Forcefully remove pointers to non-responsive disks from other disk’s VGRAs Vgreduce –f vg01 如果这条命令不成功或者vgdisplay的Cur pv 和Act PV不一致,那么vgexport vgimport,但是不建议vg00作者个操作 a. ll /dev/vg01/group 找出vg的minor number, vgdisplay –v vg01 列出vg中的物理卷 b. vgchange –a y vg01 c. vgexport –m /etc/lvmconf/vg01.map vg01 d. mkdir /dev/vg01 mknod /dev/vg01/group c 64 0x010000 e 倒入vg,但是排除损坏的pv vgimport –m /etc/lvmconf/vg01.conf vg01 /dev/dsk/c0t0d0 f 重试vgreduce –f vg01命令 4. rebuild /etc/lvmtab to reflect the new configuration mv /etc/lvmtab /etc/lvmtab.bkp vgscan 5. backup the new configuration vgcfgbackup vg01 6. if you removed any LVs in step2 recreate them and restore their data from tape lvcreate –L 32 –n data vg01 newfs /dev/vg01/rdata mount –a frecover –I /data –xv Removing Corrupted Volume Groups 1. umount /data 2. vi /etc/fstab 3. vgchange –a n vg01 4. vgexport vg01 5. dd if=/dev/zero of=/dev/rdsk/cxtxdx bs=8192 |