Monday, March 26, 2012

Update Firmware on 9111-285 (power5 System)

Download Required Firmware - In this case Recommended Firmware is SF240-417
FTP the rpm file (01SF240_417_382.rpm) to the Server that Needs to be Upgraded

Run the command below to extract the flash image file in the rpm file:

rpm -Uvh --ignoreos /tmp/01SF240_202_201.rpm

Before installing check the existing firmware level. From AIX, use the command lsmcode.  This command resides in the diagnostic directory.  An example of the output of the lsmcode command is as follows:
----------------------------------------------------------------------------------------------------------
DISPLAY MICROCODE LEVEL                                                   802811
IBM,9111-285

The current permanent system firmware image is SF240_202
The current temporary system firmware image is SF240_202
The system is currently booted from the temporary firmware image.

Use Enter to continue.
--------------------------------------------------------------------------------------------------------------
Next, run the update_flash command to upgrade firmware:
# ls /tmp/fwupdate
01SF240_417_382
# /usr/lpp/diagnostics/bin/update_flash -f /tmp/fwupdate/01SF240_417_382
The image is valid and would update the temporary image to SF240_417.
The new firmware level for the permanent image would be SF240_202.

The current permanent system firmware image is SF240_202.
The current temporary system firmware image is SF240_202.


***** WARNING: Continuing will reboot the system! *****

Do you wish to continue?
Enter 1=Yes or 2=No
SHUTDOWN PROGRAM
Mon Mar 26 11:01:35 EDT 2012
Stopping The LWI Nonstop Profile...
Waiting for The LWI Nonstop Profile to exit...
Waiting for The LWI Nonstop Profile to exit...
Stopped The LWI Nonstop Profile.
0513-044 The sshd Subsystem was requested to stop.
Wait for 'Rebooting...' before stopping.
Error logging stopped...
Advanced Accounting has stopped...
Process accounting stopped...
Stopping NFS/NIS Daemons
0513-044 The nfsd Subsystem was requested to stop.
0513-044 The biod Subsystem was requested to stop.
0513-044 The rpc.lockd Subsystem was requested to stop.
0513-044 The rpc.statd Subsystem was requested to stop.
0513-004 The Subsystem or Group, gssd, is currently inoperative.
0513-004 The Subsystem or Group, nfsrgyd, is currently inoperative.
0513-044 The rpc.mountd Subsystem was requested to stop.
0513-004 The Subsystem or Group, ypbind, is currently inoperative.


Connection to host lost.
After the System is back up, ran lsmcode and found that the system booted from the new Temporary Firmware as shown below
DISPLAY MICROCODE LEVEL                                                   802811
IBM,9111-285

The current permanent system firmware image is SF240_202
The current temporary system firmware image is SF240_417
The system is currently booted from the temporary firmware image.

Use Enter to continue.
To move the Temporary Image
# /usr/lpp/diagnostics/bin/update_flash -c 

Friday, March 23, 2012

Rename hdisk in AIX

Recently we had an issue with 2 hdisks going bad in AIX LPAR, both in different volume groups (hdisk15 & hdisk78)., The Volume groups had disks mirrored (hdisk0 to hdisk39 mirrored to hdisk40 to hdisk79). After replacing the drives, found that  the hdisks names got swapped as shown below before and after replacement. Since I do not want to create any confusion in future, I had to rename them back to original state before adding them to voulme groups as below..

 Before Replacing 



#lscfg -vl hdisk15
  hdisk15          U5791.001.992055M-P2-T5-L11-L0  16 Bit LVD SCSI Disk Drive (7
3400 MB)



#lscfg -vl hdisk78
  hdisk78          U5791.001.9920546-P2-T6-L10-L0  16 Bit LVD SCSI Disk Drive (7
3400 MB)

After Replacing



#lscfg -vl hdisk78
  hdisk78          U5791.001.992055M-P2-T5-L11-L0  16 Bit LVD SCSI Disk Drive (7
3400 MB)



#lscfg -vl hdisk15
  hdisk15          U5791.001.9920546-P2-T6-L10-L0  16 Bit LVD SCSI Disk Drive (7
3400 MB)

Swapping Back to Original State

Removed both the drives
#rmdev -dl hdisk78
#rmdev -dl hdisk15

Ran cfgmgr in the order I wanted

#cfgmgr -l scsi3 
#cfgmgr -l scsi23

Verified that they are swapped back

#lsdev -Cc disk |grep hdisk15
hdisk15   Available 0P-08-00-11,0 16 Bit LVD SCSI Disk Drive 
$ lsdev -Cc adapter |grep 0P-08
scsi3     Available 0P-08 Wide/Ultra-3 SCSI I/O Controller



#lscfg -vl hdisk15
  hdisk15          U5791.001.992055M-P2-T5-L11-L0  16 Bit LVD SCSI Disk Drive (7
3400 MB)




#lsdev -Cc disk |grep hdisk78
hdisk78   Available 0t-08-00-10,0 16 Bit LVD SCSI Disk Drive 
# lsdev -Cc adapter |grep  0t-08  
scsi23    Available 0t-08 Wide/Ultra-3 SCSI I/O Controller
#lscfg -vl hdisk78
  hdisk78          U5791.001.9920546-P2-T6-L10-L0  16 Bit LVD SCSI Disk Drive (7
3400 MB)

Replace Faulty Disk in a SCSI Mirrored VG in AIX

Found a Faulty SCSI Disk in a volume group on one of the LPAR in p595, Volume Group has 40 SCSI Disks mirrored and one Logical Volume and it is marked stale.,
Following are the steps taken to replace the Faulty Disk

Error Report Details

# errpt |more

IDENTIFIER TIMESTAMP  T C RESOURCE_NAME  DESCRIPTION
EAA3D429   0321085312 U S LVDD           PHYSICAL PARTITION MARKED STALE
EAA3D429   0321084712 U S LVDD           PHYSICAL PARTITION MARKED STALE

16F35C72   0321044812 P H hdisk78        DISK OPERATION ERROR
16F35C72   0321023812 P H hdisk78        DISK OPERATION ERROR

$ errpt -a -j F7DDA124 |more
---------------------------------------------------------------------------
LABEL:          LVM_SA_PVMISS
IDENTIFIER:     F7DDA124

Date/Time:       Wed Mar 21 04:48:29 EDT 2012
Sequence Number: 99327
Machine Id:      00CFEFAF4C00
Node Id:         test01t
Class:           H
Type:            UNKN
WPAR:            Global
Resource Name:   LVDD
Resource Class:  NONE
Resource Type:   NONE
Location:

Description
PHYSICAL VOLUME DECLARED MISSING

Probable Causes
POWER, DRIVE, ADAPTER, OR CABLE FAILURE

Detail Data
MAJOR/MINOR DEVICE NUMBER
8000 0013 0000 0051
SENSE DATA
00CF EFAF 0000 4C00 0000 0113 D553 FECC 00CF EFAF 9018 3EC4 0000 0000 0000 0000

$ errpt -a -j 16F35C72 |more
---------------------------------------------------------------------------
LABEL:          DISK_ERR2
IDENTIFIER:     16F35C72

Date/Time:       Wed Mar 21 04:48:29 EDT 2012
Sequence Number: 99325
Machine Id:      00CFEFAF4C00
Node Id:         test01t
Class:           H
Type:            PERM
WPAR:            Global
Resource Name:   hdisk78
Resource Class:
Resource Type:
Location:
VPD:
        Manufacturer................IBM
        Machine Type and Model......ST373454LC
        FRU Number..................00P2685
        ROS Level and ID............43373137
        Serial Number...............0005D90D
        EC Level....................H13092
        Part Number.................26K5280
        Device Specific.(Z0)........000004129F00013E
        Device Specific.(Z1)........0721C717
        Device Specific.(Z2)........0002
        Device Specific.(Z3)........05179
        Device Specific.(Z4)........0001
        Device Specific.(Z5)........22
        Device Specific.(Z6)........H13092

Description
DISK OPERATION ERROR

Probable Causes
DASD DEVICE

Failure Causes
DISK DRIVE
DISK DRIVE ELECTRONICS

$ lscfg -vl hdisk78
  hdisk78          U5791.001.9920546-P2-T6-L10-L0  16 Bit LVD SCSI Disk Drive (7
3400 MB)

        Manufacturer................IBM
        Machine Type and Model......ST373454LC
        FRU Number..................00P2685
        ROS Level and ID............43373137
        Serial Number...............0005D90D
        EC Level....................H13092
        Part Number.................26K5280
        Device Specific.(Z0)........000004129F00013E
        Device Specific.(Z1)........0721C717
        Device Specific.(Z2)........0002
        Device Specific.(Z3)........05179
        Device Specific.(Z4)........0001
        Device Specific.(Z5)........22
        Device Specific.(Z6)........H13092