disks broke?

hay

I'm new in the AIX-environment. Right now i'm testing some stuff out. But i can't test the LVM-part which is (to me) very important. I have 2 disks in my testmachine but it seems only 1 is working wel. I'll show you the output below of the disks...
hdisk0 = in good condition
hdisk2 = broke???

# lspv
hdisk0 00086939bd9ffc58 rootvg active
hdisk2 00086939fbf7df92 None

# lscfg -vl hdisk2
hdisk2 P1/Z1-A6 16 Bit LVD SCSI Disk Drive (18200 MB)

    Manufacturer................IBM
    Machine Type and Model......DDYS-T18350N
    FRU Number..................07N3776
    ROS Level and ID............53393348
    Serial Number...............4EY72879
    EC Level....................F79851
    Part Number.................07N3811
    Device Specific.\(Z0\)........000003029F00013A
    Device Specific.\(Z1\)........07N4921
    Device Specific.\(Z2\)........0933
    Device Specific.\(Z3\)........00285
    Device Specific.\(Z4\)........0001
    Device Specific.\(Z5\)........22
    Device Specific.\(Z6\)........F79851

# lsdev -Cc disk
hdisk0 Available 10-60-00-5,0 16 Bit LVD SCSI Disk Drive
hdisk1 Defined 10-60-00-4,0 Other SCSI Disk Drive
hdisk2 Available 10-60-00-6,0 16 Bit LVD SCSI Disk Drive

# lsattr -El hdisk2
pvid 00086939fbf7df920000000000000000
Physical volume identifier False
queue_depth 3 Queue DEPTH False
size_in_mb 18200 Size in Megabytes False
max_transfer 0x40000
Maximum TRANSFER Size True
unique_id 23084EY728790CDDYS-T18350N03IBMscsi Unique device identifier False
PR_key_value none N/A True
reserve_policy single_path N/A True
PCM PCM/friend/scsiscsd N/A True
dvc_support Device Support False
algorithm fail_over Algorithm True

when running diag on the disk:
A PROBLEM WAS DETECTED ON do 23 nov 14:48:11 2006 801014

The Service Request Number(s)/Probable Cause(s)
(causes are listed in descending order of probability):

63C-132: A Disk Drive hardware error occurred.
hdisk2 FRU: 07N3776 P1/Z1-A6
16 Bit LVD SCSI Disk Drive (18200 MB)

Use Enter to continue.

when trying to make a vg and include this disk:

# mkvg testvg hdisk2
0516-306 mkvg: Unable to find physical volume testvg in the Device
Configuration Database.
0516-862 mkvg: Unable to create volume group.

Is this disk really broke? Is it really the disk?

thanks for replying!

cheerz

from mkvg man page:

-y VolumeGroup Specifies the volume group name rather than having the name
generated automatically. Volume group names must be unique system wide and can
range from 1 to 15 characters. The name cannot begin with a prefix already
defined in the PdDv class in the Device Configuration database for other
devices. The volume group name created is sent to standard output.

Try this:

mkvg -y newvg hdisk2

Regards.

my bad... it's because i was too fast... this is the right output

# mkvg -y testvg hdisk2
0516-1182 mkvg Open Failure on hdisk2.
0516-862 mkvg: Unable to create volume group.

i think there are enough error messages in my previous post to rely on...

cheerz

have you tried a "rmdev" and after that a "cfgmgr"?

yes, but that was a "no go"...
it's weird that AIX can read so many details from the disk, but is not able to use it...

cheerz

Yes that's weird...

  • What does errpt (-a) say?
  • And, what does "no go" mean? Gets "stalled" or something?
  • Are there any other disk on the same controller?
  • That disk already has a PVID, have you tried an importvg? (though it really looks like a HW issue :))

errpt (errpt -a -N hdisk2 > hdisk2log.log) tells me a lot. many disk operation errors it seems...

when i do a rmdev of this disk and then a cfgmgr, aix recognizes the disk and configures it properly, but it's still not possible to use the disk...

there is another disk on the controller, the hdisk0. Everything ok with this disk, so it won't be the controller i guess?

i've already tried an importvg with as result:

# importvg hdisk2
0516-024 lqueryvg: Unable to open physical volume.
Either PV was not configured or could not be opened. Run
diagnostics.
0516-024 lqueryvg: Unable to open physical volume.
Either PV was not configured or could not be opened. Run
diagnostics.
0516-1140 importvg: Unable to read the volume group descriptor area
on specified physical volume.

I think that this disk is really dead... although it stays weird :slight_smile:

cheerz

Well, I would replace the disk :smiley:

Hdisk Problems mostly (not always, but very oftenly) "surface out" on AIX systems. Before a hdisk device finally dies usually there are an increasing number of relocations (basically that means a small part of the disks surface has become unusable) showing in the error-log as PERM-type errors (TMP-type errors can safely be ignored). You can ignore one or two of them, but as they happen more and more often prepare to replace the disk proactively.

AIX gets it information about the disk not from the hdisk itself, but from the electronic and specifically the firmware chip on it. This part can still work without any problem while the physical drive is already worn out.

bakunin