OEL5.5 hot add disk

Hi All,

I have an issue when trying to hot-add a SAN disk to an oracle enterprise linux 5.5 server.

The storage array we are using is an HP 24000, we have no issues with the disk usage for both ASM and LVM setup, it is just a simple matter of not being able to scan for new disk when it is added without a reboot.

We have tried:

scan "- - -" > /sys/class/scsi_host/host(0 or 1)/scan

which errors with the following message:

unexpected response from lun 142, scan aborted

lun 142 happens to be the new added disk.

And also we have found a script called rescan_scsi_bus.sh, this runs successfully but does not add any new scsi devices.

I have a feeling on a reboot the scan is forced while loading the qla2xxx module, however we cannot remove the module and re-add it during server running time as this comes up with a "Fatal: module is in use" error message.

We have also tried to restart the multipathing deamon to see if the disk gets picked up, and also to run xpinfo, neither of which seem to see the disk.

I have raised a call with oracle support and have now been waiting 3 weeks without a response, however if i do get a useful reply i will post it here.

Cheers,
Tom

What does dmesg say?

Also, try with ioscan and modprobe .

hi verdepollo,

dmesg just gives the error message: unexpected response from lun 142, scan aborted

ioscan is not an installed prgram. However i will try this when i can get some resource to do so on a test box (install ioscan, assign some SAN disk and try to add it), which may take a couple of weeks due to the environment we are running, everything even for test environments must go through change control. I actually thought ioscan was just an HP-UX command.

As for modprobe, we tried to take it out and add it back in, which it can't do because it says "Fatal: module is in use" as per my original post :wink:

Any other possibilities would be greatly appreciated, or even a possibility why the scan command doesn't work. I have been told it was a problem for 5.3 distro's and earlier so why should i be having the issue with 5.5?

Cheers,
Tom

This is documentation we have tested and verified in our QA lab, and its known to be working on RHEL 5.5 with a P2000 G3 FC. Its based on the RH online storage guide. It assumes you're using multipathing.

List the disks by typing
multipath -l

This will give you the current host and lun numbers (x:0:0:n, where x is the host number, n is the lun number).
A sample output is below:

[root@baschinfs01 ~]# multipath -l                                              
mpath14 (3600c0ff000118652f9351f4e01000000) dm-16 HP,P2000 G3 FC                
[features=1 queue_if_no_path][hwhandler=0][rw]                       
\_ round-robin 0 [prio=0][enabled]                                              
 \_ 2:0:0:6 sdy        65:128 [active][undef]                                   
 \_ 5:0:0:6 sdab       65:176 [active][undef]                                   
\_ round-robin 0 [prio=0][enabled]                                              
 \_ 3:0:0:6 sdz        65:144 [active][undef]                                   
 \_ 4:0:0:6 sdaa       65:160 [active][undef]     

Once you've created the lun on the SAN, type (again, where x is the host number, n is the lun)

echo "0 0 n" > /sys/class/scsi_host/hostX/scan

For each host number that should have access to the lun.

So looking back at our sample it'd be:

[root@baschinfs01 ~]# echo "0 0 6" > /sys/class/scsi_host/host2/scan            
[root@baschinfs01 ~]# echo "0 0 6" > /sys/class/scsi_host/host3/scan            
[root@baschinfs01 ~]# echo "0 0 6" > /sys/class/scsi_host/host4/scan            
[root@baschinfs01 ~]# echo "0 0 6" > /sys/class/scsi_host/host5/scan   

Next type

multipath
multipath -l

and you should now see the new device. At that point you should be able to add the volume to oracle ASM.

If that doesn't work, you can reference the online storage config docs here:
Online Storage Reconfiguration Guide

Hi msarro,

Sorry for such a long delay on the reply but i have been absolutely bogged down with work on other projects and other BAU that means i haven't had a chance to even log in.

But thanks for your feed back, during the echo command as i showed with my original post we only ever used "- - -". When i get a chance i will try it with the "0 0 lunid" to see if that works. However that may prove difficult to test as there is database installation and development and testing going on across all our current environments, the next chance i get for testing might be within the next 3-4 weeks when we are building 4 new Linux servers so i can only comment on how it works then.

I will send a response on either the sucess or failure when i can.

Cheers,
Tom