POWER6 VIO Failover

Hello,

Can someone point to me document or the method to make failover for VIO Servers ?

I have one VIO_Server1 which has 4 AIX LPARS

  • From the VIO_SERVER1 I have assigned virtual SCSI DISK to the four LPARS
  • For Ethernet I have used LHEA

So, if my VIO Server1 crashes all the LPARS should move to VIO_SERVER2

1) How can I make this possible ?

Second Question
From my VIO_Server_1 I have assigned a disk HDISK1 to LPAR_5
From my VIO_Server_2 I have assgined a disk HDISK2 to LPAR_5

In LPAR5 i see two DISK
lspv
hdisk1
hdisk2

I load AIX 6 and mirror HDISK1 and HDISK2

If VIO_Server_1 goes down I still have HDISK2
will this work ?

IBM VIO WIKI IBM Developer

Are the physical disks on a SAN or local?

The O/S AIX 6. is from the local disks of VIO.

Local Disks of VIO_1 Server has been assigned to LPARs

Also, does anyone know if there is load balancing option available for VIO

Two VIO servers and One Lpar

can there be a load balancing ?

Thanks

If you mirror the disk from VIO1 to the disk from VIO2 then you'd be OK if one of the VIOS is down/rebooted. Is the LHEA presented from each VIO or is it assigned directly to the LPAR?

I have not mirrored the VIO_1 with VIO_2

Here how it is setup at the moment.

VIO_2 = NOT ACTIVE

VIO_1 = Active

VIO_1 has 3 internal DISK
hdisk0 -- rootvg
hdisk1 -- NONE
hdisk2 -- lpars_rootvg

so what I have done, I have mirrored hdisk0 with hdisk1
hdisk0 -- rootvg
hdisk1 -- rootvg
hdisk2 -- lpars_rootvg

hdisk2 --> I have made small LVs and given it to LPARS
mklv -lv lpar_1_rootvg lpars_rootvg 50G
mklv -lv lpar_2_rootvg lpars_rootvg 50G
mklv -lv lpar_3_rootvg lpars_rootvg 50G
mklv -lv lpar_4_rootvg lpars_rootvg 50G

then assigned them to LPARS
mkvdev -vdev lpar_1_rootvg -vadapter vhost1
mkvdev -vdev lpar_2_rootvg -vadapter vhost2
mkvdev -vdev lpar_3_rootvg -vadapter vhost3
mkvdev -vdev lpar_4_rootvg -vadapter vhost4

so each lpar now has a HDISK0

Then I have made another hdisk thru

mklv -lv aix1rootlvm rootvg 50G hdisk1
mklv -lv aix2rootlvm rootvg 50G hdisk1
mklv -lv aix3rootlvm rootvg 50G hdisk1
mklv -lv aix4rootlvm rootvg 50G hdisk1

and assigned to lpars on different/new VHOSTx

So now each LPAR has
HDISK0
HDISK1
and I mirrored them.

LHEA presented to each LPAR.. Not thru VIO ****)

1) How can I utilise VIO_SERVER_2

2) Failover to VIO_SERVER_2

3) Load Balancing Concept in VIOS ?

Thanks

As far as I know there isn't any automatic load balancing you can do. Normally you will present hdisk from each VIO like you've done and then you would mirror it.

So if LPAR1 has hdisk0 from VIO1, and hdisk1 from VIO2, then when you mirror hdisk0 to hdisk1 you would be covered if you have to reboot VIO1 or VIO2.

thanks for your quick answer.

Okay, that would help me have the LPAR alive.. thru the AIX O/S

how about the SAN disks ?

From my DS8000 i have created logical disks and assigned to VIO_SERVER_1
Create Virtual SCSI Adapter on VIO_SERVER_1 and LPAR_1 + LPAR_2 + LPAR_3 + LPAR_4

Allocation
VIO_SERVER_1 disk10 (From SAN) given to LPAR_1
VIO_SERVER_1 disk11 (From SAN) given to LPAR_2
VIO_SERVER_1 disk12 (From SAN) given to LPAR_3
VIO_SERVER_1 disk13 (From SAN) given to LPAR_4

Now if the Vio_server_1 goes down all those disks will disappear from LPAR_1 + LPAR_2 + LPAR_3 + LPAR_4

1) How can I make failover or redundancy for it with VIO_Server_2 (other than Mirroring)

2) Can I use Mobility under HMC to move the LPAR Partition under the same Managed system to another VIO Server (in my case VIO_SERVER_2 ? )

thanks

Unfortunately I haven't messed with Partition Mobility, we don't have the $$ for the license, maybe in next year's budget.

MPIO is what you want to use for multiple VIO redundancy. Here's a quick overview:

1 - Setup 2 VIOs and assign the same SAN drives to each system.
2 - Then on both VIOs you change the hdisk attribute 'reserve_policy' to be 'no_reserve' 3 - Assign the hdisk to the LPAR (do on each VIO)-Step 2 is what allows the hdisk to be assigned to the same LPAR from each VIO
4 - Run 'cfgmgr' on the LPAR to detect the drive(s)
5 - Run 'lspath' to verify that the LPAR sees the hdisk from each VIO
6 - Test by rebooting/shutdown one of the VIO servers.

For more detailed info you should download or order the IBM Redbook "PowerVM Virtualization on IBM System p: Introduction and Configuration" (sg24-7490-03)

IBM Redbooks | PowerVM Virtualization on IBM System p: Introduction and Configuration Fourth Edition

And the Best Practices book too: IBM Redbooks | IBM System p Advanced POWER Virtualization (PowerVM) Best Practices

Thanks for the information. The redbook -- which you posted -- gives few examples... I will try that and post here.

Also, for the Partition Mobility POWER6
check this out

Live Partition Mobility is the next step in the IBMs Power Systems� virtualization continuum. It can be combined with other virtualization technologies, such as logical partitions, Live Workload Partitions, and the SAN Volume Controller, to provide a fully virtualized computing platform that offers the degree of system and infrastructure flexibility required by today's production data centers.
This IBM� Redbooks� publication discusses how Live Partition Mobility can help technical professionals, enterprise architects, and system administrators:

  • Migrate entire running AIX� and Linux� partitions and hosted applications from one physical server to another without disrupting services and loads.

  • Meet stringent service-level agreements.

  • Rebalance loads across systems quickly, with support for multiple concurrent migrations.

  • Use a migration wizard for single partition migrations.
    This book can help you understand, plan, prepare, and perform partition migration on IBM Power Systems POWER6� technology-based servers that are running AIX.

    Table of contents

    		Chapter 1. Overview
    

Chapter 2. Live Partition Mobility mechanisms
Chapter 3. Requirements and preparation
Chapter 4. Basic partition migration scenario
Chapter 5. Advanced topics
Chapter 6. Migration status
Chapter 7. Integrated Virtualization Manager for Live Partition Mobility
Appendix A. Error codes and logs

partition live mobility is not for failover in case of vio goes down. it's for planned movement in case of maintenance or load balancing

you need mirrored disk over two vios when using internal disks, or multipathing the same lun over 2 vio-servers with external disk as homeyjoe wrote

I have tried this with two new VIOs and it works fine.

I have loaded the SDDPCM & MPIO drivers then configured the SAN Storage for both VIO Servers.

However, I have an existing VIO Server which already has the SAN Storage and when I install the MPIO and SDDPCM drivers I get the following errors.

Anyone come across this ?

$ oem_setup_env

# lspv

hdisk0 00c34be40095f9b8 rootvg active

hdisk1 00c34be405cf31ab rootvgpool active

hdisk2 00c34be40a538498 rootvg active

hdisk3 00c34be4288d3021 None

hdisk4 00c34be4288d94aa None

hdisk5 00c34be4288a9545 None

hdisk6 00c34be4288cccbb None

hdisk7 00c34be42887a299 None

hdisk8 00c34be428881a3f None

hdisk9 00c34be4286abfd5 None

hdisk10 00c34be42869b11b None

$ oem_setup_env

# lsdev -Cc disk

hdisk0 Available 02-08-00 SAS Disk Drive

hdisk1 Available 02-08-00 SAS Disk Drive

hdisk2 Available 02-08-00 SAS Disk Drive

hdisk3 Available 04-00-02 MPIO Other DS4K Array Disk

hdisk4 Available 04-00-02 MPIO Other DS4K Array Disk

hdisk5 Available 04-00-02 MPIO Other DS4K Array Disk

hdisk6 Available 04-00-02 MPIO Other DS4K Array Disk

hdisk7 Available 04-00-02 MPIO Other DS4K Array Disk

hdisk8 Available 04-00-02 MPIO Other DS4K Array Disk

hdisk9 Available 04-00-02 MPIO Other DS4K Array Disk

hdisk10 Available 04-00-02 MPIO Other DS4K Array Disk

# rmdev -dl hdisk10

Method error (/usr/lib/methods/ucfgdevice):

0514-062 Cannot perform the requested function because the

specified device is busy.

$ chdev -dev hdisk3 -attr reserve_policy=no_reserve

Some error messages may contain invalid information

for the Virtual I/O Server environment.

Method error (/usr/lib/methods/chgdisk):

0514-062 Cannot perform the requested function because the

specified device is busy.

# pcmpath query device

Kernel extension sdduserke was not loaded. Errno=8.

Please verify SDDPCM device configuration

Kernel extension sddapuserke was not loaded. Errno=8.

Please verify SDDPCM device configuration

prtconf

System Model: IBM,9117-MMA

Machine Serial Number: 6534BE4

Processor Type: PowerPC_POWER6

Processor Implementation Mode: POWER 6

Processor Version: PV_6_Compat

Number Of Processors: 1

Processor Clock Speed: 4208 MHz

CPU Type: 64-bit

Kernel Type: 64-bit

LPAR Info: 2 VIO_SERVER_1

Memory Size: 4096 MB

Good Memory Size: Not Available

Platform Firmware level: EM340_041

Firmware Version: IBM,EM340_041

Console Login: enable

Auto Restart: true

Full Core: false

Errpt | more

IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION

A6DF45AA 0525014909 I O RMCdaemon The daemon is started.

D6A51BF7 0525014909 I H ent0 HEA PORT UP

2BFA76F6 0525014809 T S SYSPROC SYSTEM SHUTDOWN BY USER

9DBCFDEE 0525014909 T O errdemon ERROR LOGGING TURNED ON

192AC071 0525014609 T O errdemon ERROR LOGGING TURNED OFF

A6DF45AA 0525012309 I O RMCdaemon The daemon is started.

D6A51BF7 0525012309 I H ent0 HEA PORT UP

2BFA76F6 0525012209 T S SYSPROC SYSTEM SHUTDOWN BY USER

9DBCFDEE 0525012309 T O errdemon ERROR LOGGING TURNED ON

192AC071 0525011909 T O errdemon ERROR LOGGING TURNED OFF

0C10BB8C 0525011809 I H hdisk9 ARRAY CONFIGURATION CHANGED

0C10BB8C 0525011809 I H hdisk10 ARRAY CONFIGURATION CHANGED

0C10BB8C 0525011809 I H hdisk6 ARRAY CONFIGURATION CHANGED

0C10BB8C 0525011809 I H hdisk4 ARRAY CONFIGURATION CHANGED

0C10BB8C 0525011809 I H hdisk7 ARRAY CONFIGURATION CHANGED

0C10BB8C 0525011809 I H hdisk8 ARRAY CONFIGURATION CHANGED

A6DF45AA 0525011309 I O RMCdaemon The daemon is started.

D6A51BF7 0525011209 I H ent0 HEA PORT UP

A6DF45AA 0525011209 I O RMCdaemon The daemon is started.

9DBCFDEE 0525011309 T O errdemon ERROR LOGGING TURNED ON

A6DF45AA 0525005209 I O RMCdaemon The daemon is started.

D6A51BF7 0525005209 I H ent0 HEA PORT UP

Also, lsdev -Cc disk should show something like

[LEFT]hdisk10 Available 04-01-02 IBM MPIO DS4700 Array Disk

[/LEFT]
1- My client partitions are OFF
2- SAN Storage DISKs are allocated to HOST_GROUP_VIO
3- HOST_GROUP_VIO has two VIO Servers: VIO_1 and VIO_2

VIO_2 = which is the scratch installation works fine... but the problem is with VIO_1

are the luns mapped to a vhost?

if yes and the client partitions are off as you wrote, use rmdev -dev vtdname -ucfg on the vtd
change reserve_policy and run cfgdev, then the vtds are available again

and you need to reboot the vio-server after installing the sddpcm drivers

1>In case of SAN disk you can map the LUNs to both VIO servers. and make the mapping correctly.So that if VIO1 goes down the communication will be through VIO2.

>From DS8K assign same LUNs to visible to both the VIO servers.
>Map the LUNs to LPAR (its wise to use PVID while mapping) from both the VIOS.

2> Yeah, if you are using SAN disk it is possible to do Partition Mobolity within th managed system.

thanks for the quick answer.

1) yes LUNS are mapped to VTDs

2) Client Partitions are OFF

3) use rmdev -dev vtdname -ucfg
will this have any adverse affect on the data / san path / application ?

4) installed MPIO and SDDPCM drivers and rebooted VIO two times but no luck.. same problem occurs.

uninstalled MPIO and SDDPCM

rebooted

installed

rebooted

still same problem....

you just unconfigure the virtual target device, without deleting a hdisk or deleting data on the hdisk, so nothing will happen when your client partitions are off

sddpcm drivers add the following line to /etc/inittab:

srv:2:wait:/usr/bin/startsrc -s pcmsrv > /dev/null 2>&1

see if the pcmsrv service is running

Quick Question:

Can I setup VIO failover between different managed servers.

Last time I configured a failover between VIOs but on the same managed server.

what if I have HMC which manages two different managed servers.

I create VIO_1 on managed_server_1
and VIO_2 on managed_server_2

can i still configure the failover ?

appreciate your replies...

No.
Not unless you get into partition mobility but that is quite different from powerVM.
A VIO client and server must be on the same system.

-----Post Update-----

Just read the 2nd page of this post, are you sure you have a correct / supported version of SDDpcm and the HAS for the DS4k?
The configured disks description suggests "other" that the drivers do not correctly recognise them and the SDDpcm storage support matrix is a nightmare...

-----Post Update-----

Just read the 2nd page of this post, are you sure you have a correct / supported version of SDDpcm and the HAS for the DS4k?
The configured disks description suggests "other" that the drivers do not correctly recognise them and the SDDpcm storage support matrix is a nightmare...

This is the reply I got from Support:

I haven't tried yet... i will try it out the both methods,
the one which is posted above in this thread (simpler version) and the support

I would suggest you to do the following.

1) rmdev -Rdl every "Defined" fcs adapters

2) make the client lpars paths to vio1 defined (rmdev -dl vscsiX)

If you have lpars with disk mapped as LVs, then you will have

to consider stopping the lpars having those VTDS...

3) shutdown vio1

4) create a copy of the vio1 lpar profile with NO vhost slots

at the HMC, let's name it "NO_VHOST".

5) Start VIO server with this 'NO_VHOST' profile

Since there is no vhost, we will not configure the vhosts,

nor the VTDs, hence not open the hdisks.

6) You should now be able to rmdev all the DS4K hdisks, since

no VTDs are active.

7) Reboot the VIO server with (keeps the NO_VHOST profile)

8) Verify you have "IBM MPIO DS4700 Array Disk" disks

change the reserve_policy according to the value needed

[LEFT]
9) You may reboot once more, to check that values are kept [/LEFT]

Across reboot.

10) Shutdown the server, and, boot it from the "Default" profile

11) Enable the client path again (cfgmgr).