Clustering Solaris Zones/Containers

Hello everyone. I'm working on a fairly large project at my company and have been looking for some guidance. I just happened to stumble on this forum when looking for help with Solaris, so I'm hoping that you all won't mind me bothering you with my questions. :slight_smile:

Anyway, here goes. Is it possible to cluster an application within Solaris 10 Zones/Containers so that we could fail an application and all of its filesystems from one container to another? We are currently using VCS (though would consider other clustering technologies) and their recommendation is to have 2 separate, but identical zones configured on the hosts. This allows you to patch the offline one while the other one runs production. But, I can't seem to find a way to migrate the app from one zone to another for failover. Is this possible or in this scenario is it necessary to also maintain 2 distinct copies of your app, configuration and log files?

yes, you can do that. with the new version of cluster 3.2 you can also cluster linux zones :wink:
also you don't have to cluster the app... you can cluster the whole zone! but the best practice depends on what exactly you want to do!

hth,
DN2

one of the issues that we're trying to address with clustering the zone vs clustering the app has to do with patching. with app level clustering as it is now, you can patch the OS on the offline node and then fail the service groups over after your are complete. if there are any issues you can go back to the previous system and recover the offline node or if not you can proceed with patching the other system that is now the offline node.

with zones, at some point you need to patch the zone itself since even with a sparse root you have files that need to be updated. during that time, you either just have the app down or hope that nothing happens while you patch.

i'm looking for maximum uptime including system maintenance (and management likes the option for application isolation).

imho this is no knowledge to serach in a forum. this is mission critical and should be done by specialised consultants!

however:
for a maximum on uptime you can use "liveupgrade". you need another disk in your server and can make a online copy of your running system. this copy can be patched and all you need as downtime is "one reboot" (which can take a long time;)).

hehe, yeah, when it comes to the implementation phase i'll probably be hitting up Sun for more assistance. at this point i'm trying to find out what others are doing out there to see if has been done before. my preference is to drop the whole virtualization idea entirely and just cluster applications with virtual IPs. i'm willing to give zones (and even LDOMs for that matter) a chance, but so far i don't think either one will give us the availability that we're looking for. basically i want an HA environment that meets the following needs:

1 - can handle hardware failures
2 - can handle OS issues such as panics
3 - needs minimal downtime for system maintenance
4 - provide ability to store application logs all together
5 - can dynamically expand filesystems or add storage to a live system.

so, 1 is straight forward and can be handled by a normal app cluster, clustered zones and clustered ldoms.
2 is straight forward and can be handled by app cluster and clustered zones in a particular configuration. ldoms have the issue that if anything happens to that OS, the app will be down until it can be repaired or restored.
3 can be handled by an app cluster and with clustered zones in a particular config. ldoms can't do this because you have one OS that you are stuck with through failures, patches, etc.
4. app cluster works with this again. if we configure zone cluster so that 2 and 3 are met, then this one can't be without some sort of shared storage option (say NFS or some sort of clustered filesystem). ldoms are fine with this because everything is stuck together anyway, app, os and everything.
5. app cluster is fine with this. ldoms completely fail at this because you need a reboot to modify filesystems or storage. i'm not sure about zones. since zones just remap local filesystems, will they recognize if a filesystem has grown?

By the way, the particular configuration that I mentioned for zones is that you have your zones local instead of on shared storage that you fail over. This is what Symantec recommends for clustered zones. So, you have zone A on server 1 with some hostname. Then you have zone B on server 2 with the same hostname. Then, you just start up the IP on whichever zone you want your end users/clients to connect to. Now you have two separate environments that you maintain which allows for patching of the offline zone while the other one is still servicing clients. The only problem I have then is that I can't seem to find a nice way to migrate the application data between these two zones with the IP and we can't have application logs spread between two zones.

hi,

I've already installed such clusters for my costumers... so yes, it is possible... :wink:

At the last costumer it was Solaris 10 (u4) on two nodes with SunCluster 3.2. We had about 10 zones, 3 local and the rest on a failover FS. with the same ideas for the local zones, if the patching or updating on one zone fails, we still are able switch to the other node...

but like DukeNuke2 said, just call your sun partner, they should have the right consultant or system-engineer to answerer your questions. that's to much to answerer on a forum, you will have to design the cluster framework, the filesystems (global, failover or local) and the zones for your special needs. in short, yes it's possible, but if it's really the right solution for you, hmmm, good question...

since i am such a consultant from a sun partner, i know that there are a lot of questions about zones in a clusters :stuck_out_tongue:
give them a chance, consultants don't bite :D:D

gP

do you guys have any available resources (docs, articles, etc), on this subject? more on the zone fail overs. i've worked with zones quit a lot and have never implemented this before. dont' know much about it. anything would be awesome.

just out of curiosity, have any of you guys done this on m5000,m9000 or say 6800s??? seeing as they are two domain arch... just wondering how clustering and zones and multiple domain platforms handle this.

have a look on docs.sun.com:
Sun Cluster Data Service for Solaris Containers Guide

yes, on all of them. m5000 just with zones, 6800 with SunCluster and zones (well, actually it was a sf6900) and m9000 with VCS and zones. but it's the same like with a whole physical machine. solaris doesn't know that there are other OSes on the same hardware, that's the sense of the hardware domain. a "cluster in a box" prevents software issues, but it's like working with two nodes, they just share the same backplane :wink:

gP