Solaris 9 Home Directory, Two Machines Sharing a NAS

Stellaman1977 · September 26, 2018, 11:15am

Good Morning,

I have 2 Solaris 9 machines sharing a NAS, and need to have users to be able to log in from the 2nd machine and get to all of their files on the NAS that were created on the 1st machine.

So far its working ok, but when users log in to the second machine, their user IDs show up as numbers, and they are not considered the owner. I used

usermod -u 1234 username

to get the user ID on the second machine to match the user ID on the first, but the user could not log in after that. I believe the issue is with the home directory: I think I need to change the owner of it recursively which makes me nervous. Should I:

chown -R username /home/username

?

I'm concerned because when looking over that home directory, there are 2 folders in the root group, and one of those has root as the owner. These appear to be for navigation. eg. commands

and

..

(These always appear at the top of directories.)

Would changing the directory recursively mess things up? Is there something else I should do as well?

jim_mcnamara · September 26, 2018, 1:07pm

Assuming I got what you want -
The problem is the OS on box 1. It thinks requests from box 2 are network only requests.

The user coming in from 2 is treated as a network login on 1. You need to have those directories mounted locally on 2 (via samba on machine 2, for example). That way all of the permissions for a user from 2 are valid "inside" the filesystem. In other words, accessing the directory on 1 by a user on 2 gets "treated" as if the origin of the request was local to 1.

The samba example is just one method, there are some others, IIRC. It has been 10 years since I did anything with Solaris 9.

BTW, if anything happens to box 1, the users on box 2 are locked out as well. Because their home directory is nopt available. So your initial design is less than optimal. You may have some NAS options that bypass this problem - I do not know.

MadeInGermany · September 26, 2018, 2:44pm

Normally a home directory is fully owned by the user, and you can do

chown -hR username /home/username

The -h is necessary because by default chown follows a symbolic link. Imagine the user has a symlink mypasswd -> /etc/passwd ...
If you want to be safe then use "find" to only change owner to "username" if current owner is "olduser".

find /home/username -user olduser -exec chown -h username {} +

Both chown and find -user take a numeric UID or a username (that they automatically map to a UID).

hicksd8 · September 26, 2018, 3:12pm

Whoa!! Hang on a minute. Let's explain some things here.

Any one filesystem can only be mounted by one operating system at a time. Mounting the same filesystem on multiple machines is an instant recipe for corruption.

Sometimes storage systems (SANs, NAS's, etc) can be dual-tailed into two different machines but only when a suitable software suite (cluster software) is managing the ownership (at any one time) of each filesystem on the storage. Then, when a failover occurs, the (expensive) cluster suite will swap over the mounted filesystems from one system to another.

As Jim eluded to, the filesystem(s) can only be mounted by one O/S which controls file locking, file access (read/write), etc, and any second node needs to access those filesystems over the network (via NFS). So, repeat, mounting a filesystem on more than one system at the same time is big trouble. Make no mistake about that.

MadeInGermany · September 26, 2018, 4:11pm

My understanding is that the two Solaris boxes NFS-mount the same file system from another box, a NAS (Network Attached Storage) box.
Perfectly normal.
But there is no common naming service in place, so users were created on each Solaris box and got different UIDs.

Stellaman1977 · September 26, 2018, 4:35pm

Hmm- it was set up like that prior to my arrival- possibly for years- I simply moved files from machine 1 to the NAS and then added symbolic links on both machines to the directories on the NAS.

Not sure if its important, but there are 7 or 8 other machines (Windows, RedHat) on that switch with access to this NAS as well.

I've shut the second machine down for the moment so I can get to the bottom of this.

I read Jim's response again. I have to admit I didn't quite grasp it. He seems to be saying that the first machine has some kind of priority over the files on the NAS and the second machine is lower priority, and that I should mount the NAS to machine two (which I did, but you disagree with).. not sure if I'm getting that right.

We've had a regular weekly ufsdumps of both machines, saved on the NAS weekly for a long time without issue, for what its worth.

Does this change or clarify things any? I guess I'm wondering what is the purpose of a NAS if only one machine can use it.

Thanks for your help!

jim_mcnamara · September 26, 2018, 9:57pm

Clarification:

We have two boxes. Box 1 is parent. Box 1 owns the filesystem. Box1 shares the filesystem via NFS or samba or whatever. Box 1 does not care who connects to the filesystem and then remote mounts it - via NFS. So you really have a proxy acting in box 1 in its very own kernel space when a request comes over the network. Box 1 controls entirely the NFS mounted disk, because it is actually physically mounted on box 1, not box 2.

Box 2 now runs an NFS/Samba client that connects over to box 1 via the smb protocol (example protocol). Box 2 has a symlink to the NFS mountpoint (that lives on box 2) then points to box 1. This is a mountpoint that connects as a proxy to the real disk on box 1.

This works great. I do not know what Made In Germany saw in your post, but what I described, I think, is clear. Samba or NFS works fine on Solaris 9. You will need to read a little on configuring your fileserver on Box 1. You do not seem to be running NFS to make box 1 a fileserver, and make box 2 a client of that fileserver.

The only weenie you need to know:
As of 2013 the NFS version on Solaris 9 has/had a bug.

Before rebooting:
If you have a user that stays logged on in spite of policy (I did), then you must kill all users processes on both systems. And any process that has the filesystem in question open. All logged on users and possibly system maintenance processes can have open files/directories there.

Why? If there is a file held open on box 2 (i.e., some user has a process with the current directory aimed "in" the NFS mount) then box 1 will hang on shutdown. Forever. If you force kill box 1, it will not rebuild NFS , so you lose the connection when you do reboot . Forever. Forever = you have to destroy and rebuild the connection on both sides. And it fails sometimes. As of 2015 there was a patch for this on Solaris 10, Solaris 11 did not have the problem, and no patch for Solaris 9. Verify this with Oracle support, if you still have support for your Solaris 9 box.

See: Solaris Operating System - Releases

hicksd8 · September 27, 2018, 4:25am

Reading MadeInGermany's post#5 I can see that he has a different perception of this problem than Jim and I.

So, questions:

Are both node 1 and node 2 both mounting the filesystem(s) on the NAS via NFS???

What type of NAS is this? Make/model?

Is this NAS intelligent enough to share NFS handles for filesystems AND control all file locking, file read/write, and file contention? If so, the NAS is itself acting as node 1 and your Solaris 9 machines are NFS clients node 2 and node 3, both mounting via NFS which is okay.

Question then is: When the filesystem(s) need checking/correcting, how is that done? Is the NAS capable of creating and formatting filesystem(s) itself, and running integrity checks?

Sorry for all the questions but MadeInGermany's post#5 has got me thinking that Jim and I have perhaps not comprehended the problem from your post#1 due to lack of detail.

Awaiting answers........

Stellaman1977 · September 27, 2018, 10:02am

Thanks- I checked machine 1 only (since I have 2 shut down right now)

/etc/mttab

shows nfs for the NAS mounts.

Its an Iomega StorCenter ix4-200d

I don't know if its intelligent enough to share handles, but I can say, that most users uids on machine 2 had already been made to match those of machine 1. I create accounts locally on each machine.

I know I can log in to the NAS via a browser and create shares and typically, users generally create directories from the windows machines.

Does this help?

------ Post updated at 10:02 AM ------

Even though I moved the files off of Box 1 and on to the NAS?

Physically mounted? Box 1, Box 2, and the NAS are all connected through the switch. Box 1 and box 2 mount the NAS in exactly the same way as far as I can tell.

Does this change anything?

Peasant · September 27, 2018, 12:31pm

How are filesystems from the former server which acted like a NAS exported ?
Is the same or similar configuration used during export from new NAS storage ?
Are some firewalls (ipfilter, ipnat) localy configured on solaris boxes (or any other clients to former solaris NAS box ) ?

Can you show output from following commands, issued from one of the solaris boxes.

showmount -e <your working acting NAS solaris box which you wish to migrate>
showmount -e <your NAS storage ip address>

When NAS storage is up, does it work on other operating systems like centos, or exhibits the similar behavior as on solaris boxes ?

What is the actual error when user tries to, for instance, ssh to a box ?

In most cases, for home directories automount is used on Solaris.

This is a nice article about automount :
Less known Solaris Features: /export/home? /home? autofs? - c0t0d0s0.org

Regards
Peasant.

MadeInGermany · September 28, 2018, 8:34am

Your post#9 confirmed that my assumption in post#5 is right, and you should try my suggestion in post#3 to alter file ownership.
I forgot to mention that the "chown" requires "root=" permission on the NAS box (in the NFS exports), otherwise it will fail with error "Not owner".

Stellaman1977 · October 23, 2018, 11:25am

So,

To change user jsmith from being user 67123 to 1012:

usermod -u 1012 jsmith
  find /home/jsmith -user 67123 -exec chown -h 1012 {} +

I think I like this (find) one because it sounds completely reversible to me.

If I were to do a recursive change, if files and directories owned by someone else happen to be in there, they would get changed, but then I would never know how to change back, not knowing who they had belonged to.
What does {} + do?

MadeInGermany · October 23, 2018, 2:31pm

While {} \; runs a chmod for each file, the {} + collects the filenames and, when long enough, runs a chmod with the collected argument list.
Fewer invocations of chmod => greater speed.

Stellaman1977 · October 25, 2018, 9:39am

Just to confirm- is this the best way to change someone's uid? (what I said in #12) I don't want to break the account.

MadeInGermany · October 25, 2018, 12:09pm

Yes.
You are right - the find command is reversible.

Stellaman1977 · October 30, 2018, 3:25pm

Thanks. Haven't tried this one yet.. Being overly cautious so I wanted to ask- Is this the cause of the login issue I was having when I changed the uid before or could it be something else? Many posts I've seen about changing the uid just show how to change the uid without any mention of this issue I'm having. And if this is the issue, do you know the specific files or directories within this users home directory that are the ones responsible for the login issue? If its just a few, I could perhaps just change those to test the theory.

gull04 · October 31, 2018, 3:35am

Hi Stellaman,

Your suggested methodology for changing the user ID and file ownership is the correct way to make the change. Bringing the User ID's and the Group ID's into alignment will probably be required, depending on what the user population is on the estate you may have to repeat this exercise for a number of users. Now would probably be a good time to plan a system for the management/alignment of the User ID's and Group ID's over the whole estate.

This is a common issue in the world of larger estates, it is frequently down to organic growth - where no or poor forward planning is in place. Large enterprises tend to have management tools in place, where user ID's can be correctly managed - this tends to happen after the problem has been encountered.

Regards

Gull04

Stellaman1977 · November 2, 2018, 7:18am

OK, I ran the commands and the user was able to log in successfully after the uid change. He also looked around his files on the NAS and Thanks! Though I now noticed that while his uid matches between the two machines, the group IDs do not match. Is this a problem?

gull04 · November 2, 2018, 7:33am

Hi Stellaman;

The group ID's are not so much of a problem - however it is best practice to keep them aligned as some users may depend on group permissions for access to files/programs.

If you check the /etc/group file on each of the system, you should be able to bring both accounts into alignment. This can be done in the same manner that you aligned the user ID's, but you should check the group memberships as unlike user ID's many users can be members of groups.

Regards

Gull04

MadeInGermany · November 2, 2018, 10:29am

Yes, even if it currently might not matter, it is better to have the GIDs in sync as well.
Analogue to the

usermod -u newuid username
find /home/username -user olduid -exec chown -h newuid {} +

you run

groupmod -g newgid groupname # changes /etc/group and perhaps referring entries in /etc/passwd
usermod -g newgid username # if not done by the previous groupmod
find /home/username -group oldgid -exec chgrp -h newgid {} +

The user must logout/login on the target system otherwise she would continue with the old GID.