Append users if GID exists

I have thousands of users assigned various roles. The role header defines the GID that they should have. If I can get all the GID information into /etc/group format I can upload the output file and easily assign all users into the desired GID. Right now I can get an output file like this:

oracle:x:4579:user1
oracle:x:4579:user2
sybase:x:9475:user1
sybase:x:9475:user2
oracle:x:4579:user3
oracle:x:4579:user4
sybase:x:9475:user5

There's nothing wrong with uploading the file like this, but it's pretty big. The way I produce this is via the following function:

for i in $(cat $tmp_gid_file)
do
TMP_GID=`grep -w $i master_group_list | sed 's/:/:x:/'`
echo $TMP_GID:$USER_ID >> gid_file

tmp_gid_file has a list of GID's line by line. I have to search the GID in a master file instead of getent group (it's a long story) and it returns <group_name>:<GID> so that's why I have to replace the colon with a :x:

I'd like to produce a GID file that has only 1 GID entry per line and does not duplicate. I would like it to look like this:

oracle:x:4579:user1,user2,user3,user4
sybase:x:9475:user1,user2,user5

I started with an if statement but I have a problem appending the user's ID to the end of the line where the GID is found:

GID_EXISTS=`grep -iw $i gid_file`
if [ ! -z $GID_EXISTS ] 
then
   sed '/$i/s/$/:$USER_ID/' gid_file  ##This seems to print the line, not append 
                                      ##the user's ID to the line where the GID is found
else
...

I'm using GNU/Linux (that's what I get from uname -a...not sure if it's rhel or suse or what) and my shell is bash.

It sounds like you could egrep or sed or awk or bash a list as it downloads, just keeping what you want for a list of users.

$Variables are not evaluated inside single quotes (not much is, so I prefer them, generally).

No point in going through the file twice, with sed and grep.

Red hat exact identity is in /etc/*red*hat* !

You could use something like the following:

awk -F':' '{ idx=$1":"$2":"$3":"; rec[idx]=rec[idx]","$4 } END {for (var in rec) print var substr(rec[var],2) }' YOURGIDFILE

If you can decompose the installed user ids and each gid into lines in a simple file, and do the same for the desired user ids and gid, then you can sort them and pass them to comm to find out what is missing. On bash, and ksh on systems with /dev/fd/[0-9]*, you can pipe it all and have less clutter and selay:

comm -23 <(
  gen-desired-list-to-stdout | sort -u
 ) <(
  gen-actual-list-to-stdout | sort -u
 )| create-any-missing-group-or-missing-id-or-id-group-affiliation

I figured it out before I was able to consider some of your suggestions :slight_smile: I grep the destination file to see if the GID already exists in the file that I store that in a variable. If that variable is empty then I append the GID definition to the bulkload file. If the variable is not null then I remove the line that contains the GID definition and then replace it with the same thing but with the user's id appended at the end.

Probably not very efficient as DGPickett points out...I may have to try a different logic as we increase in numbers.

Yes, it is nice to avoid n^2 problems at the start! Solutions with comm are very robust and scale well, can exploit multiprocessing and such well, detect both negative (missing) and positive (wrong) problems!