Checksum Comparison and listing out the differences

Hi All,

We have a requiremetn where i have created a shell script that will compute the checksum of remote directoryand the local directory files
and get those details in two different files in the below format

File1:

0bee89b07a248e27c83fc3d5951213c1 /home/test1/test1/test1.sh
d41d8cd98f00b204e9800998ecf8427e /home/test1/test1/test2.sh

File2(Remote server)

 e48bc2344e4f2e2778309fa3abf8fb3f /home/test1/test1/test1.sh
d41d8cd98f00b204e9800998ecf8427e /home/test1/test1/test2.sh
d41d8cd98f00b204e9800998ecf8427e /home/test1/test1/test4

Here first column is checksum and the second column is the filename with path
Based on their cheksum Now i need to compare both the files in such a manner that it should display the output like this

 /home/test1/test1/test1.sh differs 
/home/test1/test1/test2.sh same
/home/test1/test1/test4 only present in file2_name.

This all is required to do the overall code comparison for the client

You could try something like:

awk '
NR == 1 {
        f1 = FILENAME
}
FNR == NR {
        cs[$2] = $1
        next
}
$2 in cs {
        if($1 == cs[$2])
                print $2, "same"
        else    print $2, "differs"
        delete cs[$2]
        next
}
{       print $2, "only present in", FILENAME
}
END {   for(i in cs)
                print i, "only present in", f1
}' File1 File2

If you want to try this on a Solaris/SunOS system, use /usr/xpg4/bin/awk , /usr/xpg6/bin/awk , or nawk instead of the default /usr/bin/aw k.

Hi Don,

Thanks that worked like a charm.

I have developed my script. Now i need to format it to give a profeessional look as this needs to be presented to the client. Could you please hlep me in this.

Below is the script

#!/bin/bash
function ssh_server_local()
{
if [ -r dir1 ];
then
for i in `cat dir1`
do
find $i -type f -exec md5sum {} +|sort -k2 > $name_local
done
else
echo "Please enter the folder path in dir1 file"
fi
}
function ssh_server_remote()
{
if [ -r dir2 ];
then
for i in `cat dir2`
do
ssh $username@$ip_remote "find $i -type f -exec md5sum {} +|sort -k2; exit" > $name_remote
done
else
echo "Please enter the folder path in dir2 file"
fi
}
function comparison()
{
awk '
NR == 1 {
f1 = FILENAME
}
FNR == NR {
cs[$2] = $1
next
}
$2 in cs {
if($1 == cs[$2])
print $2, "same"
else print $2, "differs"
delete cs[$2]
next
}
{ print $2, "only present in", FILENAME
}
END { for(i in cs)
print i, "only present in", f1
}' $name_remote $name_local >result.txt
}
 
echo "Please confirm if the direcotry path has been added in the dir1 y/n"
read d1
echo "Please confirm if the direcotry path has been added in the dir2 y/n"
read d2
echo "Please enter the user to which file belongs in the remote server"
read username
echo "Please enter the ip address of remote server"
read ip_remote
echo "Please enter the name of the remote server"
read name_remote
echo "Please enter the ip address of local server"
read ip_local
echo "Please enter the name of local server"
read name_local
ssh_server_local
ssh_server_remote
comparison

For starters, you can change the useless use of cat and dangerous backticks for i in `cat dir1` ; do ... done into

while read -r i
do
...
done < dir

And add more error checking, so the script quits instead of spitting errors when given nonsense values or when missing important files. I often use this kind of construct:

die() { # Print error message and quit, use 'die "string"' or 'die "string" exitcode'
        echo "$1" >&2
        exit ${2-1}
}

# Get input from user
...

[[ -z "$string" ]] && die "String is blank"
[[ -e "filename" ]] || die "filename does not exist"

...

No. This forum is intended to help you learn how to use the tools available on Linux and UNIX Systems so you can effectively use these wonderful platforms. We are not here to do your job for you.

On the other hand, if you like the code I write and would like to hire me to turn sample code into something that could be used in a production environment, send me private mail giving details of what job you want me to do for you and how much you will pay me to do the job.

4 Likes

I would say the most important thing to give it a professional look is by indenting your code.

Writing a nicely indented code clearly shows that you've a clear sense of what you're writing and you certainly know how to "respect programming".

Can I also suggest:

  1. Check command line parameters and print quick usage info, consider -h help parameter and the ability to pass prompted value in as command options.

2 .Proper validation of entered data/parameters, e.g. ensure response to y/n prompts is valid (what are you using d1 and d2 for anyway), ip addresses are pingable

  1. Good naming of function and variables - try to describe what they do/store e.g. "comparison" is less than informative.

  2. Try to write a manual page that fully describes the script and any pre-requisites.