Comparing two directories with diff

Hi all,

I have 2 directories on two different servers. I am trying to find out what is missing from directory X and what is missing from directory Y. they should both have the same exact files in them.

I understand some files may be missing from both directories on each server. I am not sure how to approach this. I already did an ls of each directory to a text file, now I just need to compare and filter. Is there an easy way I can do this with 'diff' or do I need to take another approach?

Thanks in advance

Be lazy.
Create a new directory Z. Copy all the files in X to Z and all the files in Y to Z then copy Z over top of X and Y. After that, you have to compare the list of files to your master list to find any files that were missing in both.

I was thinking of doing that too, by far the easiest way.

i have some concerns about doing this in a production system though..But the differences in the files will still be in the thousands, so a script will probably still be required.. Thanks for the advice.

Is it the goal to know if there are missing files, or to keep the files in sync?

For the latter, I recommend rsync, which copies only missing files.

Are the files text or binary (executables)?

they are - data or International Language text

Do you want to know about different contents too or is it enough if files with the right names exist.

For the latter(just an idea, not tested)

user@server1 $ ls >files-s1
user@server2 $ ls >files-s2
# create all files list
cat files-s1 files-s2 | sort -u >files-all

# missing from server1
grep -f files-s1 -v files-all

As Stomp mentioned before rsync is a good match for this type of problem.

It offers a --dry-run option that allows reporting on any differences and what would be done to update them. It has many configuration options for example: supports just comparing filenames+date+time or doing a full CRC check of each file, removing files no longer present on the master server, compressing data as it's transferred between the servers, bandwidth capping for reduction on network impact, preserving file ownership and permissions, etc.

Please always tell us what operating system you're using when you start a thread like this.

Some systems have a utility named dircmp that, among other things, can tell you what files are missing from either of two directories given as operands and, for files that are present in both directories, tell you whether or not the contents of those pairs of files have the same contents.