Compare two text files and output difference

Hi experts,

I am trying to compare two text files and output the difference to another file.
I'm not strictly looking for differences in text but additional text at the end of one file that isn't in another, so basically comparing the file 2 against file 1 and printing any additional text to file 3.

Code

# Tool number
toolnr=`uname -n | cut -c2-`

# Log Directory Path
LOG_path="/usr/asm/atl.1001/user_data/error_logs"

# Input file names
file1="ASML_LOGBOOK.TXT"
file2="ASML_LOGBOOK_2.TXT

diff ${file1} ${file2} |grep "^<" > ${LOG_path}/LOG${toolnr}.TXT

the following is an error output when I run the script:
Error code:
> /usr/asm/atl.1001/user_data/error_logs/LOG1001.TXT^J^J^J: cannot open

Any help greatly appreciated!

It appears that you do not have permission to create /usr/asm/atl.1001/user_data/error_logs/LOG1001.TXT if it doesn't already exist, or you do not have permission to erase the current contents and write new data to that file if it does already exist. What is the output of the command:

ls -l /usr/asm/atl.1001/user_data/error_logs/LOG1001.TXT

after you get that error message?

Just out of curiosity, why do you want to remove the first character of the system's node name when creating a log file for that system?

Hi Don,that code I used was used from another thread on here about comparing two text files. So I'm not sure what that bit is doing, what I thought that was doing was taking the differences in file 2 compared to file 1 and printing them to another file??

I dont have access to the machine at the moment but I should have privileges to read and write files as I do this all the time, after the first error I opened and saved LOGtoolnr.txt so the file actually exists now but it still can't be accessed.

How could i clean up my code to just print the differences to another file?

Thanks

In the error msg the filename that cannot be opened appears to have three trailing <newline> chars. Are you sure that this is what you want? You better double check the filename composing.

OK. I didn't realize how little experience you have writing scripts. When I saw that you were selecting specific pieces of the output of diff, I assumed you had more knowledge about shell programming than you seem to have.

I went back to your first message in this thread and looked at your script again. The immediate problem is that you are missing a closing quote on the line:

file2="ASML_LOGBOOK_2.TXT

Therefore, the value you assigned to the file2 variable is:

"ASML_LOGBOOK_2.TXT

diff ASML_LOGBOOK.TXT ${file2} |grep "

and ending up trying to read input from a file named:

 " > ${LOG_path}/LOG${toolnr}.TXT
 
 

with the name presumably being terminated by a " on a line you didn't show, or by your shell closing the string when it hit the end of file of your script.

If you just copied this code from another thread without understanding what you're doing, I strongly suggest that just replace what you showed us with the command:

diff ASML_LOGBOOK.TXT ASML_LOGBOOK_2.TXT

to see if the output is what you're expecting. If it is, then change it to:

diff ASML_LOGBOOK.TXT ASML_LOGBOOK_2.TXT > LOG1001.TXT

If it isn't what you want, read the diff(1) man page to see if there is an option that will give you what you want. If not, explain in detail what you really want in LOG1001.TXT and how that is different from what the diff utility produces by default.

Hi again don,

Thanks again for your reply and input. I'll have access to the UNIX machine later on so I'll try what you suggested.

But to explain what exactly I am looking for in log1001.txt.

A logbook is kept on tool which is used to document work carried out on the tool. At 0700 and 1900 there is a pass down where the actions from last 12 hours need to be documented.

What I am trying to do is save a copy of the logbook (asml_logbook.txt) at 0700 and then at 1900 compare that to the current logbook(asml_logbook_2.txt) to see if there is anything added (ie any actions performed in last 12 hours). If there is it will be added to the end of the current file and so I want to copy this extra text to the third file, log1001.txt.

I have tried to find out if there is a way to search between two date + time stamps depending on current time but this doesn't seem to be possible.

M

---------- Post updated 08-24-12 at 08:40 AM ---------- Previous update was 08-23-12 at 11:26 PM ----------

Hi Again Don,

I just added the quote at the end of the file name that I was missing and now it seems to have worked, one small thing though, is there a way to get the output without a "<" at the start of each line?

Thanks,

M.

The pipeline that you're running is:

diff ${file1} ${file2} |grep "^<" > ${LOG_path}/LOG${toolnr}.TXT

Please just enter the command:

diff ASML_LOGBOOK.TXT ASML_LOGBOOK_2.TXT

and look at the output produced. Note that this command is the first command in your pipeline after the shell expands the file1 and file2 variables.
Note that there are lines starting with < which identify lines that are in file2, but not in file1; lines starting with > which identify lines that are in file1, but not in file2; and lines not starting with < or > that indicate where the lines starting with < and > appear in the files. The second stage in your pipeline:

grep "^<"

says that you want to discard all of the output produced by diff except for the lines that have < as the first character on the line. This means that you are throwing away all of the information that tells you whether the lines you have selected are lines that are different in file1 than they are in file2 or are only present in file2. You are also throwing away all of the information that shows lines that are in file2, but not present in file1; lines that show how a line that changed appears in file2; and lines that were in file1, but are not present in file2; and all of the lines that indicate where lines that were added, deleted, or changed appear in these files.

If this is what you want and you also want to throw away the first two characters (< and space) from the remaining lines, replace the

grep "^<" 

in your pipeline with:

awk  '!/^</ {next} {sub(/^../,"");print}'

Hi Don, thanks again for the help...

The following is what I currently have :

#!/bin/sh
# Shell Script to compare file 1 and file 2 and print differences to file 3


# Tool number
toolnr=`uname -n | cut -c2-`

# Log Directory Path
LOG_path="/usr/asm/atl.1001/user_data/error_logs"

# Input file names
file1="ASML_LOGBOOK.TXT"
file2="ASML_LOGBOOK_2.TXT"

diff ${file1} ${file2} |awk '!/^</ {next} {sub(/^../,"");print}' > ${LOG_path}/LOG1001.TXT


When I run this i get a message saying
awk: syntax error near line 1
awk: illegal statement near line 1

Am I running this in the wrong shell?

I tried changing the first line to

#!bin/awk

But that gives error messages stating the same as above but for line 6, so I remove that variable line from the code and just write the output file name in by hand at the end but then I get an error same as above for line 10.

Im guessing I have one fundamental issue here that someone with any kind of knowledge in scripting would avoid:wall:!

M

I tried this script on my system (running OS X) using #!/bin/sh , #!/bin/bash , and #!/bin/ksh and all three worked just fine (after changing the setting of LOG_path to a directory that exists on my system. It won't work with csh or tcsh.

This is a shell command language script and awk is not a shell command language interpreter, so (as you have found) using /bin/awk as your command interpreter doesn't work.

What operating system are you using? I.e., what is the output from the command uname -a on your system.

Hi Don, just checked the output of

uname -a 

and the following is what I get:

SunOS mxxxx 5.10 where mxxxx is the machine number.

So i looked up the awk command on solaris and i replaced the awk command with /usr/xpg4/bin/awk and it seems to work perfectly!!

the following is my code that does exactly what I want to:

#!/bin/sh
# Shell Script to compare file 1 and file 2 and print differences to file 3


# Tool number
toolnr=`uname -n | cut -c2-`

# Log Directory Path
LOG_path="/usr/asm/atl.1001/user_data/error_logs"

# Input file names
file1="ASML_LOGBOOK.TXT"
file2="ASML_LOGBOOK_2.TXT"

diff ${file1} ${file2} |/usr/xpg4/bin/awk '!/^</ {next} {sub(/^../,"");print}' > ${LOG_path}/LOG${toolnr}.TXT

I'm pretty sure this will do exactly what I want.

Thanks very much for the help!