comparing two files using awk

hit brick wall while trying to knock up a script that will take values from the "lookup" file and look it up in the "target" file and return values that dont appear in "target" but do in "lookup".

just knocked up something using bits from previous threads but theres gotta be something wrong with the syntax as it will not execute properly.

it does print the output but also complains about "awk: cmd. line:2: (FILENAME=lookup FNR=4) fatal: `continue' outside a loop is not allowed" at the same time

first time im using awk, and pretty novice at shell in general.

#!/bin/bash
awk 'FILENAME=="target" {arr[$0]++}
        FILENAME=="lookup" { if ($0 in arr) {continue}
        else {print $0}}' target lookup
exit 0

lookup contains
tom
ronnie
sam
naiz
mum
kash

target contains
naiz
mum
kash

output should be
tom
ronnie
sam

Welcome to the forum!
You can use the inverse:

awk 'FILENAME=="target" {arr[$0]++}
     FILENAME=="lookup" {if(!($0 in arr)) {print $0}}' target lookup

brilliant.

just need to output to file now.
thought a simple > out would do but

#!/bin/bash
awk 'FILENAME=="target" {arr[$0]++}
        FILENAME=="lookup" { if (!($0 in arr))
        {print $0 > out}}' target lookup
exit 0

awk: cmd. line:3: (FILENAME=lookup FNR=1) fatal: expression for `>' redirection has null string value

I noticed and I adjusted my response as well :slight_smile:

we keep stepping on each other's toes! see my 2nd post above :slight_smile:

You original would have worked if you used next instead of continue BTW.

print $0 > "out"

ignore, fixed it by enclosing the filename in double quotes!

---------- Post updated at 04:56 AM ---------- Previous update was at 04:56 AM ----------

haha, snap indeed!

thanks buddy

You can further shorten your script like this:

awk 'FILENAME=="target" {arr[$1]++}
     FILENAME=="lookup" && !($1 in arr) {print $0}' target lookup > out

and then:

awk 'FILENAME=="target" {arr[$1]++}
     FILENAME=="lookup" && !($1 in arr)' target lookup > out

I used $1 instead of $0 so that there are no mismatches due to spacing differences..

so just to read that back in my head

"if the value in $0 is not in the equiavalent array position, print said value"

where the exclamation mark represents the "not in" condition?

@#9 : indeed

interesting. how comes the former doesnt print on screen first despite the print $0?

having trouble reading back the red bit. what are we doing here?

It does not print on screen because of the redirect of stdout of awk to the file "out" (> out)
The red part makes use of the fact that {print $0} is the default action in awk so you can leave that out.

makes sense. enhancing it further to allow user to provide a path for the target file. was pretty simple and got that working.

echo 'enter full path to eodfeed file'
read targetpath
echo $targetpath
awk '
FILENAME=="$targetpath" {arr[$1]++}
FILENAME=="lookup" && !($1 in arr) ' $targetpath lookup > out

now id like to offer the user to provide the path of a file that itself contains multiple paths to target files. the script would then lookup against each target file in a loop and save the results to a separate output file (which would be conveniently named after the target file the lookup was carried against)

so the user would provide the location of a file that contains the following
/opt/ice/server/targetfile1
/opt/ice/server/targetfile2
/opt/ice/server/targetfile3

the script would use "lookup" file against each one of the above and output the results as
out_targetfile1
out_targetfile2
out_targetfile3

before i have a crack at it, im trying to figure out where id start
i guess the first thing is to read the file the user has provided and store the 3 locations into an array (this is within the awk). start a for loop, for each i in arr do <awk program that already works>

on the right tracks?

The code in #13 will not work IMO, since $targetpath does not get evaluated inside the ' '
But I think we can leave targetpath out entirely if we use a next statement:

awk ' FILENAME=="lookup" && !($1 in arr){print; next} {arr[$1]++} ' "$targetpath" lookup

Now using the user provided file that contains the target, I would get the targetfile from the user and then I would just use this code in a while-read loop in the shell :

while read file
do
   if [ -r "$file" ]; then
     awk ' FILENAME=="lookup" && !($1 in arr){print; next} {arr[$1]++} ' "$file" lookup > "$file.out"
   fi
done < "$targetfile"