Awk/bash one liner replacement for a if condition

ctrld · December 2, 2019, 7:18am

Hi.

I wrote this small bash script, i want to compare second column from file1 with file2 if a pattern matches. Files are small and I am sure that pattern occurs only once. I think this can be rewritten into a awk one liner. Appreciate if someone could give me idea. Whole NR FNR confuse me :o

#!/bin/bash
var1=$(awk '/pattern/ { print $2 }' file1)
var2=$(awk '/pattern/ { print $2 }' file2)
if [[  "${var1}" != "${var2}" ]];then
    echo "Second column is not matching"
fi

RudiC · December 2, 2019, 7:52am

If "I am sure that pattern occurs only once" (per file, I presume), how about

if awk '/pattern/ {T[$2]++}; END {for (t in T) if (T[t] > 1) exit 1}' file[12]
  then echo "Second column is not matching"
fi

ctrld · December 2, 2019, 8:52am

Awesome and super quick!

I just made a small adjustment for better look. Please correct me if I am wrong :

awk '/pattern/ {T[$2]++}; END {for (t in T) if (T[t] > 1){print "Second column is not matching"}}' file[12]

rbatte1 · December 2, 2019, 8:56am

If you are hitting memory limits, you could almost go very old-school:-

awk '/pattern/ { print $2 }' file1 | sort > $workdir/file1_col2
awk '/pattern/ { print $2 }' file2 | sort > $workdir/file2_col2
diff $workdir/file1_col2$workdir/file2_col2

Probably slow because of the disk access and if it's too big, then diff may also struggle.

Not the best, but a clunky way to be sure it will actually work. Perhaps this will give you a good basis to see what output from a better (more efficient) process should be.

Robin

ctrld · December 2, 2019, 9:12am

I am extremely sorry! I think I was late to note that I wanted reverse of what this code is doing now;

if awk '/pattern/ {T[$2]++}; END {for (t in T) if (T[t] > 1) exit 1}' file[12]
  then echo "Second column is not matching"
fi

At present it is printing when columns are matching, I needed when columns are NOT matching.

RudiC · December 2, 2019, 9:21am

ctrld:

Awesome and super quick!

I just made a small adjustment for better look. Please correct me if I am wrong :
awk '/pattern/ {T[$2]++}; END {for (t in T) if (T[t] > 1){print "Second column is not matching"}}' file[12]

Fine if that suits you. I was thinking you might want to perform further shell commands in the if branch...

ctrld:

I am extremely sorry! I think I was late to note that I wanted reverse of what this code is doing now;
if awk '/pattern/ {T[$2]++}; END {for (t in T) if (T[t] > 1) exit 1}' file[12]
  then echo "Second column is not matching"
fi
 
At present it is printing when columns are matching, I needed when columns are NOT matching.

Well, try if ! awk... , then. Or use the else branch...

ctrld · December 2, 2019, 10:45am

Thank you for confirming