Hi.
I need to filter lines based upon matches in multiple tab-separated columns. For all matching occurrences in column 1, check the corresponding column 4. IF all column 4 entries are identical, discard all lines. If even one entry in column 4 is different, then keep all lines.
How can I modify the following
awk
to compare the 4th column and not the 2nd column:
FNR==NR {
array[$0]++
next
}
{
counter = 0
for (i in array) {
split(i, holder, FS)
if (holder[1] == $4) {
counter++
}
}
if (counter >= 2) {
print
}
}
$ awk -f script.awk file.txt{,}
The input data is the following:
DOG A B BIG
DOG C D BIG
DOG E F BIG
CAT G H SMALL
CAT I J SMALL
CAT K L BIG
CAT M N SMALL
The desired output is the following:
CAT G H SMALL
CAT I J SMALL
CAT K L BIG
CAT M N SMALL