The validation logic is 1st column and 2nd column need to be considered
if both columns values are not same and 1st column values are same
then the record in 1st column need to be picked up
in the records if the first and second column matches then those records need to be dropped
While reading the input build an associative array named u (unique) keyed by $1. The values are build/chosen based on the following expression:
k[$1,$2]++ ? x : 1
If the value of the auto incremented associative array k, build en passant with $1 SUBSEP $2 as keys, is different than 0 (i.e. true in boolean context), i.e. already seen (remember Ed Morton's !arr[val]++?),
then return and assign the value of the variable x (never used and auto initialized -> null -> 0 in numeric context -> false in boolean context, if I had written 0, it would have been clearer :)), otherwise return and assign the value 1 (the opposite of the previous).
END {
for (_ in u) if (u[_])
print _
}
After reading all the input, print only those u keys whose values are true when evaluated in boolean context (which equal to 1).
hi,
thanks for the response
actually in my message if there are 3 records like
aaa 123 233
aaa 234 222
aaa 242 222
then only ONE aaa
need to be printed
but in the output it is showing all the 3 values
Actually in my input file it will contain nearly 10 fields each separated by pipe symbol
For that thing whether this solution will work (by replacing k[$1,$2]++ with all the fields like $3...) or i have to use another approach
I have to consider the first 2 fields for validation remaining fields i can leave as it is