remove duplicate lines using awk

Hi,
I came to know that using

awk '!x[$0]++'

removes the duplicate lines. Can anyone please explain the above syntax. I want to understand how the above awk syntax removes the duplicates.

Thanks in advance,
sudvishw :confused:

x is a array and it's initialized to 0.the index of x is $0,if $0 is first time meet,then plus 1 to the value of x[$0],x[$0] now is 1.As ++ here is "suffix ++",0 is returned and then be added.So !x[$0] is true,the $0 is printed by default.if $0 appears more than once,! x[$0] will be false so won't print $0.

I am sorry. I cannot understand. It would be great if you can explain with an example. Usually we do a sort and then pick the unique records. Is there any sorting inbuilt in this awk :wall:

Sorting is not necessary. All it does is create an (associative) array element with the entire line as the index without a value (or 0 is you will). The exclamation mark negates that value so the outcome is 1 (true). The value of 1 in awk means perform the default action which is {print $0} so the entire line gets printed.

Afterwards the ++ comes into action and 1 is added to the array value, which now becomes 1. So that next time the same line is encountered the value returned by the array is 1 which is then negated to 0 by the exclamation mark, so nothing will get printed.

Hi,
Thanks for your explanation. If I understand it right, suppose we have a file as shown below:

hi
hi
hii
hi

Here hi comes 3 times.

First time when hi comes, x[hi] will be initialized to 0, which is negated and so it becomes 1 and the line is printed. Second time, x[hi] will be 1, which gets negated and so it becomes 0 and the line is not printed.

If I am not wrong, third time, x[hi] will be 0, which gets negated to 1 and the line should be printed. I think there is something I am missing here. Please clarify.

Got it. Thanks!!!

getting an error while using this.

 
awk -F',' '!x[$1$2$3]++' UnixEg.dat
x[$1$2$3]++': Event not found

---------- Post updated at 11:22 PM ---------- Previous update was at 09:44 PM ----------

found the solution