Help removing lines with duplicated columns

Hi Guys...

Please Could you help me with the following ?

aaaa bbbb cccc sdsd
aaaa bbbb cccc qwer

as you can see, the 2 lines are matched in three fields...
how can I delete this pupicate ? I mean to delete the second one if 3 fields were duplicated ?

Thanks

i) Use useful topics!
ii)

awk -F" " '! something[$3]++' inputfile

Could you explain this command for me .... :confused:

and what do u mean by " Use useful topics "...

Regards

You've named this thread - your question - simply "help". It would be much better to use something like "removing duplicate lines".

And well, I see you've finally used the board search http://www.unix.com/shell-programming-scripting/62574-finding-duplicates-columns-removing-lines.html\#post302196106 .

Let's put it together:
i) you've used a meaningless topic
ii) you've posted your problem three times
iii) you've already found a thread in which you problem is answered and explained

Do you really believe this will motivate me to explain this to you? I don't.

Dear fabtagon.....

I posted this topic in 3 different places to make sure that my problem can be handled by someone who can deal well enough with Unix.. and as I expected, someone came, like you, and gave a solution that makes no sense at all... thats why I had to post it in different places.. GOT IT !!

Please if you have a stright answer... be my guest, otherwise... go and practice som unix commands.....
Cappich ???

Warm regards Bozo

I went back and checked the problem stated here http://www.unix.com/shell-programmin...#post302196106 .
, believe me it's different....

I have 3 matched fields out of 4, NOT one out of 4.
Here is another example.

SSSSS DDDDDDD 10:10:00 15:22:22
XXXXX AAAAAAA 00:00:11 00:02:11
XXXXX AAAAAAA 00:00:11 06:02:10
EEEEEE VVVVVVV 04:12:00 01:10:02
EEEEEE VVVVVVV 04:12:00 05:12:00
SSSSS DDDDDDD 10:10:00 13:23:21
EEEEEE FFFFFFFF 20:20:20 24:00:00

I want the output to be like

XXXXX AAAAAAA 00:00:11 06:02:10
EEEEEE VVVVVVV 04:12:00 05:12:00
SSSSS DDDDDDD 10:10:00 13:23:21
EEEEEE FFFFFFFF 20:20:20 24:00:00

If 2 lines are matched by 3 fields, I want to delete the first.

Thanks...

If you reverse the file, the same solution can be used.

tac file |
awk '!a[$1 $2 $3]++'

This uses the three first fields to decide whether it's seen the same data before.

If you don't have the tac command, maybe you can sort the input before feeding it to awk.

There are certainly ways to make awk print the last instead of the first line; you can search the forums for a plethora of examples of this.

Dear era,

I tried to use tac command, put the unix didnt recognize it at all. Also, when I used the awk command alone, it gave an error ( Bailing out )...
Could you tell me about the "a" in the awk, what does it stand for ?

Thanks so much for your kind help.

Have a look at the forum FAQ Simple rules of the UNIX.COM forums: . Duplicating and crossposting is strongly discouraged.

If you aren't able to understand above command line even after a quite similiar one has been explained in detail in another thread maybe you should start really practising shell programming (which consists of reading man pages/online ressources) instead of demanding a solution from someone beeing as kind as to sacrifice his free time for you.

Dear fabtagon,

Read the last example and you will find that it's not duplicated, it's another Question, NOT AS THE ONE THAT YOU COPIED IT's ANSWER AND PASTED IT TO MINE..... Look again if you are interested, otherwise, you have my best regards.

There are many different variants of awk. If your awk does not understand that script, see if you can find nawk or mawk or gawk instead. On some systems (Sun, HP-UX) you might be able to find a "XPG4" version of awk which is more modern than the bare-bones "old awk".

The name of awk comes from the family names of its creators Alfred Aho, Peter Weinberger, and Brian Kernighan.

If you are unable to abide by the forum rules in spite of several remarks by forum users, perhaps these forums are not for you.

I meant the "a" in the command you wrote (awk '!a[$1 $2 $3]++'), because it was not clear enough for me... Im new to awk and I needed a quick solution.

and I do abide the forum rules, see for your self above... I dare you if you find similar thread like this one or even close to..

Nevertheless, thanks for your help,

The forums' own search tool stupidly treats "awk" as a stop word, so I took a detour via Google.

site:unix.com awk duplicate - Google Search

a is just the name of a variable; if the associative array already contains a value for the given key, we have already seen that key before, and suppress printing. (The default if no action is given is to print anything matching the condition.)

I used the nawk and it worked,
Thanks for your kind help, really appreciated.