Placing Duplicate Lines per section into another file

petersf · January 27, 2010, 12:24pm

Hello,

I need help in putting duplicate lines within a section into another file. Here is what I'm struggling with:

Using this file �data.txt�:

 
ABC1 012345 header
ABC2 7890-000
ABC3 012345 Header Table
ABC4
ABC5 593.0000 587.4800
ABC5 593.5000 587.6580 <= dup need to remove
ABC5 593.5000 587.6580
ABC5 594.0000 588.0971
ABC5 594.5000 588.5361
ABC1 67890 header
ABC2 1234-0001
ABC3 67890 Header Table
ABC4
ABC5 594.5000 588.5361 <= to keep in this section
ABC5 601.0000 594.1603
ABC5 601.5000 594.6121
ABC5 602.0000 595.0642
ABC5 602.0000 595.0642 <= dup need to remove

Duplicates need to be placed in �data.dup�

 
ABC1 012345 header
ABC5 593.5000 587.6580 
ABC1 67890 header
ABC5 602.0000 595.0642

I've been using the current code to find the duplicates, but have yet to find a method to get the duplicates into the �data.dup� file according to the section the duplicate was found.

# This will remove the duplicate lines and place the unique lines in �data.out�
awk '{ if ($1=="ABC1") {delete arr}
      if ( !arr[$0]++ ) { print $0 } }' data.txt > data.out
 
# This will extract the duplicates into the �data.dup� file.  This works, but I need it 
# to be by section, with a section header
awk '/ABC1/{ ABC1 = $2 } ++arr[ABC1,$0] > 1' data.txt > data.dup

I am using Bourne Shell (/bin/sh)

Any help would be GREATL:Y appreciated.

Franklin52 · January 27, 2010, 12:42pm

Try this one:

awk 'NF==3 && /header/; $0!=s{s=$0;next}1' file

petersf · January 27, 2010, 12:55pm

Franklin52 - You Are Awsome!!! I've spent a good 8 hours trying to figure this out (spanding over a few days). May I ask what book I can learn more on advanced scripting?

Franklin52 · January 27, 2010, 1:13pm

You can check one of the links here or Google for more:

http://www.unix.com/answers-frequently-asked-questions/13774-unix-tutorials-programming-tutorials-shell-scripting-tutorials.html

And this important, stay tuned here, here is where you can master the stuff!

Regards

petersf · January 27, 2010, 1:16pm

Thanks Man! I appreciate your time. Don't worry about my leaving. This is the real work-related stuff here!