Hello,
I need help in putting duplicate lines within a section into another file. Here is what I'm struggling with:
Using this file �data.txt�:
ABC1 012345 header
ABC2 7890-000
ABC3 012345 Header Table
ABC4
ABC5 593.0000 587.4800
ABC5 593.5000 587.6580 <= dup need to remove
ABC5 593.5000 587.6580
ABC5 594.0000 588.0971
ABC5 594.5000 588.5361
ABC1 67890 header
ABC2 1234-0001
ABC3 67890 Header Table
ABC4
ABC5 594.5000 588.5361 <= to keep in this section
ABC5 601.0000 594.1603
ABC5 601.5000 594.6121
ABC5 602.0000 595.0642
ABC5 602.0000 595.0642 <= dup need to remove
Duplicates need to be placed in �data.dup�
ABC1 012345 header
ABC5 593.5000 587.6580
ABC1 67890 header
ABC5 602.0000 595.0642
I've been using the current code to find the duplicates, but have yet to find a method to get the duplicates into the �data.dup� file according to the section the duplicate was found.
# This will remove the duplicate lines and place the unique lines in �data.out�
awk '{ if ($1=="ABC1") {delete arr}
if ( !arr[$0]++ ) { print $0 } }' data.txt > data.out
# This will extract the duplicates into the �data.dup� file. This works, but I need it
# to be by section, with a section header
awk '/ABC1/{ ABC1 = $2 } ++arr[ABC1,$0] > 1' data.txt > data.dup
I am using Bourne Shell (/bin/sh)
Any help would be GREATL:Y appreciated.