I have a requirement where I need to remove duplicates from a fixed width file which has multiple key columns .Also , need to capture the duplicate records into another file .
File has 8 columns.
Key columns are col1 and col2.
Col1 has the length of 8 col 2 has the length of 3.
Please give a sample input file (showing field contents and separators), and provide the outputs that you expect to get from that input. Please use code tags when you post the input and output files.
Assuming your input file is named Input , the following awk script will create a file named Output containing what you described as "file with out Duplicate" and a file named Duplicates that will contain what you described as "Duplicate file":
awk -v df=Duplicates -v of=Output '
substr($0, 1, 11) in key {
print > df
next
}
{ key[substr($0, 1, 11)]
print > of
}' Input