Dear All,
I have multiple files having number of records, consist of more than 10 columns some column values are duplicate and i want to remove these duplicate values from these files.
Duplicate values may come in different files.... all files laying in single directory..
Need help to remove line contain duplicate values, and store in another files with same file name having .dup extention...
Sample files
Input_file_001.txt
AAAAAC01 0397fa AB2010120211200500000000200009904136515 099999999999 IUVSN11 MOB
AAAAAA01 03981d AB2010120211130100000007430009588004780 888888888888888 GGGCZ11 MOB 76457499048 3122
BBBBBBB01 03982f AB2010120211203400000000150009588000696 909090909090909 KKKKKG11 MOB 64325984725 4107
AAAAAC01 0396fa AB2010120211200500000000200009904136515 099999999999 IUVSN11 MOB ------ contain duplicate value
AAAAAA01 03901d AB2010120211130100000007430009588004780 888888888888888 GGGCZ11 MOB 76457499048 3122 ------ contain duplicate value
Input_file_002.txt
CCCCCCA01 03981d AB2010120211130100000007430009588004780 11111111111118 GGGCZ11 MOB 76457499048 3122
BBBBBBB01 03932f AB2010120211203400000000150009588000696 909090909090909 KKKKKG11 MOB 64325984725 4107 � contain duplicate values of first file
Need out put something like this
Input_file_001.txt
AAAAAC01 0397fa AB2010120211200500000000200009904136515 099999999999 IUVSN11 MOB
AAAAAA01 03981d AB2010120211130100000007430009588004780 888888888888888 GGGCZ11 MOB 76457499048 3122
BBBBBBB01 03982f AB2010120211203400000000150009588000696 909090909090909 KKKKKG11 MOB 64325984725 4107
Input_file_001.txt.dup
AAAAAC01 0396fa AB2010120211200500000000200009904136515 099999999999 IUVSN11 MOB
AAAAAA01 03901d AB2010120211130100000007430009588004780 888888888888888 GGGCZ11 MOB 76457499048 3122
Input_file_002.txt
CCCCCCA01 03981d AB2010120211130100000007430009588004780 11111111111118 GGGCZ11 MOB 76457499048 3122
Input_file_002.txt.dup
BBBBBBB01 03932f AB2010120211203400000000150009588000696 909090909090909 KKKKKG11 MOB 64325984725 4107
Currently i�m using following command to remove duplicate.... but not able to store duplicate lines .dup file
awk '!x [substr($0,38,93), substr($0,94,141)]++' * > all_files_