To remove a particular record from File

SatishW · January 6, 2010, 1:50pm

Dear Friends,
Need your help to fix the issue we are facing.
We have a requirement - Need to remove specific records from a comma separated file and create new file. It should check "filter.csv" file and then accordingly remove records from the input file and generate the output file.(not case sensitive removal of records)

Please see below example -
Please see below Model/Plant combinations, we want to remove records which contains below combinations and create file.

filter.csv

TC-P42C1,c231
TC-P50C1.c231
TC-P42X1,C231

Original File -File/Directory: /interfaces/inputfile.csv

TC-P42C1,C131,0,0,0,0,0,0,0,0,0
TC-P42C1,C201,4261,4261,4261,4261,4261,4261,4261,4261,4261
TC-P42C1,C231,0,0,0,0,0,0,0,0,0
TC-P42C1,C271,0,0,0,0,0,0,0,0,0
TC-P42C1,C311,0,0,0,0,0,0,0,0,0
TC-P42C1,C321,0,0,0,0,0,0,0,0,0
TC-P50C1,C201,1450,4072,4072,4072,4072,4072,4072,4072,4072
TC-P50C1,C231,0,0,0,2,2,2,2,2,2
TC-P50C1,C271,0,0,0,0,0,0,0,0,0
TC-P50C1,C341,0,0,0,0,0,0,0,0,0
TC-P50C1,C501,0,0,0,7,7,7,7,7,7
TC-P50C1,C601,0,0,0,0,0,0,0,0,0
TC-P50C1,C941,0,0,0,0,0,0,0,0,0
TC-P42X1,C141,0,0,0,0,0,0,0,0,0
TC-P42X1,C231,0,0,0,7,7,7,7,7,7
TC-P42X1,C611,0,0,0,0,0,0,0,0,0
TC-P42X1,C621,0,0,0,0,0,0,0,0,0
TC-P42X1,C921,300,300,300,300,300,300,300,300,0
SR-MS102,C141,0,0,0,0,0,0,0,0,0
SR-MS182,C231,0,0,0,0,0,0,0,0,0
SR-TEG18,C322,0,0,0,0,0,0,0,0,0
TC-32LX14,C141,0,0,0,0,0,0,0,0,0
TC-32LX14,C921,0,0,0,0,0,0,0,0,0
TC-32LX14,C231,0,0,0,0,0,0,0,0,0

OutPut File - File/Directory: /interfaces/outputfile.csv

TC-P42C1,C131,0,0,0,0,0,0,0,0,0
TC-P42C1,C201,4261,4261,4261,4261,4261,4261,4261,4261,4261
TC-P42C1,C271,0,0,0,0,0,0,0,0,0
TC-P42C1,C311,0,0,0,0,0,0,0,0,0
TC-P42C1,C321,0,0,0,0,0,0,0,0,0
TC-P50C1,C201,1450,4072,4072,4072,4072,4072,4072,4072,4072
TC-P50C1,C271,0,0,0,0,0,0,0,0,0
TC-P50C1,C341,0,0,0,0,0,0,0,0,0
TC-P50C1,C501,0,0,0,7,7,7,7,7,7
TC-P50C1,C601,0,0,0,0,0,0,0,0,0
TC-P50C1,C941,0,0,0,0,0,0,0,0,0
TC-P42X1,C141,0,0,0,0,0,0,0,0,0
TC-P42X1,C611,0,0,0,0,0,0,0,0,0
TC-P42X1,C621,0,0,0,0,0,0,0,0,0
TC-P42X1,C921,300,300,300,300,300,300,300,300,0
SR-MS102,C141,0,0,0,0,0,0,0,0,0
SR-MS182,C231,0,0,0,0,0,0,0,0,0
SR-TEG18,C322,0,0,0,0,0,0,0,0,0
TC-32LX14,C141,0,0,0,0,0,0,0,0,0
TC-32LX14,C921,0,0,0,0,0,0,0,0,0
TC-32LX14,C231,0,0,0,0,0,0,0,0,0

We have Unix commands like sed, cut, awk which can help us.
Thanks a lot. Happy New Year to all Unix World
Satish

Corona688 · January 6, 2010, 1:55pm

[edit] I think I misunderstood your problem slightly. Let me rethink.

SatishW · January 6, 2010, 2:04pm

Thanks Corona for quick reply.

First two columns (model,plant) are maintained in filter.csv file. filter.csv will get modified frequently.
It it not fixed that 2nd will alwyas have value as C231. It can differ.

But whatever is maintained on filter.csv file, we would like Unix to scan the input file(first 2 columns) and remove records which can finds (model,plant) and create output file.

This will help clarify.

Franklin52 · January 6, 2010, 2:05pm

Try:

grep -i -v -f filter.csv inputfile.csv

Corona688 · January 6, 2010, 2:07pm

Franklin got it before I figured it out. Beware that if there are any blank lines in the filter list, grep will interpret ALL lines as matching and strip out ALL lines, which is what baffled me briefly. Also keep in mind is the -i is there to make it case-insensitive, since the case of two letters in your example input is different from the results you want it to match. If that was accidental you can omit that flag.

quirkasaurus · January 6, 2010, 2:07pm

re_string=$( tr '\012' '|' < filter.csv )
re_string=${re_string%\|}
egrep -v "($re_string)" original.csv | tee original.fmt

shoot! nice solution, franklin. <<-- winnar