Count the delimeter from a file and delete the row if delimeter count doesnt match.

Akumar1 · August 31, 2010, 3:04pm

I have a file containing about 5 million rows, in the file there are some records which has extra delimiter at random position. (we dont know the positions), now we have to Count the delimeter from each row and if the count of delimeter is not matching then I want to delete those rows from the file.

What is the most efficient way of doing it. rather how can I do it.

the actual file file has 105 column whith delimeter as open bracket.

thanks for your help in advance.

bartus11 · August 31, 2010, 3:13pm

perl -ne '@m=m/\(/g;print if $#m+1==104' infile > outfile

Akumar1 · August 31, 2010, 4:24pm

Many thanks for the reply,
however by using below command I am able to take out unwated rows in another file , but I want those rows shoud get deleted from main file. any help:confused:

perl -ne

rdcwayx · August 31, 2010, 8:16pm

provide some sample input, and your expect O/P.

agama · August 31, 2010, 10:06pm

I'm assuming the perl command was truncated in your post, but that it is something like was suggested earlier. If so, then You'll need to move (mv) the output file onto the original file in order to 'replace' the original file with the changes.

For example:

perl -ne '@m=m/\(/g;print if $#m+1==104' infile > outfile  
mv infile infile.bak
mv outfile infile

You could move the outfile straight over the original file, but it is my experience that having a backup copy of the original file can be a life saver (especially while testing the command).

Michael_Stora · August 31, 2010, 11:41pm

If the delimiter is whitespace

awk 'NF < 105'

In your case:

awk 'BEGIN { FS="[" }; NF < 105'

Mike