Re: Deleting lines from big file.

Hi,

I have a big (2.7 GB) text file. Each lines has '|' saperator to saperate each columns.

I want to delete those lines which has text like '|0|0|0|0|0'

I tried:

sed '/|0|0|0|0|0/d' test.txt

Unfortunately, it scans the file but does nothing.

file content sample:

/2010|1|2842|C|9999999900099|0|083|484.09|879|4004.59|1363|6277.43|1118|5020.95
/2010|2|2842|C|9999999900099|0|1083|184.49|889|0|0|0|0|0
/2010|3|2842|C|9999999900099|0|103|428.99|899|4004.59|1363|6277.43|1118|5020.95
/2010|9|842|F|9999999900099|0|183|14.41|19|0|0|0|0|0

I want to delete 2rd and 4th lines (the original file has 47 million lines and there are many occurance of '|0|0|0|0|0').

Thanks in advance
Dip

Try grep -v "|0|0|0|0|0" <test.txt >output.txt It won't edit in-place, but then, sed won't either. And besides, I think you'd want to check your data to make sure you didn't destroy it before overwriting it, hm?

1 Like

Thank you.

1 Like
  1. Always check that you are using the correct tools. Sed is a stream editor, and is wonderful for making on-the-fly edits. What you want, though, is to select or deselect lines to retain. Finding and filtering lines this way is a 'grep' kinda thing. Used in the form 'fgrep' it finds strings, rather then regular expressions, and is a tool better suited to your purpose.
  2. Perl is also capable, but may be overkill for what you want to do.
  3. AWK could be made to do the same job, but would be awkward compared to grep. (pun intended: sorry)

You have to redirect input to the command. Try following:

$ cat test.txt

/2010|1|2842|C|9999999900099|0|083|484.09|879|4004.59|1363|6277.43|1118|5020.95
/2010|2|2842|C|9999999900099|0|1083|184.49|889|0|0|0|0|0
/2010|3|2842|C|9999999900099|0|103|428.99|899|4004.59|1363|6277.43|1118|5020.95
/2010|9|842|F|9999999900099|0|183|14.41|19|0|0|0|0|0
$ sed -e '/|0|0|0|0|0/d' < test.txt
/2010|1|2842|C|9999999900099|0|083|484.09|879|4004.59|1363|6277.43|1118|5020.95
/2010|3|2842|C|9999999900099|0|103|428.99|899|4004.59|1363|6277.43|1118|5020.95