awk not working as expected with BIG files ...

videsh77 · February 24, 2005, 5:53am

I am facing some strange problem.

I know, there is only one record in a file 'test.txt' which starts with 'X'

I ensure that with following command,
awk /^X/ test.txt | wc -l

This gives me output = '1'.

Now I take out this record out of the file, as follows :

awk /^X/ test.txt > XRecord.txt

Then I take out all records which do not start with 'X'.
Following is the command for the same,

awk '$0!~/^X/ test.txt > NotXRecord.txt

After doing this, Number of records with NotXRecord.txt is not 1 record less than number of records with 'test.txt'.

Now the same sequence above, works fine with small files, say uptil 100 thousands of records.

But it do not work for a file, with 4million rows.

Does any one have valuable hints to share?

bhargav · February 24, 2005, 3:15pm

What is the problem are you getting here ??

awk may be too slow here.

Have you tried grep or sed ? for this problem ?

My understanding is , you may get good results with sed or grep
performance wise.

Have a try ....