Optimize/speed-up perl extraction

pinpe · August 4, 2007, 6:33am

Hi,

Is there a way I can extract my data faster. You know my data is 1.2 GB text file with 8Million rows with 38 columns/fields. Imagine how huge this is.

How I can optimized the data extraction using perl. That is why I'm creating a script to filter only those informations that I need. Is there any modules available or any way to speed up the process of extraction? Tnx in advance.

Cheers!

Br, Pete

matrixmadhan · August 4, 2007, 7:08am

This is an absolute relative term.

What is that you are interested in extraction ?

What is the problem with the current approach that you have ?

Could you please post sample input and output ? That would help us a lot better to understand what is needed to do !

porter · August 4, 2007, 8:10am

We don't need to, you have just told us.

How are you determining what you extract?

How do you extract it?

The fastest way would be to use a C program that can read each line into a single buffer, do the determination without any memory allocation/deallocation, then print the required sections again without memory allocation/deallocation.

matrixmadhan · August 4, 2007, 9:13am

How is this possible ?

Isn't there any size restriction on the program buffer, kernel buffer ?

If there is a feasibility to have single buffer to hold the contents whatever be the size, there could be just one flush that could do the job ( this is purely subjective )