I'm reading in some >41,000 line files and doing some manipulations of columns based on the values of other columns. Arrays make a ton of sense for this application.
Not to slow it down too much, I want to spit out the lines efficiently and not have another loop indexing the entries for each line.
For each row
$echo "${array[@]}
gives space delimiters (and some of the values contain space so piping it through tr won't work)
IFS=','; echo "${array[@]}
does not change anything, the output is still space delimited (I was not expecting this to work).
Is there time and processor power efficient way to do this without a loop indexing through all the columns?
Mike
PS. topic related to the code
IFS=','; While read -r -a array
Do a bunch of stuff
echo "${array[@]} >> out_file
done < in_file
is much much slower than
(IFS=','; While read -r -a array
Do a bunch of stuff
echo "${array[@]}
done ) < in_file > out_file
I suspect this is because the file is only opened and closed once, not 41,000 times. What are the risks of the subshell over-running it's available memory? I'm looking at ~10 mb files for just Q1 2013 data and I still need to port over data from 2010 on. This will have to be deployed to a Cygwin enviroment.