Help to simplify with awk

Hi,

Need your help guys. I'm trying to tweak my current shell-script to make it run faster. I think a part of the code that takes too long is the splitting of the 1st field of my CSV raw-file to date-time. Below is the 1st column of my CSV file:

$ awk -F"," {'print $1'} temp-*|head
20080826000059
20080826000325
20080826000400
20080826000044
20080826000332
20080826001014
20080826001932
20080826002833
20080826002107
20080826002148

When my script runs it becomes:

2008.08.26,00:00:59
2008.08.26,00:03:25
2008.08.26,00:04:00
2008.08.26,00:00:44
2008.08.26,00:03:32
2008.08.26,00:10:14
2008.08.26,00:19:32
2008.08.26,00:28:33
2008.08.26,00:21:07
2008.08.26,00:21:48

the part of my script that does this is as below:

for LINE in `cat $INPUT/temp-out.rgp3`
do
YEAR=`echo $LINE|awk -F"," {'print $1'}|cut -c 1-4`
MONTH=`echo $LINE|awk -F"," {'print $1'}|cut -c 5-6`
DEY=`echo $LINE|awk -F"," {'print $1'}|cut -c 7-8`
HOUR=`echo $LINE|awk -F"," {'print $1'}|cut -c 9-10`
MIN=`echo $LINE|awk -F"," {'print $1'}|cut -c 11-12`
SEC=`echo $LINE|awk -F"," {'print $1'}|cut -c 13-14`
done

Is there a way that I can simplify this to make the script run faster? Maybe use awk {gsub} or something similar?

Appreciate your replies. :slight_smile:

Thanks.

Try...

awk -F"," '{ 
       YEAR=substr($1,1,4)
       MONTH=substr($1,5,2)
       DEY=substr($1,7,2)
       HOUR=substr($1,9,2)
       MIN=substr($1,11,2)
       SEC=substr($1,13,2)
       print YEAR "." MONTH "." DEY "," HOUR ":" MIN ":" SEC
    }' $INPUT/temp-out.rgp3

Also take a look at this solution: http://www.unix.com/shell-programming-scripting/78426-timestamp-date.html\#post302228573

Or even just

sed -e 's/\(....\)\(..\)\(..\)\(..\)\(..\)\(..\).*/\1.\2.\3,\4:\5:\6/' $INPUT/temp-out.rgp3

If the objective is to have these variables available to your shell script in some further processing before the "done", then maybe something like

while read line; do
  tail=${line#????}; YEAR=${line%$tail};  line=${line#$YEAR}
  tail=${line#??};   MONTH=${line%$tail}; line=${line#$MONTH}
  tail=${line#??};   DAY=${line%$tail};   line=${line#$DAY}
  tail=${line#??};   HOUR=${line%$tail};  line=${line#$HOUR}
  tail=${line#??};   MIN=${line%$tail};   line=${line#$MIN}
  tail=${line#??};   SEC=${line%$tail}
  echo $YEAR.$MONTH.$DAY,$HOUR:$MIN:$SEC
  : more stuff here
done <$INPUT/temp-out.rgp3

This isn't terribly well encapsulated; I'd suggest you turn this into a function to make it prettier.

echo 20080826000059 | sed 's/\(....\)\(..\)\(..\)\(..\)\(..\)\(..\)/\1\.\2\.\3,\4:\5:\6/'

It works like a charm!

Thanks!!!!