I want to transform a log file into input for a database.
Here's the log file:
Tue Aug 4 20:17:01 PDT 2009
Wireless users: 339
Daily Average: 48.4285
=
Tue Aug 11 20:17:01 PDT 2009
Wireless users: 295
Daily Average: 42.1428
=
Tue Aug 18 20:17:01 PDT 2009
Wireless users: 294
Daily Average: 42.0000
=
Tue Aug 25 20:17:01 PDT 2009
Wireless users: 289
Daily Average: 41.2857
=
I need to strip the descriptions for "Wireless users" and "Daily Average" but keep the date as is.
So far, I thought I could use "=" for the record separator and "\n" as the field separator. Here's what I've got so far:
awk -F'\n' 'RS="\="{for(i=1;i<=NF;i++){gsub(/[^[:digit:].]/,"",$4)}}; 1' rotate1.log
The output confuses me:
Tue Aug 4 20:17:01 PDT 2009
Wireless users: 339
Daily Average: 48.4285
Tue Aug 11 20:17:01 PDT 2009 Wireless users: 295 42.1428
Tue Aug 18 20:17:01 PDT 2009 Wireless users: 294 42.0000
Tue Aug 25 20:17:01 PDT 2009 Wireless users: 289 41.2857
Tue Sep 1 20:17:01 PDT 2009 Wireless users: 379 54.1428
Why is it printing the first record as is, printing the rest as specified by RS and FS?
Secondly I need to gsub on field 3 as well. Here's one with two
gsub statements:
awk -F'\n' 'RS="\="{for(i=1;i<=NF;i++){gsub(/[^[:digit:].]/,"",$4)}{gsub(/[^[:digit:]]/,"",$3)}}; 1' rotate1.log
Output ( still printing first record unscathed ):
Tue Aug 4 20:17:01 PDT 2009
Wireless users: 339
Daily Average: 48.4285
Tue Aug 11 20:17:01 PDT 2009 295 42.1428
Tue Aug 18 20:17:01 PDT 2009 294 42.0000
Tue Aug 25 20:17:01 PDT 2009 289 41.2857
Tue Sep 1 20:17:01 PDT 2009 379 54.1428
Is there a way to throw an "or" in there to reduce the gsubs to
one?
So the output above is OK except for the printing of the first record "AS IS".
With OFS set as tab I've nearly got what I need:
awk -F'\n' 'RS="\="{for(i=1;i<=NF;i++){gsub(/[^[:digit:].]/,"",$4)}{gsub(/[^[:digit:]]/,"",$3)}};{OFS="\t"};1' rotate1.log
So what's up with printing the first record undigested?
Thanks for reading!
Bubnoff