printf in awk

clx · April 1, 2009, 12:14pm

Hi friends..

I am confused about awk printf option..

I have a comma separated file

88562848,21-JAN-08,2741079, -1188,-7433,TESTING
88558314,21-JAN-08,2741189, -1273,-7976,TESTING

and there is a line in my script ( written by someone else)

What is the use of command?
I guess %i is for integer and %s for - ? is it char or something?

apart from this..

I have to modify this script to deal with a semicolon separated file..(also includes comma in 4th and 5th place)

T900001;10-Dec-08;T245377;2,25;14,09;CM Inv;1001
900234;10-Jan-08;954201;0,44;2,78;CM Inv;1001

I tried substituting the above awk syntax with "," replaced by ";" but it doesnt work. it is giving me a awk syntax error.

Please help.

cfajohnson · April 1, 2009, 12:34pm

Did you remember to quote or escape the semi-colon?

awk -F\; '.....' FILE | grep ... ## No need for cat

marcus121 · April 1, 2009, 4:10pm

Yes, %i (and %d) are numeric. %s is character string (whereas %c is a single character).

What this is doing is inputing the csv file (separated by comma) and outputting all but the last field (TESTING) pretty much the same and the grep is saying EXCLUDE (dash-v) lines that have essentially null values across the record. I'm guessing the null values in the records might inconsistant lengths which is why he needs to force the format prior to the grep -v. In other words, these are both null but would require different greps:

00000000,, , 0, 0,TESTING
0,           ,,0000,0,TESTING

... with the awk printf, they would both have the format '0,,0,0,0,,' though.

As cfa said, you need to escape the semi-colon because it's a control character like pipe, tick, etc.

marcus121 · April 1, 2009, 4:23pm

Regarding the rest of your question, if it's always the case that you wish the commas were semi-colons in your new file, you could translate prior to your awk:

 tr ',' ';' <csvfile | awk -F\; '{ ... }'

Keep in mind now your field 4 will be 4&5 and your field 5 will be 6&7 and on down the line. Here's your data and field numbers now:

/msa1/profiles/mlibby/ksh> tr ',' ';' <malt | awk -F\; '{for(i=1;i<=NF;++i) print "Fld",i,$i}'
Fld 1 T900001
Fld 2 10-Dec-08
Fld 3 T245377
Fld 4 2
Fld 5 25
Fld 6 14
Fld 7 09
Fld 8 CM Inv
Fld 9 1001
Fld 1 900234
Fld 2 10-Jan-08
Fld 3 954201
Fld 4 0
Fld 5 44
Fld 6 2
Fld 7 78
Fld 8 CM Inv
Fld 9 1001

If you don't want to unilaterally translate, i.e., some OTHER fields might have imbedded commas that you don't want to translate to semicolon, use the split feature on fields 4 & 5 like so:

/msa1/profiles/mlibby/ksh> r
awk -F\; '{split($4,z,",");split($5,y,",");printf "%i,%s,%s,%i,%i,%i,%i,%s,%i\n",$1,$2,$3,z[1],z[2],y[1],y[2],$6,$7}' csvfile | grep -v '0,,,0,0,0,0,,0'
0,10-Dec-08,T245377,2,25,14,9,CM Inv,1001
900234,10-Jan-08,954201,0,44,2,78,CM Inv,1001

clx · April 2, 2009, 6:24am

Hi All,

Thank you very much for your explaination.

@marcus121 : This is pretty clear now. just one point..

It is clear that above will print only 5 fields ( exclude TESTING
i.e

now grep -v '0,,0,0,0,,' , I guess pointing to -
1 st field - 0
2nd - null
3rd, 4th, 5th - 0
6 th - null ..

but in the resulting file, there will be no 6th field (TESTING) !!. then for what purpose, the last 2 commas are there in grep?

also, regarding the comma in 5th and 6th field in the new file..
my requirement is..

instead of 2,25 i want to print (2*100)+25 i.e 225
in general, 4th and 5th field, i want the below value

(value_before_comma*100)+value_after_comma

though I am working on it.. trying like $4*=100 but I am having problem in splitting both values for doing different calulation
I appreciate if i can get some clue.

Thanks.

clx · April 2, 2009, 10:53am

This is what i tried..

Is that a right way of achieving this...

Thanks in advance.

siquadri · April 2, 2009, 11:07am

Hey can you explain your requirement clearly.
Do you will have two different files one comma seperated and other ; seperated.

Or you want to create ; seperated from , separetaed and then do transformations

clx · April 2, 2009, 11:43am

Hi...
I have allredy a script made for comma separated file..

I have to modify that script for semicolon separated file...
Including some more other extra requiemnt.

as i shown in example.
my file contains 4th and 5th field with the value with comma..

what i want is...

input...

output....

i.e on the 4th and 5th position i want...

(value_before_comma*100)+value_after_comma

and ignore the last two fields from the orignal file.

as i did this in

though this resulting file is comma seperated instead of ";" but that is not a big deal.

marcus121 · April 2, 2009, 2:16pm

tr ',' ';' <filein | awk -F\; 'BEGIN{OFS=","} {print $1,$2,$3,$4*100+$5,$6*100+$7}'

siquadri · April 2, 2009, 2:36pm

Try this:

 
awk -F';' 'BEGIN {OFS=";"} {$4=($4*100)+$5} END{print $1,$2,$3,$4,$6}'

clx · April 7, 2009, 3:41am

Hi..
These solutions are working fine.
Thanks to all.

one query.. if there is a negative field there.. lets say -2,25 in that case my logic will fail.
-200+25=-175 where as I still want -225.

thanks in advance.