Using double quotes in awk

Hi

I read somewhere that when using double quotes in awk; variables gets expanded else it doesn't.

So I tried to use the double quotes inside an awk statement as below:

 
from_instance_trans=`awk "/INPUT =\"$frm_inst\"/,/<\/TRANSFORMATION>/" $xml_object | grep -w "<TRANSFIELD" | awk -F\" "{ if ($0 ~ "$frm_inst") { print $14":"$16} }"`

And I'm having problem with the part I've colored in red.
When the script executes I get the following error:
[NOTE: $frm_inst=STAGE_TABLE]

 
awk: { if (./name_of_the_script.ksh ~ STAGE_TABLE) { print 4 6} }
awk:       ^ syntax error
 

While at the command line if I use the same syntax but with single quotes I get correct result

 
awk '/INPUT ="STAGE_TABLE"/,/<\/TRANSFORMATION>/' test_file.XML | grep -w '<TRANSFIELD' | awk -F\" '{ if ($0 ~ "EMP_KEY")  { print $14":"$16} }'
 
results into
-----------
15:0

The o/p of the command

awk '/INPUT ="STAGE_TABLE"/,/<\/TRANSFORMATION>/' test_file.XML | grep -w '<TRANSFIELD' 

is

 
<TRANSFIELD DATATYPE ="double" DEFAULTVALUE ="" DESCRIPTION ="" FIELD ="EMP_KEY" PICTURETEXT ="" PORTTYPE ="OUT" PRECISION ="15" SCALE ="0"/>
<TRANSFIELD DATATYPE ="string" DEFAULTVALUE ="" DESCRIPTION ="" FIELD ="DEPT_KEY" PICTURETEXT ="" PORTTYPE ="OUT" PRECISION ="7" SCALE ="0"/>
<TRANSFIELD DATATYPE ="string" DEFAULTVALUE ="" DESCRIPTION ="" FIELD ="MGR_NAME" PICTURETEXT ="" PORTTYPE ="OUT" PRECISION ="32" SCALE ="0"/>
<TRANSFIELD DATATYPE ="decimal" DEFAULTVALUE ="" DESCRIPTION ="" FIELD ="SAL_CODE" PICTURETEXT ="" PORTTYPE ="OUT" PRECISION ="3" SCALE ="0"/>
<TRANSFIELD DATATYPE ="string" DEFAULTVALUE ="" DESCRIPTION ="" FIELD ="PCT_RISE" PICTURETEXT ="" PORTTYPE ="IN" PRECISION ="7" SCALE ="0"/>

and I'm trying to extract PRECISION & SCALE field's values. :wall:
Any insight into this would be much appreciated.

-dips

Generally, when invoking awk from CL, you want to use single quotes to prevent shell from expanding.
If you use double quotes instead of single, shell will expand $0 into the name of the script. When you use single quotes, shell passes $0 to awk literally, and awk interprets it as the whole record, which is what you wanted.

Here is how I'd awk it out:

 
awk '/EMP_KEY/ {for(i=1;i<=NF;i++){if($i~"PRECISION|SCALE") { print gensub(/[^0-9]/,"","g",$(i+1)); } } }'

Processing only lines that contain "EMP_KEY", looping through all fields (in case PRECISION and SCALE can be on different places); if the field matches "PRECISION" or "SCALE", 'gensub' call will replace all non-numbers [^0-9] in next field ($i+1) with empty string. "g" is a "global replacement" flag.

Hi mirni,
ok got the point.
But the problem is that the below code line will be a part of a loop which reads a xml file and extracts the $frm_inst
And again in your code awk won't expand the variable $frm_inst. What I mean is that the pattern keeps on changing.

I encountered a post in this forum talking about the same issue. I tried this also but it doesn't work :(.

Please advice.

-dips

you can pass a variable to awk with -v switch:

awk -v "myVar=$frm_inst" '{if($0~myVar){for(i=1;i<=NF;i++){if($i~"PRECISION|SCALE") { print gensub(/[^0-9]/,"","g",$(i+1)); } } } }'

Or just plug in the shell variable directly, turning off the single quote beforehand and turning it back on after:

 awk  '/'$frm_inst'/ {for(i=1;i<=NF;i++){if($i~"PRECISION|SCALE") { print gensub(/[^0-9]/,"","g",$(i+1)); } } }'
1 Like

Hi mirni,

Brilliant! :slight_smile:

There's only one problem with the code you have provided it matches two fields say EMP_KEY and also I_EMP_KEY. Is there a way to restrict to an exact match?

-dips

awk -F'"' '{print $(NF-3),$(NF-1)}' infile
# cat tst
<TRANSFIELD DATATYPE ="double" DEFAULTVALUE ="" DESCRIPTION ="" FIELD ="EMP_KEY" PICTURETEXT ="" PORTTYPE ="OUT" PRECISION ="15" SCALE ="0"/>
<TRANSFIELD DATATYPE ="string" DEFAULTVALUE ="" DESCRIPTION ="" FIELD ="DEPT_KEY" PICTURETEXT ="" PORTTYPE ="OUT" PRECISION ="7" SCALE ="0"/>
<TRANSFIELD DATATYPE ="string" DEFAULTVALUE ="" DESCRIPTION ="" FIELD ="MGR_NAME" PICTURETEXT ="" PORTTYPE ="OUT" PRECISION ="32" SCALE ="0"/>
<TRANSFIELD DATATYPE ="decimal" DEFAULTVALUE ="" DESCRIPTION ="" FIELD ="SAL_CODE" PICTURETEXT ="" PORTTYPE ="OUT" PRECISION ="3" SCALE ="0"/>
<TRANSFIELD DATATYPE ="string" DEFAULTVALUE ="" DESCRIPTION ="" FIELD ="PCT_RISE" PICTURETEXT ="" PORTTYPE ="IN" PRECISION ="7" SCALE ="0"/>
# awk -F'"' '{print $(NF-3),$(NF-1)}' tst
15 0
7 0
32 0
3 0
7 0
# awk -F'"' '{print $(NF-3)":"$(NF-1)}' tst
15:0
7:0
32:0
3:0
7:0
# awk -F'"' '/EMP_KEY/{print $(NF-3)":"$(NF-1)}' tst
15:0

---------- Post updated at 02:31 PM ---------- Previous update was at 02:06 PM ----------

Example of exact match

# cat tst
<TRANSFIELD DATATYPE ="double" DEFAULTVALUE ="" DESCRIPTION ="" FIELD ="EMP_KEY" PICTURETEXT ="" PORTTYPE ="OUT" PRECISION ="15" SCALE ="0"/>
<TRANSFIELD DATATYPE ="double" DEFAULTVALUE ="" DESCRIPTION ="" FIELD ="I_EMP_KEY" PICTURETEXT ="" PORTTYPE ="OUT" PRECISION ="22" SCALE ="5"/>
<TRANSFIELD DATATYPE ="string" DEFAULTVALUE ="" DESCRIPTION ="" FIELD ="DEPT_KEY" PICTURETEXT ="" PORTTYPE ="OUT" PRECISION ="7" SCALE ="0"/>
<TRANSFIELD DATATYPE ="string" DEFAULTVALUE ="" DESCRIPTION ="" FIELD ="MGR_NAME" PICTURETEXT ="" PORTTYPE ="OUT" PRECISION ="32" SCALE ="0"/>
<TRANSFIELD DATATYPE ="decimal" DEFAULTVALUE ="" DESCRIPTION ="" FIELD ="SAL_CODE" PICTURETEXT ="" PORTTYPE ="OUT" PRECISION ="3" SCALE ="0"/>
<TRANSFIELD DATATYPE ="string" DEFAULTVALUE ="" DESCRIPTION ="" FIELD ="PCT_RISE" PICTURETEXT ="" PORTTYPE ="IN" PRECISION ="7" SCALE ="0"/>
# nawk -F'"' -v V="EMP_KEY" '$8==V{print $(NF-3)":"$(NF-1)}' tst
15:0
# nawk -F'"' -v V="I_EMP_KEY" '$8==V{print $(NF-3)":"$(NF-1)}' tst
22:5

Hi ctsgnb,

I don't have nawk command on my m/c :(.

Also I can't use NF as not necessarily PRECISION & SCALE fields would at NF-1 & NF-3 positions.

-dips

use awk instead, but the rest of the syntax should be similar.

# cat tst
<TRANSFIELD DATATYPE ="double" DEFAULTVALUE ="" DESCRIPTION ="" FIELD ="EMP_KEY" PICTURETEXT ="" PORTTYPE ="OUT" PRECISION ="15" SCALE ="0"/>
<TRANSFIELD DATATYPE ="double" DEFAULTVALUE ="" DESCRIPTION ="" FIELD ="I_EMP_KEY" PICTURETEXT ="" PORTTYPE ="OUT" PRECISION ="22" SCALE ="5"/>
<TRANSFIELD DATATYPE ="string" DEFAULTVALUE ="" DESCRIPTION ="" FIELD ="DEPT_KEY" PICTURETEXT ="" PORTTYPE ="OUT" PRECISION ="7" SCALE ="0"/>
<TRANSFIELD DATATYPE ="string" DEFAULTVALUE ="" DESCRIPTION ="" FIELD ="MGR_NAME" PICTURETEXT ="" PORTTYPE ="OUT" PRECISION ="32" SCALE ="0"/>
<TRANSFIELD DATATYPE ="decimal" DEFAULTVALUE ="" DESCRIPTION ="" FIELD ="SAL_CODE" PICTURETEXT ="" PORTTYPE ="OUT" PRECISION ="3" SCALE ="0"/>
<TRANSFIELD DATATYPE ="string" DEFAULTVALUE ="" DESCRIPTION ="" FIELD ="PCT_RISE" PICTURETEXT ="" PORTTYPE ="IN" PRECISION ="7" SCALE ="0"/>

# awk -F'"' -v V="I_EMP_KEY" '{for(i=1;i<=NF;i++) if ($i==V) print $(i+6)":"$(i+8)}' tst
22:5

or

# awk -F'"' -v V="EMP_KEY" '{for(i=1;i<(NF-8);i++) if ($i==V) print $(i+6)":"$(i+8)}' tst
15:0
1 Like

Stick a double quote just before the variable; that way you are matching against "EMP_KEY and I_EMP_KEY won't be a problem. You can put another double quote behind it for the same reason (won't match EMP_KEY_2).

 awk  '/"'$frm_inst'"/ {for(i=1;i<=NF;i++){if($i~"PRECISION|SCALE") { print gensub(/[^0-9]/,"","g",$(i+1)); } } }'

Hi,

Suberb guys!! :D. Thanks a ton!!

-dips