Bash variable expansion in awk script

Hello,
I need to split a file into two of different locations by re-direction in awk.

cat infle
aaa     1 3
bbb     2 4
aaa     3 3
bbb     4 4
aaa     5 3
bbb     6 4

cat /storage/tmp/group_a.gtf
aaa     1 3
aaa     3 3
aaa     5 3

cat /storage/tmp/group_b.gtf
bbb     2 4
bbb     4 4
bbb     6 4

The path of the resulted files should be passed in with shell variable, and I have tried:

MY_DIR="/storage/tmp"
awk -v MY_DIR="${MY_DIR}" '{if ($1 == "aaa")  {print $0 >> "${MY_DIR}/group_a.gtf"} else if ($1 == "bbb") {print  $0 >> "${MY_DIR}/group_b.gtf"}}' infle  #A typo was $7=="bbb"
MY_DIR="/storage/tmp";  awk '{if ($1 == "aaa") {print $0 >>  ENVIRON[\"MY_DIR\"]/group_a.gtf"} else if ($1 == "bbb") {print $0  >> "ENVIRON[\"MY_DIR\"]/group_b.gtf"}}' infle
awk -v  awk_var=${MY_DIR} '{if ($1 == "aaa") {print $0 >>  "awk_var/group_a.gtf"} else if ($1 == "bbb") {print $0 >>  "awk_var/group_b.gtf"}}' infle

but none worked with error like: cmd. line:1: (FILENAME=aaa FNR=1) fatal: can't redirect to `awk_var/group_a.gtf' (No such file or directory)

I checked the threads in this forum:
How to pass a variable from shell to awk
Pass shell Variable to awk
but they did not help me out probably due to my expansion is related to file directory. However, the it worked if the path is hard-coded, which is not thru shell variable along with other part of the shell script.

awk  '{if ($1 == "aaa") {print $0 >> "/storage/tmp/group_a.gtf"} else  if ($1 == "bbb") {print $0 >> "/storeage/tmp/group_b.gtf"}}'  infle

Also It is interesting with following oneliner, in which the expansion worked at every other hit of ENTER. Could not understand why.

MY_DIR="/storage/tmp"; awk -v awk_var="$MY_DIR" '{print awk_var}'

/storage/tmp

/storage/tmp

/storage/tmp

 ......

1) What are the rules to expand shell (bash) variables (especialy related to directory) inside awk script?
2) Why my last oneliner worked every other time with ENTER? Thanks a lot!

1 Like
MY_DIR="/storage/tmp"
awk -v myDir="${MY_DIR}" '{if ($1 == "aaa")  out=(myDir "/group_a.gtf"); else if ($7 == "bbb") out=(myDir "/group_b.gtf"); print >> out;close(out)} ' infile
1 Like

1) It looks like vgersh99 has already shown you how to expand shell variables for use in awk . Note, however, that his code will append to your two output files; not replace them if they already existed. It also looks like he had a typo looking for "bbb" in field 7 instead of in field 1. And, if you have some value other than "aaa" or "bbb" on an input line, that line will be written to one of those two output files depending on what was on the previous input line or cause awk to error output if another value appeared on the first input line. If that isn't what you want, you might want to try:

MY_DIR="/storage/tmp"
awk -v myDir="${MY_DIR}" '
$1 == "aaa"{ print > (myDir "/group_a.gtf")}
$1 == "bbb"{ print > (myDir "/group_b.gtf")}' infile

The above form should work with any standard awk as long as you don't specify more than 9 output files in your awk script.

2) Your one-liner worked every time you hit ENTER. The blank lines you see in the output are the echo of your input lines and the printing of the contents of your variable is the output from the print statement in your awk script.

This might be more obvious if you used:

MY_DIR="/storage/tmp"; printf '\ntext to be ignored by awk\n\n' | awk -v awk_var="$MY_DIR" '{print awk_var}'
/storage/tmp
/storage/tmp
/storage/tmp
$ 
3 Likes