Putting an awk variable name inside double-quotes or inside slashes in an awk script turns it into a literal string; not the name of a variable to be expanded. Although the extended regular expressions are usually written with slashes as delimiters, in reality all that awk requires is a string (a constant string between double-quotes, a constant string between slashes, or a variable containing a string).
Try:
for i in {1..22}
do
printf "$i\n"
awk -v chr=$i '
{
if ($2 ~ ("^" chr ":") )
{
print $0
}
}' more.txt
done
or, very slightly more efficiently:
for i in {1..22}
do
printf "$i\n"
awk -v ERE="^$i:" '
{
if ($2 ~ ERE)
{
print $0
}
}' more.txt
done
Did you consider using awk 's default behaviour for condensing the script to
for i in {1..22}
do printf "$i\n"
awk -v chr="^$i:" '$2 ~ chr' more.txt
done
EDIT: Are you aware that your script creates 22 processes to run awk in either, opening and reading more.txt 22 times? That's quite expensive, resourcewise. How about one single awk invocation and one single file read for all:
awk '
{TMP = $2
sub (/:.*$/, "", TMP)
BUF[TMP, ++CNT[TMP]] = $0
}
END {for (i=1; i<=22; i++) {print i
for (c=1; c<=CNT; c++) print BUF[i, c]
}
}
' more.txt
awk '
{BUF[TMP=substr($2,1,index($2,":")-1), ++CNT[TMP]] = $0 # store the input file in memory with index based on $2 and in increasing order
}
END {for (i=1; i<=22; i++) {print i # create sequence No. (1 .. 22) and print it
for (c=1; c<=CNT; c++) print BUF[i, c] # print the input for this sequence number - if exists - in increasing order
} # if it does not exist, CNT defaults to zero, and loop is not entered.
}
' more.txt
You are right, that is a bit intricate...
We have the BUF array, that needs to be "multidimensional", i.e. indexed by two indices. awk doesn't provide real multidimensional array but approximates them by using a "compound" index concatenating the different "dimensions' " subindices separated by comma (or the variable SUBSEP, c.f. man awk ).
The first index is built from the beginning of $2 up to the first : using the substr function, saving the result to the TMP variable at the same time for later use. awk allows for this construct.
The second index is just a pre-incremented (++ operator in front of the variable, c.f. man awk ) counter array indexed by that TMP .
So consecutive lines with identical TMP index (1, 3, and 22 in your above sample) will have an incremental / sequential integer second index.
You could demonstrate this behaviour by printing out the BUF array's indices:
for (b in BUF) print b
Please be aware that in awk , the order in which b transverses the indices of the array is not defined.