Don't understand AWK asort behaviour

jgilot · November 23, 2011, 8:21am

Hello,

I have the following script :

BEGIN {
print "1 ***";
  split("abc",T,""); 
  T[5]="e";
  T[26]="z";
  T[25]="y";
  for (i in T) printf("%i:%s ",i,T);  print "";
  for (i=1; i<=length(T); i++) printf(T); print ""

print "2 ***";
  asort(T,U);
  for (i in U) printf("%i:%s ",i,U);  print "";
  for (i=1; i<=length(U); i++) printf(U); print ""
}

and I don't understand the line following 2*** in the resulting display :

$ awk -f tst
1 ***
26:z 5:e 1:a 2:b 25:y 3:c 
abceyz
2 ***
26:z 17: 4: 18: 5: 19: 6: 7: 8: 9: 10: 20: 11: 21:a 12: 22:b 13: 23:c 14: 1: 24:e 15: 2: 25:y 16: 3: 
abceyz

I thought the result should have been : 1:a 2:b 3:c 4:e 5:y 6:z.

Where I am wrong ?

binlib · November 23, 2011, 1:57pm

for (i=1; i<=length(T); i++) printf(T);

caused the array T to be filled with many null values. At the beginning, length(T) was 6. When i was incremented to 4, it created new array element T[4], etc.

fpmurphy · November 23, 2011, 6:00pm

In AWK, arrays are associative and array subscripts are always strings. Arrays can be sparse as in your example. If you use an array subscript (expression index) to refer to an array element that has no recorded value, i.e. T[10], then the value returned is "", the null string.

You can test whether a particular index exists, without the side effect of creating that element if it is not already present by using an

index in array

expression.

For more information, see Chapter 11 of Effective AWK Programming by Arnold Robbins.

jgilot · November 23, 2011, 6:39pm

Hi all,

Thanks