awk to indent file outout

SkySmart · August 10, 2012, 7:07pm

i have a file that contains information such as this:

hostname.sky.net     ===     12.39.59.35
hostname.sky.net     ===     12.39.59.35
hostname.sky.net     ===     12.39.59.35
hostname-newyork.sky.net      ====      13.45.35.24
hostname-newyork.sky.net      ====      13.45.35.24
hostname-newyork.sky.net      ====      13.45.35.24

As you can see here, because the hostname containing "newyork" is longer, it messes up the alignment of the output.

how can i make awk tab each line so they are perfected aligned?

the preferred output would be something like:

hostname.sky.net                ====      12.39.59.35
hostname.sky.net                ====      12.39.59.35
hostname.sky.net                ====      12.39.59.35
hostname-newyork.sky.net        ====      13.45.35.24
hostname-newyork.sky.net        ====      13.45.35.24
hostname-newyork.sky.net        ====      13.45.35.24

I tried this:

awk '{printf "\t% s\t% s\t% s\t% s\n",$1,$2,$3,$4}'

but i'm sure it is just wrong.

Don_Cragun · August 10, 2012, 8:53pm

In your input the second field is sometimes "===" and sometimes "====", but you show "====" on all of your output lines. Assuming you want to maintain what is in the input fields and just align the output, the following might work:

awk '   {
        f1[NR]=$1
        f2[NR]=$2
        f3[NR]=$3
        if (length($1) > max1) max1=length($1)
        if (length($2) > max2) max2=length($2)
}
END     {
        fmt=sprintf("%%-%ds %%-%ds %%s\n",max1,max2)
        for (i=1; i<=NR; i++) printf(fmt,f1,f2,f3)
}' file...

but I would be nervous about doing this for "large" input files.

SkySmart · August 10, 2012, 9:39pm

don cragun:

In your input the second field is sometimes "===" and sometimes "====", but you show "====" on all of your output lines. Assuming you want to maintain what is in the input fields and just align the output, the following might work:
awk '   {
   f1[NR]=$1
   f2[NR]=$2
   f3[NR]=$3
   if (length($1) > max1) max1=length($1)
   if (length($2) > max2) max2=length($2)
}
END     {
   fmt=sprintf("%%-%ds %%-%ds %%s\n",max1,max2)
   for (i=1; i<=NR; i++) printf(fmt,f1,f2,f3)
}' file...
but I would be nervous about doing this for "large" input files.

sorry about that. i meant to have all of them to be "===="

but why would u be nervous about doing this for large input files? is there a better solution?

Don_Cragun · August 10, 2012, 11:38pm

That make it a little bit easier. You don't have to save the $2 values when you read a record, calculate the maximum field width for $2, and the output format string for the 2nd field just has "====" instead of having to print a string with the width of the longest $2 input value.

The determination of what a "large" file is depends on the size of your input file, the amount of memory on the machine, and the load on the machine. If you can figure out the maximum field widths for each field before you start reading the data and can build in the output format string instead of computing it on the fly, you don't have to store the entire file in awk's address space.