Bash formatting data into columns

mccmjc · December 16, 2014, 10:43am

Hi guys,

I'm trying to create a table of aggregated data using just bash commands. My data is in three columns, for example:

 2014-01-01  testA  64
 2014-01-01  testB  27
 2014-02-01  testA  31
 2014-02-02  testB  29
 2014-02-02  testC  12

And the result I am looking for is:

 2014-01-01  64    27    null
 2014-02-01  31    null  null
 2014-02-02  null  29    12

where columns 2, 3 and 4 map to testA, B or C.

So in effect, the first column is the y-axis, the next column is the x-axis label, and the final column is the value.

In other words, I'm trying to group on the first two columns using the third column as the value.

I'm thinking an array might be the best way to solve this problem, but I can't get my head around where to start.

Any ideas anyone?

Thanks,

Chris

RudiC · December 16, 2014, 11:02am

Try

awk     '       {LN[$1]; HD[$2]; MX[$1,$2]=$3}
         END    {               printf "%10s", ""; for (i in HD) printf "%10s", i; print "";
                 for (j in LN) {printf "%10s",j;   for (i in HD) printf "%10s", MX[j,i]; print ""}
                }
        ' file
               testA     testB     testC
2014-01-01        64        27          
2014-02-01        31                    
2014-02-02                  29        12

joeyg · December 16, 2014, 11:08am

Just some crude looking code, for a possible start (without awk)

copy file to file1
// loop thru while data in file
while ($wc -l file1 > 0)
 do
// extract data from that first column
 matchf = head -1 file1 | cut -f1
// pull out matching lines
 grep $matchf < file1 > file3
// now temp file3 has common entries for first unique column value
// put these into your column format
// write these to file4
//
// copy everything but that first column match to file2
 grep -v $matchf <file1 >file2
// copy temp file back to working file1
 cp file2 file1
 done

Yoda · December 16, 2014, 4:29pm

RudiC's approach in bash using Associative Arrays:-

#!/bin/bash

declare -A LN
declare -A HD
declare -A MX

while read v1 v2 v3
do
        LN["$v1"]="$v1"
        HD["$v2"]="$v2"
        MX["${v1}${v2}"]="$v3"
done < filename

for j in "${LN[@]}"
do
        printf "%10s" "$j"
        for k in "${HD[@]}"
        do
                [ -z "${MX[${j}${k}]}" ] && printf "%10s" "null" || printf "%10s" "${MX[${j}${k}]}"
        done
        printf "\n"
done

mccmjc · December 17, 2014, 5:03am

Thanks guys, it's amazing how many times this requirement has come up, and I've never been able to figure out how to approach the problem.

Appreciate the quick assistance from you all.

Cheers,

Chris