Transpose rows to columns complex

unme · December 7, 2014, 9:29am

Input:

IN,A,1
IN,B,3
IN,B,2
IN,C,7
BR,A,1
BR,A,5
BR,C,9
AR,C,9

Output:

CNTRY,A,B,C
IN,1,5,7
BR,6,0,9
AR,0,0,9

I've tried the below code, but unable to get the desired output, Please help me in this:

awk -F, '{a[$1","$2]+=$3} END {for (i in a) { print i "," a}}' input.txt

ritakadm · December 7, 2014, 10:32am

Not the most elegant solution, but try this,,untested..sorry i missed on the sum part...

awk -F, '{a[$1","$2]+=$3;}END{for(i in a)print i", "a;}' file > infile


awk     -F,  '{cntry[$1]; alpha[$2]; val[$1,$2]=$3}
         END    {               printf "%10s", ""; for (i in alpha) printf "%10s", i; print "";
                 for (j in cntry) {printf "%10s",j;   for (i in alpha) printf "%10s", val[j,i]; print ""}
                }
        '  infile

Scrutinizer · December 7, 2014, 12:42pm

Two suggested modifications to ritakadm's approach:

val[$1,$2]+=$3

and

val[j,i]+0

unme · December 7, 2014, 9:07pm

Modified slightly to delimited file:

awk     -F,  '{c[$1]; a[$2]; v[$1,$2]+=$3}
         END    {               printf "XX%s", ","; for (i in a) printf "%s,", i; print "";
                 for (j in c) {printf "%s",j;   for (i in a) printf ",%s", v[j,i]+0; print ""}
                }
        '   infile>infile2

sed '1 s/.$//' infile2>output

Here I am facing some difficulty, some times alphabetic order may vary:
A,B,C columns should be fixed always:

IN,A,1
IN,C,7
BR,A,1
MX,C,9

output should be:

XX,A,B,C
IN,1,0,7
BR,1,0,0
MX,0,0,9

Chubler_XL · December 7, 2014, 11:24pm

Not sure what order you want the columns in (order they appear in input or alphabetical):

This keeps order in input file:

awk     -F,  '{
            c[$1]
            if(!($2 in av)) {
               av[$2]
               a[++cols]=$2}
            v[$1,$2]+=$3
         }
         END {  
           printf "XX"
           for (i=1;i<=cols;i++) printf ",%s", a
           printf "\n"
           for (j in c) {
               printf "%s",j;
               for (i=1;i<=cols;i++)
                  printf ",%s", v[j,a]+0; print ""}
          }' infile>infile2

unme · December 8, 2014, 12:55am

The oder I am meaning is:

Per Country maximum rows I will get is 3 and minimum is 2.

For example, If I get only 2 rows per country above script will generate only 3 columns i.e.,

IN,A,5
IN,C,4
CA,A,4

I will get output as:

XX,A,C
IN,5,4
CA,4,0

Another example:

IN,A,5
IN,B,5
IN,C,4
CA,A,4

I will get output as:

XX,A,B,C
IN,5,5,4
CA,4,0,0

Column are varying based on the input, my aim to maintain the genric ie., if A & B records are exists in the source i need to populate C column with 0 value for all the rows.

If A & C records are exits in the source i need to populate B column with 0 value for all the rows.

Chubler_XL · December 8, 2014, 3:17am

how about this:

awk     -F,  '
BEGIN { cols=split("A,B,C", a);}
{ c[$1];v[$1,$2]+=$3 } 
END {
   printf "XX"
   for (i=1;i<=cols;i++) printf ",%s", a
   printf "\n"
   for (j in c) {
       printf "%s",j;
       for (i=1;i<=cols;i++)
       printf ",%s", v[j,a]+0; print ""}
}' infile>infile2

unme · December 8, 2014, 4:58am

Thanks a lot Its working Perfectly