I have the following 4 column data, which is a output of db2 select query:
Need to pivot the data.
sample Input:
Year Month Country Counts
2012 Aug Canada 114
2012 Aug USA 92
2012 Aug Mexico 3
2012 Aug Aus 20
2012 Sep India 158
2012 Sep Canada 106
2012 Sep USA 89
2012 Sep Mexico 3
2012 Sep Aus 13
2012 Oct India 1571953
2012 Oct Canada 1387428
2012 Oct USA 1330861
2012 Oct Mexico 75663
2012 Oct Aus 143765
2012 Oct Alpha 1
2012 Nov India 391
2012 Nov Canada 657
2012 Nov USA 894
2012 Nov Mexico 360
2012 Nov Aus 1034
Desired Output:
2012 Aug 2012 Sep 2012 Oct 2012 Nov
India 178 158 1571953 391
Canada 114 106 1387428 657
USA 92 89 1330861 894
Mexico 3 3 75663 360
Aus 22 13 143765 1034
Alpha 0 0 1 0
awk '{if(!X[$1,$2]++){ # Here we remove duplicates of date as $1 and $2. So it will take only single entry for any month.
FIR=FIR?FIR"_"$1" "$2:$1" "$2} # Here we append date $1 and $2 to variable FIR using FS as _.
{Y[$1,$2,$3]=$4;Z[$3]++}} # Here we Assign $4 to array using index as $1,$2,$3 and add country's to array Z. In array Z only unique countries will be added.
END{n=split(FIR,FCL,"_"); # Here we split FIR var(all unique dates) to variable FCL using FS as _ and n = number dates present in FIR variable.
for(i=1;i<=n;i++){s=s?s"\t"FCL:"\t"FCL} # Here we just append all the unique months to the variable s.
print s; # print variable s on first line.
for(j in Z){s=j; # Get values from array Z and assign it to s. array Z contains unique country names
for(i=1;i<=n;i++){ # Use for loop using variable n(which we got from spiting variable FIR)
split(FCL,PK," ") # Split variable FCL(which is has year and month) to array PK.
p=Y[PK[1],PK[2],j]?Y[PK[1],PK[2],j]:0 # Here we check array Y presents or not. PK[1] = Year, PK[2] = Month and j = Country. If for that particular field we have array value then assign it to variable s and if not then assign it to 0
s=s"\t\t"p}; # then append it variable s
print s # Print variable s
}}' file # Read file