Put repeated values into a table

ritakadm · November 25, 2013, 12:01pm

Hi all,

I have blocks of records like the following, each block ends in = in a new line, I want tabularize the entire output. The pattern is the same in every block although not all types are there in every block.
For example gine3 is absent in the second block but present first and third.

line1=ABC
pine2=XYZ
gine3=123 
=
line1=TYU
pine2=UYT
=
line1=BCD
pine2=GHY
gine3=786
tine4=RTY

output

line1 pine2 gine3 tine4
ABC XYZ 123 
TYU UYT 
BCD GHY 786 RTY

bartus11 · November 25, 2013, 12:42pm

Try:

awk -F"=" 'BEGIN{OFS=" ";print "line1 pine2 gine3 tine4"}$0=="="{print x;x=""}{x=(x?x" ":"")$2}END{print x}' file

ritakadm · November 25, 2013, 12:46pm

Thank you, but my columns are not restricted to only 4, there may be 20 or more, can we make the header variable? should be the maximum number of lines in all blocks

Akshay_Hegde · November 25, 2013, 1:10pm

For given input this also will work

$ awk '!A[$1]++{s = s ? s OFS $1 : $1}{B[++i]=$2}END{print s; for(j=1;j<=i;j++)printf B[j]"%s", B[j] !~ /[[:alnum:]]/ || j == i ? RS : OFS }' FS="=" file

--edit--

$ awk '!A[$1]++{s = s ? s OFS $1 : $1}{p = !/^\=/ ? p ? p OFS $2 : $2 : p RS }END{gsub(/\n[[:space:]]/,"\n",p);print s RS p}' FS="=" file

line1 pine2 gine3  tine4
ABC XYZ 123  
TYU UYT 
BCD GHY 786 RTY

RavinderSingh13 · November 25, 2013, 1:17pm

Hello Akshay,

Thanks for great code could you please explain the same.

Thanks,
R. Singh

Akshay_Hegde · November 25, 2013, 1:47pm

awk '!A[$1]++{s = s ? s OFS $1 : $1}.... --> Here unique records from column 1 are stored in variable s

{B[++i]=$2} --> Array B holds column2 records

and finally in END block

END{print s;.. --> Prints unique records in variable s

for(j=1;j<=i;j++) --> Looping for 1 to i

B[j] !~ /[[:alnum:]]/ || j == i ? RS : OFS --> if B[j] is not alphanumeric or j equal to i thats end of loop print B[j] and Row separator ("\n") else print B[j] and output field separator (OFS)

RudiC · November 25, 2013, 2:24pm

Try this on a larger file and come back with results:

awk -F= 'BEGIN          {LnCnt=0; HdCnt=0}
         !$1 && !$2     {LnCnt++; next}
                        {for (i=0; i<HdCnt; i++) if (HD==$1) break
                         if (i==HdCnt) HD[HdCnt++]=$1
                         A[LnCnt,$1] = $2}

         END            {for (i=0; i<HdCnt; i++) printf "%s\t", HD; printf "\n"
                         for (j=0; j<=LnCnt; j++)
                           {for (i=0; i<HdCnt; i++) printf "%s\t", A[j,HD]; printf "\n"}
                        }
        ' OFS="\t" file
line1    pine2    gine3    tine4    
ABC    XYZ    123         
TYU    UYT            
BCD    GHY    786    RTY

summer_cherry · November 26, 2013, 3:10am

python

import re
counter=1
keys={}
values={}
with open("a.txt") as file: 
 for line in file:
  line=line.replace("\n","")
  if re.match('=',line):
   counter+=1
   continue  
  items = line.split("=")
  keys[items[0]]=1
  values.setdefault(counter,{})[items[0]]=items[1]
for i in sorted(values,key=lambda x: int(x)):
 for j in sorted(keys,key=lambda x: int(re.sub('[a-zA-Z]*','',x))):
  print(values.get(j,""),end=" ")
 print("")

perl

my %keys;
my $counter=1;
my %values;
while(<DATA>){
	chomp;
	if (/^=$/){
		$counter++;
		next;
	}
	my @arr=split("=",$_);
	$keys{$arr[0]}=1;
	$values{$counter}->{$arr[0]}=$arr[1];
}
my @sorted_keys = sort {$a=~/(\d+)/;my $aa=$1;$b=~/(\d+)/;my $bb=$1; $aa<=>$bb} keys %keys;
print join " ", @sorted_keys;
print"\n";
for my $key (sort keys %values){
	for my $k (@sorted_keys){
		print $values{$key}->{$k}," ";
	}
	print "\n";
}
__DATA__
line1=ABC
pine2=XYZ
gine3=123 
=
line1=TYU
pine2=UYT
=
line1=BCD
pine2=GHY
gine3=786
tine4=RTY