Script changing row to column

Indra2011 · January 12, 2017, 11:57am

Hi Gurus,

I have an I/P file which looks like

I would liek to get O/P as

100 200 300
 1     1    4
 2     3    4
 9     8    7

Is it possible

Thanks,

RavinderSingh13 · January 12, 2017, 1:27pm

Hello Indra2011,

Could you please try following and let me know if this helps you.

awk '{if(!($1 in A)){C[++o]=$1;};++A[$1];B[$1,A[$1]]=$NF;P=P>A[$1]?P:A[$1]} END{for(f=1;f<=o;f++){printf("%d\t",C[f])};print X;for(j=1;j<=P;j++){for(l=1;l<=o;l++){printf("%d\t",B[C[l],j])};print X}}'   Input_file

Output will be as follows.

100    200    300    
1    1    4    
2    3    4    
9    8    7

EDIT: Adding a non-one liner form of solution too successfully now.

awk '{
        if(!($1 in A)){
                        C[++o]=$1;
                      };
        ++A[$1];
        B[$1,A[$1]]=$NF;
        P=P>A[$1]?P:A[$1]
     }
    END{
        for(f=1;f<=o;f++){
                                printf("%d\t",C[f])
                         };
        print X;
        for(j=1;j<=P;j++){
                                for(l=1;l<=o;l++){
                                                        printf("%d\t",B[C[l],j])
                                                 };
                         print X
                         }
       }
    '   Input_file

Thanks,
R. Singh

Scrutinizer · January 12, 2017, 1:49pm

Another one you could try:

awk '
  {
    if($1 in A) {
      if(!c) c=p
    } 
    else 
      h=h $1 OFS
  } 
  c && !(p%c) {
    if(h) print h
    print s
    h=s=x
  }
  {
    A[$1]
    s=s $2 OFS
    p=FNR
  }
  END {
    print s
  }
' OFS='\t' file

MadeInGermany · January 12, 2017, 2:32pm

@Ravinder, caution: (!A[$1]) is true if the contents is 0 . And has the side effect that it defines the A[$1] element (having no value).
A pure lookup (!($1 in A)) is often preferrable.

---------- Post updated at 14:32 ---------- Previous update was at 13:58 ----------

Here is my awk solution, with comments.

awk '
BEGIN { format=" %-5s" }
function printA() {
# print the A indices once
  if (!header) {
    header=1
    for (i in A) printf format,i
    printf "\n"
  }
# print the A values
  for (i in A) printf format,A
  printf "\n"
}

($1 in A) {
# repetition detected, print+clear the array
  printA()
  split("",A)
}

# store $2 in $1-indexed array
{ A[$1]=$2 }

# at the end print the array
END { printA() }
' inputfile

Indra2011 · January 12, 2017, 2:54pm

All of the Gurus, many thanks for your help

---------- Post updated at 03:41 PM ---------- Previous update was at 03:36 PM ----------

Hi Scrutinizer.

Could you be kindly explain what you did in here?

tHanks

---------- Post updated at 03:54 PM ---------- Previous update was at 03:41 PM ----------

scrutinizer:

Another one you could try:

awk '
  {
   if($1 in A) {
   if(!c) c=p
   } 
   else 
   h=h $1 OFS
  } 
  c && !(p%c) {
   if(h) print h
   print s
   h=s=x
  }
  {
   A[$1]
   s=s $2 OFS
   p=FNR
  }
  END {
   print s
  }
' OFS='\t' file

Could you be kindly explain what you did in here?

Don_Cragun · January 12, 2017, 9:21pm

Here is a copy of Scrutinizer's script with comments added. I hope this will help you understand how it works. Let us know if there is still something you don't understand after
looking at these comments (along with the awk man page on your system).

awk '			# Use awk to process the following script:
  {
    if($1 in A) {	# If the value in the 1st field has been seen before
      if(!c) c=p	#   if c has not been set, set it to p.  (Since p is
			#   later set to the previous input line number and $1
			#   will have been seen before and c will not have been
			#   set only when we see the 2nd occurrence of the 1st
			#   value in the 1st field, this sets c to the number
			#   of different values that appear in the 1st field.)
    } 
    else 		# otherwise (the value in the 1st field has not been
			# seen)
      h=h $1 OFS	#   add the 1st field and a field separator to the
			#   header string.
  } 
  c && !(p%c) {		# If c is non-zero and p is an even multiple of c
    if(h) print h	#   if h (the header line) is not an empty string print
			#   it,
    print s		#   print the string of accumulated $2 values, and
    h=s=x		#   clear the header line and the string of accumulated
			#   $2 values.
  }
  {
    A[$1]		# Record that we have seen this $1 value before,
    s=s $2 OFS		# add the current $2 value to the string of accumulated
			# $2 values, and
    p=FNR		# set p to the current input line number.
  }
  END {
    print s		# When we hit EOF on the input file, print the last
			# string of accumulated $2 values.
  }
' OFS='\t' file		# Mark the end of the awk script text, set the output
			# field separator to a tab and name the input file to
			# be processed.

Indra2011 · January 12, 2017, 11:50pm

Don,
Many many thanks Don for your detailed explaining. Excuse my lack of knowledge on this. I have couple of questions ---

what is

in this script?
2. what is this this logic for

If c is non-zero and p is an even multiple of c

in terms of real input file?

Sorry to bother you again on this
Thanks

Don_Cragun · January 13, 2017, 12:33am

I don't know how to be much clearer than the comment I made before on this:

    if($1 in A) {	# If the value in the 1st field has been seen before
      if(!c) c=p	#   if c has not been set, set it to p.  (Since p is
			#   later set to the previous input line number and $1
			#   will have been seen before and c will not have been
			#   set only when we see the 2nd occurrence of the 1st
			#   value in the 1st field, this sets c to the number
			#   of different values that appear in the 1st field.)

  c && !(p%c) {		# If c is non-zero and p is an even multiple of c

With your sample file, there are three distinct values in the 1st field: 100 , 200 , and 300 so c will be set to the line number in the input file that contained the last one of these the 1st time it was seen (which is the 300 on line 3). Since c is 3, the actions in this section will be performed on lines 3, 6, 9, 12, ... (i.e., each time three lines have been read) which gives you a complete line of output to print after reading 3 input lines.

Indra2011 · January 13, 2017, 10:41am

don cragun:

I don't know how to be much clearer than the comment I made before on this:
   if($1 in A) {	# If the value in the 1st field has been seen before
   if(!c) c=p	#   if c has not been set, set it to p.  (Since p is
			#   later set to the previous input line number and $1
			#   will have been seen before and c will not have been
			#   set only when we see the 2nd occurrence of the 1st
			#   value in the 1st field, this sets c to the number
			#   of different values that appear in the 1st field.)
  c && !(p%c) {		# If c is non-zero and p is an even multiple of c
With your sample file, there are three distinct values in the 1st field: 100 , 200 , and 300 so c will be set to the line number in the input file that contained the last one of these the 1st time it was seen (which is the 300 on line 3). Since c is 3, the actions in this section will be performed on lines 3, 6, 9, 12, ... (i.e., each time three lines have been read) which gives you a complete line of output to print after reading 3 input lines.

Million Thanks Don.