Using a command to go column by column

iconig · January 25, 2013, 2:35pm

Hi,
Could somebody help me with a code that passes each column through a command or set of commands, one column at a time. LIke this:
cat data:
4 89 87
5 3 89
10 82 4
39 10 39
100 98 9
1 4 3

Code:

awk '{print $1}' data | sort | gstat > group1
awk '{print $2}' data | sort | gstat > group2
awk '{print $3}' data | sort | gstat > group3

Of course, the right code will not have these three different codes, just one code that goes through each column and outputs the result to different files.

Thanks

Yoda · January 25, 2013, 2:57pm

awk '{ print $1 | "sort -n > group1"; print $2 | "sort -n > group2"; print $3 | "sort -n > group3"; } ' data

iconig · January 25, 2013, 3:54pm

Your code is essentially the same as mine. The problem is if I have 100 column, I do not want to write out each column by hand. I am looking for a code that will go through each column one after the other, without me having to do it myself.
Thanks

tukuyomi · January 25, 2013, 3:57pm

A whole script that'll do what you want

#!/bin/sh

# Uncomment this to start by deleting tmp files that will be created for further use
# Uncomment it if you plan to use the script more than once in the same dir
# This, because of the "append" redirection in the awk script
#rm "${F:=tmp}"*

# awk script to separate columns into each tmp file
# this returns as well the number of columns for later use -stored in n-
n=$(awk -v f="$F" '
{x=0; while(x++<NF)A[x,FNR]=$x}
END{
for(i=1;i<x;i++){
	for(j=1;j<FNR;j++)print A[i,j] >> f i
	}
	print i-1
}
' data)

# Sorting each tmp into group file
# Using n to count tmp files
while [ "$((i+=1))" -lt "$n" ]; do
	sort -n "$F$i" > "group$i"
done

exit 0

Yoda · January 25, 2013, 4:19pm

# Awk code for transposing and creating group files
awk 'BEGIN{t=1} {
 for(i=1;i<=NF;i++) {
  a[NR,i] = $i;
 }
} NF>nf {
 nf = NF;
} END {
 for (i=1;i<=nf;i++) {
  for (j=1;j<=NR;j++) {
   file="group"t;
   print a[j,i] > file;
   t=(j==NR)?++t:t;
  }
 }
} ' file

# Sorting the data for each group files
for file in group*
do
 sort -n "$file" > tmp; mv tmp "$file"
done

Don_Cragun · January 25, 2013, 6:07pm

The following is similar to bipinajith's shell script, but doesn't create the temp files, doesn't include empty lines in the data to be sorted if some rows in the data file have more fields than others, feeds the sorted output into your gstat command, and uses 3 digits in the group files instead of a variable number of digits. If you have a 100 columns in your data file, there is also a chance that bipinajith's script will run out of file descriptors on many implementations of awk.

Try:

awk '
{       for(i = 1; i <= NF; i++) a[NR,i] = $i
        if(NF > nf) nf = NF
}
END {   for(f = 1; f <= nf; f++) {
                cmd = sprintf("sort -n | gstat > group%03d", f)
                for(i = 1; i <= NR; i++)
                        if((i,f) in a)
                                printf("%s\n", a[i,f]) | cmd
                close(cmd)
        }
}' data

As always, if you're using a Solaris/Sun OS system, use /usr/xpg4/bin/awk or nawk instead of awk .

Note that I've never heard of the gstat command, but as long as it is on your command search path, this should work. (When I tested it, I just used a shell script named gstat that reported that it had been called and listed the contents of the data it found on its standard input.)