Find Max value of line and print

ncwxpanther · February 10, 2016, 1:37pm

I need to find the max value of all columns except the 1st column and print the answer along with the 1st column.

Input

123xyz   0    0    1    2    0    0    0    0    0    0    0     
234xyz   0    0    0    0    0    0    0    0    0    0    0    
345xyz   0    0    1    0    0    0    0    0   -9   -9   -9

Expected Output

123xyz   2

I have this code to look for all values in entire file, but it does not ignore the 1st column.

awk '{for(x=1;x<=NF;x++)a[++y]=$x}END{c=asort(a);print "max:",a[c]}'

RudiC · February 10, 2016, 1:58pm

Why don't you start at x=2 if you're not interested in $1?
However, try

awk '
BEGIN   {MAX=-1E100
        }
        {for (x=2; x<=NF; x++) if ($x>MAX)      {MAX = $x
                                                 C1  = $1
                                                }
        }
END     {print C1, MAX
        }
' file
123xyz 2

MadeInGermany · February 10, 2016, 3:02pm

The original code can be extended like this

awk '{ for(x=2;x<=NF;x++) {a[++y]=$x; b[$x]=$1} } END { c=asort(a); print "max:",b[a[c]],a[c] }' file

But is more efficient to only remember the current maximum and its field1, in two variables. Like RudiC did.
If you do not like the initialization MAX=-1E100, then you can do

awk '
NR==1 {MAX=$2; C1=$1}
{ for (x=2; x<=NF; x++) if ($x>MAX) { MAX=$x; C1=$1 } }
END { print C1, MAX } 
' file

ncwxpanther · February 10, 2016, 3:13pm

Thanks. I got both of these to work.

My first column contains a lot of unnecessary information. Is there a way to space out the fields 1-11, 13-16, 18-19 and 20-27?

Input

USW00003812Y2016M02TMAXLE32 0    0    1    2    0    0    0    0    0    0    0

Desired output

USW00003812 201602 TMAXLE32 2

MadeInGermany · February 10, 2016, 3:31pm

Instead of

{ print C1, MAX }

You can do

{ print substr(C1,1,11), substr(C1,13,4), substr(C1,18,2), substr(C1,20,8)}, MAX }

ncwxpanther · February 11, 2016, 9:21am

I have the following code set up to analyze several thousand files is multiple directories. The result output is only a single max out of all $file per $dir.

I am expecting a max value for each of the $file for each $dir to be returned.

What am I missing?

#!/bin/bash

for dir in a b c x y z
do

for file in USW*
do
awk '
BEGIN   {MAX=-1E100
        }
        {for (x=2; x<=NF; x++) if ($x>MAX)      {MAX = $x
                                                 C1  = $1
                                                }
        }
END     {print substr(C1,1,11), substr(C1,13,4), substr(C1,18,2), substr(C1,20,8), MAX
        }
' /dir/of/files/$dir/$file
done

done

RudiC · February 11, 2016, 11:20am

It should return one result line per file. Please show the directory/file structure and/or what your two for loops make of it.

ncwxpanther · February 11, 2016, 11:59am

Dir Structure

/a /b /c /x /y /z

File Name Structure in dir /a

USW000004444.prpc.eq.0.str
USW000004445.prpc.eq.0.str

File Contents

USW000004444Y1991M10PRCPEQ00  11    0    0    0    0    0    1    0    0    1    0    0    0    0    0    1    2    3    4    5    6 
USW000004444Y1991M11PRCPEQ00  10    0    0    0    0    0    1    0    0    1    0    0    0    0    0    1    2    3    4    5    -9

USW000004445Y1991M10PRCPEQ00  0    0    0    0    0    0    1    0    0    1    0    0    0    0    0    1    2    3    4    5    6 
USW000004445Y1991M11PRCPEQ00  -9    0    0    0    0    0    1    0    0    1    0    0    0    0    0    1    2    3    4    5    -9

Output

USW000004444 1991 10 PRCPEQ00 11

Expected Output

USW000004444 1991 10 PRCPEQ00 11
USW000004445 1991 10 PRCPEQ00 6

RudiC · February 11, 2016, 1:09pm

As there's no file matching USW* in the current directory, the pattern is NOT expanded but given as is to the awk command: awk '...' a/USW* which then expands to ALL files in a and makes them all one single stream: awk '. . .' a/USW000004444.prpc.eq.0.str a/USW000004445.prpc.eq.0.str with one single MAX etc. value.
Try in lieu:

for dir in a b c
  do    for file in $dir/USW*
          do  awk ' 
                BEGIN   {MAX=-1E100
                        }
                        {for (x=2; x<=NF; x++) if ($x>MAX)      {MAX = $x
                                                                 C1  = $1
                                                                }
                        }
                END     {print substr(C1,1,11), substr(C1,13,4), substr(C1,18,2), substr(C1,20,8), MAX
                        }
                ' $file
           done
  done
USW00000444 Y199 M1 0PRCPEQ0 11
USW00000444 Y199 M1 0PRCPEQ0 6
awk: cannot open b/USW* (No such file or directory)
awk: cannot open c/USW* (No such file or directory)

(I guess your substr extracts one char too few...)