awk search for max and min while ignoring special character

ncwxpanther · July 29, 2015, 11:24am

I am trying to get a simple min/max script to work with the below input. Note the special character (">") within it.

Script

awk  'BEGIN{max=0}{if(($1)>max)  max=($1)}END {print max}'
awk  'BEGIN{min=0}{if(($2)<min)  min=($2)}END {print min}'

Input

-122.2840   42.0009
-119.9950   41.1777
>
-123.7540   39.5520
-123.7820   39.6872

The above scripts are not outputting the correct values. Ive tried using "-FS" to ignore ">" but i dont think its being used correctly.

Note that finding min on column 1 and max on column 2 does work properly.

awk  'BEGIN{max=0}{if(($2)>max)  max=($2)}END {print max}' 
awk  'BEGIN{min=0}{if(($1)<min)  min=($1)}END {print min}'

RudiC · July 29, 2015, 11:54am

Note that 0 will always be greater than negative values, so $1 will never exceed max in your code snippet. Same for the min values...
Try

awk  '
BEGIN           {max=-1E100}
NF > 1          {if (($1)>max)  max=($1)
                }
END             {print max}
' file

---------- Post updated at 17:54 ---------- Previous update was at 17:51 ----------

or even

awk  '
BEGIN           {max=-1E100
                 min=1E100
                }
NF > 1          {if (($1)>max)  max=($1)
                 if (($2)<min)  min=($2)
                }
END             {print max, min}
' file
-119.9950 39.5520

RavinderSingh13 · July 29, 2015, 12:01pm

Hello ncwxpanther,

If you want to get maximum and minimum values in both the columns across the whole file then following may help you in same.

 awk 'BEGIN{col1_max=col2_max=-1E100;col1_min=col2_min=1E100} ($0 !~ />/){col1_max=col1_max > $1?col1_max : $1; col2_max=col2_max > $2?col2_max : $2; col1_min=col1_min < $1?col1_min:$1; col2_min=col2_min < $2?col2_min:$2;} END{print "COL1_max" OFS "COL2_max" OFS "COL1_min" OFS "COL2_min" ORS col1_max OFS col2_max OFS col1_min OFS col2_min}'  Input_file

Output will be as follows.

 COL1_max COL2_max COL1_min COL2_min
-119.9950 42.0009 -123.7820 39.5520

EDIT: Adding a non-one liner form of solution now for same.

cat min_max.ksh
awk 'BEGIN{
                col1_max=col2_max=-1E100;
                col1_min=col2_min=1E100
          }
                ($0 !~ />/){
                                        col1_max=col1_max > $1?col1_max : $1;
                                        col2_max=col2_max > $2?col2_max : $2;
                                        col1_min=col1_min < $1?col1_min : $1;
                                        col2_min=col2_min < $2?col2_min : $2;
                           }
                END        {
                                        print "COL1_max" OFS "COL2_max" OFS "COL1_min" OFS "COL2_min" ORS col1_max OFS col2_max OFS col1_min OFS col2_min
                           }
    ' Input_file

Thanks,
R. Singh

neutronscott · July 29, 2015, 12:08pm

The problem with max is the > and comparing strings to numbers.
The problem with min is we start with min being 0 and none of those are below zero.

Let's skip lines that don't have 2 columns, and also initialize using the first line.

$ awk  'NR==1 {min=$2; next} NF>1 && $2<min {min=$2} END {print min}' input
39.5520

$ awk  'NR==1 {max=$1; next} NF>1 && $1>max {max=$1} END {print max}' input
-119.9950

vgersh99 · July 29, 2015, 12:19pm

awk 'FNR==1 {max=$1;min=$2;next} NF>1{if($1>max) max=$1; if ($2<min) min=$2} END {print max, min}' myFile

ncwxpanther · July 29, 2015, 1:08pm

Thanks. I got these 2 scripts to print the min and max of each column separately. Whats the best way to combine these into a single line?

awk 'FNR==1 {min=$1;max=$1;next} NF>1{if($1<min) min=$1; if ($1>max) max=$1} END {print max, min}'
awk 'FNR==1 {min=$2;max=$2;next} NF>1{if($2<min) min=$2; if ($2>max) max=$2} END {print max, min}'

Output

MinColumn1 MaxColumn1 MinColumn2 MaxColumn2

vgersh99 · July 29, 2015, 1:29pm

something along these lines - for any number of fields:
awk -f nc.awk myFile where nc.awk is:

function mm(fnr,   i)
{
  for(i=1; i<=NF;i++) {
    if (fnr==1) {min=$i;max=$i;continue}
    if ($i<min) min=$i
    if ($i>max) max=$i
  }
}

NF>1 {mm(FNR)}
END {
  for(i=1;i in min;i++) 
     printf("%s%s%s%s", min, OFS, max, ((i+1) in min)?OFS:ORS)
}

Don_Cragun · July 29, 2015, 1:41pm

For two fields and assuming that line 1 in a file could also be your special character, you could try:

awk '
NF > 1 {if(!nf++) {
		m1 = M1 = $1
		m2 = M2 = $2
	} else {if($1 + 0 < m1)	m1 = $1
		if($1 + 0 > M1)	M1 = $1
		if($2 + 0 < m2)	m2 = $2
		if($2 + 0 > M2)	M2 = $2
	}
}
END {	print "MinColumn1 MaxColumn1 MinColumn2 MaxColumn2"
	printf("%10s %10s %10s %10s\n", m1, M1, m2, M2)
}' file

which, with your sample input produces the output:

MinColumn1 MaxColumn1 MinColumn2 MaxColumn2
 -123.7820  -119.9950    39.5520    42.0009

If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk .

PS. Just to be clear, the above code produces the same output if the input file is:

>
-122.2840   42.0009
-119.9950   41.1777
>
-123.7540   39.5520
-123.7820   39.6872

instead of the provided sample, while the other suggestions in this thread don't seem to produce the desired output in this case. It isn't clear to me from the provided description of the input file format whether or not this matters for the submitter's real world data.