I am trying to get a simple min/max script to work with the below input. Note the special character (">") within it.
Script
awk 'BEGIN{max=0}{if(($1)>max) max=($1)}END {print max}'
awk 'BEGIN{min=0}{if(($2)<min) min=($2)}END {print min}'
Input
-122.2840 42.0009
-119.9950 41.1777
>
-123.7540 39.5520
-123.7820 39.6872
The above scripts are not outputting the correct values. Ive tried using "-FS" to ignore ">" but i dont think its being used correctly.
Note that finding min on column 1 and max on column 2 does work properly.
awk 'BEGIN{max=0}{if(($2)>max) max=($2)}END {print max}'
awk 'BEGIN{min=0}{if(($1)<min) min=($1)}END {print min}'
RudiC
July 29, 2015, 11:54am
2
Note that 0 will always be greater than negative values, so $1
will never exceed max
in your code snippet. Same for the min values...
Try
awk '
BEGIN {max=-1E100}
NF > 1 {if (($1)>max) max=($1)
}
END {print max}
' file
---------- Post updated at 17:54 ---------- Previous update was at 17:51 ----------
or even
awk '
BEGIN {max=-1E100
min=1E100
}
NF > 1 {if (($1)>max) max=($1)
if (($2)<min) min=($2)
}
END {print max, min}
' file
-119.9950 39.5520
Hello ncwxpanther,
If you want to get maximum and minimum values in both the columns across the whole file then following may help you in same.
awk 'BEGIN{col1_max=col2_max=-1E100;col1_min=col2_min=1E100} ($0 !~ />/){col1_max=col1_max > $1?col1_max : $1; col2_max=col2_max > $2?col2_max : $2; col1_min=col1_min < $1?col1_min:$1; col2_min=col2_min < $2?col2_min:$2;} END{print "COL1_max" OFS "COL2_max" OFS "COL1_min" OFS "COL2_min" ORS col1_max OFS col2_max OFS col1_min OFS col2_min}' Input_file
Output will be as follows.
COL1_max COL2_max COL1_min COL2_min
-119.9950 42.0009 -123.7820 39.5520
EDIT: Adding a non-one liner form of solution now for same.
cat min_max.ksh
awk 'BEGIN{
col1_max=col2_max=-1E100;
col1_min=col2_min=1E100
}
($0 !~ />/){
col1_max=col1_max > $1?col1_max : $1;
col2_max=col2_max > $2?col2_max : $2;
col1_min=col1_min < $1?col1_min : $1;
col2_min=col2_min < $2?col2_min : $2;
}
END {
print "COL1_max" OFS "COL2_max" OFS "COL1_min" OFS "COL2_min" ORS col1_max OFS col2_max OFS col1_min OFS col2_min
}
' Input_file
Thanks,
R. Singh
The problem with max is the > and comparing strings to numbers.
The problem with min is we start with min being 0 and none of those are below zero.
Let's skip lines that don't have 2 columns, and also initialize using the first line.
$ awk 'NR==1 {min=$2; next} NF>1 && $2<min {min=$2} END {print min}' input
39.5520
$ awk 'NR==1 {max=$1; next} NF>1 && $1>max {max=$1} END {print max}' input
-119.9950
1 Like
awk 'FNR==1 {max=$1;min=$2;next} NF>1{if($1>max) max=$1; if ($2<min) min=$2} END {print max, min}' myFile
Thanks. I got these 2 scripts to print the min and max of each column separately. Whats the best way to combine these into a single line?
awk 'FNR==1 {min=$1;max=$1;next} NF>1{if($1<min) min=$1; if ($1>max) max=$1} END {print max, min}'
awk 'FNR==1 {min=$2;max=$2;next} NF>1{if($2<min) min=$2; if ($2>max) max=$2} END {print max, min}'
Output
MinColumn1 MaxColumn1 MinColumn2 MaxColumn2
something along these lines - for any number of fields:
awk -f nc.awk myFile
where nc.awk is:
function mm(fnr, i)
{
for(i=1; i<=NF;i++) {
if (fnr==1) {min=$i;max=$i;continue}
if ($i<min) min=$i
if ($i>max) max=$i
}
}
NF>1 {mm(FNR)}
END {
for(i=1;i in min;i++)
printf("%s%s%s%s", min, OFS, max, ((i+1) in min)?OFS:ORS)
}
For two fields and assuming that line 1 in a file could also be your special character, you could try:
awk '
NF > 1 {if(!nf++) {
m1 = M1 = $1
m2 = M2 = $2
} else {if($1 + 0 < m1) m1 = $1
if($1 + 0 > M1) M1 = $1
if($2 + 0 < m2) m2 = $2
if($2 + 0 > M2) M2 = $2
}
}
END { print "MinColumn1 MaxColumn1 MinColumn2 MaxColumn2"
printf("%10s %10s %10s %10s\n", m1, M1, m2, M2)
}' file
which, with your sample input produces the output:
MinColumn1 MaxColumn1 MinColumn2 MaxColumn2
-123.7820 -119.9950 39.5520 42.0009
If you want to try this on a Solaris/SunOS system, change awk
to /usr/xpg4/bin/awk
.
PS. Just to be clear, the above code produces the same output if the input file is:
>
-122.2840 42.0009
-119.9950 41.1777
>
-123.7540 39.5520
-123.7820 39.6872
instead of the provided sample, while the other suggestions in this thread don't seem to produce the desired output in this case. It isn't clear to me from the provided description of the input file format whether or not this matters for the submitter's real world data.