I want to select from all lines having the first column equal value the particular line with the minimum of the second column value. That is to say I would like that the AWK script would be able to produce the following file output.txt:
20 23 54
20.5 33 11
21 22 21
I have already try to find an answer on many forums but without success. Can you help me?
The following produces the output you requested but the order of the output is unspecified:
awk '
NF < 2 {next
}
!($1 in m) || m[$1] > $2 {
m[$1] = $2
o[$1] = $0
}
END { for(i in o) print o
}' input.txt
If you are using a Solaris system, use /usr/xpg4/bin/awk , /usr/xpg6/bin/awk , or nawk instead of /usr/bin/awk or /bin/awk .
Your sample has all of the input with a given 1st column value grouped together, but your statement of requirements didn't say anything about this. The code above accepts input in any order.
If your input always has all lines with the same 1st column value on adjacent input lines, this script can be rewritten to produce output when the 1st column value changes. This would take fewer resources for large input files and would produce output in the same order as the input.
This provides an easy way to group 1st field values together, but it also produces an empty output line that the OP doesn't seem to want and it will only work correctly if all 2nd field values in each group have the same number of digits before the decimal point (if a decimal point occurs in any 2nd field value within a group) and have no leading plus-signs (+) unless all non-negative values in a group have a leading plus-sign. The last part of this can be fixed trivially by adding the -n option to sort. Getting rid of the blank line is also easy (if it matters):