Displaying lines of a file which have the highest number?

Hello

Wondering if anybody may be able to advise on how I can filter the contents of the following file:

 
<object_name>-<version>          <Instance>        
 
GM_GUI_code.fmb-4                        1
GM_GUI_code.fmb-5                        1
GM_GUI_code.fmx-4                        1
GM_GUI_code.fmx-5                        1
GM_GUI_code.fmx-4                        2
GM_GUI_code.fmx-5                        2
GM_Extract_dbcode.pkb-18              1
GM_Extract_dbcode.pks-18              1 
GM_Extract_dbcode.pkb-19              1 
GM_Extract_dbcode.pks-19              1 
GM_Extract_dbcode.pkb-20              1 
GM_Extract_dbcode.pks-20              1 

This is a list of the contents of a release. I need display only the LATEST version (version number is after the file extension) of an object so I would then only have the following displayed:

 
<object_name>-<version>          <Instance>
 
GM_Extract_dbcode.pkb-20              1 
GM_Extract_dbcode.pks-20              1
GM_GUI_code.fmb-5                        1
GM_GUI_code.fmx-5                        1
GM_GUI_code.fmx-5                        2

As you can see there are two instances of GM_GUI_code.fmx. I need to be able to display both instances. I'm a bit stumped as to how to do this though.

Thanks
Glyn

# awk -F'[ -]' 'NR==FNR{if(NR<3)print;x=($2>x && /GUI/)?$2:x;y=($2>y && /Extract/)?$2:y;next}$2==x || $2==y{a[$0]}END{for (i in a) print i}' file file
<object_name>-<version>          <Instance>

GM_Extract_dbcode.pkb-20              1
GM_Extract_dbcode.pks-20              1
GM_GUI_code.fmb-5                        1
GM_GUI_code.fmx-5                        1
GM_GUI_code.fmx-5                        2

Thanks Danmero. Unfortunately I need to be able to run the command on different files, in which the object names listed won't don't contain "GUI" in the name...

Feel free to adjust the code as you wish, unfortunately I can't provide you a universal one-size-fit-all solution.
If you need something else please post an extended data sample.

Try:

awk -F'[- \t]*' '{A[$1,$3]=$2} END{for (i in A){split (i, B, SUBSEP); printf "%-25s %s\n",B[1]"-"A,B[2]}}' infile
GM_GUI_code.fmx-5         1
GM_GUI_code.fmb-5         1
GM_Extract_dbcode.pkb-20  1
GM_Extract_dbcode.pks-20  1
GM_GUI_code.fmx-5         2
$ cat infile
GM_GUI_code.fmb-4                        1
GM_GUI_code.fmb-5                        1
GM_GUI_code.fmx-4                        1
GM_GUI_code.fmx-5                        1
GM_GUI_code.fmx-4                        2
GM_GUI_code.fmx-5                        2
GM_Extract_dbcode.pkb-18              1
GM_Extract_dbcode.pks-18              1
GM_Extract_dbcode.pkb-19              1
GM_Extract_dbcode.pks-19              1
GM_Extract_dbcode.pkb-20              1
GM_Extract_dbcode.pks-20              1

This work for me:

# awk -F'[- \t]*' '{if(NR<3){print;next}}{A[$1,$3]=$2} END{for (i in A){split (i, B, SUBSEP); printf "%-25s %s\n",B[1]"-"A,B[2]}}' file
<object_name>-<version>          <Instance>

GM_Extract_dbcode.pks-20  1
GM_Extract_dbcode.pkb-20  1
GM_GUI_code.fmx-5         1
GM_GUI_code.fmx-5         2
GM_GUI_code.fmb-5         1

Scrutinizer, many thanks.

I've had a problem when running this against another file though. The file contents are as follows:

bash-3.00$ cat release
ins_smo.sql-22                           1
lfa10ins.fmb-1                           1
lfa10ins.fmb-2                           1
lfa10ins.fmx-1                           1
lfa10ins.fmx-1                           2
lfa10ins.fmx-2                           1
lfa10ins.fmx-2                           2
pen_pack_05.pkb-8                        2
pkg_land_insp_capture_form.pkb-19        1
release_note_SAF10_Certification_1.doc-0 1
release_note_SAF10_Certification_1.doc-1 1
release_note_SAF10_Certification_1.doc-2 1
releasenote-1317                         1
saf09127.rdf-10                          1
saf09127.rdf-9                           1
saf09127.rep-10                          1
saf09127.rep-10                          2
saf09127.rep-9                           1
saf09127.rep-9                           2
saf1016c.fmb-1                           1
saf1016c.fmb-2                           1
saf1016c.fmx-1                           1
saf1016c.fmx-1                           2
saf1016c.fmx-2                           1
saf1016c.fmx-2                           2
saf1016m.fmb-1                           1
saf1016m.fmx-1                           1
saf1016m.fmx-1                           2
saf1016s.fmb-1                           1
saf1016s.fmx-1                           1
saf1016s.fmx-1                           2
bash-3.00$

When I run your command, I get the following returned:

saf09127.rep-9            2
releasenote-1317          1
ins_smo.sql-22            1
saf1016s.fmx-1            1
saf1016c.fmb-2            1
saf1016s.fmx-1            2
saf1016s.fmb-1            1
pen_pack_05.pkb-8         2
saf1016m.fmx-1            1
saf1016m.fmx-1            2
pkg_land_insp_capture_form.pkb-19 1
saf1016m.fmb-1            1
lfa10ins.fmx-2            1
lfa10ins.fmx-2            2
lfa10ins.fmb-2            1
release_note_SAF10_Certification_1.doc-2 1
saf1016c.fmx-2            1
saf1016c.fmx-2            2
saf09127.rdf-9            1
saf09127.rep-9            1

The problem is that objects

saf09127.rdf-9            1
saf09127.rep-9            1
saf09127.rep-9            2

... are being displayed, instead of

saf09127.rdf-10            1
saf09127.rep-10            1
saf09127.rep-10            2

Is there some problem when it's calculating double-figures?

I must the awk I'm seeing in these solutions is a bit beyond my current knowledge! Thanks to both. :b:

Hi Glyn_mo,

Try sorting the file first so that the records are in proper order.

sort -t- -k1,1 -k2,2n infile | awk -F'[- \t]*' '{A[$1,$3]=$2} END{for (i in A){split (i, B, SUBSEP); printf "%-25s %s\n",B[1]"-"A,B[2]}}'

or try this slightly revised awk:

awk -F'[- \t]*' '$2>A[$1,$3] {A[$1,$3]=$2} END{for (i in A){split (i, B, SUBSEP); printf "%-25s %s\n",B[1]"-"A,B[2]}}' infile