Displaying lines of a file which have the highest number?

Glyn_Mo · April 19, 2010, 10:45am

Hello

Wondering if anybody may be able to advise on how I can filter the contents of the following file:

 
<object_name>-<version>          <Instance>        
 
GM_GUI_code.fmb-4                        1
GM_GUI_code.fmb-5                        1
GM_GUI_code.fmx-4                        1
GM_GUI_code.fmx-5                        1
GM_GUI_code.fmx-4                        2
GM_GUI_code.fmx-5                        2
GM_Extract_dbcode.pkb-18              1
GM_Extract_dbcode.pks-18              1 
GM_Extract_dbcode.pkb-19              1 
GM_Extract_dbcode.pks-19              1 
GM_Extract_dbcode.pkb-20              1 
GM_Extract_dbcode.pks-20              1

This is a list of the contents of a release. I need display only the LATEST version (version number is after the file extension) of an object so I would then only have the following displayed:

 
<object_name>-<version>          <Instance>
 
GM_Extract_dbcode.pkb-20              1 
GM_Extract_dbcode.pks-20              1
GM_GUI_code.fmb-5                        1
GM_GUI_code.fmx-5                        1
GM_GUI_code.fmx-5                        2

As you can see there are two instances of GM_GUI_code.fmx. I need to be able to display both instances. I'm a bit stumped as to how to do this though.

Thanks
Glyn

danmero · April 19, 2010, 11:10am

# awk -F'[ -]' 'NR==FNR{if(NR<3)print;x=($2>x && /GUI/)?$2:x;y=($2>y && /Extract/)?$2:y;next}$2==x || $2==y{a[$0]}END{for (i in a) print i}' file file
<object_name>-<version>          <Instance>

GM_Extract_dbcode.pkb-20              1
GM_Extract_dbcode.pks-20              1
GM_GUI_code.fmb-5                        1
GM_GUI_code.fmx-5                        1
GM_GUI_code.fmx-5                        2

Glyn_Mo · April 19, 2010, 11:58am

Thanks Danmero. Unfortunately I need to be able to run the command on different files, in which the object names listed won't don't contain "GUI" in the name...

danmero · April 19, 2010, 12:36pm

Feel free to adjust the code as you wish, unfortunately I can't provide you a universal one-size-fit-all solution.
If you need something else please post an extended data sample.

Scrutinizer · April 19, 2010, 1:08pm

Try:

awk -F'[- \t]*' '{A[$1,$3]=$2} END{for (i in A){split (i, B, SUBSEP); printf "%-25s %s\n",B[1]"-"A,B[2]}}' infile

GM_GUI_code.fmx-5         1
GM_GUI_code.fmb-5         1
GM_Extract_dbcode.pkb-20  1
GM_Extract_dbcode.pks-20  1
GM_GUI_code.fmx-5         2

$ cat infile
GM_GUI_code.fmb-4                        1
GM_GUI_code.fmb-5                        1
GM_GUI_code.fmx-4                        1
GM_GUI_code.fmx-5                        1
GM_GUI_code.fmx-4                        2
GM_GUI_code.fmx-5                        2
GM_Extract_dbcode.pkb-18              1
GM_Extract_dbcode.pks-18              1
GM_Extract_dbcode.pkb-19              1
GM_Extract_dbcode.pks-19              1
GM_Extract_dbcode.pkb-20              1
GM_Extract_dbcode.pks-20              1

danmero · April 19, 2010, 2:52pm

scrutinizer:

Try:

awk -F'[- \t]*' '{A[$1,$3]=$2} END{for (i in A){split (i, B, SUBSEP); printf "%-25s %s\n",B[1]"-"A,B[2]}}' infile

GM_GUI_code.fmx-5         1
GM_GUI_code.fmb-5         1
GM_Extract_dbcode.pkb-20  1
GM_Extract_dbcode.pks-20  1
GM_GUI_code.fmx-5         2

This work for me:

# awk -F'[- \t]*' '{if(NR<3){print;next}}{A[$1,$3]=$2} END{for (i in A){split (i, B, SUBSEP); printf "%-25s %s\n",B[1]"-"A,B[2]}}' file
<object_name>-<version>          <Instance>

GM_Extract_dbcode.pks-20  1
GM_Extract_dbcode.pkb-20  1
GM_GUI_code.fmx-5         1
GM_GUI_code.fmx-5         2
GM_GUI_code.fmb-5         1

Glyn_Mo · April 22, 2010, 2:03pm

Scrutinizer, many thanks.

I've had a problem when running this against another file though. The file contents are as follows:

bash-3.00$ cat release
ins_smo.sql-22                           1
lfa10ins.fmb-1                           1
lfa10ins.fmb-2                           1
lfa10ins.fmx-1                           1
lfa10ins.fmx-1                           2
lfa10ins.fmx-2                           1
lfa10ins.fmx-2                           2
pen_pack_05.pkb-8                        2
pkg_land_insp_capture_form.pkb-19        1
release_note_SAF10_Certification_1.doc-0 1
release_note_SAF10_Certification_1.doc-1 1
release_note_SAF10_Certification_1.doc-2 1
releasenote-1317                         1
saf09127.rdf-10                          1
saf09127.rdf-9                           1
saf09127.rep-10                          1
saf09127.rep-10                          2
saf09127.rep-9                           1
saf09127.rep-9                           2
saf1016c.fmb-1                           1
saf1016c.fmb-2                           1
saf1016c.fmx-1                           1
saf1016c.fmx-1                           2
saf1016c.fmx-2                           1
saf1016c.fmx-2                           2
saf1016m.fmb-1                           1
saf1016m.fmx-1                           1
saf1016m.fmx-1                           2
saf1016s.fmb-1                           1
saf1016s.fmx-1                           1
saf1016s.fmx-1                           2
bash-3.00$

When I run your command, I get the following returned:

saf09127.rep-9            2
releasenote-1317          1
ins_smo.sql-22            1
saf1016s.fmx-1            1
saf1016c.fmb-2            1
saf1016s.fmx-1            2
saf1016s.fmb-1            1
pen_pack_05.pkb-8         2
saf1016m.fmx-1            1
saf1016m.fmx-1            2
pkg_land_insp_capture_form.pkb-19 1
saf1016m.fmb-1            1
lfa10ins.fmx-2            1
lfa10ins.fmx-2            2
lfa10ins.fmb-2            1
release_note_SAF10_Certification_1.doc-2 1
saf1016c.fmx-2            1
saf1016c.fmx-2            2
saf09127.rdf-9            1
saf09127.rep-9            1

The problem is that objects

saf09127.rdf-9            1
saf09127.rep-9            1
saf09127.rep-9            2

... are being displayed, instead of

saf09127.rdf-10            1
saf09127.rep-10            1
saf09127.rep-10            2

Is there some problem when it's calculating double-figures?

I must the awk I'm seeing in these solutions is a bit beyond my current knowledge! Thanks to both.

Scrutinizer · April 22, 2010, 3:26pm

Hi Glyn_mo,

Try sorting the file first so that the records are in proper order.

sort -t- -k1,1 -k2,2n infile | awk -F'[- \t]*' '{A[$1,$3]=$2} END{for (i in A){split (i, B, SUBSEP); printf "%-25s %s\n",B[1]"-"A,B[2]}}'

or try this slightly revised awk:

awk -F'[- \t]*' '$2>A[$1,$3] {A[$1,$3]=$2} END{for (i in A){split (i, B, SUBSEP); printf "%-25s %s\n",B[1]"-"A,B[2]}}' infile