Sort on two keys

Hello,

I am trying to sort a text file by two keys but the second key should be reversed.

I have tried -nt '|' -k 4 -rk 5 but it just sorts reversed on key 4.

Does anyone have any suggestions ?

Thanks

You could sort the files on the first key and then with each segment of the sorted file containing lines where the first key is the same you must sort on the second key.

One long-winded way of doing that would be to use the perl script:

#! /usr/bin/perl -wnl

($undef, $undef, $undef, $primary_key, $secondary_key, $undef) = split /,/;

if (!defined($previous_key)) {
  $previous_key = $primary_key;
}
else {
  if ($previous_key ne $primary_key) {
    for $key (sort {$b cmp $a} keys %segment) {
      print $segment{$key};
    }
    %segment = ();
  }
}

if (!defined($segment{$secondary_key})) {
  $segment{$secondary_key} = $_;
}
else {
  $segment{$secondary_key} .= "\n" . $_;
}
$previous_key = $primary_key;

END {
  for $key (sort {$b cmp $a} keys %segment) {
    print $segment{$key};
  }
}

to convert this unsorted file

bbb,ccc,ddd,eee,aaa,fff
ddd,eee,fff,aaa,ccc,bbb
bbb,ccc,ddd,eee,fff,aaa
bbb,ccc,ddd,eee,aaa,fff
fff,aaa,bbb,ccc,ddd,eee
eee,fff,aaa,bbb,ccc,ddd
aaa,bbb,ccc,ddd,eee,fff
ccc,ddd,eee,fff,aaa,bbb
ddd,eee,fff,aaa,bbb,ccc
ccc,ddd,eee,fff,aaa,bbb

using this command

sort -t, -k4  unsorted.txt | sort5.pl

to this sorted file

ddd,eee,fff,aaa,ccc,bbb
ddd,eee,fff,aaa,bbb,ccc
eee,fff,aaa,bbb,ccc,ddd
fff,aaa,bbb,ccc,ddd,eee
aaa,bbb,ccc,ddd,eee,fff
bbb,ccc,ddd,eee,fff,aaa
bbb,ccc,ddd,eee,aaa,fff
bbb,ccc,ddd,eee,aaa,fff
ccc,ddd,eee,fff,aaa,bbb
ccc,ddd,eee,fff,aaa,bbb

Thanks for the reply.

Unfortunately I am calling unix from within another program (Datastage). I really wanted to keep it to one unix command but if this is not possible I will have to work around it.

One way I was going to do this was to change the generation of the second key so that it is always in the right order so I only have to use one key.:slight_smile:

Thanks anyway

Show how the input and output looks like.

BPC_TRANS_CHARG|000009|000|278|TGL0009|S1_C12 J60867SNCF0R n'est pas present dans PS_S1_C12_VW
BPC_TRANS_CHARG|000009|000|278|EGL0009|S1_C12 J60867SNCF0R n'est pas present dans PS_S1_C12_VW
BPC_TRANS_CHARG|000009|000|86|TGL0009|S1_C12 J60867SNCF0R n'est pas present dans PS_S1_C12_VW
BPC_TRANS_CHARG|000009|000|86|EGL0009|S1_C12 J60867SNCF0R n'est pas present dans PS_S1_C12_VW

Field 4 e.g 278 has to be ascending and field 5 descending TGL0009

sort -t '|' -k4,4n -k5,5r input_file.txt

Excellent. That works perfectly

What does the -k4,4n -k5,5r mean. I understand the 4n part but what does the , do?

I modified the command
sort -t '|' -k4n -k5r BPC_TRANS_CHARG000009_000_Anomalie.csv and this works as well

Thanks.

Colin

The -k4,4n -k5,5r first sorts numerically on the fourth field and then sorts on the fifth field in reverse order. The 5th field sort is a sub-sort that is it sorts the 5th field based on the sorted contents of the 4th field.The comma in -k separates the start and stop fields. In -k5r the start field is the 5th one and as no stop field is given it defaults to the end of the line. The sort manpage provides a better explanation.

No, I think you are wrong there. Your explanation is far clearer than man sort.

English is my first language, man help is my fifth !!

Many Thanks

Colin