Hi MadeInGermany,
The sort man page description of sort keys (whether using -k keydef
or +key_start -key_end
) and of the options/flags is complicated because there are so many special cases in the way sort keys are specified. If you look at lines as C arrays of type char where the index of the first character in the array is 0, the old way of specifying key definitions will feel "right" to you. If you look at the way most other UNIX utilities number fields and characters within fields (e.g., awk
, cut
, fold
, head
, paste
, and tail
), the -k keydef
key definitions will feel "right" to you. I can translate between the two, but I generally find I'm more likely to get the new form correct the first time. And, I find it much easier to explain the new form to a UNIX utility newby whether or not they are experienced C programmers. Most of the people that I know who feel more comfortable with the old form learned to use the old form before the late 1980's when the new form was invented.
The options -b
, -d
, -f
, -i
, -n
and -r
can be given as options that apply to all sort keys that do not include any flags or as flags that only apply to the key to which they are attached. The -t char
option can only be specified as an option; not as a flag.
When used as a flag, b
applies only to the start field or end field specification to which it is attached. All other flags can be applied to the start field specification, to the end field specification, or both and have the same effect.
If a key field definition has any flags attached, those flags override ALL options except -t char
. So, for example, if I want to sort in reverse (decreasing) numeric order on unit price and on reverse case-insensitive alphabetic order by ingredient name within groups of identical unit prices I could use any of the following:
sort -t, -k3r,3n -k1.9f,1r file
sort -t, -k3nr,3rn -k1.9fr,1rf file
sort -frt, -k3rn,3 -k1.9,1 file
sort -nrt, -k3,3 -k1.9,1fr file
but the following won't work:
sort -rft, -k3,3n -k1.9,1 file
because the n
flag on the first key specification overrides both the -r
and -f
options.
The main difference between the two forms is that field numbers and characters in fields are numbered from 1 when using -k keydef
, but are numbered from 0 when using +key_start -key_end
. For example, to sort alphabetically on the 7th character of the first field skipping over leading spaces in the sample data shown in post #1, the following four commands all specify the same sort key:
sort -t, -k1.7b,1.7b file
sort -t, +0.6b -0.6b file
sort -bt, -k1.7,1.7 file
sort -bt, +0.6 -0.6 file
and all produce the output (note that there are two <space>s between the line number and the ingredient name in the 1st field):
1 Potato,vegatable,0.89,5
4 green_onion,vegatable,0.99,3
11 onion,vegatable,0.89,2
12 bell_pepper,vegatable,0.89,2
21 pumpkin_pie_filling,vegatable,2.98,1
In the most general -k keydef
option form:
-k field_start_number[.first_character_number][flag...][,field_end_number[.last_character_number][flag...]]
if the optional .first_character_number
is omitted, it defaults to the first character in the given field. If the optional .last_character_number
is omitted, it defaults to the last character in the given field. If the entire optional end field number specification (,field_end_number[.last_character_number][flag...]
) is omitted, it defaults to the end of the current line.
An equivalent sort key using the old style key definition for the above -k
option is:
+field_start_number-1[.first_character_number-1][flag...] [-field_end_number-1[.last_character_number-1][flag...]]
where all four occurrences of -1
are numeric calculations on the previous number; not literal strings.
I hope this helps. And, I really hope I don't have any typos in this post that further confuse any readers trying to figure out how sort
works.