How can I sort this, first by 2nd field then by 1st field.
tried
sort -b -k 2,2
Input:
AS11 AB1
BD34 AB10
AF12 AC2
A345 AB10
R134 AB2
456 AC10
TTT2 BD12
desired output:
AS11 AB1
R134 AB2
A345 AB10
BD34 AB10
AF12 AC2
456 AC10
TTT2 BD12
How can I sort this, first by 2nd field then by 1st field.
tried
sort -b -k 2,2
Input:
AS11 AB1
BD34 AB10
AF12 AC2
A345 AB10
R134 AB2
456 AC10
TTT2 BD12
desired output:
AS11 AB1
R134 AB2
A345 AB10
BD34 AB10
AF12 AC2
456 AC10
TTT2 BD12
For your sample input, where you want to perform an alphabetic sort on the 1st two characters of the 2nd field as the primary key, a numeric sort on the remaining characters in the 2nd field as the secondary key, and an alphanumeric sort on the 1st field as the tertiary key, the following should work:
sort -k2.1b,2.2b -k2.3bn,2 -k1,1 Input
If the length of the alphabetic strings at the start of the 2nd field is variable length, or if you also want to split the 1st field into alpha and numeric portions and sort them as separate keys as well, the following should work:
#!/bin/ksh
TMPF=${0##*/}.$$
awk -v tmpf="$TMPF" '
BEGIN { sort = "sort -t, -k3,3 -k4b,4bn -k1,1 -k2b,2bn > \"" tmpf "\""
}
{ match($1, /[0-9]/)
printf("%s,%s,", substr($1, 1, RSTART - 1), substr($1, RSTART)) | sort
match($2, /[0-9]/)
printf("%s,%s,%d\n", substr($2, 1, RSTART - 1), substr($2, RSTART), NR) | sort
l[NR] = $0
}
END { close(sort)
FS = ","
while((getline < tmpf) == 1)
print l[$5]
close(tmpf)
}' Input
rm -f "$TMPF"
This was written and tested using a Korn shell, but should work with any POSIX-conforming shell. If you want to try this on a Solaris/SunOS system, change awk
to /usr/xpg4/bin/awk
or nawk
.
The above should work on any system. Some versions of sort
have simpler ways of sorting alphabetic and numeric parts of individual fields and some versions of awk
have built-in sort features; but since you didn't bother telling us what operating system you're using, I limited my response to more portable code.
Hi.
Utility msort
recognizes a hybrid string as in this example:
#!/usr/bin/env bash
# @(#) s1 Demonstrate comparing hybrid strings, msort.
# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C msort pass-fail
FILE=${1-data1}
pl " Input data file $FILE:"
cat $FILE
pl " Expected output:"
cat expected-output.txt
pl " Results:"
msort -qj --line -n 2,2 --comparison-type hybrid -n 1,1 --comparison-type hybrid $FILE |
tee f1
pass-fail f1 expected-output.txt
exit 0
producing:
$ ./s1
Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution : Debian 8.4 (jessie)
bash GNU bash 4.3.30
msort 8.53
pass-fail - ( local: RepRev 1.2, ~/bin/pass-fail, 2012-06-14 )
-----
Input data file data1:
AS11 AB1
BD34 AB10
AF12 AC2
A345 AB10
R134 AB2
456 AC10
TTT2 BD12
-----
Expected output:
AS11 AB1
R134 AB2
A345 AB10
BD34 AB10
AF12 AC2
456 AC10
TTT2 BD12
-----
Results:
AS11 AB1
R134 AB2
A345 AB10
BD34 AB10
AF12 AC2
456 AC10
TTT2 BD12
-----
Comparison of 7 created lines with 7 lines of desired results:
Succeeded -- files have same content.
( Note that pass-fail
is a local command, replace it with cmp
, diff
, etc. )
The msort
code was in GNU/Debian repository, as well as in Fedora, Ubuntu, MacOS (port), to mention a few. Also at MSORT
Best wishes ... cheers, drl