Hi.
Noting that in this specific dataset, the "#" is essentially the record separator, we can add a dummy record at the top, and use some of the features of a flexible alternative sorting code msort
.
Here is a demonstration script with results:
#!/usr/bin/env bash
# @(#) s1 Demonstrate sort of multi-line record, msort, sed
# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
em() { pe "$*" >&2 ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C msort sed
FILE=${1-data2}
pl " Input data file $FILE:"
head $FILE
pl " Results:"
msort -q -r "#" -d '"' -n 2,2 -c h $FILE |
tee t1 |
sed '1d;$d'
pl " Temporary file t1:"
head -20 t1
exit 0
producing:
$ ./s1
Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-7-amd64, x86_64
Distribution : Debian 8.11 (jessie)
bash GNU bash 4.3.30
msort 8.53
sed (GNU sed) 4.2.2
-----
Input data file data2:
# junk "0 "
#INFO id="133 texas" logo="http://tx.yy.zz
http://11.22.48
#INFO id="21 michigan" logo="http://mx.yy.zz
http://11.22.55
#INFO id="18 london" logo="http://lx.yy.zz
http://11.22.77
#INFO id="299 paris" logo="http://px.yy.zz
http://11.22.00
-----
Results:
#INFO id="18 london" logo="http://lx.yy.zz
http://11.22.77
#INFO id="21 michigan" logo="http://mx.yy.zz
http://11.22.55
#INFO id="133 texas" logo="http://tx.yy.zz
http://11.22.48
#INFO id="299 paris" logo="http://px.yy.zz
http://11.22.00
-----
Temporary file t1:
junk "0 "
#INFO id="18 london" logo="http://lx.yy.zz
http://11.22.77
#INFO id="21 michigan" logo="http://mx.yy.zz
http://11.22.55
#INFO id="133 texas" logo="http://tx.yy.zz
http://11.22.48
#INFO id="299 paris" logo="http://px.yy.zz
http://11.22.00
#
Note that the raw results -- file t1
-- has a beginning and ending line that should be removed, which is what the sed
command does. The sorting mode is hybrid, a combination of alphabetic and numeric.
If you have any odd sorting requirements that the standard sort does not address, it may be useful to consider msort
.
Some more information on msort
:
msort sort records in complex ways (man)
Path : /usr/bin/msort
Version : 8.53
Type : ELF 64-bit LSB executable, x86-64, version 1 (SYS ...)
Help : probably available with -h,--help
Repo : Debian 8.11 (jessie)
Home : http://www.billposer.org/Software/msort.html (pm)
Best wishes ... cheers, drl