Sort by id value in every two rows

Hello,
I am running ubuntu 14.04. My purpose is to sort below file according to value of id even though it contains alphanumeric strings inside double quote. File is not tab separated. I am not sure that can be done...
mydata

#INFO id="133 texas" logo="http://tx.yy.zz
http://11.22.48
#INFO id="21 michigan" logo="http://mx.yy.zz
http://11.22.55
#INFO id="18 london" logo="http://lx.yy.zz
http://11.22.77
#INFO id="299 paris" logo="http://px.yy.zz
http://11.22.00

Expected output:

#INFO id="18 london" logo="http://lx.yy.zz
http://11.22.77
#INFO id="21 michigan" logo="http://mx.yy.zz
http://11.22.55
#INFO id="133 texas" logo="http://tx.yy.zz
http://11.22.48
#INFO id="299 paris" logo="http://px.yy.zz
http://11.22.00

What I tried:

awk '{print int((NR-1)/2), $0}' mydata | sort -n -k2,2 | cut -f2- -d' ' 

Thank you
Boris

EDIT:
Hello,
I have solved somehow with a bit long way but it's okay now.

for i in {10..900}
do
echo "id=\"$i"
done > output

Then with grep and while read -r line loop.

Thanks
Boris

You were very close with your awk, sort, cut pipeline:

awk -F\" 'NR%2{printf "%s\t%s\t",$2,$0; next }1' mydata |
  sort -n | cut -f2- | tr '\t' '\n'
1 Like

Hi.

Noting that in this specific dataset, the "#" is essentially the record separator, we can add a dummy record at the top, and use some of the features of a flexible alternative sorting code msort .

Here is a demonstration script with results:

#!/usr/bin/env bash

# @(#) s1       Demonstrate sort of multi-line record, msort, sed

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
em() { pe "$*" >&2 ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C msort sed

FILE=${1-data2}

pl " Input data file $FILE:"
head $FILE

pl " Results:"
msort -q -r "#" -d '"' -n 2,2 -c h $FILE |
tee t1 |
sed '1d;$d'

pl " Temporary file t1:"
head -20 t1

exit 0

producing:

$ ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-7-amd64, x86_64
Distribution        : Debian 8.11 (jessie) 
bash GNU bash 4.3.30
msort 8.53
sed (GNU sed) 4.2.2

-----
 Input data file data2:
# junk "0 "
#INFO id="133 texas" logo="http://tx.yy.zz
http://11.22.48
#INFO id="21 michigan" logo="http://mx.yy.zz
http://11.22.55
#INFO id="18 london" logo="http://lx.yy.zz
http://11.22.77
#INFO id="299 paris" logo="http://px.yy.zz
http://11.22.00

-----
 Results:
#INFO id="18 london" logo="http://lx.yy.zz
http://11.22.77
#INFO id="21 michigan" logo="http://mx.yy.zz
http://11.22.55
#INFO id="133 texas" logo="http://tx.yy.zz
http://11.22.48
#INFO id="299 paris" logo="http://px.yy.zz
http://11.22.00

-----
 Temporary file t1:
 junk "0 "
#INFO id="18 london" logo="http://lx.yy.zz
http://11.22.77
#INFO id="21 michigan" logo="http://mx.yy.zz
http://11.22.55
#INFO id="133 texas" logo="http://tx.yy.zz
http://11.22.48
#INFO id="299 paris" logo="http://px.yy.zz
http://11.22.00
#

Note that the raw results -- file t1 -- has a beginning and ending line that should be removed, which is what the sed command does. The sorting mode is hybrid, a combination of alphabetic and numeric.

If you have any odd sorting requirements that the standard sort does not address, it may be useful to consider msort .

Some more information on msort :

msort   sort records in complex ways (man)
Path    : /usr/bin/msort
Version : 8.53
Type    : ELF 64-bit LSB executable, x86-64, version 1 (SYS ...)
Help    : probably available with -h,--help
Repo    : Debian 8.11 (jessie) 
Home    : http://www.billposer.org/Software/msort.html (pm)

Best wishes ... cheers, drl

2 Likes

Try also

$ paste -sd"\t\n" file | sort -nt\" -k2 | tr $'\t' $'\n'
#INFO id="18 london" logo="http://lx.yy.zz
http://11.22.77
#INFO id="21 michigan" logo="http://mx.yy.zz
http://11.22.55
#INFO id="133 texas" logo="http://tx.yy.zz
http://11.22.48
#INFO id="299 paris" logo="http://px.yy.zz
http://11.22.00
4 Likes