Sort numbers which has colon (:) in between

varu0612 · June 6, 2012, 4:09am

Although i tried multiple option i couldn't find a way to get the rigt ouput.

Say i have the following data

cat file.txt
 
[1] C request
[1:1]
[1:4]
[1:2]
[1:3]
[1] C response
[2:3]
[2] C request
[2:1]
[2:2]
[2] C response

The output should look like

[1] C request
[1:1]
[1:2]
[1:3]
[1:4]
[1] C response
[2] C request
[2:1]
[2:2]
[2:3]
[2] C response

I've tried

sort -t: -k1,1n -k2,2n or sort -t: +1.1 -1.3

but no luck.

any idea?

Cheers.

---------- Post updated 06-06-12 at 09:09 AM ---------- Previous update was 06-05-12 at 06:37 PM ----------

Sorry - i'll remember and use it next time.

Do you have any idea on my request? Maybe i need to use awk + sort?

Thx in advance!

guruprasadpr · June 6, 2012, 5:04am

Hi

$ sort -n file.txt | sed -n '/request/{p;s/.*//;x;p;d;}; /response/{p;n;h;T}; /request/!{1h;1!H;}'
[1] C request
[1:1]
[1:2]
[1:3]
[1:4]
[1] C response
[2] C request
[2:1]
[2:2]
[2:3]
[2] C response

Guru.

varu0612 · June 6, 2012, 7:43am

Thanks a lot!! If you don't mind, can you please explain the sed command?

Many thanks!!

ygemici · June 6, 2012, 9:01am

# awk '/req||!~res/{while($0!~/res/){a[x++]=$0;getline}};{a[l]=$0;for(cc in a)if(a[cc]~/req/){print a[cc];a[cc]="";};
for(i=0;i<x;i++){split(a,s,":");sub("]","",s[2]);ll=s[1];f=0;s[2]=s[2];if(s[2]==1&&!f){print a;f=1;xx=s[2]+1;}
if(s[2]==xx&&a){print s[1]":"s[2]"]";xx++;f=1;}if(a&&f!=1)n[xxx++]=s[2]}for(lll=1;lll<20;lll++){
for(i=0;i<xxx;i++)if(n==(ll+lll))print ll":"n"]"}print a[l];;;x=xxx=0;xx=""}' unsorted_file
[1] C request
[1:1]
[1:2]
[1:3]
[1:4]
[1] C response
[2] C request
[2:1]
[2:2]
[2:3]
[2] C response

regards
ygemici

varu0612 · June 9, 2012, 12:58pm

Hi all,

Can anyone help me understand the above SED command?

Thanks!!

drl · June 10, 2012, 7:54am

Hi.

Here is an alternate solution. The utility msort has an interesting sort-field key called a hybrid. Essentially, it breaks apart a key-field into sub-fields. It then sorts the sub-fields as character unless the field is numeric, in which case it sorts as a numeric. The header and trailer lines in this data file have no second field, so a pair of perl scripts adds the field to request and response lines, and removes the second field after the hybrid sort. So for the perl add action, the lines would become:

[1:0] C request
[1:32000] C response

ready for input into msort.

#!/usr/bin/env bash

# @(#) s2	Demonstrate hybrid sort, msort.
# http://freecode.com/projects/msort

pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C msort perl

FILE=${1-data1}
pl " Input data file $FILE, and expected output:"
head -n 11 $FILE expected-output.txt
pl " Helper perl scripts:"
head p*

pl " Results:"
./p1 $FILE |
tee f1 |
msort -q -l -n 1,1 -c hybrid |
tee f2 |
./p2

exit 0

producing:

% ./s2

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0.8 (lenny) 
bash GNU bash 3.2.39
msort 8.44
perl 5.10.0

-----
 Input data file data1, and expected output:
==> data1 <==
[1] C request
[1:1]
[1:4]
[1:2]
[1:3]
[1] C response
[2:3]
[2] C request
[2:1]
[2:2]
[2] C response

==> expected-output.txt <==
[1] C request
[1:1]
[1:2]
[1:3]
[1:4]
[1] C response
[2] C request
[2:1]
[2:2]
[2:3]
[2] C response

-----
 Helper perl scripts:
==> p1 <==
#!/usr/bin/env perl

# @(#) p1	Augment header and trailer lines with integers.

while ( <> ) {
  s/\[(\d+)\]/[\1:0]/ if /request/;
  s/\[(\d+)\]/[\1:32000]/ if /response/;
  print;
}

==> p2 <==
#!/usr/bin/env perl

# @(#) p2	Remove integers from head and trailer lines.

while ( <> ) {
  s/:\d+// if /request|response/;
  print;
}

-----
 Results:
[1] C request
[1:1]
[1:2]
[1:3]
[1:4]
[1] C response
[2] C request
[2:1]
[2:2]
[2:3]
[2] C response

Files f1 and f2 show the intermediate form of the data and can be discarded for production work.

See the link in the script for msort if it is not in an accessible repository. Fedora, Debian, FreeBSD have it, for example.

Best wishes ... cheers, drl

alister · June 10, 2012, 3:42pm

With the sample data, that sort command is not performing a numerical sort. Since the first character of the line is not the beginning of a numeric string, all lines compare equal, as if they all had a leading zero, and then they are compared lexicographically to break the tie. [10:1] will incorrectly precede [2:1] .

The T test will always succeed -- since the n command will always reset the tested condition -- so it could be replaced with d (since -n is in effect), in which case the sed script would be portable (instead of GNU specific).

The following is a reimplementation of drl's suggestion. Instead of using perl and msort to decorate-sort-dedecorate, it uses sed and sort:

sed '/req/s/]/:0]/; /res/s/]/:9999]/; s/[][]//g' infile |
sort -nt: -k1,1 -k2,2 |
sed '/req/s/:0//; /res/s/:9999//; s/[:[:digit:]]*/[&]/'

Regards,
Alister

varu0612 · September 8, 2012, 8:47am

sed '/req/s/]/:0]/; /res/s/]/:9999]/; s/[][]//g' infile |
sort -nt: -k1,1 -k2,2 |
sed '/req/s/:0//; /res/s/:9999//; s/[:[:digit:]]*/[&]/'

Hi Alister,

If you don't mind can you please explain your sed command and also guruprasadpr's example

sed -n '/request/{p;s/.*//;x;p;d;}; /response/{p;n;h;T}; /request/!{1h;1!H;}'

In this way i'll better understand/ learn each sed option.

Thanks a lot!

varu0612 · September 11, 2012, 4:41pm

Guys,

am i asking too much for a bit of explanation of the options below?

I'd very much appreciate if anyone is willing to explain the options so i can better understand the whole line/ learn it as well.

Best rgds

varu0612 · September 16, 2012, 3:31am

I gave up waiting for someone to help me out understand the above sed logic/ options.

Thx anyway.