What's going wrong with sort?

Sort isn't behaving as I expect. Can someone tell me what's going on?
Incidentally, the infile was ordered by 'sort' previously.
Why is 101m_lig1_frag1 before 101m_lig1_frag12 but after 101m_lig1_frag11?

$infile...
101m_lig1_frag10 124 ...
101m_lig1_frag11 124 ...
101m_lig1_frag12 124 ...
101m_lig1_frag13 124 ...
101m_lig1_frag1 124 ...
101m_lig1_frag14 124 ...
101m_lig1_frag15 124 ...
101m_lig1_frag16 124 ...
101m_lig1_frag17 124 ...
101m_lig1_frag18 124 ...

100> sort $infile > $outfile

$outfile
101m_lig1_frag10 124 ...
101m_lig1_frag11 124 ...
101m_lig1_frag1 124 ...
101m_lig1_frag12 124 ...
101m_lig1_frag13 124 ...
101m_lig1_frag14 124 ...
101m_lig1_frag15 124 ...
101m_lig1_frag16 124 ...
101m_lig1_frag17 124 ...
101m_lig1_frag18 124 ...

use sort -d infile.. :slight_smile:

Thanks, but I'm still getting the same result... wierd!

daisy 1807>cat head.fps | awk '{print $1,$2,"..."}' | sort -d
101m_lig1_frag10 124 ...
101m_lig1_frag11 124 ...
101m_lig1_frag1 124 ...
101m_lig1_frag12 124 ...
101m_lig1_frag13 124 ...
101m_lig1_frag14 124 ...
101m_lig1_frag15 124 ...
101m_lig1_frag16 124 ...
101m_lig1_frag17 124 ...
101m_lig1_frag18 124 ...

sort -n
it may solv ur problem

Thanks, but I've already tried it. I get the same output as above.

You can give this a try:

awk '{print substr($1,15)" "$0}' infile | sort -n | cut -d" " -f2-

Regards

This is just normal behaviour of sort . You can't do a sort -n as the numbers belongs to the part of string of alphabets.

One option would be:

for each in $(sed 's/101m_lig1_frag\(.*\) 124\(.*\)/\1/' infile | sort -n); do grep  "^101m_lig1_frag$each " infile; done

Output:

Thanks for the input guys. You've given me a bit to work with.
I've decided to accept this is what sort does and have added space characters after the identifier in the list file to get the same sorting.

This surprises me, mainly because of this...

daisy 1813>cat head.fps | awk '{print $1}' | sort -d
101m_lig1_frag1
101m_lig1_frag10
101m_lig1_frag11
101m_lig1_frag12
101m_lig1_frag13
101m_lig1_frag14
101m_lig1_frag15
101m_lig1_frag16
101m_lig1_frag17
101m_lig1_frag18

Why is a space character inbetween a '1' and a '2', and why can't you sort by just the first field? :confused: