Sort Oddity

All,

I'm baffled by some sort behavior on SunOS 5.10 and looking for guidance.

Running the command:

sort -t'|' -k5,5 -k7,7 -k1,1 -k2,2 -k3,3 -k6,6 -k8,8 test

Against the file "test":

thirdA||||first||second|Data
thirdB||||first||second|
thirdC||||first||second|Data
thirdD||||first||second|Data

Yields:

thirdA||||first||second|Data
thirdC||||first||second|Data
thirdD||||first||second|Data
thirdB||||first||second|

Since the first (5) and second (7) sort keys are equal, I'm expecting the differentiating factor to be the third (1) and for the file to look as it did prior to the sort; however, because thirdB's last field (8) is blank, it appears at the bottom. If I put any content in this field it sorts as expected.

I get the same results whether my locale is "C" or "en_US.ISO8859-1." Can someone explain this behavior or how to achieve the sort I'm after?

Thanks much!

Solaris has a bad sort at one point! This is old HPUX:

 
$ sort -t'|' -k5,5 -k7,7 -k1,1 -k2,2 -k3,3 -k6,6 -k8,8 <<!
thirdA||||first||second|Data
thirdB||||first||second|
thirdC||||first||second|Data
thirdD||||first||second|Data
!
thirdA||||first||second|Data
thirdB||||first||second|
thirdC||||first||second|Data
thirdD||||first||second|Data

$

For me it works ...

# cat tst
thirdA||||first||second|Data
thirdB||||first||second|
thirdC||||first||second|Data
thirdD||||first||second|Data
# sort -t'|' -k5,5 -k7,7 -k1,1 -k2,2 -k3,3 -k6,6 -k8,8 tst
thirdA||||first||second|Data
thirdC||||first||second|Data
thirdD||||first||second|Data
thirdB||||first||second|
# uname -a
SunOS <anonymized> 5.10 Generic_127111-10 sun4us sparc FJSV,GPUZC-M
#

Are you running SPARC ?
Is your system up to date ?
By the way ... it is a good habit not to use "test" as a file name since it is a command name.

1 Like

B.Solaris 10 Operating System Patch List (Solaris 10 Release Notes) - Sun Microsystems

118824-01 SunOS 5.10: patch usr/bin/sparcv9/sort

1 Like

We're SPARC and apparently lacking updates. I wonder if there's a workaround? I'll research further and see what we can do about updating. Thanks!

SunOS <anonymized> 5.10 Generic_137137-09 sun4v sparc SUNW,T5240

gnu sort = coreutils, or download fix.

I found this funny note:

http://dlc.sun.com/pdf/819-7324/819-7324.pdf

Sort Capability in the European UTF-8 Locales Does
Not Function Correctly (4307314)
The sort capability in the European UTF-8 locales does not work properly.
Workaround: Before you attempt to sort in a FIGGS UTF-8 locale, set the LC_COLLATE
variable to the ISO1 equivalent.
# echo $LC_COLLATE
> es_ES.UTF-8
# LC_COLLATE=es_ES.IS08859-1
# export LC_COLLATE
Then start sorting.

1 Like

Thanks DG. One last inquiry here: How do I find a detailed description of the bug (so I can pass it onto our admins)? I tried searching bugs.sun.com with the two numbers in your link ("118824-01" and "6178339"), but I'm not getting any results. I've searched plenty of non-Sun bug sites with success; what am I missing?

Orale may have mangledthe internal searches, I used Google.

You need to go to SunSolve Home Page and download it. There may be prereq's. It'd be nicer to get the bundles of patches that follow the state of your host.

If you have an older SPARC OS around, try copying their sort over.

Or, you can find SPARC GNU coreutils out there and download it, at least as a workaround.

Here is a description of bug id 6178339:
Bug ID: 6178339 "sort -n "command with 0x86 as separator failled on a 2.7 G, 70 000 000 lines file (not exactly what you experienced.)

Are you on open solaris, not the Oracle/Sun product?

SPARC, right?

You might try other LC_COLLATE values.

The bug is reported against solaris 9 and introduced with sunos_2.0 (!), nothing OpenSolaris specific.

I find it surprising that the sort is messed up again, as I recall it being an Solaris 8 bug. I think we substituted the Solaris 7 sort for a while. Less performance, i guess, but data correct!