sort -t option causing code to fail need ASCII character

Hello,

When I run this UNIX code without the -t option it gives me the desired results.

The code keeps the record with the greatest datetime based on the key columns.
I sort it first then sort it again with the -u option, that's it.

I need to have a variable to specify an ASCII character such as tab or the unit separator as the field separator. This is ASCII 009 for tab or 031 for Unit Separator.

In this case I'm trying tab.
My file is in fact tab-delimited.

When I run the script I get the error:

sort: empty tab
sort: empty tab

unix command

s=$(printf "\009")

sort -t "$s" -k1,1 -k2,2 -k4,4 -k5,5r input.txt > sortedinput.txt

sort -t "$s" -k1,1 -k2,2 -k4,4 sortedinput.txt -u > nodups.txt

input file - input.txt

21erescca    010240    8    10    sct_det3_10_20110516_143936.txt
41erescca    010240    7    10    sct_det3_10_20110516_143936.txt
21erescca    010240    4    10    sct_det3_10_20110517_143936.txt
41erescca    010240    6    10    sct_det3_10_20110517_143936.txt
11erescca    010240    2    10    sct_det3_10_20110516_143936.txt
11erescca    010240    2    10    sct_det3_10_20110516_143936.txt
21erescca    010245    8    10    sct_det3_10_20110516_143936.txt
11erescca    010240    1    10    sct_det3_901_20110516_143936.txt
41erescca    4010240    6    10    sct_det3_10_20110517_143936.txt
11erescca    010240    06    10    sct_det3_901_20110517_143936.txt

sortedinput.txt - desired file

11erescca    010240    06    10    sct_det3_901_20110517_143936.txt
11erescca    010240    1    10    sct_det3_901_20110516_143936.txt
11erescca    010240    2    10    sct_det3_10_20110516_143936.txt
11erescca    010240    2    10    sct_det3_10_20110516_143936.txt
21erescca    010240    4    10    sct_det3_10_20110517_143936.txt
21erescca    010240    8    10    sct_det3_10_20110516_143936.txt
21erescca    010245    8    10    sct_det3_10_20110516_143936.txt
41erescca    010240    6    10    sct_det3_10_20110517_143936.txt
41erescca    010240    7    10    sct_det3_10_20110516_143936.txt
41erescca    4010240    6    10    sct_det3_10_20110517_143936.txt

output file - nodups.txt - desired file

11erescca    010240    06    10    sct_det3_901_20110517_143936.txt
21erescca    010240    4    10    sct_det3_10_20110517_143936.txt
21erescca    010245    8    10    sct_det3_10_20110516_143936.txt
41erescca    010240    6    10    sct_det3_10_20110517_143936.txt
41erescca    4010240    6    10    sct_det3_10_20110517_143936.txt

By default sort takes white space chars as delimiter, in your case you fields only delimited by tab, so you don't have to specif any delimiter, I have executed on solaris, it worked fine for me

user@host> (/home/user) $ sort  -k1,1 -k2,2 -k4,4 -k5,5r test.txt >sorted.txt
user@host> (/home/user) $ sort  -k1,1 -k2,2 -k4,4  sorted.txt -u
11erescca    010240    06    10    sct_det3_901_20110517_143936.txt
21erescca    010240    4    10    sct_det3_10_20110517_143936.txt
21erescca    010245    8    10    sct_det3_10_20110516_143936.txt
41erescca    010240    6    10    sct_det3_10_20110517_143936.txt
41erescca    4010240    6    10    sct_det3_10_20110517_143936.txt
user@host> (/home/user) $ cat sorted.txt
11erescca    010240    06    10    sct_det3_901_20110517_143936.txt
11erescca    010240    1    10    sct_det3_901_20110516_143936.txt
11erescca    010240    2    10    sct_det3_10_20110516_143936.txt
11erescca    010240    2    10    sct_det3_10_20110516_143936.txt
21erescca    010240    4    10    sct_det3_10_20110517_143936.txt
21erescca    010240    8    10    sct_det3_10_20110516_143936.txt
21erescca    010245    8    10    sct_det3_10_20110516_143936.txt
41erescca    010240    6    10    sct_det3_10_20110517_143936.txt
41erescca    010240    7    10    sct_det3_10_20110516_143936.txt
41erescca    4010240    6    10    sct_det3_10_20110517_143936.txt

Hi !

thanks for testing it :slight_smile:

Yes I know it works the way it is but I will be using it with different files and I need to be able to change the field separator with a variable.

So I need to use the -t option and specify an ASCII code. Some files use the ASCII 031 Unit Separator. Some use tab which is ASCII 009.

ASCII 031 doesn't show up well on the forum post so I put the example for ASCII tab 009.

This is definitely a requirement for me to always specify the field delimiter as an ASCII code.