Sorting a flat file based on multiple colums(using character position)

Hi,

I have an urgent task here. I am required to sort a flat file based on multiple columns which are based on the character position in that line. I am restricted to use the character position instead of the space and sort +1 +2 etc to do the sorting.

I understand that there is a previous post which is similar to my problem but for that case, the sorting can be done using sort +1 +2 etc.

For my case, I am unable to do it as each column may be made up of some words separated by a space etc and as such spacing is not a correct delimiter to define the column.

I need help regarding this matter, either using unix shell scripting or awk. Thanks a lot

Can you post some sample data here.? Thanks.

Faroe Island 20 island
japan 19 airline

The above is the sample data. As you can see, the intended first colum is actually Faroe Island instead of just Faroe. As such, It is necessary to use the char position to distinguish the columns.

Appreciate if you could provide me with some advise on how to do the sorting...For example I wanna sort according to the first column and last column

PS: Assume the 20 and 19, island and airline are all at the same char position

cat test
Faroe Island 20 island
japan 19 test
japan 19 airline
Alpha Zulu 21 island
Alpha 121 island

Try this:
sed '
s/ \([0-9]\)/,\1/
s/\([0-9]\) /\1,/
' test | sort -k1,3

Alpha,121,island
Alpha Zulu,21,island
Faroe Island,20,island
japan,19,airline
japan,19,test

if you want to change "," back to space

sed '
s/ \([0-9]\)/,\1/
s/\([0-9]\) /\1,/
' test | sort -k1,3 | sed 's/,/ /g'

Alpha 121 island
Alpha Zulu 21 island
Faroe Island 20 island
japan 19 airline
japan 19 test

Try this (assume numbers 19 and 20 starting from column 15):

sort -k1.15n inputfile

Jean-Pierre.

But how u sort using 2 columns? If the first cloum compared is a tie, then compare the other column?

In general, how to sort(order by) multiple columns?

sort -k 1,3 is doing just that. eg, for input
Faroe Island 20 island
japan 19 test
japan 19 airline
Alpha Zulu 21 island
Alpha 121 island

Japan occurs twice in first column, so while sorting, 3rd column is considered and the output is:

Alpha 121 island
Alpha Zulu 21 island
Faroe Island 20 island
japan 19 airline
japan 19 test

isn't that what you wanted?

I understand what you are trying to explain...But the main problem arises because a column is not distinguished based on spacing only. For example, Faroe Island can be a value of 1 column and as such, I require the use of char positions to distinguish between real columns...

Do you know of any way to sort multiple columns based on char position?

Maybe I am not getting what you are trying to say. Could you explain it a bit more?

Faroe Island is being treated as 1 below.
Alpha,121,island
Alpha Zulu,21,island
Faroe Island,20,island
japan,19,airline
japan,19,test

I assumed that you mainly have 3 columns.
Text Number Text, columns are space separated but, because columns 1,3(text) can contain spaces, you can't use space as delimiter. Sed code above checks for space before & after column 2(Number) and changes it to ",". It ignores the spaces that occurs in 1st and 3rd. Once that's done, you can sort values using "," as delimiter. Most imp. assumption is that second column is number and is separated by space from 1st and 3rd.

I am not aware of sorting based on char position. If someone knows it I would be glad to learn.