I have one .txt file which has filenames with various extensions e.g. .gz,.dat,.CTL,.xml. I want to sort all the filenames as per their extensions and would like to delete all the file names with .xml extension.
Please help.
PS : I am using Sun OS Generic_122300-60.
Here are two other methods. The first uses msort, which allows fields to be specified from the right-hand side. The other uses a quickly written perl code, which reverses the characters on each line:
First, you can use the traditional sort, and this will work fine for 99% of the cases:
ls -1 | sort -t. -k 3,3 -k 2,2 -k 1,1
You tell sort to order by the 3rd extension, then the 2nd, then the 1st... and sort ignores non-existent fields. The only problem with this sort method is that you get this kind of weird ordering:
bar
foo
bar.zip
foo.zip
foo.bar.jpg
foo.bar.zip
bar.foo.zip
That is, fields with more than 1 extension have higher sorting precedence than fields with two. So the two zip files seem out-of-place.
So for the best ordering -- the one most likely to be expected, you make the last extension "special" by inserting a special character before the final period. Then sort, then remove the special character. You can use path-separators because those are never part of the filename.
ls -1 | sed 's/\(\.[^.]*\)$/\/\1/' | sort -t/ -k 2,2 -k 1,1 | sed 's/\/\([^/]*\)$/\1/'
I'll admit: That's ugly for the command line. It could be a bit nicer if you don't need to worry about full path-names in your list.
Postscript: On Linux, you can find the "rev" command with the util-linux suite. It prints out each line in the file in reverse, so you can use drl's technique in that environment:
Not sure what ls there is on Solaris, but GNU ls has an -X option that does exactly that -- sorts by extension.
Edit: it doesn't support -X. Never mind. How about this:
Prepend with the extension, sort on it, and then take it out (DSU = decorate-sort-undecorate):
Both rev commands also reverse the suffixes, and so they do not get sorted in alphabetical order.
A different strategy would be to prepend with suffix and a dot or just a space and a dot if there is no suffix and remove them after the sort. The sort would still need to use -t.
Another option might be just to list the suffixes, if they are not too many:
So far I like those 2 solutions the best for utilizing standard tools, at least on the data files posted so far in this thread. The msort solution is a single-command (but non-standard) solution: the ability to specify fields from the right-hand-side is invaluable in this situation.
That is true, however, my impression was that the OP desired grouping rather than strict sorting. In which case the revs work except in the cases where there are no extensions. In those situations, the no-suffixed files are not in a group by themselves. The possibility of more than dot does complicate the issue, and I'm glad that it was raised.
After some thought, a better re function for my script is:
Further to this any sort must have the suffix as the primary sort key and - like drl suggests - have provisions for files without extensions or these will be all over the place. Something like this, perhaps: