Need immediate help with sorting!!!

sasuke_uchiha · August 5, 2008, 1:18pm

hey,
I have a file that looks smthng like this:
/--- abcd_0050 ---/
asdfjk
adsfkja
lkjljgafsd

/---abcd_0005 ---/
lkjkljbfkgj
ldfksjgf
dfkgfjb

/-- abcd_0055--/
klhfdghd
dflkjgd
jfdg

I would like it to be sorted so that it looks like this:
/---abcd_0005 ---/
lkjkljbfkgj
ldfksjgf
dfkgfjb

/--- abcd_0050 ---/
asdfjk
adsfkja
lkjljgafsd

/-- abcd_0055--/
klhfdghd
dflkjgd
jfdg

Basically, it sorts according to the numbers in the comment lines. What commands do I execute in the shell to get this?

pvr_satya · August 5, 2008, 1:21pm

use
sort filename

foe help man sort

sasuke_uchiha · August 5, 2008, 1:46pm

Well, i did use that. This is what I typed in the shell:
grep 'abcd' filename | sort -n

But the problem is that it sorts and prints only those 3 comment lines(/headers) and none of the stuff under those lines. How do I get it print everything out after sorting those 3 headers???

joeyg · August 5, 2008, 1:56pm

I had to 'fix' your input file; spacing and number of - characters. I figured you planned for the header line to be a regular format.

> cat file1
/*--- abcd_0050 ---*/
asdfjk
adsfkja
lkjljgafsd

/*--- abcd_0005 ---*/
lkjkljbfkgj
ldfksjgf
dfkgfjb

/*--- abcd_0055 ---*/
klhfdghd
dflkjgd
jfdg

> cat file1 | sed 's/\/\*/#\/\*/' | tr "\n" "~" | tr "#" "\n" | sort | tr -s "~" | tr "~" "\n"

/*--- abcd_0005 ---*/
lkjkljbfkgj
ldfksjgf
dfkgfjb

/*--- abcd_0050 ---*/
asdfjk
adsfkja
lkjljgafsd

/*--- abcd_0055 ---*/
klhfdghd
dflkjgd
jfdg

sasuke_uchiha · August 5, 2008, 3:02pm

datz pretty confusing; what's with so many 'tr' commands?? Is this the only way or are there any other ways to get my desired output??

drl · August 5, 2008, 3:57pm

Hi.

For folks that need more features than are found in standard sort, there is msort: MSORT

Here is a sample run with msort:

#!/bin/sh -

# @(#) s1       Demonstrate msort.

# See:
# http://billposer.org/Software/msort.html

set +o nounset
LC_ALL=C ; LANG=C ; export LC_ALL LANG
echo "Environment: LC_ALL = $LC_ALL, LANG = $LANG"
echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version =o $(_eat $0 $1) msort
set -o nounset

FILE=${1-data1}
echo
echo " Data file $FILE:"
echo "12345678901234567890"
cat $FILE

echo
msort -b -e12,15 -c numeric -2 results -1 $FILE

echo
echo " Results from msort:"
echo
cat results

exit 0

producing (using joeyg's amended dataset on file data1):

$ ./s1
Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
Linux 2.6.25-2-686
GNU bash, version 3.2.39(1)-release (i486-pc-linux-gnu)
msort - no version provided for /usr/bin/msort.

 Data file data1:
12345678901234567890
/*--- abcd_0050 ---*/
asdfjk
adsfkja
lkjljgafsd

/*--- abcd_0005 ---*/
lkjkljbfkgj
ldfksjgf
dfkgfjb

/*--- abcd_0055 ---*/
klhfdghd
dflkjgd
jfdg

Key 1 obligatory  character range 11 through 14 Increasing numeric
Reading from data1.
Records processed:                          3
Sorting...
Records written:                            3
Comparisons:                                2

 Results from msort:

/*--- abcd_0005 ---*/
lkjkljbfkgj
ldfksjgf
dfkgfjb

/*--- abcd_0050 ---*/
asdfjk
adsfkja
lkjljgafsd

/*--- abcd_0055 ---*/
klhfdghd
dflkjgd
jfdg

This illustrates the CLI use. There is also a GUI front-end available ... cheers, drl

joeyg · August 5, 2008, 4:06pm

> cat file1 | sed 's/\/\*/#\/\*/' | tr "\n" "~" | tr "#" "\n" | sort | tr -s "~" | tr "~" "\n"

display the file
change /* to #/* (to find later)
translate <new-line> to a ~
translate the # to a <new-line> (now, each record on one line only instead of 4)
sort
translate to get rid of (suppress) repetitive ~ characters
translate ~ to <new-line> (to put line back to four lines)

I often take multi-line inputs, and tr the <new-line> to ~ so I can treat multi-line records as one. Useful for sort and grep, for example. But, must place some kind of marker on the lines to assist in later putting the lines back to their original state.

sudhamacs · August 5, 2008, 4:48pm

Assuming that u r looking for sorting using the numbers and the pattern /*-- and the following paragraph doesn't contain numbers.

for i in `grep "/*--" file | sed 's/[^0-9]//g' | sort -n`
do
sed -e '/./{H;$!d;}' -e "x;/$i/!d;" file
done

summer_cherry · August 5, 2008, 11:05pm

open(FH,"<yourfilename");
while(<FH>){
	if(m/\/\*---/){
		@arr=split(" ",$_);
		$key=substr($arr[1],5);
	}
	$hash{$key}=sprintf("%s%s",$hash{$key},$_);
}
close(FH);
for $key (sort keys %hash){
	print $hash{$key};
}

summer_cherry · August 5, 2008, 11:07pm

perl:

open(FH,"<yourfilename");
while(<FH>){
	if(m/\/\*---/){
		@arr=split(" ",$_);
		$key=substr($arr[1],5);
	}
	$hash{$key}=sprintf("%s%s",$hash{$key},$_);
}
close(FH);
for $key (sort keys %hash){
	print $hash{$key};
}