pchang
July 14, 2011, 10:23am
1
Hi forum.
I'm hoping someone can help me out with this problem. I tried to search online but couldn't come up with an exact solution.
I have the following data file:
H|20-May-2011|MF_FF.dat|77164|731374590.96|1|1|731374590.96|76586|77164|578|2988|Y
H|20-May-2011|MF_FF.dat|77164|731374590.96|1|4|731374590.96|76586|77164|578|2988|Y
H|21-Jul-2011|MF_FF.dat|77178|731374590.96|2|4|731374590.96|76600|77178|578|3002|Y
H|21-Jul-2011|MF_FF.dat|77178|731374590.96|2|4|731374590.96|76600|77178|578|3002|Y
H|21-May-2011|MF_FF.dat|77178|731374590.96|2|4|731374590.96|76600|77178|578|3002|Y
H|22-Aug-2011|MF_FF.dat|77195|731374590.96|3|4|731374590.96|76617|77195|578|3019|Y
H|22-May-2011|MF_FF.dat|77195|731374590.96|3|4|731374590.96|76617|77195|578|3019|Y
H|26-Mar-2011|MF_FF.dat|76444|713606875.00|1|1|713606875.00|75822|76444|622|3310|Y
I need to sort the file by the 2nd field in chronological order.
Thanks
panyam
July 14, 2011, 11:05am
3
sort -t"|" -k 2.8,2.11 -k 2.4,2.6M -k 2.1,2.2n
---------- Post updated at 08:05 AM ---------- Previous update was at 08:03 AM ----------
bartus11:
Try:
sort -Mt"-" -k2,2 file
Hello Bartus11,
It won't work if the data have "cahnge in year".
ex:
cat input_file
H|21-May-2011|MF_FF.dat|77178|731374590.96|2|4|731374590.96|76600|77178|578|3002|Y
H|21-Jul-2010|MF_FF.dat|77178|731374590.96|2|4|731374590.96|76600|77178|578|3002|Y
sort -Mt"-" -k2,2 input_file
H|21-May-2011|MF_FF.dat|77178|731374590.96|2|4|731374590.96|76600|77178|578|3002|Y
H|21-Jul-2010|MF_FF.dat|77178|731374590.96|2|4|731374590.96|76600|77178|578|3002|Y
pchang
July 14, 2011, 11:10am
4
Thanks guys for your quick responses.
My sort utility doesn't somehow recognize the "M" option.
sort -t"|" -k 2.8,2.11 -k 2.4,2.6M -k 2.1,2.2n test_file
Usage: sort [-Abcdfimnru] [-T Directory] [-t Character] [-o File]
[-y[Kilobytes]] [-z Recordsize] [-k Keydefinition]...
[[+Position1][-Position2]]... [File]...
Any other way to resolve this?
What operating system are you on?
pchang
July 14, 2011, 11:38am
6
AIX newdwdev 3 5 0007DA12D900
Write your own sort in awk using the second field as the key unless you know perl which has a builting sort function...
Assuming that the dates in the 2nd column do not repeat in the file, you could do something like this in Perl -
$
$ # show the content of the data file "f7"
$
$ cat -n f7
1 H|20-May-2011|MF_FF.dat|77164|731374590.96|1|1|731374590.96|76586|77164|578|2988|Y
2 H|21-Jul-2011|MF_FF.dat|77178|731374590.96|2|4|731374590.96|76600|77178|578|3002|Y
3 H|21-May-2011|MF_FF.dat|77178|731374590.96|2|4|731374590.96|76600|77178|578|3002|Y
4 H|22-Aug-2011|MF_FF.dat|77195|731374590.96|3|4|731374590.96|76617|77195|578|3019|Y
5 H|22-May-2011|MF_FF.dat|77195|731374590.96|3|4|731374590.96|76617|77195|578|3019|Y
6 H|26-Mar-2011|MF_FF.dat|76444|713606875.00|1|1|713606875.00|75822|76444|622|3310|Y
7 H|31-Oct-2009|MF_FF.dat|76444|713606875.00|1|1|713606875.00|75822|76444|622|3310|Y
8 H|07-Dec-2010|MF_FF.dat|76444|713606875.00|1|1|713606875.00|75822|76444|622|3310|Y
$
$ # run the Perl program on the data file "f7"
$
$ perl -F"\|" -lane 'BEGIN {%x = qw(Jan 01 Feb 02 Mar 03 Apr 04 May 05 Jun 06
Jul 07 Aug 08 Sep 09 Oct 10 Nov 11 Dec 12)}
($d,$m,$y) = unpack("A2xA3xA4", $F[1]);
$data{$y.$x{$m}.$d} = $_;
END {foreach $k (sort keys %data) {print $data{$k}}}
' f7
H|31-Oct-2009|MF_FF.dat|76444|713606875.00|1|1|713606875.00|75822|76444|622|3310|Y
H|07-Dec-2010|MF_FF.dat|76444|713606875.00|1|1|713606875.00|75822|76444|622|3310|Y
H|26-Mar-2011|MF_FF.dat|76444|713606875.00|1|1|713606875.00|75822|76444|622|3310|Y
H|20-May-2011|MF_FF.dat|77164|731374590.96|1|1|731374590.96|76586|77164|578|2988|Y
H|21-May-2011|MF_FF.dat|77178|731374590.96|2|4|731374590.96|76600|77178|578|3002|Y
H|22-May-2011|MF_FF.dat|77195|731374590.96|3|4|731374590.96|76617|77195|578|3019|Y
H|21-Jul-2011|MF_FF.dat|77178|731374590.96|2|4|731374590.96|76600|77178|578|3002|Y
H|22-Aug-2011|MF_FF.dat|77195|731374590.96|3|4|731374590.96|76617|77195|578|3019|Y
$
$
tyler_durden
And here's the awk version...
awk -F\| '{
n=split($2,a,"-")
for (i=n; i; i--) {
if (a=="Jan") a="01"
else if (a=="Feb") a="02"
else if (a=="Mar") a="03"
else if (a=="Apr") a="04"
else if (a=="May") a="05"
else if (a=="Jun") a="06"
else if (a=="Jul") a="07"
else if (a=="Aug") a="08"
else if (a=="Sep") a="09"
else if (a=="Oct") a="10"
else if (a=="Nov") a="11"
else if (a=="Dec") a="12"
s = sprintf("%s", a?s""a:a)
}
if (s in c) x=sprintf("%s",x"\n"$0)
else {c = s; b[NR] = s; x = $0}
s=""
} END {
for (i=1; i<NR; i++)
for (j=1; j<=NR-i; j++)
if (b[j] > b[j+1]) {
t = b[j]
b[j] = b[j+1]
b[j+1] = t
}
for (i=1; i<=NR; i++)
if (x[b]) print x
[b]}' file
1 Like