Sort Issue

these are two records(adr.txt)trying to sort with the the code expl below.
5423|336110|2730 Pierce St|Ste 300|Sioux City|IA|Woodbury|51104|3765||42518651|96405013|A|2|3|U|12/08/2009
5423|14462335|624 JONES ST|STE 5400|Sioux City|IA|Woodbury|51101|||42496648|96400644|A|8|2|U |12/24/2009

nawk -F'|' '{n=split($NF, a, "[ /]"); gsub(":", "", a[n]);printf("%d%02d%02d%s%s\n", \
a[3], a[1], a[2], a[n], OFS $0)}' OFS='|' adr.txt | sort -t '|' -k2n,2 -k17,17 -k16n,16 \
-k15n,15r -k1n,1r -k3n,3r | cut -d '|' -f2-

The result is reverse as i expect the bottome one to come top...
...The issue here is ("THE LAST BUT ONE FIELD IS NOT CONSISTENT AS IT HAS A SPACE AFTER "U" IN ONE ROW AND NOT IN OTHER ROW")... Either with or without spaces in two rows the above sort works fine
I just want to ignore the space in that filed while sorting...Any ideas please?
Thanks

Hi

nawk -F'|' '{n=split($NF, a, "[ /]"); gsub(":", "", a[n]);printf("%d%02d%02d%s%s\n", a[3], a[1], a[2], a[n], OFS $0)}' OFS='|' adr.txt \
| sort -t '|' -k2n,2 -k17.1,17.1 -k16n,16 -k15n,15r -k1n,1r -k3n,3r \
| cut -d '|' -f2-

Assuming the column containing your "U"-values is column number 17.

Alternatively, sed to strip unnecessary whitespace.

Also noticed that your AWK builds a first column like this:

200908142009|...

It won't do any damage, but you perhaps meant to build

20090814|...

instead?

BRgds
JG

JG,

It works absolutely fine...

Just one more issue if you have few minutes...

Like you said , the 17th filed is being sorted with the way you suggested 17.1 which works fine...

I just want to take no chance with other fileds too with the space issue like in 17th.

for example, in 16th filed the value can be anywhere between 1 to 99. I want to consider only values , I mean just one value in case of single digit and 2 values in case of double digit. I dont find any DOC on this...

please suggest me if you have any ideas...hope I am clear

Thanks

Hi

I would strip all unnecessary whitespace

sed -e "s/[ ]*|[ ]*/|/g" -e "s/^[ ]*//" -e "s/[ ]*$//"

and change the sort condition for column 17 back to as you had it the first place, giving you something like (untested):

sed -e "s/[ ]*|[ ]*/|/g" -e "s/^[ ]*//" -e "s/[ ]*$//" adr.txt \
| nawk -F'|' '{n=split($NF, a, "[ /]"); gsub(":", "", a[n]);printf("%d%02d%02d%s%s\n", a[3], a[1], a[2], a[n], OFS $0)}' OFS='|' \
| sort -t '|' -k2n,2 -k17,17 -k16n,16 -k15n,15r -k1n,1r -k3n,3r \
| cut -d '|' -f2-

Hope this helps
JG