Hi,
Could someone please help me with this? I have a text file that has fields seperated by comma. The last column in it has multiple values seperated by "|". I need to sort values in the last column seperated by pipe..is there any way I can do this through script?
Sample text file -
TEST1,ABC,5|4|1|3|2
TEST2,DEF,10|7|25|11|52|19
TEST3,XYZ,5|11|52|1
The output should generate -
TEST1,ABC,1|2|3|4|5
TEST2,DEF,7|10|11|19|25|52
TEST3,XYZ,1|5|11|52
How can I achieve this?
---------- Post updated at 04:47 PM ---------- Previous update was at 03:46 PM ----------
Could anyone help please??
SFNYC
November 3, 2009, 5:39pm
2
Try this:
$ cat ./sort_fields.ksh
#!/bin/ksh
sort_array()
{
set -A LOCALARRAY $1
LOCALARRAY_COUNT=${#LOCALARRAY[*]}
IDX=0
TEMP=0
while (( $IDX < $LOCALARRAY_COUNT ))
do
IDX2=`expr $IDX + 1`
while (( $IDX2 < $LOCALARRAY_COUNT ))
do
if (( ${LOCALARRAY[$IDX]} > ${LOCALARRAY[$IDX2]} ))
then
TEMP=${LOCALARRAY[$IDX]}
LOCALARRAY[$IDX]=${LOCALARRAY[$IDX2]}
LOCALARRAY[$IDX2]=$TEMP
fi
IDX2=`expr $IDX2 + 1`
done
IDX=`expr $IDX + 1`
done
print "${LOCALARRAY[*]}"
}
while read LINE
do
LINEPT1=${LINE%,*}
set -A FIELDS $(echo $LINE | cut -d',' -f3 |cut -d'|' --output-delimiter=' ' -f1-)
LINEPT2=$(sort_array "${FIELDS[*]}"|cut -d' ' --output-delimiter='|' -f1-)
print "${LINEPT1},${LINEPT2}"
done < test.dat
exit 0
$ cat ./test.dat
TEST1,ABC,5|4|1|3|2
TEST2,DEF,10|7|25|11|52|19
TEST3,XYZ,5|11|52|1
$ ./sort_fields.ksh
TEST1,ABC,1|2|3|4|5
TEST2,DEF,7|10|11|19|25|52
TEST3,XYZ,1|5|11|52
I still consider myself to definitely be a beginner with awk but your question intrigued me so I worked at it a bit. The following doesn't meet 100% of your requirement but I just can't figure it out from this point.
$ awk -F, '{split($3,a,"|"); n=asort(a,b); for (i=1; i<=n; i++) sort=sort "|" b; print $1 "," $2 "," sort}' file
input
TEST1,ABC,5|4|1|3|2
TEST2,DEF,10|7|25|11|52|19
TEST3,XYZ,5|11|52|1
output
TEST1,ABC,|1|2|3|4|5
TEST2,DEF,|1|2|3|4|5|7|10|11|19|25|52
TEST3,XYZ,|1|2|3|4|5|7|10|11|19|25|52|1|5|11|52
As you can see the problem I'm having is getting the for loop to only include the parts of the array relevant to the current line.
I'm scratching my head for now, but I know I'll figure it out soon
Should be a simple solution, but stop to merge the lines. Anyone can help to continue it?
$ awk -F[\,\|] '{for(i=3;i<=NF;i++) {print $1,$2,$i}}' urfile |sort -k1,1n -k3,3n
TEST1 ABC 1
TEST1 ABC 2
TEST1 ABC 3
TEST1 ABC 4
TEST1 ABC 5
TEST2 DEF 7
TEST2 DEF 10
TEST2 DEF 11
TEST2 DEF 19
TEST2 DEF 25
TEST2 DEF 52
TEST3 XYZ 1
TEST3 XYZ 5
TEST3 XYZ 11
TEST3 XYZ 52
then I need merge the lines which $1 is same.
sortbars() {
echo "$1"|xargs -d'|' -n1|sort -n|xargs|tr ' ' '|'
}
while IFS=, read cola colb colc; do
echo "$cola,$colb,$(sortbars $colc)"
done < infile
TEST1,ABC,1|2|3|4|5
TEST2,DEF,7|10|11|19|25|52
TEST3,XYZ,1|5|11|52
Alternatively:
sortbars() {
IFS='|'; for i in $1; do
echo $i
done|sort -n|xargs|tr ' ' '|'
}
perl:
while(<DATA>){
chomp;
my @tmp=split(",",$_);
my @t1=split("[|]",$tmp[2]);
$tmp[2]=join "|", sort {$a<=>$b} @t1;
print join ",",@tmp;
print "\n";
}
__DATA__
TEST1,ABC,5|4|1|3|2
TEST2,DEF,10|7|25|11|52|19
TEST3,XYZ,5|11|52|1
Thank you all for awesome suggestions!
rdcwayx
November 15, 2009, 10:21pm
8
mkastin:
$ awk -F, '{split($3,a,"|"); n=asort(a,b); for (i=1; i<=n; i++) sort=sort "|" b; print $1 "," $2 "," sort}' file
output
TEST1,ABC,|1|2|3|4|5
TEST2,DEF,|1|2|3|4|5|7|10|11|19|25|52
TEST3,XYZ,|1|2|3|4|5|7|10|11|19|25|52|1|5|11|52
I think I fix your code.
awk -F, '{sort=""} {split($3,a,"|"); n=asort(a,b); for (i=1; i<=n; i++) sort=sort "|" b; print $1 "," $2 "," sort}' urfile
TEST1,ABC,|1|2|3|4|5
TEST2,DEF,|7|10|11|19|25|52
TEST3,XYZ,|1|5|11|52
Below is the code, which I try to make it shorter
$ awk -F, '{printf "\n"$1","$2","} {split($3,a,"|"); n=asort(a); for (i=1;i<=n;i++) printf "|"a} ' urfile
TEST1,ABC,|1|2|3|4|5
TEST2,DEF,|7|10|11|19|25|52
TEST3,XYZ,|1|5|11|52