Need help in writing a routine for sorting a CSV file

avikaljain · April 10, 2011, 11:44pm

Hi,

I have a CSV file in following manner:

server1,env1,patch1
server1,env1,patch2
server1,env1,patch3
server1,env2,patch1
server1,env2,patch3
server2,env3,patch1
server2,env3,patch5
server2,env4,patch1
server3,env6,patch1
server3,env7,patch2
server3,env7,patch3

I want to sort this input on server basis and then run execute the script. Something liek below:

sort these values and run the script
server1,env1,patch1
server1,env1,patch2
server1,env1,patch3
server1,env2,patch1

Once above operation is done

sort these values and run the script
server2,env3,patch1
server2,env3,patch5
server2,env4,patch1

followed by

sort these values and run the script:
server3,env6,patch1
server3,env7,patch2
server3,env7,patch3

The script that I am trying to execute has inbuilt logic wrriten to take this sorted values as input and perform operation.

Any help here would be really appreciated.

Thanks in advance,
Avikal Jain

Chubler_XL · April 11, 2011, 12:40am

Not sure if this is what you were asking for:

cut -f1 -d"," your_file.csv | sort -u | while read server
do
     grep "^${server}," your_file.csv | your_inbuilt_script.sh
done

avikaljain · April 11, 2011, 4:12am

Thanks Chubler_XL, this is exactly what I was looking for.

avikaljain · April 13, 2011, 6:14am

Hi,

I am facing an issue now with my script where I am using this routine, while running it gives me following error:

/mpp/t24_delivery/RCM/menu/packlist.dat: line too long.

Any idea what might have been causing this? This runs fine on my test machine however on actual environment the script gives the above error.

The complete command that I am running is:

cut -f1 -d"," "${CONFIG_FILE_PATH}" | sort -u | while read HOST_NAME
do
 grep "^${HOST_NAME}," "${CONFIG_FILE_PATH}" > /mpp/t24_delivery/RCM/menu/deployment/$HOST_NAME.dat

where ${CONFIG_FILE_PATH} = /mpp/t24_delivery/RCM/menu/packlist.dat

Thanks for help in advance.

-Avikal Jain

Chubler_XL · April 14, 2011, 8:20pm

Sounds like grep is failing because your datafile packlist.dat has a line with too may characters. From memory, grep does not support more that about 2K characters in 1 line.

rdcwayx · April 14, 2011, 9:18pm

awk -F, '!a[$1]++{print $1}' your_file.csv |while read server

avikaljain · April 15, 2011, 12:25am

Hi Chubler_XL, I do not think the grep command is the issue here, I replaced the grep command with just echo $HOST_NAME and I still get the same error message:

The code:

 
cut -f1 -d"," "${CONFIG_FILE_PATH}" | sort -u | while read HOST_NAME
do
echo $HOST_NAME

The error message i get using above is:
stdin: line too long

I checked my input file, when I perform a cat on it the ouput is something like below:

ROOT@XXXXXXXXXX:/tmp # cat packlist.dat
server1,SG11,PACK1
server1,SG11,PACK2
server1,SG11,PACK3
server1,SG12,PACK1
server1,SG12,PACK2
server1,SG12,PACK3
server2,SG21,PACK1
server2,SG21,PACK2
server2,SG21,PACK3
server2,SG22,PACK1
server2,SG22,PACK2
server3,SG31,PACK1ROOT@XXXXXXXXXXX:/tmp #

However when I add extra line at the end of the file, output of cat seems like below:

ROOT@XXXXXXXXXX:/tmp # cat packlist.dat
server1,SG11,PACK1
server1,SG11,PACK2
server1,SG11,PACK3
server1,SG12,PACK1
server1,SG12,PACK2
server1,SG12,PACK3
server2,SG21,PACK1
server2,SG21,PACK2
server2,SG21,PACK3
server2,SG22,PACK1
server2,SG22,PACK2
server3,SG31,PACK1
ROOT@XXXXXXXXXXX:/tmp #

and the same code works fine, I guess I need the routine which can sort the csv doesnt matter if teh file has extra line at the end or not.

@rdcwayx : I will give a try to your routine as well.

Thanks for your input everyone, really appreciate your help. I will post the results once try rdcwayx's routine.

---------- Post updated at 11:25 PM ---------- Previous update was at 10:37 PM ----------

Folks, the awk utility provided by rdcwayx is working fine, the only question I have now is that will it able to handle any kind of input file i.e the one with a extra line and one without a extra line at the end?

Many thanks again to rdcwayx and Chubler_XL for their time and input.