How to use sort with null values?

viktor1985 · May 31, 2012, 1:16pm

Hello everyone

I am doing a join command. Obviously, before I need two files sorted first. ( Both files have headers and have about 2 million lines each one )

The problem is, one of the files has null values in the key to sort (which is the first filed ). For example I have the original file1.csv

imsi;hlr;subIndex;customer
700309879;2;66;tmobile
300123456;1;55;americamovil
;3;77;telefonica

I need the output to be ( first the header, then null values , then numeric values ) or ( header, numeric values, null values):

imsi;hlr;subIndex;customer
;3;77;telefonica
300123456;1;55;americamovil
700309879;2;66;tmobile

I have tried many combinations of sort flags using -b -t -d -n -g -k1 , but without success.

I am using a fedora 16

Appreciate your comments.

Best Regards

joeyg · May 31, 2012, 1:24pm

Since it looks like you want this sorted as a zero rather than a blank...
why not replace all lines that begin with

with

??

donadarsh · May 31, 2012, 1:43pm

total_rec=`wc -l $file`
#subtract 1 for header record 
total_rec=`expr $total_rec - 1`
#put the header record first in output file
head -1 $file > somefile
#sort based on first filed except header and append to output file
tail -$total_rec $file|awk '{print $0 | "sort -t\; -k 1,1"} ' >>somefile

shamrock · May 31, 2012, 4:04pm

Try this...

sort -t\; -k1,1n -r file

viktor1985 · May 31, 2012, 10:48pm

Thank you. I think it will work it out, but I need something faster. my files are about 1.5 GB. I am using process substitution with 2 sort and 1 join.

---------- Post updated at 09:48 PM ---------- Previous update was at 09:46 PM ----------

Thank you very much shamrock !! This is what I was looking for !!