thousands separator

Hi,
Trying to represent a number with thousands separator in AWK:

echo 1 12 123 1234 12345 123456 1234567 | awk --re-interval '{print gensub(/([[:digit:]])([[:digit:]]{3})/,"\\1,\\2","g")}' 

  1 12 123 1,234 1,2345 1,23456 1,234567

any idea what is wrong here ?

I would use this (as it seams you have GNU Awk):

% cat sep.awk
{ printf "%'d ", $1 } END { print "" }
% print 1 12 123 1234 12345 123456 1234567 |gawk -f sep.awk RS=" "
1 12 123 1,234 12,345 123,456 1,234,567

Your locale must support such characters:

% print 1 12 123 1234 12345 123456 1234567 |LC_ALL=C gawk -f sep.awk RS=" "
1 12 123 1234 12345 123456 1234567
% print 1 12 123 1234 12345 123456 1234567 |LC_ALL=en_US.UTF-8 gawk -f sep.awk RS=" "
1 12 123 1,234 12,345 123,456 1,234,567

I get the following output:

%'d %'d %'d %'d %'d %'d %'d 

I am using redhat release 4

I suppose it's version/environment specific.

A User's Guide for GNU Awk
Edition 3
June, 2004

maybe it works for the manual
still it doesn't work for me :frowning:

Not only for the manual, it works fine on my Ubuntu 7.10 :slight_smile:

The

printf "%'d "

solution did not work for me either. I have GNU AWK 3.1.5. The man doesn't mention apostrophe among printf format options, and I don't have thousands.awk file.

This solution

echo 1 12 123 1234 12345 123456 1234567 | awk --re-interval '{print gensub(/([[:digit:]])([[:digit:]]{3})/,"\\1,\\2","g")}'

will not work, because only the #,### pattern gets repeated. It becomes more clear when you add a few longer numbers to the list.

I created the following solution:

#!/bin/sh 
nums=`echo -e " 1\n 12\n 123\n 1234\n 12345\n 123456\n 1234567\n 12345678\n 123456789\n 1234567890\n"`
echo "$nums" | awk --re-interval '{ 
        if (length($1) > 3) 
        {
                a = int(length($1)%3)
                
                if (a == 0)
                {
                        p1 = gensub(/([[:digit:]]{3})/, "\\1,", "g")
                        printf "%-20d %s \n", $1, gensub(/,$/, "\\1", "g", p1)
                }

                if (a == 1)
                {
                        q1 = gensub(/\<([[:digit:]])/, "\\1,", "g")
                        q2 = gensub(/([[:digit:]]{3})/, "\\1,", "g", q1)
                        printf "%-20d %s \n", $1, gensub(/,$/, "\\1", "g", q2)
                }
                
                if (a == 2)
                {
                        r1 = gensub(/\<([[:digit:]]{2})/, "\\1,", "g")
                        r2 = gensub(/([[:digit:]]{3})/, "\\1,", "g", r1)
                        printf "%-20d %s \n", $1, gensub(/,$/, "\\1", "g", r2)
                }
        }
}'

Note! This will not work with non-integers (i don't need it for my script), but it can be extended with some effort!

Hi.

The solution of radoulov worked for me, but the apostrophe copied and pasted in as an odd character -- it came in as a "?" in vi. I replaced it with a not-so-special single quote and, with the locale assignments and GNU Awk 3.1.4, it worked as shown above:

% ./s1

(Versions displayed with local utility "version")
Linux 2.6.11-x1
GNU bash, version 2.05b.0(1)-release (i386-pc-linux-gnu)
GNU Awk 3.1.4

 Results from awk, locale C:
1234567

 Results from awk, locale en_US.UTF-8:
1,234,567

cheers, drl

Hi.

I didn't find the apostrophe flag description in man awk or Effective AWK Programing, 2nd, but in printf(3), we see:

cheers, drl

Yes, this works fine:

awk 'BEGIN{printf "%'"'"'d\n", 1234567890}'

With sed:

sed -e :a -e 's/\(.*[0-9]\)\([0-9]\{3\}\)/\1,\2/;ta'

Regards

Hi.

Who could live without a perl version:

perl -wpe '1 while s/(.*\d)(\d{3})/$1,$2/'

I like the brevity of the radoulov awk version, and the small size of the sed executable:

-rwxr-xr-x  1   41048 Nov 30  2004 /bin/sed*
-rwxr-xr-x  1  311308 Nov 26  2004 /usr/bin/awk*
-rwxr-xr-x  2 1057324 Mar  8  2005 /usr/bin/perl*

cheers, drl

1 Like