Problem in formatting output in sed / awk

adam1969in · April 7, 2017, 10:39pm

I have a file like this :

!    1 ! 542255 !    50,140.00  !
!    2 ! 551717 !     5,805.00  !
!    3 ! 551763 !     8,130.00  !
!    4 ! 551779 !       750.00  !
!    5 ! 551810 !    56,580.00  !
!    6 ! 551816 !     1,350.00  !
!    7 ! 551876 !       360.00  !
!    8 ! 551898 !     6,580.00  !
!    9 ! 557285 !    69,295.00  !
!   10 ! 557508 !     6,685.00  !

I have used sed to remove the thousands separator

sed 's/,//g' file

The column of amount now gets mis-aligned, like this :

!    1 ! 542255 !    50140.00  !
!    2 ! 551717 !     5805.00  !
!    3 ! 551763 !     8130.00  !
!    4 ! 551779 !       750.00  !
!    5 ! 551810 !    56580.00  !
!    6 ! 551816 !     1350.00  !
!    7 ! 551876 !       360.00  !
!    8 ! 551898 !     6580.00  !
!    9 ! 557285 !    69295.00  !
!   10 ! 557508 !     6685.00  !

The numbers with thousands separator gets shifted one space left. And if the numbers are in lakhs, the more it shifts to the left. I need the output like this :

!    1 ! 542255 !    50140.00  !
!    2 ! 551717 !     5805.00  !
!    3 ! 551763 !     8130.00  !
!    4 ! 551779 !      750.00  !
!    5 ! 551810 !    56580.00  !
!    6 ! 551816 !     1350.00  !
!    7 ! 551876 !      360.00  !
!    8 ! 551898 !     6580.00  !
!    9 ! 557285 !    69295.00  !
!   10 ! 557508 !     6685.00  !

Please help me to do it either using sed or awk. I prefer awk

Aia · April 8, 2017, 1:25am

awk -F! '{if(sub(",", "", $4)){$4=" "$4}}1' OFS=! adam1969in.file

 sed 's/\([0-9]*\),/ \1/'  adam1969in.file

perl -pe 's/(\d+),/ $1/' adam1969in.file

Scrutinizer · April 8, 2017, 2:19am

A different approach that should also work if there is more than one thousands separator:

awk '{$4=sprintf("%*s",gsub(/,/,x,$4),x)$4}1' FS=! OFS=! file

--
x represents an empty string here. The number of comma replacements (gsub) determine the amount of space padding to the left.

--
As an alternative, if the column width is known beforehand, then one could also use:

awk '{gsub(/,/,x,$4); $4=sprintf("%15s",$4)}1' FS=! OFS=! file

If it is not, it can be determined like this:

awk '{l=length($4); gsub(/,/,x,$4); $4=sprintf("%*s",l,$4)}1' FS=! OFS=! file

--
Note:
The locale's thousands separator number parsing is broken in many applications, so the value needs to be treated as a string instead of a number. For example bash and awk fail to interpret the number in column 4 correctly..

ksh93 for one does it perfectly however (with the right locale):

LC_NUMERIC="en_US.UTF-8"
while IFS=! read -A f
do
  printf "%s!%s!%s!%13.2f  !\n" "${f[@]}"
done < file

adam1969in · April 8, 2017, 5:10am

 sed 's/\([0-9]*\),/ \1/'  adam1969in.file

This one is short and sweet. It does what I need. What if I have more than one columns with thousands separator.

Thank you very much sir.

RudiC · April 8, 2017, 5:14am

sed might not be the tool of choice, then. Use either of the awk proposals and loop across the columns in question.

adam1969in · April 8, 2017, 6:56am

Yes, you are right sir. Using the formula given by @Scrutinizer

I have used this for loop. It is working perfectly. However, please check if I have used it correctly.

awk 'BEGIN{
FS="!"; OFS="!"
}
{
for (i=1;i<=NF;i++){
l=length($i); gsub(/,/,x,$i); $i=sprintf("%*s",l,$i)}1
print}' file

Scrutinizer · April 8, 2017, 8:17am

Small correction:

awk '
  BEGIN {
    FS="!"; OFS="!"
  }
  {
    for (i=1;i<=NF;i++) {
      l=length($i)
      gsub(/,/,x,$i)
      $i=sprintf("%*s",l,$i)
    }
    print
  }
' file

(the "1" was removed)

adam1969in · April 8, 2017, 9:18am

scrutinizer:

Small correction:

awk '
  BEGIN {
   FS="!"; OFS="!"
  }
  {
   for (i=1;i<=NF;i++) {
   l=length($i)
   gsub(/,/,x,$i)
   $i=sprintf("%*s",l,$i)
   }
   print
  }
' file

(the "1" was removed)

Thank you very much sir.
And thanks to all.

drl · April 8, 2017, 10:05am

Hi.

Using the free utility align :

#!/usr/bin/env bash

# @(#) s1       Demonstrate automatic field alignment, align.

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
em() { pe "$*" >&2 ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C align dixf

FILE=${1-data1}

pl " Input data file $FILE:"
cat $FILE

pl " Results, default:"
align $FILE

pl " Results, wider gutter:"
align -g 3 $FILE

pl " More details for align:"
dixf align

exit 0

producing:

$ ./s1 

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution        : Debian 8.7 (jessie) 
bash GNU bash 4.3.30
align 1.7.0
dixf (local) 1.42

-----
 Input data file data1:
!    1 ! 542255 !    50140.00  !
!    2 ! 551717 !     5805.00  !
!    3 ! 551763 !     8130.00  !
!    4 ! 551779 !       750.00  !
!    5 ! 551810 !    56580.00  !
!    6 ! 551816 !     1350.00  !
!    7 ! 551876 !       360.00  !
!    8 ! 551898 !     6580.00  !
!    9 ! 557285 !    69295.00  !
!   10 ! 557508 !     6685.00  !

-----
 Results, default:
!  1 ! 542255 ! 50140.00 !
!  2 ! 551717 !  5805.00 !
!  3 ! 551763 !  8130.00 !
!  4 ! 551779 !   750.00 !
!  5 ! 551810 ! 56580.00 !
!  6 ! 551816 !  1350.00 !
!  7 ! 551876 !   360.00 !
!  8 ! 551898 !  6580.00 !
!  9 ! 557285 ! 69295.00 !
! 10 ! 557508 !  6685.00 !

-----
 Results, wider gutter:
!    1   !   542255   !   50140.00   !
!    2   !   551717   !    5805.00   !
!    3   !   551763   !    8130.00   !
!    4   !   551779   !     750.00   !
!    5   !   551810   !   56580.00   !
!    6   !   551816   !    1350.00   !
!    7   !   551876   !     360.00   !
!    8   !   551898   !    6580.00   !
!    9   !   557285   !   69295.00   !
!   10   !   557508   !    6685.00   !

-----
 More details for align:
align   Align columns of text. (what)
Path    : ~/p/stm/common/scripts/align
Version : 1.7.0
Length  : 270 lines
Type    : Perl script, ASCII text executable
Shebang : #!/usr/bin/perl
Help    : probably available with --help
Home    : http://kinzler.com/me/align/
Modules : (for perl codes)
 Getopt::Std    1.10

Best wishes ... cheers, drl

MadeInGermany · April 8, 2017, 3:26pm

With sed and a loop

sed '
:L
s/\([0-9]\{1,\}\),/ \1/
tL
' file

adam1969in · April 8, 2017, 10:47pm

drl:

Hi.

Using the free utility align :

#!/usr/bin/env bash

# @(#) s1       Demonstrate automatic field alignment, align.

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
em() { pe "$*" >&2 ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C align dixf

FILE=${1-data1}

pl " Input data file $FILE:"
cat $FILE

pl " Results, default:"
align $FILE

pl " Results, wider gutter:"
align -g 3 $FILE

pl " More details for align:"
dixf align

exit 0

producing:

$ ./s1 

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution        : Debian 8.7 (jessie) 
bash GNU bash 4.3.30
align 1.7.0
dixf (local) 1.42

-----
 Input data file data1:
!    1 ! 542255 !    50140.00  !
!    2 ! 551717 !     5805.00  !
!    3 ! 551763 !     8130.00  !
!    4 ! 551779 !       750.00  !
!    5 ! 551810 !    56580.00  !
!    6 ! 551816 !     1350.00  !
!    7 ! 551876 !       360.00  !
!    8 ! 551898 !     6580.00  !
!    9 ! 557285 !    69295.00  !
!   10 ! 557508 !     6685.00  !

-----
 Results, default:
!  1 ! 542255 ! 50140.00 !
!  2 ! 551717 !  5805.00 !
!  3 ! 551763 !  8130.00 !
!  4 ! 551779 !   750.00 !
!  5 ! 551810 ! 56580.00 !
!  6 ! 551816 !  1350.00 !
!  7 ! 551876 !   360.00 !
!  8 ! 551898 !  6580.00 !
!  9 ! 557285 ! 69295.00 !
! 10 ! 557508 !  6685.00 !

-----
 Results, wider gutter:
!    1   !   542255   !   50140.00   !
!    2   !   551717   !    5805.00   !
!    3   !   551763   !    8130.00   !
!    4   !   551779   !     750.00   !
!    5   !   551810   !   56580.00   !
!    6   !   551816   !    1350.00   !
!    7   !   551876   !     360.00   !
!    8   !   551898   !    6580.00   !
!    9   !   557285   !   69295.00   !
!   10   !   557508   !    6685.00   !

-----
 More details for align:
align   Align columns of text. (what)
Path    : ~/p/stm/common/scripts/align
Version : 1.7.0
Length  : 270 lines
Type    : Perl script, ASCII text executable
Shebang : #!/usr/bin/perl
Help    : probably available with --help
Home    : http://kinzler.com/me/align/
Modules : (for perl codes)
 Getopt::Std    1.10

Best wishes ... cheers, drl

Thank you sir:)

---------- Post updated at 08:17 AM ---------- Previous update was at 08:16 AM ----------

Thank you sir