Using awk to place decimal points at proper position

Hi,

I have one input file which is delimited by pipe. I want to put decimal points in this input file at particular position in particular column and also get the negative sign (if any) at start of that column.

$ cat Input_file.txt
11|10102693|1|20151202|10263204|20151127|N|0001
11|20100142-|2|20151202|21411783-|20151127|3|0005

And, There is one comma separated file which mentions "Column number","length of the column" and "position of decimal point".

$ cat decimal.csv
2,8,3
5,8,4

Expected output is as follows:

11|101.02693|1|20151202|1026.3204|20151127|N|0001
11|-201.00142|2|20151202|-2141.1783|20151127|3|0005

I have tried below awk code but, I am not getting expected output:

$ awk 'FNR == NR {A[NR]=$1;B[NR]=$2;C[NR]=$3;CNT=NR;next}{for(i=1;i<=CNT;i++) $A=substr($A,$B+1,1) substr($A,1,$C) "." substr($A,$C+1,$B-$C)} {print $0}' FS="," decimal.csv FS="|" OFS="|" Input_file.txt
11|01.|1|20151202|010263204.|20151127|N|0001
11|120.100|2|20151202|721411783-.|20151127|3|0005

Can someone please help me to figure out what I am doing wrong here and correct me.

Hello Prathmesh,

Could you please try following and let me know if this helps you.

awk 'FNR==NR{A[++i]=$0;next} {for(j=1;j<=i;j++){split(A[j], array,",");if(length($array[1])==array[2]){$array[1]=substr($array[1],1,array[3]) "." substr($array[1],array[3]+1);} else {q="-" substr($array[1],1,array[2]);$array[1]=substr(q,1,array[3]) "." substr(q,array[3]+1);}};print}' decimal.csv  FS="|" OFS="|" Input_file

Output will be as follows.

11|101.02693|1|20151202|1026.3204|20151127|N|0001
11|-20.100142|2|20151202|-214.11783|20151127|3|0005

Thanks,
R. Singh

1 Like

Thanks. But, Second line of the output seems to be wrong. Output should be as follows:

11|101.02693|1|20151202|1026.3204|20151127|N|0001
11|-201.00142|2|20151202|-2141.1783|20151127|3|0005

Hello Prathmesh,

A very minor change with code as follows may help you in same.

awk 'FNR==NR{A[++i]=$0;next} {for(j=1;j<=i;j++){split(A[j], array,",");if(length($array[1])==array[2]){$array[1]=substr($array[1],1,array[3]) "." substr($array[1],array[3]+1);} else {q="-" substr($array[1],1,array[2]);$array[1]=substr(q,1,array[3]+1) "." substr(q,array[3]+2);}};print}' decimal.csv  FS="|" OFS="|" Input_file

Output will be as follows.

11|101.02693|1|20151202|1026.3204|20151127|N|0001
11|-201.00142|2|20151202|-2141.1783|20151127|3|0005

EDIT: Adding a non-one liner form for solution on same now.

awk 'FNR==NR{
                A[++i]=$0;
                next
            }
            {
                for(j=1;j<=i;j++){
                                        split(A[j], array,",");
                                        if(length($array[1])==array[2]){
                                                                        $array[1]=substr($array[1],1,array[3]) "." substr($array[1],array[3]+1);
                                                                       }
                                        else                           {
                                                                        q="-" substr($array[1],1,array[2]);$array[1]=substr(q,1,array[3]+1) "." substr(q,array[3]+2);}};
                                                                        print
                                                                       }
    ' decimal.csv  FS="|" OFS="|" Input_file
 

Thanks,
R. Singh

1 Like

Thanks a lot. Can you please explain the code so that, it will be easier to understand.

Hello Prathmesh,

Could you please go through following and let me know if this is helpful for you.

awk 'FNR==NR{                                                                                     #### FNR==NR condition will be TRUE when first file named decimal.csv is being read.
A[++i]=$0;                                                                                        #### When above condition is TRUE then create an array named A with index of variable i whose index is increasing every occurance whenever above condition is TRUE, so means till first file is being read completly.
next                                                                                              #### skipping all other statements now by using next keyword.
}
{
for(j=1;j<=i;j++){                                                                                #### starting for loop now which will start only when second file will be read and wen first file will be completly read then we will get a final valur of variable i so running the loop from j=1 to till i's value.
split(A[j], array,",");                                                                           #### splitting the value of array A whose delimiter is comma(,). This array is having the values of decimal.csv files, which by splitting into array I wil use it later in code.
if(length($array[1])==array[2]){                                                                  #### Now I am comparing the value of field $array[1] which is field mentioned in decimal.csv and value is in file Input_file with length of field provided in decimal.csv file if they are equal then do following actions.
$array[1]=substr($array[1],1,array[3]) "." substr($array[1],array[3]+1);                          #### Here I am RE-FORMING the field let's say 2nd and 5th(as per your decimal.csv) file by using substr utility of awk. which works on substr(LINE,starting point, ending point) etc.
}
else{                                                                                             #### In case length of field is NOT equal to length provided in decimal.csv file then do following actions.
q="-" substr($array[1],1,array[2]);$array[1]=substr(q,1,array[3]+1) "." substr(q,array[3]+2);}};  #### creating a variable named q whose value I am constructing by field values by file Input_file, more important placing dot(.) here as per requirement in between field's values.
print                                                                                             #### finally printing the line, not here it will print the new formatted values of fields which we have done in above steps.
}
' decimal.csv  FS="|" OFS="|" Input_file                                                          #### mentioning first file named decimal.csv then Field separator as | and Output field separator as | for Input_file here.
 

Thanks,
R. Singh

1 Like

Thanks for very good explanation.

How about

awk '
FNR == NR       {A[NR] = $1
                 B[NR] = $2 + 1
                 C[NR] = $2 - $3
                 CNT=NR
                 next
                }
                {for (i=1; i<=CNT; i++) $A = sprintf ("%*.*f", B, C, $A / 10^(C) * ($A ~ /-$/?-1:1))
                }
1
' FS="," decimal FS="|" OFS="|"  file
11|101.02693|1|20151202|1026.3204|20151127|N|0001
11|-201.00142|2|20151202|-2141.1783|20151127|3|0005
2 Likes