awk to search for specific line and replace nth column

ncwxpanther · December 6, 2013, 8:22am

I need to be able to search for a string in the first column and if that string exists than replace the nth column with "-9.99".

AW12000012012   2.38   1.51   3.01   1.66   0.90   0.91   1.22   0.82   0.57   1.67   2.31   3.63   0.00
AW12000012013   1.52   0.90   1.20   1.34   1.21   0.67   1.18   1.22   2.75   0.84   1.52   0.27  -9.99
Q120100010001   6.35   1.70   2.59   1.88   3.08   0.70   0.71   0.31   2.27   0.19   2.42   6.05   0.00

In the case above I would need to search for "2013" and when it is present, replace the 13th column in that line only with "-9.99".

Thanks!

RudiC · December 6, 2013, 8:35am

Try

awk '$1 ~ /2013$/ {$13 = -9.99}1' file

I guess you want to keep the format? Try

 awk '$1 ~ /2013$/ {$13 = -9.99; for (i=2; i<=NF; i++) $i=$i+0}1' CONVFMT="%6.2f" file
AW12000012012   2.38   1.51   3.01   1.66   0.90   0.91   1.22   0.82   0.57   1.67   2.31   3.63   0.00
AW12000012013   1.52   0.90   1.20   1.34   1.21   0.67   1.18   1.22   2.75   0.84   1.52  -9.99  -9.99
Q120100010001   6.35   1.70   2.59   1.88   3.08   0.70   0.71   0.31   2.27   0.19   2.42   6.05   0.00

ncwxpanther · December 6, 2013, 9:02am

Thanks RudiC

"2013" may exist in various places in column 1. So I need the script to only look for it in the last 4 fields. I recently got a tip on searching for that string -

awk 'substr($1,length($1)-3) <= 2013'

Would something like that work in this case?

---------- Post updated at 09:02 AM ---------- Previous update was at 08:43 AM ----------

I was able to get this command to work for the most part

awk 'substr($1,length($1)-3) == 2013 {$13 = -9.99; for (i=2; i<=NF; i++) $i=$i+0}1' CONVFMT="%6.2f" file

But some of the formatting is not correct.

For instance -

AA02700012013   1.98   3.09   2.29   2.44   5.48   7.04   5.90   3.34   5.07   1.55   3.64  -9.99  41.87
BB02800012013 3   2.89   2.89   2.60   3.82   9.49   5.27   4.66   2.13   1.99   2.79  -9.99  41.53
CC02900012013   0.50   0.39   0.21   0.18   0.22   0.65   3.50   2.16   4.16   0.43   0.88  -9.99  13.28
DD03000012013   2.12   2.38   1.80   3.07   4.31   7.45   4.62   3.84   3.46   3.18   3.78  -9.99  40.03
EE03100012013   4.69   4.09   2.67   4.68   3.96 9   8.41   4.77   2.80   2.20   3.46  -9.99  50.74

How do I correct so that the whole digits have 2 decimal precision?

RudiC · December 6, 2013, 9:20am

/2013$/ will match 2013 at field end only.
Looks like integers are not formatted with CONVFMT. Try

awk '$1 ~ /2013$/ {$13 = -9.99; for (i=2; i<=NF; i++) $i=sprintf("%6.2f", $i)}1' file

Akshay_Hegde · December 6, 2013, 10:31am

Looks like duplicate post

ncwxpanther · December 9, 2013, 11:48am

To be more efficient what is the best way to also remove lines that contain integers greater than 2013.

I was using this command separately, but would rather combine it with the above code.

awk 'substr($1,length($1)-3) <= 2013 file

Akshay_Hegde · December 9, 2013, 11:53am

combine like this

$ awk 'substr($1,length($1)-3) == 2013{$13 = -9.99; for (i=2; i<=NF; i++) $i=sprintf("%6.2f", $i)}1'  file
AW12000012012   2.38   1.51   3.01   1.66   0.90   0.91   1.22   0.82   0.57   1.67   2.31   3.63   0.00
AW12000012013   1.52   0.90   1.20   1.34   1.21   0.67   1.18   1.22   2.75   0.84   1.52  -9.99  -9.99
Q120100010001   6.35   1.70   2.59   1.88   3.08   0.70   0.71   0.31   2.27   0.19   2.42   6.05   0.00

if you don't care about spacing then this also will work

$ awk 'substr($1,length($1)-3) == 2013{$13 = -9.99}{$1=$1}1'  file
AW12000012012 2.38 1.51 3.01 1.66 0.90 0.91 1.22 0.82 0.57 1.67 2.31 3.63 0.00
AW12000012013 1.52 0.90 1.20 1.34 1.21 0.67 1.18 1.22 2.75 0.84 1.52 -9.99 -9.99
Q120100010001 6.35 1.70 2.59 1.88 3.08 0.70 0.71 0.31 2.27 0.19 2.42 6.05 0.00

ncwxpanther · December 9, 2013, 1:04pm

Akshay

I tried

awk 'substr($1,length($1)-3) == 2013{$13 = -9.99; for (i=2; i<=NF; i++) $i=sprintf("%6.2f", $i)}1'

but it only does the last command not the first.

Akshay_Hegde · December 9, 2013, 1:06pm

post what you are getting.

ncwxpanther · December 9, 2013, 1:08pm

AA00101062012   1.54   0.76  -1.39  -2.18  -2.59  -2.84  -2.28  -1.96  -1.02   1.42   0.61   0.73 -99.99
AA00101062013   1.36   0.75   0.65   0.95   1.22   1.21   2.90   3.08   2.94   2.22 1.59  -9.99 -99.99
AA00101062014 -99.99 -99.99 -99.99 -99.99 -99.99 -99.99 -99.99 -99.99 -99.99 -99.99 -99.99 -99.99 -99.99

Akshay_Hegde · December 9, 2013, 1:10pm

ncwxpanther:

AA00101062012   1.54   0.76  -1.39  -2.18  -2.59  -2.84  -2.28  -1.96  -1.02   1.42   0.61   0.73 -99.99
AA00101062013   1.36   0.75   0.65   0.95   1.22   1.21   2.90   3.08   2.94   2.22 1.59  -9.99 -99.99
AA00101062014 -99.99 -99.99 -99.99 -99.99 -99.99 -99.99 -99.99 -99.99 -99.99 -99.99 -99.99 -99.99 -99.99

what do you expect ?

ncwxpanther · December 9, 2013, 1:12pm

AA00101062012   1.54   0.76  -1.39  -2.18  -2.59  -2.84  -2.28  -1.96  -1.02   1.42   0.61   0.73 -99.99
AA00101062013   1.36   0.75   0.65   0.95   1.22   1.21   2.90   3.08   2.94   2.22  1.59  -9.99 -99.99

Akshay_Hegde · December 9, 2013, 1:22pm

ncwxpanther:

AA00101062012   1.54   0.76  -1.39  -2.18  -2.59  -2.84  -2.28  -1.96  -1.02   1.42   0.61   0.73 -99.99
AA00101062013   1.36   0.75   0.65   0.95   1.22   1.21   2.90   3.08   2.94   2.22  1.59  -9.99 -99.99

Try:

$ awk 'substr($1,length($1)-3) <= 2013{$13 = -9.99; for (i=2; i<=NF; i++) $i=sprintf("%6.2f", $i);print}' file

RudiC · December 9, 2013, 1:28pm

The 1 at the end of the script will print every single line, regardless of the modification of $13 has taken place or not. If you want to modify lines with 2013 only but want to print every line that has a year less or equal 2013, you need to introduce two conditiones/patterns, one for modification, one for print.

ncwxpanther · December 9, 2013, 1:29pm

Now it appears that the 13th column in every line is replaced....even the line that do not contain 2013.

AL00101062012   1.54   0.76  -1.39  -2.18  -2.59  -2.84  -2.28  -1.96  -1.02   1.42   0.61  -9.99 -99.99
AL00101062013   1.36   0.75   0.65   0.95   1.22   1.21   2.90   3.08   2.94   2.22    1.59  -9.99 -99.99
AL00102061895   0.69  -0.66  -0.28  -0.77   0.06   0.34   0.45   0.96  -0.41  -0.41  -0.93  -9.99 -99.99

should be

AA00101062012   1.54   0.76  -1.39  -2.18  -2.59  -2.84  -2.28  -1.96  -1.02   1.42   0.61 2.48 -99.99
AA00101062013   1.36   0.75   0.65   0.95   1.22   1.21   2.90   3.08   2.94   2.22    1.59  -9.99 -99.99
AA00102060001   0.69  -0.66  -0.28  -0.77   0.06   0.34   0.45   0.96  -0.41  -0.41  -0.93  0.86 -99.99