String Replacement in a column

ctrld · August 28, 2014, 3:56am

I have a text file with the following contents. I am trying to check the column "COL5". If the value is not "0" then I need to replace the value to NA only in COL5.
The problem is that I don't have a clear delimiter here, I have only the column names.

cat input.txt

-08/27/14:08:30:01--

NAME                 COl1   COL2 COL3   COL4  COL5
-------------        ------ ---- ------ ----- ----
NAME1234            NY      NA    777   77    0

NAME1235            RU      NA    100   90    NA

NAME1244            IN      NA    -     30    5
NAME3126            SG      NA    -     50    7

I need the output as below:

cat input.txt

-08/27/14:08:30:01--

NAME                 COl1   COL2 COL3   COL4  COL5
-------------        ------ ---- ------ ----- ----
NAME1234            NY      NA    777   77    0

NAME1235            RU      NA    100   90    NA

NAME1244            IN      NA    -     30    NA
NAME3126            SG      NA    -     50    NA

Makarand_Dodmis · August 28, 2014, 4:44am

try

nawk '{if ($6 != "0" && $6 && $6 != "COL5" && $6 != "----") {gsub($6"$","NA",$0)}; { print $0}}' input.txt

RudiC · August 28, 2014, 5:07am

The problem with above proposal is, it will replace any column having a value equal $6 with "NA". Try (with e.g col3 set to 5)

awk '$6 && D   {gsub($6"$","NA",$0)}
     $1~/---/  {D=1}
     1
    ' file

-08/27/14:08:30:01--

NAME                 COl1   COL2 COL3   COL4  COL5
-------------        ------ ---- ------ ----- ----
NAME1234            NY      NA    777   77    0

NAME1235            RU      NA    100   90    NA

NAME1244            IN      NA    5     30    NA
NAME3126            SG      NA    -     50    NA

---------- Post updated at 11:07 ---------- Previous update was at 10:53 ----------

@Makarand Dodmis: Please don't edit your original script

after someone pointed out a possible quirk.

ctrld · August 28, 2014, 11:39pm

Thanks Makarand and RudiC.

Unfortunately I am facing one issue. It was my mistake. I didn't tell that for some rows I have few more columns to the end.
The solution provided by RudiC is working only for rows with exactly six columns. If I have more columns the row is untouched.

Chubler_XL · August 29, 2014, 12:18am

This matches column #6 in the table, you were not clear if the headings should be used to match the column or it's ordinal position.

Here I'm using the ----- ------ line to get the columns start and length:

awk '$6 && S {$0=substr($0,1,S) sprintf("%-*s", L, "NA") substr($0,S+L+1) }
/^--/ {
L=length($6)
F=x
for(i=1;i<6;i++) F=F $i" +"
match($0,F)
S=RLENGTH-1
print "S=" S " L=" L

}
1' infile

-08/27/14:08:30:01--

NAME                 COl1   COL2 COL3   COL4  COL5  COL..     COLn
S=45 L=4
-------------        ------ ---- ------ ----- ----  -------   ----
NAME1234            NY      NA    777   77    0     0         PI

NAME1235            RU      NA    100   90   NA     NA        0

NAME1244            IN      NA    5     30   NA     5         NA
NAME3126            SG      NA    -     50   NA     0         7

ctrld · August 29, 2014, 1:28am

Thanks Chubler! It worked fine.