Replace only few string

Diya123 · April 28, 2014, 6:03pm

Hi,

I have a file with 1st column to be gene symbols and rest of the columns its expression value. for each gene symbol if the value if other than NA replace it with the gene symbol

input file:

RPS6	14.26939	13.7448	14.18972	13.45445	14.47969	13.60643	13.5248
BASP1	10.49898	11.35968	6.051649	8.769745	13.07223	11.28016	11.93768
LDHB	14.2268	12.93219	13.97726	12.44734	13.77943	14.32173	14.2813
HEPACAM2	5.681814	8.440248	NA	NA	11.85384	12.37137	11.33899

outputfile

RSP6 RSP6 RSP6 RSP6 RSP6 RSP6 RSP6
BASP1 BASP1 BASP1 BASP1 BASP1 BASP1 BASP1
LDHB LDHB LDHB LDHB LDHB LDHB LDHB
HEPACAM2 HEPACAM2   HEPACAM2 HEPACAM2 HEPACAM2

Is it easy to do it in awk?

Thanks,

Corona688 · April 28, 2014, 6:22pm

awk '($1 == "NA") { for(N=2; N<=NF; N++) $N=$1 } 1' OFS="\t" inputfile > outputfile

Diya123 · April 28, 2014, 6:26pm

Thanks for the command. but when i use this my output file is same as input file. I does not perform the replace

Corona688 · April 28, 2014, 6:27pm

I made one mistake.

awk '($1 != "NA") { for(N=2; N<=NF; N++) $N=$1 } 1' OFS="\t" inputfile > outputfile

Diya123 · April 28, 2014, 6:28pm

Ok. When I change it to !=NA then it performs the replace but where it has NA in specific columns it has to turn blank or should be NA only. but that not happening.

Corona688 · April 28, 2014, 6:50pm

awk '{ for(N=2; N<=NF; N++) if($N == "NA") {$N=""} else { $N=$1 } } 1' OFS="\t" inputfile > outputfile

MadeInGermany · April 28, 2014, 7:18pm

awk '{ for (N=2; N<=NF; N++) $N = ($N == "NA") ? "" : $1 } 1' inputfile

or

awk '{ for (N=2; N<=NF; N++) if ($N != "NA") printf " %s",$1; print ""}' inputfile

Diya123 · April 29, 2014, 2:14pm

Thanks Corona. It worked.