Reverse complement

I want to reverse some DNA sequences and complement them at the same time. Thus, A changes to T; C to G; T to A and G to C.
example:
infile

>GHL8OVD01CMQVT SHORT1
TTGATGT
>GHL8OVD01CMQVT SHORT2
TTGATGT

outfile:

>GHL8OVD01CMQVT SHORT1
ACATCAA
>GHL8OVD01CMQVT SHORT2
ACATCAA

The Identifier (> XXXXX) should not be modified
This is the code I want to modify:

awk ' !(NR%2) ' infile | rev | tr ACGT TGCA

However, the Ids are not being printed. If I include NR%2 , the Ids will also be reverse complemented
I know I can always use perl:

perl -nle'BEGIN {
  @map{ A, C, G, T } = ( T, G, C, A )
  }
  print /^>/ ?
    $_ :
      join //, map $map{ $_ }, split //, scalar reverse
  ' infile

But I am trying to simplify the script so I can explain it better

Can't you apply what you learned from your thread Cut & awk four days ago to this thread? You have exactly the same problem assuming that some elements of a pipeline will only process some of the lines they are fed or that lines thrown away by some element of a pipeline will still magically appear in your output.

You didn't ask any questions about the suggestions you were given there, so we assume that you understand how those suggestions work.

perl -ple 'y/ACGT/TGCA/ and $_ = reverse unless /^>/' infile
>GHL8OVD01CMQVT SHORT1
ACATCAA
>GHL8OVD01CMQVT SHORT2
ACATCAA

sed approach (not necessarily easier to explain...):

sed '/>/n; y/ATCG/TAGC/;s/^.*$/X&X/;:x;s/\(X.\)\(.*\)\(.X\)/\3\2\1/;tx;s/X//g' file
>GHL8OVD01CMQVT SHORT1
ACATCAA
>GHL8OVD01CMQVT SHORT2
ACATCAA

or, if you have GNU sed with its extensions:

sed '/>/n; y/ATCG/TAGC/;s/^/echo /;s/$/ | rev/;e' file

Hi Xterra,
You can use system() to run shell commands inside awk , but invoking a shell to invoke rev and tr once for each even numbered line in your file will take at least two orders of magnitude longer to run than building equivalent functionality into your awk script. If we write an awk script to print odd numbered lines and feed even numbered lines through rev and tr :

#!/bin/ksh
IAm=${0##*/}
tmpf="$IAm.$$"
awk -v tmpf="$tmpf" '
FNR % 2
!(FNR % 2) {
	print > tmpf
	close(tmpf)
	system("rev \"" tmpf "\" | tr ACGT TGCA")
}
' ${1:-infile}
rm -f "$tmpf"

it is easy to understand and, with an input file containing 10,000 copies of your sample input file, the average of timing 10 runs (with output redirected to a file) is about:

real	1m5.37s
user	0m41.09s
sys	0m49.33s

A similar awk script building the rev and tr functionality into an internal function:

#!/bin/ksh
awk '
BEGIN {	c["A"] = "T"; c["C"] = "G"; c["G"] = "C"; c["T"] = "A" }
function revcomp(	i, o) {
	o = ""
	for(i = length; i > 0; i--)
		o = o c[substr($0, i, 1)]
	return(o)
}
!(FNR % 2) {$0 = revcomp()}
1' ${1:-infile}

produces exactly the same output and takes about:

real	0m0.16s
user	0m0.15s
sys	0m0.00s

In other words this awk script processes a little more than 800 lines in the time it take to process 2 lines firing up a pipeline to process the even lines.

The average timing for Aia's perl suggestion was:

real	0m0.03s
user	0m0.02s
sys	0m0.01s

For some reason the BSD based sed on OS X produced the wrong output (with leading and trailing X characters on even numbered lines; the lines had been translated but not reversed) without producing any diagnostics when running RudiC's sed script. But an equivalent command (splitting on semicolons into separate sed editing commands):

sed -e '/>/n' -e 'y/ATCG/TAGC/' -e 's/^.*$/X&X/' -e ':x' -e 's/\(X.\)\(.*\)\(.X\)/\3\2\1/' -e 'tx' -e's/X//g' infile

produced the expected output with average timing output of:

real	0m0.09s
user	0m0.09s
sys	0m0.00s