awk: seeking to bytes

karyn1617 · February 7, 2005, 2:16pm

can I seek to a particular byte in a file and replace it using awk?
if so, how?

vgersh99 · February 7, 2005, 2:41pm

substitute the THIRD character in the record/line with 'A"

nawk '{$0=substr($0,2,1) "A" substr($0,4); print}'

karyn1617 · February 7, 2005, 4:08pm

What if I want to go to byte 24 and replace it, instead of knowing which character?

zazzybob · February 7, 2005, 4:38pm

Using sed, you can navigate to a particular offset in a line and make a substitution....

sed 's/./A/24' myfile

This'll replace the 24th character on each line with 'A'

If you want to change a particular byte offset within a multiline file online once, try something like

#!/bin/sh

replace="B"
offset=24

awk -vr=$replace -vo=$offset '
   BEGIN {
      s=o-1;
      e=o+1;
      RS="";
   } {
      printf("%s%s%s",substr($0,1,s), r, substr($0,e) );
   }
' testfile

exit 0

But beware, each newline is also a byte in itself.

Cheers
ZB

vgersh99 · February 7, 2005, 6:03pm

zazzybob:

Using sed, you can navigate to a particular offset in a line and make a substitution....

sed 's/./A/24' myfile

This'll replace the 24th character on each line with 'A'

If you want to change a particular byte offset within a multiline file online once, try something like
#!/bin/sh

replace="B"
offset=24

awk -vr=$replace -vo=$offset '
   BEGIN {
   s=o-1;
   e=o+1;
   RS="";
   } {
   printf("%s%s%s",substr($0,1,s), r, substr($0,e) );
   }
' testfile

exit 0
But beware, each newline is also a byte in itself.

Cheers
ZB

I believe the above 'awk':

will not work for the multi-byte character sets as 'substr' is a character based operation [not a byte oriented one].
will not work on LARGE files as most awks [at least Solaris' stock ones] have inherited limits on the lenght of records [RS], number of fields [NF] etc....

It might work in some isolated cases, but it is not a generic solution.

karyn1617 · February 7, 2005, 6:58pm

zazzybob:

Using sed, you can navigate to a particular offset in a line and make a substitution....

sed 's/./A/24' myfile

This'll replace the 24th character on each line with 'A'

If you want to change a particular byte offset within a multiline file online once, try something like
#!/bin/sh

replace="B"
offset=24

awk -vr=$replace -vo=$offset '
   BEGIN {
   s=o-1;
   e=o+1;
   RS="";
   } {
   printf("%s%s%s",substr($0,1,s), r, substr($0,e) );
   }
' testfile

exit 0
But beware, each newline is also a byte in itself.

Cheers
ZB

I tried the sed command, but it did not output anything???

karyn1617 · February 7, 2005, 7:00pm

Do you know the limit on awks for length of records? I am only interested in the first 80 bytes of the line, does that make a difference? Also, can't I equate a byte = 1 character?

vgersh99 · February 7, 2005, 7:31pm

Some limits of classic awk (maybe you need this):
100 fields
3000 chars per input record
3000 chars per output record
1024 chars per field
3000 chars per printf string
400 chars max literal string
15 open files

Also browsing 'comp.lang.awk' [on Google] can yield some results as well.
comp.lang.awk

If you are ONLY interested in the first 80 chars on the FIRST line, you can easily use either the sed OR the nawk solution(s) posted.

For the efficiency you might want to modify either one of them to quit AFTER the first line is processed.

zazzybob · February 7, 2005, 10:51pm

The only simple generic solution to this entire problem (for files that exceed the limits posed by awk, for example) is to fire up an HEX editor and make the change manually. You could write a C program to do it, or use Perl/PHP (note this is UNTESTED Perl code - just found from Googling around...)

#!/usr/bin/perl
# replace byte in binary file
($file, $offset, $byte) = @ARGV;
$byte || die "Usage: $0 FILE OFFSET BYTE\n";
$byte = pack('C', eval("$byte"));
$offset = eval("$offset");
@file = split('', `cat $file`);
die "No such offset $offset in $file\n" if !defined($file[$offset]);
$file[$offset] = $byte;
open(FILE, ">$file") || die "Cannot overwrite $file: $!\n";
print FILE join('', @file);
close(FILE);

Even the sed solution will also be operating on a character set not a byte set, as will the original nawk solution you posted.

I've tried the sed 's/./A/24' myfile on a very old version of sed (shipped with UNICOS 9.0) and it seems to work okay, as it does with newer seds too.

Cheers
ZB