Find and increment value in string of hex

I have a long string of hex (from ASN.1 data) where I need to find and change a particular hex value only and increment it. The hex pairs either side ( 84 and a7 ) of the value to increment will remain constant.

i.e. " 84 <length> <value_to_increment> a7 " starting with 00 .

So end result:

840100a7
840101a7
...
84020080a7 (length changes to 02 when value reaches 80, 128dec)
84020081a7

Any ideas how best to do this?

sed 's/84/\n84/g; s/a7/a7\n/g' infile | awk --non-decimal-data '
   /84[0-9][0-9]..*[aA]7/ {
      a=substr($0, 1, 4);
      b=substr($0, 5, length($0)-6);
      c=substr($0, length($0)-1);
      l=substr($0, 3, 2)*2;
      b=sprintf("%d", "0x" b); b=sprintf("%0*x", l, b+1)
      $0=a b c
   }
   {printf $0}
   '

Not clear.
Do you have a string with above structure and want to produce a file from it, incrementing the value indicated line by line? Then, where to stop?
Or, do you have a file with above structured values in the lines, and want to just increment each line's value?
I guess, the value 0x7F needs special treatment, as the length has to be incremented, and the value will need four places, then?

Sorry first poster trying to explain... :slight_smile:

So I have a file containing one long line of hex (converted from BER ASN.1), i.e.

30820170a15c80070400020205010d8119877532221275e5727de311be4e938334297fb650ff00ff00ff82024e4ca31fa01580083033313035343030a2098107831341090000018102689582024e4c840100a70b80045940f1598103058390880102a282010ea4820...

I want to find all occurrences between the hex pairs 84 and a7 and replace these values by (length)(value) where length = 02 if value > 127dec, otherwise 01. And value should be incremented (beginning at 0) for each occurrence.

value will be equal to 2 bytes (4 places) where length = 02.

Now, this is completely different from what you posted in #1. And, still not clear.
There's only one pattern occurrence in above sample, with value 00 . That should become 01 , understood. you don't say anything about the next pattern occurrence. Should that become 02 regardless of its initial value? Or, its initial value + 1? And, you didn't answer the question about the 0x7F value, should that ever occur.

It's the same question but worded differently, appreciated it's still not clear though.

Ok, new example, original file:

30820170a15c800704000202050840100a710d8119877532221275e5727de31840100a71be4e938334297fb650ff00ff00ff820840100a724e4ca31fa01580083033313840100a7035343030a2098107831341090000018102689582024e4c840100a70b80045940f15981030583908

should become:

30820170a15c800704000202050840100a710d8119877532221275e5727de31840101a71be4e938334297fb650ff00ff00ff820840102a724e4ca31fa01580083033313840103a7035343030a2098107831341090000018102689582024e4c840104a70b80045940f15981030583908

So regardless of the initial value it should be replaced, beginning with

00

for the first occurrence and incremented for every occurrence thereafter.

Where value >

0x7F

the length (after the 84) will be set to

02

instead of

01

and the value will be 2 bytes (4 values) for example

84020080a7

Does that make sense? Thanks for trying to understand!

Why should 0xF7 be decremented to 0x80 ?

Sorry typo...should have said

0x7F

(corrected it now)

So all values greater than this, starting with

0x80

will be represented as 4 digits and the length set to 02, i.e.

84020080a7

Try

awk '
        {while (match ($0, /84[^a]*a7/))        {printf "%s", substr ($0, 1, RSTART + 1)
                                                 if (TMP == "") TMP = sprintf ("%.f", "0x" substr ($0, RSTART + 4 , RLENGTH - 6))
                                                   else         TMP++
                                                 LG = (TMP > 127) + 1
                                                 printf "%02x%0*x", LG, 2*LG, TMP
                                                 $0 = substr ($0, RSTART + RLENGTH - 2)
                                                }
         print $0
        }
' file

Be aware that with lines longer than the system config parameter LINE_MAX (my system: 2048) the result may not be what you expect.

Hi securegooner,
If you increment 7F to 80 and on subsequent runs to 81, 82, 83, and 84; you then have an ambiguity as to where the string starting with 84 starts. Should the string 84017Fa7 be incremented to 84020080a7 or to 84020100a7 ??? And, can there only be values 01 and 02 after the leading 84? Or, should 8402FFFFa7 or 84027F7Fa7 be incremented to 8403010000a7 ?

And will you really have an uppercase 7F and a lowercase a7 ?

Please be very sure that your specification is clear and complete and accurately specifies what you are trying to do.

Hi RudiC,
Note also that using match($0,/84[^a]*a7/) may get a false match on even byte boundaries such as 0840a7 and RSTART and RLENGTH won't be set correctly for a cases like 08012aa7 and 08020a0aa7 .

To get proper alignment, we have to be sure tha characters are matched in pairs starting on odd character boundaries.

1 Like

Thanks Don, valuable caveats indeed, although I regarded my attempt more a proof of concept than a bullet proof program.
Eliminating false matches can be done by in the regex exactly counting the characters between 84 and a7 , which could be 4 or 6 according to the spec in post#1. And, finding even boundaries equiv. to full bytes can be done by evaluating the evenness or oddness of RSTART. See below:

awk '
        {print
         while (match ($0, /84....(..)?a7/))    {printf "%s", substr ($0, 1, RSTART + 1)
                                                 if (RSTART%2)  {if (!TMP)      TMP = sprintf ("%.f", "0x" substr ($0, RSTART + 4 , RLENGTH - 6))
                                                                    else        TMP++
                                                                 LG  = (TMP > 127) + 1
                                                                 printf "%02x%0*x", LG, 2*LG, TMP
                                                                }
                                                        else    {RLENGTH--
                                                                 printf "%s", substr ($0, RSTART + 2, RLENGTH - 4)
                                                                }
                                                 $0 = substr ($0, RSTART + RLENGTH - 2)
                                                }
         print
        }
' file

I appreciate there's still some pitfalls that are not yet covered, but I think that's how far we get with a spec as given.

Thanks very much for your comments and suggestions RudiC and Don.

I will try the posted code on my data to see if it produces the output I wanted to achieve and get back to you...