Find and increment value in string of hex

securegooner · December 19, 2017, 6:33am

I have a long string of hex (from ASN.1 data) where I need to find and change a particular hex value only and increment it. The hex pairs either side ( 84 and a7 ) of the value to increment will remain constant.

i.e. " 84 <length> <value_to_increment> a7 " starting with 00 .

So end result:

840100a7
840101a7
...
84020080a7 (length changes to 02 when value reaches 80, 128dec)
84020081a7

Any ideas how best to do this?

rdrtx1 · December 19, 2017, 7:19am

sed 's/84/\n84/g; s/a7/a7\n/g' infile | awk --non-decimal-data '
   /84[0-9][0-9]..*[aA]7/ {
      a=substr($0, 1, 4);
      b=substr($0, 5, length($0)-6);
      c=substr($0, length($0)-1);
      l=substr($0, 3, 2)*2;
      b=sprintf("%d", "0x" b); b=sprintf("%0*x", l, b+1)
      $0=a b c
   }
   {printf $0}
   '

RudiC · December 19, 2017, 8:15am

Not clear.
Do you have a string with above structure and want to produce a file from it, incrementing the value indicated line by line? Then, where to stop?
Or, do you have a file with above structured values in the lines, and want to just increment each line's value?
I guess, the value 0x7F needs special treatment, as the length has to be incremented, and the value will need four places, then?

securegooner · December 19, 2017, 9:58am

Sorry first poster trying to explain...

So I have a file containing one long line of hex (converted from BER ASN.1), i.e.

30820170a15c80070400020205010d8119877532221275e5727de311be4e938334297fb650ff00ff00ff82024e4ca31fa01580083033313035343030a2098107831341090000018102689582024e4c840100a70b80045940f1598103058390880102a282010ea4820...

I want to find all occurrences between the hex pairs 84 and a7 and replace these values by (length)(value) where length = 02 if value > 127dec, otherwise 01. And value should be incremented (beginning at 0) for each occurrence.

value will be equal to 2 bytes (4 places) where length = 02.

RudiC · December 19, 2017, 10:19am

Now, this is completely different from what you posted in #1. And, still not clear.
There's only one pattern occurrence in above sample, with value 00 . That should become 01 , understood. you don't say anything about the next pattern occurrence. Should that become 02 regardless of its initial value? Or, its initial value + 1? And, you didn't answer the question about the 0x7F value, should that ever occur.

securegooner · December 19, 2017, 10:33am

It's the same question but worded differently, appreciated it's still not clear though.

Ok, new example, original file:

30820170a15c800704000202050840100a710d8119877532221275e5727de31840100a71be4e938334297fb650ff00ff00ff820840100a724e4ca31fa01580083033313840100a7035343030a2098107831341090000018102689582024e4c840100a70b80045940f15981030583908

should become:

30820170a15c800704000202050840100a710d8119877532221275e5727de31840101a71be4e938334297fb650ff00ff00ff820840102a724e4ca31fa01580083033313840103a7035343030a2098107831341090000018102689582024e4c840104a70b80045940f15981030583908

So regardless of the initial value it should be replaced, beginning with

for the first occurrence and incremented for every occurrence thereafter.

Where value >

0x7F

the length (after the 84) will be set to

instead of

and the value will be 2 bytes (4 values) for example

84020080a7

Does that make sense? Thanks for trying to understand!

RudiC · December 19, 2017, 11:15am

Why should 0xF7 be decremented to 0x80 ?

securegooner · December 19, 2017, 11:25am

Sorry typo...should have said

0x7F

(corrected it now)

So all values greater than this, starting with

0x80

will be represented as 4 digits and the length set to 02, i.e.

84020080a7

RudiC · December 19, 2017, 11:30am

Try

awk '
        {while (match ($0, /84[^a]*a7/))        {printf "%s", substr ($0, 1, RSTART + 1)
                                                 if (TMP == "") TMP = sprintf ("%.f", "0x" substr ($0, RSTART + 4 , RLENGTH - 6))
                                                   else         TMP++
                                                 LG = (TMP > 127) + 1
                                                 printf "%02x%0*x", LG, 2*LG, TMP
                                                 $0 = substr ($0, RSTART + RLENGTH - 2)
                                                }
         print $0
        }
' file

Be aware that with lines longer than the system config parameter LINE_MAX (my system: 2048) the result may not be what you expect.

Don_Cragun · December 19, 2017, 1:27pm

Hi securegooner,
If you increment 7F to 80 and on subsequent runs to 81, 82, 83, and 84; you then have an ambiguity as to where the string starting with 84 starts. Should the string 84017Fa7 be incremented to 84020080a7 or to 84020100a7 ??? And, can there only be values 01 and 02 after the leading 84? Or, should 8402FFFFa7 or 84027F7Fa7 be incremented to 8403010000a7 ?

And will you really have an uppercase 7F and a lowercase a7 ?

Please be very sure that your specification is clear and complete and accurately specifies what you are trying to do.

Hi RudiC,
Note also that using match($0,/84[^a]*a7/) may get a false match on even byte boundaries such as 0840a7 and RSTART and RLENGTH won't be set correctly for a cases like 08012aa7 and 08020a0aa7 .

To get proper alignment, we have to be sure tha characters are matched in pairs starting on odd character boundaries.

RudiC · December 19, 2017, 3:07pm

Thanks Don, valuable caveats indeed, although I regarded my attempt more a proof of concept than a bullet proof program.
Eliminating false matches can be done by in the regex exactly counting the characters between 84 and a7 , which could be 4 or 6 according to the spec in post#1. And, finding even boundaries equiv. to full bytes can be done by evaluating the evenness or oddness of RSTART. See below:

awk '
        {print
         while (match ($0, /84....(..)?a7/))    {printf "%s", substr ($0, 1, RSTART + 1)
                                                 if (RSTART%2)  {if (!TMP)      TMP = sprintf ("%.f", "0x" substr ($0, RSTART + 4 , RLENGTH - 6))
                                                                    else        TMP++
                                                                 LG  = (TMP > 127) + 1
                                                                 printf "%02x%0*x", LG, 2*LG, TMP
                                                                }
                                                        else    {RLENGTH--
                                                                 printf "%s", substr ($0, RSTART + 2, RLENGTH - 4)
                                                                }
                                                 $0 = substr ($0, RSTART + RLENGTH - 2)
                                                }
         print
        }
' file

I appreciate there's still some pitfalls that are not yet covered, but I think that's how far we get with a spec as given.

securegooner · December 20, 2017, 9:02am

Thanks very much for your comments and suggestions RudiC and Don.

I will try the posted code on my data to see if it produces the output I wanted to achieve and get back to you...