Convert Binary File To Hex In Linux

dHi,
I have the attached file(actual file can be extracted post unzipping it) & i am trying to use the following code for coversion to hex format.
Starting hex value is 84 which is start of the record & termination is done using 00 00 followed by 84(hex) which i can see in the dump clearly using the below command(od) but i want all the records/lines to be arranged accordingly so that further processing can be done.

od -t x1 DDI15.09.02.C

Some lines from the above command is as follows

0000000 84 7d 00 0c 09 40 11 8a 14 93 51 51 72 74 0f 09
0000020 02 0f 05 26 0d 65 06 51 55 21 67 01 00 00 6e 06
0000040 06 a0 5a 6e 7a 0b 00 82 05 10 00 00 8e 0a 03 13
0000060 8a 14 93 51 51 72 98 09 03 04 46 04 04 32 04 9a
0000100 07 01 05 00 20 00 a8 08 01 10 06 51 55 21 c6 0f
0000120 01 05 b0 00 00 02 08 29 00 00 a9 00 00 cc 18 01
0000140 03 01 03 04 92 18 04 04 92 18 05 08 02 92 18 00
0000160 01 47 06 03 01 96 08 af 00 cf 14 00 00 84 86 00
0000200 0c 08 40 11 6a 14 15 10 77 60 74 0f 09 02 0f 03
0000220 01 01 65 0a 94 14 10 92 15 67 03 00 00 6c 4f 56
0000240 4f 44 41 4c 07 00 09 6e 06 4f b1 85 aa 7a 2a 00
0000260 80 0c 03 12 0e 41 03 94 14 10 92 15 82 05 10 00
0000300 00 8e 0a 03 13 6a 14 15 10 77 60 a8 0a 01 10 0a
0000320 94 14 10 92 15 c6 0f 01 05 d8 06 00 02 08 26 00
0000340 00 80 00 00 cc 17 01 03 01 02 03 30 03 04 4b 18
0000360 05 08 02 4b 18 00 00 26 06 03 01 96 08 b0 00 cf
0000400 14 00 00 84 86 00 0c 08 40 11 6a 29 15 10 40 85
0000420 74 0f 09 02 0f 05 0f 01 65 0a 98 28 54 86 51 67
0000440 01 00 00 6c 4f 56 4f 44 41 4c 08 00 1f 6e 06 9d
0000460 90 0a 18 7a 2a 00 80 0c 03 12 0e 41 03 98 28 54
0000500 86 51 82 05 10 00 00 8e 0a 03 13 6a 29 15 10 40
0000520 85 a8 0a 01 10 0a 98 28 54 86 51 c6 0f 01 05 9b
0000540 01 00 02 08 54 00 00 60 00 00 cc 17 01 03 01 02
0000560 03 30 03 04 f9 17 05 08 02 f9 17 00 03 42 06 03
0000600 01 96 08 b1 00 cf 14 00 00 84 91 00 0c 10 40 11
0000620 6a 14 15 15 32 24 74 0f 09 02 0f 00 27 0b 65 0a
0000640 93 51 68 48 70 66 02 00 00 67 06 00 00 6c 42 52
0000660 4c 4c 4f 43 03 00 0e 6e 06 37 f1 59 fe 7a 2a 00
0000700 80 0c 03 12 0e 31 25 93 51 68 48 70 82 05 10 00
0000720 00 8e 0a 03 13 6a 14 15 15 32 24 a8 0a 01 10 0a
0000740 93 51 68 48 70 ac 0b 12 03 10 6a 14 15 15 32 24
0000760 c6 0f 01 05 63 0c 00 02 08 05 00 00 5f 00 00 cc
0001000 13 01 03 04 02 03 30 05 08 00 04 00 37 01 12 06
0001020 03 01 96 08 b2 00 cf 14 00 00 84 91 00 0c 10 40
0001040 11 6a 14 15 15 32 25 74 0f 09 02 0f 05 0a 05 65

Output required is as follows

84 7d 00 0c 09 40 11 8a 14 93 51 51 72 74 0f 09 02 0f 05 26 0d 65 06 51 55 21 67 01 00 00 6e 06 8a 14 93 51 51 72 98 09 03 04 46 04 04 32 04 9a 07 01 05 00 20 00 a8 08 01 10 06 51 55 21 c6 0f 01 05 b0 00 00 02 08 29 00 00 a9 00 00 cc 18 01 03 01 03 04 92 18 04 04 92 18 05 08 02 92 18 00 01 47 06 03 01 96 08 af 00 cf 14 00 00 
84 00 0c 08 40 11 6a 14 15 10 77 60 74 0f 09 02 0f 03 01 01 65 0a 94 14 10 92 15 67 03 00 00 6c 4f 56 4f 44 41 4c 07 00 09 6e 06 4f b1 85 aa 7a 2a 00 00 8e 0a 03 13 6a 14 15 10 77 60 a8 0a 01 10 0a 94 14 10 92 15 c6 0f 01 05 d8 06 00 02 08 26 00 00 80 00 00 cc 17 01 03 01 02 03 30 03 04 4b 18 05 08 02 4b 18 00 00 26 06 03 01 96 08 b0 00 cf 14 00 00

and so on....

Hello siramitsharma,

Could you please attach the actual file, I was having issues with opening it.

Thanks,
R. Singh

PFA the zip file again.

Do you have the command hexdump?

Something like:-

 ARRAY=( `hexdump -v -e '1/1 "%02x "' /path/to/filename` ) 

For a sapce delimited flat array, or:-

 VAR=`hexdump -v -e '1/1 "%02x "' /path/to/filename` 

As complete string pairs with a space between each pair, flat variable.
Once you have either __editing__ is easy...

EDIT:
If you have only od then try:-

 ............ od -tx1 -An ............ 

Note the -An part and again using a similar approach to 'hexdump' above...
(Check to see you have them using 'man od'.)

Not sure if this will work with your large files:

od -tx1 -w655360 -An | sed 's/\(00 00\) \(84\)/\1\n\2/g'

od might not want a width that large, and sed might be overburdened with lines that long.

Thanks wisecracker & RudiC..

Thanks Rudi & wisecracker i have used the following code

hexdump -v -e '1/1 "%02x "' DDI15.09.02.C|sed 's/\(00 00\) \(84\)/\1\n\2/g' >DDI15.09.02.C_HEX

Post this i have a problem writing the processed file against the hex generated above. Below is the code for reference


awk 'function ano(i)
{
$str=sprintf("%d%d%d%d%d",$i,$(i+1),$(i+2),$(i+3),$(i+4),$(i+5),$(i+6))
return $str
}
function timestamp(j)
{
$s=sprintf("%02d/%02d/%02d %02d:%02d:%02d",strtonum("0x"$j),strtonum("0x"$(j+1)),strtonum("0x"$(j+2)),strtonum("0x"$(j+3)),strtonum("0x"$(j+4)),strtonum("0x"$(j+5)),strtonum("0x"$(j+6)))
return $s
}
{
print ano(9),timestamp(15)
} ' DDI15.09.02.C_HEX

Output coming is as follows:

1493515172 00/00/00 00:00:00
1415107760 00/00/00 00:00:00
2915104085 00/00/00 00:00:00
1415153224 00/00/00 00:00:00
1415153225 00/00/00 00:00:00
1415109892 00/00/00 00:00:00
1415118252 00/00/00 00:00:00
2945107422 00/00/00 00:00:00
291515250 00/00/00 00:00:00
1415166271 00/00/00 00:00:00

while when running the timestamp function along required output is coming. Can you please help

What exactly are you trying to achieve?

Hi RudiC,
Trying to write the required output(sequence no & timestamp) from hex converted file(DDI15.09.02.C_HEX) from original binary file(DDI15.09.02.C) attached earlier.

As long as you keep us guessing (e.g. on you file structrue) and don't post exact specifications, I'm afraid we (at least I) can't help further.

1 Like

Ok..let me iterate whole things

I have changed the binary file to hex using the following code.

hexdump -v -e '1/1 "%02x "' DDI15.09.02.C|sed 's/\(00 00\) \(84\)/\1\n\2/g' >DDI15.09.02.C_HEX

Fileformat of DDI15.09.02.C_HEX is as follows:

84 7d 00 0c 09 40 11 8a 14 93 51 51 72 74 0f 09 02 0f 05 26 0d 65 06 51 55 21 67 01 00 00 6e 06 8a 14 93 51 51 72 98 09 03 04 46 04 04 32 04 9a 07 01 05 00 20 00 a8 08 01 10 06 51 55 21 c6 0f 01 05 b0 00 00 02 08 29 00 00 a9 00 00 cc 18 01 03 01 03 04 92 18 04 04 92 18 05 08 02 92 18 00 01 47 06 03 01 96 08 af 00 cf 14 00 00 
84 00 0c 08 40 11 6a 14 15 10 77 60 74 0f 09 02 0f 03 01 01 65 0a 94 14 10 92 15 67 03 00 00 6c 4f 56 4f 44 41 4c 07 00 09 6e 06 4f b1 85 aa 7a 2a 00 00 8e 0a 03 13 6a 14 15 10 77 60 a8 0a 01 10 0a 94 14 10 92 15 c6 0f 01 05 d8 06 00 02 08 26 00 00 80 00 00 cc 17 01 03 01 02 03 30 03 04 4b 18 05 08 02 4b 18 00 00 26 06 03 01 96 08 b0 00 cf 14 00 00

Now from the above hex files i want to fetch data from 9th fields onwards which is covered in function ano(i) in the code below & from 15th onwards for the timestamp which is written in function timestamp. But i am getting the required output via function ano but not from timestamp function which is coming as 0 .
Hope it clears the requirement

awk 'function ano(i)
{
$str=sprintf("%d%d%d%d%d",$i,$(i+1),$(i+2),$(i+3),$(i+4),$(i+5),$(i+6))
return $str
}
function timestamp(j)
{
$s=sprintf("%02d/%02d/%02d %02d:%02d:%02d",strtonum("0x"$j),strtonum("0x"$(j+1)),strtonum("0x"$(j+2)),strtonum("0x"$(j+3)),strtonum("0x"$(j+4)),strtonum("0x"$(j+5)),strtonum("0x"$(j+6)))
return $s
}
{
print ano(9),timestamp(15)
} ' DDI15.09.02.C_HEX

Hi siramitsharma...

I have expanded your zip file and the start bytes are nothing like yours, that helps - not.

So $9, $10, $11, $12 and $13 are BCD AND $15, $16, $17 are HEX values for the year in the file you gave; that is 15\09\03 ; and similarly $18, $19, $20 for the time...

Why did you not tell us that part of the info is BCD and the other part HEX to decimal conversion?

Longhand the second part only for just ONE set of values and added a newline just to make it clear...
The BCD part is easy...
OSX 10.7.5, default bash terminal...

Last login: Mon Sep  7 13:58:50 on ttys000
AMIGA:barrywalker~> IFS=" "
AMIGA:barrywalker~> ARRAY=( $( cat /tmp/hexstring ) )
AMIGA:barrywalker~> printf "%02d\%02d\%02d %02d:%02d:%02d\n" 0x${ARRAY[14]} 0x${ARRAY[15]} 0x${ARRAY[16]} 0x${ARRAY[17]} 0x${ARRAY[18]} 0x${ARRAY[19]}
15\09\03 18:43:11
AMIGA:barrywalker~> _

Thanks Wisecracker & RudiC for the help & apologies for the info not shared earlier. Thought that i would be able to do it.
Anyways, can you please suggest some other alternative looking into my requirement? Thanks in advance

Please take way more care when specifiying your request! Some comments:
a) the two lines in your sample file in post#11 don't have the same structure, so anything developed for one line will fail on the other.
b) in your "ano" function you use byte 9 up to 9+6=15, in "timestamp" you use bytes 15+. There's an overlap; byte 15 is used/interpreted/worked upon twice; it's beyond my imagination that this be correct.
c) you didn't spend a second to explain the structure of the file; wisecracker (as well as others like me) did a good job in guessing/determining it but this is certainly part of the specification and not in the role of the coder.
d) the strtonum function is not offered by ALL awk versions, and I think you only need it in very special cases due to awk 's variables' string/number dualism.

---------- Post updated at 20:52 ---------- Previous update was at 18:18 ----------

Howsoever, the main problems in your awk script are the assignments to $str and $s . As neither of those is defined, they evaluate to "" or 0 , and thus the assignments overwrite $0 i.e. the source line worked upon. That's why after the call to "ano", "timestamp" has nothing to read anymore.
Try

awk '
function ano(i)
        {return sprintf("%02d%02d%02d%02d%02d%02d",$i,$(i+1),$(i+2),$(i+3),$(i+4),$(i+5))
        }
function timestamp(j)
        {return sprintf("%02d/%02d/%02d %02d:%02d:%02d", "0x"$j,"0x"$(j+1),"0x"$(j+2),"0x"$(j+3),"0x"$(j+4),"0x"$(j+5),"0x"$(j+6))
        }

        {print ano(9),timestamp(15)
        }
' file
149351517274 15/09/02 15:05:38
151077607400 09/02/15 03:01:01
141510776074 15/09/02 15:03:01

The second line is from your sample file, the third after correction of the byte sequence by inserting a 0X7D in the second position of the line.

Just as a helper the next byte after the '0x84', that is '0x7D', is the number of bytes to the NEXT '0x84'.

Therefore the next '0x84' has a byte following it of '0x86' pointing to the next '0x84' and so on.
This is easy to code for...

EDIT:
Forgot to add do not include the byte '0x84' in the byte count per section...