Checksum different For Same file Created By Two processes

siramitsharma · February 11, 2016, 12:20pm

I have one utility in VB which generates attached file(circle.14.mdn_range.properties_VB) & i have created other file(circle.14.mdn_range.properties_UTLFILE) having same contents with UTL_FILE(Oracle running on solaris). But checksum is different for both the files with same contents. Can you suggest the reason ?

CertUtil -hashfile circle.14.mdn_range.properties_VB MD5
MD5 hash of file circle.14.mdn_range.properties_VB:
1b 43 b6 a4 44 d0 4d 8a 8b 91 3e 7b d7 aa 4f 4e
CertUtil: -hashfile command completed successfully.
 
CertUtil -hashfile circle.14.mdn_range.properties_UTLFILE MD5  
MD5 hash of file circle.14.mdn_range.properties_UTLFILE:
ae 0c 75 b8 d5 19 05 64 d5 c6 54 ec 55 bc dc e0
CertUtil: -hashfile command completed successfully.

hicksd8 · February 11, 2016, 12:34pm

How do you know that the files are identical?

If you compare them using "diff" command does it say that they are the same?

If diff says that they're different I suggest you look at them with a hex editor if they're not too big.

siramitsharma · February 11, 2016, 12:36pm

Hi Hick,
Yes diff says both files are same. That is the reason not getting to understand why checksum is different.

Any ideas

RudiC · February 11, 2016, 12:52pm

Those files are not identical.

hicksd8 · February 11, 2016, 1:07pm

I agree with RudiC. Those files cannot be identical (unless your md5 calculator is faulty).

When you compared them did you do a binary compare or only ascii?

siramitsharma · February 12, 2016, 1:14am

Yes the files is different in terms of the data which is there, else there is no difference. Following are my queries:

Does data in one as ascending and in other descending gives different checksums?
If diff gives this difference then how do i do comparison in binary or ascii mode using hexdump

fpmurphy · February 12, 2016, 2:16am

Yes. You will only get the same hash if both files are IDENTICAL in every way.

achenle · February 13, 2016, 11:26am

That's not STRICTLY correct, as any hash function effectively "compresses" data down to a fixed number of bits so there have to be collisions where different data produces identical hash results.

Hash functions try to make it impossible to modify data in ways to predictably produce a hash result identical to the result from the unmodified data. (Which is why when that protective ability is cracked, that hash function gets abandoned for more secure ones. FWIW, cracks started appearing in md5 a long time ago and now it's effectively completely cracked. It's still useful for comparing data, though it's not really secure to validate it any more.)

So to be STRICTLY correct, when the hash results differ then the data differs. If the hash results are the same you need to check if the data differs.

(Yes, I've been spending a lot of time recently dealing with cryptography and hashes for a customer...)