Differentiate between MS Word and Excel files in Unix

Hi,

I want to differentiate between a MS Word and Excel file in Unix (not by extension). The condition which we are currently checking for is the pattern "\320\317\021\340" within first 40 bytes of the file. However this format is same in all MS Office files. Can somebody tell me any special characters which we can check which will differentiate between MS Word and MS Excel file.

Regards,
Rajan

Excel files are BIFF format files - they don't have any special 'identifying' string.
Attached is the WORD binary file format.

Neither is very helpful. You can try Wotsit.org for more details.
AFAIK windows actually uses the extension to determine what app to to use to open
XLS and DOC files.

phatak_rajanm, can you tell us what exactly is the use case ?
Can you use the 'file' unix / linux application ?
For example, if I touch a file, as in : "touch test.xls" and pass it for processing against file, as in : "file test.xls" it tells me - empty file. If I enter some text, it tells me : ascii text. At the same time, if I pass a real XLS file it says : "Microsoft Office Document". So, it seems to me that 'file' command makes the difference, but not based on the extension.

Here's one natural use case: Open Excel files with Gnumeric but Word files in OpenOffice.

Hm... I just noticed that the 'file' utility does not make difference between Excel files and Word files - both are presented as "Microsoft Office Document". I'm out of suggestions but relying on the file's extension can also work, you can just warn the users to pay attention on file's names and especially extensions.