Keyword Searching

Hi all,

I am in the process of building a shell script as part of a auditing utility. It will search a specified directory for keywords and output results of the file path, and line number that the word was found on. I built a test script (shown below) that does just this, but egrep apparently does not allow MS word, excel, etc... documents to be read. I was wondering if someone could point me in an alternate direction that would allow me to search these types of documents as well? (Wordfile is a file that is create elsewhere with a list of words to search for e.g. bus)

Thanks!

cat << EOF > ${TMPDIR}/scanit
rm -f ${TMPDIR}/strings
strings "\$1" | egrep -n -i -f ${TMPDIR}/wordlist ^\d{3}-\d{2}-\d{4}$ >> ${TMPDIR}/strings
if [ -s ${TMPDIR}/strings ]
then
	echo >> ${TMPDIR}/${HOSTNAME}.o
	echo "File:  \$1" >> ${TMPDIR}/${HOSTNAME}.o
	file "\$1"  >> ${TMPDIR}/${HOSTNAME}.o
	cat ${TMPDIR}/strings >> ${TMPDIR}/${HOSTNAME}.o
fi
rm -f ${TMPDIR}/strings
EOF

HOSTNAME=`hostname`
export HOSTNHAME

if [ $# -eq 0 ]
then
	echo "You must specify the start of the directory tree to search"
	exit
fi

find $1 -type f 2> ${TMPDIR}/${HOSTNAME}_find_errors | tee ${TMPDIR}/${HOSTNAME}_filelist | \
head -100 |\
sed -e "s+^+sh -x ${TMPDIR}/scanit \"+" -e 's/$/"/' > ${TMPDIR}/scanitnow

sh -x ${TMPDIR}/scanitnow 1> ${TMPDIR}/${HOSTNAME}_scan_run 2>&1

cd ${TMPDIR}
if [ -s ${HOSTNAME}.o ]
then
	date "+%Y%M%d_%H:%m:%S: indicators found on ${HOSTNAME}" > ${HOSTNAME}_scan_results.csv
	cat ${HOSTNAME}.o >> ${HOSTNAME}_scan_results.csv
else
	date "+%Y%M%d_%H:%m:%S:  No indicators found on ${HOSTNAME}" > ${HOSTNAME}_scan_results.csv
fi

zip ${HOSTNAME}_scan.zip ${HOSTNAME}_find_errors ${HOSTNAME}_filelist ${HOSTNAME}_scan_run ${HOSTNAME}_scan_results.csv