Hello,
I am sorry if the title is confusing, but I need a script to grep a list of Names from a Source file in a Master database in which all the homophonic variants of the name are listed along with a single indexing key and store all of these in an output file. I need this because I am testing the accuracy of a Homophone algorithm which I have written.
An example will make this clear.Let us assume that the source file has the following entries:
John
Mary
and the Master file has the following
Jon<Tab>2003
Jean<Tab>2003
John<Tab>2003
Johan<Tab>2003
Johann<Tab>2003
Mary<Tab>21978
Marie<Tab>21978
Mariam<Tab>21978
Marium<Tab>21978
Each indexed entry is separated by a Space.
The output file would identify all homophones of the Word found in the master file and which are linked by the common index and store them.
The source file has around 30,00+ entries.At present I have to open both files in a text editor. Select a word in the source file and search for it in the master database, copy to clipboard and store in the Output file. Since this is a long and tedious operation, I was wondering if there is a PERL or AWK script which could do the job.
My OS is Windows and all the wonderful UNIX tools don't help.
Many thanks for your help.