Compare two files using awk

Hi. I'm new to awk and have searched for a solution to my problem, but haven't found the right answer yet. I have two files that look like this:

file1
Delete,3105551234
Delete,3105551236
Delete,5625559876
Delete,5625556789
Delete,5625553456
Delete,5625551234
Delete,5625556956
Delete,5625556643
Delete,6265552486
Delete,6265559365
Add,7755559833
Add,9515550087

file2
93,170334,0,-1,-1,,AAA,,5625556643,6465550987,,,-1,,581,93,-1
94,170335,0,-1,-1,,AAA,,7145550167,6465550987,,,-1,,581,93,-1
107,170239,0,-1,-1,,AAA,,6265559999,6465550987,,,-1,,581,93,-1
109,170240,0,-1,-1,,AAA,,5205558723,6465550987,,,-1,,581,93,-1
110,170241,0,-1,-1,,AAA,,3105551236,6465550987,,,-1,,581,93,-1
111,170348,0,-1,-1,,AAA,,6195550178,6465550987,,,-1,,581,93,-1
114,170256,0,-1,-1,,AAA,,5625559876,6465550987,,,-1,,581,93,-1
118,170336,0,-1,-1,,AAA,,3105551234,6465550987,,,-1,,581,93,-1
119,170337,0,-1,-1,,AAA,,5125559812,6465550987,,,-1,,581,93,-1
120,170338,0,-1,-1,,AAA,,5125559083,6465550987,,,-1,,581,93,-1
121,101,1,-1,-1,,AAA,,,2135559126,,,-1,,0,85,-1
122,170339,0,-1,-1,,AAA,,5625559067,6465550987,,,-1,,581,93,-1
125,999996,1,-1,-1,,AAA,,,6265559365,,,-1,,0,2561,-1
127,170340,0,-1,-1,,AAA,,5625551234,6465550987,,,-1,,581,93,-1
128,170341,0,-1,-1,,AAA,,5625559148,6465550987,,,-1,,581,93,-1
129,170342,0,-1,-1,,AAA,,5625556789,6465550987,,,-1,,581,93,-1
130,170343,0,-1,-1,,AAA,,5625559210,6465550987,,,-1,,581,93,-1
133,100,1,-1,-1,,AAA,,,6265552486,,,-1,,0,85,-1
134,170344,0,-1,-1,,AAA,,5625553456,6465550987,,,-1,,581,93,-1
135,170345,0,-1,-1,,AAA,,7605559809,6465550987,,,-1,,581,93,-1
137,170257,0,-1,-1,,AAA,,5625556956,6465550987,,,-1,,581,93,-1

I would like to look at file1 and any entry that has "Delete" in $1, look for $2 (from file1) in file2. Then, create a third file, file3, with "D,"$1 of file2. So, the output with the above examples would look like this:

file3
D,93
D,110
D,114
D,118
D,125
D,127
D,129
D,133
D,134
D,137

I hope I'm making sense. Any help would be appreciated. Thanks.

Not to quibble - but you are not clear. Your example does not match what you said.
take
114,170256,0,-1,-1,,AAA,,5625559876,6465550987,,,-1,,581,93,-1
and
Delete,5625559876

This means 'do not print' the 114,...... line.

Your output
D,114

has the 114 line in it. Several other lines are like this. Did you mean the reverse of what you said?

He means something like:

awk -F, 'NR==FNR && /^D/ {a[$2]++;next}
$9 in a || $10 in a {print "D," $1}' file1 file2

Sorry about that. file1 is a list of numbers that need to be deleted or added. file2 is a list of current numbers and corresponding information. I want file3 to be just the "D," along with the first column of file2 associated with the number marked for deletion in file1.

I tried the script Franklin posted, but I got "syntax error near line 2". I forgot to mention I'm using Solaris 8 if that makes a difference. Thanks.

Use nawk or /usr/xpg4/bin/awk on Solaris.

Regards

Yes, nawk worked, thank you very much. I was wondering, if you didn't mind, if you could breakdown the script so I can understand exactly how it's working? I'd like to learn as much of this as I can. Thanks.

awk -F, 'NR==FNR && /^D/ {a[$2]++;next}
$9 in a || $10 in a {print "D," $1}' file1 file2

Here we go:

awk -F, 

Set field separator

NR==FNR && /^D/ 

If we read the 1st file and the line starts with a "D"

{a[$2]++;next}

Set array a with the 2nd field as index and read the next line

$9 in a || $10 in a {print "D," $1}'

If the 9th or the 10th field exists as an index of the array a in the 2nd file print "D," and the 1st field.

Regards

Thanks for the info and thank you very much for your help. I really appreciate it. How does the script know when to use the first file and when to use the second file? Sorry, if that's a dumb question.

NR to the number of the current input record and FNR is the current record number in the current file.
FNR is reinitialized to 0 each time a new input file is started so if NR==FNR the 1st file is processed.

Regards