The code you supplied produces the following output for my 3 test files:
D,5,5; 2
B,2,1; 4
NEW:A,1,1; 2
We have simplified your requirement (1.) to "only look at the first 2 files" (ie with a LOOK value of 2) and this will change the output to:
NEW:D,5,5; 1
B,2,1; 3
NEW:A,1,1; 2
Requirement (2.) that NEW should check all available files (i.e. ciscostats_08032012 is checked as well) will produce:
B,2,1; 3
NEW:A,1,1; 2
This is because "D,5,5" is in ciscostats_08032012, so it's not new.
This output matches the output of the script I supplied in post #16, you have said that #16 is wrong but I still can't see what it's doing that you dont like.
---------- Post updated at 11:50 AM ---------- Previous update was at 09:27 AM ----------
Looking back over this thread, I suspect you are reading the code I have supplied, and determining it's not doing what you want. Rather than trying it out with actual data, so it's probably time for me to explain what it does:
$files is populated with a list of data files with the most recent first eg:
ciscostats_02012012
ciscostats_01012012
ciscostats_31122011
a[] contains a count of how many times each ID appears in the first (most recent) file.
b[] contains a count of how many times an ID from a[] appears in files 2 thru LOOK
c[] contains a count of how many times an ID from a[] appears in any other file
At the end we print any ID that appears in both a[] and b[], and has a[]+b[] count >= MATCH
otherwise, a "NEW" record is output if value appears in a[] and not in c[]