actually that's the least of my problems now. i believe i'll be able to figure that out at the end. but the only other question i have is, lets say the first time i run this command, i get and output similar to this:
first run:
/data/projects/file01,300lines,130lines matching 'Customer.*Processed'
(note, this is just one file out of many that would be in the output.)
now, the above output is saved to a file called /tmp/results.txt
the second time i run this command, say 5 minutes later, there'd be a line in the output similar to:
second run:
/data/projects/file01,410lines,139lines matching 'Customer.*Processed'
now, i dont want to search through each file again. i want to begin from the point where the last scan left off.
in the first run, there were 300 lines in the file named '/data/projects/file01. I want it so that, the next time i run the script, awk can begin from line 301 to the end of the file. and i want to have this happen for all the files it finds in the directory. this way, only the first run will be slow. all runs after that will be fast.
here's my attempt to modify your code:
lastlinenumber=$(awk -F"," '{print $2}' /tmp/results.txt | sed 's/lines//g')
awk -v LLNUM=${lastlinenumber} 'FNR == 1 {if (NR > 1) {print fn, "text1", fnr, "text2", nl}
fn=FILENAME; fnr = 1; nl = 0}
{fnr = FNR}
/customer.*processed/ && NR>LLNUM {nl++}
END {print fn, "text1", fnr, "text2", nl}
' file?
if while comparing the most recent list of files in the latest scan, it finds a file that didn't exist in the previous scan, it'll scan that file in its entirety because it would be considered new.