gawk -v sw="error|fail|panic|accepted" 'NR>1 && NR <=128500 {
for (w in a)
{
if ($0 ~ a[w])
d[a[w]]++
}
}
BEGIN {
c = split(sw,a,"[|]")
}
END {
for (i in a)
{
o = o (a"="(d[a]?d[a]:0)",")
}
sub(",*$","",o)
print o
}' /var/log/treg.test
the above code works majestically when searching for multiple strings in a log.
the problem is, as the log gets bigger (i.e. 5MB), the time it takes to search for all the strings gets longer as well. took 2 seconds to search a 5MB file using this code. had the file been bigger, say 10MB, it would take longer.
so i'm wondering, can this code be optimized at all to make it run faster? maybe if the strings were read from a separate file it would help speed things up??
code runs on linux redhat / ubuntu platforms