awk -v sw="lemons|dogs" 'NR>100 && NR<200 BEGIN { c=split(sw,a,"[|]"); } { for (w in a) { if ($0 ~ a[w]) d[a[w]]++; } }
END { for (i in a) { o=o (a"="(d[a]?d[a]:0)","); }
sub(",*$","",o); print o;
}' /home/jahitt/data.txt
what am i doing wrong with the above code? im pretty sure the issue is in the bolded. how can this be fixed?
BEGIN and END special rules can be intermixed with other rules, but you cannot add another rule with these. So below is wrong:
'NR>100 && NR<200 BEGIN ..
Correction:
awk -v sw="lemons|dogs" '
NR > 100 && NR < 200 {
for (w in a)
{
if ($0 ~ a[w])
d[a[w]]++
}
}
BEGIN {
c = split(sw,a,"[|]")
}
END {
for (i in a)
{
o = o (a"="(d[a]?d[a]:0)",")
}
sub(",*$","",o)
print o
}
' /home/jahitt/data.txt
Yoda's fix will give you a working program that counts the number of lines from line number 101 through line number 199 that contain "lemons" and that contain "dogs" and print them at the end. But, you didn't tell us what this script is supposed to do.
Another way to read what you were trying to do would be print lines 101 through 199 from your input file and at the end print the number of lines in the entire file that contaied "dogs" and the number of lines in the entire file that contained "lemons". If that was your intent, the one character change marked in red below to your original script should work:
awk -v sw="lemons|dogs" 'NR>100 && NR<200;BEGIN { c=split(sw,a,"[|]"); } { for (w in a) { if ($0 ~ a[w]) d[a[w]]++; } }
END { for (i in a) { o=o (a"="(d[a]?d[a]:0)","); }
sub(",*$","",o); print o;
}' /home/jahitt/data.txt
Although I prefer more readable code like:
awk -v sw="lemons|dogs" '
NR>100 && NR<200
BEGIN { c=split(sw,a,"[|]")
}
{ for (w in a) {
if ($0 ~ a[w])
d[a[w]]++
}
}
END { for (i in a) {
o=o (a"="(d[a]?d[a]:0)",")
}
sub(",*$","",o)
print o
}' /home/jahitt/data.txt
If your input file contained:
lemons and dogs
lemons only
cats and dogs
dogs only
cats only
lemons and cats and dogs
But, if you decide to change basic logic in your original requirements, you should consider whether you need to redesign everything so that each counted pattern has a list of zero or more exclusions that should be considered. (And, I'm not going to try to guess at your new requirements and propose a new syntax for your sw variable to make that happen.)
Quick prototyping works well sometimes. But, sitting down and clearly defining your requirements before you start programming will usually give you a much more coherent, maintainable piece of software that works better and does what you want.
#!/usr/bin/perl -w
my (@lem_lines,@dogs_lines);
while(<>){
if (($. > 100) && ($. < 200)) {
push @lem_lines,$_ if /^lemons (?!only)/;
push @dogs_lines,$_ if /dogs/;
}
}
my $lem=scalar @lem_lines;
my $dogs=scalar @dogs_lines;
print "Number of lemons excluding ( lemons only ) in lines from 101 to 199 is : $lem\n";
print "count of dogs in lines from 101 to 199 is: $dogs\n";
Run as
perl try.pl input_file
I tried for the given input by Don Cragun and works well.
Thanks Don C for your explanation about the problem.
Ofcourse , regular expression has to be updated to look for dogs or lemons only etc if skysmart is looking for specific locations for example .