perl script to list filenames that do not contain given string

Hi,

I need to write a perl script that should do a search recursively in all the '*.txt' files for the string "ALL -Tcb" and should print only the file names that do not contain this string.

Why Perl? It can be done, of course, but it sounds easier in shell.

find . -type f -name '*.txt' -print |
while read file; do
  grep -l "ALL -Tcb" "$file" >/dev/null && continue
  echo "$file"
done

If you have a grep which understands the -q option, use that instead of -l and the redirection to /dev/null.

Thanks era, but I am not using Unix or Linux machines rather Cygwin on windows. And it is a requirement for me to develop it only in Perl.

Hmm, this is a Unix forum ...?

perl -MFile::Find -e 'find(sub {
  return 0 unless (m/\.txt$/);  # skip file names which don't match this regex
  open (F, $File::Find::name) || warn "could not open $File::Find::name: $!\n";
  my $match = grep { /ALL -Tcb/ } <F>;
  close F;
  print "$File::Find::name\n" unless $match;
  return ! $match; }, ".")'

See the File::Find documentation for a bit of background. The grep will return a list of matching lines; because it is invoked in scalar context, that list will be turned into the number of elements in the list of matches. If that number is zero, there were none, and we print the file name.

The final parameter is the list of directories to traverse; simply "." will traverse the current directory and its subdirectories.

Thanks a lot era. If I run this script at command line of Cygwin, it is spewing lot many errors. So could you change it and give me as a perl program, so that I can run it from a file?

what errors do you see?

could not open ./api/group/file.txt: No such file or directory

like this continuously where ever txt files are found. I ran it by copying and pasting each and every line of the script (removed the comment)

Do you run it in a different directory than where you search? Do the files it complains about actually exist, under the current directory? Do you have permission to read them?

I did run it from the top of the directory where the search files are exist in the subdirectories under the current directory. Also I have read permission because I can do

find . -type f -name "*.txt" -exec grep -lv "ALL -Tcb" {} \;

And the file ./api/group/file.txt actually exists? If not, any idea where find could find it?

I just given a single entry of error for an eg. Of course the txt file exists in the aforementioned path. I guess, this perl script does not recursively searching for the files. Yet it's an awesome script, a brainchild and masterpiece of Era.

File::Find does the recursive searching; the fact that it complains about a file several directories deep should prove that it basically works. I tested the script and it works for me, and I can't really help troubleshoot it on Cygwin, because I'm not too familiar with that platform. Do you get the same error message if you try to open the file in Perl directly?

perl -e 'open (F, "./api/group/file.txt") or die "could still not open it: $!\n"'

If you have a facility like strace (dunno what that would be called in Windows, spy something?) then perhaps you could figure out why the open fails.

I hope it's okay to continue to use this file as an example, as I can't guess what other files you have.

The piece of code which you given here is successfully opening the file without error. I guess, Cygwin may not have 'File::Find' package and it could be the cause of recursive file-opening problem. The Perl script works perfectly alright if I am running it on the directory where the *.txt file exists. Hats off era!!

How about this below script:

 for %a in (*.mmp) do @perl -ne "if (!m/ALL -Tcb/) { print qq(%a); }" %a 

but it gives "bash: !m/ALL: event not found" error :rolleyes: :stuck_out_tongue:

The bash error is because the exclamation mark is special to bash. I don't know if you can use single quotes in Cygwin; I guess if you are running bash you can (and must). But then the DOS syntax for loop is not going to work.

Anyway, your script simulates grep -v; it will print the file's name if there is a line which doesn't match, even if there are also lines which do match.

If the "open" call gets invoked at all, it means File::Find is doing its thing, because it is being passed this function as a callback -- it would not execute if File::Find wasn't invoking it. Also the fact that you get a warning which uses $File::Find::name in the message should prove to you that File::Find is indeed traversing the directory and finding files (and setting this variable's value to the files it finds). Anyway, if it wasn't available, perl would die with an error (try perl -MFlie::Fnid -e 0 to see what it looks like).

Hi era,

I checked your script in Linux (ubuntu - x86 m/c). It is still giving the same error what I have got in Cygwin, i.e.,

could not open ./ngilogger-1.0/src/ngilogger.txt: No such file or directory
./ngilogger-1.0/src/ngilogger.txt
could not open ./showimage-1.0/showimage.txt: No such file or directory
./showimage-1.0/showimage.txt

if we are running it from the top level directory. I don't have any Unix machine to check with either :frowning:

The script era posted has to be run from the directory you want the search to start in. I think he made that clear. If the start directory is not the same then feed the script a start directoy. See File::Find for details. If the script is in the start directory it looks to me like it should work.

Well in any event, it "finds" those files because otherwise it would not be trying to open them.

Having Perl modules is good precisely because reinventing the wheel is not much fun, but here goes; a quick and dirty ugly half-assed broken replacement for File::Find.

/me secretly imagines you will have the same No such file or directory errors with this too, but at least it's simple enough that you can try to debug it yourself.

#!/usr/bin/perl

use strict;
use warnings;

die "usage: $0 dirs ...\n" unless (@ARGV);

while (@ARGV)
{
    curse(shift @ARGV);
}

sub curse
{
    my ($arg) = @_;

    if (-d $arg)
    {
	local *D;
	unless (opendir (D, "$arg"))
	{
	    warn "$0: could not open directory $arg: $!\n";
	    return 0;
	}
	while (my $dir = readdir D)
	{
	    next if ($dir eq "." || $dir eq "..");
	    curse("$arg/$dir");
	}
	closedir D;
	return 0;
    }

    elsif (-f _)
    {
	local *F;
	unless (open (F, $arg))
	{
	    warn "$0: could not open $arg: $!\n";
	    return 0;
	}
	my $matches = grep { /ALL -Tcb/ } <F>;
	close F;
	print "$arg\n" unless $matches;
	return ! $matches;
    }

    # else
    return 0;
}

Try changing this line in the original code:

open (F, $File::Find::name) || warn "could not open $File::Find::name: $!\n";

to:

open (F, $_) || warn "could not open $File::Find::name: $!\n";

File::Find does a chdir by default into the current directory so using $_ should hopefully solve this problem

KevinADC: good catch! I spotted the difference between $_ and $File::Find::name in the docs, but interpreted it precisely the other way around (didn't notice there would be a chdir)! And I obviously didn't test it in a deep-enough directory hierarchy, duh.

Yes, KevinADC you are right. Your solution works perfectly without errors. Thanks a lot to you and to Era for the prompt responses/suggestions. Indeed I owe to you a lot.