perl questions

New to perl and a few questions - usually used to shell scripting but have to write a script to check a few things on windows server and need help..

  1. Check for file missing in sequence..

ie. list of file in directory as follows:

file0001.txt
file0002.txt
file0004.txt
file0005.txt
file0006.txt

How do I identify that file0003.txt is missing in the sequence ?

  1. search files for a pattern and count the matching occurances?

ie. file1 contains
test
test2
test4
test2
test1

I want to know how to search for test2 and count number of matches.

I would prefer if these were as simple as possible.. simple questions I am sure but I havent used perl before..

  1. If your files are strictly numbered like that, you can do a readdir() to get the list of files in the directory, sort it, and then keep a running counter that increments the counter by 1 each time. If it matches the next item read, then there is no missing file. I can think of a couple more options, but this one is probably the easiest.

  2. The grep() function will do it.

grep - perldoc.perl.org

since you are working in windows, here's a vbscript, if you don't mind

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFolder = objFSO.GetFolder("c:\temp1")
Dim counter
Dim nStore()
counter=1
i =0
fileCount=objFolder.Files.Count
'Store file numbers into array
For Each myFile In objFolder.Files     
	If Left(myFile.Name,4) = "file" And objFSO.GetExtensionName(myFile.Name) = "txt" Then
		m = Right(objFSO.GetBaseName(myFile),4)
		ReDim Preserve nStore(i)
		nStore(i) = m		
		i = i+ 1
	End If 
Next
'go through array, filter out those not in array
For j = 0 To fileCount+1    
	n = String(3,"0") & CStr(counter) 
	MyIndex = Filter(nStore, n)
	If UBound(MyIndex) = -1 Then
		WScript.Echo "file " , n , "not there"
	End If 
	counter=counter+1
Next

ussage on command prompt:

c:\pathofyourscript> cscript  myscript.vbs 
  1. search files for a pattern and count the matching occurances?

ie. file1 contains
test
test2
test4
test2
test1

I want to know how to search for test2 and count number of matches.

I would prefer if these were as simple as possible.. simple questions I am sure but I havent used perl before..
[/quote]

Set objFSO = CreateObject("Scripting.FileSystemObject")
myFile = "c:\temp1\file0001.txt"
Set objFile = objFSO.OpenTextFile(myFile,1)
c = 0
Do Until objFile.AtEndOfLine
	line=objFile.ReadLine
	If InStr(1,line,"test2") >0 Then
		c=c+1
	End If
Loop
objFile.Close
WScript.Echo "total count: ", c
  1. If your files are strictly numbered like that, you can do a readdir() to get the list of files in the directory, sort it, and then keep a running counter that increments the counter by 1 each time. If it matches the next item read, then there is no missing file. I can think of a couple more options, but this one is probably the easiest.

Thanks - can you help me on how to use readdir and increment counter etc and identify that I am missing one?

As I say - I am new to perl so know only the very basics..

Separately I will have a look at the grep function..

Check this perl script:

#!/usr/bin/perl
# count_this.pl 
use strict;
my $to_search = shift;
my $count = 0;
while (<>) {
    while (m/$to_search/g) {
        $count++;
    }
}
print "$to_search found $count times \n";

Run this script as:

[ysawant@in-gcs-nas1 temp]$ perl count_this.pl Name sample.xml 
Name found 6 times 
[ysawant@in-gcs-nas1 temp]$ 

Can someone help me with the below - script to check for missing file sequence??

  1. If your files are strictly numbered like that, you can do a readdir() to get the list of files in the directory, sort it, and then keep a running counter that increments the counter by 1 each time. If it matches the next item read, then there is no missing file. I can think of a couple more options, but this one is probably the easiest.

Thanks - can you help me on how to use readdir and increment counter etc and identify that I am missing one?

As I say - I am new to perl so know only the very basics..

opendir(DIR, "R:/example") or die "Cannot opendir: $!";
my @files = sort map { /^file\d+\.txt$/ ? ($_) : () } readdir(DIR);
closedir(DIR);

my $min=($files[0] =~ /^file(\d+)\.txt$/ && $1) or die "No matching files found";
my $max=($files[-1] =~ /^file(\d+)\.txt$/ && $1) or die "No matching files found";

my $thisfileidx = 0;
for (my $i=$min; $i<=$max; $i++) {
	if ($files[$thisfileidx] eq sprintf("file%04d.txt", $i)) {
		$thisfileidx++;
	} else {
		print STDERR "File " . sprintf("file%04d.txt", $i) . " is missing!\n";
	}
}

(This is a Windows machine and R:\example is a directory on a ramdisk with file0001.txt, file0002.txt, file0003.txt, file0005.txt and file0006.txt)

It will tell you file0004.txt is missing.

Perfect - thanks very much for that... gets me out of another hole.

Thanks for the reply :slight_smile: :b: