Hi everyone
I have a quick perl matching question. I have the following file, and I want to use perl to search through the 2nd column and see if it finds any of the month names (e.g.: Jan, Feb, Mar, ... Dec).
Here's the file I'm trying to search, and here's the code I have so far. Any help would be super helpful! Thanks
lu Aug 2006 -122.24 48.76
lu AuG 2006 -122.24 48.76
21 2 2006 -122.24 48.76
2A Jul 2008 -117.8617 34.7017
21 Ma2 2006 -112.24 48.76
#!/usr/bin/perl -w
#
use strict;
use warnings;
my $file = 'sample1.xyt';
open my $info, $file or die "Could not open $file: $!";
my $x1 = $_;
my @cols = split(" ", $x1);
#Test: to see if col2 has values that match the following EXACTLY: Jan, Feb, Mar, Apr, Jun, Jul, Aug, Sep, Oct, Nov, Dec AND ALSO check that there are no numbers in this column. If these criteria is not met, print the bad line
if ($cols[1] =~ /^[0-9]+$/) { #checks for all numbers in the month
print "I have a bad month here - $x1";
} elsif ($cols[1] =~ /^[a-zA-Z]+$/) { #checks for all letters
#check in here for matches!
@months = ("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec");
##Not sure what to do here! :-(
} else {
print "I have a bad month here - $x1";
}
} else {
print "I have a bad month here - $x1";
}
}
close $info;
Try something like this which uses the "smart match" to perform a "in list":
use strict;
use warnings;
my @months = ( "Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec" );
my $file = 'sample1.xyt';
open my $info, $file or die "Could not open $file: $!";
while ( <$info> ) {
chomp;
my @cols = split(" ");
if ( $cols[1] ~~ @months ) {
print "Valid month found: $cols[1]\n";
} else {
print "Invalid month found: $cols[1]\n";
}
}
arrays - Perl: if ( element in list ) - Stack Overflow
How about this:
#!/usr/bin/perl -w
use strict;
use warnings;
my @months = ("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec");
open FILE, "<sample1.xyt" or die "Could not open sample1.xyt";
foreach my $line (<FILE>) {
my @cols = split(" ", $line);
if ($cols[1] =~ /^[0-9]+$/) { #checks for all numbers in the month
print "I have a bad month (numbers) here - $cols[1]\n";
} elsif (grep /^$cols[1]$/i, @months) {
print "Month is ok - $cols[1]\n";
} else {
print "I have a bad month here - $cols[1]\n";
}
}
close(FILE);
Alternation is probably faster than array comparison.
#!/usr/bin/perl
use strict;
use warnings;
open (my $file, '<', 'sample1.xyt');
while (<$file>){
@rec=split/\s+/,$_;
if ($rec[1]=~/^(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)/){
...
}
else{
print "Dud month in record $.:\n\t$_";
}
even more simpler with regexp, no array
#!/usr/bin/perl
use strict;
use warnings;
open (my $file, '<', 'sample1.xyt');
while (<$file>){
if ($_=~/.+\s+(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s+.+/i){
print "month found $1\n";
}
else{
print "Dud month in record $.:\t$_";
}
}
close $file;
$ ./test.pl
month found Aug
month found AuG
Dud month in record 3: 21 2 2006 -122.24 48.76
month found Jul
Dud month in record 5: 21 Ma2 2006 -112.24 48.76
this was super useful and incredibly instructive... thanks guys!