Now you're starting to get tricky.
How about a command line that looks like this:
filter excludeFile 4 file1 file2 file3...
Would that work? The exclude file comes first, followed by the numeric field number (starting at 1 for the first field), followed by a list of one or more files that use that particular field number. If you use a negative field number, it will count from the end of the line instead of the front, so a field of -2 would mean the second to the last field on every line, even if each line had a different number of fields.
In the code below I have told Perl to rename the original files so that they end in .bak and then write the changes to the original name. For the command above, you'd end up with file1 and file1.bak for example.
If that works for you, try the following. Note the extra -I.bak option on the first line and the extra $field variable.
#!/usr/bin/perl -I.bak
my @a, %exclude;
my $file = shift;
open(EXCLUDE_LIST, "< $file") or die;
chomp( @a=<EXCLUDE_LIST> );
close(EXCLUDE_LIST);
@exclude{@a}=@a;
my $field = shift;
if ($field =~ /\D/) {
$field = 4;
}
die "Field specifier may not be zero.\n" unless $field;
$field-- if $field > 0;
while (<>) {
print unless exists $exclude{ (split(/,/))[$field] };
}
If there are other options you want to add (such as using a different delimiter between fields), then it's time to start using Getopt::Std and specifying options using the same techniques other commands use: a dash followed by a letter.
@ripat: That's a cute trick with the NR==FNR for awk. I'm going to have to remember that one. Only useful for a single file, but still... (The file handling in awk is terrible!)
---------- Post updated at 04:00 PM ---------- Previous update was at 03:52 PM ----------
It's a really bad idea to use variables without putting double quotes around them! I can screw up that awk command pretty bad by passing the script a filename with a space or wildcard character in it, especially as the third parameter.
Please put double quotes around ALL variable substitutions. Out of a thousand uses it will only be wrong 3-4 times, so you've got a 99.6% chance of getting it right. Those are pretty good odds.