I've got the code below. It does it's job but it's scrappy. Can someone explain why grep -v -f doesn't work against an empty file?
Basically I have a file of presumed good data - I want to remove any by comparing with a file I know are bad. When the bad file is empty - the output file is also empty. It's weird.
Works fine when the bad file has some data - so you'll see the cheap hack fix I've put in.
if [ -s tmp.merge.nos ]
then
join -t, -v 1 tmp.both tmp.req.merges | sed 's/=/,/' | grep -v -f tmp.merge.nos > $infile.ready
else
join -t, -v 1 tmp.both tmp.req.merges | sed 's/=/,/' > $infile.ready
fi
Maybe comm would be better - but I'd like to know why it doesn't work anyway (grep -v -f). Cheers:confused:
On my HP-UX, a single empty pattern file works OK with -v (returns all lines). But when I specify multiple pattern files, the first being empty, I get no data returned. An empty pattern file in any but the first position works fine:
grep -v -f pat1 -f patempty -f pat2 mydata
will return all lines except those represented in pat1 and pat2. So even though my unix handles this better than yours, it still fails when the first of multiple pattern files is empty. Don't know what the deal is with that.
Maybe yours will work with a non-empty first pattern file, which contains one weird pattern that would never occur in your data:
| grep -v -f patdummy -f tmp.merge.nos
Yea, it's still a workaround, but if it works for you, at least it would get rid of your if-else.
Good idea - but I'm in an environment where a dummy file like that would live a very short and painful life before it was taken to the vet and deleted.... I'd rather stick with the if-then - which although not beautiful code is at least effective.
Still would like to know any explanations for the behaviour....
Yeah - that works..... the only thing is that's also a workaround - and I already have one that works - which is probably less of a workaround.
I'm not adverse to using if-else, after all it's valid code and there for a good reason.... and I would think that the if-else I have is more 'pure' code than creation/deletion of a dummy file.
So I'm not all that worried about the if-else - happy to leave it there as it's no more inefficient than other methods you guys have come up with....really just curious about the empty file thing.
I played with that grep -v "" < file and got the same behaviour you suggested....do you know why null string is considered a match to everything else?
Every string can be considered to have a null string in it. Think about how you would write grep. Unless you put in special code for a null string, it's gotta work this way. And who wants to slow down a program in ensure that something goofy like 'grep -v "" ' works.