Remove words from file

Beeser · December 19, 2008, 10:14am

Hello,

I have a question:
I have two different files, let's call them file1 and file2. file1 contains a list of words, the words are on seperate lines:

word1
word2
word3
word4
etc...

file2 also contains a list of words, seperated in the same way as file1.

What I want to do is remove the words that are in both file1 and file2 from file2. Does anyone know if this is possible?

I tried some sed stuff, but I just can't get the desired result.

Many thanks!

Christoph_Spohr · December 19, 2008, 10:20am

Hi,

grep -v -f file2 file1

-v -- print only lines not matching pattern
-f file2 -- get the list of possible matches from file2.

HTH Chris

SFNYC · December 19, 2008, 10:25am

$ cat file1
word1
word2
word3
word4
word5
word6

$ cat file2
word1
word2
word3
word4
word7
word8
word9

$ comm -13 file1 file2
word7
word8
word9

Beeser · December 19, 2008, 10:56am

Thanks for your replies, but I still dont get the desired result.

I still see the words that occur in file1 in file2 after using these commands.

file1 contains for example words that occur often in a text like:
a
the
I
an
to
be

Those words also occur in file2, but I want to strip them out of file2, so I have a list of words that don't occur that much.

I actually don't see why comm won't work.

Are there other options to solve my problem?

vgersh99 · December 19, 2008, 11:28am

nawk 'FNR==NR {a[$0];next} !($0 in a)' file1 file2

Beeser · December 19, 2008, 11:45am

Thanks for your help! You just solved my problem