Delete file2.txt from file1.txt using scripting

Hi,

I`m a total newbie, well my requirement is that i have 2 files

I want to identify which countries i do not currently have in db..

how can i use the grep or another command to find this file ..

i want to match all-countries.txt with countries-in-db.txt so the output is equal to countries which are not currently in my db?

any command that i can use to achieve this would be helpful

Thanks

Hi Beanbaby,

I think you can use diff file1.txt file2.txt. It will show you the differnece between the two files.

$ cat db.txt
US
SG
AU
$ cat country.txt 
US
IN
GB
JP
SG
AU
$ awk 'NR==FNR{a[$0]++;next}{if (!($1 in a))print}' db.txt country.txt
IN
GB
JP

Thank You so much...

However my db has 127 countries... and thewholeworld db has 260 countries.. the result from the above command

awk 'NR==FNR{a[$0]++;next}{if (!($1 in a))print}' db.txt country.txt

IN is 219.

Should it not be close to 130 (the countries that are not listed in db.txt?

Thanks!

post some contents of your files and expected output.

---------- Post updated at 12:02 PM ---------- Previous update was at 12:00 PM ----------

if your country name has something like "Saudi Arabia" ( with space ), then you need to try out this

awk 'NR==FNR{a[$0]++;next}{if (!($0 in a))print}' db.txt country.txt

Hello,

Yes there are countries like 2 spaces and 3 spaces, like the example below

allcountries.txt

Uganda
United Arab Emirates
United Kingdom
United States
Uruguay

db.txt

Uganda
United States
Uruguay

result.txt

should have countries not listed in my database like the below

United Arab Emirates
United Kingdom

Many Thanks!

 
 $ awk 'NR==FNR{a[$0]++;next}{if (!($0 in a))print}' db.txt country.txt 
United Arab Emirates
United Kingdom

try this...

grep -vxFf all-countries.txt countries-in-db.txt 

gives the records from db.txt which are not present in all_countries.txt

sweet, looks like it is getting the results for countries with two spaces like

United States but not getting single countries like Thailand is in both .txt files but still outputs in the results?

Thanks

awk 'FNR==NR{sub(/[ \t]*$/,"");a[$0];next}
{sub(/[ \t]*$/,"")}
!($0 in a)' db.txt allcountries.txt

Thanks seems to have worked perfectly.

Many Thanks!

Try

$ grep -vf  db.txt allcountries.txt
United Arab Emirates
United Kingdom

If spaces lead to inconsistencies, try tr -d " " <file|grep -vf <(tr -d " " <file2)