Comparing two files and printing 2nd column if match found

spirm8 · November 9, 2010, 8:35am

Hi guys,

I'm rather new at using UNIX based systems, and when it comes to scripting etc I'm even newer.

I have two files which i need to compare.

file1: (some random ID's)

451245
451288
136588
784522

file2: (random ID's + e-mail assigned to ID)

123888 xc@xc.com
451245 a@a.com
122112 adsadas@asd.com
451288 b@b.com
136588 c@c.com
784522 d@d.com

My thoughts on this, was to have the script print $2 of file2 if there's a match on that particular ID from in the File1.

So it cant do a per line search i guess, it needs to look throughout the file and search for a match.
I would like it to store the output in a .txt file aswell - I know you can do that by > randomfile.txt right?

I hope you get what I'm talking about

Best regards
Jesper

gc_sw · November 9, 2010, 8:39am

try

awk 'NR==FNR{a[$1,$2]=$3;next} a[$1,$2]{print $2}' file2 file1

spirm8 · November 9, 2010, 8:44am

Hi, thanks for answering. I'm getting this when trying to run it.

awk: syntax error near line 1
awk: illegal statement near line 1
awk: syntax error near line 1
awk: bailing out near line 1

gc_sw · November 9, 2010, 8:45am

do you have solaris? if so; try:

nawk 'NR==FNR{a[$1,$2]=$3;next} a[$1,$2]{print $2}' file2 file1

[/COLOR]

spirm8 · November 9, 2010, 8:48am

Yeah im using solaris, and nawk did the trick. Not getting any errors now I doesn't seem to print anything out tho, is this wrong?

nawk 'NR==FNR{a[$1,$2]=$3;next} a[$1,$2]{print $2}' file2 file1 > output.txt

gc_sw · November 9, 2010, 8:49am

you should see the result. something is wrong. try:

nawk 'NR==FNR {a[$1$2]++;next} a[$1$2] { print $2 }' file2 file1

you can find similar detailed sol'n at:
http://www.unix.com/shell-programming-scripting/147237-easily-search-file-get-line-little-problem-post302466738.html

Franklin52 · November 9, 2010, 8:51am

Use grep with the -f option. Check your man page.

gc_sw:

you should see the result. something is wrong. try:
nawk 'NR==FNR {a[$1$2]++;next} a[$1$2] { print $2 }' file2 file1
you can find similar detailed sol'n at:
http://www.unix.com/shell-programming-scripting/147237-easily-search-file-get-line-little-problem-post302466738.html

That shouldn't work since the first file has only one column.

spirm8 · November 9, 2010, 9:09am

Hi, I think it might be too advanced for me to understand? I'm not getting anything out of the man page for grep using -f

Franklin52 · November 9, 2010, 9:19am

If your grep version supports the -f option:

grep -f file1 file2

Otherwise with awk:

nawk 'NR==FNR{a[$0]; next}$1 in a' file1 file2

spirm8 · November 9, 2010, 9:36am

Using
nawk 'NR==FNR{a[$0]; next}$1 in a' file1 file2 > test.txt

Dosnt seem to do the desired thing..

file1 is a little group of ID's i have which i need to get hold of their email associated to their ID. (926 lines/entries/users)
file2 is the complete list holding all ID's with their email address. (1045 lines/entries/users)

When i use the command, it creates a file containing 842 (lines/entries/users) where ID AND Email is listed.

I only need to get their Email address from their ID if theres a match and create a output file that only contains the emails that were a match.

Best regards

Franklin52 · November 9, 2010, 9:45am

With the given sample files I get:

$ cat file1
451245
451288
136588
784522
$ cat file2
123888 xc@xc.com
451245 a@a.com
122112 adsadas@asd.com
451288 b@b.com
136588 c@c.com
784522 d@d.com
$ awk 'NR==FNR{a[$0]; next}$1 in a' file1 file2
451245 a@a.com
451288 b@b.com
136588 c@c.com
784522 d@d.com
$

Am I missing something?

Scrutinizer · November 9, 2010, 9:56am

IMO franklin's suggestion just needs a print $2 action. Plus I would tend to use $1 instead of $0 in the first part to avoid possible mismatches due to spurious spacing...

awk 'NR==FNR{a[$1]; next}$1 in a{print $2}' file1 file2

spirm8 · November 10, 2010, 2:45am

Thanks guys, I've got it working now

spirm8 · November 11, 2010, 5:05am

Lets say i now need to search for missing entries between two files. I would like to print $1 from file1 if the number isnt found in file2?

file1(total)
1
2
3
4
5
6

file2 (partial)
1
2
3

output(numbers missing)
4
5
6

Thanks

Scrutinizer · November 11, 2010, 5:11am

awk 'NR==FNR{a[$1]; next}!($1 in a)' file2 file1

spirm8 · November 11, 2010, 5:17am

Hi, thanks for the quick reply.

When i use:

awk 'NR==FNR{a[$1]; next}!($1 in a)' file2 file1

I get:
3
4
5
6

But 3 is in file2... is there a way to fix that?

Scrutinizer · November 11, 2010, 5:27am

Strange. With your input I get:
4
5
6

--

With the samples you provided, this should work too:

grep -vf file2 file1

spirm8 · November 11, 2010, 5:39am

Very wierd.. I dont know why its doing that o_O

The grep -vf works as it should tho... heh

ygemici · November 11, 2010, 5:46am

# for i in `sed "" file2`; do sed "/$i/d" file1 1>testf  && mv -f testf file1; done ; more file1
4
5
6

# comm -3 file2 file1
      4
      5
      6

# diff file2 file1 | sed '1d;s/[>]* //'
4
5
6

spirm8 · November 11, 2010, 8:48am

I searched the forum for a oneliner that will be able to remove a match between two files, but without luck.

file1:
1
2
3
4
5
6

file2:
1
2
3

Now its the other way around, if '3' is found in file1, using file2 as a "delete theese list" it should be removed in my > output.txt

basicly my file2, contains all the entries that should be removed from file1