Long time listener first time poster. Hope someone can advise.
I have two files, 1000+ lines in each, two fields in each file.
After performing a sort, what is the best way to find exact matches where field $1 and $2 in file1 are also present in file2 on the same line, then output only those into file3?
Field 1 will always be unique within the files, but not field 2, there could be numerous entries of the same text.
So where field 1 and field 2 on a line in file 1, both only exist on the same line in file 2, output to file 3.
Thanks in advance.
File 1
abcdefgh name1
bcdefgha name2
nmrsthji namei
nmdherya wood
kasjfhsw moon
pqoweiru sun
wershgsy other
iundhstw tree
gfhskwyt mine
alskalak hoover
File 2
abcdefgh name1
bcdefgha name5
hjgdrnja namej
nmdherya notwood
kasjfhsw moon
wershgsy other
pqoweiru sun
gfhskwyt mine
alskalak hoover
--- Post updated at 02:25 PM ---
Confusing myself.
I dont mean field 1 and field 2 on line 3 in file 1, have to be on line 3 in file 2.
But field 1 and 2 on line 3, must match in file 2, on any line, then output to file 3.
Being a long time listener, you might be aware that the preferred approach in here is to show your own efforts, like code attempts, or thoughts on a solution. The posted problem has been addressed umpteen times in these forums, and you might find a good starting point searching, e.g. the links given in the lower left of this screen, under "More UNIX and Linux Forum Topics You Might Find Helpful".
On top, a desired output would clarify the situation.
Output into file three, in the form of either field 1 and field 2 that match in both files.
Or field 1 and field 2 that match in both, if it duplicates output, can live with that.
Hi bstaff,
I think we already understand the you want the results to be stored into a third file (rather than just letting the output be sent to the program's standard output so you could redirect it wherever you wanted it to go). The question is: "Given the contents of File 1 and File 2 that you showed us in post #1 in this thread, exactly what output do you hope will be stored in your output file?
Note also that the description of your problem talks about file 1 and file 2 (with entirely lowercase "file" in both cases), but when you showed us sample contents for those files you showed labels with initial caps on both filenames. On most BSD, Linux, and UNIX filesystems, case matters in filenames.
I'm not sure why your search in here has not yet been successful, as many of the links show promising starting points, and with some - not too much - fiddling your problem can be solved:
awk 'NR == FNR {T[$1,$2]; next} ($1,$2) in T' file[12]
abcdefgh name1
kasjfhsw moon
wershgsy other
pqoweiru sun
gfhskwyt mine
alskalak hoover