Join two files with matching columns

Hi,

I need to join two files together with one common value in a column. I think I can use awk or join or a combination but I can't quite get it.

Basically my data looks like this, with the TICKER columns matching up in each file

File1

TICKER,column 1, column, 2, column, 3, column 4

File2.

TICKER,STATUS

I need them to join up like this into file3.

TICKER,STATUS,column 1, column, 2, column, 3, column 4

thank you

This type of question is one of frequently asked. Did you search this forum before posting?

Yes I did. I am read through several threads and am currently trying this:

join -t\, file1 file2 

which works, but it only prints out 5 lines for some reason, even though the first column for each file is identical.

You could try using awk:

awk -F, 'NR==FNR{A[$1]=$0;next}$1 in A{$0=$0","A[$1];print}' file1 file2

That is very helpful, thank you!

Can you help me understand what is being matched up with that line of code, though? For example if I wanted file2 to be matched with column 2 instead, but only print column 1 to the output, what would I have to change around?

thanks!!

The join command would work too, but it requires that the lines in the inout files are in sorted order..

Then you could try something like this:

awk 'NR==FNR{A[$2]; next} $2 in A{print $1}' file1 file2

If that does not work, you would need to provide a better data sample..

Hey guys

thanks for the help with this, I was able to accomplish my goal by doing the following:

sort -t "," file1.csv > temp1.csv
sort -t "," -k 2 file2.csv > temp2.csv
join -t\, -1 1 -2 2 -o0 1.2,2.1,1.3,1.4,1.5,1.6 temp1.csv temp2.csv > OUTPUT.csv

and all my data goes to the correct spot! I am seeing trailing commas at the end of my OUTPUT.csv, though. Any idea of why that happens? It's probably not a big deal but I'd like my script to output the data as cleanly as possible.

thanks again for the help with this!