Hi,
I need to join two files together with one common value in a column. I think I can use awk or join or a combination but I can't quite get it.
Basically my data looks like this, with the TICKER columns matching up in each file
File1
TICKER,column 1, column, 2, column, 3, column 4
File2.
TICKER,STATUS
I need them to join up like this into file3.
TICKER,STATUS,column 1, column, 2, column, 3, column 4
thank you
Yoda
June 24, 2013, 6:54pm
2
This type of question is one of frequently asked. Did you search this forum before posting?
Yes I did. I am read through several threads and am currently trying this:
join -t\, file1 file2
which works, but it only prints out 5 lines for some reason, even though the first column for each file is identical.
Yoda
June 24, 2013, 7:21pm
4
You could try using awk:
awk -F, 'NR==FNR{A[$1]=$0;next}$1 in A{$0=$0","A[$1];print}' file1 file2
That is very helpful, thank you!
Can you help me understand what is being matched up with that line of code, though? For example if I wanted file2 to be matched with column 2 instead, but only print column 1 to the output, what would I have to change around?
thanks!!
unkleruckus:
Yes I did. I am read through several threads and am currently trying this:
join -t\, file1 file2
which works, but it only prints out 5 lines for some reason, even though the first column for each file is identical.
The join command would work too, but it requires that the lines in the inout files are in sorted order..
unkleruckus:
That is very helpful, thank you!
Can you help me understand what is being matched up with that line of code, though? For example if I wanted file2 to be matched with column 2 instead, but only print column 1 to the output, what would I have to change around?
thanks!!
Then you could try something like this:
awk 'NR==FNR{A[$2]; next} $2 in A{print $1}' file1 file2
If that does not work, you would need to provide a better data sample..
Hey guys
thanks for the help with this, I was able to accomplish my goal by doing the following:
sort -t "," file1.csv > temp1.csv
sort -t "," -k 2 file2.csv > temp2.csv
join -t\, -1 1 -2 2 -o0 1.2,2.1,1.3,1.4,1.5,1.6 temp1.csv temp2.csv > OUTPUT.csv
and all my data goes to the correct spot! I am seeing trailing commas at the end of my OUTPUT.csv, though. Any idea of why that happens? It's probably not a big deal but I'd like my script to output the data as cleanly as possible.
thanks again for the help with this!