xshang
September 30, 2012, 1:57am
1
I have two file like follows. I want to merge them according the first field of file1. The separator of file1 is tab, while the second one is ",". I tried some code, but can't get the results.
File1:
AABB 6072 28 5922
BBCC 316 147 162
CCDD 907 71 231
File2:
CCDD,hTRBV12-4,hTRBJ2-3,319895
DDEE,hTRBV20-1,hTRBJ1-5,252661
Output:
CCDD 907 71 231 hTRBV12-4 hTRBJ2-3 319895
awk 'NR==FNR{a[$1];next}$1 in a{print $0}' file1 FS, file2
mirni
September 30, 2012, 3:01am
2
Try
join -t' ' -j 1 file1 <(sed 's/,/ /g' file2)
Input the tab as ctrl+v ctrl+i.
Your awk line can also be made to work:
awk -F' ' 'NR==FNR{a[$1];next}$1 in a' file1 <(sed 's/,/ /g' file2)
1 Like
RudiC
September 30, 2012, 3:31am
3
Slight addition to your own awk cmd will make it work:
awk 'NR==FNR{a[$1]=$0;next}$1 in a{print a[$1], $2, $3, $4}' file1 FS="," OFS="\t" file2
1 Like
another awk version:
awk 'NR==FNR{A[$1]=$0; next} $1 in A && $1=A[$1]' file1 FS=, OFS='\t' file2
1 Like
rudic:
NICE!
Thanks It later crossed my mind that it could be reduced a bit further still:
awk 'NR==FNR{A[$1]=$0; next} $1=A[$1]' file1 FS=, OFS='\t' file2
2 Likes
xshang
September 30, 2012, 9:15pm
7
scrutinizer:
Thanks It later crossed my mind that it could be reduced a bit further still:
awk 'NR==FNR{A[$1]=$0; next} $1=A[$1]' file1 FS=, OFS='\t' file2
Nice code. It works, but another problem appears.
The fields in file2 cover some fields in file1's in output, however, the value of "NF" show the right number. Confused.
Besides, what should I do if I want to change the position of file1 and file2?
I try the following code but it don't work.
awk 'NR==FNR{A[$1]=$0; next} $1=A[$1]' file2 FS='\t' OFS=, file1
With fields covering other fields, I presume, you mean it is as if the fields are written over the fields of the other file.. This coulld be due to carriage returns in the input files, which can happen if files are in DOS format instead of UNIX format.
To convert use:
tr -d '\r' < file > file.new
first.
To flip the files around, you could try:
awk 'NR==FNR{A[$1]=$0; next} $1=A[$1]' FS=, file2 FS='\t' OFS=, file1
1 Like