Ubuntu, Bash 4.3.48
Hi,
I have 2 files and I want to join them (line by line if the start of the lines is the same, like a ID)
INPUT FILE 1 (tab delimited)
aa_12_12_v_c aaa,asf,afgas,eg
bb_12_43_a_d dad,ada,adaf,afa
cc_56_75_d_f asd,thh,ert,rtertet
INPUT FILE 2 (tab delimited)
aa_12_12_v_c 1:1:1:1:1
cc_56_75_d_f 2:2:2:2:2
INPUT FILE 3 (tab delimited)
bb_12_43_a_d 3:3:3:3:3
Using join
join -t "`echo -e "\t"`" -a1 FILE1 FILE2 > OUTPUT1
OUTPUT1 (tab delimited)
aa_12_12_v_c aaa,asf,afgas,eg 1:1:1:1:1
bb_12_43_a_d dad,ada,adaf,afa
cc_56_75_d_f asd,thh,ert,rtertet 2:2:2:2:2
Considering that in my case -e ND doesn't work I have to do this
awk 'FNR==NR{if(m<NF)m=NF;next}{for(i=NF;i<m;i++)$(i+1)="ND"}1' OUTPUT1 OUTPUT1 > XFILE; sed 's/ /\t/g' XFILE > OUTPUT2
OUTPUT2 (tab delimited)
aa_12_12_v_c aaa,asf,afgas,eg 1:1:1:1:1
bb_12_43_a_d dad,ada,adaf,afa ND
cc_56_75_d_f asd,thh,ert,rtertet 2:2:2:2:2
Then for the 3th file...
join -t "`echo -e "\t"`" -a1 OUTPUT2 FILE3 > OUTPUT3
OUTPUT3 (tab delimited)
aa_12_12_v_c aaa,asf,afgas,eg 1:1:1:1:1
bb_12_43_a_d dad,ada,adaf,afa ND 3:3:3:3:3
cc_56_75_d_f asd,thh,ert,rtertet 2:2:2:2:2
Considering that in my case -e ND doesn't work I have to do this
awk 'FNR==NR{if(m<NF)m=NF;next}{for(i=NF;i<m;i++)$(i+1)="ND"}1' OUTPUT3 OUTPUT3 > XFILE; sed 's/ /\t/g' XFILE > OUTPUT4
OUTPUT4 (tab delimited)
aa_12_12_v_c aaa,asf,afgas,eg 1:1:1:1:1 ND
bb_12_43_a_d dad,ada,adaf,afa ND 3:3:3:3:3
cc_56_75_d_f asd,thh,ert,rtertet 2:2:2:2:2 ND
--- --- ---
The point is that seem a little complicate my code... then, ofthe but not always I have problem with sorting... some time I have errors about sorting, when I apply the join command. I read that if I'm sure that my files are sorted I can bypass this sort-control-step of join command... but I want a new code without warnings...
Do you know any other command? Any help! commands, codes, script
Having N files I want to create a loop...
Many thanks!
echo manolis