To merge different sizes txt files

Hi,

I have to .txt files that look like

"baseMean" "log2FoldChange" "lfcSE" "stat" "pvalue" "padj"
"c104215_g2_i4" 202.057864855455 5.74047973414006 1.14052672909697 5.03318299141063 4.8240223910525e-07 0.00234905721174879
"c91544_g1_i1" 373.123487095726 5.62496675850204 1.15060014539303 4.88872418539511 1.01491573830736e-06 0.00234905721174879
"c104937_g1_i2" 127.674619831286 5.06648438344161 1.16615265871181 4.34461504297243 1.39520111651921e-05 0.0183546569529002
"c105753_g1_i3" 134.024403708584 4.97002237479055 1.17052222688412 4.24598718472911 2.17633069924804e-05 0.0207469473745466
"c108287_g1_i4" 116.154777394681 4.94489963165783 1.17057311887466 4.22434066862194 2.39641321103628e-05 0.0207469473745466
"c103430_g2_i1" 113.778003847288 4.90828138271733 1.17197935474572 4.18802717201639 2.81389833386134e-05 0.0216545109559152
"c83301_g1_i1" 103.09657725435 4.73959088799424 1.17819090507568 4.0227698818383 5.75176873742156e-05 0.0355851897968018
"c99520_g2_i1" 79.96763602061 4.35095490449958 1.19150491788504 3.65164661864985 0.000260564268983195 0.0751945052907338
"c69876_g1_i1" 552.790165445229 -3.97960824220711 1.11639782163909 -3.56468649890793 0.000364291348238595 0.0901100670678753

and

Hit	Name	signature_desc	Ontology_term
c48374_g1_i2	PF02874,PF00006,PF00306	ATP synthase alpha/beta family, beta-barrel domain,ATP synthase alpha/beta family, nucleotide-binding domain,ATP synthase alpha/beta chain, C terminal domain	GO:0005524,GO:0015992,GO:0046034,GO:0015991,GO:0016820,GO:0033178	
c99520_g2_i1	PF10168,PF00487,PF00173	Nuclear pore component,Fatty acid desaturase,Cytochrome b5-like Heme/Steroid binding domain	GO:0020037,GO:0006629	
c105882_g1_i3	PF03638	Tesmin/TSO1-like CXC domain, cysteine-rich domain		
c83301_g1_i1	PF01694	Rhomboid family	GO:0004252,GO:0016021	
c94400_g1_i1	PF01419	Jacalin-like lectin domain		
c55961_g1_i1	PF00030	Beta/Gamma crystallin		
c104646_g2_i1	PF00217	ATP:guanido phosphotransferase, C-terminal catalytic domain	GO:0016301,GO:0016772	
c103430_g2_i1	PF02991	Autophagy protein Atg8 ubiquitin like		
c104937_g1_i2	PF13499,PF04377	Arginine-tRNA-protein transferase, C terminus,EF-hand domain pair	GO:0004057,GO:0016598,GO:0005509

They are different sizes (the first one is longer).

What I need to do is to "add" the information from the second file to the first file just keeping the rows which ID is in both of them. They have in common the ID's from each one first column. And I need to keep the rows sorted as in the first file.

I guess that probably I can do it with awk but I honestly don't know how to do it.

Can anyone help me?
Thank you for your time.

Alicia

Try this:

awk -F'\t' '
FNR==1  {
   if(++file==2) {
       OFS=FS=" "
       print $0,heading
   } else {
      qt="\""
      heading = qt $2 qt " " qt $3 qt " " qt $4 qt
   }
   next
}
file==1 {k=qt $1 qt; name[k]=qt $2 qt;sig[k]=qt $3 qt;term[k]=qt $4 qt}
file==2 && ($1 in name) { print $0,name[$1],sig[$1],term[$1] }' file2 file1

Hi, try something like:

awk 'NR==FNR{A["\"" $1 "\""]; next} $1 in A' file2 file1

Output:

"c104937_g1_i2" 127.674619831286 5.06648438344161 1.16615265871181 4.34461504297243 1.39520111651921e-05 0.0183546569529002
"c103430_g2_i1" 113.778003847288 4.90828138271733 1.17197935474572 4.18802717201639 2.81389833386134e-05 0.0216545109559152
"c83301_g1_i1" 103.09657725435 4.73959088799424 1.17819090507568 4.0227698818383 5.75176873742156e-05 0.0355851897968018
"c99520_g2_i1" 79.96763602061 4.35095490449958 1.19150491788504 3.65164661864985 0.000260564268983195 0.0751945052907338

---
Or, to add the extra information at the end, try:

awk 'NR==FNR{i="\"" $1 "\""; $1=x; A=$0; next} $1 in A{print $0, A}' FS='\t' file2 FS=" " file1