Joining two files by retaining the headers.

Hi Experts,

I want to join two files(file1 ,file2) which are having tab separated values,by sorting them on key column(ID) and want to redirect the output to other file(output file) along with the headers from both the files.

cat file1:
ID	Name	phone_no
205	mno	90808
200	xyz	32003
100	abc	55555

cat file2:

ID	city	email
205	bangalore	blr@abc.com
100	delhi	dlh@abc.com

Required output:

ID	Name	phone_no	city	email
205	mno	90808	bangalore	blr@abc.com
100	abc	55555	delhi	dlh@abc.com

Need your help experts

Try :

$ awk 'FNR==NR{A[$1]=$2 FS $3;next}($1 in A){$0=$0 FS A[$1]}A[$1]' file2 file1
ID Name phone_no city email
205 mno 90808 bangalore blr@abc.com
100 abc 55555 delhi dlh@abc.com

Please use codetag not icode

Try:

awk 'NR==FNR{a[$1]=$0; next} $1 in a {print a[$1], $2, $3}' file1 file2
1 Like

Thanks Akshay:)It worked perfectly:b:
One small doubt, what if file1 contains headers and file2 is without headers. In that case will this command work or will there be any other changes in the command

---------- Post updated at 06:37 AM ---------- Previous update was at 06:36 AM ----------

Thank you Frank :slight_smile:

what if file1 contains headers and file2 is without headers. In that case will this command work or will there be any other changes in the command

Then Header will not be printed

Sorry for asking too many doubts.

The below code works for fine for files with 3 columns. what if it contains unknown number of columns.In that case how to print all the columns

If you just want to print file1 header you can use this

$ awk 'FNR==NR{A[$1]=$2 FS $3;next}FNR==1{print}($1 in A){$0=$0 FS A[$1]}A[$1]' file2 file1
ID Name phone_no 
205 mno 90808 bangalore blr@abc.com
100 abc 55555 delhi dlh@abc.com

if you want to print your own header you can do like this

$ awk 'FNR==NR{A[$1]=$2 FS $3;next}FNR==1{print "My Headers blah blah"}($1 in A){$0=$0 FS A[$1]}A[$1]'  file2  file1
My Headers blah blah
205 mno 90808 bangalore blr@abc.com
100 abc 55555 delhi dlh@abc.com

---------- Post updated at 07:00 AM ---------- Previous update was at 06:54 AM ----------

You can select required column

$0 ---> to print entire line
$1 ---> to print 1st column
$2 ---> to print 2nd column
........................................
$n ---> to print nth column

1 Like

You can print the header in the BEGIN section and if you have unknown number of columns you can try something like this:

awk 'BEGIN{print "ID Name phone_no city email"} NR==FNR{a[$1]=$0; next} $1 in a {$1=a[$1]}1' file1 file2

This works if we have less number of columns, But what if we have around 1000+ columns

Use nawk or /usr/xpg4/bin/awk on Solaris.