Merging two files in UNIX

Hi Experts,
Need urgent solution for a problem.

I have two files file1 and file2. file1 is tab separated and file2 is comma separated.
I need to merge both the files into single file based on CUST_ID by retaining the headers of file1

Matching CUST_IDs should be placed one below the other in the output file.If any correspnding column is missing in file 2 it has to be replaced with 'NA';

cat file1:(tab separated)

CUST_ID	Name	Place	acc_no
1	abc	blr|chn|dlh	500|600|700
2	xyz	hyd|blr	888|999
3	mno	blr|hyd	111|222|333


cat file2:(coma separated)
CUST_ID,Name,acc_no
1,abc,500|600|700
2,xyz,111|222|333


Required output:

CUST_ID,Name,Place,acc_no
1,abc,blr|chn|dlh,500|600|700
1,abc,NA,500|600|700
2,xyz,hyd|blr,888|999
2,xyz,NA,111|222|333

Try:

$ awk -F'[ ,]' 'FNR==NR{gsub(/[[:space:]]/,",");A[$2]=$0;next}A[$2]{$3="NA,"$3;$0=FNR==1 ? A[$2]: A[$2] RS $0}A[$2]'  OFS=\, file1 file2

Resulting

CUST_ID,Name,Place,acc_no
1,abc,blr|chn|dlh,500|600|700
1,abc,NA,500|600|700
2,xyz,hyd|blr,888|999
2,xyz,NA,111|222|333

Hi akshay ,
let me post you the actual data. PFA the data for the above specification.
Kindly help me !

If you could explain relationship between sample data posted in #1 and actual data attached in #3 I will try to understand what you are going to do.

The tab separated file is the master file which contains all the columns and CSV file is the slave file which contains only few columns from master.

Requirement.
The final output should contain all columns(headers and its value) from master file and matching columns from slave file

Both the files have to be merged together based on CUSTSEQ . If any column is missing in slave file it has to be append as 'NA' under the header column

PFA the sample.

 awk -F"[\t,]" 'NR==FNR{if (NR==1) {for (i=1;i<=NF;i++) {d=$i }}
line[$1]=$0;next }
{if(FNR==1) { for (j=1;j<=NF;j++) {gsub("\"","",$j);c[j]=$j;}}
if ( line[$1] ) { print line[$1]
for ( x=1;x<i;x++ ) {
for ( y=1;y<j;y++ ) {
if (d[x] == c[y] ) { flg=1;printf $y"\t"} else { flag=0}
}
} if ( flag == 0 ) { printf "NA" } print "\n";
} }' master slave 

Hi pravin,

The above code will output the headers from both the files and display values one below the other.But the output should contain headers from master file alone and there respective value from slave file.

awk -F"[\t,]" 'NR==FNR{if (NR==1) {print $0;for (i=1;i<=NF;i++) { d=$i } } else {line[$1]=$0;next }}
{if(FNR==1) {for (j=1;j<=NF;j++) { gsub("\"","",$j);c[j]=$j;}} else {
if ( line[$1] ) {
for ( x=1;x<i;x++ ) {
for ( y=1;y<j;y++ ) {
if (d[x] == c[y] ) { flg=1;printf $y"\t"} else { flag=0}
}
} if ( flag == 0 ) { printf "NA" };printf "\n";
} }}' master slave

Its not working pravin. :frowning: The values from slave file are not placed under correct headers. Also values from master file are not displayed.

If you are trying with data other than which you attached here, our code will not work. we noticed in last 2/3 thread of yours that you are giving sample data which is completely different from your actual data.

Sorry for creating confusion.:confused:
I have attached the actual data,and this is the one which I am actually working on.

You may want to adapt this proposal - which works for your sample data - to your master and slave files:

awk -F, 'NR>1 {$2=$2 OFS "NA"; print}' OFS="\t" file2 | cat - file1 | sort | tr '\t' ','
1,abc,blr|chn|dlh,500|600|700
1,abc,NA,500|600|700
2,xyz,hyd|blr,888|999
2,xyz,NA,111|222|333
3,mno,blr|hyd,111|222|333
CUST_ID,Name,Place,acc_no

EDIT: sort -n will bring the header to line#1...