Greetings Experts,
As part of automating the sql generation, I have the source table name, target table name, join condition stored in a file join_conditions.txt which is a delimited file (I can edit the file if for any reason). The reason I needed to store is I have built SELECT list without FROM clause according to source and target tables taken from another file and stored it in an array and now I need to build the complete sql query of select list and the from list;
Src_tbl Tgt_tbl join condition
Src_1 @Tgt_1 @FROM ( SELECT SRC_1.* FROM SRC_1 INNER JOIN SRC_TEMP_1 ON
SRC_1.COL1=SRC_TEMP_1.COL1) SRC JOIN (SELECT TGT_1.* FROM TGT_1
INNER JOIN TGT_TEMP_1 ON TGT_1.COL1=TGT_TEMP_1.COL1) TGT ON
SRC.COL1=TGT.COL1;
Src_2 @Tgt_2 ...................
Please note that I have aligned the 3rd column LPAD with spaces for better view; the provided join condition column can span multiple lines;
Similarly there are corresponding join conditions for other source and target tables in the file which is not mentioned here.
awk -F "@" 'BEGIN{RS=";"}
{
join_cond=$3
a[$1 OFS $2]=join_cond
}
END {
for (i in a) {
print a
}
}' < join_conditions.txt > sql_query.txt
Expected assignment:
FROM ( SELECT SRC_1.* FROM SRC_1 INNER JOIN SRC_TEMP_1 ON
SRC_1.COL1=SRC_TEMP_1.COL1) SRC JOIN (SELECT TGT_1.* FROM TGT_1
INNER JOIN TGT_TEMP_1 ON TGT_1.COL1=TGT_TEMP_1.COL1) TGT ON
SRC.COL1=TGT.COL1;
Current assignment:
FROM ( SELECT SRC_1.* FROM SRC_1 INNER JOIN SRC_TEMP_1 ON
Even though I specified the RS=; I think it is still considering the \n as record separator; How can I assign the multi-line content present in
the join_conditions.txt to awk variable jon_cond for frame the sql query; I am struck up with this as I have no clue on how to proceed;
I can break the file manually to contain the join_condition with source and target repeating multiple times as
Src_tbl1 Tgt_tbl1 join condition
Src_1 Tgt_1 FROM ( SELECT SRC_1.* FROM SRC_1 INNER JOIN
Src_1 Tgt_1 SRC_TEMP_1 ON SRC_1.COL1=SRC_TEMP_1.COL1) SRC
Src_1 Tgt_1 JOIN (SELECT TGT_1.* FROM TGT_1 INNER JOIN TGT_TEMP_1 ON
Src_1 Tgt_1 TGT_1.COL1=TGT_TEMP_1.COL1) TGT ON SRC.COL1=TGT.COL1;
and then carry out the required operation. But I assume that there is more elegant approach
Many thanks for your time..
---------- Post updated at 11:27 AM ---------- Previous update was at 10:52 AM ----------
Hi RavinderSingh,
I have a input delimited file whose layout is
Src_tbl@Tgt_tbl@join_condition
Src_1@Tgt_1@FROM ( SELECT SRC_1.* FROM SRC_1 INNER JOIN SRC_TEMP_1 ON SRC_1.COL1=SRC_TEMP_1.COL1) SRC JOIN (SELECT TGT_1.* FROM TGT_1 INNER JOIN TGT_TEMP_1 ON TGT_1.COL1=TGT_TEMP_1.COL1) TGT ON SRC.COL1=TGT.COL1;
The data in join_condition can span multiple lines here.
I need to get the data in the join_condition column and assign it to an array for further operations;
awk -F "@" ' BEGIN{RS=";"}
{
a[$1 OFS $2]=$3
print a[$1 OFS $2]
}'
Expected Output:
FROM ( SELECT SRC_1.* FROM SRC_1 INNER JOIN SRC_TEMP_1 ON SRC_1.COL1=SRC_TEMP_1.COL1) SRC JOIN (SELECT TGT_1.* FROM TGT_1 INNER JOIN TGT_TEMP_1 ON TGT_1.COL1=TGT_TEMP_1.COL1) TGT ON SRC.COL1=TGT.COL1;
Output what I am seeing:
FROM ( SELECT SRC_1.* FROM SRC_1 INNER JOIN
How to get the expected output here please.