I am using a function to replace the values dynamically to frame sql query by reading a file. My file will have column names like
file.txt
col_1
col_2
expected output:
select id,col_1,col_2 from (
select a.id,
a.col_1,
rank() over (ORDER BY cast(a.col_1 AS double) DESC) AS col_1_rank,
a.col_2,
rank() over (ORDER BY cast(a.col_2 AS double) DESC) AS col_2_rank
from table_name a )t
join
(
SELECT count(case when col_1='.' then null else 1 end) AS col_1_count,
count(case when col_2='.' then null else 1 end) AS col_2_count
FROM table_name
) f
where t.col_1_rank < 0.0001*f.col_1_count or t.col_2_rank < 0.0001*f.col_2_count;
where ever i am using col_1 and col_2 that should be read from the file and get replaced , there is a possibility that more columns will be added in file.txt but the query structure remains the same
sql_query {
while read -r col do
printf 'select id,%s,%s from (
select a.id,
a.%s,
rank() over (ORDER BY cast(a.%s AS double) DESC) AS %s_rank,
a.%s,
rank() over (ORDER BY cast(a.%s AS double) DESC) AS %s_rank
from table_name a )t
join
(
SELECT count(case when %s='.' then null else 1 end) AS %s_count,
count(case when %s='.' then null else 1 end) AS %s_count
FROM table_name
) f
where t.%s_rank < 0.0001*f.%s_count or t.%s_rank < 0.0001*f.%s_count;'
done < $1 # this will be file.txt as parameter
}
Yes the number of columns in the file can increase but the structure of the query remains the same and it gets added for e.g. if col_3 is there in file.txt. Then below will be the output
select id,col_1,col_2,col_3 from (
select a.id,
a.col_1,
a.col_2,
a.col_3,
rank() over (ORDER BY cast(a.col_1 AS double) DESC) AS col_1_rank,
rank() over (ORDER BY cast(a.col_2 AS double) DESC) AS col_2_rank,
rank() over (ORDER BY cast(a.col_3 AS double) DESC) AS col_3_rank,
from table_name a )t
join
(
SELECT count(case when col_1='.' then null else 1 end) AS col_1_count,
count(case when col_2='.' then null else 1 end) AS col_2_count,
count(case when col_3='.' then null else 1 end) AS col_3_count
FROM table_name
) f
where t.col_1_rank < 0.0001*f.col_1_count or t.col_2_rank < 0.0001*f.col_2_count or t.col_3_rank < 0.0001*f.col_3_count ;