looping and awk/sed help

I am pretty new to this, but imagine what I am trying to do is possible

iI am trying to make an automated DB comparison tool that selects all columns in all tables and compares them to the same thing in another DB.

anyway I have created 2 files to help with this

the first file is a distinct list of table_names (used in the first for loop)

the second file contains all the column_names for all tables
table_name column_name
table_1 column_1
table_1 column_2
table_2 column_1
...etc

#! /bin/sh

#for each table in tables_list get all column_names
for table in `cat tables_list`
do
        ctr=0
        echo "\ntable name is $table"
        for col in `grep $table columns_list | awk {'print $2'}`
        do
                ((ctr=ctr+1))
                echo "column $ctr name is $col"
        done
done

I am stuck here trying to get all the column_names in one long string so I can insert them into an SQL statement.

like this "col_1 , col_2, col_3 ..."

can anyone help me save the column_names like above?

another potential problem is that a few of the tables have over 300 columns!!

Thanks,
Zelp

get all the columns in a file and use this :-

cat column_list_file | tr '\n' ','

nawk -f selp.awk table_namesFile column_namesFile

here's zelp.awk:

BEGIN { SEPlist="," }
FNR==NR { tbl[$1]; next}
{ 
  if ( $1 in tbl )
     out[$1] = ($1 in out) ? out[$1] SEPlist $2 : $2
}
END {
   for ( i in out)
      print i, out
}

okay that gets the output I need, but I am not sure how to use it the way I need to.

I have a template file I created that contains the sql statements and placeholders for columns_names and table_names

select * from #@connection1
minus
select * from #@connection2;

select * from #@connection1
minus
select * from #@connection3;
...etc for 25 diff DB's

i want to insert the column_name list for all the *'s in the file
and I want to replace the table_name for all the #'s in the file

then I want to save 1 set of sql's for each table so I can run them and create a log of any diffs form all DB's

sorry for so many questions, I am still very new to awk/sed wasn't sure which to use or how, thanks so much.
Zelp

how do you map a 'tableName' to the 'connectionNumber'?
can you put that mapping in your 'table_namesFile' file like so:

table_1 connection1
table_2 connection2
table_3 connection3
BEGIN { SEPlist="," }
FNR==NR { tbl[$1]; next}
{ 
  if ( $1 in tbl )
     out[$1] = ($1 in out) ? out[$1] SEPlist $2 : $2
}
END {
   for ( i in out)
      printf("select %s from %s@connection1\n",  out, i)
}

I guess there is more explanation needed.
We have 25 DB's that have identical schema (the same table exists in all of them) but the data is different.

we are hoping to combine all of them into 1 DB, first I am trying to find all the differences between each table in each DB's to investigate them before we just put them all together.

so you see the template I have already contains all the connections in the SQLs, I just want to replace all the *'s and #'s and save that as a new sql named after the table all the sql's are for. So now the new sql created will select the same columns from the same tables in all 25 DB's for the comparison.

am I making any sense anymore?

maybe in the second portion instead of a print, I can do a replace of the table_names and column_names into a new file based on my template?

something like this?

BEGIN { 
    SEPlist="," 
    schemasN=split("connection1 connection2 connection3 connection4", schemasA, " ")
}
FNR==NR { tbl[$1]; next}
{ 
  if ( $1 in tbl )
     out[$1] = ($1 in out) ? out[$1] SEPlist $2 : $2
}
END {
   for(s=2; s <= schemasN; s++) 
     for ( i in out)
        printf("select %s from %s@%s\nminus\nselect %s from %s@%s\n",  out, i, schemasA[1], out, i, schemasA)
}

yes the select statements come out great, but not the connections

you see I have 25 DB's and I want to compare them each to each other, and I figure the best way is like this

1-2
2-1
2-3
3-2
3-4
4-3
etc...

but I also want to just compare 1 table at a time in each of the 25 DB, what gets created now contains all tables.

I guess I could just grep this output intto a seperate sql file for each table_name?

Thanks so much for all the help so far,
Zelp

ok, so do you want to permute amoung ALL the 25 tables and create a 'minus' selects for ALL of them?
Say you have 3 tables: t1, t2, t3
Do you need 6 permutations like so ?:
t1
minus
t2

t1
minus
t3

t2
minus
t1

t2
minus
t3

t3
minus
t1

t3
minus
t2

FYI: I've also fixed the script in the last posting - you may try it now.

here's the version with the permutations - you might want to improve the permutations part of as it is not the most efficient:

BEGIN {
    SEPlist=","
    schemasN=split("connection1 connection2 connection3 connection4", schemasA, " ")

  for(i=1; i <= schemasN; i++)
    for(j=1; j <= schemasN; j++)
      if ( i != j ) {
        perm[schemasA SUBSEP schemasA[j]]
        perm[schemasA[j] SUBSEP schemasA]
      }

}
FNR==NR { tbl[$1]; next}
{
  if ( $1 in tbl )
     out[$1] = ($1 in out) ? out[$1] SEPlist $2 : $2
}
END {
     for ( permI in perm ) {
       split(permI, from_toA, SUBSEP)
       for ( i in out) {
          outFile = i ".sql"
          printf("select %s from %s@%s\nminus\nselect %s from %s@%s\n\n#-----\n\n",  out, i, from_toA[1], out, i, from_toA[2] ) >> outFile
        }
     }
}

WOW, you did it!

Thanks a ton.

now I can just awk that into a file, then grep the resulting file per each table_name and save that to individual SQLs per table.

you can do this without using grep - I've modified the last code posting.

Thanks again, you are really on the ball!

Hi Zelp,

F.Y.I ,Oracle has already created a comparison tool between schemas and databases.
You can invoke the module "Change Manager" from the OEM (Oracle Enterprise Manager) and run any comparison you want.

Best regards,
Nir