Dynamically checking rules in database tables using SQL

I want to check for rows in a table where all values (except the key) is empty. I am using MySQL 5.5.
I plan to do this mechanically, so the approach should work for any table in my database schema.

Suppose for illustration purposes I start with the following table:

CREATE TABLE `sources` (
  `source` varchar(3) NOT NULL,
  `description` varchar(24) DEFAULT NULL,
  `pricedata` varchar(24) DEFAULT NULL,
  `tablename` varchar(24) DEFAULT NULL,
  `reference` varchar(64) DEFAULT NULL,
  PRIMARY KEY (`source`)
) DEFAULT CHARSET=latin1;

Then my initial steps are:

USE information_schema;
SET @myTable = 'sources';
PREPARE statement FROM "SELECT GROUP_CONCAT(column_name SEPARATOR ', ') AS columns FROM columns WHERE table_name = @myTable AND column_key = ' '";
EXECUTE statement;

This yields:

+----------------------------------------------+
| columns                                      |
+----------------------------------------------+
| description, pricedata, tablename, reference |
+----------------------------------------------+

How do I use this result to be able to construct and execute a statement as follows?:

SELECT * FROM sources WHERE [description] IS NULL AND [pricedata] IS NULL AND [tablename] IS NULL AND [reference] IS NULL;

That's obviously pseudo-code for illustration purposes, but the idea is that the approach should work for all tables.

I tried something as follows:

SELECT CONCAT(SELECT * FROM @myTable WHERE ', GROUP_CONCAT(column_name SEPARATOR ' IS NULL AND '), ' IS NULL;) FROM columns WHERE table_name = @myTable;

but that was a little too clever and resulted in an error message. Within the CONCAT a nested prepare-statement might work, but the syntax evades me.

So, how can I dynamically check for rows having all nulls?

mysql>
mysql> --
mysql> select column_name
    ->   from columns
    ->  where table_name = 'sources'
    ->    and column_key = '';
+-------------+
| column_name |
+-------------+
| description |
| pricedata   |
| tablename   |
| reference   |
+-------------+
4 rows in set (0.04 sec)

mysql>
mysql>
mysql> --
mysql> select concat ('select * from sources where ', substring(group_concat(concat('AND ',column_name,' IS NULL') separator ' '),5), ';') as x
    ->   from columns
    ->  where table_name = 'sources'
    ->    and column_key = '';
+--------------------------------------------------------------------------------------------------------------------+
| x                                                                                                                  |
+--------------------------------------------------------------------------------------------------------------------+
| select * from sources where description IS NULL AND pricedata IS NULL AND tablename IS NULL AND reference IS NULL; |
+--------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

mysql>
mysql>
mysql>
mysql> --
mysql> -- Use a variable instead of hard-coding the table name
mysql> --
mysql> set @x = 'sources';
Query OK, 0 rows affected (0.00 sec)

mysql>
mysql>
mysql> --
mysql> select concat ('select * from ', @x, ' where ', substring(group_concat(concat('AND ',column_name,' IS NULL') separator ' '),5), ';') as x
    ->   from columns
    ->  where table_name = @x
    ->    and column_key = '';
+--------------------------------------------------------------------------------------------------------------------+
| x                                                                                                                  |
+--------------------------------------------------------------------------------------------------------------------+
| select * from sources where description IS NULL AND pricedata IS NULL AND tablename IS NULL AND reference IS NULL; |
+--------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql>
mysql>

---------- Post updated at 12:24 AM ---------- Previous update was at 12:15 AM ----------

Or another one:

mysql>
mysql> --
mysql> select concat ('select * from ', @x, ' where ', group_concat(column_name separator ' is null and '), ' is null;') as x
    ->   from columns
    ->  where table_name = @x
    ->    and column_key = '';
+--------------------------------------------------------------------------------------------------------------------+
| x                                                                                                                  |
+--------------------------------------------------------------------------------------------------------------------+
| select * from sources where description is null and pricedata is null and tablename is null and reference is null; |
+--------------------------------------------------------------------------------------------------------------------+
1 row in set (0.04 sec)

mysql>
mysql>

---------- Post updated at 12:53 AM ---------- Previous update was at 12:24 AM ----------

If you want to do something similar for a bunch of tables, then you could assign the table list to a variable and use it in your query.

Over here, I've generated relevant queries for a few tables of the Sakila database:

mysql>
mysql> --
mysql> set @set_of_tables = 'actor,city,country';
Query OK, 0 rows affected (0.00 sec)

mysql> set @schema = 'sakila';
Query OK, 0 rows affected (0.00 sec)

mysql>
mysql> --
mysql> select table_name, column_name, column_key
    ->   from columns
    ->  where find_in_set (table_name, @set_of_tables)
    ->    and table_schema = @schema;
+------------+-------------+------------+
| table_name | column_name | column_key |
+------------+-------------+------------+
| actor      | actor_id    | PRI        |
| actor      | first_name  |            |
| actor      | last_name   | MUL        |
| actor      | last_update |            |
| city       | city_id     | PRI        |
| city       | city        |            |
| city       | country_id  | MUL        |
| city       | last_update |            |
| country    | country_id  | PRI        |
| country    | country     |            |
| country    | last_update |            |
+------------+-------------+------------+
11 rows in set (0.04 sec)

mysql>
mysql> --
mysql> select concat ('select * from ', table_name, ' where ', group_concat(column_name separator ' is null and '), ' is null;') as x
    ->   from columns
    ->  where find_in_set (table_name, @set_of_tables)
    ->    and table_schema = @schema
    ->    and column_key != 'PRI'
    ->  group by table_name;
+---------------------------------------------------------------------------------------------+
| x                                                                                           |
+---------------------------------------------------------------------------------------------+
| select * from actor where first_name is null and last_name is null and last_update is null; |
| select * from city where city is null and country_id is null and last_update is null;       |
| select * from country where country is null and last_update is null;                        |
+---------------------------------------------------------------------------------------------+
3 rows in set (0.18 sec)

mysql>
mysql>

Thank you for your response. So how do I then execute the generated SELECT statements?

(1) Create a script to generate such queries.
(2) Pass the script to the mysql command and redirect the output to a file.
(3) And then pass the queries.sql containing the generated queries to the mysql command.

Yes, I was contemplating the export and then an import also. I see that more of a last resort option, because it involves writing to a file together with all its shortcomings.

The other solution we are investigating is through the use of cursors. This looks roughly as follows:

  • create a list of all tables in the schema
  • loop through these tables to create a list of all columns per table which are not keys
  • loop through these columns to pick up rows where all values are NULL

Disadvantage here is that it requires the use of a procedural language as well as being less portable to other database systems.

Are you saying it is impossible to do entirely within MySQL in (possibly nested) SQL statements?