Motivation
There has been some discussions over at meta discourse lately about PostgresSQL 10 and 12 indexes being corrupt. We have yet to see this in practice using Discourse; however, it is certainly a good idea check indexes and verify they are not corrupt, especially before doing a backup and restore.
The amcheck
module provides functions that allow you to verify the logical consistency of the structure of PostgreSQL indexes
Installing amcheck
In this Discourse configuration, we have both web-app and data containers, so we will install amcheck
in the data container.
# ./launcher enter data
# apt install postgresql-10-amcheck
Verify the Indexes
We need to create the amcheck_next
extension (one time only) and then run the query:
# su postgres
$ psql discourse
psql (10.12 (Debian 10.12-2.pgdg100+1))
Type "help" for help.
discourse=# CREATE EXTENSION amcheck_next;
CREATE EXTENSION
discourse=#
Now that the initial setup is done, we can run this query:
discourse=# SELECT bt_index_check(c.oid), c.relname, c.relpages
discourse-# FROM pg_index i
discourse-# JOIN pg_opclass op ON i.indclass[0] = op.oid
discourse-# JOIN pg_am am ON op.opcmethod = am.oid
discourse-# JOIN pg_class c ON i.indexrelid = c.oid
discourse-# JOIN pg_namespace n ON c.relnamespace = n.oid
discourse-# WHERE am.amname = 'btree' AND n.nspname = 'pg_catalog'
discourse-# -- Don't check temp tables, which may be from another session:
discourse-# AND c.relpersistence != 't'
discourse-# -- Function may throw an error when this is omitted:
discourse-# AND i.indisready AND i.indisvalid
discourse-# ORDER BY c.relpages DESC LIMIT 10;
bt_index_check | relname | relpages
----------------+-----------------------------------+----------
| pg_depend_depender_index | 150
| pg_attribute_relid_attnam_index | 147
| pg_depend_reference_index | 123
| pg_attribute_relid_attnum_index | 101
| pg_class_tblspc_relfilenode_index | 42
| pg_class_relname_nsp_index | 38
| pg_proc_proname_args_nsp_index | 38
| pg_class_oid_index | 27
| pg_description_o_c_o_index | 24
| pg_type_typname_nsp_index | 20
This Discourse PostgreSQL 10 query shows a session that performs verification of catalog indexes. Verification of the presence of heap tuples as index tuples is requested for unique indexes only.
As we can see, no error is raised, all indexes tested appear to be logically consistent.
The Query (Copy & Paste)
SELECT bt_index_check(index => c.oid, heapallindexed => i.indisunique),
c.relname,
c.relpages
FROM pg_index i
JOIN pg_opclass op ON i.indclass[0] = op.oid
JOIN pg_am am ON op.opcmethod = am.oid
JOIN pg_class c ON i.indexrelid = c.oid
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE am.amname = 'btree' AND n.nspname = 'pg_catalog'
-- Don't check temp tables, which may be from another session:
AND c.relpersistence != 't'
-- Function may throw an error when this is omitted:
AND c.relkind = 'i' AND i.indisready AND i.indisvalid
ORDER BY c.relpages DESC LIMIT 10;
Summary
No error concerning corruption raised by amcheck
should ever be a false positive. In practice, amcheck
is more likely to find software bugs than problems with hardware. amcheck
raises errors in the event of conditions that, by definition, should never happen, and so careful analysis of amcheck
errors is often required.
Repairing Postgres Index Corruption
There is no general method of repairing problems that amcheck
detects.
An explanation for the root cause of an invariant violation should be sought. pageinspect may play a useful role in diagnosing corruption that amcheck
detects.
A REINDEX
may not be effective in repairing corruption, but you can try and run the above query again.
Final Thoughts
I have performed this procedure on two Discourse installations, both running PostgreSQL 10 with a "two container" configuration and a single container configuration. The process went well in both instances. I recommend anyone who in concerned about corrupt PostgreSQL indexes consider this procedure.
https://access.crunchydata.com/documentation/amcheck-next/1.5/