Character collations determine the sort order and classification of characters. When creating a database with initdb, PostgreSQL normally sets the collation based on the operating system’s locale settings, but other special collations, such as “C,” “POSIX,” and “ucs_basic,” are available as alternatives.

On Linux systems, updates to glibc can bring changes to collation rules. Normally, these updates are small, but they can lead to index corruption and need attention when planning upgrades and updates. PostgreSQL can be configured to use collation rules from different sources. When updating PostgreSQL and the operating system you run PostgreSQL on, you need to be alerted for changes to collation rules. 

This blog covers the pg_upgrade and lc_collate mismatch issues and solutions that might appear while running pg_upgrade

pg_upgrade error for lc_collate

  pg_upgrade check fails with the below error:

lc_collate values for database “postgres” do not match

 Example:

Here, the lc_collate/locale value might be different in the message depending on the environment setup.

The root cause of the pg_upgrade error

When the default collation of the old and new PostgreSQL databases does not match, the pg_upgrade consistency check will fail, and the message lc_collate values for the database “postgres” do not match.

Here is an example of a locale issue.

 On source PostgreSQL version:

A problem appears if we install a new PostgreSQL version and initialize the data directory using initdb. The information from the OS will be used to set the locale.

Solution

To fix this, reinitialized the new version of PGDATA Directory with the same –encoding and –locale as the original PostgreSQL cluster by explicitly specifying it as the command-line argument for initdb as shown in the below example.

Furthermore, there is no issue with pg_upgrade:

Start a new version of PostgreSQL, verify the encoding, and collate:

Encoding and collate look the same for all databases as it was in the old version of PostgreSQL.

Points to remember for pg_upgrade and lc_collate mismatch issues

  • It is always preferable to explicitly mention the coalition to be used while initializing a data directory, avoiding leaving that to the system defaults.
  • Binary collations (C/POSIX) are preferred over other collations because they stabilize the collation rules and avoid future troubles.
  • Binary collation also has performance benefits in addition to collation stability.

Our services and software are designed to make running high-performance, highly available PostgreSQL in critical environments practically effortless, enhancing PostgreSQL with security, high availability, and performance features that are all certified and tested to work seamlessly together.

 

Learn Why Customers Choose Percona for PostgreSQL

Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments