Hello friends,
In Managed Services, we have the opportunity to see different technologies and various topologies, which makes the work fascinating and challenging at the same time.
This time, I’m going to tell you about a particular case: a client with a dual-primary topology plus a replica, as detailed below:
1 2 3 | PS-primary-1=192.168.0.14 [RW] |___ PS-primary-2=192.168.0.59 [RW] (Slave_delay: 0) |___ PS-replica-1=192.168.0.99 [R] (Slave_delay: 0) |
[RW] means Read/Write access.
[R] means Read Only access.
We received an alert of this kind on PS-primary-2:
1 2 | Last_SQL_Errno: 1032 Last_SQL_Error: Could not execute Update_rows event on table foo.persons; Can't find record in 'persons', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000360, end_log_pos 58177 |
Seconds later, we noticed that someone (not us) resolved the replication issue, possibly by skipping the error in some way (using pt-slave-restart) or simply executing something like:
1 | mysql> STOP SLAVE; SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; START SLAVE; |
Or by injecting an empty transaction.
(Disclaimer: We won’t deeply analyze the root cause here, but rather the surprise we found during the analysis.)
Having seen this, the most advisable action is to run pt-table-checksum, and if any differences appear, use pt-table-sync to fix the data drift.
Therefore, we executed the pt-table-checksum in the following way:
1 2 3 4 5 6 | $ pt-table-checksum h=192.168.0.14 --port=3306 --no-check-binlog-format --no-check-replication-filters --replicate percona.checksums --recursion-method=hosts --max-load Threads_running=50 --max-lag=100 --databases=foo --tables=persons Checking if all tables can be checksummed ... Starting checksum ... TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE 07-28T12:40:31 0 1 6 1 1 0 0.193 foo.persons |
(Second disclaimer: We have populated the table with a small amount of data to simplify and improve visibility in this case. With larger volumes, the same scenario also applies.)
(Credentials, for security reasons, are stored in the file /home/user/.my.cnf, which is why you don’t see them in any executed command. If you want to know how to create it, here’s an example)
At a glance, there seem to be differences. Let’s check the other primary instance and the replica as well.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | PS-primary-2> SELECT max(ts), db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunks FROM percona.checksums WHERE ( master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc)) GROUP BY db, tbl; +---------------------+-----+---------+------------+--------+ | max(ts) | db | tbl | total_rows | chunks | +---------------------+-----+---------+------------+--------+ | 2023-07-28 08:40:31 | foo | persons | 5 | 1 | +---------------------+-----+---------+------------+--------+ 1 row in set (0.00 sec) PS-replica-1> SELECT max(ts), db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunks FROM percona.checksums WHERE ( master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc)) GROUP BY db, tbl; +---------------------+-----+---------+------------+--------+ | max(ts) | db | tbl | total_rows | chunks | +---------------------+-----+---------+------------+--------+ | 2023-07-28 09:53:10 | foo | persons | 4 | 1 | +---------------------+-----+---------+------------+--------+ 1 row in set (0.00 sec) |
The next step is to execute pt-table-sync with the –print option to review what the utility will actually do. The following output appears:
1 2 3 4 5 | $ pt-table-sync --print h=192.168.0.59 --sync-to-master --database=foo --table=persons --replicate percona.checksums DELETE FROM `foo`.`persons` WHERE `personid`='3' LIMIT 1 /*percona-toolkit src_db:foo src_tbl:persons src_dsn:P=3306,h=192.168.0.14 dst_db:foo dst_tbl:persons dst_dsn:h=192.168.0.59 lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:1519 user:percona host:localhost.localdomain*/; REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('1', 'Iglesias', 'Par', 'IND', 'Jaipur', '2023-07-27 07:10:10') /*percona-toolkit src_db:foo src_tbl:persons src_dsn:P=3306,h=192.168.0.14 dst_db:foo dst_tbl:persons dst_dsn:h=192.168.0.59 lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:1519 user:percona host:localhost.localdomain*/; REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('7', 'Ewd', 'Woo', 'CAN', 'Vancouver', '2023-07-28 07:10:10') /*percona-toolkit src_db:foo src_tbl:persons src_dsn:P=3306,h=192.168.0.14 dst_db:foo dst_tbl:persons dst_dsn:h=192.168.0.59 lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:1519 user:percona host:localhost.localdomain*/; |
A DELETE statement has appeared. How is it possible?
As this is a dual-primary topology, it is advisable to perform a checksum on the other primary server as well, which is PS-primary-2. This will help identify any discrepancies and ensure data consistency between both primary servers.
1 2 3 4 5 6 | $ pt-table-checksum h=192.168.0.59 --port=3306 --no-check-binlog-format --no-check-replication-filters --replicate percona.checksums --recursion-method=hosts --max-load Threads_running=50 --max-lag=100 --databases=foo --tables=persons Checking if all tables can be checksummed ... Starting checksum ... TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE 07-28T13:08:12 0 1 5 1 1 0 0.147 foo.persons |
That’s when we notice that one server has six rows, and the other has four. How is it possible? Let’s review:
1 2 3 4 5 6 7 8 9 10 11 12 13 | PS-primary-1 (none)> SELECT max(ts), db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunks FROM percona.checksums WHERE ( master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc)) GROUP BY db, tbl; +---------------------+-----+---------+------------+--------+ | max(ts) | db | tbl | total_rows | chunks | +---------------------+-----+---------+------------+--------+ | 2023-07-28 10:03:00 | foo | persons | 6 | 1 | +---------------------+-----+---------+------------+--------+ PS-replica-1 (none)> SELECT max(ts), db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunks FROM percona.checksums WHERE ( master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc)) GROUP BY db, tbl; +---------------------+-----+---------+------------+--------+ | max(ts) | db | tbl | total_rows | chunks | +---------------------+-----+---------+------------+--------+ | 2023-07-28 10:03:00 | foo | persons | 6 | 1 | +---------------------+-----+---------+------------+--------+ |
Let’s see what data the table currently contains:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | $ for SERVE in 192.168.0.14 192.168.0.59 192.168.0.99; do echo ; echo $SERVE ; mysql -h $SERVE -e "select * from foo.persons"; done 192.168.0.14 +----------+-----------+-----------+---------+------------+---------------------+ | PersonID | LastName | FirstName | Address | City | Insert_timestamp | +----------+-----------+-----------+---------+------------+---------------------+ | 1 | Iglesias | Par | IND | Jaipur | 2023-07-27 07:10:10 | | 2 | Rooster | War | IND | Tamil Nadu | 2023-07-27 07:10:10 | | 4 | SoS | Syl | CAN | Ontario | 2023-07-27 07:10:10 | | 5 | Al-lot | Pat | IND | Delhi | 2023-07-27 07:10:10 | | 6 | The Crazy | Vai | IND | Vadodara | 2023-07-27 07:10:10 | | 7 | Ewd | Woo | CAN | Vancouver | 2023-07-28 07:10:10 | +----------+-----------+-----------+---------+------------+---------------------+ 192.168.0.59 +----------+-----------+-----------+---------+------------+---------------------+ | PersonID | LastName | FirstName | Address | City | Insert_timestamp | +----------+-----------+-----------+---------+------------+---------------------+ | 2 | Rooster | War | IND | Tamil Nadu | 2023-07-27 07:10:10 | | 3 | Colonel | Vil | PER | Lima | 2023-07-27 07:10:10 | | 4 | SoS | Syl | CAN | Ontario | 2023-07-27 07:10:10 | | 5 | Al-lot | Pat | IND | Delhi | 2023-07-27 07:10:10 | | 6 | The Crazy | Vai | IND | Vadodara | 2023-07-27 07:10:10 | +----------+-----------+-----------+---------+------------+---------------------+ 192.168.0.99 +----------+-----------+-----------+---------+------------+---------------------+ | PersonID | LastName | FirstName | Address | City | Insert_timestamp | +----------+-----------+-----------+---------+------------+---------------------+ | 2 | Rooster | War | IND | Tamil Nadu | 2023-07-27 07:10:10 | | 4 | SoS | Syl | CAN | Ontario | 2023-07-27 07:10:10 | | 5 | Al-lot | Pat | IND | Delhi | 2023-07-27 07:10:10 | | 6 | The Crazy | Vai | IND | Vadodara | 2023-07-27 07:10:10 | +----------+-----------+-----------+---------+------------+---------------------+ |
As you can see, there is data drift everywhere.
And how can this be possible? I’ll enumerate several possibilities, and it is not an exhaustive list
Data drift and integrity of replica data is complex. Let’s outline some common ways this can happen (generally speaking):
- A statement is executed on a primary with: SET SESSION sql_log_bin = OFF
- A statement was executed directly on the replica
- This can happen if the replica was not in super_read_only and a super user executed
- This can happen if the replica was not in read_only
- A statement was executed on a replica, and the replica was later promoted to a primary without GTID in place
- A primary server is not configured for full ACID compliance, and it crashed
- At some point, the primary was not configured for row-based replication (even briefly)
These are the most common ways that data drift occurs, but certainly not the only ways that it can happen. More exotic cases can involve bugs, engine differences, version differences, etc.
Coming back to the issue, if we run pt-table-sync targeting the second primary instance, we will see the following result:
1 2 3 4 5 | $ pt-table-sync --print h=192.168.0.14 --sync-to-master --database=foo --table=persons --replicate percona.checksums DELETE FROM `foo`.`persons` WHERE `personid`='1' LIMIT 1 /*percona-toolkit src_db:foo src_tbl:persons src_dsn:P=3306,h=192.168.0.59 dst_db:foo dst_tbl:persons dst_dsn:h=192.168.0.14 lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:5732 user:percona host:localhost.localdomain*/; DELETE FROM `foo`.`persons` WHERE `personid`='7' LIMIT 1 /*percona-toolkit src_db:foo src_tbl:persons src_dsn:P=3306,h=192.168.0.59 dst_db:foo dst_tbl:persons dst_dsn:h=192.168.0.14 lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:5732 user:percona host:localhost.localdomain*/; REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('3', 'Colonel', 'Vil', 'PER', 'Lima', '2023-07-27 07:10:10') /*percona-toolkit src_db:foo src_tbl:persons src_dsn:P=3306,h=192.168.0.59 dst_db:foo dst_tbl:persons dst_dsn:h=192.168.0.14 lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:5732 user:percona host:localhost.localdomain*/; |
Here, the output is less promising: it suggests deleting the rows with ID 1 and 7, and performing a REPLACE on ID 3.
Therefore, note that it asks us to delete rows, which is not correct. So, what to do? After a discussion with the customer (we’ve shared with them all that we gathered), we agreed on the following:
- We need to add the row with ID=3 to the instances where it’s missing.
- We need to update the row with ID=1 in the instances where it’s missing.
- We need to add the row with ID=7 to the instances where it’s missing.
- We DO NOT need to delete any rows; no DELETES are necessary.
Important note: The client informed us that the data that exists only in one instance and not in the others is necessary. The client also verified that the actions we proposed to them were correct. Remember: we, as DBAs, are the guardians of the data and its integrity, but not necessarily the owners, let alone know the content for ethical reasons.
How do we achieve this, then?
Unfortunately, in this case, using pt-table-sync is not recommended since, as we have seen with the –print option, it would perform DELETES in both cases, regardless of which primary instance we consider as the source of truth.
We also cannot execute the REPLACE on PS-primary-1 because the REPLACE command follows a specific logic explained here
MySQL uses the following algorithm for REPLACE (and LOAD DATA … REPLACE):
- Try to insert the new row into the table
- While the insertion fails because a duplicate-key error occurs for a primary key or unique index:
- Delete from the table the conflicting row that has the duplicate key value
- Try again to insert the new row into the table
That means that if we execute a REPLACE on PS-primary-1 where the row with ID 7 does not exist but does exist in PS-primary-2, it will result in a duplicate key error, which I’ll show you below (as the row already exists):
1 2 | PS-primary-1 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('3', 'Colonel', 'Vil', 'PER', 'Lima', '2023-07-27 09:20:13'); Query OK, 1 row affected (0.01 sec) |
We connected to PS-primary-2, and indeed, we found the duplicate key error:
1 2 3 4 5 | PS-primary-2 (none)> PAGER grep "Last_Err"; SHOW SLAVE STATUSG PAGER set to 'grep "Last_Err"' Last_Errno: 1062 Last_Error: Could not execute Write_rows event on table foo.persons; Duplicate entry '3' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000359, end_log_pos 26840 1 row in set (0.01 sec) |
The commands we should execute would be the following:
1 2 3 4 | set sql_log_bin=0; REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('3', 'Colonel', 'Vil', 'PER', 'Lima', '2023-07-27 07:10:10'); REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('1', 'Iglesias', 'Par', 'IND', 'Jaipur', '2023-07-27 07:10:10'); REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('7', 'Ewd', 'Woo', 'CAN', 'Vancouver', '2023-07-28 07:10:10'); |
However, as we observed, the first one will fail due to a duplicate key. So, how do we resolve this? The safest option, in my humble opinion, would be to execute the three REPLACE statements on each of the instances by first invoking this command:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | PS-primary-1 > set sql_log_bin=0; Query OK, 0 rows affected (0.00 sec) PS-primary-1 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('3', 'Colonel', 'Vil', 'PER', 'Lima', '2023-07-27 07:10:10'); Query OK, 1 row affected (0.01 sec) PS-primary-1 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('1', 'Iglesias', 'Par', 'IND', 'Jaipur', '2023-07-27 07:10:10'); Query OK, 1 row affected (0.01 sec) PS-primary-1 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('7', 'Ewd', 'Woo', 'CAN', 'Vancouver', '2023-07-28 07:10:10'); Query OK, 1 row affected (0.01 sec) PS-primary-1 > Bye PS-primary-2 > set sql_log_bin=0; Query OK, 0 rows affected (0.00 sec) PS-primary-2 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('3', 'Colonel', 'Vil', 'PER', 'Lima', '2023-07-27 07:10:10'); Query OK, 1 row affected (0.00 sec) PS-primary-2 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('1', 'Iglesias', 'Par', 'IND', 'Jaipur', '2023-07-27 07:10:10'); Query OK, 1 row affected (0.01 sec) PS-primary-2 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('7', 'Ewd', 'Woo', 'CAN', 'Vancouver', '2023-07-28 07:10:10'); Query OK, 1 row affected (0.01 sec) PS-primary-2 > Bye PS-replica-1 > set sql_log_bin=0; Query OK, 0 rows affected (0.00 sec) PS-replica-1 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('3', 'Colonel', 'Vil', 'PER', 'Lima', '2023-07-27 07:10:10'); Query OK, 1 row affected (0.01 sec) PS-replica-1 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('1', 'Iglesias', 'Par', 'IND', 'Jaipur', '2023-07-27 07:10:10'); Query OK, 1 row affected (0.01 sec) PS-replica-1 > REPLACE INTO `foo`.`persons`(`personid`, `lastname`, `firstname`, `address`, `city`, `insert_timestamp`) VALUES ('7', 'Ewd', 'Woo', 'CAN', 'Vancouver', '2023-07-28 07:10:10'); Query OK, 1 row affected (0.01 sec) PS-replica-1 > set global super_read_only=1; set global read_only=1; Query OK, 0 rows affected (0.00 sec) |
Finally, we verify that all instances have the same data, either with the SELECT statement or the pt-table-checksum command.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | $ for SERVE in 192.168.0.14 192.168.0.59 192.168.0.99; do echo ; echo $SERVE ; mysql -h $SERVE -e "select * from foo.persons"; done 192.168.0.14 +----------+-----------+-----------+---------+------------+---------------------+ | PersonID | LastName | FirstName | Address | City | Insert_timestamp | +----------+-----------+-----------+---------+------------+---------------------+ | 1 | Iglesias | Par | IND | Jaipur | 2023-07-27 07:10:10 | | 2 | Rooster | War | IND | Tamil Nadu | 2023-07-27 07:10:10 | | 3 | Colonel | Vil | PER | Lima | 2023-07-27 07:10:10 | | 4 | SoS | Syl | CAN | Ontario | 2023-07-27 07:10:10 | | 5 | Al-lot | Pat | IND | Delhi | 2023-07-27 07:10:10 | | 6 | The Crazy | Vai | IND | Vadodara | 2023-07-27 07:10:10 | | 7 | Ewd | Woo | CAN | Vancouver | 2023-07-28 07:10:10 | +----------+-----------+-----------+---------+------------+---------------------+ 192.168.0.59 +----------+-----------+-----------+---------+------------+---------------------+ | PersonID | LastName | FirstName | Address | City | Insert_timestamp | +----------+-----------+-----------+---------+------------+---------------------+ | 1 | Iglesias | Par | IND | Jaipur | 2023-07-27 07:10:10 | | 2 | Rooster | War | IND | Tamil Nadu | 2023-07-27 07:10:10 | | 3 | Colonel | Vil | PER | Lima | 2023-07-27 07:10:10 | | 4 | SoS | Syl | CAN | Ontario | 2023-07-27 07:10:10 | | 5 | Al-lot | Pat | IND | Delhi | 2023-07-27 07:10:10 | | 6 | The Crazy | Vai | IND | Vadodara | 2023-07-27 07:10:10 | | 7 | Ewd | Woo | CAN | Vancouver | 2023-07-28 07:10:10 | +----------+-----------+-----------+---------+------------+---------------------+ 192.168.0.99 +----------+-----------+-----------+---------+------------+---------------------+ | PersonID | LastName | FirstName | Address | City | Insert_timestamp | +----------+-----------+-----------+---------+------------+---------------------+ | 1 | Iglesias | Par | IND | Jaipur | 2023-07-27 07:10:10 | | 2 | Rooster | War | IND | Tamil Nadu | 2023-07-27 07:10:10 | | 3 | Colonel | Vil | PER | Lima | 2023-07-27 07:10:10 | | 4 | SoS | Syl | CAN | Ontario | 2023-07-27 07:10:10 | | 5 | Al-lot | Pat | IND | Delhi | 2023-07-27 07:10:10 | | 6 | The Crazy | Vai | IND | Vadodara | 2023-07-27 07:10:10 | | 7 | Ewd | Woo | CAN | Vancouver | 2023-07-28 07:10:10 | +----------+-----------+-----------+---------+------------+---------------------+ $ pt-table-checksum h=192.168.0.14 --port=3306 --no-check-binlog-format --no-check-replication-filters --replicate percona.checksums --recursion-method=hosts --max-load Threads_running=50 --max-lag=100 --databases=foo --tables=persons Checking if all tables can be checksummed ... Starting checksum ... TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE 07-28T14:12:32 0 0 7 0 1 0 0.185 foo.persons $ pt-table-checksum h=192.168.0.59 --port=3306 --no-check-binlog-format --no-check-replication-filters --replicate percona.checksums --recursion-method=hosts --max-load Threads_running=50 --max-lag=100 --databases=foo --tables=persons Checking if all tables can be checksummed ... Starting checksum ... TS ERRORS DIFFS ROWS DIFF_ROWS CHUNKS SKIPPED TIME TABLE 07-28T14:12:50 0 0 7 0 1 0 0.148 foo.persons |
Conclusion
Maintaining this type of topology (dual primaries) can be a headache, and they are prone to such problems. Therefore, resolving them is not always as simple as it may seem. The tools pt-table-checksum and pt-table-sync are essential for resolving these issues in most cases and for efficient verification, as in this case.
Remember that if you have any questions or concerns, you can always contact us.
Lastly, don’t forget to run the pt-table-sync with the –print option to verify and share with the client or data owner whether the data to be corrected and the proposed corrections seem appropriate. Always.
Let’s stay synchronized, and I’ll see you in the next blog!