Short answer: No.
Detailed explanation: Semi-sync SYNC status does not guarantee the old master is replication-consistent with the cluster after a crash or shutdown.
Known issues:
What semi-sync guarantees: No client applications have seen transactions that didn't reach a replica, but the master's binary log may contain additional events not yet replicated.
Impact: In heavy write scenarios, crashed masters often require re-provisioning from another node rather than rejoining the cluster.
Recommendation: Use rpl_semi_sync_master_wait_point = AFTER_COMMIT (default) to ensure client-visible transactions are safer, even though it may leave more transactions in the binary log after a crash.
Reference: /pages/07.howto/01.replication-best-practice/docs.md:44
Problem: Rejoining slaves during switchover fails when using expire_logs_days after extended periods without writes.
Cause: Binary logs are automatically purged based on expire_logs_days, which may remove logs needed for slave rejoin after the cluster has been idle.
Related bug: MDEV-10869
Solution:
expire_logs_days value to retain logs longerbinlog_expire_logs_seconds (MariaDB 10.6+) for finer controlWorkaround: If switchover fails, you may need to re-provision affected slaves from the new master.
Reference: Current FAQ
Parameter: rpl_semi_sync_master_wait_point
AFTER_COMMIT (recommended):
AFTER_SYNC:
Recommendation: Use AFTER_COMMIT for safer client experience.
Reference: /pages/07.howto/01.replication-best-practice/docs.md:50
Problem: Applications with SUPER privileges can write to a read-only master during switchover.
Cause: MariaDB does not have MySQL's super_read_only protection. The READ_ONLY flag does not block SUPER users from writing.
Related bug: MDEV-9458
Impact: During switchover:
READ_ONLYFLUSH TABLES WITH READ LOCKMitigation:
READ_ONLY slavemax_connections during switchover to limit queued connectionsBest practice: Don't grant SUPER privileges to application users.
Reference: Current FAQ
Problem: MySQL server hangs during shutdown when using GTID with autocommit=0 and super_read_only=ON.
Affected versions:
Fixed in:
Cause: Transaction attempting to save GTIDs to mysql.gtid_executed table fails because super_read_only=ON prevents the update. With autocommit=0, the transaction never completes, blocking shutdown.
Solution: Upgrade to MySQL 5.7.25/8.0.14 or later.
Workaround (if upgrade not possible): Set autocommit=1 or avoid super_read_only on slaves.
Bug reference: Bug #28183718
Reference: Current FAQ
Problem: Semi-sync timeout causes workload changes and increased failover risk.
Behavior: When rpl_semi_sync_master_timeout (default: 10 seconds) is reached:
Impact before timeout: Semi-sync slows workload to network replication speed, creating backpressure on writes.
Impact after timeout:
Monitoring: replication-manager tracks "In Sync" status and SLA metrics to determine when safe failover windows exist.
Reference: /pages/07.howto/01.replication-best-practice/docs.md:46
Problem: Relay slaves cannot automatically reconnect in multi-tier replication when their intermediate master fails.
Cause: replication-manager does not automatically manage relay node failures in multi-tier topologies.
Limitation: If you have:
Master → Relay → Slave
And the Relay node dies, the Slave cannot automatically reconnect to Master.
Workaround 1: Manually repoint slaves to the new topology after relay node failure.
Workaround 2 (recommended): Use multi-domain child clusters instead of multi-tier replication. replication-manager supports defining each level of the tree as its own master-slave cluster, where a slave of the parent cluster is the master of the child cluster. Each cluster has independent failover management, so a failure at any level is handled automatically within that cluster.
[parent-cluster] [child-cluster]
Master → Slave1 (= child Master) → ChildSlave1
Slave2 ChildSlave2
Limitation: Child cluster replication does not support replication filtering (replicate-do-db, replicate-ignore-table, etc.) — all databases and tables are replicated to the child level.
Design consideration: Multi-domain child clusters provide fully automated HA at every level of the tree, at the cost of no filtering. Multi-tier relay topologies allow filtering but require manual intervention when relay nodes fail.
Reference: /pages/05.configuration/05.replication/docs.md
Restriction: Do not use server-id = 1000 on any database node in your cluster.
Reason: replication-manager reserves server-id = 1000 for binlog server operations during crash recovery.
Impact: Using server-id 1000 in your cluster will cause:
Solution: Use any server-id except 1000. Common practice is sequential IDs: 3306, 3307, 3308, etc.
Reference: /pages/05.configuration/03.failover/02.crash-recovery/docs.md