Database server clustering

Posted Posted by Patrice Guay in Web hosting     Comments 7 comments
Mar
23

database server clusterA cluster is a group of two or more independent servers operating as a single system. A database server cluster could help you achieve high availability or improve I/O throughput for your database service. In this article, we discuss the different approaches to database clustering for MySQL and MSSQL.

MySQL DRBD

The MySQL database service can be transformed into an highly available service with DRBD. This software technology, only available under Linux, mirrors data of a service from the active node to the standby node of a highly available cluster. DRBD works on top of block devices (e.g. hard disk partitions, LVM’s logical volumes). It mirrors each data block that it is written to disk to the standby node.

Mirroring can be done in a tightly coupled fashion called protocol C (synchronous mirroring). In this mode, the file system on the active node is notified that the writing of the block was finished only when the block is written to both disks of the cluster. It is the good choice for highly available clusters when losing a single transaction in case of a complete crash of the active node must be avoided. The other available option is asynchronous mirroring. In this mode, the file system on the active node is notified that the writing of the block was finished as soon as the data is written to the local disk. It is the good choice when building mirrors over long distances.

The diagram below presents a solution formed by a web server cluster and a MySQL DRBD cluster connected on a local network. Both MySQL server nodes are inter-connected by a DRBD network link. This network interface is used to synchronize data block between the servers and avoid private network congestion.

A consequence of mirroring data at the block device level is that you can only access your data on the active node. This is caused by the nature of most Linux file systems (ext3, ext4, XFS, JFS, etc). These file systems do not support multiple servers accessing the same shared disk for read and write operations. Their design only allow one server accessing each disk. As a consequence, the performance of a MySQL DRBD cluster is roughly equal to the performance of a single MySQL server. DRBD provides high availability but no performance enhancement for a MySQL database service.

MySQL replication

A cluster composed of two or more MySQL servers can also be created using a redundant load balancing solution and MySQL database replication feature.

MySQL master/master replication

A cluster with MySQL master/master replication consists of two MySQL servers in an active/active mode. The load balancing appliance must define two separate IP addresses:

  • Write operation IP
  • Read operation IP

Write operations to the database must be performed on a single MySQL server to prevent data corruption. A server is defined as the primary server in the load balancing appliance configuration. The second server is configured as a backup. In case the primary server fails, write operations will be directed automatically to the second server. The data contained in each server database is synchronized between them using replication. Read operations can be performed on both MySQL servers.

In a master/master replication setup, both servers are on an equal footing. If one server fails, the second can take the relay without having to change the configuration of the web servers. For write operations, such a configuration provides performance equal to that of a single server. For read operations, the theoretical performance of such a cluster is two times higher than that of a single server. However, to achieve higher performance during read operations, the web application must be coded to make a distinction between SQL write and read requests.

MySQL master/slave replication

A cluster with MySQL master/slave replication consists of two MySQL servers in an active/active mode. The load balancing appliance must define two separate IP addresses:

  • Write operation IP
  • Read operation IP

Write operations to the database must be performed on the master server exclusively. The data contained in the slave server is synchronized with the master server. Read operations can be performed on both MySQL servers.

In a master/slave replication setup, only the master server is able to properly manage the write operations. In case of failure of the master server, the web application will stop working properly. For write operations, such a configuration provides performance equal to that of a single server. For read operations, the theoretical performance of such a cluster is two times higher than that of a single server. However, to achieve higher performance during read operations, the web application must be coded to make a distinction between SQL write and read requests.

Comparing master/master and master/slave replications shows there is no apparent advantage to use a master/slave configuration. However, while a master/master replication consists of two servers, the master/slave replication can be formed of several servers with one master server and several slave server. Therefore, a master/slave configuration is interesting when the number of slave servers is greater than one.

MSSQL high availability cluster

A Microsoft Windows cluster consists of two servers in an active/passive configuration, at least one domain controller (DC) and a data storage solution. The diagram below presents a MSSQL cluster formed by two database servers, two redundant domain controller servers and an iSCSI SAN storage solution. The iSCSI SAN is connected to the database servers with multipath connections to provide redundant connectivity links.

This Microsoft Windows based setup provides high availability to your web solution. If the active server stops working, the passive server will take the relay. The performance of the database cluster depends mainly on the following elements:

  • database server CPU
  • SAN disks and RAID configuration
  • SAN data caching mechanism
  • communication link between the servers and the SAN

MSSQL replication

A cluster composed of two or more MSSQL servers can also be created using a redundant load balancing solution and MSSQL database replication feature.

MSSQL master/master replication

A cluster with MSSQL master/master replication consists of two MSSQL servers in an active/active mode. The load balancing appliance must define two separate IP addresses:

  • Write operation IP
  • Read operation IP

Write operations to the database must be performed on a single MSSQL server to prevent data corruption. A server is defined as the primary server in the load balancing appliance configuration. The second server is configured as a backup. In case the primary server fails, write operations will be directed automatically to the second server. The data contained in each server database is synchronized between them using the transactional publication with updatable subscriptions mechanism. Read operations can be performed on both MSSQL servers.

The transactional publication with updatable subscriptions mechanism adds a column named msrepl_tran_version on each published table to keep track of transactions. All SQL INSERT commands of your application must be adapted to accommodate this change.

In a master/master replication setup, both servers are on an equal footing. If one server fails, the second can take the relay without having to change the configuration of the web servers. For write operations, such a configuration provides performance equal to that of a single server. For read operations, the theoretical performance of such a cluster is two times higher than that of a single server. However, to achieve higher performance during read operations, the web application must be coded to make a distinction between SQL write and read requests.

MSSQL master/slave replication

A cluster with MSSQL master/slave replication consists of two MSSQL servers in an active/active mode. The load balancing appliance must define two separate IP addresses:

  • Write operation IP
  • Read operation IP

Write operations to the database must be performed on the master server exclusively. The data contained in the slave server is synchronized with the master server using the transactional publication mechanism. Read operations can be performed on both MSSQL servers.

In a master/slave replication setup, only the master server is able to properly manage the write operations. In case of failure of the master server, the web application will stop working properly. For write operations, such a configuration provides performance equal to that of a single server. For read operations, the theoretical performance of such a cluster is two times higher than that of a single server. However, to achieve higher performance during read operations, the web application must be coded to make a distinction between SQL write and read requests.

Comparing master/master and master/slave replications shows there is no apparent advantage to use a master/slave configuration except the requirement to use modified SQL INSERT statements in master/master mode. However, while a master/master replication consists of two servers, the master/slave replication can be formed of several servers with one master server and several slave server. Therefore, a master/slave configuration is interesting when the number of slave servers is greater than one.

7 Comments to “Database server clustering”

  • The “transactional publication with updatable subscriptions” mecanism will be removed in a future version of Microsoft SQL Server. Avoid using this feature in new development work, and plan to modify applications that currently use this feature. The “Peer-to-Peer Transactional Replication” mecanism, available in Microsoft SQL Server 2008 Enterprise and above, is recommended.

    http://msdn.microsoft.com/en-us/library/ms151718.aspx

    http://msdn.microsoft.com/en-us/library/ms151196.aspx

  • Le mécanisme de “publication transactionnelle avec des abonnements qui peuvent être mis à jour” sera supprimé dans une prochaine version de Microsoft SQL Server. Évitez d’utiliser cette fonctionnalité dans de nouveaux travaux de développement et prévoyez de modifier les applications qui utilisent actuellement cette fonctionnalité. Le mécanisme “Réplication transactionnelle d’égal à égal”, disponible dans Microsoft SQL Server 2008 Enterprise et plus récent, est recommandé.

    http://msdn.microsoft.com/fr-fr/library/ms151718.aspx

    http://msdn.microsoft.com/fr-fr/library/ms151196.aspx

  • […] are many many ways in which you could cluster a database. MySQL and MS SQL Server have their way of clustering while No SQL databases like MongoDB do it slightly differently. The important thing to understand […]

  • […] are many many ways in which you could cluster a database. MySQL and MS SQL Server have their way of clustering while No SQL databases like MongoDB do it slightly differently. The important thing to understand […]

  • The ebook “Web scaling vol. I” could be a good starting point for you. See this blog article for more details: http://www.patriceguay.com/webhosting/web-scaling-vol1-ebook

  • […] Database server clustering « Patrice Guay, Eng. – A cluster is a group of two or more independent servers operating as a single system. A database server cluster could help you achieve high availability or improve I … […]

  • Hello
    I am dba I would use the iSCSI solution for duplicating backups (.bak file) using robocopy
    Is this solution is useful for transferring files of size 25 GB by the network
    thank you in advance

Post comment

Advertisements