Tuesday 14 October 2008

High Availability System (P.A.S.S) - Database Mirroring - SQL Server 2008

Whats is database mirroring ?

Database mirroring maintains a hot standby database (known as the mirror database) that can quickly assume client connections in the event of a principal database outage.

Database mirroring involves two copies of a single database that reside on different computers. At any given time, only one copy of the database is available to clients. This copy is known as the principal database. Unlike log shipping which works by applying full transaction log backups to a warm standby database, database mirroring works by transferring and applying a stream of database log records from the principal database to the other copy of the database (the mirror database).

Database mirroring applies every database modification that is made on the principal database to the mirror database. This includes data changes as well as changes to the physical and logical structures in the database, such as database files, tables, and indexes.

Following are some basic database mirroring terms.

Principal: In a database mirroring configuration, there are two copies of a single database, but only one copy is accessible to the clients at any given time. The copy of the database that the applications connect to is called the principal database. The server that hosts the principal database is known as the principal server.

Mirror: The mirror is the copy of the principal database. The mirror is always in a restoring state; it is not accessible to the applications. To keep this database up-to-date, the log records are transferred from the principal and applied on the mirror database. The server that hosts the mirror database is known as the mirror server.

Witness: The optional witness is an SQL Server instance in a database mirroring configuration. It is separate from the principal and mirror instances. When database mirroring is used in synchronous mode, the witness provides a mechanism for automatic failover.

Send Queue: While sending the log records from the principal to the mirror, if the log records can’t be sent at the rate at which they are generated, a queue builds up at the principal. This is known as the send queue. The send queue does not use extra storage or memory. It exists entirely in the transaction log of the principal. It refers to the part of the log that has not yet been sent to the mirror.

Redo Queue: While applying log records on the mirror, if the log records can’t be applied at the rate at which they are received, a queue builds up at the mirror. This is known as the redo queue. Like the send queue, the redo queue does not use extra storage or memory. It exists entirely in the transaction log of the mirror. It refers to the part of the hardened log that remains to be applied to the mirror database to roll it forward.

Endpoint: An endpoint is a SQL Server object that enables SQL Server to communicate over the network. It encapsulates a transport protocol and a port number.

Failover: When the principal database (or the server hosting it) fails, database mirroring provides a mechanism to fail over to the mirror database.


Some important aspects of database mirroring are:
• The granularity of database mirroring is a database. Mirroring is configured for one database at a time. The whole instance is not mirrored.

• Two copies of the same database are involved in database mirroring, but only one database is accessible to the applications at any given time. You can create a snapshot on the mirror and use it for read-only purposes (a good solution for reporting requirements). However, you cannot directly access the mirror database or back up the mirror database.
• You cannot mirror the master, msdb, temp, or model databases.
• Database mirroring requires that the database use the full recovery model. You can’t use the simple or bulk-logged recovery models.
SQL Server 2005/2008 allows only one mirror database for each principal database.
• One instance can serve as the principal for one database, the mirror for another database, and the witness for yet another database.
• Multiple databases in an instance can be mirrored.
• Applications that connect to a database with ADO.NET or the SQL Native Client (SNAC) can automatically redirect connections when the database fails over to the mirror.
• Data between the principal and the mirror is transferred encrypted by default.
• A database which is mirrored to another server can also be the source database for a log-shipping scenario

Network bandwidth can be a bottleneck for mirrored databases where the rate of transaction log generation is high enough to saturate the network. On synchronously mirrored databases (where a transaction cannot commit on the principal until all the transaction log for it has been hardened in the log of the mirror database), this can even lead to workload throughput degradation.

Log compression which is a new feature of SQL Server 2008 is a way to fix this problem. The downside of this is that compression is not free - extra CPU must be used. The obvious CPU load increases comes from having to compress the transaction log on the principal and then decompress it again on the mirror before hardening it in the log. The non-obvious increase will come because the principal and mirror should end up being able to process more transactions/second, which means more CPU is needed.

This extra CPU could be a problem for some systems that are already pegged in terms of CPU load, and so they may actually see a drop in performance when log stream compression is enabled.

Apart from the obvious benefits of this feature for systems that can afford the extra CPU, it may allow more mirrored databases to perform effective database maintenance - in the form of fragmentation control and index maintenance. When database mirroring is enabled, the database has to be running in the full recovery mode. This means that operations such as index rebuilds are fully logged - that's a lot of extra log being generated if you're used to switching to bulk-logged recovery mode to perform index operations. Of course, this isn't an issue for reorganizing an index (with ALTER INDEX ... REORGANIZE) as this has been fully logged (and has to be that way because of how it works)

Anyway, for some people the fact that these two operations are fully logged means that running them produces so much additional log that, in conjunction with the log from the workload itself, the network becomes a bottleneck for mirroring and it slows down both the maintenance and reduces the workload throughput. The addition of log stream compression means that for these systems, the log may be compressed enough so that the network is no longer a bottleneck and more regular fragmentation control can take place. This in turn will increase workload throughput as well - a double benefit!

No comments: