Wednesday, 1 October 2008

High Availability System (P.A.S.S) - Hardware Methods

SQL Server Database Solutions P.A.S.S Checklist

There are many ways to design a database solution for SQL Server and I try to make sure that I stick to my P.A.S.S checklist
  • High Performance
  • High Availability
  • High Security
  • High Scalability
In the following articles, I will go through my checklist to show you how you can use it within your database designs and implementations.

High Availability

Some people may judge that a high availability system is a system that never has any unscheduled downtime. This can be a mistake as even a scheduled downtime can make your system a non-available system.

To judge if a database solution is a highly available system, you should look at it from a users point a view. A user should be able to do what ever he/she wants to do any time of the day and if that’s not the case, then your system in not a highly available solution and should be redesigned for any potential problems that could occur.

Other things can make your system non-available, from index rebuilding, backing up the database to locking of tables. All these potential problems should be thought of when designing a database solution to make your users more productive.


How can you make your database system, highly available?

One method is to understand your environment and what is meant by the recovery models of a SQL Server database. The longer it takes to recover a database from a failover, the more unavailable your system is. Recovery models can impact on how long your database solution can take to recover.

Another method of making your system more available is to have disk redundancy and placing your databases file on RAID which can also shorten the recovery time.

Other methods to make your system more available is using one or more of the following solutions

Hardware Methods
* Remote Mirroring
* Failover Clustering

Software Methods
* Database Mirroring
* Log Shipping
* Peer to Peer replication

Each one of the have pro and cons and I will try to explain the benefits of each

Remote Mirroring





Remote mirroring is the method of mirroring your database to another SAN (Storage Area Network). This is a good idea if you have two business locations and you want to make sure that if one building is affected by a unspecified problem, the second database at another location would be able to takeover. What happens is that two or more SANS are connected and through this connection, they can make sure that they are in sync. Depending on hardware being used, there are different ways for this method to be implemented.

As remote mirroring can be expensive, you should only use this technique if your organisation uses the same hardware throughout your organisation as each remote mirroring solution is different between vendors. Also you need a dedicated dark fibre connection between the SANS as DSL shared connections will not do.
A negative feature about remote mirroring is that there is not an automatic failover process, so you would have to manual recover the secondary database when a failure occurs.


Failover Clustering

The main features of Failover Clustering are :-

  • Windows Feature
  • Prevents Data Loss
  • Automatic Client Redirection
  • Requires Specialised Hardware (Certified Clustering Software)
  • Not as fast as Database Mirroring
  • Can be geographically dispensed

The way failover clusting works is that you have nodes (SQL Servers) that all connect using a shared hard drive(Shared Disk Array). and on the shared disk array, each node would be given its own resources to be the owners of

i.e.

Node A is the Owner of Drive D
Node B is the Owner of Drive E

Only one node can be the owner of one resource at anytime.

Each node listens to the other node via a heartbeat and if one node detects that the other node, is not functioning, the functioning node would take over the ownership of the non functioning node. This way the functioning node will be able to bring back the SQL Server instance back online.

i.e
Node B is not functioning
Node A is the Owner of Drive D
Node A is the Owner of Drive E

All uncommitted transactions are rolled back and can be restarted on the new server.

The good thing about failover clustering is that if a failure happens on a Node, the other nodes can easily take ownership and business tasks can continue.

Future parts will contain information on
* Database Mirroring
* Log Shipping

* Peer to Peer replication


No comments: