The Definitive Guide To Scaling Out SQL Server 2005
The Definitive Guide To Scaling Out SQL Server 2005
The Definitive Guide To Scaling Out SQL Server 2005
Scaling Out
SQL Server 2005
Don Jones
Introduction
Introduction to Realtimepublishers
by Sean Daily, Series Editor
The book you are about to enjoy represents an entirely new modality of publishing and a major
first in the industry. The founding concept behind Realtimepublishers.com is the idea of
providing readers with high-quality books about today’s most critical technology topics—at no
cost to the reader. Although this feat may sound difficult to achieve, it is made possible through
the vision and generosity of a corporate sponsor who agrees to bear the book’s production
expenses and host the book on its Web site for the benefit of its Web site visitors.
It should be pointed out that the free nature of these publications does not in any way diminish
their quality. Without reservation, I can tell you that the book that you’re now reading is the
equivalent of any similar printed book you might find at your local bookstore—with the notable
exception that it won’t cost you $30 to $80. The Realtimepublishers publishing model also
provides other significant benefits. For example, the electronic nature of this book makes
activities such as chapter updates and additions or the release of a new edition possible in a far
shorter timeframe than is the case with conventional printed books. Because we publish our titles
in “real-time”—that is, as chapters are written or revised by the author—you benefit from
receiving the information immediately rather than having to wait months or years to receive a
complete product.
Finally, I’d like to note that our books are by no means paid advertisements for the sponsor.
Realtimepublishers is an independent publishing company and maintains, by written agreement
with the sponsor, 100 percent editorial control over the content of our titles. It is my opinion that
this system of content delivery not only is of immeasurable value to readers but also will hold a
significant place in the future of publishing.
As the founder of Realtimepublishers, my raison d’être is to create “dream team” projects—that
is, to locate and work only with the industry’s leading authors and sponsors, and publish books
that help readers do their everyday jobs. To that end, I encourage and welcome your feedback on
this or any other book in the Realtimepublishers.com series. If you would like to submit a
comment, question, or suggestion, please send an email to [email protected],
leave feedback on our Web site at http://www.realtimepublishers.com, or call us at 800-509-
0532 ext. 110.
Thanks for reading, and enjoy!
Sean Daily
Founder & Series Editor
Realtimepublishers.com, Inc.
i
Table of Contents
Introduction to Realtimepublishers.................................................................................................. i
Chapter 1: An Introduction to Scaling Out ......................................................................................1
What Is Scaling Out? .......................................................................................................................1
Why Scale Out Databases? ..............................................................................................................3
Microsoft SQL Server and Scaling Out ...............................................................................4
SQL Server 2005 Editions .......................................................................................6
General Technologies ..............................................................................................6
General Scale-Out Strategies ...............................................................................................7
SQL Server Farms....................................................................................................7
Distributed Partitioned Databases............................................................................9
Scale-Out Techniques ........................................................................................................10
Distributed Partitioned Views................................................................................11
Distribution Partitioned Databases and Replication ..............................................13
Windows Clustering...............................................................................................13
High-Performance Storage.....................................................................................14
Hurdles to Scaling Out Database Solutions ...................................................................................14
Database Hurdles ...............................................................................................................19
Manageability Hurdles.......................................................................................................19
Server Hardware: Specialized Solutions........................................................................................19
Comparing Server Hardware and Scale-Out Solutions .................................................................21
Categorize the Choices ......................................................................................................21
Price/Performance Benchmarks.........................................................................................22
Identify the Scale-Out Solution .........................................................................................23
Calculate a Total Solution Price ........................................................................................23
Take Advantage of Evaluation Periods..............................................................................23
Industry Benchmarks Overview ....................................................................................................24
Summary ........................................................................................................................................25
Chapter 2: Scaling Out vs. Better Efficiency.................................................................................26
Addressing Database Design Issues...............................................................................................26
Logically Partitioning Databases .......................................................................................31
Addressing Bottlenecks Through Application Design ..................................................................34
Minimize Data Transfer.....................................................................................................34
Avoid Triggers and Use Stored Procedures.......................................................................35
ii
Table of Contents
iii
Table of Contents
Real-World Testing............................................................................................................69
Benchmarking ....................................................................................................................70
Summary ........................................................................................................................................70
Chapter 4: Distributed Partitioned Views ......................................................................................72
Pros and Cons ................................................................................................................................72
Distributed Partitioned View Basics..................................................................................74
Distributed Partitioned View Details .................................................................................75
Design and Implementation ...........................................................................................................81
Linked Servers ...................................................................................................................81
Partitioned Tables ..............................................................................................................87
The Distributed Partitioned View ......................................................................................88
Checking Your Results ......................................................................................................88
Best Practices .................................................................................................................................89
Grouping Data....................................................................................................................89
Infrastructure......................................................................................................................90
Database Options ...............................................................................................................90
Queries and Table Design..................................................................................................90
Sample Benchmark Walkthrough ..................................................................................................91
Sample Benchmark ............................................................................................................91
Conducting a Benchmark...................................................................................................93
Summary ........................................................................................................................................94
Chapter 5: Distributed and Partitioned Databases .........................................................................95
Pros and Cons ................................................................................................................................95
Distributed Databases ........................................................................................................95
Partitioned Databases.........................................................................................................99
Design and Implementation .........................................................................................................103
Designing the Solution.....................................................................................................103
Distributed Databases ..........................................................................................103
Partitioned Databases...........................................................................................107
Implementing the Solution...............................................................................................108
Distributed Databases ..........................................................................................108
Partitioned Databases...........................................................................................113
Best Practices ...............................................................................................................................116
iv
Table of Contents
Benchmarks..................................................................................................................................117
Summary ......................................................................................................................................119
Chapter 6: Windows Clustering...................................................................................................120
Clustering Overview ....................................................................................................................120
Clustering Terminology ...................................................................................................120
How Clusters Work..........................................................................................................121
Cluster Startup .....................................................................................................122
Cluster Operations ...............................................................................................123
Cluster Failover....................................................................................................124
Active-Active Clusters.........................................................................................126
Clusters for High Availability......................................................................................................128
Clusters for Scaling Out...............................................................................................................128
Setting Up Clusters ......................................................................................................................132
SQL Server and Windows Clusters .............................................................................................137
Clustering Best Practices .............................................................................................................138
Optimizing SQL Server Cluster Performance .............................................................................140
Case Study ...................................................................................................................................140
Database Mirroring ......................................................................................................................142
Summary ......................................................................................................................................143
Chapter 7: Scale-Out and Manageability.....................................................................................144
Manageability Problems in a Scale-Out Environment.................................................................144
Monitoring .......................................................................................................................144
Maintenance.....................................................................................................................145
Management.....................................................................................................................147
Monitoring Solutions for Scale-Out.............................................................................................147
Microsoft Operations Manager ........................................................................................150
Third-Party Solutions.......................................................................................................152
Symantec Veritas i3 for SQL Server ....................................................................152
Unisys Application Sentinel for SQL Server.......................................................153
ManageEngine Applications Manager.................................................................153
Nimsoft NimBUS for Database Monitoring ........................................................155
NetIQ AppManager for SQL Server....................................................................155
Maintenance Solutions for Scale-Out ..........................................................................................157
v
Table of Contents
vi
Table of Contents
vii
Copyright Statement
Copyright Statement
© 2005 Realtimepublishers.com, Inc. All rights reserved. This site contains materials that
have been created, developed, or commissioned by, and published with the permission
of, Realtimepublishers.com, Inc. (the “Materials”) and this site and any such Materials are
protected by international copyright and trademark laws.
THE MATERIALS ARE PROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND,
EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE,
TITLE AND NON-INFRINGEMENT. The Materials are subject to change without notice
and do not represent a commitment on the part of Realtimepublishers.com, Inc or its web
site sponsors. In no event shall Realtimepublishers.com, Inc. or its web site sponsors be
held liable for technical or editorial errors or omissions contained in the Materials,
including without limitation, for any direct, indirect, incidental, special, exemplary or
consequential damages whatsoever resulting from the use of any information contained
in the Materials.
The Materials (including but not limited to the text, images, audio, and/or video) may not
be copied, reproduced, republished, uploaded, posted, transmitted, or distributed in any
way, in whole or in part, except that one copy may be downloaded for your personal, non-
commercial use on a single computer. In connection with such use, you may not modify
or obscure any copyright or other proprietary notice.
The Materials may contain trademarks, services marks and logos that are the property of
third parties. You are not permitted to use these trademarks, services marks or logos
without prior written consent of such third parties.
Realtimepublishers.com and the Realtimepublishers logo are registered in the US Patent
& Trademark Office. All other product or service names are the property of their
respective owners.
If you have any questions about these terms, or if you would like information about
licensing materials from Realtimepublishers.com, please contact us via e-mail at
[email protected].
viii
Chapter 1
[Editor’s Note: This eBook was downloaded from Content Central. To download other eBooks
on this topic, please visit http://www.realtimepublishers.com/contentcentral/.]
1
Chapter 1
Why is scaling out even necessary in a Web farm? After all, most Web servers are fairly
inexpensive machines that employ one or two processors and perhaps 2GB of RAM or so. Surely
one really pumped-up server could handle the work of three or four smaller ones for about the
same price? The reason is that individual servers can handle only a finite number of incoming
connections. Once a server is dealing with a couple of thousand Web requests (more or less,
depending on the server operating system—OS, hardware, and so forth), adding more RAM,
network adapters, or processors doesn’t increase the server’s capacity enough to meet the
demand. There is always a performance bottleneck, and it’s generally the server’s internal IO
busses that move data between the disks, processors, and memory. That bottleneck is literally
built-in to the motherboard, leaving you few options other than scaling out. Figure 1.2 illustrates
the bottleneck, and how adding processors, disks, or memory—the features you can generally
upgrade easily—can only address the problem up to a point.
2
Chapter 1
3
Chapter 1
4
Chapter 1
There is a limit to how much scaling up can improve an application’s performance and ability to
support more users. To take a simplified example, suppose you have a database that has the sole
function of performing a single, simple query. No joins, no real need for indexes, just a simple,
straightforward query. A beefy SQL Server computer—say, a quad-processor with 4GB of RAM
and many fast hard drives—could probably support tens of thousands of users who needed to
simultaneously execute that one query. However, if the server needed to support a million users,
it might not be able to manage the load. Scaling up wouldn’t improve the situation because the
simple query is as finely tuned as possible—you’ve reached the ceiling of scaling up, and it’s
time to turn to scaling out.
Scaling out is a much more complicated process than scaling up, and essentially requires you to
split a database into various pieces, then move the various pieces to independent SQL Server
computers. Grocery stores provide a good analogy for comparing scale up and scale out.
Suppose you drop by your local supermarket and load up a basket with picnic supplies for the
coming holiday weekend. Naturally, everyone else in town had the same idea, so the store’s a bit
crowded. Suppose, too, that the store has only a single checkout lane open. Very quickly, a line
of unhappy customers and their shopping carts would stretch to the back of the store.
One solution is to improve the efficiency of the checkout process: install faster barcode scanners,
require everyone to use a credit card instead of writing a check, and hire a cashier with the fastest
fingers in the world. These measures would doubtless improve conditions, but they wouldn’t
solve the problem. Customers would move through the line at a faster rate, but there is still only
the one line.
A better solution is to scale out by opening additional checkout lanes. Customers could now be
processed in parallel by completely independent lanes. In a true database scale-out scenario, you
might have one lane for customers purchasing 15 items or less, because that lane could focus on
the “low-hanging fruit” and process those customers quickly. Another lane might focus on
produce, which often takes longer to process because it has to be weighed in addition to being
scanned. An ideal, if unrealistic, solution might be to retain a single lane for each customer, but
to divide each customer’s purchases into categories to be handled by specialists: produce, meat,
boxed items, and so forth. Specialized cashiers could minimize their interactions with each other,
keeping the process moving speedily along. Although unworkable in a real grocery store, this
solution illustrates a real-world model for scaling out databases.
In Chapter 2, we’ll explore scale up versus scale out in more detail. Scaling up is often a prerequisite
for scaling out, so I’ll also provide some tips and best practices for fine-tuning your databases into
scaled-up (or scale-out-able) shape.
Scaling out is generally appropriate for only very large databases. Consider scaling up for fairly
small databases that warrant scaling out to improve performance. Improve the hardware on
which the database is running, improve the procedures and queries used to manage the database,
or correct fundamental database design flaws. Scaling out isn’t a last resort, but it is best reserved
for well-designed databases being accessed by a large number of well-designed clients.
5
Chapter 1
General Technologies
There are several core technologies, most of which are common to all high-end database
platforms, that make scale-out possible. These technologies include:
• Data access abstraction—Often implemented as views and stored procedures, these
abstractions allow client and middle-tier applications to access data in a uniform fashion
without an understanding of how the data is actually stored on the back-end. This allows
a client application to, for example, query data from a single view without realizing that
the data is being assembled from multiple servers.
• Replication—A suite of technologies that allows multiple read or read-write copies of the
same data to exist, and for those copies to undergo a constant synchronization process
that seeks to minimize the time during which the copies (or replicas) are different from
one another..
• Clustering—A set of technologies that allows multiple servers running complex
applications (such as SQL Server) to appear as a single server. In some cases, the servers
distribute the incoming workload among themselves; in other scenarios, the servers act as
backups to one another, picking up workload when a single server fails.
Clustering is an overused, overloaded term; in this context, I’m referring to clustering in a general
sense. Windows Clustering is a specific technology that provides high availability. Although it is not in
and of itself a scale-out technology (it does nothing to distribute workload), it does have a useful place
in helping scaled-out solutions become more reliable and more highly available.
6
Chapter 1
This strategy is perhaps the simplest means of scaling out SQL Server. Although SQL Server
replication isn’t simple to set up and maintain, the strategy works well even with many different
servers and copies of the database. However, this setup has drawbacks. Latency is the primary
drawback—neither copy of the database will ever be exactly like the other copies. As new
records are added to each copy, a period of time elapses before replication begins. With only two
servers in the company, each server might be as much as an hour out of sync with the other,
depending upon how you set up replication. Adding more servers, however, involves difficult
replication decisions. For example, consider the six-office setup that Figure 1.4 depicts.
7
Chapter 1
In this example, each of the six offices has an independent SQL Server, which is a useful and
scalable design. However, latency might be very high. If each SQL Server replicated with its
partners just once every hour, then total system latency could be 3 hours or more. For example, a
change made in the Los Angeles office would replicate to New York and Las Vegas in about an
hour. An hour later, the change would make it to London and Denver. Another hour later, and
the change would finally reach Orlando. With such high latency, it’s unlikely that the entire
system would ever be totally in sync.
Latency can be reduced, but at the cost of performance. For example, if each of the six servers
replicated with each of the other six servers, the system could converge, or be universally in
sync, about once an hour (assuming again that replication occurred every hour). Figure 1.5
shows this fully enmeshed design.
8
Chapter 1
Figure 1.5: A fully enmeshed six-server farm (the green lines represent replication).
The consequence of this design is decreased performance. Each server must maintain replication
agreements with five other servers and must perform replication with each other server every
hour. So much replication, particularly in a busy database application, would likely slow
performance so much that the performance gain achieved by creating a server farm would be
lost. Each office might require two servers just to maintain replication and meet users’ needs.
Therefore, the server farm technique, although fairly easy to implement, has a point of
diminishing return.
9
Chapter 1
Administrators can use two basic approaches to implement this strategy. The first approach is to
modify the client application so that it understands the division of the database across multiple
servers. Although fairly straightforward, if somewhat time-consuming, this solution does not
work well for the long term. Future changes to the application could result in additional
divisions, which would in turn require additional reprogramming.
A better approach is to program the client application to use stored procedures, views, and other
server-side objects—an ordinary best practice for a client-server application—so that the client
application does not need to be aware of the data’s physical location. SQL Server offers different
techniques to handle this setup, including distributed partitioned views.
Scale-Out Techniques
SQL Server and Windows offer several techniques to enable scaling out, including SQL Server–
specific features such as distributed databases and views and Windows-specific functions such as
Windows Clustering (which, as I’ve mentioned, isn’t specifically a scale-out technology
although it does have a use in scale-out scenarios).
10
Chapter 1
11
Chapter 1
Figure 1.7: Distributed partitioned views enable a client to view three unrelated tables as one table.
Distributed partitioned views are a powerful tool for creating scaled-out applications. Each
server participating in the distributed partitioned view is a node, and the entire group of servers is
a shared-nothing cluster. Each server’s copy of the distributed table (or tables) has the same
schema as the other copies—the same columns, constraints, and so forth—but each server
contains different rows.
It is crucial that tables are horizontally partitioned in such a way that each server handles
approximately the same load. For example, an application that frequently adds new rows to the
most recently added node places an unequal amount of INSERT traffic on that server, partially
defeating the purpose of the scale-out strategy. If database reads primarily deal with newer rows,
the new node will handle most of the traffic associated with the table, leaving the other nodes
relatively idle. Appropriately redistributing the rows across the available nodes will alleviate the
problem and more evenly distribute the traffic (and therefore the workload) across them.
Despite their utility, distributed partitioned views are only rarely used as a result of the difficulty
inherent in evenly distributing rows across multiple servers. Failing to achieve an even
distribution can result in disastrous performance, because certain participating servers will need
to hold query results in memory while waiting for other servers that have more rows to process
to catch up.
In Chapter 4, we’ll explore distributed partitioned views—how to set them up and how to use them.
12
Chapter 1
Chapter 5 covers distributed partitioned databases in detail, including how-to information for creating
and maintaining distributed partitioned databases.
Windows Clustering
Windows Clustering not only improves performance but can also be a useful technique for
scaling out without increasing the risk of a server failure. For example, a two-node active/active
cluster has two independent SQL Server machines. You can configure these nodes as a server
farm, in which each server contains a complete copy of the database and users are distributed
between them, or as a distributed database architecture, in which each server contains one logical
half of the entire database. In either architecture, a failure of one server is not a catastrophe
because Windows Clustering enables the other server to transparently take over and act as two
servers.
Over-engineering is the key to a successful active/active cluster. Each node should be designed
to operate at a maximum of 60 percent capacity. That way, if a node fails, the other node can
begin running at 100 percent capacity with about a 20 percent loss of efficiency. Even with this
efficiency loss, performance is generally still well within an acceptable range, especially
considering that applications after failover must run on half as much hardware.
Setting up clusters can be extremely complex. In the case of Windows Clustering, the software is
not difficult to use, but the underlying hardware must be absolutely compatible with Windows
Clustering—and most hardware vendors have exacting requirements for cluster setups. To
prevent confusion, it’s advisable to buy an HCL-qualified cluster that is well documented from a
major server vendor. This simplifies cluster setup and the vendor (and Microsoft) can provide
cluster-specific technical support, if necessary.
Chapter 6 delves into Windows Clustering, along with a complete tutorial about how clustering works
and how clustered SQL Server systems can be a part of your scale-out solution.
13
Chapter 1
High-Performance Storage
High-performance storage offers an often-overlooked performance benefit for SQL Server—
particularly external storage area networks (SANs) that rely on Fibre Channel technology rather
than traditional SCSI disk subsystems. Because high-performance storage enables an existing
server to handle a greater workload, this type of storage is an example of scaling up rather than
out.
SQL Server is a highly disk-intensive application. Although SQL Server includes effective
memory-based caching techniques to reduce disk reads and writes, database operations require
significant data traffic between a server’s disks and its memory. The more quickly the disk
subsystem can move data, the faster SQL Server will perform. Industry estimates suggest that a
considerable amount of idle time in SQL Server results from waiting for the disk subsystem to
deliver data. Improving the speed of the disk subsystem can, therefore, markedly improve overall
SQL Server performance.
Moving to additional RAID-5 arrays on traditional copper SCSI connections is a simple way to
improve disk space. However, high-speed Fibre Channel SANs offer the best speed, as well as
myriad innovative recovery and redundancy options—making them a safer place to store
enterprise data.
Chapter 7 is all about high-performance storage and how it can help improve your scale-out solution.
14
Chapter 1
Figure 1.9 illustrates the database distributed across three servers. The client, however, continues
to access a view—a distributed view. The view pulls information from three tables on three
servers, but no changes are necessary to the client.
15
Chapter 1
To clearly illustrate this point, let’s consider the alternative. Figure 1.10 shows a client directly
accessing data from a server. This setup is inefficient, as ad-hoc queries require the most work
for SQL Server to process.
16
Chapter 1
After the data has been distributed to three different servers (see Figure 1.11), the situation will
degrade further. This situation would require a new or at least revised client application that is
aware of the data distribution. In addition, the data will take longer to query, slowing application
performance.
17
Chapter 1
To avoid the unpopular task of revising client applications, spend time analyzing the way your
clients interact with a database. Use the results of this analysis to fine-tune the database before
you consider a distributed data solution as part of a scale-out scenario.
Chapter 3 will contain a more in-depth discussion of the challenges of scaling out SQL Server,
including tips for getting your databases in shape for the move.
Of course, there’s no step-by-step guide to scale-out success. Several hurdles exist that can make
scaling out more difficult, either from a database standpoint or from a manageability standpoint.
Solutions and workarounds exist for many of these hurdles, but it’s best to put everything on the
table up front so that you know what you’re getting into.
18
Chapter 1
Database Hurdles
Some databases are simply not built to scale-out easily. Perhaps the data can’t be easily
partitioned and divided across servers, or perhaps your circumstances make replication
impractical. For many organizations, the database design and how it’s used present the biggest
hurdles to scaling out, forcing those organizations to do the best they can with scale-up solutions.
Even if a drastic redesign is required, however, SQL Server offers solutions that can make such a
redesign possible and even relatively painless. Integration Services can be used to transform a
problematic database design into one that’s more amenable to scale-out, while views and stored
procedures can help mask the database changes to client applications and middle-tier
components, eliminating a cascade of design changes in complex multi-tier applications. Scale-
out capability and performance starts with a solid database design, but SQL Server’s toolset
recognizes that few databases are initially built with scale-out in mind.
Manageability Hurdles
Manageability problems are often a concern with many scale-out solutions. Let’s face it—scale-
out solutions, by definition, involve adding more servers to your environment. You’re then faced
with the reality of managing multiple servers that have a complex set of relationships (such as
data replication) with one another. These multiple servers require patch and software
management, performance and health monitoring, change and configuration management,
business continuity operations, and other complex tasks. SQL Server 2005 provides management
capabilities that address some of these hurdles; proper management technique and third-party
tools and utilities can help address other hurdles and make scale-out manageability easier and
more comprehensible.
Chapter 8 will tackle the complex topic of scale-out manageability, including its many difficulties and
solutions.
19
Chapter 1
The ultimate server purchase (in terms of expense, at least) is Microsoft Datacenter Server,
which can be scaled-out across high-end, proprietary hardware. Available in both Windows 2000
Datacenter Server and Windows Server 2003 Datacenter Edition, this hardware is sold in
specially certified configurations with the OS preloaded. Generally, at minimum, these systems
provide a large amount of data storage and are at least 4-way machines. This option is the most
expensive and offers the most proprietary type of server. However, the hardware, OS, and drivers
are certified as a package, so these systems are also the most stable Windows machines, earning
99.999 percent uptime ratings. Carefully examine your scale-out strategy to determine whether
you need this much horsepower and this level of availability.
Microsoft’s Datacenter program is not intended to equate with raw server power. Datacenter is
intended primarily for reliability, by combining well-tested drivers, hardware, and OS components.
However, due to the price, most Datacenter machines are also beefy, intended for heavy-duty
enterprise applications.
I should also point out that Datacenter Server isn’t specifically designed for scale-out solutions;
servers running Datacenter are typically fully-loaded in terms of processor power, memory, and other
features, and they are specifically designed for high availability. However, most scale-out solutions
include a need for high-availability, which is where Windows Clustering can also play a role (as I’ll
discuss throughout this book).
Obviously, the money required for a Datacenter Server system—let alone several of them—is
significant, and your solution would need extremely high requirements for availability and horsepower
in order to justify the purchase. I suspect that most scale-out solutions can get all the power they
need using PC-based hardware (also called non-proprietary or commodity hardware), running
Standard or Enterprise Editions of Windows, using Windows Clustering to achieve a high level of
availability.
For example, for the cost of a single data center server from one manufacturer, you can often
configure a similarly equipped two-node cluster from other manufacturers. The clustering
provides you with similar reliability for your application because it is less likely that both cluster
nodes will fail at the same time as the result of a device driver issue. As the goal of scale-out
solutions are to spread the workload across multiple servers, having more servers might be more
beneficial than relying on one server. This decision depends upon your particular strategy and
business needs. Scaling out allows for flexibility in the types of server hardware that you use,
allowing you to find a solution that is specifically suited to your needs.
There is a whole new breed of server available to you now—64-bit. Although Intel’s original Itanium
64-bit architecture remains a powerful choice, a more popular choice is the new x64-architecture
(called AMD64 by AMD and EM64T by Intel). I’ll cover this new choice in detail in Chapter 7.
20
Chapter 1
21
Chapter 1
Price/Performance Benchmarks
Start with TPC benchmarks that are appropriate to your application: TPC-C, TPC-H, or TPC-W.
(We’ll explore these benchmarks later in this chapter.) At this stage, focus on the
price/performance ratio rather than raw performance. Use an Excel spreadsheet to create a chart
like the one that Figure 1.12 shows as a useful tool for comparison.
In this chart, higher values on the Y axis mean better performance; dots further to the right cost
more. The chart reveals that Servers B and C, which offer high performance at a lower cost, are
good choices. Servers D and E fall into the high-cost category as well as provide low
performance. This type of graph—called a scatter graph—can help you quickly compare the
price/performance ratio of servers.
If TPC hasn’t published results for the particular server in which you’re interested, try grouping
your option by manufacturer. Manufacturers typically have similar performance, especially
within a product family. Take an average score for a manufacturer or product family as
representative of the manufacturer or family’s overall performance.
22
Chapter 1
Although these tests provide an “apples to apples” comparison of different platforms and
manufacturers, it is important to note that the TPC configurations are not always applicable in real
world environments. Therefore, using these test results for sizing is difficult. For sizing, it’s best to ask
the sales representative from one of the major vendors for tools—most major vendors have tools that
will provide ballpark sizing information.
Manufacturers provide different support plans and maintenance agreements, making it a difficult task
to compare servers’ actual prices. I price solutions based entirely on the hardware, then add the cost
of service or maintenance plans after a primary evaluation of the different solutions.
23
Chapter 1
Earlier, this chapter noted that Web server farms are nearly all built with inexpensive hardware. The
theory is that you can buy many inexpensive servers, and many servers are better than one.
Scaling out SQL Server means building a SQL Server farm, which doesn’t require you to pay extra for
the most fine-tuned machine on the market. You can save money by buying machines that are based
on standardized components (rather than the proprietary components used by most highly tuned
machines).
TPC publishes results on its Web site at http://www.tpc.org. Results are broken down by
benchmark, and within each benchmark, they are divided into clustered and non-clustered
systems. However, TPC benchmarks are not available for every server, and TPC doesn’t test
every server with SQL Server—the organization publishes results for other database software as
well. TPC also publishes a price estimate that is based on manufacturer’s listed retail prices. This
estimate enables you to quantify how much extra you’re paying for performance.
Separate benchmarks are available for many database applications. Although some of these
applications can be based upon SQL Server, you’ll obtain more application-specific results by
checking out the specific benchmarks.
Additional benchmarks exist, though few are geared specifically toward SQL Server. As you’re
shopping for server hardware for a scale-out solution, ask manufacturers for their benchmark
results. Most will happily provide them to you along with any other performance-related
information that you might need.
24
Chapter 1
Summary
In some cases, large database applications reach the point where scale up is not practical, and
scaling out is the only logical solution. You might find, however, that scaling out is a more
intelligent next step even when you could potentially squeeze more out of your existing design.
The next chapter will explore tips and best practices for increasing efficiency on your existing
servers, which is an important first step to creating a useful scaled-out database application.
25
Chapter 2
In addition to the methods for improving the structure of the current database discussed in this
chapter, there are low-cost ways to scale out the database hardware that incorporate dual-core
processors as well as the benefits of the x64. These methods are explained in the Appendix: 64-Bit
and High Performance Computing.
SQL Server 2005 includes many performance enhancements that earlier versions don’t offer.
However, most of these improvements are integrated throughout the SQL Server engine, making it
difficult to point to any one particular T-SQL language element or database programming technique
and say that it will give you a performance boost. Many existing applications may run faster simply by
upgrading to SQL Server 2005. However, that doesn’t mean there is no room for improvement,
especially in efficiency. In fact, the tips in this chapter apply equally to SQL Server 2005 and SQL
Server 2000 (except where I’ve specifically noted otherwise by calling out a specific version of the
product).
26
Chapter 2
For example, suppose you have two tables in your database and each relies on a lookup table.
The two main tables, Table 1 and Table 2, each have a foreign key constraint on one column that
forms a relationship with a table named Lookup Table. As the following tables show (the
examples provide only the relevant columns), Table 1 contains the addresses for properties that
are for sale, and Table 2 contains the addresses for real estate transactions that are in progress.
To ensure that addresses are accurate and consistent, Lookup Table provides pieces of the
address including street types (for example, Road, Street, Avenue, Drive). Including this
information in a lookup table prevents users from inconsistently entering data such as Rd for
Road and Dr for Drive.
Table 1 Lookup Table Table 2
AdrID StName StTyp TypID StTypName AdrID StName StTyp
1 Main 2 1 Road 8 Sahara 1
2 Elm 2 2 Street 9 Major 1
3 Forest 3 3 Avenue 10 Pierce 3
The problem with this design is that Table 1 and Table 2 do not contain the street type name;
instead, they contain the primary key value pointing to the appropriate row in Lookup Table.
Thus, querying a complete address from the database requires a multi-table join, which is
inherently less efficient than simply querying a single table. This common database design,
which is employed by many applications, is inefficient. If this design were used in a real estate
application, for example, address-based queries would be very common, and having to join three
tables to make this very common query wouldn’t be the most efficient way to build the database.
When you distribute this database across servers in a scale-out project, you will run into
additional problems. For example, suppose you put Table 1 and Table 2 on different servers; you
must then decide on which server to place Lookup Table (see Figure 2.1).
27
Chapter 2
Figure 2.1: Using lookup tables can restrict your scale-out options.
Ideally, as both servers rely on Lookup Table, the table should be on the same physical server as
the tables that use it, which is impossible in this case. To work around this problem as well as
improve performance, simply denormalize the tables, as the following examples illustrate.
Table 1 Lookup Table Table 2
AdrID StName StTyp TypID StTypName AdrID StName StTyp
1 Main Street 1 Road 8 Sahara Road
2 Elm Street 2 Street 9 Major Road
3 Forest Avenue 3 Avenue 10 Pierce Avenue
Lookup Table is still used to create user input choices. Perhaps the address-entry UI populates a
drop-down list box based on the rows in Lookup Table. However, when rows are saved, the
actual data, rather than the primary key, from Lookup Table is saved. There is no SQL Server
constraint on the StTyp column in Table 1 or Table 2; the UI (and perhaps middle-tier business
rules) ensures that only data from Lookup Table makes it into these columns. The result is that
the link between Lookup Table and Table 1 and Table 2 is now broken, making it easy to
distribute Table 1 and Table 2 across two servers, as Figure 2.2 illustrates.
28
Chapter 2
When a user needs to enter a new address into Table 2 (or edit an existing one), the user’s client
application will query Table 2 from ServerB and Lookup Table from ServerA. There is no need
for SQL Server to maintain cross-server foreign key relationships which, while possible, can
create performance obstacles. This example illustrates denormalization: Some of the data
(“Street,” “Road,” and so forth) is duplicated rather than being maintained in a separate table
linked through a foreign key relationship. Although this configuration breaks the rules of data
normalization, it provides for much better performance.
You must always keep in mind why the normalization rules exist. The overall goal is to reduce data
redundancy—primarily to avoid multiple-row updates whenever possible. In this real estate example,
it’s unlikely that the word “Road” is going to universally change to something else, so it’s more
permissible—especially given the performance gains—to redundantly store that value as a part of the
address entity rather than making it its own entity.
29
Chapter 2
Normalization is a useful technique; however, this method is generally used to reduce data
redundancy and improve data integrity at the cost of performance.
If you have many detached lookup tables after denormalizing, you can create a dedicated SQL
Server system to host only those tables, further distributing the data workload of your
application. Figure 2.3 shows an example of a scaled-out database with a dedicated “lookup table
server.”
30
Chapter 2
Figure 2.3: Creating a dedicated lookup table can allow you to scale out further.
Client- or middle-tier applications would query ServerB for acceptable values for various input
fields; ServerA and ServerC store the actual business data, including the values looked up from
ServerB. Because querying lookup tables in order to populate UI elements (such as drop-down
list boxes) is a common task, moving this data to a dedicated server helps to spread the
application’s workload out across more servers.
31
Chapter 2
For the first approach, you need to have some means of horizontally partitioning your tables. All
tables using a unique primary key—such as an IDENTITY column—must have a unique range
of values assigned to each server. These unique values permit each server to create new records
with the assurance that the primary key values won’t conflict when the new rows are distributed
to the other servers. For software development purposes, another helpful practice is to have a
dedicated column that simply indicates which server owns each row. Figure 2.4 shows a sample
table that includes a dedicated column for this purpose.
Figure 2.4: Using a column to indicate which server owns each row.
Note that the CustomerID column—the primary key for this table—has different identity ranges
assigned to each server as well.
In the other approach, you need to identify logical divisions in your database tables. Ideally, find
tables that are closely related to one another by job task or functional use. In other words, group
tables that are accessed by the same users who are performing a given job task. Consider the
database illustrated in Figure 2.5.
32
Chapter 2
It’s likely that customer service representatives will access both customer and order data, so it
makes sense to distribute those tables to one server. Vendors and inventory data are primarily
accessed at the same time, so this data can exist on a separate server. Customer service
representatives might still need access to inventory information, and techniques such as views
are an easy way to provide them with this information.
If your database doesn’t already provide logical division in its tables, work to create one.
Carefully examine the tasks performed by your users for possible areas of division for tables.
Breaking a database into neat parts is rarely as straightforward as simply picking tables. Sometimes
you must make completely arbitrary decisions to put a particular table on a particular server. The goal
is to logically group the tables so that tables often accessed together are on the same physical
server. Doing so will improve performance; SQL Server offers many tools to enable a distributed
database even when there aren’t clear divisions in your tables.
You can also combine these two scale-out techniques. For example, you might have one server
with vendor and inventory information, and four servers with customer and order information.
The customer and order tables are horizontally partitioned, allowing, for example, different call
centers in different countries to maintain customer records while having access to records from
across the entire company.
I’ll describe some of these scale-out decisions in more detail in the next chapter.
33
Chapter 2
Chapter 10 will focus on application design issues in much greater detail, with more specific focus on
SQL Server 2005-compatible techniques for improving performance. The tips provided are useful
general practices that apply to any database application.
The number of rows you query depends entirely on the type of application you’re writing. For
example, if users frequently query a single product and then spend several minutes updating its
information, then simply querying one product at a time from SQL Server is probably the right choice.
If users are examining a list of a dozen products, querying the entire list makes sense. If users are
paging through a list of a thousand products, examining perhaps 20 at a time, then querying each
page of 20 products probably strikes the right balance. The point is to query just what the user will
need right at that time or in the next couple of moments.
On the same note, make sure that client applications are designed to minimize database locks and
to keep database transactions as short as possible to help minimize locks. Applications should be
designed to be able to handle a failed query, deadlock, or other situation gracefully.
These guidelines are especially true in a database that will be scaled out across multiple servers.
For example, if your scale-out project involves distributed views, locking a row in a view can
lock rows in tables across multiple servers. This situation requires all the usual workload of
maintaining row locks plus the additional workload required to maintain those locks on multiple
servers; an excellent example of how a poor practice in a single-server environment can become
a nightmare in a distributed database.
SQL Server cannot determine whether a query is poorly written or will cause a major server
problem. SQL Server will dutifully accept and execute every query, so the application’s designer
must make sure that those queries will run properly and as quickly as possible.
34
Chapter 2
Avoid giving users ad-hoc query capabilities. These queries will often be the worst-running ones on
your system because they are not optimized, are not backed by specific indexes, and are not running
from a stored procedure. Instead, provide your users with some means of requesting new reports or
queries, and allow trained developers and database administrators (DBAs) to work together to
implement those queries in the most efficient way possible.
In SQL Server 7.0 and SQL Server 2000, stored procedures could be written only in SQL Server’s
native T-SQL language. SQL Server 2005, however, embeds the .NET Framework’s Common
Language Runtime (CLR) within the SQL Server engine. The practical upshot of this inclusion is that
you can write stored procedures in any .NET language, such as VB.NET or C#. Visual Studio 2005
provides an integrated development environment that makes creating and debugging this .NET-
based stored procedure easier.
The additional power and flexibility offered by .NET provide even more reason to use stored
procedures for absolutely every query that SQL Server executes. Stored procedures will become an
even better single point of entry into SQL Server, eliminating the need for triggers.
35
Chapter 2
36
Chapter 2
The middle tier should still access SQL Server exclusively through stored procedures, but these
stored procedures can now be simplified because they don’t need to incorporate as much
business logic. Thus, they will execute more quickly, allowing SQL Server to support more
clients. As your middle-tier servers become overloaded, you simply add more, meeting the client
demand. A typical way in which middle-tier servers help offload work from SQL Server is by
performing data validation. That way, all queries executed on SQL Server contain valid,
acceptable data, and SQL Server simply had to put it in the right place (rather than having stored
procedures or other objects validate the data against your business rules). SQL Server is no
longer in the business of analyzing data to determine whether it’s acceptable. Of course, client
applications can perform data validation as well, but middle-tier servers help to better centralize
business logic to make changes in that logic easier to implement in the future.
Working with XML? Consider a business tier. SQL Server 2000 and its various Web-based feature
releases, as well as SQL Server 2005, support several native XML features that can be very useful.
For example, if you’re receiving XML-formatted data from a business partner, you can have SQL
Server translate, or shred, the XML into a relational data format and use it to perform table updates.
Unfortunately, doing so can be very inefficient (particularly on SQL Server 2000). If you will be
working with a great deal of XML-based data, build a middle tier to do so. The middle tier can perform
the hard work of shredding XML data and can submit normal T-SQL queries to SQL Server.
Execution will be more efficient and you’ll have a more scalable middle tier to handle any growth in
data traffic.
If you don’t need to parse (or “shred”) the XML, SQL Server 2005 provides a new native XML data
type, making it easier to store XML directly within the database. This stored XML data can then be
queried directly (using the XQuery syntax, for example), allowing you to work with an XML column as
if it were a sort of complex, hierarchical sub-table.
The overall goal is to try to identify intensive processes—such as data validation or XML parsing—
and move those to dedicated servers within the middle tier of your overall application. After all, the
less work SQL Server has to do, the more work it can do; because SQL Server is ultimately the only
tier capable of querying or updating the database. Moving work away from SQL Server will help
maximize its ability to do database work efficiently.
Using multiple tiers can be especially effective in distributed databases. The client application is
the most widely distributed piece of software in the application, so the most efficient practice is
to avoid making changes to the client application that will require the distribution of an update.
By forcing the client to talk only to a middle tier, you can make many changes to the database
tier—such as distributing the database across multiple servers or redistributing the database to
fine-tune performance—without changing the client application. Figure 2.7 illustrates this
flexibility.
37
Chapter 2
Figure 2.7: Multi-tier applications allow for data tier redesigns without client application changes.
In this configuration, changes to the data tier—such as adding more servers—don’t affect the
client tier. Instead, the smaller middle tier is updated to understand the new back-end structure.
This technique makes it easier to change the data tier to meet changing business demands,
without a time-consuming deployment of a new client application.
38
Chapter 2
Figure 2.8: MSMQ allows long-running queries (and their results) to be queued.
This technique allows for strict control over long-running queries, and enables requestors to
continue working on other projects while they wait for their query results to become available. It
isn’t necessary for the requestor to be available when the query completes; MSMQ will store the
results until the requestor is ready to retrieve them.
39
Chapter 2
SQL Server’s Data Transformation Services (DTS) include an MSMQ task that allows DTS to
place query results and other information onto an MSMQ message queue. MSMQ is also
accessible from COM-based languages, such as Visual Basic, and from .NET Framework
applications.
Using MSMQ is an especially effective scale-out technique. For example, if your company
routinely prepares large reports based on your database, you might create a standalone server that
has a read-only copy of the data. Report queries can be submitted to that server via MSMQ, and
the results later retrieved by clients. Use DTS to regularly update the reporting server’s copy of
the database, or use replication techniques such as snapshots, log shipping, mirroring, and so
forth. By offloading the reporting workload to a completely different server, you can retain
additional capacity on your online transaction processing (OLTP) servers, improving
productivity.
Although SQL Server 2005 can still utilize MSMQ, it also provides a more advanced set of
services called Service Broker. Service Broker is a built-in set of functionality that provides
message queuing within SQL Server 2005. Service Broker uses XML formatting for messages,
and makes it especially straightforward to pass messages between SQL Server 2005 computers.
Service Broker can be particularly helpful in scale-out scenarios because it helps to facilitate
complex communications between multiple SQL Server 2005 computers.
Service Broker—specifically, it’s ability to support scale-out scenarios—will be explored in more detail
throughout this book.
You can create views that combine the current and archived databases into a single set of virtual
tables. Client applications can be written to query the current tables most of the time and to query the
views when users need to access archived data. It’s the easiest way to implement a distributed-
archive architecture.
Why bother with archiving? You archive primarily for performance reasons. The larger your
databases, the larger your indexes, and the less efficient your queries will run. If you can
minimize database size while meeting your business needs, SQL Server will have a better chance
at maintaining a high level of performance over the long haul.
40
Chapter 2
Make sure that your users are aware of the consequences of querying archived data. For
example, your client application might present a message that warns Querying archived data will
result in a longer-running query. Your results might take several minutes to retrieve. Are you
sure you want to continue? If you’re running queries synchronously, make sure that you give
users a way to cancel their queries if the queries require more time than users anticipated. For
queries that will take several minutes to complete, consider running the queries asynchronously,
perhaps using the MSMQ method described earlier.
Don’t forget to update statistics! After you’ve made a major change, such as removing archived data,
be sure to update the statistics on your tables (or ensure that the automatic update database option is
enabled) so that SQL Server’s query optimizer is aware that your database has shrunk.
Archiving is another great way to scale out a SQL Server application. Some applications spend a
considerable portion of their workload querying old data rather than processing new transactions.
By keeping an archive database on one or more separate servers, you can break off that workload
and preserve transaction-processing capacity on your primary servers. If your application design
includes a middle tier, only the middle tier needs to be aware of the distribution of the current
and archived data; it can contact the appropriate servers on behalf of clients as required. Figure
2.9 illustrates how an independent archival server can be a part of your scale-out design.
41
Chapter 2
Figure 2.9: Archived data can remain accessible in a dedicated SQL Server computer.
42
Chapter 2
Fine-Tuning SQL
Tuning indexes and improving the efficiency of T-SQL queries will help improve SQL Server’s
performance. These improvements will help a single server support more work, and they are
crucial to scaling out across multiple database servers. Inefficient indexes and inefficient queries
can have a serious negative impact on SQL Server’s distributed database capabilities, so fine-
tuning a database on a single server will help improve performance when the database is spread
across multiple servers.
Tuning Indexes
More often than not, indexes are the key to database performance. Thus, you should expect to
spend a lot of time fussing with indexes to get them just right. Also, learn to keep track of which
indexes you have in each table. You will need to ensure that the index files are on the same
physical server as their tables when you begin to divide your database across multiple servers.
Smart Indexing
Many new DBAs throw indexes on every column in a table, hoping that one or two will be
useful. Although indexes can make querying a database faster, they slow changes to the
database. The more write-heavy a table is, the more careful you need to be when you add your
indexes.
Use SQL Server’s Index Tuning Wizard (in SQL Server 2000; in SQL Server 2005 it’s part of
the new Database Engine Tuning Advisor) to get the right indexes on your tables to handle your
workload. Used in conjunction with SQL Profiler and a representative query workload, the Index
Tuning Wizard is your best first weapon in the battle to properly index your tables.
Indexing isn’t a one time event, though. As your database grows, you’ll need to reevaluate your
indexing strategy. Indexes will need to be periodically rebuilt to ensure best performance.
Changes to client applications, database design, or even your server’s hardware will change your
indexing strategy.
43
Chapter 2
Learn to practice what I call smart indexing. Constantly review your indexes for appropriateness.
Experiment, when possible, with different index configurations. One way to safely experiment
with indexes is to create a testing server that is as close as possible in its configuration to your
production server. Use SQL Profiler to capture a day’s worth of traffic from your production
server, then replay that traffic against your test server. You can change index configurations and
replay the day’s workload as often as necessary, monitoring performance all the while. When
you find the index configuration that works best, you can implement it on your production server
and check its performance for improvements.
44
Chapter 2
There are some unique best practices for composite indexes that you should keep in mind (in
addition, you should be aware of a SQL Server composite index bug; see the sidebar “The
Composite Index Bug” for more information):
• Keep indexes as narrow as possible. In other words, use the absolute minimum number of
columns necessary to get the effect you want. The larger the composite index, the harder
SQL Server will work to keep it updated and to use it in queries.
• The first column you specify should be as unique as possible, and ideally should be the
one used by most queries’ WHERE clauses.
• Composite indexes that are also covering indexes are always useful. These indexes are
built from more than one column, and all the columns necessary to satisfy a query are
included in the index, which is why the index is said to cover the query.
• Avoid using composite indexes as a table’s clustered index. Clustered indexes don’t do as
well when they’re based on multiple columns. Clustered indexes physically order the
table’s data rows and work best when they’re based on a single column. If you don’t have
a single useful column, consider creating an identity column and using that as the basis
for the clustered index.
The Composite Index Bug
A fairly well-known SQL Server bug relates to how the query optimizer uses composite indexes in large
queries. The bug exists in SQL Server 7.0 and SQL Server 2000; SQL Server 2005’s query optimizer
corrects the bug.
When you issue a query that includes a WHERE clause with multiple OR operators, and some of the
WHERE clauses rely on a composite index, the query optimizer might do a table scan instead of using
the index. The bug occurs only when the query is executed from an ODBC application or from a stored
procedure.
Microsoft has documented the bug and provides suggested workarounds in the article “BUG: Optimizer
Uses Scan with Multiple OR Clauses on Composite Index” at
http://support.microsoft.com/default.aspx?scid=KB;en-us;q223423. Workarounds include using index
hints to take the choice away from the optimizer and force it to use the index.
How can you tell if this bug is affecting you? Pay close attention to your production query execution plans,
which you can view in SQL Profiler. You might also try running an affected query on both SQL Server
2000 and SQL Server 2005 to see how each handles the query.
45
Chapter 2
46
Chapter 2
Avoid Cursors
Cursors are detrimental from a performance perspective. Consider the code sample that Listing
2.1 shows, which is adapted from a sample on http://www.sql-server-performance.com.
DECLARE @LineTotal money
DECLARE @InvoiceTotal money
SET @LineTotal = 0
SET @InvoiceTotal = 0
OPEN Line_Item_Cursor
FETCH NEXT FROM Line_Item_Cursor INTO @LineTotal
WHILE @@FETCH_STATUS = 0
BEGIN
SET @InvoiceTotal = @InvoiceTotal + @LineTotal
FETCH NEXT FROM Line_Item_Cursor INTO @LineTotal
END
CLOSE Line_Item_Cursor
DEALLOCATE Line_Item_Cursor
SELECT @InvoiceTotal InvoiceTotal
This code locates an invoice (10248), adds up all the items on that invoice, and presents a total
for the invoice. The cursor is used to step through each line item on the invoice and add its price
into the @LineTotal variable. Listing 2.2 shows an easier way that doesn’t involve a cursor.
DECLARE @InvoiceTotal money
SELECT @InvoiceTotal = sum(UnitPrice*Quantity)
FROM [order details]
WHERE orderid = 10248
SELECT @InvoiceTotal InvoiceTotal
Listing 2.2: The sample code modified so that it doesn’t involve a cursor.
The new code uses SQL Server’s aggregate functions to sum up the same information in fewer
lines of code and without using a slower-performing cursor. These aggregate functions can be a
big timesaver and return the same results faster than complex cursor operations.
47
Chapter 2
Summary
These best practices and tips combined with unlimited hours of fine-tuning and tweaking still
won’t create a single server that has as much raw power as multiple servers. Thus, the reason to
consider scaling out—single-server efficiency only gets you so far. However, performance
benefits will result in a scale-out environment from fine-tuning the design and performance of
databases in a single-server environment. In other words, maximize your efficiency on a single
server and you’ll reap performance benefits in a scale-out scenario. Databases need to be
efficiently designed, queries fine-tuned, indexes put in order, and application architecture
cleaned up before beginning the scale-out process. Otherwise, inefficiencies that exist on one
server will be multiplied when the database is distributed—effectively sabotaging a scale-out
project.
48
Chapter 3
Scale-Out Decisions
Exactly how you use your data and how your data is structured will greatly impact which scale-
out options are available to you. Most large database applications not only deal with a lot of data
and many users but also with widely distributed users (for example, several offices that each
accommodate thousands of users). One solution is to simply put a dedicated database server in
each location rather than one central server that handles all user requests. However, distributing
servers results in multiple copies of the data and the associated problem of keeping each copy
updated. These and other specifics of your environment—such as the need for real-time data—
will direct you to a particular scale-out technique.
Real-Time Data
The need for real-time data, frankly, complicates the scale-out process. If we could all just live
with slightly out-of-date data, scaling out would be simple. For example, consider how a Web
site is scaled out. When the number of users accessing the site becomes too much for one server
to handle, you simply add Web servers. Each server maintains the same content and users are
load balanced between the servers. Users aren’t aware that they are using multiple Web servers
because all the servers contain the same content. In effect, the entire group of servers—the Web
farm—appears to users as one gigantic server. A certain amount of administrative overhead
occurs when the Web site’s content needs to be updated because the new content must be quickly
deployed to all the servers so that they are in sync with one another, but there are many tools and
utilities that simplify this process.
Microsoft includes network load-balancing (NLB) software with all editions of Windows Server 2003
(WS2K3) and Win2K Advanced Server. This software load balances incoming TCP/IP connections
across a farm (or cluster) of servers.
49
Chapter 3
Why not simply scale out SQL Server in the same way? The real-time data requirement makes
this option unfeasible. Suppose you copied your database to three new servers, giving you a total
of four SQL Server computers that each maintains a copy of your database. As long as you
ensure that users are accessing SQL Server only via TCP/IP and you implement Windows NLB
to load-balance connections across the SQL Server computers, everything would work
reasonably well—as long as your users only query records (each server would have an identical
copy of the records). The minute someone needed to change a record, though, the situation
would change. Now, one server would have a different copy of the database than the other three.
Users would get different query results depending on which of the four servers they queried. As
users continued to make changes, the four database copies would get more and more out of sync
with one another, until you would have four completely different databases.
SQL Server includes technology to help with the situation: replication. The idea behind
replication is that SQL Server can accept changes on one server, then copy those changes out to
one or more other servers. Servers can both send and receive replication traffic, allowing
multiple servers to accept data updates and distribute those updates to their partner servers.
SQL Server 2005 has a new technology called database mirroring which is conceptually similar to
replication in that it creates copies of the database. However, the mirror copy isn’t intended for
production use, and so it doesn’t fulfill the same business needs that replication does.
However, replication doesn’t occur in real-time. Typically, a server will save up a batch of
changes, then replicate those changes in order to maintain a high level of efficiency. Thus, each
server will always be slightly out of sync with the other servers, creating inconsistent query
results. The more changes that are made, the more out-of-sync the servers will become. In some
environments, this lag time might not matter, but in corporate transactional applications,
everyone must see the same results every time, and even a “little bit” out of sync is too much.
SQL Server offers many types of replication including snapshot, log shipping, merge, and
transactional. Each of these provides advantages and disadvantages in terms of replication traffic,
overhead, and the ability to maintain real-time copies of data on multiple servers.
What if you could make replication take place immediately—the second someone made a
change, it would replicate to the other servers? Unfortunately, this real-time replication would
defeat the purpose of scaling out. Suppose you have one server that supports ten thousand users
who each make one change every 5 minutes—that is about 120,000 changes per hour. Suppose
you copied the database across a four-server farm and evenly load balanced user connections
across the four servers. Now, each server will need to process only one-quarter of the traffic
(about 30,000 changes per hour). However, if every server immediately replicates every change,
each of the four servers will still need to process 120,000 changes per hour—their own 30,000
plus the 30,000 apiece from the other three servers. In effect, you’ve bought three new servers to
exactly duplicate the original problem. Ultimately, that’s the problem with any replication
technology: There’s no way to keep multiple copies of a frequently-updated database up-to-date
without putting an undesirable load on every copy.
As this scenario illustrates, the need for real-time data across the application is too great to allow
a scale-out scenario to take significant advantage of replication. Thus, one of the first scale-out
project considerations is to determine how up-to-date your data needs to be at any given moment.
50
Chapter 3
Later in the chapter, I’ll explore a scale-out environment that employs replication. If you don’t have a
need for real-time data (and not all applications do), replication does offer some interesting
possibilities for scale-out.
Cross-Database Changes
Another scale-out project consideration is whether you can split your database into functionally
separate sections. For example, in a customer orders application, you might have several tables
related to customer records, vendors, and orders that customers have placed. Although
interrelated, these sections can standalone—changes to a customer record don’t require changes
to order or vendor records. This type of database—one which can be split along functional
lines—is the best candidate for a scale-out technique known as vertical partitioning.
However, if your database tables are heavily cross-linked—updates to one set of tables
frequently results in significant updates to other sets of tables—splitting the database across
multiple servers will still require a significant number of cross-database changes, which might
limit the effectiveness of a scale-out technique.
Vertical partitioning breaks the database into discreet sections that can then be placed on
dedicated servers (technically, both a large database that is partitioned by column and several
tables spread onto different servers qualify as vertical partitioning—just a different levels).
Ideally, vertical partitioning will help distribute the load of the overall database application
across these servers without requiring replication. However, if a large number of cross-database
changes are regularly required by your application, splitting the database might not actually help.
Each server participating in the scheme will still be required to process a large number of
updates, which may mean that each server can only support the same (or close to the same)
number of users as your original, single-server architecture.
Analyze your database to determine whether it can be logically split into different groups of
functionally related tables. There will nearly always be some relationship between these sets. For
example, a set of tables for customer orders will likely have a foreign key relationship back to the
customer records, allowing you to associate orders with specific customers. However, adding an
order wouldn’t necessarily require changes to the customer tables, making the two sets of tables
functionally distinct.
51
Chapter 3
In Chapter 2, I presented an overview of techniques that you can use to fine-tune the performance of
your single-server databases. For a more detailed look at fine-tuning performance on a single server,
read The Definitive Guide to SQL Server Performance Optimization (Realtimepublishers.com),
available from a link at http://www.realtimepublishers.com.
52
Chapter 3
53
Chapter 3
In this example, each server contains a complete, identical copy of the database schema and data.
When a user adds a row to one server, replication updates the copy of the data on the other
server. Users can then query either server and get essentially the same results.
The time it takes for ServerB to send its update to ServerA is referred to as latency. The types of
replication that SQL Server supports offer tradeoffs among traffic, overhead, and latency:
• Log shipping isn’t truly a form of replication but can be used to similar effect. This
technique copies the transaction log from one server to another server, and the log is then
applied to the second server. This technique offers very high latency but very low
overhead. It’s also only available for an entire database; you can’t replicate just one table
by using log shipping.
• Similar to log shipping, snapshot replication essentially entails sending a copy of the
database from one server to another. This replication type is a high-overhead operation,
and locks the source database while the snapshot is being compiled, so snapshot
replication is not a method you want to use frequently on a production database. Most
other forms of replication start with a snapshot to provide initial synchronization between
database copies.
• Transactional replication copies only transaction log entries from server to server.
Assuming two copies of a database start out the same, applying the same transactions to
each will result in identical final copies. Because the transaction data is often quite small,
this technique offers fairly low overhead. However, to achieve low latency, you must
constantly replicate the transactions, which can create a higher amount of cumulative
overhead. Transactional replication also essentially ignores conflicts when the same data
is changed in two sources—the last change is kept regardless of whether that change
comes from a direct user connection or from an older, replicated transaction.
• Merge replication works similarly to transactional replication but is specifically designed
to accommodate conflicts when data is changed in multiple sources. You must specify
general rules for handling conflicts or write a custom merge agent that will handle
conflicts according to your business rules.
• Mirroring, a new option introduced in SQL Server 2005, is primarily designed for high
availability. Unlike replication, which allows you to replicate individual tables from a
database, mirroring is configured for an entire database. Mirroring isn’t appropriate to
scale-out solutions because the mirror copy of the database isn’t intended for production
use; its purpose is primarily as a “hot spare” in case the mirror source fails.
Chapter 5 will provide more details about distributed databases, including information about how to
build them.
54
Chapter 3
However, for applications for which each copy of the data needs to support many write
operations, replication becomes less suitable. As Figure 3.3 illustrates, each change made to one
server results in a replicated transaction to every other server if you need to maintain a low
degree of latency. This fully enmeshed replication topology can quickly generate a lot of
overhead in high-volume transactional applications, reducing the benefit of the scale-out project.
55
Chapter 3
Figure 3.3: Replication traffic can become high in distributed, write-intensive applications.
To work around this drawback, a less-drastic replication topology could be used. For example,
you might create a round-robin topology in which each server simply replicates with its right-
hand neighbor. Although this setup would decrease overhead, it would increase latency, as
changes made to one server would need to replicate three times before arriving at the original
server’s left-hand neighbor. When you need to scale out a write-intensive application such as
this, a distributed, partitioned database—one that doesn’t use replication—is often a better
solution.
Partitioned Databases
Partitioning is simply the process of logically dividing a database into multiple pieces, then
placing each piece on a separate server. Partitioning can be done along horizontal or vertical
lines, and techniques such as replication and distributed partitioned views can be employed to
help reduce the complexity of the distributed database. Figure 3.4 shows a basic, horizontally
partitioned database.
56
Chapter 3
In this example, the odd- and even-numbered customer IDs are handled by different servers. The
client application (or a middle tier) includes the logic necessary to determine the location of the
data and where changes should be made. This particular example is especially complex because
each server only contains its own data (either odd or even customer IDs); the client application
must not only determine where to make changes but also where to query data. Figure 3.5 shows
how replication can be used to help alleviate the complexity.
57
Chapter 3
In this example, when a client makes a change, the client must make the change to the
appropriate server. However, all data is replicated to both servers, so read operations can be
made from either server. Prior to SQL Server 2000, this configuration was perhaps the best
technique for scaling out and using horizontally partitioned databases. SQL Server 200x’s
(meaning either SQL Server 2000 or SQL Server 2005) distributed partitioned views, however,
make horizontally partitioned databases much more practical. I’ll discuss distributed partitioned
views in the next section.
Vertically partitioned databases are also a valid scale-out technique. As Figure 3.6 shows, the
database is split into functionally related tables and each group of tables has been moved to an
independent server.
58
Chapter 3
In this example, each server contains a portion of the database schema. Client applications (or
middle-tier objects) contain the necessary logic to query from and make changes to the
appropriate server. Ideally, the partitioning is done across some kind of logical functional line so
that—for example—a customer service application will primarily deal with one server and an
order-management application will deal with another. SQL Server views can be employed to
help recombine the disparate database sections into a single logical view, making it easier for
client applications to access the data transparently.
Chapter 5 will provide more details about partitioned databases and how to build them.
59
Chapter 3
Chapter 4 covers distributed partitioned views, including details about how to create them.
60
Chapter 3
In this scenario, client applications are not aware of the underlying distributed partitioned
database. Instead, the applications query a distributed partitioned view, which is simply a kind of
virtual table. On the back end, the distributed partitioned view queries the necessary data from
the servers hosting the database, constructing a virtual table. The benefit of the distributed
partitioned view is that you can repartition the physical data as often as necessary without
changing your client applications. The distributed partitioned view makes the underlying servers
appear as one server rather than several individual ones.
Some environments use distributed partitioned views and NLB together for load balancing. A copy of
the distributed partitioned view is placed on each participating server and incoming user connections
are load balanced through Windows’ NLB software across those servers. This statistically distributes
incoming requests to the distributed partitioned view, helping to further distribute the overall workload
of the application.
However, because of the difficult-to-predict performance impact of horizontal partitioning and
distributed partitioned views (which I’ll discuss next), it is not easy to determine whether the NLB
component adds a significant performance advantage.
61
Chapter 3
When are distributed partitioned views a good choice for scaling out? When your data can be
horizontally partitioned in such a way that most users’ queries will be directed to a particular
server, and that server will have most of the queried data. For example, if you partition your table
so that East coast and West coast data is stored on two servers—knowing that West coast users
almost always query West coast data only and that East coast users almost always query East
coast data only—then distributed partitioned views provide a good way to scale out the database.
In most cases, the view will pull data from the local server, while still providing a slower-
performance means of querying the other server.
In situations in which a distributed partitioned view would constantly be pulling data from multiple
servers, expect a significant decrease in performance. In those scenarios, distributed partitioned
views are less effective than an intelligent application middle tier, which can direct queries directly to
the server or servers containing the desired data. This technique is often called data-dependent
routing, and it effectively makes the middle tier, rather than a distributed partitioned view, responsible
for connecting to the appropriate server in a horizontally-partitioned database.
62
Chapter 3
Windows Clustering
I introduced Windows Clustering in Chapter 1, and Chapter 6 is devoted entirely to the topic of
clustering. Clustering is becoming an increasingly popular option for scale-out scenarios because
it allows you to employ many servers while maintaining a high level of redundancy and fault
tolerance in your database server infrastructure. WS2K3 introduces the ability to support 4-way
and 8-way clusters in the standard and enterprise editions of the product, making clustering more
accessible to a larger number of companies (8-way clusters are available only on SQL Server
Enterprise 64-bit Edition).
As I noted in Chapter 1, Microsoft uses the word cluster to refer to several technologies. NLB
clusters, for example, are included with all editions of WS2K3 and are used primarily to create load-
balanced Web and application farms in pure TCP/IP applications. Although such clusters could
theoretically be used to create load-balanced SQL Server farms, there are several barriers to getting
such a solution to work.
To complicate matters further, Windows Server 2003, x64 Edition, supports a new clustering
technology called compute cluster which is completely different from Windows Cluster Server-style
clustering. I’ll cover that in Chapter 7.
In this book, I’ll use the term cluster to refer exclusively to what Microsoft calls a Windows Cluster
Server. This type of cluster physically links multiple servers and allows them to fill in for one another
in the event of a total hardware failure.
The idea behind clustering is to enlist several servers as a group to behave as a single server.
With Windows Cluster Server, the purpose of this union isn’t to provide load balancing or better
performance or to scale out; it is to provide fault tolerance. If one server fails, the cluster
continues to operate and provide services to users. Figure 3.8 shows a basic 2-node cluster.
63
Chapter 3
This diagram shows the dedicated LAN connection used to talk with the corporate network, and
the separate connection used between the two cluster nodes (while not strictly required, this
separate connection is considered a best practice, as I’ll explain in Chapter 6). Also shown is a
shared external SCSI disk array. Note that each node also contains its own internal storage,
which is used to house both the Windows OS and any clustered applications. The external
array—frequently referred to as shared storage even though both nodes do not access it
simultaneously—stores only the data used by the clustered applications (such as SQL Server
databases) and a small cluster configuration file.
Essentially, one node is active at all times and the other is passive. The active node sends a
heartbeat signal across the cluster’s private network connection; this signal informs the passive
node that the active node is active. The active node also maintains exclusive access to the
external SCSI array. If the active node fails, the passive node becomes active, seizing control of
the SCSI array. Users rarely notice a cluster failover, which can occur in as little as 30 seconds.
Although the service comes online fairly quickly, the user databases must go through a recovery
phase. This phase, depending on pending transactions at time of failover, can take a few seconds or
much longer.
64
Chapter 3
In an active-active cluster, a separate external SCSI array is required for each active node. Each
node “owns” one external array and maintains a passive link to the other node. In the event that
one node fails, the other node becomes active for both, owning both arrays and basically
functioning as two complete servers.
Active-active is one of the most common types of SQL Server clusters because both cluster
nodes—typically higher-end, pricier hardware—are serving a useful purpose. In the event of a
failure, the databases from both servers remain accessible through the surviving node.
Why not cluster? If you’re planning to create a partitioned or distributed database, you will already be
investing in high-end server hardware. At that point, it doesn’t cost much more to turn them into a
cluster. You’ll need a special SCSI adapter and some minor extra networking hardware, but not much
more. Even the standard edition of WS2K3 supports clustering, so you won’t need specialized
software. You will need to run the enterprise edition of SQL Server 2000 in order to cluster it, but the
price difference is well-worth the extra peace of mind.
Four-Node Clusters
If you’re buying three or four SQL Server computers, consider clustering all of them. Windows
clustering supports as many as 8-way clusters (on the 64-bit edition), meaning you can build
clusters with three, four, or more nodes, all the way up to eight (you will need to use the
enterprise or datacenter editions for larger clusters). As Figure 3.10 shows, 4-node clusters use
the same basic technique as an active-active 2-node cluster.
65
Chapter 3
I’ve left the network connections out of this figure to help clarify what is already a complex
situation: each node must maintain a physical connection to each external drive array, although
under normal circumstances, each node will only have an active connection to one array.
It is very uncommon for clusters of more than two nodes to use copper SCSI connections to their
drive arrays, mainly because of the complicated wiring that would be involved. As Figure 3.11
shows, a storage area network (SAN) makes the situation much more manageable.
In this example, the external disk arrays and the cluster nodes are all connected to a specialized
network that replaces the copper SCSI cables. Many SANs use fiber-optic based connections to
create a Fibre Channel (FC) SAN; in the future, it might be more common to see the SAN
employ new technologies such as iSCSI over less-expensive Gigabit Ethernet (GbE)
connections. In either case, the result is streamlined connectivity. You can also eliminate the
need for separate external drive arrays, instead relying on external arrays that are logically
partitioned to provide storage space for each node.
66
Chapter 3
Users access the virtual servers by using a virtual name and IP address (to be very specific, the
applications use the virtual name, not the IP address, which is resolved through DNS).
Whichever node “owns” those resources will receive users’ requests and respond appropriately.
Chapter 6 will cover clustering in more detail, including specifics about how clusters work and how to
build SQL Server clusters from scratch.
Discussing clustering in SQL Server terminology can be confusing. For example, you might
have a 2-node cluster that represents one logical SQL Server (meaning one set of databases, one
configuration, and so forth). This logical SQL Server is often referred to as an instance. Each
node in the cluster can “own” this instance and respond to client requests, meaning each node is
configured with a virtual SQL Server (that is the physical SQL Server software installed on
disk).
67
Chapter 3
A 2-node cluster can also run two instances in an active-active configuration, as I’ve discussed.
In this case, each node typically “owns” one instance under normal conditions; although if one
node fails, both instances would naturally run on the remaining node. A 2-node cluster can run
more instances, too. For example, you might have a 2-node cluster acting as four logical SQL
Server computers (four instances). Each node would “own” two instances, and either node could,
in theory, run all four instances if necessary. Each instance has its own virtual server name,
related IP address, server configuration, databases, and so forth.
This capability for clusters to run multiple instances often makes the active-passive and active-
active terminology imprecise: Imagine an 8-node cluster running 12 instances of SQL Server,
where half of the nodes are “doubly active” (running two instances each) and the others are
merely “active” (running one instance apiece). SQL Server clusters are thus often described in
terms of nodes and instances: An 8 × 12 cluster, for example, has 8 nodes and 12 instances.
Buy pre-built, commodity clusters. Microsoft’s Windows Cluster Server can be a picky piece of
software and has its own Hardware Compatibility List (HCL). Although building a cluster isn’t
necessarily difficult, you need to be careful to get the correct mix of software in the correct
configuration. An easier option is to buy a preconfigured, pre-built cluster (rather than buying pieces
and building your own). Many manufacturers, including Dell, IBM, and Hewlett-Packard, offer cluster-
class hardware and most will be happy to work with you to ship a preconfigured cluster to you. Even if
you don’t see a specific offer for a cluster, ask your sales representative; most manufacturers can
custom-build a cluster system to your specifications.
Also look for clusters built on commodity hardware, meaning servers built on the basic PC platform
without a lot of proprietary hardware. In addition to saving a significant amount of money, commodity
hardware offers somewhat less complexity to cluster configuration because the hardware is built on
the basic, standard technologies that the Windows Cluster Server supports. Manufacturers such as
Dell and Gateway offer commodity hardware.
68
Chapter 3
Real-World Testing
Perhaps the toughest part of conducting a scale-out pilot is getting enough data and users to make
it realistic. Try to start with a recent copy of the production database by pulling it from a backup
tape because this version will provide the most realistic data possible for your tests. If you’re
coming from a single-server solution, you’ll need to do some work to get your database backups
into their new scaled-out form.
Whenever possible, let SQL Server’s Integration Services (called Data Transformation Services,
called DTS, prior to SQL Server 2005) restructure your databases, copy rows, and perform the other
tasks necessary to load data into your test servers. That way, you can save the DTS packages and
rerun them whenever necessary to reload your servers for additional testing with minimal effort.
It can be difficult to accurately simulate real-world loads on your servers in a formal stress test to
determine how much your scaled-out solution can handle. For stress tests, there are several
Microsoft and third-party stress-test tools available (you can search the Web for the most recent
offerings).
For other tests, you can simply hit the servers with a good-sized portion of users and multiply the
results to extrapolate very approximate performance figures. One way to do so is to run a few
user sessions, then capture them using SQL Server’s profiling tool (SQL Profiler in SQL Server
2000). The profiling tool allows you to repeatedly replay the session against the SQL Server, and
you can copy the profile data to multiple client computers so that the session can be replayed
multiple times simultaneously. Exactly how you do all this depends a lot on how your overall
database application is built, but the idea is to hit SQL Server with the same type of data and
traffic that your production users will. Ideally, your profile data should come from your
production network, giving you an exact replica of the type of traffic your scaled-out solution
will encounter.
69
Chapter 3
Benchmarking
The Transaction Processing Council (TPC) is the industry’s official bench marker for database
performance. However, they simply provide benchmarks based upon specific, lab-oriented
scenarios, not your company’s day-to-day operations. You’ll need to conduct your own
benchmarks and measurements to determine which scale-out solutions work best for you.
Exactly what you choose to measure will depend on what is important to your company; the
following list provides suggestions:
• Overall processor utilization
• Number of users (real or simulated)
• Number of rows of data
• Number of transactions per second
• Memory utilization
• Network utilization
• Row and table locks
• Index hits
• Stored procedure recompiles
• Disk activity
By tracking these and other statistics, you can objectively evaluate various scale-out solutions as
they relate to your environment, your users, and your database applications.
Summary
In this chapter, you’ve learned about the various scale-out techniques and the decision factors
that you’ll need to consider when selecting one or more techniques for your environment. In
addition, we’ve explored the essentials of building a lab to test your decisions and for
benchmarking real-world performance results with your scale-out pilot.
A key point of this chapter is to establish a foundation of terminology, which I will use
throughout the rest of the book. The terms distributed and partitioned come up so frequently in
any scale-out discussion that it can be easy to lose track of what you’re talking about. The
following points highlight key vocabulary for scale-out projects:
• Partitioned refers to what breaks up a database across multiple servers.
• A vertically partitioned database breaks the schema across multiple servers so that each
server maintains a distinct part of the database, such as customer records on one server,
and order records on another server. This can also be referred to simply as a partitioned
database.
• A horizontally partitioned database breaks the data rows across multiple servers, which
each share a common schema.
70
Chapter 3
71
Chapter 4
Figure 4.1: You can use a view to limit users’ ability to see columns in a table.
You can also use views to pull columns from multiple tables into a single virtual table. As Figure
4.2 shows, this type of view is most often created by using JOIN statements to link tables that
have foreign key relationships. For example, you might create a view that lists a product’s
information along with the name of the product’s vendor rather than the vendor ID number that
is actually stored in the product table.
72
Chapter 4
Figure 4.2: You can use a view to combine information from multiple tables.
Views can access tables that are located on multiple servers, as well. In Figure 4.3, a view is used
to pull information from tables located on different servers. This example illustrates a sort of
distributed view, although the view isn’t really doing anything drastically different than a view
that joins tables located on the same server.
Figure 4.3: You can use a view to combine information from tables on different servers.
73
Chapter 4
Figure 4.4: Using a distributed partitioned view to combine information from horizontally partitioned tables
on different servers.
To illustrate the power behind a distributed partitioned view, consider the following example.
Suppose you have a table with several billion rows (making the table in the terabyte range), and
you need to frequently execute a query that returns just 5000 rows, based (hopefully) on some
criteria in an indexed column. One server will expend a specific amount of effort and will require
a specific amount of time to gather those rows.
74
Chapter 4
In theory, two servers—each with only half of the rows total and only a few to return—could
complete the query in half the time that one server would require to return the same number of
rows. Again in theory, four servers could return the results in one-quarter the time. The
distributed partitioned view provides “single point of contact” to a back-end, load-balanced
cluster of database servers. The distributed partitioned view is implemented on each of the
servers, allowing clients to connect to any one server to access the distributed partitioned view;
the distributed partitioned view makes it appear as if all the rows are contained on that server,
when in fact the distributed partitioned view is coordinating an effort between all the back-end
servers to assemble the requested rows.
Microsoft uses the term shared-nothing cluster to refer to the group of back-end servers that work
together to fulfill a distributed partitioned view query. The term describes the fact that the servers are
all required in order to complete the query but that they are not interconnected (other than by a
network) and share no hardware resources. This term differentiates distributed partitioned views and
Microsoft Cluster Service clusters, which typically share one or more external storage devices.
Shared-nothing clusters are also different than the clusters created by Network Load Balancing
(NLB); those clusters contain servers that have a complete, identical copy of data (usually Web
pages) and independently service user requests without working together.
What makes distributed partitioned views in SQL Server truly useful is that the views can be
updated. The distributed partitioned view accepts the updates and distributes INSERT, DELETE,
and UPDATE statements across the back-end servers as necessary, meaning that the distributed
partitioned view is keeping track—under the hood—of which servers contain which rows. The
distributed partitioned view truly becomes a virtual table, not just a read-only representation of
the data.
75
Chapter 4
Figure 4.5: Unbalanced updates place an uneven load on a single server in the federation.
76
Chapter 4
How the data is queried also affects performance. Suppose the users of your products table
primarily query products in sequential blocks of one hundred. Thus, any given query is likely to
be filled by just one, or possibly two, servers. Now suppose that your users tend to query for
newer products—with ID numbers of 3000 and higher—a lot more often than older products.
Again, that fourth server will handle most of the work, as Figure 4.6 shows.
Figure 4.6: Unbalanced partitioning places an uneven load on one server in the federation.
Are the three uninvolved servers actually doing nothing? The answer depends on where the query is
executed. If the clients in Figure 4.6 are submitting their queries to the copy of the distributed
partitioned view on ServerD, then, yes, the other three servers won’t even know that a query is
happening. The reason is that the distributed partitioned view on ServerD knows that only ServerD is
necessary to complete the query.
However, had the clients submitted their query to a copy of the distributed partitioned view on
ServerC, for example, ServerC would execute the query, submit a remote query to ServerD, then
provide the results to the clients. ServerC would be doing a lot of waiting while ServerD pulled
together the results and sent them over, so it might be more efficient for the client application to have
some knowledge of which rows are located where so that the application could submit the query
directly to the server that physically contained the rows.
77
Chapter 4
Don’t be tempted to look at the task of partitioning from a single-query viewpoint. To design a
properly partitioned federation, you want the workload of all your users’ queries to be distributed
across the federations’ servers as much as possible. However, individual queries that can be
satisfied from a single server will tend to execute more quickly.
Obviously, the ultimate form of success for any database project is reduced end-user response times.
Balancing the load across a federation should generally help improve response times, but that
improved response time is definitely the metric to measure more so than arbitrary load distribution.
The following example illustrates this point. Suppose your users typically query all of the
products made by a particular vendor. There is no concentration on a single vendor; users tend to
query each vendor’s products several times each day. In this case, partitioning the table by
vendor makes sense. Each single query will usually be serviced by a single server, reducing the
amount of cooperation the servers must handle. Over the course of the day, all four servers will
be queried equally, thus distributing the overall workload across the entire federation. Figure 4.7
illustrates how two queries might work in this situation.
78
Chapter 4
Figure 4.7: Each query is fulfilled by an individual server, but the overall workload is distributed.
For the very best performance, queries would be sent to the server that physically contains the
majority of the rows to be queried. This partially bypasses some of the convenience of a distributed
partitioned view but provides better performance by minimizing inter-federation communications and
network traffic. However, implementing this type of intelligently directed query generally requires
some specific logic to be built-in to client or middle-tier applications. Keep in mind that any server in
the federation can fulfill distributed partitioned view queries; the performance benefit is recognized
when the server that is queried actually contains the majority of the rows needed to respond.
One way to implement this kind of intelligence is to simply provide a lookup table that client
applications can query when they start. This table could provide a list of CHECK values and the
associated federation members, allowing client applications to intelligently submit queries—when
possible—to the server associated with the desired range of values. Although not possible for every
query and every situation, this configuration is worth considering as you determine which column to
use for partitioning your data.
79
Chapter 4
An uneven distribution of rows can cause more subtle problems. Using the four-server products
table as an example, suppose that one of the servers hasn’t been maintained as well as the other
three—its database files are heavily fragmented, indexes have a large number of split pages, and
so forth. That server will typically respond to any given query somewhat more slowly than the
other, better-maintained servers in the federation. When a distributed partitioned view is
executed, the other three servers will then be forced to hold their completed query results in
memory—a very expensive concept for a database server—until the lagging server catches up.
Only when all the servers have prepared their results can the final distributed partitioned view
response be assembled and provided to the client. This type of imbalance—especially if it occurs
on a regular basis—can cause significant performance degradation of an application. Figure 4.8
shows a timeline for a distributed partitioned view execution, and how a lagging server can hold
up the entire process.
80
Chapter 4
Figure 4.8: A slow server in the federation causes unnecessary wait periods in distributed partitioned view
execution.
A lagging server is not always caused by maintenance issues—a server with significantly less
powerful hardware or one that must consistently perform a larger shard of query processing than
the others in a federation might hold up the query. The key to eliminating this type of delay is,
again, proper distribution of the data across the federation members.
This example reinforces the idea that step one in a scale-out strategy is to ensure that your servers
are properly tuned. For more information, see Chapter 2.
Linked Servers
A distributed partitioned view requires multiple servers to communicate, so you need to provide
the servers with some means of communication. To do so, you use SQL Server’s linked servers
feature, which provides authentication and connection information to remote servers. Each server
in the federation must list all other federation members as a linked server. Figure 4.9 shows how
each of four federation members have pointers to three partners.
81
Chapter 4
To begin, open SQL Server Management Studio, In the Object Explorer, right-click Linked
Servers, and select New Linked Server, as Figure 4.10 shows.
82
Chapter 4
In a dialog box similar to that shown in Figure 4.11, type the name of the server to which you
this server linked, and indicate that it is a SQL Server computer. SQL Server provides built-in
linking functionality, so most of the rest of the dialog box isn’t necessary.
83
Chapter 4
You should also specify the collation compatible option (which I describe later under Best
Practices). Doing so will help improve performance between the linked servers, which you
should set to use the same collation and character set options. Figure 4.12 shows how to set the
option.
84
Chapter 4
Finally, as Figure 4.13 shows, you can set various security options. You can use these settings to
specify pass-through authentication and other options so that logins on the local server can map
to logins on the linked server. Ideally, set up each server to have the same logins—especially if
you’re using Windows Integrated authentication; doing so would make this tab unnecessary.
You’ll make your life significantly easier by maintaining consistent logins across the members of the
federation. I prefer to use Windows Integrated authentication so that I can add domain groups as SQL
Server logins, then manage the membership of those groups at the domain level. By creating task- or
role-specific domain groups, you can add the groups to each federation member and ensure
consistent authentication and security across all your SQL Server computers.
85
Chapter 4
Be sure to set a complex password for SQL Server’s built-in sa login even if you’re setting SQL
Server to use Windows Integrated authentication. That way, if SQL Server is accidentally switched
back into Mixed Mode authentication, you won’t have a weak sa login account as a security hole.
You can also set up linked servers by using the sp_addlinkedserver stored procedure. For more
information on its syntax, consult the Microsoft SQL Server Books Online.
Once all of your links are established, you can move on to creating the partitioned tables.
86
Chapter 4
Partitioned Tables
Partitioned tables start out just like any other table, but they must contain a special column that
will be the partitioning column. SQL Server will look at this column to see which of the
federation’s servers contain (or should contain, in the case of added rows) specific rows. The
partitioning column can be any normal column with a specific CHECK constraint applied. This
CHECK constraint must be different on each member of the federation so that each member has
a unique, non-overlapping range of valid values for the column.
A UNION statement is used to combine the tables into an updateable view—the distributed
partitioned view. Keep in mind that SQL Server can only create updateable views from a
UNION statement under certain circumstances; in order to create a distributed partitioned view
on SQL Server 2000, you’ll have to adhere to the following rules:
• The partitioning column must be part of the primary key for the table, although the
primary key can be a composite key that includes multiple columns. The partitioning
column must not allow NULL values, and it cannot be a computed column.
• The CHECK constraint on the column can only use the BETWEEN, OR, AND, <, <=, =,
>, and >= comparison operators.
• The table that you’re partitioning cannot have an identity or timestamp column, and none
of the columns in the table can have a DEFAULT constraint applied.
Here is an example CHECK constraint that you might include in a CREATE TABLE statement:
CONSTRAINT CHK_VendorA_M CHECK (VendorName BETWEEN ‘AAAAA’ AND
‘MZZZZ’)
Again, this constraint must exist in the table in each member of the federation, although each
member must supply a different, non-overlapping range of values.
87
Chapter 4
It is important to use UNION ALL rather than specifying some additional criteria because you want all
of the data in each table to be included in the distributed partitioned view.
Each server’s version of the distributed partitioned view will be slightly different because each
server will start with the local copy of the table, then link to the other three (or however many)
members of the federation.
There is no requirement that the tables on each federation member have different names; SQL
Server uses the complete server.database.owner.object naming convention to distinguish between
them. However, from a human-readable point of view, coming up with a suffix or some other indicator
of which server the table is on will help tremendously when you’re creating your distributed partitioned
views and maintaining those tables.
88
Chapter 4
Clients will realize better performance if they query directly against the partitioning column because
SQL Server can make an immediate and certain determination as to which server will handle the
query. In the previous example, only one server could possibly contain vendors with the name
ACMEA because that is the column that was used to partition the table and the servers must have
non-overlapping ranges.
Best Practices
Designing databases and distributed partitioned views can be a complex task, often filled with
contradictory goals, such as improving performance and reducing database size (which are rarely
completely compatible). To help you create the best design for your situation, and to configure
your SQL Server computers to execute distributed partitioned views as quickly as possible,
consider the following best practices.
Grouping Data
It is not enough to simply partition your primary tables across the servers in the federation.
Ideally, each server should also contain a complete copy of any lookup tables to enable each
server to more readily complete queries on its own. You’ll likely need to make some design
decisions about which lookup tables are treated this way. For example, tables with a small
number of rows or ones that aren’t updated frequently are good candidates to be copied to each
server in the federation. Tables that change frequently, however, or that contain a lot of rows,
may need to be partitioned themselves. The ideal situation is to horizontally partition the primary
table that contains the data your users query most and to include complete copies of all
supporting tables (those with which the primary table has a foreign key relationship). Not every
situation can meet that ideal, but, in general, it is a good design strategy. The fewer external
servers any particular server has to contact in order to complete its share of a distributed
partitioned view query, the better the queries’ performance.
89
Chapter 4
Infrastructure
SQL Server computers that are part of a federation must have the highest possible network
connectivity between one another. Gigabit Ethernet (GbE) is an inexpensive and readily
available technology that provides today’s fastest LAN connectivity speeds and should be part of
the base configuration for any server in a federation. Use fast, efficient network adapters that
place as little processing overload on the servers’ CPUs as possible. In the future, look for
network adapters that implement TCP/IP offload engines (TOE) to minimize CPU impact.
As I’ve already mentioned, try to keep servers in a federation as equal as possible in terms of
hardware. Having one server with a memory bottleneck or with slower hard drives makes it more
difficult for the federation to cooperate efficiently.
Also consider connecting the servers in a federation via a storage area network. SANs provide
the best speed for storage operations and can eliminate some of the bottlenecks often associated
with the heavy-load data operations in a federation.
Database Options
Ensure that each server participating in a federation has the same collation and character set
options. Then set the server option collation compatible to true, telling SQL Server to assume
compatible collation order. Doing so allows SQL Server to send comparisons on character
columns to the data provider rather than performing a conversion locally. To set this option, run
the following command using SQL Query Analyzer:
sp_serveroption ‘server_name’, ‘collation compatible’, true
Another option you can set is lazy schema validation. This option helps improve performance by
telling SQL Server’s query processor not to request metadata for linked tables until the data is
actually needed from the remote server. That way, if data isn’t required to complete the query,
the metadata isn’t retrieved. Simply run the following command to turn on the option:
sp_serveroption ‘server_name’, ‘lazy schema validation’, true
Of course, you’ll need to set both of these options for each server in the federation.
90
Chapter 4
Also avoid using the bit, timestamp, and uniqueidentifier data types in tables that sit behind a
distributed partitioned view; distributed partitioned views deal less effectively with these data
types than with others. Also, when using binary large object (blob) data types—such as text,
ntext, or image—be aware that SQL Server can incur a significant amount of additional
processing simply to transmit the large amount of data between the servers in the federation.
Although these object types aren’t forbidden in a distributed partitioned view, they certainly
won’t provide the best possible performance for queries.
Sample Benchmark
I conducted an informal benchmark of a database on both a single server and a 2-node federation
using distributed partitioned views. My database was fairly small at a mere ten million rows. For
hardware, I utilized identical Dell server computers, each with a single Pentium 4 processor and
1GB of RAM. Obviously not high-end server hardware, but it is the relative difference between
single-server and distributed partitioned view performance that I wanted to examine, not the
overall absolute performance values. In fact, using this hardware—as opposed to newer 64-bnit
hardware—makes the performance differences a bit easier to measure. I created a simple table
structure within the database using the SQL statements included in Listing 4.1.
91
Chapter 4
Listing 4.1: Example SQL statements used to create a simple table structure within the database.
92
Chapter 4
For the first query, I queried the base table directly. The base table was located on a third
identical server containing all ten million rows, and the query completed in an average of 58
seconds. Next, I queried the view. Each of the two servers in the federation contained half of the
ten million rows, more or less at random. All my queries were returning all the rows, so
achieving a fair distribution of rows wasn’t particularly important. The view responded in an
average of 40 seconds, which is about 30 percent faster. So, two servers are able to query five
million rows apiece faster than one server is able to query ten million rows. Keep in mind that
my response time also includes the time necessary for the server executing the view to compile
the results and provide them to my client.
My second query was simpler:
SELECT
DISTINCT Commission
FROM
TestDB.dbo.VendorData
The base table responded in 42 seconds; the view responded in 30 seconds, which is about 18
percent faster. Although far from a formal test of scale-out capability, my tests indicated that
distributed partitioned views provide an average 20 to 30 percent faster response time than a
standalone server. Not incredibly impressive, but this example is fairly simplistic—more
complex queries (and databases) would generate a greater performance increase as the complex
query operations became spread across multiple servers. My example doesn’t have much in the
way of complex query operations, so I realized a fairly minimal performance gain. Keep in mind
that my goal with this test wasn’t to measure absolute performance, but rather to see whether any
basic difference existed between distributed partitioned views and a straight, single-server query.
As you can see, distributed partitioned views provides a performance benefit. Also notice that
my queries were specifically chosen to pull all the rows in the database (or to at least make every
row a candidate for selection), creating a fairly even distribution of work across the two servers. I
deliberately avoided using queries that might be more real-world because they might also be
affected by my less-than-scientific partitioning of the data.
Conducting a Benchmark
Make an effort to perform your own tests using data that is as real-world as possible, ideally
copied from your production environment. In addition, run queries that reflect actual production
workloads, allowing you to try several partitioning schemes until you find one that provides the
best performance improvement over a single-server environment. Carefully document the
environment for each test so that you can easily determine which scenario provides the best
performance gains. Your tests should be easily repeatable—ideally, for example, running queries
from saved files to ensure that each one is identical—so that you can make an even comparison
of results. Your benchmarks should also be real-world, involving both queries and updates to
data. You will be building a scale-out solution based on your benchmark results, so make sure
that those results accurately reflect the production loads that your application will see.
93
Chapter 4
Summary
Distributed partitioned views are a powerful tool for scaling out, providing shared-nothing
clustering within the base SQL Server product. Distributed partitioned views require careful
planning, testing, and maintenance to ensure an even overall distribution of querying workload
across federation members, but the planning effort is worth it— distributed partitioned views
allow you to grow beyond the limits of a single server. Using less-expensive, “commodity” PC-
based servers, you can create increasingly large federations to spread the workload of large
database applications—a tactic already employed to great effect in Web server farms throughout
the industry.
In the next chapter, I’ll look at additional scale-out techniques that distribute and partition data
across multiple servers. These techniques are a bit more free-form and can be adapted to a
variety of situations. We will explore how to set up replication and other SQL Server features to
implement various additional scale-out scenarios.
94
Chapter 5
Distributed Databases
Distributed databases are an easy way to bring more processing power to a database application.
There are two reasons to distribute:
• To place data in closer physical proximity to more users. For example, you might
distribute a database so that a copy exists in each of your major field offices, providing a
local database for each office’s users.
• To absorb a greater amount of traffic than a single database server can handle. For
example, a Web site might use multiple read-only copies of a database for a sales catalog,
helping to eliminate the database back end as a bottleneck in the number of hits the Web
site can handle.
Replication is used to keep the databases in sync. For example, Figure 5.1 shows an example
distributed database.
95
Chapter 5
In this example, the database exists on two servers, ServerA and ServerB. Each server contains
an identical copy of the database, including the database schema and the actual data contained
within the database.
Suppose that a user adds a new database row to ServerB, which then replicates the changes to
ServerA. Both servers again have an identical copy of the data. The downside to this
arrangement is that the two database servers will always be slightly out of sync with one another,
particularly in a busy environment in which data is added and changed frequently. SQL Server
offers multiple types of replication (and SQL Server 2005 specifically adds database mirroring,
which is conceptually similar to replication), which I’ll cover later in this chapter, that each uses
a different method to strike a balance between overhead and synchronization latency.
96
Chapter 5
The design of your distributed database will affect its latency as well. For example, consider the
four-server distributed database in Figure 5.2. In this example, the administrator has created a
fully enmeshed replication topology, which means that each server replicates directly with every
other server. Changes made on any one server are pushed out to the other three servers. This
technique reduces latency because only one “hop” exists between any two servers. However, this
technique also increases overhead, because each server must replicate each change three times.
Another technique is to have ServerA replicate changes only to ServerB; ServerB to ServerC;
ServerC to ServerD; and ServerD to ServerA. This circular topology ensures that every server
replicates each change only once, which reduces overhead. However, latency is increased
because as many as three “hops” exist between any two servers. For example, a change made on
ServerA must replicate to ServerB, then to ServerC, and then to ServerD—creating a much
longer lag time before ServerD comes into sync with the rest of the servers. The amount of
overhead and latency you are willing to tolerate will depend on how complex you are willing to
make your environment, and how much overhead and latency your business applications and
users can handle.
97
Chapter 5
Latency is the most important consideration, from a business perspective, in designing replication. At
the very least, users need to be educated so that they understand that the database exists in multiple
copies, and that the copies won’t always be in sync. Make users aware of average replication times
so that they have reasonable expectations of, for example, the time necessary for their changes to be
replicated.
Your business needs will determine how much latency is acceptable. For example, latency of a
couple of minutes might not matter to most applications. However, applications that depend on real-
time data might not tolerate even a few seconds of latency; in such cases, an alternative, third-party
solution for synchronizing data will be required.
The previous examples are geared toward a database that is distributed across multiple physical
locations; another technique, which Figure 5.3 shows, is to create multiple database servers to
support multiple Web servers.
In this example, one database server holds a writable copy of the database. Internal users make
changes to this copy, and the changes are then replicated to the read-only databases accessed by
the Web servers. This model is infinitely scalable; if you determine that each database server can
support , for example, 50 Web servers, then you simply deploy a new database server for every
50 Web servers you add to your environment. The technique works well primarily for read-only
data, such as an online product catalog. Typically, the Web servers would access a second
database server with data changes, such as new orders.
98
Chapter 5
Any data that doesn’t change very frequently or isn’t changed by a large number of users is an
excellent candidate for this type of replication. A single, writable copy eliminates any possibility of
conflicts, which can happen if data is changed in multiple locations. Multiple read-only copies provide
an easy scale-out method for large applications, particularly Web sites that must support tens of
thousands of users.
Partitioned Databases
The previous Web site example makes a nice segue into the pros and cons of partitioned
databases. Figure 5.4 shows an evolution of the Web site example that includes a fourth database
server used to store customer orders. This server is written to by the Web servers.
99
Chapter 5
This example illustrates a form of partitioned database. Part of the database—the catalog
information—is stored on one set of database servers; another part—customer orders—is stored
on another server. The databases are interrelated, as customers place orders for products that are
in the catalog. In this example, the purpose of the partitioning is to distribute the overall
workload of the application across multiple servers; because the server storing orders doesn’t
need to serve up product information, its power is conserved for processing new orders. In a
particularly large Web site, multiple servers might be required to handle orders, and they might
replicate data between one another so that each server contains a complete copy of all orders,
making it easier for customers to track order status and so forth.
Partitioning a database in this fashion presents challenges to the database administrator and the
application developer. In this Web site example, the developer must know that multiple servers
will be involved for various operations so that the Web servers send queries and order
information to the appropriate server. Each Web server will maintain connections to multiple
back-end database servers.
This complexity can be dispersed—although not eliminated—by creating multi-tier applications.
As Figure 5.5 shows, the Web servers deal exclusively with a set of middle-tier servers. The
middle-tier servers maintain connections to the appropriate back-end database servers,
simplifying the design of the Web application. This design introduces an entirely new application
tier—the middle tier—which has to be developed and maintained, so the complexity hasn’t been
eliminated; it has merely been shifted around a bit.
100
Chapter 5
The point is that partitioned databases always increase complexity. Data has multiple paths
across which it can flow, and different servers are designated with specific tasks, such as serving
up catalog data or storing order data. These database designs can allow you to create staggeringly
large database applications, but you will pay for the power in more complex maintenance and
software development. This Web site scenario is an example of a vertically partitioned database,
in which different tables of the database are handled by different servers.
Figure 5.6 is a simpler model of vertical partitioning in which different tables are split between
two servers. Again, the problem with this technique is that it places a burden on the software
developer to know where specific bits of data are being stored.
101
Chapter 5
SQL Server offers components that help to reduce this complexity. For example, you can create
views that pull from multiple tables on different servers. Views work similarly to distributed
partitioned views, which I covered in the previous chapter. Distributed partitioned views are
designed to work with horizontally partitioned databases; you can also create regular views that
help to consolidate vertically partitioned data.
Views become a key to helping make the multiple servers appear to be one large server, a
technique I’ll discuss later in this chapter when I show you how to implement partitioned
databases. However, views don’t always work well if the servers are physically separated;
partitioning a database usually precludes physically distributing the servers across WAN links
for simple performance reasons.
102
Chapter 5
Distributed Databases
A basic design rule is that a distributed database is useful when you need to make multiple
copies of data available. Perhaps you want the copies to be physically distributed so that the
copies are close to individual user populations, or perhaps you need multiple copies to support
the back-end requirements for a large application. In either case, multiple copies of a database
create specific problems:
• Changes to the copies must somehow be reconciled.
• Reconciliation has processing overhead associated with it.
• Reconciliation has a time factor, referred to as latency, associated with it.
SQL Server’s replication features are designed to handle data reconciliation with varying degrees
of overhead, latency, and ability to handle conflicting changes.
One way to neatly avoid most of the problems raised by distributed databases is to allow the copies
of the database to be read-only. If changes are made only on one copy, then those changes are
distributed to read-only copies, and you only need to be concerned about the latency in pushing out
changes to the read-only copies. Some applications lend themselves to this approach; many do not.
To begin, let’s cover some basic SQL Server replication terminology. First, an article is the
smallest unit of data that SQL Server can replicate. You can define an article to be a table, a
vertical or horizontal partition of data, or an entire database. Articles can also represent specific
stored procedures, views, and other database objects.
103
Chapter 5
Articles are made available from a publisher, which contains a writable copy of the data. A
subscriber receives replication changes to the article. A distributor is a special middleman role
that receives replication data from a publisher and distributes copies to subscribers, helping to
reduce the load of replication on the publisher. A subscription is a collection of articles and a
definition of how the articles will be replicated. Push subscriptions are generated by the
publisher and sent to subscribers; pull subscriptions are made available to subscribers, which
must connect to receive the subscription’s data.
In a case in which multiple servers will contain writable copies of the data, each server will act
both as publisher and subscriber. In other words, ServerA might publish any changes made to its
copy of the data while simultaneously subscribing to changes that occur on ServerB, ServerC,
and ServerD. SQL Server has no problem with a single server both sending and receiving
changes to a database. SQL Server supports different types of replication:
• Snapshot replication is designed to copy an entire article of data at once. SQL Server
must be able to obtain an exclusive lock on all the data contained in the article, and can
compress the replicated data to conserve network bandwidth. Because of the requirement
for an exclusive lock, snapshot replication isn’t suitable for high-volume transactional
databases; this replication type is used primarily for data that is mostly static. Snapshots
can be high-overhead when the snapshot is taken, meaning you’ll schedule snapshots to
occur infrequently. Subscribers to the snapshot replace their copy of the data with the
snapshot, meaning there is no capability to merge copies of the database and handle
conflicts. Snapshots are often a required first step in establishing other types of
replication so that multiple copies of the database are known to be in the same condition
at the start of replication.
Snapshot replication is most useful for distributing read-only copies of data on an infrequent basis.
• Transactional replication begins with an initial snapshot of the data. From there,
publishers replicate individual transactions to subscribers. The subscribers replay the
transactions on their copies of the data, which results in the copies of the database being
brought into synchronization. No facility for handling conflicts is provided; if two
publishers make changes to the same data, their published transactions will be played on
all subscribers, and the last one to occur will represent the final state of the replicated
data. Transactional replication is fairly low-bandwidth, low-overhead, and low-latency,
making it ideal for most replication situations. It is often paired with a form of horizontal
partitioning, which might assign specific database rows to specific copies of the database.
Doing so helps to reduce data conflicts; you might, for example, assign different blocks
of customer IDs to different field offices so that the different offices avoid making
changes to each others’ data.
Transactional replication offers the easiest setup and ongoing maintenance. It deals poorly with
conflicting changes, so it is best if the database is horizontally partitioned so that each publisher tends
to change a unique group of rows within each table. Transactional replication is also well-suited to
data that doesn’t change frequently or that is changed by a small number users connecting to a
particular publisher.
104
Chapter 5
• Merge replication is perhaps the most complex SQL Server replication technique. Also
starting with a snapshot, merge replication works similarly to transactional replication
except that interfaces are provided for dealing with conflicting changes to data. In fact,
you can develop customized resolvers—or use one of SQL Server’s built-in resolvers—to
automatically handle changes based on rules. Merge replication offers low-latency and
creates an environment in which changes can be made to data in multiple places and
resolved across the copies into a synchronized distributed database.
Merge replication offers the most flexibility for having multiple writable copies of data. However, this
replication type can have higher administrative and software development overhead if SQL Server’s
built-in default resolver isn’t adequate for your needs.
For merge replication, SQL Server includes a default resolver; its behavior can be a bit complex.
Subscriptions can be identified as either global or local, with local being the default. For local
subscriptions, changes made to the publisher of an article will always win over changes made by
a subscriber. You might use this method if, for example, a central office’s copy of the database is
considered to be more authoritative than field office copies. However, care must be taken in
client applications to re-query data for changes, and users must be educated to understand that
their changes to data can be overridden by changes made by other users.
Subscriptions identified as global carry a priority—from 0.01 to 99.99. In this kind of
subscription, subscribers are synchronized in descending order of priority, and changes are
accepted in that order. Thus, you can define levels of authority for your data and allow certain
copies of your data to have a higher priority than other copies.
Merge replication was designed to understand the idea of changes occurring at both the subscriber
and publisher, so you don’t need to create a fully enmeshed replication topology in which each copy
of the data is both a publisher and subscriber. Instead, select a central copy to be the publisher and
make all other copies subscribers; merge resolvers then handle the replication of changes from all
copies.
SQL Server also includes an interactive resolver, which simply displays conflicting changes to
data and allows you to select which change will be applied. It is unusual to use this resolver in an
enterprise application, however; it is far more common to write a custom resolver if the default
resolver doesn’t meet your needs. Custom resolvers can be written in any language capable of
producing COM components, including Microsoft Visual C++. Of course, SQL Server 2005
integrates the Microsoft .NET Common Language Runtime, making .NET a possibility for
writing merge resolvers.
While SQL Server 2000 only supported COM-based resolvers, SQL Server 2005 supports both COM-
based custom resolvers and business logic handlers written in managed (.NET) code.
105
Chapter 5
As I mentioned earlier, transactional replication is by far the most popular form of replication in
SQL Server, in no small part because it is so easy to set up and an excellent choice when creating
distributed databases. To help avoid the problem of conflicting changes, transactional replication
is often paired with horizontal partitioning of data. For example, Figure 5.7 shows how a table
has been divided so that one server contains all even-numbered primary keys, and a second
server contains odd-numbered keys. This partitioning represents how the data is used—perhaps
one office only works with odd-numbered clients and another focuses on the evens—reducing
the number of data conflicts.
A more common technique is to create a portioning column. For example, customer records
might have a Region column that contains a value indicating which regional field office deals
with that customer the most. Conflicting changes to the customer’s data will be rare, as most
changes will be made to only that region’s data, with the change then replicated to other regions’
database servers.
106
Chapter 5
Partitioned Databases
Partitioning a database is usually performed to accomplish one of two goals:
• Distribute processing workload so that different database servers handle different
portions of the database. This setup is usually accomplished through vertical partitioning.
• Segregate portions of the database so that, although copies exist on multiple servers,
certain parts of the data are “owned” by only a single server. This setup is usually
accomplished through horizontal partitioning and is often used in conjunction with
replication, as I’ve already described.
Horizontal partitioning is a simpler matter, so I’ll cover it first. It is simply a matter of separating
the rows of your database so that particular rows can be “owned” by a specific server. To do so,
you follow the same process used to create distributed partitioned views (see Chapter 4 for more
information about this process). You might have a specific partitioning column, as I’ve already
described, which assigns rows based on criteria that is appropriated within your business (for
example, a regional code, a range of customer IDs, a state, and so on).
Vertical partitioning is more difficult because you’re splitting a database across multiple servers,
as Figure 5.6 shows. Usually, you will split the database along table lines so that entire tables
exist on one server or another. The best practice for this technique is to minimize the number of
foreign key relationships that must cross over to other servers. Figure 5.8 shows an example.
107
Chapter 5
In this example, three tables dealing with orders and customers are kept on one server, and a
table containing product information is stored on another server. This example shows only one
foreign key relationship cross between servers—between the Products and OrderLines table.
Depending on your needs, full partitioning might not be the best answer. For example, suppose you
use the database design that Figure 5.8 shows. The reason for partitioning the database is so that the
servers containing the product and order information can each handle a higher workload than if all
that information was contained on a single server.
An alternative technique is to keep a copy of the product information on the server that contains the
order information. Doing so would improve performance for that server because the server could
maintain its foreign key relationship locally. The second server could handle actual queries for
product information and replicate product changes to the order server’s read-only copy of the table.
Distributed Databases
One of the first things you’ll want to set up is replication publishing and distribution. The
publisher of a subscription isn’t necessarily the same server that distributes the data to
subscribers; the role of distributor can be offloaded to another SQL Server computer. To
configure a server as a publisher or distributor, open SQL Server Management Studio (in SQL
Server 2005; for SQL Server 2000, you use SQL Enterprise Manager and the steps are slightly
different). From the Object Explorer, right-click Replication, then select Configure Distribution.
As Figure 5.9 shows, a wizard will walk you through the necessary steps. You can either have
the publisher be its own distributor (as shown), or select one or more other servers as
distributors.
108
Chapter 5
When configuring replication, ensure that the SQL Server Agent is configured to start using a user
account that is valid on all computers that will participate in replication; generally, that will mean using
a domain user account. SQL Server Agent handles much of the work involved in replication and
cannot be running under the default LocalSystem account if replication is to work.
109
Chapter 5
To quickly select all tables, click the Publish All checkbox in the right-hand window, next to Tables.
4. Finish by specifying a name for the publication. You can also specify additional
properties for the publication, including data filters, anonymous subscribers, and so forth.
For more information about these additional properties, refer to SQL Server Books
Online.
The Local Publications list should be updated to reflect the new publication. You can also right-
click the Local Publications folder to examine the Publication Databases list (see Figure 5.11).
110
Chapter 5
There are several caveats associated with complex publications that involve multiple publishers. For
example, by default, IDENTITY columns in a publication are not replicated as IDENTITY columns;
they are simply replicated as normal INT columns. This default setting doesn’t allow the subscribers
to update the tables and create new IDENTITY values; although SQL Server can certainly handle
publications in which subscribers can create new IDENTITY values, setting up these publications
requires more manual effort and is beyond the scope of this discussion. For more details, consult
SQL Server Books Online.
As an alternative, you can generate globally unique identifiers (GUIDs) to replace IDENTITY columns
as unique keys. SQL Server can generate GUIDs for you, and will replicate GUIDs across servers
with no conflict.
To subscribe to the publication, you will follow similar steps. For example, right-click Local
Subscriptions to create a new subscription. As Figure 5.12 shows, a Wizard walks you through
the entire process.
111
Chapter 5
To create a pull subscription, open Management Studio on the subscriber. From the Replication
sub-menu, select Pull Subscription. You will see a dialog box similar to the one in Figure 5.12
listing current subscriptions. Click Pull New Subscription to create a new subscription.
Once replication is set up, it occurs automatically. SQL Server includes a Replication Monitor
within Management Studio (see Figure 5.13) that you can use to monitor the processes involved
in replication. In this case, the Log Reader agent is the service that monitors the SQL Server
transaction log for new transactions to published articles; when it finds transactions, it engages
the distributor to distribute the transactions to subscribers of the published articles.
112
Chapter 5
Partitioned Databases
Vertically partitioned databases are very easy to create—simply move tables from one server to
another. Deciding which tables to move is the difficult part of the process, and reprogramming
client applications to deal with the new distribution of data can be a major undertaking.
Unfortunately, there are no tools or rules for designing the partitioning of a database. You will
need to rely on your own knowledge of how the database works, and perhaps performance
numbers that tell you which tables are most often accessed as a set. Spreading commonly-
accessed tables across multiple servers is one way to help ensure a performance benefit in most
situations.
113
Chapter 5
There are also no tools for reprogramming your client applications to deal with the newly
partitioned database. However, SQL Server does make it possible to create an abstraction
between the data a client application sees and the way in which that data is physically stored,
partitioned, or distributed.
One technique to help make it easier for programmers to deal with partitioned databases is views.
Figure 5.14 shows an example of a vertically partitioned database in which different tables exist
on different servers. A view can be used to combine the two tables into a single virtual table,
which programmers can access as if it were a regular table. Stored procedures can provide a
similar abstraction of the underlying, physical data storage. Applications could be written to deal
entirely with the actual, physical tables; the virtual tables represented by views; or a combination
of the two, depending on your environment. Keep in mind that the server hosting the view uses a
bit more overhead to collect the distributed data and assemble the view; be sure to plan for this
additional overhead in your design and place the views accordingly.
It’s also possible to use SQL Server as middle tier in partitioned database schemes. For example,
you might have tables spread across ServerA and ServerB, and construct views on ServerC. Client
applications would deal solely with ServerC, and ServerC would assemble virtual tables from the data
on ServerA and ServerB. This setup requires significant planning but can provide a useful abstraction
so that software developers don’t need to be concerned with how the data is physically distributed. In
addition, this configuration prevents either ServerA or ServerB from hosting all the views related to
the database application.
114
Chapter 5
115
Chapter 5
Best Practices
Creating best practices for distributed and partitioned databases is difficult; every business
situation has unique needs and challenges that make it difficult to create a single set of beneficial
rules. However, there are certainly guidelines that have proven effective in a wide variety of
situations. Don’t consider these hard and fast rules—take them as a starting point for your
designs:
• Reduce the number of subscribers that a publisher must deal with when it is also acting as
a database server for users or database applications. If necessary, create a standalone
distributor so that the publisher only needs to replicate data once (to the distributor), after
which the distributor handles the brunt of the replication work to the subscribers.
• If latency is an issue, employ transactional or merge replication and create a fully
enmeshed replication topology. If latency is not an issue—for example, a product catalog
being distributed to read-only copies might only need to be replicated once a week—then
use snapshot replication.
• As I’ve already mentioned, minimize the number of cross-server foreign key
relationships and other cross-server object references when vertically partitioning a
database. Cross-server references pass through SQL Server’s Linked Servers
functionality (which I described in Chapter 4) and can have a negative impact on overall
performance if overused.
• Minimize the potential for data conflicts in replication so that you can use simpler
transactional replication rather than the more complex merge replication. Horizontally
partitioning tables so that each copy of the database “owns” particular rows can go a long
way toward reducing data collisions (or conflicts) and can make transactional replication
more viable in an environment with multiple writable copies of a database.
• Reduce the programming complexity of vertically partitioned databases by making use of
views and stored procedures. These objects can abstract the underlying physical database
structure so that software developers deal with a single set of objects (views and stored
procedures) regardless of where the underlying data is actually situated.
Working with distributed or partitioned databases can be especially difficult for software
developers, so make sure you include them in your initial scale-out design processes. They will
need to understand what will need to change, if anything, in their client applications. In addition,
perform basic benchmark testing to determine whether your proposed scale-out solution provides
tangible performance benefits for your end users; how client applications function will play a
major role in that performance. Including software developers in the planning and testing stages
will help ensure more accurate results.
116
Chapter 5
Benchmarks
Measuring the performance of a scale-out solution that uses distributed and/or partitioned
databases can be complex because it is difficult to determine what to measure. For example,
suppose you’ve created a distributed database like the one that Figure 5.3 illustrates. The purpose
is to allow more Web servers to exist by having multiple copies of a database. All hardware
being equal, a new database server should double the potential throughput of your Web site,
because the new database server can support the same number of servers as the original database
server. Similarly, if your existing Web farm can handle 10,000 users per hour with one back-end
database and 10 Web servers, having two back-end database servers and 20 Web servers should
provide the power for 20,000 users per hour.
The main thing to measure is end-user response time because that metric is ultimately the sign of
success or failure in any IT project.
This type of calculation becomes less straightforward when you move into more complex—and
realistic—scenarios like the one that Figure 5.4 shows. In this case, the central Orders database
server could serve as a performance bottleneck, preventing you from exactly doubling your site’s
overall user capacity.
You could also be using distributed databases in a scenario like the one I showed you in Figure
5.2, with multiple database servers housed in different physical locations. Again, hardware being
equal, each database server should be able to handle an equal number of users. However, the
actual performance gain from such a scenario can be greater than simply providing more power
at the database tier. For example, suppose you start out with a single database server located in a
central office, and field office users connect via WAN. And suppose that your database server is
approaching its performance limits with several thousand company users connecting each day.
Adding a server at your two major field offices would provide two performance benefits: the
workload of the database application would be distributed across three servers (which will allow
each server to maintain peak efficiency) and users will be accessing data across a LAN—rather
than a WAN—which will create at least the perception of improved application performance.
Figure 5.15 illustrates how network speed provides the performance gain.
117
Chapter 5
Figure 5.15: Local SQL Server computers have an impact on perceived performance.
To illustrate this concept with another example, suppose your original server, located at one of
your company’s two offices, can support all 5000 of your company users, which is far from the
server’s limit. Half of the users access the data across a WAN link. Now suppose you get another
identical server and place it in your other office. Neither server will be working close to its
capacity, but the second office will definitely see a performance benefit from the distributed
database because they are now accessing data across the LAN instead of across the slower WAN
link. The first office’s users won’t see any performance change at best; at worst, they might see a
slight decrease in performance as a result of the additional load of replication (performance
degradation is unlikely in this case; replication isn’t that big of a burden in a scenario such as
this). This setup illustrates how it can be difficult to measure the performance gains of a
distributed database scale-out solution—there are several factors completely unrelated to SQL
Server that can affect users’ perception of performance.
118
Chapter 5
Measuring the success of a vertically partitioned database can be even more difficult. It’s nearly
impossible to measure the performance each table contributes to an application’s performance.
For example, if you were to divide a database between two servers so that exactly half the tables
were on each server, it’s unlikely that you would double performance. The reason is that some
tables are more heavily used than others. Additionally, a poorly designed partitioning scheme
can hurt performance by forcing servers to rely too much on remote foreign key tables, which
must be queried across the LAN.
The only accurate way to measure the performance benefits—or drawbacks—of a vertical
partitioning scheme is to objectively measure the performance of the database application as a
whole. In other words, construct metrics such as maximum number of users or average response
time for specific user activities. By measuring these end user-based metrics, you will be able to
account for all of the various factors that can affect performance, and arrive at an objective
performance measurement for the application as a whole.
Summary
Distributing and partitioning databases are time-tested flexible ways to increase the performance
of a database application. In fact, distributed partitioned views, which I discussed in the previous
chapter, are an outgrowth and refinement of the database distribution and partitioning techniques
I’ve discussed in this chapter. Distributing a database gives you the flexibility to place multiple
copies of data in a single location and balance workload between the copies. Alternatively, you
can distribute data across locations to provide faster access to different groups of users.
Partitioning—both horizontal and vertical—can also provide a performance gain, particularly for
well-designed databases that offer logical divisions in either tables or rows.
It is not a straightforward task to predict performance gains from distributing and partitioning
databases. It’s difficult to fire off sample queries against a non-distributed copy of a database and
compare the results to the performance of a distributed copy; the nature of distribution is to
increase potential capacity, not necessarily to increase the performance of individual queries.
When making performance comparisons, consider the total activity of an entire application to
determine the effectiveness of your scale-out solution.
In the next chapter, I’ll focus on Windows Clustering. Clustering is a common addition to scale-
out solutions, as it prevents the single point of failure that a database server can represent. By
clustering SQL Server computers, you can create a multiple-server scale-out solution that isn’t
vulnerable to the failure of a single piece of server hardware.
119
Chapter 6
I want to emphasize that Windows Clustering isn’t a scale-out solution in and of itself; it is, however, a
common addition to scale-out solutions because it provides the high availability that scale-out
solutions often require. SQL Server 2005 also provides database mirroring, a high-availability solution
that provides similar capabilities. In addition to Windows Clustering, this chapter will briefly discuss
database mirroring.
Clustering Overview
Microsoft has offered clustering as an option since NT 4.0, Enterprise Edition. In Win2K, only
the Advanced Server and Datacenter Server editions include clustering capabilities; with
WS2K3, the Standard Edition also includes the clustering capability.
There are several non-Microsoft solutions for clustering SQL Server, many of which also provide SQL
Server-specific advantages such as real-time replication capabilities. However, for this chapter we’ll
focus on the Windows Cluster Service software provided with Windows.
Clustering Terminology
Before we explore clustering in more detail, it is important to define some basic terminology to
prevent confusion. The following list highlights the essential clustering terms:
• Node—A single server within a cluster. It’s called a node to distinguish it from other
non-clustered servers.
• Cluster—Any collection of one or more nodes. Even a cluster with just one node is
considered a cluster: If you have a 2-node cluster, and one node fails, you’re left with a 1-
node cluster.
• Virtual server—End users and client applications don’t connect directly to cluster nodes;
they connect to virtual servers, which represent specific services—such as file sharing,
Exchange Server, and SQL Server—that the cluster can provide. Virtual servers can be
passed back and forth across cluster nodes, allowing the service to remain available even
if a particular node isn’t.
120
Chapter 6
In this example, two servers are nodes in the cluster. Each node provides private storage for the
OS and any applications the cluster will run, such as SQL Server. This private storage can be in
the form you prefer, such as internal hard drives or an external drive array.
121
Chapter 6
The nodes are also connected to a single external drive array. All of the nodes are connected to
this array, but they can’t all access the array at the same time. The external array can be as
simple as a RAID cabinet provided by your server vendor or more powerful, such as an EMC
storage cabinet. Regardless, the external storage must be configured to have at least two logical
volumes: one large volume will be used to store data from clustered applications such as SQL
Server, and a small volume is required for the cluster’s quorum resource, a file that describes the
cluster’s configuration.
The “shared” external drive array provides a single SCSI bus. Both nodes connect to this bus,
although their controllers must have different SCSI device ID numbers. Only one computer can
successfully communicate over the bus at a time; thus, the Windows Cluster Service controls the
nodes’ communications over the bus. As a result of the special level of control required, only certain
SCSI array controllers that provide cluster-compatible drivers can be used in a cluster configuration.
Also note that the nodes share a network connection to one another. This connection can be as
simple as a crossover cable, or the connection can be run through a more traditional hub or
switch. This private network connection will carry the heartbeat signal, a continuous pulse that
proves the cluster’s active node is still functioning. You could run this heartbeat over the regular
network connection that connects the nodes to the rest of your network, but you run the risk of
occasional spikes in network traffic delaying the heartbeat. A delayed heartbeat could result in
unnecessary failovers; thus, it is best to use a dedicated connection.
Cluster Startup
When you start up the first node in the cluster, it runs Windows normally. When the Windows
Cluster Service starts, the service performs a bus reset on the shared SCSI array. After
performing the reset, the service pauses to determine whether another attached node performs a
similar reset. When none does (because no other node is turned on yet), the node determines that
it is the first node in the cluster and immediately begins starting all clustered resources.
Clustered resources typically include the external storage array, one or more virtual computer
names, one or more virtual IP addresses, and clustered applications such as DHCP, Exchange
Server, and SQL Server. Clustered applications’ executable files are stored on the node’s private
storage; their data is stored on the cluster’s shared external array. Clients use the virtual
computer names and IP addresses to talk to the cluster and the clustered applications; because
any node in the cluster can potentially respond to these virtual names and addresses, clients will
always be able to contact the cluster even if a particular node isn’t available.
When the second (and subsequent) nodes are started up, they also perform a SCSI bus reset.
However, the first node owns the external storage resource at this time, so it immediately
performs its own bus reset. The second node sees this reset, determines that the cluster is already
running, and assumes a passive role. In this role, the second node simply monitors the incoming
heartbeat signal and waits for the active node to fail. Any clustered services—such as SQL
Server—are held in a suspended state rather than started normally. Figure 6.2 illustrates the
cluster’s condition.
122
Chapter 6
The passive node is only passive with regard to the cluster. In other words, it’s possible for the
passive node to perform useful work because it is a fully fledged Windows server. The node
simply focuses on non-clustered applications. For example, you can run a reporting application
on the passive “spare” node, allowing the node to be useful while acting as a backup for the
functioning active node.
Cluster Operations
While the cluster is working, incoming traffic is sent to the cluster’s virtual IP addresses. The
active node responds to this traffic, routing it to the appropriate applications. In fact, the only
practical difference between the active node and any other Windows server is that the node is
sending a heartbeat signal to the other nodes in the cluster.
Interestingly, the clusters don’t use a virtual MAC address to respond to incoming traffic. When
clients (or a router) needs to forward traffic to the active node, the Address Resolution Protocol
(ARP) sends out a request for the MAC address, including the requested cluster IP address in the
request. The active node sees the request and responds with its own MAC address. Should the
active node fail, a few requests might be sent before clients (and routers) realize that they are not
getting a response; in that case, the client (and routers) would resend the ARP request, and
whichever node had taken over for the failed node will respond with its own MAC address.
123
Chapter 6
Thus, clustered applications’ client components must be willing to resend requests in order to re-
establish connectivity when a cluster node fails and the passive node takes over. One reason that
SQL Server works so well as a clustered application is that Microsoft wrote both the client and
server end: Although you might have a custom client application running on your users’
computers, that application is probably using Microsoft’s ActiveX Data Objects (ADO), Open
Database Connectivity (ODBC), or ADO.NET in order to connect to SQL Server. Those
database connectivity objects will automatically resend requests to the server as needed, instantly
making your custom client applications cluster-aware.
While the cluster is running, you can transfer the cluster’s active node responsibility from node
to node. Although similar to a failover, this process is much more controlled. Essentially, you
transfer a group of cluster resources—such as a virtual computer name, IP address, and SQL
Server service—from one node to another. On the newly active node, those services will begin to
start, while at the same time they begin to shut down on the now-passive node. Generally, a
transfer of resources from one node to another takes about half a minute or less, depending upon
the specific applications and services involved. Transferring services in this fashion allows you
to perform maintenance on cluster nodes while keeping the overall clustered application
available to your users.
Cluster Failover
At some point, an active cluster node will fail. When it does, its heartbeat signal stops, telling the
passive node that there is a problem. The passive node performs a bus reset on the external,
shared SCSI bus array. When the formerly active node doesn’t perform its own reset, the passive
node determines that the other node has failed.
The SCSI bus reset step is an extra precaution. If the heartbeat signal had failed momentarily due to
a network problem, the SCSI bus reset would keep the passive node from seizing control of the
cluster when the active node is still working. When both steps—the heartbeat and the bus reset—fail,
the passive node knows it’s time to step in and take over.
The passive node now seizes control of the cluster, appointing itself active node. It quickly reads
the quorum resource to determine how the cluster is currently configured, and begins starting
clustered services and applications. It also begins responding to the cluster’s virtual IP addresses
and names. Within about 30 seconds, the passive node is the active node. Figure 6.3 illustrates
failover in a sample cluster.
124
Chapter 6
Windows Clustering supports a concept called failback, where the cluster will attempt to shift
clustered resources back to the original, preferred node. For example, suppose you built your
cluster so that the passive node is performing other, non-clustered work, and you don’t want it
being the active node in the cluster for any longer than necessary. To configure this preference,
you designate within the Cluster Service that one node is the preferred node for clustered
resources. Whenever that node is online, all resources will be transferred to it. If it fails and is
subsequently restarted, all cluster services will transfer back to it once it is online again.
However, the back-and-forth of clustered resources across nodes can prove annoying and
disruptive to users. To prevent disruption, you can configure failback policy to only occur during
evening hours, and to stop occurring if the preferred node fails a certain number of times.
125
Chapter 6
Active-Active Clusters
To eliminate the waste of having one server sitting around in case the other fails, you have the
option to create active-active clusters. In an active-active cluster, each node performs useful
work and is backed up by the other nodes. For example, consider the cluster in Figure 6.4.
This figure illustrates two logical clusters implemented across two nodes. Each logical cluster
has a set of resources, including an external drive array that is connected to both nodes, a virtual
IP address, a virtual name, and one or more clustered applications. Under normal circumstances,
each node owns one set of resources, just as a standalone server might. One cluster is highlighted
in yellow, and the other in blue.
When a failure occurs, one node can own both sets of clustered resources, effectively becoming
two servers on one machine. Depending on how you design the cluster, its performance might be
much lower than it was when both nodes were running; but poor performance is often better than
no performance. Figure 6.5 shows the failed-over configuration, with one server handling both
logical clusters.
126
Chapter 6
Windows Clustering in later editions of Windows, such as WS2K3 Enterprise Edition and
Datacenter Edition, can handle more than just two nodes in a cluster. In fact, it’s not uncommon
to have 4-node clusters running SQL Server in an active-active-active-active configuration. In
these configurations, each node is a functioning SQL Server, and any node can take over for the
failure of any other node. In fact, it’s theoretically possible for three of the four servers to fail
and for all four logical clusters to continue serving clients. However, to actually achieve this
level of redundancy, you would need to engineer each cluster to handle no more than 20 to 30
percent of its maximum capacity under normal conditions. That level of over-engineering can be
expensive, which is why most cluster designers target 60 percent utilization, allowing each node
to carry the load of two nodes with only slightly degraded overall application performance.
127
Chapter 6
128
Chapter 6
Figure 6.6: Distributed partitioned views allow multiple servers to contribute to a single set of query results.
Another technique is data-dependent routing. In this technique, a custom middle tier of your
application takes the place of the distributed partitioned view, handling the routing of queries to
the server or servers that contain the required data. This technique is useful in cases in which the
data can’t be horizontally partitioned in such a way to have most users querying most of their
data directly from the server containing that data. In other words, when a distributed partitioned
view would have to acquire a significant amount of data from another server on a regular basis,
data-dependent routing provides better performance. However, data-dependent routing requires
significantly more development effort and might less-readily accommodate back-end
repartitioning.
Distributed partitioned views and federated servers, however, are extremely vulnerable to
hardware failure. A single hardware failure in any of the four servers that the figure shows could
result in the entire application becoming unusable because one-fourth of the application’s data is
unavailable. The result is that a single, even minor, failure—such as a processor power module
or server power supply failure—could result in all four servers, and the entire application, being
completely useless.
Clustering can help. By implementing the four servers in an active-active-active-active cluster
(or even as two independent active-active clusters), the failure of a single piece of hardware
won’t affect the availability of the overall application. In this case, clustering isn’t providing a
scale-out solution by itself, but it is contributing to the reliability of an existing scale-out
solution. Figure 6.7 shows a more detailed view of how the federated servers might be built into
a 4-node cluster.
129
Chapter 6
In the event that a single server fails, one of the others can take over for it, acting as two virtual
SQL Server computers while the one node is offline. Figure 6.8 illustrates how the cluster
failover process ensures that all of the application’s data is available, even when a single
federation member is unavailable.
130
Chapter 6
The distributed partitioned views will probably run somewhat slower when one server must carry
the workload of two, but slower performance is usually more acceptable than the entire
application simply being unavailable due to a single server failure.
131
Chapter 6
Setting Up Clusters
Setting up a cluster is a fairly straightforward process. You can perform the setup completely
remotely using WS2K3’s Cluster Administrator console. Simply launch the console, and when
prompted, select the action to create a new cluster, as Figure 6.9 shows.
Next, you’ll provide some basic information about the new cluster, including its domain. Cluster
nodes must belong to a Windows domain so that the clusters can communicate by using a single
user account. Without a domain, it’s impossible for two servers to share a user account. Figure
6.10 shows the dialog box in which you’ll enter the domain information and the proposed name
of the new cluster.
132
Chapter 6
Next, you’ll provide the name of the first node in the cluster. This node must be an existing
server, and it must meet the pre-requisites for being a cluster node. Figure 6.11 shows the dialog
box in which you enter this information.
Next, the wizard will attempt to verify the information you’ve entered and determine whether a
cluster can be created. A status dialog box, which Figure 6.12 shows, keeps you apprised of the
wizard’s status.
133
Chapter 6
Figure 6.13 shows the dialog box that the wizard presents when the node you specified isn’t
suitable to be a cluster node. This dialog box is common because Windows is picky about cluster
requirements.
134
Chapter 6
Some third-party products allow for the use of dynamic disks in a cluster; search the Microsoft
Knowledge Base for “server cluster dynamic disks” for updated information.
• Each node must have a static IP address, and the cluster itself must have an additional
static IP address. You will need yet another static IP address for each virtual SQL Server
instance you create.
135
Chapter 6
Assuming your nodes meet the requirements, the cluster creation wizard will complete
successfully. You will run the wizard again to add the second and subsequent nodes to the
cluster, and when you’re finished, you’ll have a complete cluster containing however many
nodes you specified.
When completed, your cluster will contain basic resources. These resources are initially assigned
to the first node in the cluster, although you can transfer them as you like. The resources are
organized into resource groups, and, generally speaking, all of the resources in a resource group
are dependent upon one another and must be transferred as a group.
For example, one of the most important resources is the Cluster Disk resource, which represents
the cluster’s external disk. The quorum resource, which represents the cluster’s quorum
configuration, resides on the Cluster Disk, and so must be transferred with it. Likewise, the
cluster’s virtual name and IP address also depend on the Cluster Disk and must be transferred
with it. In an active-active cluster, you’ll have multiple cluster disks, names, and IP addresses,
which can be transferred, as a group, independent of the other groups, as the following example
illustrates.
In this active-active configuration, either node can own either resource group or one node can
own both resource groups if the other node happens to be offline. SQL Server is installed as
another set of resources, which are dependent upon cluster disks, IP addresses, quorums, and so
forth.
You can create multiple quorums, IP addresses, and other shared resources within a cluster.
Suppose, for example, that you have a cluster running both SQL Server and Exchange Server. The
cluster contains two nodes. Node A contains an instance of SQL Server and an instance of Exchange
Server. Node B contains an active instance of SQL Server.
With the right combination of disks, quorums, and other resources, you could transfer Exchange
Server, for example, to Node B independently of Node A’s SQL Server instance. Or, if Node A fails,
both its SQL Server instance and Exchange Server instance could be transferred to Node B—which
would then be running Exchange Server and two copies of SQL Server. Such a configuration is not
recommended, but is useful to illustrate the cluster’s capabilities.
136
Chapter 6
This behavior isn’t unusual. Remember that you can install multiple named instances of SQL Server
on any Windows computer, allowing that computer to respond as if it were multiple SQL Server
computers. Clustering simply coordinates the failover and service startup between multiple
computers.
For example, it’s possible—although it sounds complicated—to have a 2-node cluster with four
instances of SQL Server. Node A might normally run Instances 1 and 2; Node B would normally run
Instances 3 and 4. You would run SQL Server Setup four times to create this configuration, and if
either node failed, Windows Clustering would move all available instances to the surviving node.
137
Chapter 6
138
Chapter 6
In this age of heightened security, it’s worth mentioning cluster security best practices:
• Don’t expose cluster members to the Internet without the use of a firewall or other
protective measures.
• Do not add the Cluster Service account to the Domain Admins group or use a member of
that group to start the service. It isn’t necessary.
• Do not assign a normal user account to the Cluster Service. The service’s account
shouldn’t be used for interactive logons by administrators or other users.
• Applications installed in the cluster should have their own service accounts; don’t reuse
the Cluster Service’s account.
• If you have more than one cluster, use a different service account for the cluster services
in each cluster.
• Keep the quorum disk completely private to the cluster. Don’t share files or store
application data on it.
• Don’t mess with the permissions on HKEY_LOCAL_MACHINE. Loosening security on
this registry hive can give attackers an easy way to gain administrative or system
permissions on cluster members.
139
Chapter 6
Case Study
I worked with an Internet e-commerce company that needed to implement a scale-out solution
for its SQL Server back end. The organization’s business database was extremely complex and
difficult to break out vertically, so they decided to go with a federation of servers and use
distributed partitioned views. They settled on two servers to start, each of which contained an
identical copy of their database schema and about one-quarter of the database’s data. A number
of distributed partitioned views were created to support their Web servers, which queried the
database servers for catalog, customer, and other information.
The company also had a data warehousing application that they used for business reporting. The
reporting application was important, but not considered mission-critical; they could live for
several days without running reports, if necessary. They already had a server, running SQL
Server 2000 (at the time), dedicated to the data warehouse.
140
Chapter 6
They decided to create a 3-node cluster. Nodes A and B would each run an active instance of
SQL Server 2000 and would be members of a server federation serving the Web farm that ran
the company’s Internet site. Node C would run a standalone, non-clustered instance of SQL
Server to support the reporting application. Node C would be capable of taking over for either
Node A or Node B, although doing so would limit their ability to run reports because Node C
wouldn’t have sufficient free resources to handle the data warehouse under those circumstances.
What they built was technically an active-active-passive cluster, although the “passive” node was
still performing useful, albeit non-clustered, work.
In a worst-case scenario, any one of the three servers could handle the Web farm. The servers
were each built to run at about 70 percent capacity under normal conditions, so if two of them
failed, the Web farm would run about 40 to 50 percent below its usual performance. But, low
performance is better than a site that’s completely shut down.
One clustering best practice the company decided to forego was the configuration of their cluster
nodes’ hardware. Nodes A and B were purchased with the intent of being cluster nodes and
contained absolutely identical hardware. Node C, however, was their existing reporting server,
which ran similar but not entirely identical hardware. Windows Clustering runs perfectly well
under such circumstances, although you must be a bit more careful with maintenance and
performance estimates because you’re working with non-homogenous hardware.
The addition of the server federation boosted their overall site performance by about 20 percent;
the use of clustering ensured that a single server failure wouldn’t take the site completely offline.
The federated database—no longer the performance bottleneck of the application—highlighted
additional performance problems in the Web tier, allowing the developers to begin a project to
improve performance there as well.
Interestingly, the company’s original plan was to simply turn their existing single SQL Server
into a 2-node, active-passive cluster. They planned to buy two servers and move the database to
one of them, while using the other solely for failover purposes. I argued that this configuration
was a waste of resources and suggested that an active-active cluster acting as a server federation
would provide both fault tolerance and a potential performance benefit. Because their Web
application had been created to use views for almost all queries, it was fairly easy to change
those views to distributed partitioned views and create a distributed, partitioned database. Their
desire to add fault-tolerance to their site was met, along with a significant improvement in site
performance.
141
Chapter 6
Database Mirroring
As mentioned earlier, database mirroring provides a high-availability capability not unlike
Windows Clustering. Unlike Windows Clustering, database mirroring protects your data as well
as provides failover in the event of a hardware failure. Table 6.1 will help you better understand
the differences and similarities between database mirroring and Windows Clustering.
Capability Windows Clustering Database Mirroring
Provides redundancy for data No Yes
When hardware fails, all of Yes No
server’s databases are included
in failover
Special hardware configuration Yes No
One backup server can serve for Yes Yes
multiple production servers
Redirection to failover server Yes Sometimes; requires specified
automatic client-side network library and
connection string
Database mirroring has been available for SQL Server 2000 through third-party solutions, but
SQL Server 2005 is the first time Microsoft has offered this high-availability feature right out of
the box. SQL Server performs mirroring by continuously sending a database’s transaction log
changes to a backup, or mirror, server. That server applies the transaction log changes, creating
an identical copy of the production database. This process is essentially like transactional
replication, except that no changes are made directly on the mirror and then replicated back to
the production server; the “replication” is strictly one-way. Mirroring does not use SQL Server’s
actual replication features, even though the end result is similar; in mirroring, log changes are
sent in blocks to the mirror, and the mirror’s goal is to get those changes committed to disk as
quickly as possible. Since the mirror isn’t being used by any other processes or users—in fact, it
can’t be, since it’s perpetually in a “recovery” state—changes can usually be committed to disk
quite rapidly.
The production copy of the database is referred to as the principal, while the copy is, of course,
called a mirror. Automatic failover is provided by an optional third server, called the witness.
The witness’ job is simply to watch the principal for a failure. When a failure occurs, the mirror
can confirm that with the witness, and take on the role of principal—usually within a few
seconds. The purpose of the witness is to help ensure that irregularities don’t disrupt the network:
In order for either the principal or the mirror to remain (or become) the principal, two servers
have to agree to it. For example, if the mirror and the principal are online, they can agree that the
principal is, in fact, the principal. However, if the mirror goes down, the witness can step in to
confirm that the principal is still online and can remain the principal. Similarly, if the principal
goes down, the witness and the mirror form a quorum and can agree that the mirror should take
over as principal. The witness does not contain a copy of the database being mirrored.
142
Chapter 6
On the client side, mirroring works best with ADO.NET or the SQL Native Client. Both of these
client libraries recognize server-side mirroring and can automatically redirect if the principal
changes. These libraries accept a connection string which specifies a failover partner:
"Data Source=A;Failover Partner=B;Initial
Catalog=MyData;Integrated Security=True;"
All editions of SQL Server 2005 can function as a witness. However, only the Enterprise, Developer,
and Standard Editions can be a principal or mirror.
Mirroring might sound like a form of replication, and in some ways it works similarly to transactional
replication, but it is not designed to mirror changes from one production database into another
production database; the mirror copy can’t be used for production without breaking the mirror and
bringing the “hot spare” into active service. And, while replication isn’t suitable for every type of
database, mirroring can be used with any type of SQL Server database.
Summary
Although Windows Clustering doesn’t offer a standalone scale-out solution, it can be an
important part of an overall scale-out solution when properly used with SQL Server. Because
scale-out solutions often deal with mission-critical data, Windows Clustering can offer additional
fault-tolerance to your solution, helping to meet your mission-critical uptime requirement. Plus,
Windows Clustering doesn’t have to cost a lot more. Scale-out solutions generally involve
multiple servers, so adding Windows Clustering on top of everything just provides added peace
of mind. Database mirroring in SQL Server 2005 can provide similar high-availability
capabilities, without the complexity often introduced by Windows Clustering.
143
Chapter 7
When it comes to solutions, this chapter will focus almost entirely on SQL Server 2005, rather than
SQL Server 2000. While most of the add-on tools from Microsoft and third parties are available for
SQL Server 2000, the built-in manageability capabilities I’ll describe are almost entirely unique to SQL
Server 2005.
Monitoring
There are a few major goals of server monitoring, and it’s important to really spell them out in
order to understand how they’re impacted by a scale-out scenario:
• Health. One main goal of monitoring is to keep an eye on server—or, more accurately,
application—health. Health is differentiated from performance by the level of context it
uses. For example, monitoring CPU performance requires very little analysis; the CPU is
what it is, and if performance is sitting at 60% utilization, then that’s your performance
metric. There’s no context; the utilization is simply 60%. Health, however, places that
number into context, and answers the question, “is this server (or application) healthy or
not?” In other words, is 60% processor utilization—along with other performance
metrics—good or bad?
• Availability. One particularly important goal of monitoring is to measure the availability
of a server (or application), and to notify the appropriate people if the server (or
application) becomes unavailable.
• Trending. Another important goal of monitoring is to develop trend reports, which help
predict future workload requirements based on past workload and observed growth.
144
Chapter 7
A scale-out solution makes these goals more difficult to achieve. For example, if you have a
federated database consisting of three SQL Server computers, the health of the application is
governed by the combined health of all three servers. You can’t simply take an average of
performance metrics; one server consistently running at 100% utilization, for example, will drag
down the application performance even if the other two servers are only at 40% utilization. You
essentially need a solution that can monitor metrics of the application itself—total response time
to key queries, for example—rather than individual servers. However, you still do need to
monitor the health of individual servers, because some issues—such as poor disk throughput or
high memory utilization—can be an indicator of server-specific issues that you can troubleshoot
and address appropriately.
Maintenance
Maintenance is one of the most complex and difficult areas of a scale-out solution. Maintenance
consists of ongoing tasks designed to keep servers (and the application) healthy, secure, and
available, such as:
• Applying hotfixes and patches
• Applying service packs
• Scanning for viruses and other malware
• Inventorying hardware and software
• Defragmenting hard disks or databases
• Maintaining security settings
• Rebuilding indexes
• Updating database statistics used by the SQL Server query optimizer
I’m not including hardware-level maintenance, which typically involves shutting a server down, in this
list because that type of maintenance is always conducted per-server. In other words, if you need to
upgrade the memory in four SQL Server computers, it’s going to require physical service on all four
servers. Software-level maintenance, however (such as the items in the above list), can often be
conducted at an application level by using tools that help to automatically apply the maintenance task
across all of the application’s servers.
I categorize these tasks into two broad areas: Operating system-level, and SQL Server-level.
Operating system-level maintenance involves taking care of Windows itself, and the operating
system and SQL Server tasks sometimes parallel one another. For example, applying patches is
something you’ll do for both Windows and SQL Server; rebuilding indexes is a SQL Server-
specific task, while inventorying hardware typically applies only to Windows.
Some of these maintenance tasks—such as patch management or defragmentation—are difficult
enough on a single server. However, the need for consistency across all the servers in an
application makes these tasks doubly difficult in a scale-out scenario. For example, if you need
to make changes to security settings, it’s absolutely essential that the same change be made, at
nearly the same time, to all of the servers in the solution. Otherwise, users could experience
inconsistent results.
145
Chapter 7
Many of these maintenance tasks are time-consuming, as well. For example, keeping track of
index status—a monitoring task—and rebuilding indexes when necessary—a maintenance
task—requires a lot of continual time and attention from valuable administrative resources. In
fact, one of the major objections to any solution which entails adding more servers to the
environment—such as a scale-out solution—is the amount of additional administrative overhead
the new server will require simply due to its existence. In order for scale-out solutions to be
feasible, they must not only function, but they must also create as little additional administrative
overhead as possible.
The DSI Solution
The biggest problem in any scale-out solution is the concept of managing a group of servers as a unit,
rather than managing individual servers. For decades, IT management has been performed more or less
at the server level; solutions that manage groups of servers as a unit are rare. Microsoft Application
Center 2000 was one such solution, allowing you to make a change to one Web server and automatically
replicating that change to every server in a Web farm. However, Application Center 2000 was specific to
Web servers.
Microsoft’s long-term solution to the problem is their Dynamic Systems Initiative, or DSI. A core part of
DSI is the System Definition Format, or SDF, an XML format that describes a configuration. In its fully-
realized implementation (which is still years away), DSI will help better manage application—rather than
server—configurations from initial provisioning throughout the application lifecycle.
It’s supposed to work something like this: When you decide you need a new SQL Server application (for
example), you’ll use a configuration tool to create your desired application configuration. This may include
installing and configuring SQL Server, IIS, and a number of other components. You won’t actually perform
these tasks; you’ll just specify them. The result is an SDF file describing exactly what a server in your
application should look like. You’d feed that SDF file to a provisioning tool (perhaps a successor to the
current Windows Automated Deployment System), which would actually install and configure the
necessary software on a new server for you.
DSI would then ensure that your intended configuration remained in place. For example, if another
administrator modified the server’s local firewall settings, DSI might reconfigure the server—
automatically—back to the settings required by the SDF file. If you need to make an approved change to
the server’s configuration, you’d make the change in the SDF file, using some sort of configuration tool.
The change to the file would trigger DSI—which would be implemented throughout Windows, IIS, SQL
Server, and any other products you’re using—to physically reconfigure the server to match the revised
file. In essence, the SDF file serves as your configuration standard, and DSI works to automatically
configure servers to match that standard at all times.
You can see how some elements of Microsoft’s current product line—notably Systems Management
Server (SMS) and Operations Manager (MOM)—might evolve over time to encompass some of DSI’s
functionality. And you can also see how DSI would make managing multiple servers easier: You simply
use DSI to initially provision however many servers you need according to a single standard. Any
changes that need to be made are made once, to the standard, and DSI implements those changes on
any servers which are set to follow that standard. It’s true policy-based management rather than server-
based management.
As I mentioned, the full vision of DSI won’t be realized for years to come, but understanding that DSI is
the eventual goal can help you make smarter decisions about management techniques, technologies,
and practices now.
146
Chapter 7
Other types of maintenance tasks have very unique problems in a scale-out solution. For
example, backing up servers and databases is a common maintenance task that’s made more
complicated in a scale-out solution. You’re faced with two problems: First, the task of backing
up increasingly-large databases, which is difficult enough in and of itself; and second, the task of
ensuring your entire application—no matter how many servers or databases are involved—is
backed up as a unit (to the best degree possible), so that any major recovery effort can bring the
entire application back online in a consistent, usable state.
Management
Management is the process of making periodic changes to your servers or applications. It’s
closely related to maintenance; maintenance, however, generally consists of the management
tasks that you can always expect to occur in some quantity over any given period of time. Patch
management, for example, is a true maintenance task: You know it’s going to be necessary. I
distinguish true management tasks as those which aren’t always predictable, but which occur in
response to some business condition. Reconfiguring servers to meet a new business need, for
example, is a one-time task that isn’t performed on a regular basis.
Management tasks face many of the same challenges as maintenance tasks: Changes need to be
applied consistently, and more or less simultaneously, across all of the servers in the solution. If
anything, management tasks tend to involve more sweeping, major changes, meaning that
mistakes in performing these tasks can have more serious consequences. Unfortunately, today’s
technologies tend to still focus on server-based management, making application-based
management difficult.
147
Chapter 7
148
Chapter 7
• Memory allocations
• Cache entries
• Threads
• Wait statistics
• Replication articles
• Transaction locks
• Active transactions
And so forth. In terms of single-server management, DMVs make a wealth of performance and
health data available to an administrator. However, they don’t do anything to provide
consolidated data across a group of servers participating in a scale-out solution. There is an
application for DMVs in scale-out solutions, however: If you consider how difficult it would be
to obtain the information from a DMV on a single server, without using the DMV, then you can
imagine how hard compiling that information would be for multiple servers. Although figuring
out index statistics for multiple servers would require you to execute a DMV on each server,
that’s a better solution than trying to assemble that information without the DMV.
You could write a stored procedure that queried information from multiple servers’ DMVs to present
the information in a somewhat consolidated query result.
From a more traditional performance perspective, SQL Server provides performance objects and
counters for Windows’ built-in Performance Monitor console. As Figure 7.2 shows, dozens of
objects and hundreds of counters are available, allowing you to measure even the most detailed
portions of a SQL Server computer’s performance.
149
Chapter 7
Again, however, this is just per-server performance monitoring. For somewhat higher-level
monitoring, you’ll need to turn to other products.
150
Chapter 7
MOM doesn’t have any built-in capability for monitoring specific applications. However, you
can build your own “management packs,” of a sort, using MOM’s capabilities. This would allow
you to, for example, configure MOM to monitor an entire application consisting of multiple
servers, and to test response times to (for example) specific queries or other operations. You
would define the thresholds of what was considered healthy or not, allowing MOM to monitor
your application—rather than just individual servers—and provide feedback about the state of
the application’s health.
For more information about MOM, visit www.microsoft.com/mom. Note that, as of this writing,
management packs specific to SQL Server 2005 have not yet been made available to the public.
151
Chapter 7
Third-Party Solutions
The third-party software market is a rich source of solutions for monitoring SQL Server
applications, particularly in a scale-out environment.
These solutions are simply examples; many manufacturers offer similar solutions in the same
categories.
The product also provides resource consumption information on a per-table basis, which can help
you identify tables that need to be distributed, or kept together, in a scale-out scenario.
152
Chapter 7
153
Chapter 7
154
Chapter 7
155
Chapter 7
Applications like MOM and AppManager can’t typically measure direct client application response
times. However, you can get a good measurement of overall application performance by making
critical queries accessible through Web pages or Web services. MOM, AppManager, and most other
performance applications can measure Web request response time, and that response time would
include (and in fact would primarily consist of) the query response time. While raw query response
time isn’t quite the same thing as overall application performance, measuring the response times for
queries that really impact your application (such as queries based on a distributed partitioned view or
other scale-out element) provide a good indicator of application performance.
156
Chapter 7
157
Chapter 7
The key with WSUS is the client-side Automatic Updates client software, which can be
configured (again, via Group Policy) to look for updates on the local WSUS server rather than
the Microsoft Update Web site. Automatic Updates can be configured to look for updates
automatically, on a regular basis, and to automatically download and install updates. This
capability helps to remove patch management as an active administrative task and instead makes
it passive; adding multiple servers in a scale-out solution no longer requires the additional
overhead of managing patches on additional servers.
158
Chapter 7
A large number of third-party solutions also exist to help make patch management easier. In most
cases, however, WSUS is all you need, and it’s completely free. Unlike prior versions (SUS), WSUS
can provide updates for most of Microsoft’s business products, including SQL Server.
However, the current version of SMS is designed primarily for inventorying and software
deployment; it isn’t designed to push configuration changes to managed servers, a capability
that’s sorely needed in scale-out scenarios to help maintain consistent configurations across
multiple servers. For example, SMS can’t help manage password changes for SQL Server
service accounts, an absolutely crucial capability in managing multiple servers. Fortunately, a
number of third-party solutions exist to help with various critical maintenance tasks.
159
Chapter 7
Third-Party Solutions
Third-party software developers can often provide point solutions that help solve specific
problems, particularly in a scale-out environment where you’re managing multiple servers and
trying to achieve a high degree of configuration consistency.
These solutions are simply examples; many manufacturers offer similar solutions in the same
categories.
160
Chapter 7
Diskeeper
Disk defragmentation affects SQL Server performance as much as any other application. SQL
Server actually deals with two types of defragmentation: Physical, and in-database. SQL Server
deals with in-database defragmentation on its own, reorganizing pages to keep data contiguous.
Periodically compacting databases can help maintain their performance, especially in online
transaction processing (OLTP) databases with frequent row additions and deletions. Physical
defragmentation, however, refers to the database file itself becoming non-contiguous across the
server’s storage devices. Software like Diskeeper can help reorganize these files, and can be
centrally managed to help reduce defragmentation on the servers in a scale-out solution. As
Figure 7.11 shows, Diskeeper can analyze defragmentation and tell you how much slower disk
access is as a result of defragmentation (in the example shown, disk access is almost 25%
slower).
Diskeeper is smart enough not to try and defragment open files, which presents a special
challenge for database files, since they’re always open. You will need to close the databases in
your scale-out solution in order to properly conduct a disk-level defragmentation. However, you
can also take steps in advance to reduce or even eliminate disk-based defragmentation of your
database files:
• Prior to creating your databases, thoroughly defragment the server’s disks.
• Create the database with a large enough initial size to handle near-term growth. This
ensures that the database file occupies contiguous disk space and that it contains enough
empty room to support database expansion.
• Once created in a contiguous disk space, the database file cannot become defragmented
(at this disk level, at least) until the database fills and needs to expand. At that point, you
should again defragment the server’s disks to provide sufficient contiguous free space
and expand the database manually to a size that will accommodate all near-term growth.
161
Chapter 7
You can still periodically defragment server disks while SQL Server is running, provided your
solution knows to leave SQL Server’s open database files alone (in other words, treat them as
unmovable, in much the same way that the Windows pagefile is usually treated). Diskeeper and
similar solutions can be set to automatically defragment on a regular basis, helping to make this
importance maintenance task passive, rather than requiring your active participation.
162
Chapter 7
163
Chapter 7
Multi-target jobs were available in SQL Server 2000, as well as in SQL Server 2005.
For more complex operations, you can create your own management tools and scripts using SQL
Management Objects (SMO), a completely managed application programming interface upon
which SQL Management Studio itself is built. SMO is a programmatic way of controlling SQL
Server, and it’s as easy to write scripts or tools that target multiple servers as it is to target a
single server.
SMO replaces SQL Distributed Management Objects (SQL-DMO) from SQL Server 2000, and is
designed primarily for use with the .NET Framework.
164
Chapter 7
Using SMO isn’t for the faint of heart, and a complete discussion of its capabilities is beyond the
scope of this book; consult the SQL Server 2005 Books Online, or Microsoft’s MSDN Library,
for a complete reference to SMO as well as examples. Briefly, however, SMO is a set of
managed classes that are accessible to the .NET Framework (VB.NET, for example) languages,
and which expose management functionality for SQL Server 2005. For example, the following
VB.NET snippet uses SMO to initiate a backup of the AdventureWorks database, backing it up
to a file named C:\SMOTest.bak:
Imports Microsoft.SqlServer.Management.Smo
Module SMOTest
Sub Main()
Dim svr As Server = New Server()
Dim bkp As Backup = New Backup()
bkp.Action = BackupActionType.Database
bkp.Database = "AdventureWorks"
bkp.DeviceType = DeviceType.File
bkp.Devices.Add("c:\SMOTest.bak")
bkp.SqlBackup(svr)
End Sub
End Module
For the original text of this example, as well as a C# example and a longer discussion on using SMO,
visit http://www.sqldbatips.com/showarticle.asp?ID=37.
165
Chapter 7
SMO isn’t accessible exclusively from .NET; it’s available to Component Object Model (COM)
based languages, as well, through .NET’s COM interoperability interfaces. For example, here’s a
VBScript version of the previous example:
Const BackupActionType_Database = 0
Const DeviceType_File = 2
bkp.Action = BackupActionType_Database
bkp.Database = "AdventureWorks"
bkp.DeviceType = DeviceType_File
bkp.Devices.Add("c:\SMOTest.bak")
bkp.SqlBackup(svr)
166
Chapter 7
Blade Computing
One of the biggest problems with a scale-out solution is hardware maintenance and the sheer
space required by large rack mount servers. Blade servers, such as the Dell PowerEdge 1855 or
the HP ProLiant BL series, can help with these problems.
Blade computing begins with a chassis, which provides power, cooling, keyboard-mouse-
monitor connectivity, and other shared services. The actual blades are essentially a super-
motherboard, containing all the core elements of a server: Processor, memory, and typically
some form of local storage. The blades fit within the chassis, and function as entirely
independent servers: Each blade has its own network connectivity, its own operating system
installation, and so forth. However, because each server—blade, that is—lacks an independent
chassis, power supply, cooling system, and so forth, it’s much smaller. In all, blade computing
can usually fit about 50% more computing into the same space as traditional rack mount servers.
Blade computing is often bundled with centralized systems management software, which allows
single-seat administration of the entire chassis (monitoring for power, cooling, and other
infrastructure services), as well as software for managing the blades themselves (that software
often provides agents for multiple operating systems), such as providing remote control,
installation, monitoring, and overall management.
Just because blades are smaller than traditional servers doesn’t mean they’re less powerful. In
fact, blade servers are often equipped with the same high-end processors you might find in any
other server suitable for a scale-out scenario: Fast x64 processors (such as the Intel Xeon series
or the AMD Opteron series), 32GB of RAM (depending on the blade model), a local 15,000RPM
SCSI hard drive (often attached directly to the blade), and other high-end features. Daughter
cards—the blade equivalent of a PCI expansion card—provide Fibre Channel connectivity,
gigabit Ethernet, and other advanced functions. While massive scale-up is not possible within a
blade—there are no 64-way blade computers, for example—the very purpose of scale-out is to
spread workload across multiple servers, and blade computing makes that easier to do while
helping to reduce overall administrative overhead as well as data center real estate.
Storage Solutions
I mentioned before that storage solutions can provide answers to some major maintenance and
management problems in scale-out solutions, particularly data backup. And many scale-out
solutions do rely heavily on high-end storage solutions to make tasks like data backup easier, and
to make it possible to get a single, consistent backup of the entire application’s data set.
Figure 7.13 illustrates the basic concept. A single external storage system—likely a SAN—
provides partitions for three SQL Server computers. The storage system uses its own
functionality to mirror all three partitions to a fourth area, which isn’t directly accessible to any
of the servers. This feature is fairly common in high-end storage systems, and can be used (for
example) as a form of fault tolerance (whether the three server-accessible partitions are mirrored
to one large partition or each to their own individual mirror is an implementation detail that
differs depending on the storage solution in use and the goals of the mirroring). In this example,
the mirror can be periodically broken, making it a point-in-time snapshot of the application’s
overall data store. The servers continue using their own accessible partitions, but the mirror is
used as the source for a backup operation, which can take however long it needs to write the data
to tape, magneto-optical storage, or whatever medium is appropriate. Once the backup operation
167
Chapter 7
is complete, the mirror is restored, and the storage system brings it up-to-date with the three
servers’ partitions.
If both backup and fault tolerance capabilities are desired, then two mirrors might be used: Mirror set
1 would provide fault tolerance for the servers’ partitions, and would never be broken except in case
of a disaster; mirror 2 would be periodically broken and then restored, and would be used by the
backup solution.
This technique is one way in which creative use of a high-end storage system can help solve
otherwise tricky management problems in a scale-out solution. By leveraging the storage
solution’s own capabilities for mirroring data, both fault tolerance and a large-scale backup
solution can be put into place. SQL Server’s own backup capabilities wouldn’t be required, since
the backup would be taking place entirely behind the scenes, without impacting SQL Server in
any way.
168
Chapter 7
Summary
In this chapter, I’ve covered some of the biggest challenges facing administrators in a scale-out
solution, including challenges related to maintenance, management, and monitoring. Since scale-
out solutions by definition involve multiple servers, and since much of the world’s SQL Server
management practices are single-server oriented, you do need to exercise some creativity in
researching and selection techniques and solutions to reduce the overhead of managing multiple
servers. It’s entirely possible, though, as I’ve pointed out in this chapter, to minimize the
additional administrative overhead imposed by having multiple SQL Server computers in a
solution. By using SQL Server’s native features, commercial solutions, and by rolling your own
solutions when necessary, scale-out administration can be made nearly as straightforward as
single-server administration.
169
Chapter 8
Storage Overview
Storage is, of course, the basis for most server applications. Ideally, we would be able to store all
data in high-speed memory; but the cost of RAM is simply too high. Disks are much cheaper,
albeit hundreds of times slower—RAM response times are measured in nanoseconds (ns) and
disk response times are measured in milliseconds (ms). Disks are mechanical devices with
moving parts, so they are also subject to more frequent failure than solid-state RAM, meaning
loss of data is also a strong concern. Today’s storage subsystems attempt to strike a balance
between fault tolerance and performance.
170
Chapter 8
Performance
Performance is based on the idea of reading and writing data as quickly as possible to serve the
users you need to support. As physical devices, storage subsystems have a number of elements
that can impede performance:
• The disk, which spins at a fixed speed and cannot transfer data beyond that speed.
• The disk heads, which can be in only one place at a time and must waste milliseconds
seeking the data that is desired. (Seeking is the process of moving the heads to the
appropriate location of the disk so that the data spinning by underneath the heads can be
magnetically read or modified.)
• Bus speed—the bus carries data between the disk and the controller hardware to which
the disk is attached. Another bus connects the controller to the server, allowing Windows
to communicate with the controller and move data to and from the disks.
• Device drivers are the actual software that communicates between Windows and the
controller hardware. Poorly written device drivers can impede performance and are very
difficult to pin down as a bottleneck.
Other elements—such as the fault tolerance scheme in use—can also impede performance by
creating additional overhead within the storage subsystem. Optimal performance can be achieved
in part by using better-performing components: faster disks, bigger busses, and less-obstructive
fault-tolerance schemes. However, you can quickly reach the point at which you have acquired
the fastest drives and widest data transfer bus and have implemented the fastest fault tolerance
scheme available.
In such a case, better performance is possible through parallelism. For example, adding more
disks to the system allows the server to save data to an idle drive when others are busy; adding
controllers provides additional, parallel paths for data to come onto and off of disks. Many of the
industry’s top-performing storage solutions, in fact, rely heavily on sheer number of physical
drives, utilizing multiple controllers per server, dozens of drives in cooperative arrays, and so on.
Not all operations are disk-intensive; however, SQL Server is heavily dependent on good input/output
performance. Thus, although not critical for every application, a good disk subsystem is an important
asset.
171
Chapter 8
First, SQL Server must read the root page of the index, then decide which page to read next.
Index pages are read one at a time, creating a large number of storage operations, although each
operation is relatively small—less than 10KB—in size. A typical index read in a relatively
efficient index might require 100 read operations, which with a well-performing storage
subsystem might take anywhere from half a second to a full second. In a slow subsystem,
however, speeds can be as slow as a half second per read operation, meaning the same index
search could take nearly a minute to execute. This scenario illustrates the level of performance
difference a storage subsystem can have on SQL Server.
Less-efficient table scans can exacerbate the problems caused by a poor storage subsystem.
Imagine a table scan that is going through a million rows of data—not uncommon in a large
database—at the slow pace of 50ms per read. That could require SQL Server to spend an
enormous amount of time—half an hour or more—to complete the table scan. You can test this
sort of performance easily by installing SQL Server and a large database on a notebook
computer. Notebooks typically have incredibly poor throughput on the hard drive—storage
performance isn’t really what notebooks are designed for. Fire off a query that table scans a
million rows, and you’ll see how bad things can get. Now image that a couple of thousand users
are trying to do the same thing all at once, and you’ll get an idea of how poor application
performance would be as a result of an inefficient SQL Server storage system. As a result,
database administrators and developers try to minimize table scans by providing SQL Server
with appropriate indexes to use instead: Even on a speedy disk subsystem, table scans can rack
up a lot of time.
So what constitutes good performance? Take a 15,000rpm disk drive, which is pretty standard
equipment for high-end servers. The drive has a fixed performance level because it can only pull
data off the drive as quickly as the drive’s platters are spinning. A high-end drive might take 6ms
to move the drive heads from one location on the platter to another, on average, which creates an
approximate maximum throughput of about 160 read operations per second. The closer you get
to this maximum, the poorer overall performance will be, so you should aim to stay within about
75 percent of the maximum, or about 120 operations per second.
Consider again an index read example with 100 operations; you can see that you’re only going to
get about one and a quarter index reads per second while staying within the safe range. This
example illustrates how easy it is to reach the performance capacity for a single drive and why
it’s so important to use arrays of drives that work together rather than storing data on a single
drive.
SQL Server 2005 introduces built-in partitioning, which allows you to more easily spread a database
across several disks while still managing the database as a single set of objects. However, in large
databases simply spreading the database across two or more single disks still won’t provide a
significant performance increase. Instead, you’ll find yourself spreading the database across multiple
disk arrays, or, more commonly, using large arrays (such as in Storage Area Networks—SANs, which
I’ll discuss in a moment) and simply ignoring SQL Server’s partitioning capabilities.
172
Chapter 8
RAID
RAID is the cornerstone of most high-performance storage solutions. The idea behind RAID is
to utilize multiple disks in concert to improve both redundancy and performance. RAID defines
several different levels, which each provide a tradeoff between redundancy and performance. In
a production environment, you’re likely to encounter RAID 1, RAID 5, and RAID 10.
RAID 0
RAID 0 uses a technique called disk striping, which Figure 8.1 illustrates.
As data is streamed to the controller, it is divided more or less evenly between all the drives in
the array (two are required). The idea is to get more drives involved in handling data to increase
overall throughput. When data is sent to Drive 0, the controller would normally have to wait until
Drive 0 accepted that data and finished writing it. In RAID 0, the controller can move on to
Drive 1 for the next chunk of data. RAID 0 improves both read and write speeds but has an
important tradeoff: if a single drive fails, the entire array is pretty much useless because no one
drive contains any entire file or folder. Thus, although RAID 0 improves performance, it doesn’t
improve redundancy. This concept of disk striping is an important one that comes up again in
other RAID levels.
173
Chapter 8
The performance of RAID 0 improves as you add drives. In a 2-drive system, for example, the
controller might submit data to both Drive 0 and Drive 1 and still need to wait a few milliseconds
for Drive 0 to catch up and be ready for more data. With a 4-drive array, Drive 0 is much more
likely to be waiting on the controller, instead. With 8 drives, the odds improve even more,
practically ensuring that Drive 0 will be ready and waiting when the controller gets back to it. It
is possible to reach an upper limit: many lower-end controllers, for example, reach their own
maximum throughput with 7 to 10 drives attached, meaning you’re not going to see a
performance improvement by attaching any drives beyond that point.
RAID 1
RAID 1 is also called mirroring, and is illustrated in Figure 8.2.
In this level of RAID, the controller writes date to and reads data from a single disk. All written
data is also written—or mirrored—to a second disk. Should the first disk fail, the second is
available essentially as an online, up-to-date backup. Most array controllers will allow the server
to functional normally off of the mirror until the failed disk is replaced.
RAID 1 is almost the opposite of RAID 0, in that it provides no performance advantage, but does
provide redundancy; the failure of a single disk won’t harm the server’s operations. RAID 1 can,
in fact, reduce performance slightly in a write-heavy application such as SQL Server because all
data is written twice. Most high-end RAID controllers, however, can minimize this performance
impact through creative management of the data bus used to communicate between the controller
and the disks.
Increasing the drives in a RAID 1 array does improve fault tolerance. For example, 3 drives in an
array would survive the failure of any 2 drives. That’s pretty expensive insurance, however, and
not at all common in production environments. In addition, more drives in a RAID 1 array can
reduce performance due to the additional write operations required.
174
Chapter 8
RAID 4
RAID 4 takes RAID 0 one step further by adding parity to disk striping. Figure 8.3 shows how
RAID 4 works.
RAID 4 uses the same disk striping mechanism that RAID 0 uses. However, RAID 4 also utilizes
an additional disk to store parity information. If a single data disk fails, the parity information,
combined with the data on the remaining disks, can be used to reconstruct the data from the
missing disk. This technique allows the array to continue operating if a single disk fails.
However, because a single disk is used to store all parity information, incoming write operations
can’t be easily interleaved. In other words, the parity disk creates a bottleneck that, in write-
intensive applications, can partially or totally defeat the performance improvement offered by
disk striping.
RAID 4 requires a minimum of 3 drives: 2 for striping and 1 for parity. Like most other RAID levels, all
drives in the array must be of equal size.
175
Chapter 8
RAID 5
RAID 5 offers a better performance-and-redundancy compromise than RAID 4 offers. Rather
than using a single drive for parity information, RAID 5 rotates the parity information across the
drives in the array, essentially striping the parity information along with the actual data being
written. So for the first chunk of data sent to the controller, the disk operations look like those
that Figure 8.3 shows. Figure 8.4 shows the subsequent write operations rotate the drive
containing the parity information.
176
Chapter 8
The net effect of RAID 5 is that a single drive can fail and the array can remain functional by
using the parity information to reconstruct the missing data. RAID 5 can handle interleaved write
operations, improving its performance over RAID 4. However, because the parity information
still represents an extra write operation, write performance is still slightly slower than read
performance in a RAID 5 array. In practice, RAID 5 offers perhaps the best tradeoff between
performance, cost, and redundancy for most server operations.
Although RAID 5 requires a minimum of 3 drives to operate, it works better with more drives.
Common RAID 5 arrays will have 7 or 8 drives in total, and can have many more, depending on the
capabilities of your hardware.
177
Chapter 8
RAID 10
RAID 10 is a combination of RAID 1 and RAID 0—hence the name. Figure 8.5 shows how
RAID 10 works.
A RAID 10 array is essentially two parallel arrays. The first array is a RAID 0 array, which uses
disk striping—without parity—for maximum performance. The second array is a mirror of the
first, providing the high level of fault tolerance offered by RAID 1. Parity information isn’t
required because each drive in the RAID 0 array has its own dedicated mirror; this type of array
can theoretically survive the failure of every single drive in the main RAID 0 array, because each
is backed up by a dedicated mirror drive.
RAID 10 provides underlying technologies for many high-performance backup systems that are
specific to SQL Server. For example, although SQL Server provides the ability to back up databases
while they are in use, doing so reduces server performance. An integrated storage and backup
solution can implement a third array as a second mirror set in a RAID 10 configuration. When a
backup is required, the third array is detached (or the mirror is broken) and becomes a static
snapshot of the data in the array. This third array can be backed up at leisure, then reattached (the
mirror repaired) when the backup is complete.
RAID 10 provides the best possible performance and redundancy for SQL Server applications,
but it does at a hefty price: You must buy twice as many drives as you need to store your
databases. Still, in a high-end database, the price is usually well worth the benefit.
178
Chapter 8
Various levels of RAID offer price tradeoffs. In RAID 1, for example, you "lose" the space of an entire
drive to the fault-tolerance scheme. With RAID 4, the parity drive is "lost" to you, because it is
dedicated to parity data. RAID 5 spreads out parity information, but you still "lose" the space equal to
one entire drive. With RAID 10, you "lose" half of the drive space you bought—in exchange for better
performance and fault tolerance.
Performance can vary significantly between different RAID 10 implementations. For example,
some implementations use a single controller that issues write commands individually to each
drive in the arrays. So for each write operation, two write commands are issued: one to the main
RAID 0 array and another to each drive’s mirror. More advanced implementations provide better
performance by eliminating the extra write command. In this implementation, the two arrays
function independently; the second array responds only to write requests and simply watches for
commands sent to the first array, then carries out those commands in parallel. Because most SQL
Server applications tend to be write-heavy, this latter, more advanced type of array is preferred.
It’s actually uncommon to see RAID 5 in use in actual, production high-end databases. RAID 10 is a
much more common solution, as it provides both excellent performance and fault tolerance.
Another technique is to use dual RAID controller cards. Windows issues write commands, which
the cards’ device driver accepts and passes on to the controllers in parallel. The controllers then
direct their attached drives independently, improving performance because write commands
don’t need to be duplicated. In some implementations, read commands are also carried out in
parallel; the driver accepts read data from whichever array responds first, ensuring that the
failure of even an entire array creates no additional lag time for the application. The device
drivers can also implement a level of error correction by comparing the read results from both
arrays to ensure that they’re identical. This error-correcting can create a performance hit for read
operations, because the controllers’ driver must wait for both arrays to respond before delivering
data to Windows.
The lesson to be learned—particularly from RAID 10, which offers the broadest number of
implementation variations—is that you need to study and understand how various
implementations work, then select the one that best fits your business needs. When combined
with options such as SANs (which I’ll cover later in his chapter), the possible implementations
can become perplexing, so don’t hesitate to ask vendors and manufacturers to explain exactly
how their solutions work.
179
Chapter 8
Software RAID
Windows includes its own software-based RAID capabilities. These capabilities allow you to use
standard IDE or SCSI drives attached on a non-RAID controller card and still have the fault
tolerance benefits of a RAID 1 or RAID 5 array or the drive flexibility of a RAID 0 array. Figure
8.6 illustrates how Windows logically implements software RAID.
As this figure illustrates, Windows implements the RAID logic in software. As data is written,
Windows decides to which drives the data will be written, and sends each data stream to the
controller card. The controller writes the data to the specified disk.
Because Windows itself is performing the RAID logic, it’s fairly inefficient. Windows is a
software application running on the server’s processors; the RAID logic must pass through
several instruction sets before finally having an effect on the underlying server hardware. In
terms of performance, Windows RAID is about the worst single thing you can do to any SQL
Server computer, particularly one participating in a scale-out solution that is supposed to offer
improved performance.
180
Chapter 8
Hardware RAID
In a hardware RAID solution, the RAID logic is moved from Windows to a dedicated processor
on a RAID-capable controller card. This card presents the array to Windows as a single physical
disk and handles the task of splitting the incoming data stream across the drives in the array.
Because the RAID logic is executed directly in hardware—and dedicated hardware, at that—its
performance is much, much faster than software RAID. Figure 8.7 illustrated the logical flow of
data.
Notice that, in this case, the array controller is a card installed directly in the server. This
scenario is common for a traditional SCSI array; Network Attached Storage (NAS) and SANs—
which I’ll discuss momentarily—work a bit differently than a traditional SCSI array.
SCSI Arrays
SCSI arrays are perhaps the most common type of arrays in most data centers. These can take the
physical form of an external array box, filled with drives and connected by a copper-based SCSI
cable. Many servers also offer internal SCSI arrays that can hold anywhere from 2 to 12 drives.
SCSI arrays hold all the basic concepts for arrays. For example, Figure 8.8 shows how arrays are
created from physical disks—in this case, 6 of them—then logically partitioned by the array
controller. Each partition is presented to Windows as a single physical device, which Windows
can then format and assign a drive letter to.
181
Chapter 8
In this example, the 6 physical drives might be configured as a single RAID 5 array, which is
then divided into logical partitions to provide different areas of space for an application. One
restriction common to most SCSI arrays is that the array must be physically attached to the SCSI
controller, and only one computer—the one containing the controller card—can utilize the array.
Such is not necessarily the case with NAS and SANs.
NAS
NAS consists of a dedicated storage device that attaches directly to a network. Generally, these
devices contain some kind of embedded server OS, such as Linux or Windows Embedded, which
enable them to act as a file server. Because they function just like a file server, they are
accessible by multiple users. In theory, SQL Server can store databases on NAS (see Figure 8.9).
182
Chapter 8
However, NAS has some major drawbacks for use in any SQL Server solution, particularly a
scale-out solution. First, NAS devices are usually designed and built to replace file servers; they
might offer a level of redundancy by incorporating RAID, but they aren’t meant for blazing disk
performance. Another drawback is the fact that data must reach the NAS device by passing over
an Ethernet network, which is perhaps one of the least efficient ways to move mass amounts of
data. Thus, although a NAS device might make a convenient place to store oft-used SQL Server
scripts or database backups, it is a very bad location to store SQL Server databases.
SANs
SANs appear superficially to be very much like NAS, as Figure 8.10 shows. However, there is a
world of difference between SANs and NAS.
In a SAN, disks reside in dedicated array chassis, which contain their own internal controller
boards. These boards are often manageable through an external user interface, such as an
embedded Web server or a configuration utility. The boards control the primary operations of the
array, including its RAID level and how the available space is partitioned. The array is
connected—often via fiber-optic cabling in a typical fiber channel SAN—to a hub. Servers
contain a fiber channel controller card and are connected to the hub. In effect, the SAN is a sort
of dedicated, specialized network that in many ways resembles the infrastructure of an Ethernet
LAN.
183
Chapter 8
Windows reads and writes data by sending normal storage requests to the fiber channel controller
card driver. The driver communicates with the controller hardware, and places storage requests
onto the fiber channel network. The other devices on the network pick up these requests—just
like a client computer might pick up network traffic—and respond appropriately. Even at this
logical level of operation, SANs don’t appear to be incredibly different from NAS devices; the
main difference is in speed. Fiber channel networks can carry data hundreds of times faster than
common Ethernet networks can, and fiber channel networks are optimized for dealing with high
volumes of data, such as that from a SQL Server application. SAN implementations also tend to
come with exceedingly large memory-based caches—in the gigabyte ranges—allowing them to
accept data quickly, then spool it out to disks.
SAN arrays can be shared between multiple servers. As Figure 8.11 shows, the array’s controller
board determines which server will have access to which portion of the available data. In effect,
the SAN array partitions itself and makes various partitions available to the designated servers on
the fiber channel network.
This ability for a SAN to be “shared” (the space isn’t really shared, but partitioned between
servers using the SAN) makes SANs a very effective tool for reducing management overhead:
The entire SAN can be managed (and often backed up, depending on the solution) as a whole,
and can often be re-partitioned somewhat dynamically, providing you with the ability to
provision more space for servers that need it.
184
Chapter 8
SANs require software in order to run; look for SAN controllers that embed as much of their operating
software as possible in their controllers or in the array hardware. This setup avoids placing any
server-based software—and thus, overhead—onto your SQL Server computers, conserving as much
processing power as possible for SQL Server itself.
Also, look for SAN solutions that offer large caches. A cache—an area of fast memory that accepts
and holds data until the relatively slower hard drives can accept it—allows SQL Server to write data
quickly, faster even than the storage solution’s drives can really accept it. Simple RAID controllers
typically offer caches, too, but often in the megabyte range. A high-end SAN solution can offer
tremendously larger caches, increasing the amount of data that can be cached and improving SQL
Server’s performance considerably.
Interestingly, the advent of high-speed Gigabit Ethernet (GbE) technologies is making Ethernet a
contender for creating SANs. Ethernet offers significantly lower pricing than many fiber
channel-based SANs because Ethernet is a decades-old technology that runs over much less
expensive copper cabling (fiber channel can also be implemented over copper but, for
performance reasons, almost never is). A technology called Internet SCSI (iSCSI), which is
supported in Windows Server 2003 (WS2K3—with the appropriate drivers), allows a server to
address networked SCSI-based arrays over an Ethernet connection. Generally, this Ethernet
network would be dedicated to SAN purposes, and servers would have an additional network
adapter for communicating with clients.
Other emerging technologies include network adapters that have the ability to move data from a
networked array directly into the server’s memory. Referred to as TCP/IP offloading, this
technology promises significant performance improvements for storage operations (as well as
other data-intensive applications) because the server’s processor can be bypassed completely,
leaving it free to work on other tasks while data is moved around. Keep an eye on these
emerging technologies as they mature to see what the next generation of high-performance
storage will offer.
Specialized Storage
You can now purchase preconfigured specialty storage solutions. These are generally packaged
systems that combine basic RAID technology with proprietary controllers to improve
performance, provide better redundancy, and so forth. For example, EMC offers several SAN
solutions. These solutions include high-end manageability, meaning they can be incorporated
into enterprise management frameworks such as Hewlett-Packard OpenView, or by monitoring
and health solutions such as Microsoft Operations Manager (MOM), or monitoring solutions
from companies such as Altiris. They can also include high-end fault protection, such as the
ability to send an alert to a service technician when a drive fails. In fact, the first you hear about a
failed drive might be when a technician shows up that afternoon with a replacement.
These high-end storage systems can, in fact, provide additional scale-out capabilities,
redundancy, and performance above and beyond simple RAID levels. EMC’s CLARiiON CX
systems include proprietary software that provides full and incremental replication capabilities—
even across long-distance WAN links—allowing you to maintain a mirror of your data at a
remote location for additional fault tolerance and business continuity.
185
Chapter 8
Dell offers branded versions of several EMC solutions, such as the Dell/EMC AX100, which comes in
both Fibre Channel and iSCSI versions, and the Dell/EMC CX700 Other vendors offer proprietary
solutions with similar capabilities.
Design Principles
To ensure an optimal high-performance storage design for your environment, consider a few
basic design principles. By reviewing the basic storage technologies, you can see that there are
several potential performance bottlenecks that any design should strive to work around:
• Controllers—Controllers represent a single point of contact between a storage subsystem
and Windows (and, therefore, SQL Server). In other words, all data has to pass through a
controller; if you have only one controller, it will represent a potential bottleneck for data
throughput.
• Drives—Drives have a fixed performance maximum, which simply cannot be overcome.
Controllers can help alleviate this bottleneck by caching data in the controller’s onboard
RAM. In addition, you can implement arrays of disks to remove the bottleneck
represented by a single disk.
• Bandwidth—The path that carries data from the controller to the drives is another
potential bottleneck. IDE, the most common drive technology in client computers, is
extremely slow compared with other technologies and isn’t suited for use in servers.
Newer versions of SCSI are faster, and some SAN technologies are faster still.
Your high-performance storage design should also consider the various types of data that SQL
Server deals with:
• Databases—Tend to be read-heavy in online analytical processing (OLAP) applications,
or mixed read- and write-heavy in online transaction processing (OLTP) applications
• Transaction logs—Write-heavy in OLTP applications
• OS files—For example, the page file, which is read- and write-heavy, especially in
systems with insufficient RAM
186
Chapter 8
The ideal SQL Server storage subsystem, from a performance and fault tolerance point of view,
might include the following:
• A RAID 1 array for the OS files, including the page file—You could increase
performance by placing the page file on a RAID 0 array, but you’ll lose fault tolerance. If
a drive in the RAID 0 array fails, you probably won’t lose much data but the server will
go offline. Use a dedicated controller—servers will often have one built-in—for this
array.
• A RAID 5 array for the transaction log—Although RAID 5 imposes performance
penalties for writes (as do many RAID levels), it provides a good balance between fault
tolerance—essential for transaction logs—and write performance. If money is no object,
create a small RAID 10 array for transaction logs and you’ll get better performance and
fault tolerance. Regardless of which you choose, use a dedicated controller for this array.
• Multiple RAID 10 arrays for databases, or RAID 5 if RAID 10 is too expensive—The
goal is to figure out how much space you need, then use a larger number of smaller drives
to improve overall throughput. Ideally, each array should be on a dedicated controller to
provide independent bandwidth. For large databases, try to split the database’s tables into
multiple secondary database files and spread them across the available arrays so that the
workload tends to be evenly distributed across the arrays. Ideally, these RAID 10 arrays
can be implemented in fiber channel SANs, providing better throughput than copper-
based SCSI connections.
Best Practices
Over the years, best practices have been developed with regard to storage performance in SQL
Server. These practices mainly come from long experimentation and reflect a sort of combined
experience from within the SQL Server community. By following these practices, you can help
ensure the best possible storage performance from your SQL Server scale-out solution.
• Do not under any circumstances use Windows’ built-in software RAID capabilities.
They’re simply too slow. Use only hardware RAID controllers that implement the RAID
array and present Windows with what appears to be a single physical disk that represents
the entire array. All major server manufacturers offer these controllers in their servers.
Or, simply use a SAN, which internalizes all the RAID capabilities and presents a single
logical drive to Windows.
• Use the fastest disks possible. 5400rpm and 7200rpm disks are commonly available,
although less than suitable for server use, where 10,000rpm is generally the accepted
minimum. SCSI disks exceeding 15,000rpm are common in server configurations. The
ability of the disk to spin its platters faster also gives it the ability to transfer data on and
off those platters faster. Provided the drive is connected to a fast controller capable of
handling the drive’s throughput, faster drives will always result in increased performance.
• Don’t mix drives in an array. Whenever possible, every drive in an array should be the
same size, speed, and brand, ensuring the most consistent possible performance across
the array. Mixing drives will lead to inconsistent performance, which can create
performance hits that are difficult, if not impossible, to positively identify.
187
Chapter 8
• The more disks, the merrier. Computers can generally stream data to and from disks
faster than even the fastest disk can handle; by implementing arrays with more disks, the
computer will be able to move on to the next device in the array while the last device
“catches up.” Arrays of at least seven to eight disks are recommended, and larger arrays
are common in specialized storage solutions such as those sold by EMC.
• Disk controllers are a bigger bottleneck than many administrators realize. Select a
controller that has been tested and demonstrated high performance numbers. The
controller should have its own CPU, and, ideally, should be bus-mastering, providing it
with dedicated access to the server’s memory and offloading work from the server’s
CPUs. As I’ve already mentioned, fiber channel controllers have a distinct advantage
over traditional SCSI controllers in that the fiber network is able to carry data to and from
the controller much more rapidly than SCSI.
• If you have the money, consider a separate array controller for each major storage
category: OS, databases, log files, and so forth. Doing so will permit SQL Server to
maintain parallel data paths and can create a significant performance improvement.
• Purchase controllers with their own onboard RAM cache and battery backup. These
features allow the controller to report data as “written,” and allow SQL Server (and
Windows) to go about other tasks. The controller then streams data to the physical disks.
The battery backup ensures that a loss of power won’t result in a loss of data; when
power is restored, the controller will generally write anything left in RAM as soon as the
server starts, before Windows even loads.
• Arrays or disks used for SQL Server data shouldn’t contain any other devices. For
example, don’t use your array controller to run a tape backup unit, CD-ROM, or other
device; dedicate the controller to the task of moving data for SQL Server. Multi-
purposing a controller simply divides its attention and reduces the throughput available to
SQL Server. In fact, when possible, don’t install tape backups and other devices on a
SQL Server computer. Even though you can use a separate controller for these devices,
you’re still running the risk of placing an unnecessary burden on the server’s processors.
If you’re interested in testing your storage performance, download one of the free performance
utilities from http://www.raid5.com. These utilities can perform a fairly thorough test of raw throughput
and let you know how your system is doing. These utilities don't test SQL Server-specific
performance, but gather general, raw disk performance. Microsoft provides a tool named SQLIO,
which writes data in 8KB and 64KB blocks, mimicking SQL Server's own disk usage patterns. This
tool is useful to benchmark SQL Server-specific performance.
Aside from the server configuration, a number of SQL Server configuration best practices can
help improve performance. For example, high-end installations typically provide separate RAID
arrays for each type of data, often storing the OS and SQL Server on a RAID-1 (mirrored) array,
data on various RAID 5 (or, more commonly, RAID 10) arrays, log files on RAID 1 arrays, and
so forth. SQL Server’s temporary database, Tempdb, is often stored on an independent RAID 5
or RAID 10 array to improve performance, for example; that may seem like overkill, but keep in
mind that SQL Server can’t function without Tempdb (meaning you’ll need to allocate plenty of
space for it, too). You’ll need to create a configuration that not only provides the best
performance but also meets your availability and fault tolerance requirements.
188
Chapter 8
Summary
In earlier chapters, I introduced you to scale-out concepts and compared scaling out to improved
efficiency. I’ve outlined several scale-out techniques and technologies, including replication,
federated databases, distribution partitioned views, and more. I’ve also discussed Windows
Clustering, which can provide a level of server fault tolerance to a scale-out solution. Finally, in
this chapter, I discussed high-performance storage, which provides a foundation for better-
performing scale-out projects.
High-performance storage is critical to many server operations, not just SQL Server. However,
because SQL Server is one of the most disk-intensive applications you can install under
Windows, high-performance storage becomes a critical design component of any scale-out
solution. Careful attention to the design of your storage subsystem is critical to the success of a
scale-out project, as even a well-designed database will suffer if stored on an underperforming
storage subsystem.
Although scale-out projects can be complex and require a significant design investment, they are
possible and can provide equally significant performance improvements for a variety of large
SQL Server applications. In addition, so-called commodity hardware can be used in scale-out
solutions to provide a more cost-effective solution, in many cases, than single-server solutions
that utilize more expensive, proprietary hardware designs.
189
Chapter 9
This chapter won’t address prepackaged applications. For the most part, packaged applications aren’t
subject to your reprogramming or rearchitecture, meaning you’re pretty much stuck with what you get.
Some packaged applications—such as SAP—have tremendous flexibility and can be re-architected
to achieve better scale-out. However, most such applications have very specific and often proprietary
guidelines for doing so, far beyond the scope of what this chapter can cover.
190
Chapter 9
In the first scenario, suppose you’ve designed your SQL Server database so that each server
maintains a complete copy of the data, and that replication is used to keep the various copies in
sync with one another. So which database server does a given client computer access? Is it a
simple matter of selecting the one closest to it? How does it go about doing so? Would you
prefer that it somehow select the server which is least busy at the time? That’s even more
difficult; while static server selection might be something you could put into the client’s
configuration, being able to dynamically select a server based on server workload is more
difficult. You can’t simply use technologies like Network Load Balancing (NLB), because that
technology assumes that every server has completely identical content. In a replication scenario,
servers won’t have completely identical content—not all the time. Consider how NLB might
work in this scenario:
• Client needs to add a row to a table. NLB directs client to Server1.
• Client immediately needs to retrieve that row (which has probably had some unique
identifier applied, likely through an Identity column). NLB directs client to Server2 this
time, but Server2 doesn’t have the new row, yet, due to replication latency.
Clients would instead need some logic of their own to select a server and then stick with it (a
technique referred to as affinity) through a series of operations; that’s not something NLB (which
was designed to work with Web farms) is designed to do.
Consider a second scenario, in which each server contains a portion of the overall database, and
objects like distributed partitioned views (DPVs) are used by clients to access the database as if it
were contained on a single server. As I explained in Chapter 4, the server physically containing
most of the requested data can best handle the query; should the client try to figure that out and
query the DPV from that server? If not, which server—as all of them are technically capable of
handling the query to the DPV—should the client select? If all clients select a single particular
server, you’re going to bottleneck at that server eventually, so you do want some way to spread
them all out.
In the next few sections, I’ll discuss some of the specific components that make client
applications more difficult in a scale-out solution.
191
Chapter 9
Actually, that’s not completely accurate. Connection strings can provide support for alternate servers
in a failover scenario: "DSN=MyData;
AlternateServers=(Database=DB2:HostName=Server2,Database=DB1:HostName=Server3)" This is
still server-centric, however, as the client will always connect to the first server that’s available.
The problem with this technique in a scale-out solution is that it restricts the client to just a single
SQL Server. If the client is expected to connect to different SQL Server computers (if the client
is running in a different office, for example, which has its own server), the client either has to be
changed, or has to be written from the outset to have multiple connection strings to choose from.
And just because you have a multi-tier application doesn’t really change this problem; while
clients in a multi-tier application aren’t usually server-centric from a SQL Server viewpoint, they
do tend to be designed to work with a single middle-tier server, which in turn uses a standard,
server-centric connection string to work with a single SQL Server computer.
In a Web farm—which is the most common model of a scale-out application—this problem
would be solved by using load balancing. Clients—or middle tier servers or whatever—would
connect to a single virtual host name or IP address, which would be handled by some load
balancing component (such as NLB, or a hardware load balancing device). The load balancing
component would then redirect the client to one of the back-end servers, often in a simple round-
robin technique where incoming connections are directed, in order, to the next server in
sequence. I’ve already discussed why this doesn’t work in a SQL Server environment: Clients
often need to perform several tasks with a given server in short order before being redirected.
Sometimes, opening a connection and leaving it open will maintain a connection with the same
server, but that can become difficult to manage in middle-tier servers where dozens or hundreds
of connections are open at once, and where connections are pooled to improve performance.
192
Chapter 9
What’s the solution to server-centric connections? Well, it depends on your SQL Server scale-
out design. Because at some point somebody has to use a connection string—whether it be a
client or a middle-tier—that’s somebody is going to have to incorporate logic to figure out which
connection string to use (or, more accurately, which server to put into the connection string). One
straightforward example of this might work for an environment where multiple SQL Servers
contain the entire database and use replication to stay in sync; as illustrated in Figure 9.3, clients
(or middle-tier servers) might examine their own IP address, match it to a list of server IP
addresses, and thereby connect to the server nearest them (similar to the way in which a
Windows client selects an Active Directory domain controller).
Of course, this technique requires that the application have a complete list of servers. To make
the application more robust and longer-lasting, you might have it actually query the list of
servers from a central database, enabling the server lineup itself to change over time without
having to deploy a new application.
A more advanced solution might be to build your own equivalent of a network load balancing
solution, however. Figure 9.4 illustrates this technique.
193
Chapter 9
In this example, the redirector is able to determine which server is least busy, located the closest,
or whatever other criteria you want to use. It then informs the client which server to use. The
client then makes a direct connection to that server, and maintains the connection for however
long it wants, allowing it to complete entire transactions with that server. The redirector service
might provide a timeout value; once the timeout expires, the client would be required to go back
and get a new server reference. This helps ensure that the redirector can continually rebalance
load across servers (for example). If the list of available servers evolves over time, only the
redirector needs to be updated, which helps reduce long-term maintenance.
This redirector service can also be implemented in a scenario where your scale-out solution uses
a federated database. Clients might be designed to submit queries to the redirector first, which
might conduct a brief analysis and direct clients to the server best capable of handling that
particular query. That would require significantly more logic, and you wouldn’t necessarily want
the redirector to try and figure out which server contained the most data (that would reduce
overall solution performance, in most cases), but the redirector might be able to realize that a
client was trying to query a lookup table’s contents, and direct the client to the server or servers
that physically contain that data.
The idea, overall, is to find a way to remove the single-server view of the network, and to give
your solution some intelligence so that it can make smarter decisions about which server to
contact for various tasks. As much as possible, those decisions should be centralized into some
middle-tier component (such as the redirector service I’ve proposed), so that long-term
maintenance of the decision-making logic can be centralized, rather than spread across a widely-
distributed client application.
194
Chapter 9
In this example, clients use message queuing to submit data requests. An application running on
the server (or a middle tier) retrieves the requests and executes them in order, placing the results
back on the queue for the client to retrieve. While this obviously isn’t appropriate for typical
online transaction processing (OLTP) data requests, it’s perfectly appropriate for ad-hoc reports
and other data that isn’t needed instantly. By moving these types of data requests into an
asynchronous model, you can ensure that they don’t consume excessive server resources, and by
building your client applications around this model you can give users an immediate response
(“Your request has been submitted”) and delayed results (“Your report is now ready to view”) in
a more acceptable fashion than simply having users stare at an hourglass.
Even certain OLTP applications can use this technique. For example, in an events-ticketing
application, submitting ticket purchases to a queue helps ensure that tickets are sold in a first-
come, first-served fashion. Customers might not receive instant confirmation of their purchase,
especially if the queue has a lot of requests on it for a popular event, but confirmation wouldn’t
take long. Because the actual processing would be accomplished by a middle-tier application,
rather than the client, the business logic of connecting to the scaled-out back-end could be more
easily centralized, as well.
My preference, as you’ll see throughout this chapter, is to never have client applications connecting
directly to the scaled-out back-end. Instead, have clients use a middle-tier, and allow that tier to
connect to the back-end for data processing. This model provides much more efficient processing,
eliminates the need for client applications to understand the scaled-out architecture, and helps to
centralize the connectivity logic into a more easily-maintained application tier.
195
Chapter 9
While it’s obviously possible to build effective, scaled-out, 2-tier (client and SQL Server)
applications, it’s not the most efficient or logical approach.
Keep in mind that Web servers and Web browsers each represent distinct application tiers; even if
you have a scaled-out Web application where Web servers are directly contacting SQL Server
computers, you’ve still got a three-tier application, with the Web servers acting as a middle tier of
sorts.
196
Chapter 9
197
Chapter 9
Figure 9.8 illustrates how this application might need to evolve to work well in a scale-out
scenario.
Unfortunately, this sort of change is definitely nontrivial: Every page in the ASP.NET, based on
the example you saw, will need significant modifications. An entire middle tier will have to be
constructed, as well. Essentially, much of the application will have to be rewritten from scratch.
This is why I refer to client applications as the bottleneck in a scale-out solution: Creating the
scaled-out SQL Server tier can seem easy compared to what you have to do to make a robust
client (and middle) tier that’s compatible with it.
198
Chapter 9
This scenario might be appropriate in a solution where the users are geographically distributed.
Each location could have its own server, using WAN-based replication to stay in sync. Benefits
include the ability for users to always access a local database server, and the remainder of the
solution wouldn’t be terribly different from a single-server solution. In fact, this is probably one
of the easiest scale-out solutions to retrofit. However, downsides to this approach can include
significant WAN utilization and high replication latency. That means users at each location have
to be accepting of the fact that any data frequently updated by users at other locations may be out
of date a great deal of the time.
Another possible use for this technique is load balancing. In this example, the servers would all
reside in the same location, and users would be directed between them to help distribute the
workload. This is also relatively easy to retrofit an existing solution into, although changes
obviously need to be made to accommodate the load balancing (I discussed these points earlier in
the chapter). Replication could be conducted over a private, high-speed network between the
servers (a private Gigabit Ethernet connection might be appropriate), although particularly high-
volume applications would still incur noticeable replication latency, meaning each server would
rarely, in practice, be completely up-to-date with the others.
Figure 9.10 illustrates a different approach. Here, each server contains only a portion of the
database. Queries are conducted through DPVs, which exist on each server. As needed, the
server being queried enlists the other servers—through linked servers—to provide the data
necessary to complete the query. This is a federated database.
199
Chapter 9
Figure 9.11 shows a minor variation on this them. Here, a fourth server contains the DPVs and
enlists the three servers containing data to complete the queries. The fourth server might not
actually contain any data; its whole function is to serve as kind of an intermediary. The fourth
server might contain tables for primarily static data, such as lookup tables, which are frequently
read but rarely changed. That would help the three main servers focus on the main database
tables. I refer to the fourth server as a query proxy, since, like an Internet proxy server, it appears
to be handling the requests even though it doesn’t contain the data.
Finally, the last scale-out model is a distributed database, as pictured in Figure 9.12. Here, the
database is distributed in some fashion across multiple servers, but the servers don’t work
together to federate, or combine, that data. Instead, anyone accessing the database servers knows
what data is stored where, and simply accesses it directly.
200
Chapter 9
This model has two main permutations. First, the database might be broken up by tables, so that
(for example) customer data resides on one server, while order data lives on another. The second
way is for the data to be manually partitioned in some fashion. Customers “A” through “M”
might be on one server, while the remainder are on another server.
These models don’t necessary stand alone, either. For example, you might create a solution
where customer data is federated between three servers, and uses DPVs to present a single,
combined view of the data. Vendor data, however, might only exist on the second server, while
lookup tables live on the third server. This model would combine a distributed database model
with a federated database model. You can be creative to help your database perform at its best.
201
Chapter 9
A second request (in red) goes to a second middle-tier server. This request might be for data
which isn’t handled by a DPV, but is rather distributed across two back-end servers. The client
application doesn’t need to understand this at all; it simply instantiates a remote component on
the middle tier server (or accesses a Web service, or something similar), and it gets its data. The
middle tier knows where the data is located, and retrieves it.
202
Chapter 9
• The middle tier is often easy to scale out. Simply create an identical middle-tier server
and find a way to load-balance clients across it (perhaps hardcoding some clients to use a
particular server, or by using an automated load balancing solution).
• The middle tier can contain business and operational logic that would otherwise require
more complex client applications, or would place unnecessary load on SQL Server. For
example, the middle tier can be designed to understand the back-end data layout,
allowing it to access the data it needs. This removed the need for client applications to
have this logic, and allows the back-end to change and evolve without having to redesign
and redeploy the client. Instead, the middle tier—which is a much smaller installed
base—is reprogrammed. Similarly, operations like basic data validation can take place on
the middle tier, helping to ensure that all data sent to SQL Server is valid. That way, SQL
Server is wasting time validating and rejecting improper data. If business rules change,
the middle tier represents a smaller installed base (than the client tier) that has to be
modified.
• You can get creative with the middle tier to help offload work from SQL Server. For
example, the middle tier might cache certain types of data—such as mainly-static lookup
tables—so that SQL Server doesn’t need to be queried each time. Or, clients could cache
that information, and use middle-tier functionality to determine when the data needed to
be re-queried.
Middle tier applications used to be somewhat complex to write, and involved fairly complicated
technologies such as Distributed COM (DCOM). However, with today’s .NET Framework, Web
services, and other technologies, middle tiers are becoming markedly easier to create and
maintain, giving you all the more reason to utilize them in your scale-out application.
203
Chapter 9
A middle tier, can, in fact, be an excellent way of migrating to a scale-out solution. If you can take the
time to redesign client applications to use a middle tier, and create the middle tier properly, then the
back-end can be scaled out without having to change the client again.
It’s very important that Web applications follow the same best practices as any other client
application: Minimizing data queried, no ad-hoc queries, and so forth.
There are a few Web applications that are special cases. For example, a SQL Server Reporting
Services Web site typically needs direct connectivity to SQL Server, rather than accessing data
through a middle tier. When this is the case, you can typically make the back-end more robust to
accommodate the direct access. For example, reports might be pulled from a static copy of the
database that’s created each night (or each week, or however often), rather than querying the OLTP
servers.
204
Chapter 9
As your Web tier scales out—Web farms being one of the easiest things to create and expand,
thanks to the way Web servers and browsers work—always take into consideration the effect on
the middle and back-end tiers. For example, you might determine that each middle-tier server
can support ten Web servers; so as you scale out the Web tier, scale out the middle tier
appropriately. Always pay attention to the resulting effect on the back-end, which is more
difficult to scale out, so that you can spot performance bottlenecks before they hit, and take
appropriate measures to increase the back-end tier’s capacity.
205
Chapter 9
There are a number of pieces of functionality which typically exist in client applications, but
which can and should, whenever possible, be moved to the middle tier:
• Data validation. When possible, move this functionality to the middle-tier. The middle-
tier might provide functionality that allows clients to query data requirements (such as
maximum field lengths, allowed formats, and so forth), so that clients can provide
immediate feedback to their users, but in general try to avoid hardcoding data validation
in the client tier. As the most widely-deployed tier, the client tier is the most difficult to
maintain, so eliminating or reducing data validation—which can change over time—
helps to improve long-term maintenance.
• Business rules. As with data, client-tier maintenance will be easier over the long term if
business logic exists primarily on the middle tier.
• Data access logic. Clients should have no idea what the data tier looks like. Instead, data
access should all be directed through the middle tier, allowing back-end structural
changes to occur without affecting how clients operate.
Client should not use (and the middle tier should not allow the use of) ad-hoc queries. Instead,
clients should be programmed to use middle-tier components (or Web services, which amounts
to the same thing) to query the exact data they require. This helps to ensure that clients are fully
abstracted from the data tier and have no dependencies on anything, including table names,
column names, and so forth. This technique provides the maximum flexibility for the data tier,
and truly makes the middle tier a “wall” between the clients and the data.
It probably goes without saying, but just in case: Applications should use all the best practices that I
discussed in Chapter 2, such as using stored procedures rather than ad-hoc queries, retrieving the
minimum amount of data, and so forth. These practices help applications perform better no matter
what kind of environment you’re working in.
206
Chapter 9
Key Weaknesses
Applications already written for a 3- (or more) tier environment are less likely to have significant
weaknesses with regard to scale-out operations, although the tier which accesses data will likely
need a decent amount of work to accommodate a scaled-out SQL Server solution. However,
many applications are simple, client-server applications that may require extensive work. Here
are some of the key weaknesses usually found in these applications, which you’ll need to address
during your conversion:
• Direct connectivity. Applications connecting directly to a data source will need to have
that connectivity removed or modified, as appropriate, to understand your new solution
architecture.
• Ad-hoc queries. Many client applications make use of ad-hoc queries, which are out of
place in any database application, but especially in a scale-out scenario. Replace these
with calls to stored procedures or to middle-tier components.
• Caching. Client applications rarely cache data, although in a scale-out solution—when
retrieving data might require the participation of multiple servers—doing so can help
improve overall throughput. Clients may be able to cache, for example, relatively static
data used for drop-down lists and other lookups, helping to improve overall throughput of
the solution.
• Poor use of connection objects. Client applications often make poor use of ADO or
ADO.NET connection objects, either leaving them open and idle for too long or too
frequently creating and destroying them. A middle tier, which can help pool connections,
makes connection resources more efficient.
• Intolerance for long-running operations. While scale-out solutions are designed to
improve performance, sometimes long-running operations are inevitable. Client
applications must be designed not to error out, or to use asynchronous processing when
possible.
• Dependence on data tier. Client applications are often highly dependent on specific data
tier attributes, such as the database schema. Clients should be abstracted from the data
tier, especially the database schema, to improve solution flexibility.
• Multi-query operations. Clients typically perform interrelated queries, requiring them to
remain connected to a single server while each successive query completes. This creates a
connectivity dependence and eliminates the possibility of the client being load-balanced
to multiple servers throughout its run time.
207
Chapter 9
Conversion Checklist
Here’s a short checklist of things you’ll need to change when converting an existing application
to work in a scaled-out environment:
I’m not assuming, in this checklist, that you’ll be using a multi-tier application, although I strongly
recommend that you consider it.
• Remove all direct connections to servers and implement logic to connect to the proper
server in the back-end. In a multi-tier application, all database connectivity will need to
be replaced by use of remote middle-tier components.
• Examine the application for data which can be locally cached and updated on demand.
Implement components that check for updated data (such as lookup data) and re-query it
as necessary.
• Redesign applications to use asynchronous processing whenever possible and practical.
This provides the middle- and back-end tiers with the most flexibility, and allows you to
maximize performance.
• Remove schema-specific references. For example, references to specific column names
or column ordinals should be removed, or rewritten so that the column names and
ordinals are created by a middle tier, stored procedure, or other abstraction. The
underlying database schema should be changeable without affecting client applications.
• Make operations as short and atomic as possible. If a client needs to execute a series of
interrelated queries, try to make that a single operation on the client, and move more of
the logic to the back-end or middle tier. By making every major client operation a “one
and done” operation, you make it easier to re-load balance clients to a different middle-
tier or SQL Server computer (if that’s how your scale-out solution is architected).
Summary
This chapter focused on the last topics needed in a scale-out solution—the actual applications
that will use your data. In general, a good practice is to apply a multi-tier approach to help isolate
clients from the data, thus providing the maximum flexibility for the back-end data tier.
Although the remainder of this book has focused primarily on SQL Server itself, you can’t
ignore the fact that SQL Server is only part of an application solution, and the design of the
remainder of the solution plays an equally important role in the solution’s overall scalability.
Throughout this book, the focus has been on scalability and flexibility. This guide has presented
you with options for scaling out the back end, explained technologies that can help SQL Server
perform better and more consistently, and introduced you to techniques that can help in both
scale-up and scale-out scenarios. You’ve learned a bit about how high availability can be
maintained in a scale-out solution, and about how critical subsystems—particularly storage—
lend themselves to a better-performing solution. Although building a scale-out solution is never
easy, hopefully, this guide has given you some pointers in the right direction, and as SQL Server
continues to evolve as a product, we’ll doubtless see new technologies and techniques dedicated
to making scale-out easier and more efficient. In the meantime, the very best of luck with your
scale-out efforts.
208
Appendix
That isn’t to say that there are no other 64-bit architectures available; quite the contrary, in fact. The
DEC Alpha processor, for example, was a 64-bit design. However, the Itanium and x64 platforms are
the only mass-production architectures currently (or slated to be) supported by Windows and SQL
Server.
209
Appendix
However, while still in production and available in servers from several manufacturers, the IA64
platform hasn’t caught on as well as Intel undoubtedly hoped. The entirely new architecture
meant that existing 32-bit applications couldn’t be guaranteed to run as well, and a compatible
version of Windows took some time to produce, slowing adoption of the new processor. The all-
new architecture required significant training for server administrators, and—because the
processors never shipped in quantities close to that of Intel’s Pentium family—per-processor
pricing remained fairly high for some time. Purchasers were also put off by clock speeds of less
than 2GHz. Although the IA64 processors running at these speeds could significantly outperform
a faster-rated Pentium processor, the industry has a habit of “shopping by numbers” and
uninformed purchasers tended to be put off by the perceived lower speeds.
The “shopping by numbers” problem is one that most processor manufacturers are dealing with. Even
in 32-bit processors, new processor technologies such as dual-core and multi-pipelining make raw
CPU clock speed a fairly poor indicator of overall performance, which is why companies such as AMD
and Intel no longer use processor speed as a major identifying factor in product advertisements and
specification sheets.
Today, IA64’s long-term future is in some question. It’s undoubtedly a powerful platform, but
new processor models have not been introduced for some time, and Intel seems to be focusing
more on its EM64T architecture. Hewlett-Packard, originally a partner in the IA64’s
development, has pulled out of the processor’s development, although the company still offers a
few IA64-based servers (other manufacturers, such as Dell, offer a few Itanium 2-based servers
as well). Microsoft has also announced that some key 64-bit technologies, including Windows
Compute Cluster, will not be made available (at least initially) on the IA64. However, Microsoft
continues to support Windows on the IA64 platform, positioning it as a preferred platform for
high-end applications such as large databases. Also, as I’ll point out later, some of the “biggest
iron” currently being used to run high-end SQL Server installations are running Itanium 2
processors.
The remainder of this appendix will use the term x64 to refer generically to both Intel EM64T and
AMD AMD64 64-bit offerings. The majority of this appendix will focus on Windows and SQL Server
support for x64, although the next section will provide an overview of the underlying x64 technology.
210
Appendix
With a single, compatible platform from two vendors, Microsoft was able to quickly produce
compatible versions of Windows in several editions, and plans to make the x64 platform the
primary 64-bit platform for Windows in the future. For example, although it’s a sure thing that
Longhorn (the next version of Windows, now named Windows Vista and expected in 2006 or
2007) will ship 32-bit versions, it’s a strong possibility that it will be the last version of Windows
to do so. By the time the subsequent version of Windows is ready to ship, 32-bit processors may
have ceased production or may be considered only suitable for lower-end personal computers
rather than servers.
32-Bit: Not Dead Yet
Although both Intel and AMD are moving full speed ahead on 64-bit processors, the 32-bit processor is
far from dead and can still offer a compelling argument for high-end database servers. For example, both
companies produce processors capable of parallel instruction execution; these processors appear to the
OS as two processors and provide performance similar to a “true” dual-processor system. Intel’s trade
name for this technology is Hyper-Threading. Both companies are also producing dual-core processors,
which essentially pack two processors into a single package. Intel’s Pentium D is an example of this
technology. Dual-core parallel-processing models also exist, essentially allowing a single processor
package to function more or less as four processors (dual-core processors are also available in 64-bit
versions, such as the AMD Athlon 64 X2 Dual-Core). A limitation of all these processors is their memory
support—traditionally, 32-bit processors have supported a maximum of 4GB of installed, physical RAM.
Microsoft, Intel, and AMD have collaborated on a technology called Physical Addressing Extensions
(PAE), which allows 32-bit editions of Windows to address more than 4GB of physical RAM. This
technology is often seen, for example, when running 32-bit Windows on an x64 processor, or on 32-bit
systems that are capable of having more than 4GB of physical RAM installed. However, PAE uses a
memory-paging technique that is somewhat slower than the native memory access provided by 64-bit
Windows running on an x64 processor, meaning applications don’t perform quite as well. In addition, 32-
bit applications can only have a logical address space of 2GB (or 3GB using a startup switch in Windows’
Boot.ini file), regardless of how much memory the server contains and Windows is capable of addressing.
To sum up, powerful 32-bit processors continue to be produced. Technologies such as PAE allow
Windows to address memory beyond the 4GB limit, but they do not allow applications such as SQL
Server to break the 2GB or 3GB boundary (although Address Windowing Extensions—AWE—does allow
SQL Server to store data pages beyond the 2GB or 3GB limit), and therefore do not provide as much
performance (in terms of memory, at least) as a fully-native, 64-bit solution.
211
Appendix
The x64 platform provides much of its backward compatibility by supporting the entire x86
instruction set, then extending that instruction set to support new 64-bit capabilities. For this
reason, Intel refers to EM64T as 64-bit extensions. Key components of x64 technology include:
• Flat 64-bit address space—This component permits Windows to address up to 1TB of
RAM in a flat (non-paged) memory space. WS2K3 x64 also provides a significantly
larger address space to individual applications. For example, even on a modestly
equipped server with 32GB of RAM (the maximum supported by WS2K3 Standard
Edition), it’s entirely feasible for even large databases to reside completely in memory,
helping to eliminate or sharply reduce disk access, one of SQL Server’s major
performance bottlenecks. Figure A.1 shows how the SQL Server address space can help
expand all of SQL Server’s memory structures, including the caches.
Figure A.1: A flat address space provides unrestricted memory for all of SQL Server’s needs.
You won’t run across too many servers physically capable of handling 1TB of memory; a common
physical limit right now is 32GB (which is still a lot of memory). As memory prices continue to fall and
density continues to rise, hardware will be introduced that supports the installation of ever-greater
amounts of RAM.
212
Appendix
• 64-bit pointers—These pointers allow the processor to natively access all installed
memory for both program instructions as well as data.
• 64-bit wide general purpose registers—These components allow the processor to work
with a full 64 bits of data at once, twice the amount of a 32-bit processor.
• 64-bit integer support—This support for larger numbers provides extended mathematical
and processing capabilities, as well as supporting 64-bit memory access.
Of course, these enhancements are only available under an x64 OS and only to x64-compatible
applications. 32-bit applications are supported on x64 editions of WS2K3, but they run under a
special Windows on Windows64 (WOW64) subsystem, and retain a 2GB (or 3GB) logical
address space limitation.
The 2GB limit is hard coded in Windows. Some editions of Windows allow this limit to be changed to
3GB, but I’ll continue to refer to the 2GB number for clarity.
The significance of the flat memory address space cannot be overstated. For example, consider
the block diagram in Figure A.2. In this diagram, the server has the maximum of 4GB of RAM
installed. That RAM is parceled out in 2GB chunks to each running application. Of course, each
server has dozens of applications—primarily background services—so there isn’t sufficient
RAM for all of them. Although background services don’t typically use the full 2GB they’re
allocated, they do use memory. Add in major applications such as SQL Server and IIS and
memory truly becomes scarce—these applications are more likely to use the full 2GB they’re
allotted. The result is that additional memory must be taken from the vastly slower page file. As
applications need to access memory, the appropriate chunk, or page, is moved from the page file
into physical RAM. At the same time, another page must be moved from physical RAM into the
disk-based page file to make room.
Remember, anything living in the page file is going to be slower to access. The goal is to get as much
as possible into physical RAM.
213
Appendix
Because SQL Server may well need to work with more than 2GB of data (including caches for
stored procedures, execution plans, and so on), it implements its own paging scheme, paging
database data on and off of disk into the 2GB of RAM it’s been allocated. Of course, if SQL
Server can’t get a full 2GB of physical RAM, then it has less to work with, and winds up paging
even more data to and from disk. It’s this slow disk access that represents the major bottleneck in
many database applications. Contrast this allocation with Figure A.3, which depicts the same
situation running in a fully-native, 64-bit environment.
Here, applications can be allocated more than 2GB of memory. SQL Server is depicted with
6GB, enough to hold all of the data for the three databases it contains. There’s plenty of RAM
left over. In fact, in many x64 servers with plenty of RAM, it’s not unusual for disk activity to
fall off significantly, simply because so much data can fit into faster RAM.
The presence of all this database data living in RAM does not present safety or integrity issues.
In a 32-bit system, SQL Server runs several threads of execution. Understand that only data
resident in physical memory can be queried or changed; typically, one thread runs the query or
change while another thread works to bring needed pages of data into physical memory from
disk. The thread running the query often has to wait while the data-retrieval thread does its work,
and that’s where the slowdown occurs. A third thread works to write changed pages to disk,
where they’re protected from a power outage or server crash. Just because pages are written to
disk, however, doesn’t mean they’re removed from RAM; SQL Server leaves pages in RAM in
case it needs to modify them again. Only when it needs to bring more pages from disk are
unneeded pages dropped from RAM to make room.
In a 64-bit scenario, the same process occurs. However, it’s much less likely that pages will need
to be dropped from RAM to make room for more. So eventually, every page needed by the
database applications winds up in memory and stays there, reducing further disk activity (which
begins to consist more of background writes for changed pages).
214
Appendix
Of course, extensive RAM isn’t the whole performance story. To take Intel x64 processors as an
example, the whole package of 64-bit computing provides some impressive statistics:
• As much as 8MB of L3 cache, helping to reduce the need for the processor to access
memory
• More than three times the bus bandwidth of older processors, with front-side bus speeds
of 667MHz
• PCI Express, a replacement for the PCI peripheral interface, which provides vastly
enhanced peripheral throughput capacities.
• Quad-channel, DDR2-400 memory, offering larger capacity and lower latency
• Memory RAID capabilities, bringing disk-based redundancy technology to the memory
subsystem
64-bit Intel Xeon MP processors have been benchmarked at 38 percent faster than previous-
generation processors in the Transaction Performance Council’s TPC-C benchmark, and 52
percent faster in the TPC-H benchmark, two important benchmarks for database performance.
Coming Soon: 64-Bit and Then Some
Neither Intel nor AMD are standing still—both companies continue to push aggressively to develop
competitive new technologies. For example, both companies have released specifications for in-
processor virtualization technologies. These technologies, often referred to as hypervisor technologies,
have been spurred by the success of software products such as VMWare and Microsoft Virtual Server.
Essentially, in-processor hypervisor technologies allow virtualization at a deeper hardware level, vastly
improving the performance of the virtual machines.
For example, today’s blade solutions are all-hardware solutions, essentially multiple high-density
computers grouped together in a common, managed chassis. In the future, hypervisor technology may
extend blade computing’s definition to include a single, powerful traditional server running multiple virtual
machines as effectively as if each virtual machine were in fact a physical blade.
Although some of these new technologies aren’t expected to begin shipping until 2006, it’s useful to note
that the age of the processor is again upon us. For much of the past decade, processor technologies
have not showed significant advancements, instead relying on incremental increases in speed. Now, x64
computing, hypervisor technologies, and other advancements are making microprocessors one of the
most exciting and fast-evolving components of a server.
Example Processors
The lineup of Intel and AMD processors is constantly evolving, but the following list provides a
selection of processor lines that include x64 models. Note that not every processor in these lines
is necessarily x64; consult the appropriate manufacturer’s product guides for specifics. Also, this
list is intended to focus on server-suitable processors, as suggested by these manufacturers,
rather than being a comprehensive list of all x64 processors:
• Intel Pentium D Processor (dual-core)
• Intel Xeon Processor
• Intel Xeon Processor MP
• AMD Opteron for Servers
215
Appendix
These processors are available now, in servers from most leading server manufacturers. In fact,
I’ve run across a few companies who have x64-based servers but are still running 32-bit
Windows without realizing that they’ve got more power under the hood that isn’t being put to
use. Check your servers and see—it may be that an upgrade to a 64-bit OS and software is all
you need to gain some extra performance for your database applications.
64-Bit Windows
WS2K3 is the first version of Windows Server to offer editions that natively support x64
computing (a 64-bit version of Win2K Advanced Server was offered but was used only on the
IA64 platform). Going forward, x64 editions of Windows will more or less parallel 32-bit
editions until such time as x64 sales surpass 32-bit sales and make 32-bit editions of Windows a
thing of the past.
An x64 edition of Windows XP Professional is also available, but this appendix will focus on the
server products.
A current drawback to x64 Windows is that a version of Microsoft’s .NET Framework isn’t
available (as of this writing; a version is expected to be made available in 2005).
216
Appendix
Interestingly, x64 editions of Windows cost the same as their 32-bit cousins, meaning you’ll only
need to spend a bit more on hardware to get extra power. Most server manufacturers are offering
x64 processors as standard equipment on all but their lowest-end servers. Dell, for example,
offers Intel Xeon or Itanium processors in the majority of its new servers. It’s actually becoming
difficult to find servers that use 32-bit processors, and few manufacturers include any in the top
or middle tiers of their product lines.
217
Appendix
X64 hardware isn’t appreciably more than the last generation of similarly equipped high-end 32-bit
hardware; what usually winds up boosting the bill is the large quantities of additional memory you can
add.
Regarding software support: Microsoft today lists nearly 150 native x64 applications, with more
on the way—the company will ship SQL Server 2005 x64-compatible editions. Producing a
native x64 application when an existing 32-bit application’s source code is available isn’t usually
difficult; typically, a native x64 application can be produced simply by recompiling the 32-bit
source code. However, for optimized software that truly takes advantage of the x64 platform,
some additional coding is generally required.
Microsoft’s product naming is getting a bit difficult to follow. SQL Server 2000 Enterprise Edition and
SQL Server 2000 Enterprise Edition (64-bit) are clear enough. As this is being written, Microsoft
hasn’t adopted an official naming strategy for SQL Server 2005. We know we’ll have SQL Server
2005 Standard Edition and SQL Server 2005 Enterprise Edition; how their 32- and 64-bit versions will
be identified isn’t yet clear.
Notably, Microsoft’s per-processor licensing is the same (as of this writing) for both 32-bit and
64-bit editions, and that licensing is per socket. In other words, there’s no extra charge for dual-
core processors. That’s currently Microsoft’s standard for licensing, and it extends across other
products as well.
218
Appendix
You can find more benchmark results at http://www.tpc.org, the Transaction Performance Council’s
Web site. You can also read detailed information about what each benchmark, such as TPC-H,
addresses.
Such clusters are often referred to as High Performance Compute Clusters (HPCCs).
219
Appendix
Obviously, special software is required to divvy up the workload between cluster elements, and
to reassemble the results. Microsoft’s HPC initiative seeks to provide some of that software for
the Windows platform.
The general idea is that one or more master nodes interact with the end user or application, while
accessing one or more compute nodes, all of which are interconnected by a private, high-speed
network. Figure A.4 illustrates the architecture, including software elements that provide
coordination and control within the compute cluster.
Compute nodes are not directly accessible from the main corporate network; instead, they work
exclusively with one another and the master node to solve whatever computing task they’re
given. In fact, compute nodes are typically installed without mice, monitors, or keyboards, and
HPC solutions often use automated deployment techniques to install the OS on the “bare metal”
machines. Microsoft’s HPC initiative supports this through the use of Emergency Management
Services (EMS; provides emergency console access to servers without a mouse or keyboard
installed) and Automated Deployment Services (ADS).
The Master Node is often clustered for high availability, using a technology such as Windows Cluster
Service.
220
Appendix
221
Appendix
The degree to which the data is segmented would depend on the implementation of SQL Server
in the HPC environment. For example, Microsoft could design it so that individual columns were
partitioned across the compute nodes, or they might find it more efficient to horizontally
partition tables.
Earlier, this book stressed the need to fix application problems before attempting a scale-out solution.
With 64-bit computing, application flaws can become even more apparent. With memory (and,
therefore, disk throughput) a significantly smaller problem, issues such as inappropriate locking, poor
process management, and so forth can quickly become the limiting factor in your application’s overall
performance.
When even your new 64-bit server is overwhelmed, scale out again comes into play. With 64-bit
computing, however, it’s actually possible to make some preliminary judgment calls about what
a scale-out solution might look like. For example, consider the solution depicted in Figure A.6.
222
Appendix
In this scenario (which is simply an example of what can be done; it’s not an illustration of any
particular production implementation), Server 1 is hosting frequently accessed lookup tables.
These are all designated as in-memory, ensuring rapid responses to queries. Servers 2 and 3 are
normal servers in a federation, hosted a distributed database. The 64-bit memory architecture can
help ensure that the majority of their portion of the database can fit in memory. Server 4 is
providing views to clients, and obtaining data from the federation members. Thanks to caching, a
lot of this data will remain accessible right at Server 4 (at least for commonly accessed views),
helping to improve throughput.
Sizing can take into account a server’s limitations. If, for example, you have a 2TB database, you
might build a federation of four servers, each equipped with perhaps four processors and 512GB
of memory. Although all of the database will not constantly be in RAM, much of it will be, thus
vastly improving response times. Write-heavy applications—and most transactional database
applications are write-heavy—will still need high-performance storage to better keep up with the
amount of data being committed to disk.
223
Appendix
How much of a “win” can SQL Server running on 64-bit provide? In a test of a 500-database
consolidation in an accounting application, the scarcest resource on a 32-bit implementation was
the procedure cache. 64-bit computing can help reduce processing cycles in SQL Server
operations, particularly in servers on which the procedure cache is under pressure. This can
happen in large applications with a large number of stored procedures that are all used
frequently. The procedure cache is unable to hold everything, so SQL Server winds up
unnecessarily recompiling stored procedures, which increases processing cycles. With 64-bit’s
larger memory capacity, the procedure cache is more likely to hold everything, reducing
recompiles and processing cycles. This can allow a 64-bit processor to achieve as fast as double
the throughput of a 32-bit process running at twice the clock speed.
During the test migration of all 500 databases, 64-bit reduced CPU utilization simply because
stored procedures were recompiling much less frequently. Figure A.7 shows the comparison
charts. Note that these numbers compare a SQL Server 2000 implementation. As you can see, the
number of transactions per minute conducted with a shorter think time showed a marked
increase.
In another example, a large document management application for a title and tax company
showed major improvements after a 64-bit migration. This high-volume application involves
complex transactions and searches; the refinancing book increased CPU utilization on a 32-way,
32-bit server to more than 70 percent at peak. The transaction processing application was moved
to a 32-way, 64-bit machine, reducing CPU utilization to a mere 30 percent and improving
reindexing times by 30 percent. Given that 64-bit servers are becoming the norm, rather than the
exception, these examples stress how easy the decision to “go 64” should be.
224
Appendix
Summary
64-bit computing, particularly x64-based servers built using commodity technologies, offer the
promise of immediate performance improvements, particular on systems fully loaded with
performance-boosting memory. Dual-core processors, which cost the same as traditional
processors in terms of software licensing for SQL Server, provide significant performance
boosts, as well. Overall, if you’re looking to successfully scale out (or up, for that matter),
there’s little reason to even consider 32-bit servers. SQL Server 2005 running on a 64-bit
platform is definitely the way to go. New x64 systems promise to offer significantly higher
performance than legacy 32-bit systems, at a much lower price point than costlier IA64-based
systems.
225
Glossary and List of Acronyms
64-bit
Refers to a 64-bit, rather than a traditional 32-bit, microprocessor, including processors featuring
Intel EM64T technology or AMD AMD64 architecture; also refers to Intel microprocessors in
the Itanium or Itanium2 families
Active-active cluster
In this cluster type, each node performs useful work and is backed up by the other nodes
ADO
ActiveX Data Objects
ARP
Address Resolution Protocol
Article
The smallest unit of data that SQL Server can replicate; defined to be a table, a vertical or
horizontal partition of data, or an entire database; can also represent specific stored procedures,
views, and other database objects
CLR
Common Language Runtime
Clustered index
Controls the physical order of the rows in a table; if a nonclustered index is created and a
clustered index doesn’t already exist, SQL Server creates a “phantom” clustered index because
nonclustered indexes always point to clustered index keys
Composite index
Groups several columns together; for example, neither a first name nor last name column will
usually be very unique in a customer table, but the combination of first name and last name will
be much more unique
Database Mirroring
A fault-tolerance technology in SQL Server 2005 that continually copies database transactions to
a backup, or mirror, server. The mirror server is capable of standing in for the primary server if
the primary server fails
Data-dependent routing
Thus use of an intelligent application middle tier that directs queries directly to the server or
servers containing the desired data; this technique effectively makes the middle tier, rather than a
distributed partitioned view, responsible for connecting to the appropriate server in a
horizontally-partitioned database
226
Glossary and List of Acronyms
DBA
Database administrator
dbo
The built-in database owner user that is present in all SQL Server databases
Disk striping
As data is streamed to the controller, it is divided more or less evenly between all the drives in
the array (two are required); the idea is to get more drives involved in handling data to increase
overall throughput
Distributed partitioned database
A scale out strategy in which a database is partitioned and the pieces exist on different servers
Distributed partitioned views
This scale out technique enables horizontal partitioning of a table so that several servers each
contain different rows from the table; the distributed partitioned view is stored on all the servers
involved, and combines the rows from each server to create a single, virtual table that contains
all the data
Distributor
A special middleman role that receives replication data from a publisher and distributes copies to
subscribers, helping to reduce the load of replication on the publisher
DTC
Distributed Transaction Coordinator
DTS
Data Transformation Services
Failback
A concept supported by Windows Clustering in which the cluster will attempt to shift clustered
resources back to the original, preferred node
FC
Fibre Channel
Federation
A group of servers that coordinate to service clients’ requests
Fillfactor
Specified when a new index is created; SQL Server stores indexes in 8KB pages; the fillfactor
specifies how full each 8KB page is when the index is created or rebuilt
Fully enmeshed replication design
A SQL Server farm replication setup that attempts to reduce latency by enabling each server in
the farm to replicate with each of the other servers in the farm; this latency reduction comes at
the cost of performance
227
Glossary and List of Acronyms
GbE
Gigabit Ethernet
GUID
Globally unique identifier
HCL
Hardware compatibility list
High-performance storage
A scale out strategy consideration that enables an existing server to handle a greater workload;
for example, high-speed Fibre Channel SANs
Horizontal partitioning
Breaks the database into multiple smaller tables that contain the same number of columns but
fewer rows as the original database; these sections can then be placed on dedicated servers
Index
A lookup table, usually in the form of a file or component of a file, that relates the value of a
field in the indexed file to its record or page number and location in the page
Latency
The time a packet takes to travel from source to destination
Linked servers
A means of communication for the servers in a federation; provides authentication and
connection information to remote servers; each server in the federation must list all other
federation members as a linked server
Log shipping
This technique copies the transaction log from one server to another server, and the log is then
applied to the second server; this technique offers very high latency but very low overhead; it’s
also only available for an entire database
Merge replication
Works similarly to transactional replication but is specifically designed to accommodate
conflicts when data is changed in multiple sources; for handling conflicts, general rules must be
specified or a custom merge agent must be written that will handle conflicts according to your
business rules
Mirrors
Online copies of data that are updated in real-time; providing a duplicate copy of the data that
can be used if the original copy fails
MSMQ
Microsoft® Message Queuing services
NAS
Network attached storage
228
Glossary and List of Acronyms
NLB
Network load balancing
Node
Each server participating in a distributed partitioned view
Normalize
A useful technique of organizing data to minimize data redundancy and improve data integrity;
for example, divide a database into two or more tables and define relationships between the
tables; however, this method is generally used at the cost of performance
ODBC
Open Database Connectivity
OLAP
Online analytical processing
OLTP
Online transaction processing
Parity
The result of a calculation performed on stored data; this information can be used to reconstruct
portions of the stored data in a failure situation
Partition
The process of logically dividing a database into multiple pieces, then placing each piece on a
separate server; partitioning can be done along horizontal or vertical lines, and techniques such
as replication and distributed partitioned views can be employed to help reduce the complexity of
the distributed database ; in SQL Server 2005, databases can be partitioned across multiple files,
allowing those files to all be managed as a single unit
Partitioning column
A requirement for partitioned tables; SQL Server looks at this column to see which of the
federation’s servers contain (or should contain, in the case of added rows) specific rows; this
column can be any normal column with a specific CHECK constraint applied, but this CHECK
constraint must be different on each member of the federation so that each member has a unique,
non-overlapping range of valid values for the column
Preferred node
A designation within the Cluster Service; whenever this node is online, all resources will be
transferred to it; if it fails and is subsequently restarted, all cluster services will transfer back to it
once it is online again
Publisher
Make articles available; contains a writable copy of the data
Quorum resource
A file that describes the cluster’s configuration
229
Glossary and List of Acronyms
RAID
Redundant Array of Inexpensive Disks
RDBMS
Relational database management system
Replication
Enables SQL Server to accept changes on one server, then copy those changes out to one or more
other servers; servers can both send and receive replication traffic, allowing multiple servers to
accept data updates and distribute those updates to their partner servers
SAN
Storage area network
Sargeable
Database administrator slang for queries that contain a constant value
Scale out
The process of making multiple servers perform the work of one logical server or of dividing an
application across multiple servers
Schema
The organization or structure of a database
SCSI
Small computer systems interface
Seeking
The process—performed by the disk heads of physical storage systems—of moving the heads to
the appropriate location of the disk so that the data spinning underneath the heads can be
magnetically read or modified
Shared-nothing cluster
The entire group of servers participating in a distributed partitioned view; the group of back-end
servers that work together to fulfill a distributed partitioned view query; none of the cluster
nodes have access to the same resources at the same time
Shared storage
The external array that stores only the data used by the clustered applications (such as SQL
Server databases) and a small cluster configuration file
Smart indexing
The process of constantly reviewing indexes for appropriateness and experimenting with
different index configurations to ensure that the current setup is the most effective
230
Glossary and List of Acronyms
Snapshot replication
Essentially entails sending a copy of the database from one server to another; this replication
type is a high-overhead operation, and locks the source database while the snapshot is being
compiled; most other forms of replication start with a snapshot to provide initial synchronization
between database copies
Stored procedure
Code that implements application logic or a business rule and is stored on the server; more
efficient than triggers and centralizes critical operations into the application’s data tier; SQL
Server retains its execution plan for future use
Subscriber
Receives replication changes to the article
Subscription
A collection of articles and a definition of how the articles will be replicated; push subscriptions
are generated by the publisher and sent to subscribers; pull subscriptions are made available to
subscribers, which must connect to receive the subscription’s data
TOE
TCP/IP offload engine
TPC
Transaction Processing Performance Council, which publishes benchmark results for several
server platforms, providing an independent indication of the relative strengths of different
servers
TPC-App
TPC benchmark for overall application performance
TPC-C
TPC benchmark for basic transaction processing in any database application
TPC-H
TPC benchmark for decision support (data warehousing) databases
TPC-R
TPC benchmark that analyzes performance for standardized report generation; no longer used
TPC-W
TPC benchmark for Web-connected databases, particularly databases supporting an e-commerce
application; replaced by the newer TPC-App benchmark
Transactional replication
Copies only transaction log entries from server to server
231
Glossary and List of Acronyms
Trigger
Database objects that can be used to intercept data to ensure that it is clean and cascade
referential integrity changes throughout a hierarchy of table relationships; triggers represent a
way to centralize business logic in the data tier
T-SQL
Transact SQL
Vertical partitioning
Breaks the database into multiple smaller tables that have the same number of rows but fewer
columns than the original database; the sections that can then be placed on dedicated servers
(technically, both a large database that is partitioned by column and several tables spread onto
different servers qualify as vertical partitioning—just at different levels)
View
The method by which database data is presented to the user; allows redistribution of databases
transparently to the end users and their business applications (as long as client applications are
designed to use the views rather than the direct tables, the tables themselves can be rearranged
and scaled out as necessary without the client application being aware of any change)
Web farm
Each server in the farm is completely independent and hosts an identical copy of the entire Web
site; users are load balanced across the servers, although the users rarely realize that more than
one server exists; a good example of scaling out
X64
A generic term used to refer to both the Intel EM64T and AMD AMD64 microprocessor
architectures; X64 processors are backward-compatible with 32-bit x86 processors
Content Central
Content Central is your complete source for IT learning. Whether you need the most current
information for managing your Windows enterprise, implementing security measures on your
network, learning about new development tools for Windows and Linux, or deploying new
enterprise software solutions, Content Central offers the latest instruction on the topics that are
most important to the IT professional. Browse our extensive collection of eBooks and video
guides and start building your own personal IT library today!
232