SQL Server Database Configuration Best Practices

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

SQL Server

Database
Configuration
Best Practices

Autor: Jean Joseph


Blog: datadrivencommunity.com
Table Of Contents
• Brief intro to SQL Server Database
Architecture
• SQL Server Database Engine
• Relational Engine
• Storage Engine
• SQLOS
• Database Data files
• Database Compatibility Level
• Database Data files Page Verification
• Database Auto Create Statistics
and Auto Update Statistics
• Database Recovery Model
• Database Scoped Configuration
• Key Takeaway

SQL Server Database Configuration Best Practices


SQL Server Database Configuration Best Practices

Intro to SQL Server Database Architecture


Database is a software product with the primary function of storing and retrieving data as requested by other software applications and
depends on a system management which is in our case Microsoft SQL Server. The Database management system relies on the server
capacity resources and can only process as much information as its hardware is capable of handling.
Brief recap of Database Engine, Storage Engine, SQLOS

SQL Server Database Engine


The core component of the SQL Server is the Database Engine. The Database Engine consists of a relational engine that processes queries
and a storage engine that manages database files, pages, pages, index, etc. The database objects such as stored procedures, views, and
triggers are also created and executed by the Database Engine.
Knowing the type of application that will access your database will help you determine how you should configure your database after installing
SQL Server. For instance, if it’s a SharePoint database application you should configure max degree of parallelism value to 1. This means a
single SQL Server process will serve each request when you limit the number of processors to be used in a parallel execution; otherwise, it
causes SharePoint to underperform. However, you need to configure MAXDOP based on your workload if it's not SharePoint database.
Relational Engine
The Relational Engine or query processor has the components that determine the best way to execute a query. It requests data from the
storage engine based on the input query and processes the results. Some tasks of the relational engine include querying processing, memory
management, thread and task management, buffer management, and distributed query processing.
Storage Engine
The storage engine oversees storage and retrieval of data from the storage systems such as disks and SAN.
SQLOS
Under the relational engine and storage engine is the SQL Server Operating System or SQLOS. SQLOS provides many operating system
services such as memory and I/O management. Other services include exception handling and synchronization services.

The goal of this post is not to explore SQL server settings in great depth but instead to walk through some of the database settings you should
look at after installing SQL Server, when architecting or troubleshooting SQL server performance issues.
SQL Server Database Configuration Best Practices

Database Data files

Database data files


When we’re configuring hardware for SQL Server, we want to consider four things: 1) data file; 2) log file; 3) Tempdb; and 4) Indexes. When a
user inserts a record, SQL Server engine must first Write an entry to the log file recapping what it’s about to do, Write the new record into
the data file, then Writes an entry into the log file saying it’s done mucking around in the data file.

As a rule of thumb, you need to put the data and log files on separate drives moreover put the log files on the fastest-writing drives we can
afford. The more often we do Deletes, Updates, Inserts the faster our log file drives need to react.

For performance wise, You also need to think of Tempdb configuration and Index Strategy. I know SQL Server does an excellent job of using
memory to avoid hitting storage. Unfortunately, in addition to log files, two other parts of SQL Server also like to do a whole lot of writing at
the exact same time we’re accessing the data file: TempDB and indexes. When these things need to write data, they need to write it NOW,
and we can’t throw memory at the problem to make it faster.

To sum up, we’ve identified several things that might need separate storage due to their constant simultaneous access when architecting our
SQL Server databases:
• Data files
• Log files
• TempDB
• Indexes (full text indexes)

We can Customize Model Database Default Settings in SQL Server to achieve better consistency and performance from your SQL Server
Database across Production and Non-Production environments. However, make it a point to document these changes which you make in
your environment so that you can apply the same in case there is a need to Rebuild your environment from scratch in the event of
a Disaster.

Note: Tempdb is also created by using the model database as a template so keep in mind that any object created in the model database, it
will exist in the tempdb on the next time that SQL Server instance restarts since tempdb is recreated every time the SQL Server service
starts.
SQL Server Database Configuration Best Practices

Database Data files


(Continue)

SQL Server Database Autogrowth


SQL Server Database Autogrowth is a procedure used by the SQL Server engine to expand the database size when all its space runs out. If
the auto-growth setting for a database is not set correctly, the database may experience various or few auto-grow events. Each time when
SQL Server must grow a file, all transactions stop. They wait until file growth operation is complete to continue. These events can introduce
unpredictable hits in performance at random times (especially if disks are performing slow).

Make sure you enable Instant file initialization (IFI) as it allows SQL Server to skip the zero-writing step and begin using the allocated space
immediately for data files. It doesn’t impact growths of your transaction log files; those still need all the zeroes. The larger the growth
operation, the more noticeable the performance improvement is with IFI enabled. Below script will help you size the autogrowth value.

USE [master];
GO
ALTER DATABASE DemoDB MODIFY FILE (NAME='DemoDB_data', FILEGROWTH = 512MB);
ALTER DATABASE DemoDB MODIFY FILE (NAME='DemoDB_log', FILEGROWTH = 256MB);
GO
SQL Server Database Configuration Best Practices

Database Compatibility Level


Database compatibility level
Database compatibility level, one of the database level settings, impacts how a database functions. Each new version of Microsoft SQL
Server introduces many new features, most of which require new keywords and change certain behaviors that existed in earlier versions. To
provide maximum backward compatibility, Microsoft enables us to set the compatibility level according to our needs.

By default, every database inherits the compatibility level of the model database version from which it was created. When you restore a
database backup taken on an older version of SQL, the database compatibility level remains the same as it was on the instance from which
you took the backup, unless the source compatibility level is lower than the minimum supported level. Note that the compatibility levels of the
tempdb, model, msdb, and resource databases are set to the current compatibility level after an upgrade occurred whereas the master
system database retains the compatibility level it had before the upgrade occurred.

Keep in mind that migrating to a modern version of SQL Server (meaning SQL Server 2016 or newer) is significantly more complicated than
it was with legacy versions of SQL Server. Because of the changes associated with the various database compatibility levels and various
cardinality estimator versions, it is actually very important to put some thought, planning, and actual testing into what database compatibility
level you want to use on the new version of SQL Server that you are migrating your existing databases to.

Determine the compatibility level


USE <database name>;
GO
SELECT compatibility_level
FROM sys.databases
WHERE name = '<database name>';
GO

To change to a different compatibility level, use the ALTER DATABASE command as shown in the following example:

USE master
GO
ALTER DATABASE <database name> SET COMPATIBILITY_LEVEL = <compatibility-level>;
SQL Server Database Configuration Best Practices

Database Compatibility Level


(Continued)

Before changing the compatibility level

To effectively change the compatibility level while users are connected to SQL Server Instance. It is preferable that you follow these steps to
properly do the work:

• Change the database to single-user mode.


• Change the compatibility level.
• Put the database in multi-user mode.

How does this affect me?


Compatibility levels are very useful and should not be undertaken. We might have an application that runs under SQL Server 2008 R2, that
SQL server version is no longer supported by Microsoft that means you will need a higher supported version. We can migrate the databases
on that server to a brand-new SQL Server 2019 instance and maintain compatibility with 2008 R2 by keeping the compatibility level at 100.

Even better, we can still benefit from a number of features in SQL Server 2019 including Query Store and Accelerated Database Recovery,
while still maintaining compatibility with applications that expect Transact-SQL query language and other features from SQL Server 2008.
Over the lifetime of that SQL Server 2019 instance, we can gradually adopt newer features in our database, and increase the compatibility
level until we get to 150.
SQL Server Database Configuration Best Practices

Database Data files Page Verification


Page Verification
When SQL Server writes data to disk, it just assumes everything’s fine. Moreover, it doesn’t care whether the data is good or not. When data
is read back from storage, SQL Server discovers whether that data is still good, and that’s where page verification methods come in.

If databases have page verification set to NONE or TORN_PAGE, we may not be able to completely recover from storage corruption. When
TORN_PAGE_DETECTION is specified, This allows you to detect when a page was not successfully written to disk but does not tell you if
the data stored in those 512 bytes is correct as a couple of bytes may have been written incorrectly. CHECKSUM, on the other hand,
calculates a checksum value as the final thing before passing it to the IO system to be written to disk. This guarantees that SQL Server had
no part in corrupting the page. When SQL Server reads it back, if a single bit is different, it will be caught, and a checksum error (824) will be
generated.

From SQL Server 2005 and newer, we usually want to be on CHECKSUM page verification. Torn Page Detection is the more lightweight of
the two, but CHECKSUM is safer, and its overhead is still small enough that it's the better option. Even if your databases have CHECKSUM
enabled, you still need to check the checksums to make sure your data hasn’t been corrupted during the backup process with the help of
WITH CHECKSUM option. This isn’t 100 percent guaranteed, but it just increases your confidence that the pages with checksums are less
likely to be corrupted.
Use below script to enable CHECKSUM
SELECT 'ALTER DATABASE ' + QUOTENAME(s.name) + ' SET PAGE_VERIFY CHECKSUM WITH NO_WAIT;'
FROM sys.databases AS s
WHERE s.page_verify_option_desc <> 'CHECKSUM';
GO
It’s very important to rebuild every index and table in the database right after you enabled CHECKSUM If you really want to protection your
data. This can be an IO and CPU intensive operation, but safer.
SQL Server Database Configuration Best Practices

Database Auto Create Statistics


and Auto Update Statistics
Auto Create Statistics and Auto Update Statistics
Statistics comprises lightweight objects that are used by SQL Server Query optimizer to determine the optimal way to retrieve data from the
table. SQL Server optimizer uses the histogram of column statistics to choose the optimal query execution plan. If a query uses a predicate
which already has statistics, the query optimizer can get all the required information from the statistics to determine the optimal way to
execute the query.

By default, SQL Server maintains create and update statistics automatically for your databases however you do have the option to manually
disable these features. Disabling auto create and update statistics should be done carefully.

There are three ways to create SQL Server statistics:


• If the auto_create_statistics option is enabled (enabled by default)
• Manually create statistics
• When a new index is created.

It is generally recommended to leave this option enabled. Ideally, statistics are managed by a scheduled job, and the automatic option is
used as a safety net – available to update statistics in the event a scheduled update does not occur, or accidentally does not include all
existing statistics.
SQL Server Database Configuration Best Practices

DATABASE RECOVERY MODEL


Database Recovery Model
The database recovery model controls how a SQL Server database can be backed up and restored. I will explain the three recovery models and
what to think about when choosing a recovery model for a database.
Simple Recovery Model
The simple recovery model is the most basic of recovery models. Each transaction is still written to the transaction log. The transaction logs
records will eventually be removed automatically when using the simple recovery model. That removal process happens for all completed
transactions when a checkpoint occurs. Because log records are removed when a checkpoint occurs, transaction log backups are not supported
when using the simple recovery model.
Note that with the simple recovery model, you can only perform full and differential backups. Point-in-time and page restore are not reinforced,
only the restoration of the secondary read-only file is maintained. Below are some of the reasons to select the simple database recovery model
include:
• It is most suitable for development and Test Databases.
• Simple reporting or application database, where data loss is tolerable.
• The point-of-failure recovery is exclusively for full and distinction backups.
• No administrative overhead.
Full Recovery Model
With the full recovery model, SQL Server reserves the transaction log until you back it up. unlike the simple recovery model, they stay in the
transaction log after the transaction is completed. The transaction log records stay in the transaction log until a log backup is performed. The full
recovery model supports point-in-time restores, meaning a database that is fully logged can be restored to any point in time. When a transaction
log backup is performed against a database that is in full recovery mode, the log records are written to the transaction log backup, and the
completed transaction log records are removed from the transaction log.

Therefore, when a database uses the full recovery model, you need to ensure transaction log backups are taken frequently enough to remove
the completed transactions from the transaction log before it fills up.
SQL Server Database Configuration Best Practices

DATABASE RECOVERY MODEL


(Full Recovery Model) - Continue
When you should use Full Recovery Model
Reasons to select the full database recovery model:
• To support mission-critical applications.
• Design High Availability keys.
• To facilitate the recovery of all the data with zero or nominal data loss.
• If the database is designed to have multiple filegroups, and you want to perform a piecemeal restore of reading/write secondary filegroups
and, optionally, read-only filegroups.
• Allow random point-in-time restoration
• Restore individual sheets
• Sustain high administration overhead

Bulk-Logged Recovery Model


The bulk-logged recovery model minimizes transaction log space usage when bulk-logged operations like BULK INSERT, SELECT INTO, or
CREATE INDEX are executed. Bulk-logged recovery model functions similar to the full recovery model with the exception that transaction log
records are minimally logged while bulk-logged operations are running. Minimal logging helps keep the log smaller, by not logging as much
information. It improves the performance of large bulk loading operations by reducing the amount of logging performed.
The bulk-logged recovery model is a great way to minimize transaction log space and improve the performance of large bulk-load
operations. But keep in mind, during the time a bulk-load operation has occurred, a point-in-time restore cannot be done. Therefore, to
minimize data loss when using bulk-load operations you should take a transaction log backup just prior to a bulk-load operation, and then
another one right after the bulk-load operation completes.
SQL Server Database Configuration Best Practices

DATABASE SCOPED CONFIGURATION


Database Scoped Configuration
Database Scoped Configuration options were introduced in SQL Server 2016, which give you the ability to control some behaviors that were
formerly configured at the SQL Server instance level. These options included MAXDOP, LEGACY_CARDINALITY ESTIMATION,
PARAMETER_SNIFFING, and QUERY_OPTIMIZER_HOTFIXES. There was also a CLEAR PROCEDURE_CACHE option that let you clear
the entire plan cache for a single database.

Database Scoped Configuration allows you to overcome the default related server’s configuration and configure each database with that
setting to meet each database or application requirements. These new configurations can be isolated at the replica level too, where you can
configure the primary replica with a specific setting and the secondary replica which is used to handle another workload type with another
setting.

sys.database_scoped_configurations has all the different available options based on different versions of SQL Server. ALTER DATABASE
SCOPED CONFIGURATION will allow you to either turn ON or OFF the available options

I will not be able to cover all the available options, but I will make sure that one of my next posts will be mainly for Database Scoped
Configurations. Let us discuss the below available options.

• MAXDOP
• LEGACY_CARDINALITY_ESTIMATION
• PARAMETER_SNIFFING
• QUERY_OPTIMIZER_HOTFIXES
• PROCEDURE_CACHE
• OPTIMIZE_FOR_AD_HOC_WORKLOADS
• GLOBAL_TEMPORARY_TABLE_AUTODROP
• ROW_MODE_MEMORY_GRANT_FEEDBACK
SQL Server Database Configuration Best Practices

DATABASE SCOPED CONFIGURATION


(MAXDOP) Continue

Maximum Degree of Parallelism (MAXDOP)


The Degree of Parallelism is the number of workers, or the number of processors, that are assigned for the parallel plan to accomplish the
worker task. By default, SQL Server will use all available CPUs during query execution time. This is great for large queries (data warehouse), but
it can cause performance problems and limit concurrency. A better approach is to limit parallelism to the number of physical cores in a single
CPU socket.

Starting with SQL Server 2016 (13.x) onward, use the following guidelines when you configure the max degree of parallelism server
configuration value:

Server configuration Number of processors Guidance


Server with single NUMA node Less than or equal to 8 logical processors Keep MAXDOP at or below # of logical processors
Server with single NUMA node Greater than 8 logical processors Keep MAXDOP at 8
Server with multiple NUMA nodes Less than or equal to 16 logical processors per Keep MAXDOP at or below # of logical processors
NUMA node per NUMA node
Server with multiple NUMA nodes Greater than 16 logical processors per NUMA Keep MAXDOP at half the number of logical
node processors per NUMA node with a MAX value of 16

NUMA node in the above table refers to soft-NUMA nodes automatically created by SQL Server 2016 (13.x) and higher versions, or hardware-
based NUMA nodes if soft-NUMA has been disabled.

These are advanced options and should be changed only by an experienced database administrator or certified SQL Server professional.

For MAXDOP, SQL Server Instance Default setting value is set to 0 which may not be appropriate for your workload. Now you do have the option
to either overwrite it at the database level or Instance level. the choice is yours and should be based on your workload requirements.
SQL Server Database Configuration Best Practices

DATABASE SCOPED CONFIGURATION


(Cost Threshold for Parallelism) Continue

Cost Threshold for Parallelism


Speaking of parallelism – what about that cost threshold for parallelism setting? The default is set to 5, is that a Good Number? Not so much.
The optimizer uses that cost threshold to figure out when it should start evaluating plans that can use multiple threads. Although there’s no right
or wrong number, 5 is too low value setting. It’s appropriate for purely OLTP applications, but as soon as you add a bit of complexity things
change!

A best practice is to increase this setting between 25 to 50 depending on the workload. Heavy OLTP can use a lower value, while heavy OLAP
will benefit from a higher value.
SQL Server Database Configuration Best Practices

DATABASE SCOPED CONFIGURATION


(PARAMETER_SNIFFING) - Continue
Parameter Sniffing
Parameter Sniffing is the process of looking to the first passed parameters values when compiling the stored procedure in order to create an
optimal execution plan that fits these parameters values and use it for all values. But the generated execution plan may not be optimal for all the
parameter’s values, leading to performance problems in some cases

When the SQL Server database engine compiles a stored procedure, it looks at the parameter values being passed and creates an execution
plan based on these parameters. The process of looking at parameter values when compiling a stored procedure is commonly called
“parameter sniffing”. Parameter sniffing can lead to inefficient execution plans sometimes; especially when a stored procedure is called with
parameter values that have different cardinality.

The important point for us is that those parameters passed are used to determine how SQL Server will process the query. An optimal execution
plan for one set of parameters might be an index scan operation, whereas another set of parameters might be better resolved using an index
seek operation.

How have we used to Deal With Parameter Sniffing?


There are several ways to deal with the parameter sniffing issue. Keep in mind all below options may not be acceptable in every situation.
• WITH OPTION (RECOMPILE)
• WITH OPTION (OPTIMIZE FOR @VARIABLE=VALUE)
• WITH OPTION (OPTIMIZE FOR (@VARIABLE UNKNOWN))
• Declare Local Variable
• Use Dynamic SQL
• Creating Multiple Stored Procedures

However, with the arrival of Database Scoped Configuration, we now can TURN { ON| OFF} Parameter Sniffing at the database level. You
should understand your workload well in order to decide if you will keep the parameter sniffing enabled or disabled.

Script to turn {ON | OFF} Parameter Sniffing:

ALTER DATABASE SCOPED CONFIGURATION SET PARAMETER_SNIFFING = {ON | OFF} ;


SQL Server Database Configuration Best Practices

DATABASE SCOPED CONFIGURATION


(GLOBAL_TEMPORARY_TABLE_AUTODROP) - Continue

GLOBAL_TEMPORARY_TABLE_AUTODROP

Applies to: Azure SQL Database (feature is in public preview)

GLOBAL_TEMPORARY_TABLE_AUTODROP, which means that the global temporary tables are automatically dropped when not in use by any
session. When set to OFF, global temporary tables need to be explicitly dropped using a DROP TABLE statement or will be automatically
dropped on server restart. The default is ON.

In Azure SQL Database logical server, this option can be set in the individual user databases of the logical server.
In SQL Server and Azure SQL Database Managed Instance, this option is set in TEMPDB and the setting of the individual user databases has
no effect.

Script to turn {ON | OFF} GLOBAL_TEMPORARY_TABLE_AUTODROP :

ALTER DATABASE SCOPED CONFIGURATION SET GLOBAL_TEMPORARY_TABLE_AUTODROP = {ON | OFF} ;


SQL Server Database Configuration Best Practices

DATABASE SCOPED CONFIGURATION


(ROW_MODE_MEMORY_GRANT_FEEDBACK) - Continue

SQL Server uses memory to store in-transit rows for hash join and sort operations. When a query execution plan is compiled for a statement,
SQL Server estimates both the minimum required memory needed for execution and the ideal memory grant size needed to have all rows in
memory. This memory grant size is based on the estimated number of rows for the operator and the associated average row size. If the
cardinality estimates are inaccurate, performance can suffer:

SQL Server 2017 brings a new query processing methods that are designed to mitigate cardinality estimation errors in query plans and adapt
plan execution based on the execution results. This innovation is called Adaptive Query Processing and consist of the three features: 1) Adaptive
Memory Grant Feedback; 2) Interleaved Execution; 3) Adaptive Joins.

ROW_MODE_MEMORY_GRANT_FEEDBACK
Applies to: Azure SQL Database and SQL Database SQL Server 2019 preview as a public preview feature
When the optimizer doesn’t estimate the correct amount of memory for a query, either memory is wasted that could be used for other processes
or some operations will spill to disk. Microsoft has added Memory Grant Feedback to help overcome this issue.
ROW_MODE_MEMORY_GRANT_FEEDBACK database scoped option, Allows you to enable or disable Row mode memory grant feedback at
the database or statement scope while still maintaining database compatibility level 150 and higher. Row mode memory grant feedback a feature
that is part of Adaptive query processing introduced in SQL Server 2019. It provides you a nice enhancement in SQL Server 2019 to resolve the
excessive memory grant issues.
Notes: Memory grant feedback for batch mode has been around for a while. But it wasn’t until the rollout of version 15.x that memory grant
feedback was available for row mode queries.
Script to turn {ON | OFF} ROW_MODE_MEMORY_GRANT_FEEDBACK:
ALTER DATABASE SCOPED CONFIGURATION SET ROW_MODE_MEMORY_GRANT_FEEDBACK = { ON | OFF};
Script to turn {ON | OFF} ROW_MODE_MEMORY_GRANT_FEEDBACK Using query hint:
OPTION (USE HINT ('DISABLE_ROW_MODE_MEMORY_GRANT_FEEDBACK’));
SQL Server Database Configuration Best Practices

Key Takeaway
A best practice is to increase Cost Threshold for Parallelism setting between 25 to 50 depending on the workload.
Heavy OLTP can use a lower value, while heavy OLAP will benefit from a higher value.
The Degree of Parallelism is the number of workers, or the number of processors, that are assigned for the parallel
plan to accomplish the worker task.
Leverage Database Scoped Configuration options to optimize your database application.

Leverage adaptive query processing if necessary for your environment.

Considering different drives for data file, log file, Tempdb and Indexes.

Database compatibility level, one of the database level settings, impacts how a database functions.

Pre-size SQL Server Database Autogrowth also make sure Instant file initialization (IFI) is enabled/

Choose Full recovery when point in time recovery matter and Bulk-Loaded when your have heavy ETL environment
and Simple when you want less administration and don’t care about the transaction log.
SQL Server Database Configuration Best Practices

I hope you guys enjoy my post! I will keep you


posted for my weekly next upcoming post

Your feedback would be greatly appreciated by


commenting in the comment section area.

This post was written by Jean Joseph.


Data Engineer/DBA.
datadrivencommunity.com / bigdatadriven.org

You might also like