SQL 2
SQL 2
SQL 2
This chapter is a concise overview of T-SQL’s building blocks. It was tackled with newcomers to SQL Server in mind, as well as
battleaxes needing either a refresher or a leg up on the new elements of the language. Also, this chapter should be considered as a
“briefing” or, better, an introduction to the extensive reference to T-SQL in SQL Server Books Online and the advanced features
discussed in later chapters.
No matter what API or environment you use, communication between client and server is via T-SQL. Knowing and using T-SQL is
independent of the type of software you are running on the clients, be they fat, thin, thick, or rich. T-SQL is also the language you use
to manage the DBMS, which is discussed in later chapters.
General office applications-line-of-business applications, report generators, SQL Server Management Studio, and so on-do not require
a deep knowledge of T-SQL, because the code is usually embedded in the application. However, some applications need to support an
end-user ability to query SQL Server databases, for instance, to find all accounts 30 days past due. This usually requires a user to
know some menial T-SQL query syntax, but you usually allow your end user to visually construct a SQL query, which creates a
statement under-the-hood and on-the-fly, so to speak.
SQL Server tools, such as the SQL Server Management Studio and the OSQL tool, require a deep knowledge of T-SQL because they
accept direct T-SQL syntax for transmission to the server. The server returns results direct by to these tools, usually in the form of a
result set or as tabulated data displayed in a grid or text window. As you learned in the earlier chapters SQL Server Management
Studio is the essential tool for building T-SQL statements.
Application developers programming access to SQL Server data need a thorough understanding of T-SQL. They need to know as
much as possible about the language, which requires comprehensive study and a lot of practice (beyond the scope of this book). This
chapter covers the basics, such as syntax style, operators, and data types, Later chapters cover the more complex subject matter: stored
procedure programming, triggers, user-defined functions, and so on.
T-SQL knowledge is essential in Internet applications or services. Even if you are going to latch onto XML, T-SQL is essentially still
the facilitator for XML, especially when it comes to returning result set data as XML documents or inserting XML data into SQL
Server databases as XML documents. T-SQL statements or queries can also be transmitted to SQL Server using URLs. However,
most of the discussion in the next seven chapters relates to native use of T-SQL.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 2 of 90
T-SQL is a procedural language with all the gravy and relish you might be accustomed to having in a language. Architecturally
speaking, it can be compared to database programming languages like Clipper and DBase because it comes with all the basic elements
of a programming language: variables, flow-control structures, logic evaluation, function and procedure call capability, and so on.
(Yes, even GOTO lives on here.) That’s the “Transact” or “T” part of the language. However, T-SQL is neither compiled like C nor
interpreted like a p-code language. Rather, it is parsed like a just-in-time script language, and its intent and logic is converted into a
native “sublanguage” that stokes the SQL Server engines.
The SQL in T-SQL supports SQL-92 through SQL-2003 DDL and DML that allow a wide range of database programmers who are up
to speed on SQL to obtain a broad range of database server functionality, and then some. If you have never studied T-SQL before
reading this book, but you know SQL, then you are certainly not a long way from being able to create applications that access SQL
Server. Many database programmers coming over from the Access, FoxPro, Delphi, PowerBuilder, and JDBC worlds, for instance,
are usually up to speed with SQL, and so getting up to speed with SQL Server is very easy. And because SQL is so widely used, I
have left the SQL-native facilities like SELECT, UPDATE, INSERT, and JOIN for discussion in later chapters, where it is assumed
you already know how to program in SQL.
T-SQL also provides access to DBMS mechanisms such as stored procedures and triggers. These are not defined by the SQL standard
(which is the base language for all SQL extended DBMS interfaces), although some attempt at adding stored procedure-like facilities
in SQL has been proposed in recent years. But hold your horses, we’ll be getting the extended stuff like stored procedures in later
chapters.
T-SQL Constants
T-SQL constants are literal or scalar values that represent a data type. The following constants are supported by the language (the data
types are discussed later in this chapter):
Character strings
Unicode strings
Binary constants
Bit constants
Datetime constants
Integer constants
Decimal constants
Money constants
Character string constants are surrounded by single quotation marks and can include alphanumeric characters (a-z, A-Z, and 0–9) and
the additional characters, such as exclamation point (!), at sign (@), and pound sign (#). The bounding quotation marks are the default
delimiter recognized by SQL Server. Setting the QUOTED_IDENTIFIER option for a connection to OFF can, however, change this,
if using single quotation marks causes problems in your development environment, where strings are usually bounded by single quote
marks, as in Visual Basic 2005 (VB), or by double quotes, as in C#.
The OLE DB drivers automatically set the QUOTED_IDENTIFIER to ON upon connection, and often an apostrophe can trash an
application because SQL Server raises hell when it sees the apostrophe and thinks it’s an identifier. In this case, the “official” solution
is to add an extra quote so that you send something like ‘St. Elmo’s Fire’ to the server as “St. Elmo’s Fire”.
Asking your end users to do that, however, is a cockamamie solution, to say the least, because it is unacceptable for your data entry
people to have to remember to type an apostrophe twice. If you have this problem, and you most likely do, you can use a function like
REPLACE(), which is a VB function (and there are equivalent functions in all languages), to add the second quote mark under the
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 3 of 90
“sheets.” You could also use a data-bound “text” control (which I am not fond of) to make the necessary adjustments automatically.
Also, if the QUOTED_IDENTIFIER option has been set OFF for a connection, character strings can also be enclosed in double
quotation marks, but the OLE DB provider and ODBC driver automatically use SET QUOTED_IDENTIFIER ON when they connect
to SQL Server. The use of single quotation marks is, however, recommended.
If a character string enclosed in single quotation marks contains an embedded quotation mark, represent the embedded single
quotation mark with two single quotation marks. This is not necessary in strings embedded in double quotation marks.
Collations and code pages are also important considerations when it comes to strings. The character string constants are assigned the
default collation of the current database attached to in the connection. However, you can use the COLLATE clause (discussed a little
later in this chapter) to specify a different collation. The character strings you enter at the client usually conform to the code page of
the computer. They are translated to the database code page, if necessary, upon transmission to the server.
Empty strings are represented as two single quotation marks with nothing in between. However, if you are working in database
compatibility mode 6.x, an empty string is treated as a single space. The SQL Server Unicode strings support the concept of enhanced
collations.
The Unicode strings have a format very similar to character strings, but they are preceded by what we call the N identifier. The N
stands for National Language in the SQL standard. Usage requires that the N prefix be uppercase. In the following example, “Jeffrey”
is the character constant, but in order to provide a Unicode constant, I would have to provide N’Jeffrey’.
Unicode constants are interpreted as Unicode data. They are not evaluated using a code page, but they do have a collation, which
primarily controls comparisons and case sensitivity. When you use the Unicode constant, you are assigned the default collation of the
database you are connected to. But you can change this with the COLLATE clause to specify a collation. (See “Nchar and Nvarchar”
later in this chapter.) The SQL Server Unicode strings support the concept of enhanced collations supported by SQL Server 2005.
Tip Consider replacing all char, varchar, and text data types with their Unicode equivalents. This will help you avoid code page
conversion issues.
Binary Constants
The binary constants are identified with the suffix Ox (an empty binary string) and are strings composed of hexadecimal numbers.
They are not enclosed in quotation marks.
Bit Constants
The number zero or one represents a bit constant. These do not get enclosed in quotation marks. If you use a number larger than 1,
SQL Server converts it to 1.
Datetime Constants
You can use the datetime constants as character date values, in specific formats. They are enclosed in single quotation marks as
follows:
'October 9, 1959'
'9 October, 1959'
'591009'
'10/09/59'
I have discussed the formats for the datetime constants later in this chapter.
Integer Constants
The integer constants are represented by strings of numbers and must be whole numbers. They do not get enclosed in quotation marks
like strings and cannot contain decimal points. Integer constants are illustrated as follows:
2006
6
Decimal Constants
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 4 of 90
The decimal constants are represented by strings of numbers that are not enclosed in quotation marks but can contain a decimal point.
The following examples represent decimal constants:
146.987
5.1
The float and real constants are represented using scientific notation (see “SQL Server Data Types” later in this chapter). They are not
enclosed in single quotes and appear as follows:
101.5E5
2E+100
Money Constants
The money constants are represented as strings of numbers. They can be whole numbers, or they can include the optional decimal
point. You can also use a currency symbol as a prefix. They are not enclosed in quotation marks. Examples of money constants are as
follows:
1200.08
$500.00
R35.05
Uniqueidentifier Constants
The uniqueidentifier is a string that represents the globally unique identifier (GUID), pronounced “gwid” or often as “goo ID.” These
constants can be specified in either character or binary string notation. The following example represents the same GUID:
'82B7A80F-OBD5–4343–879D-C6DDDCF4CF16'
0×FE4B4D38D5539C45852DD4FB4C687E47
You can use either notation, but the character string requires single quotes, as demonstrated here.
Signing Constants
To sign a numeric constant, merely apply the + or −unary operator to it. The default sign is positive if the operator is not applied. The
following examples are signed:
+$500.00
−2001
T-SQL Expressions
An expression is a syntactical element or clause composed of identifiers, operators, and values that can evaluate to obtain a result.
Like a sentence consisting of subject, verb, object to convey an action, the expression must be logically complete before it can
compute. In other words, the elements of an expression must “add up.” In the general programming environments, an expression will
always evaluate to a single result. However, Transact-SQL expressions are evaluated individually for each row in the result set. In
other words, a single expression may have a different value in each row of the result set, but each row has only one value for the
expression.
A function, such as DB_ID(), is an expression because it computes to return a value that represents a database ID.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 5 of 90
CASE, NULLIF, and COALESCE are expressions (discussed a little later in this chapter).
The preceding list items are known as simple expressions. When you combine two or more simple expressions with operators, you get
a complex expression. A good example of complex expression is your average SELECT statement. For example, SELECT * FROM
CITY is a complex or compound expression because the statement can return the name of a city for each row in the table.
It is possible to combine expressions using an operator, but only if the expressions have data types that are supported by the operator.
In addition, the following rules also apply:
A data type that has a lower precedence can be implicitly converted to the data type with the higher data type precedence.
Using the CAST function, you are able to explicitly convert the data type with the lower precedence to the data type with the
higher precedence. Alternatively, you should be able to use CAST to convert the source data type to an intermediate data type,
and then convert the intermediate data type to the data type with the higher precedence.
If you are unable to perform either an implicit or explicit conversion, then you cannot combine the two expressions to form a
compound expression.
Expression Results
In addition to the preceding rules, the following also applies to SQL Server expressions:
When you create a simple expression comprising a single variable, a constant, a scalar function, or a column name, the data
type, the collation, the precision and scale, and the value of the expression are the data type, collation, precision, scale, and
value of the referenced element.
When you combine two expressions with comparison or logical operators, the resulting data type is Boolean and the value is
one of TRUE, FALSE, or UNKNOWN. (See “Comparison Operators” in the next section).
When you combine two expressions with arithmetic, bitwise, or string operators, the operator determines the resulting data type.
When you create compound expressions, comprising many operators, the data type, collation, precision, and value of the
resulting expression are determined by combining the component expressions, two at a time, until a final result is reached. The
sequence in which the expressions are combined is defined by the precedence of the operators in the expression.
T-SQL Operators
T-SQL supports several operators that can be used to specify actions that are performed on one or more expressions. The following is
a list of the operators that are supported in SQL Server 2005:
Arithmetic operators
Assignment operators
Bitwise operators
Comparison operators
Logical operators
Unary operators
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 6 of 90
These operators are used to perform mathematical operations on two expressions of any numeric data type. Table 10–1 lists the
arithmetic operators. As indicated in the table, the + and − operators can also be used with the date data types discussed later in this
chapter.
As in most programming languages, there is a single assignment operator. In T-SQL it is the equal sign. (This is unfortunate for
experts in other languages where the equal sign is used to equate one expression [comparison] with another, as it is in Java which uses
the colon-equal [:=] for assignment). In the following example, a simple use of the T-SQL demonstrates assigning a numeric value to
a variable:
You can also use the assignment operator to assign a string to provide a name for a column heading when you display a result set. The
equal sign is also a T-SQL comparison operator.
Bitwise Operators
T-SQL provides bitwise operators that you can use within T-SQL statements to manipulate the bits between two expressions of any
integer or binary string-based data types (except image). Also, operands cannot both be of the binary string data type. Table 10–2 lists
the bitwise operators and their purposes. It also lists the Bitwise NOT (~) operator, which applies to one operand. (See also the unary
operators discussed later in this section.)
Comparison Operators
The comparison operators test equality between two expressions and are often used in WHERE clauses to test for a column value.
They can be used on all expressions except expressions of the text and image data types. Table 10–4 lists the comparison operators in
SQL Server and their functions.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 7 of 90
The result of a comparison expression is that the data type of the return value is a Boolean of TRUE, FALSE, or UNKNOWN. It is
also important to take into consideration that when SET ANSI_NULLS is ON, an operator between two NULL expressions returns
UNKNOWN. If you switch SET ANSI_NULLS to OFF, the equal operator will return TRUE if it is between two NULLS.
You can also use the AND keyword to combine multiple comparison expressions like the following:
Note You should also be aware that comparisons may be affected by the collations you are using.
Logical Operators
The logical operators test for the truth of some expression. Like comparison operators, they also return a Boolean data type with a
value of TRUE or FALSE. These operators, listed in Table 10–5, are used extensively in queries and are most common in WHERE
clauses. For example, the statement
returns rows where the entire value (the operator tests the whole value) looks like “Shapiro.”
Table 10–5: Logical Operators
Operator Purpose
ALL TRUE if all of a set of comparisons are TRUE.
AND TRUE if both Boolean expressions are TRUE.
ANY TRUE if any one of a set of comparisons are TRUE.
BETWEEN TRUE if the operand is within a range.
EXISTS TRUE if a subquery contains any rows.
IN TRUE if the operand is equal to one of a list of expressions.
LIKE TRUE if the operand matches a pattern.
NOT Reverses the value of any other Boolean expression.
OR TRUE if either Boolean expression is TRUE.
SOME TRUE if some of a set of comparisons are TRUE.
The string concatenation operator is the addition sign (+), which is used to concatenate one substring to another to create a third
derivative string. In other words, the expression ‘the small bro’+‘wn fox’ stores the value “the small brown fox.” However, be aware
that concatenation behavior can vary from database level to database level.
A version 6.5 (the database versions are referenced as 65, 70, 80, or 90) database treats an empty constant as a single blank character.
For example, if you run ‘the small bro’+”+‘wn fox’ on a 65 database, you’ll end up with the following being stored: “the small bro wn
fox.” (See also the string manipulation functions discussed later in this chapter and also the discussion of collation precedence.)
Unary Operators
The unary operators perform an operation on only one expression of any numeric data type. Table 10–6 lists the unary operators (see
the bitwise operators discussed earlier in this section).
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 8 of 90
Operator Precedence
As in all modern programming languages, T-SQL operators are governed according to rules of precedence. An operator of a higher
level is evaluated before an operator of a lower level. In compound expressions, the operator precedence determines the order of
operations.
The order of execution or computation can significantly influence the result. The following list is in order of precedence, from highest
(+) to lowest (=).
=, >, <, >=, <=, <>, !=, !>, !< (Comparison operators)
NOT
AND
= (Assignment)
Operators used in an expression that have the same operator precedence level are evaluated according to their position in the
expression from left to right. For example, in the expression used in the SET statement of this example, the subtraction operator is
evaluated before the addition operator.
SET @TheNumber = 3 – 3 + 9
You can also use parentheses to override the defined precedence of the operators in an expression. The expression within the
parentheses is evaluated first to obtain a single value. You can then use the value outside of the parentheses.
5 * (3 + 2)
is the same as
5 * 5
In expressions that contain expressions in parentheses (nesting), the deepest expression is evaluated first.
Often it becomes necessary to convert a constant or variable of one data type to another, and you would use the CONVERT() function
described later in this chapter to do this. However, when you combine two data types with an operator, a data type precedence rule
decides which data type gets converted to the data type of the other.
The data type precedence rule dictates that if an implicit conversion is supported (you do not require the use of the conversion
function), the data type that has the lower precedence is converted to the data type with the higher precedence. Table 10–7 lists the
base data types in order of precedence; the first entry in the table has the highest order, and the last entry has the lowest order.
Naturally, when both data types combined by the operator are of the same precedence, no precedence ruling is required.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 9 of 90
The following section explains the data types. This discussion is not in any order of preference or precedence, as discussed previously.
bigint Integer, a whole number, data from −2^63 (−9223372036854775808) through 2^63–1 (9223372036854775807). The
storage size is 8 bytes. Use bigint for large numbers that exceed the range of int. This integer costs more in terms of storage
footprint. Your functions will return this data type only if the argument passed is a bigint data type. The smaller integer types
are not automatically converted to bigint.
int Integer, a whole number, data from −2^31 (−2,147,483,648) through 2^31−1 (2,147,483,647). The storage size is 4 bytes.
This integer should suffice for most needs and remains the primary integer type in use on SQL Server.
smallint Integer data from −2^15 (−32,768) through 2^15–1 (32,767). The storage size is 2 bytes.
bit This is an integer data type that takes 1, 0, or NULL. You can create columns of type bit, but they cannot be indexed. Also,
if there are 8 or less bit columns in a table, the columns are stored as 1 byte by SQL Server, and if there are from 9 through 16
bit columns, they are stored as 2 bytes, and so on. This is a SQL Server conservation feature at work.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 10 of 90
The decimal[(p[, s])] and numeric[(p[, s])] types are data types with fixed precision and scale (p=precision and s=scale), as listed in
Table 10–8. When maximum precision is used, the valid values are from −10^38–1 through 10^38–1.
The precision (p) specifies the maximum total number of decimal digits that can be stored, both to the left and to the right of the
decimal point. The precision must be a value from 1 through the maximum precision. The maximum precision is 38. The scale (s)
specifies the maximum number of decimal digits that can be stored to the right of the decimal point. Scale must be a value from 0
through p. The default scale is 0; therefore, 0 <= s <= p. The maximum storage sizes vary according to the precision.
The monetary data types are used for representing monetary or currency values as follows:
Money Values from −2^63 (−922,337,203,685,477.5808) through 2^63–1 (+922,337, 203,685,477.5807). This type has
accuracy to one ten-thousandth of a monetary unit. The storage footprint is 8 bytes.
Smallmoney Values from −214,748.3648 through +214,748.3647. This type has accuracy to one ten-thousandth of a monetary
unit. The storage footprint is 4 bytes.
These data types are the approximate number data types you use with floating-point numeric data. Floating-point data is approximate,
and not all values in the data type range can be precisely represented.
The float [(n)]: This is for floating-point number data from −1.79E+308 through 1.79E+308. The n parameter is the number of bits
used to store the mantissa of the float number in scientific notation. This dictates the precision and storage size. The n parameter must
be a value in the range 1–53. The float[(n)] data type conforms to the SQL-92 standard for all values of n in the range 1–53. Table
10–9 represents values for n and the corresponding precision and memory costs.
The real is a floating-point number data type that ranges from −3.40E+38 through 3.40E+38. The footprint of real is 4 bytes.
The date and time data types represent dates and times from January 1, 1753, through December 31, 9999, to an accuracy of one
three-hundredth of a second (equivalent to 3.33 milliseconds or 0.00333 seconds). These values are rounded to increments of .000,
.003, or .007 seconds. The footprint for datetime is two four-byte integers. The first four bytes store the number of days before or after
the base date, January 1, 1900. The base date is the system reference date (values for datetime earlier than January 1, 1753, are not
permitted). The remaining four bytes store the time of day, represented as the number of milliseconds after midnight.
The smalldatetime type from January 1, 1900, through June 6, 2079, has accuracy to the minute. With smalldatetime, values with
29.998 seconds or lower are rounded down to the nearest minute. Values with 29.999 seconds or higher are rounded up to the nearest
minute.
The smalldatetime data type stores dates and times of day with less precision than datetime. SQL Server stores smalldatetime values
as two two-byte integers, as opposed to two four-byte values in datetime. The first two bytes store the number of days after January 1,
1900. The remaining two bytes store the number of minutes since midnight.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 11 of 90
char[(n)] Fixed-length non-Unicode character data with a length of n bytes, where n must be a value from 1 through 8,000.
Storage size is n bytes. The SQL-92 synonym for “char” is “character.”
varchar[(n) | MAX] Variable-length non-Unicode character data with a length of n bytes, where n must be a value from 1
through 8,000. Storage size is the actual length in bytes of the data entered, not n bytes. The data entered can be 0 characters in
length. The SQL-2003 synonyms for varchar are char varying or character varying. MAX indicates that the maximum storage
size is 2^31–1 bytes. The storage size is the actual length of data entered plus two bytes.
When n is not specified in a data definition or variable declaration statement, the default length is 1. When n is not specified with the
CAST function, the default length is 30. Objects using char or varchar are assigned the default collation of the database, unless a
specific collation is assigned using the COLLATE clause. The collation controls the code page used to store the character data.
Sites supporting multiple languages should consider using the Unicode nchar or nvarchar data type to minimize character conversion
issues. If you use char or varchar,
Use char when the data values in a column are expected to be consistently close to the same size.
Use varchar when the data values in a column are expected to vary considerably in size.
Use varchar(max) when the data values in a column vary considerably, and the size might exceed 8,000 bytes.
When n is not specified in a data definition or variable declaration statement, the default length is 1. When n is not specified when
using the CAST and CONVERT functions, the default length is 30.
In cases where you need to support multiple languages, you should use the Unicode nchar or nvarchar data types to minimize
character conversion issues. However, if you use char or varchar and if SET ANSI_PADDING is OFF, when either CREATE TABLE
or ALTER TABLE is executed, a char column that is defined as NULL is handled as varchar.
When collation code pages use double-byte characters, the storage size is still n bytes. Also, with character strings, the storage size of
n bytes can be less than n characters.
Variable Usage Variables of these types are used to store non-Unicode characters. The first data type is a fixed char variable that is
padded with spaces to the length specified in n. The second data type stores data of variable length and is not padded. You can use
either variable for storing strings that do not exceed 8,000 characters. These structures will also truncate the strings if they exceed the
number of characters declared in n.
If you use SET ANSI_PADDING OFF in the CREATE TABLE or ALTER TABLE statement, then a char column defined as NULL
will be handled as a varchar. If the collation code page uses double-byte characters, the storage size will still be n bytes. Depending on
the character string, the actual or final storage size of n bytes can still be less than n characters.
Nchar and nvarchar represent Unicode character data types that are either fixed-length (nchar) or variable-length (nvarchar). They
make use of the UNICODE UCS-2 character set.
nchar(n) This is a fixed-length Unicode character data type comprising n characters. The characters in n must be a value in
the range 1–4,000. The storage size of the string is two times n bytes. SQL-92 or higher has synonyms for these types; nchar is
national char or national character. Best practice suggests using nchar when the data entries in a column are expected to be
constantly in the same size range.
nvarchar(n | MAX) This is a variable-length Unicode character data type comprising n characters. The string represented by
n must be a value in the range 1–4,000. “MAX” indicates that the maximum storage size is 2^31–1 bytes. The storage size of
the string, in bytes, is two times the number of characters entered, and you are not required to actually enter characters. The
SQL-2003 synonyms for nvarchar are national char varying and national character varying. Best practice suggests using the
nvarchar type when the sizes of the data entries in a column are expected to vary considerably
If you do not specify n in a data definition or a variable declaration statement, then the length defaults to 1. It defaults to 30 when you
use n but do not use the CAST function. When you use nchar or nvarchar without specifying a collation in the COLLATE clause, the
default database collation is used. Also the SET ANSI_PADDING OFF has no effect on these data types; SET ANSI_PADDING is
always ON.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 12 of 90
These are binary data types that can be either fixed-length, which is a binary value, or variable-length, which is represented by the
varbinary data type.
binary [(n)] A fixed-length binary data type of n bytes. The value for n must be from 1 through 8,000. The storage size is n+4
bytes. You would use binary when column data size remains constant.
varbinary [(n | MAX)] This is the variable-length binary data type consisting of n bytes. The value for n must be a value from
1 through 8,000; however, the storage size is the actual length of the data entered +4 bytes, not the value for n bytes. MAX
indicates that the maximum storage size is 2^31–1 bytes. The varbinary type can cater to a 0-byte length. The SQL-2003
synonym for varbinary is binary varying; you use this data type to hold values that vary in size.
Note If you do not specify n in a data definition or variable declaration statement, the default length is 1. When n is not specified with
the CAST function, however, the default length of the data type is 30. data type is 30.
These are fixed and variable-length data types for storing large non-Unicode and Unicode character and binary data. Unicode data
uses the UNICODE UCS-2 character set. These types are being phased out and will be removed from future versions of SQL Server.
The ntext type is variable-length Unicode data with a maximum length of 2^30–1 (1,073,741,823) characters. The storage size, in
bytes, is twice the number of characters entered. The SQL-2003 synonym for ntext is national text.
The text type is variable-length non-Unicode data in the code page of the server and with a maximum length of 2^31–1
(2,147,483,647) characters. When the server code page uses double-byte characters, the storage is still 2,147,483,647 bytes. The
storage size may be less than 2,147,483,647 bytes.
The image type is variable-length binary data from 0 through 2^31–1 (2,147,483,647) bytes.
Cursor
The cursor is a data type for variables or stored procedure OUTPUT parameters that contain a reference to a cursor. Any variables
created with the cursor data type are nullable. You should also take note that the cursor data type cannot be used for a column in a
CREATE TABLE statement.
The operations that can reference variables and parameters having a cursor data type are
Sql_Variant
The sql_variant is a SQL Server data type that stores the values of all standard SQL Server data types, except the data types
containing large objects (LOBs), such as “MAX” text and image types, and the data types timestamp and sql_variant (itself). This
type may be used in columns, parameters, and variables, as well as in return values of user-defined functions. The following rules
apply to this data type:
A column of type sql_variant may contain the values of several different data types. For example, a column defined as
sql_variant can hold int, binary, and char values.
A sql_variant data type must first be cast to its base data type value before participating in operations such as addition and
subtraction. You may also assign it a default value. This data may also hold NULL as its underlying value. NULL values,
however, will not have an associated base type, but that will change when you replace the NULL with data.
You can use the sql_variant in columns that have been defined as UNIQUE, primary, or foreign keys. However, the total length
of the data values composing the key of a given row should not be greater than the maximum length of an index, which is
currently 900 bytes.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 13 of 90
ODBC does not support sql_variant because it has no facility to cater to the notion of a variant data type. You should check out
the specifics of the limitations in the SQL Server documentation. For example, queries of sql_variant columns are returned as
binary data when using the Microsoft OLE DB Provider for ODBC.
Precedence of sql_variant values goes according to the rules of precedence for the base data types they represent. For example,
when you compare two values of sql_variant and the base data types are in different data type families (say int and bigint), the
value whose data type family is higher in the precedence hierarchy is considered the higher of the two values.
The precedence rule discussed previously applies to conversion as well. In other words, when sql_variant values of different
base data types are compared, the value of the base data type that is lower in the hierarchy chart is implicitly converted to the
other data type before comparison is made.
Table
The table data type (introduced in SQL Server 2005) can be used to store a result set for later processing. Its primary use is for
temporary storage of a set. Use DECLARE @local_variable to declare variables of type table. The syntax for the table type is as
follows:
The parameters being passed to create a variable of type table make up the same subset of information used to create a persistent table
object in CREATE TABLE. The table declaration includes column definitions, names, data types, and constraints. Note that the only
constraint types allowed are PRIMARY KEY, UNIQUE KEY, and NULL.
Functions and variables can be declared to be of type table, and the variables can be used in functions, stored procedures, and batches.
A table variable also behaves like a local variable. It has a well-defined scope, which is the function, stored procedure, or batch in
which it is declared.
Within its scope, a table variable may be used like a regular table. It may be applied anywhere a table or table-expression is used in
SELECT, INSERT, UPDATE, and DELETE statements.
Collations also need to be taken into consideration when creating this variable. (see Chapter 4). The type is also not suitable for use
with INSERT INTO and SELECT INTO, and you cannot assign one table variable to another. Bear in mind that because the table
type is not a persistent table in the database per se, it is unaffected by any transaction rollbacks.
Timestamp
This data type exposes automatically generated binary numbers that are guaranteed to be unique within a database. Timestamp is used
typically as a mechanism for version-stamping table rows. The storage footprint is eight bytes.
The Transact-SQL timestamp data type is not the same as the timestamp data type defined in the SQL-92 standard. The SQL-92
timestamp data type is equivalent to the Transact-SQL datetime data type.
SQL Server 2005 introduces a rowversion synonym for the timestamp data type. Use rowversion instead of timestamp wherever
possible in DDL statements. This practice will ease the migration to a future release of SQL Server in which rowversion is expected to
be introduced as a new data type.
In a CREATE TABLE or ALTER TABLE statement, you do not have to supply a column name for the timestamp data type. For
example, the statement
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 14 of 90
is devoid of a column name, so SQL Server will generate a column name of timestamp. The rowversion data type synonym does not
follow this behavior. You must supply a column name when you use the rowversion synonym.
Naturally a table can have only one timestamp column. The value in the timestamp column is updated every time a row containing a
timestamp column is inserted or updated. This property makes a timestamp column a poor candidate for keys, especially primary
keys. Any update made to the row changes the timestamp value, thereby changing the key value. If the column is in a primary key, the
old key value is no longer valid, and foreign keys referencing the old value are no longer valid. If the table is referenced in a dynamic
cursor, all updates change the positions of the rows in the cursor. If the column is in an index key, all updates to the data row also
generate updates of the index.
A nonnullable timestamp column is semantically equivalent to a binary(8) column. A nullable timestamp column is semantically
equivalent to a varbinary(8) column.
Uniqueidentifier
This data type represents the globally unique identifier (GUID). A column or local variable of uniqueidentifier data type can be
initialized to a value in two ways:
Converting from a string constant in the following form (xxxxxxxx-xxxx-xxxx- xxxx-xxxxxxxxxxxx, in which each x is a
hexadecimal digit in the range 0–9 or a–f). For example, is a valid uniqueidentifier value.
The comparison operators can be used with uniqueidentifier values. However, ordering is not implemented by comparing the bit
patterns of the two values. The only operations that are allowed against a uniqueidentifier value are comparisons (=, <>, <, >, <=, >=)
and checking for NULL (IS NULL and IS NOT NULL). No other arithmetic operators are allowed. All column constraints and
properties except IDENTITY are allowed on the uniqueidentifier data type. (See Chapters 13 and 14 for examples and tips using
uniqueidentifier.)
XML
XML is a data type that lets you store XML data in a column, or a variable of xml type. The stored representation of xml data type
instances cannot exceed 2 gigabytes (GB) in size. The T-SQL syntax is as follows:
The CONTENT variable, the default, restricts the xml instance to be a well-formed XML fragment. The XML data can contain
multiple zero or more elements at the top level. Text nodes are also allowed at the top level.
The DOCUMENT variable restricts the XML instance to be a well-formed XML document. The XML data must have one and only
one root element. Text nodes are not allowed at the top level. The xml_schema_collection represents the name of an XML schema
collection. To create a typed xml column or variable, you can optionally specify the XML schema collection name.
Here is an example:
Collation Precedence
The character string data types, char, varchar, text, nchar, nvarchar, and ntext are also governed by collation precedence rules. These
rules determine the following:
The collation of the final result, the returned character string expression.
The collation used by collation-sensitive operators that use character string arguments but do not return character strings.
Operators such as LIKE and IN are examples.
SQL Server 2005 provides data type synonym support for SQL-92 compatibility. Table 10–10 lists the SQL-92 types and the SQL
Server 2005 synonyms.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 15 of 90
These data type synonyms can be used in place of the corresponding base data type names in data definition language (DDL)
statements, such as CREATE TABLE, CREATE PROCEDURE, or DECLARE @variable. The synonym has no use after the object
is created because SQL Server references the base data type and has no notion of the high-level label.
This behavior also applies to metadata operations, such as sp_help and other system stored procedures, the information schema views,
or the various data access API metadata operations that report the data types of table or result set columns.
The data type synonyms are expressed only in T-SQL statements. There is no support for them in the graphical administration
utilities, such as SQL Server Management Studio. The following code demonstrates the creation of a table specifying national
character varying:
The column A_Varcharcolumn is actually assigned an nvarchar(10) data type. It is referenced in the catalog as an nvarchar(10)
column, not according to the synonym supplied on creation of the object. In other words, metadata does not represent it as a national
character varying(10) column.
T-SQLVariables
One of the first big surprises to befall an experienced Visual Basic or C# programmer is that T-SQL variables must be explicitly
declared in a T-SQL module before they can be used. This is a requirement of T-SQL that makes it similar to a .NET language,
Delphi, or Java when dealing with private or local variables that are not exposed to the other modules.
T-SQL variables are declared using the DECLARE statement. T-SQL identifies the actual variable with the at sign (@) character as
follows:
Note in this syntax that you cannot declare and use the variable without declaring the variable data type. For example,
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 16 of 90
This code declares two variables, the first a char string 10 characters long and the second a varchar 20 characters long. In Chapter 2, I
listed the built-in data types.
T-SQL Functions
Past versions of SQL Server included numerous constructions and elements that were provided to obtain certain programming results
and values, process the various data types, and implement various operations and conditions in the DBMS. These were persistent and
were created to obviate the need to repeatedly recode such constructions. These elements, and many new ones, have now been
brought together in a unified function collection in SQL Server 2005.
Note Several administrative functions are included in the SQL Server function arsenal; see Chapter Appendix.
Although they are built into SQL Server, they can be referenced from the outside-in T-SQL statements passed through the APIs and
using profilers and query tools, like SQL Server Management Studio and the OSQL utility. Functions in SQL Server 2005 also
replace the so-called “global variables” that were preceded by the double at (like @@CONNECTION).
SQL Server functions are true to the definition of “function” in that they can take a number of input values and return scalar values or
result sets to the calling process. They can also take a blank value to return a specific predefined or predetermined value. For example,
used with the SELECT statement, the function DB_NAME() returns a value that represents the current database you are connected to.
And the function GETDATE() returns the current date and time. And NEWID() takes “nothing” as an argument and returns a system-
generated GUID.
SQL Server 2005 also lets you create your own user-defined functions. But before we get to the user-defined functions, let’s first
investigate the myriad built-in functions, many of which you will want to start using the moment you put this book down.
Determinism of Functions
SQL Server functions can be deterministic or nondeterministic. This means that either they always return the same result any time
they are called with a specific set of input values, which means the function is deterministic, or they return different results each time
they are called with a specific set of input values, which means they are nondeterministic. This is known as the determinism of the
function.
For example, a function like DATEADD is deterministic because it always returns the same result for any given set of argument
values for the parameters passed to it. GETDATE, on the other hand, is not deterministic. It is always invoked with the same
argument, but the return value, the current date and time, changes with each call of the function.
Function determinism was introduced in SQL Server 2000, and thus nondeterministic functions are constrained by the following
usage rules:
You cannot create an index on a computed column if the computed_column_expression references any nondeterministic
functions.
You cannot create a clustered index on a view if the view also references nondeterministic functions.
SQL Server’s built-in functions are either deterministic or nondeterministic according to how the function is implemented. You have
no access to change the determinism. The aggregate and string built-in functions are deterministic; however, CHARINDEX and
PATINDEX are not, as are all of the configuration, cursor, metadata, security, and system statistical functions. See BOL for a list of
functions that are always deterministic.
T-SQL provides a rich collection of string manipulation and sting management functions. Many of these functions have equivalent
functionality in generic programming languages such as VB, C++, and Delphi.
ASCII(character)
The ASCII(character) function returns the ASCII code value of type int of the leftmost character of the character expression
evaluated. For example, the expression
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 17 of 90
SELECT ASCII('d')
CHAR(integer)
The CHAR(integer) function is used to return the ASCII character of its integer code assignment. For example, the expression
SELECT CHAR(100)
returns “d,” which is the converse of the ASCII() function. Remember, the ASCII codes run from 0 to 255, and thus a null value is
returned if you are outside this range. You will obviously not be able to return a code for noninteger values. For example, CHAR(13)
is the carriage return code and has no displayable value. On the other hand, you can use the CHAR() function to insert control
characters into character strings. For example, the expression
SELECT CHAR(100)+CHAR(9)+CHAR(68)
returns the values d and D separated by a tab. The most common control characters used are Tab (CHAR(9)), line feed (CHAR(10)),
and carriage return (CHAR(13)).
This function can be used to determine the starting position of the specified phrase in the character string. It is useful for searching
textual data. The syntax for this function is CHARINDEX(expression1, expression2, startlocation). The arguments taken are as
follows:
expression1 (expression1) Represents the expression that contains the sequence of characters to be found.
startlocation Represents the starting position in ex2 from which to begin searching for ex1. If the start location is a negative
number or zero, then the search begins at the beginning of ex2. In my testing of this function, omitting the start location after a
comma delimiter causes an error. If you exclude the delimiter, the function computes.
returns the value 10. Using this function, you can easily search for the starting point of a string in a particular record in your table. For
example, the expression
USE NORTHWIND
SELECT CHARINDEX('Connection', CompanyName)
FROM Customers WHERE CustomerID = 'EASTC'
returns the value 9. This type of expression is valuable when you need to get into a value and extract a subexpression. For example, in
my record numbers for call center systems I tag an agent ID onto an order number and save the entire record as a record number. If a
manager needs to check which agent worked the record, the system can extract the agent ID by first locating the starting point in the
record where the agent ID begins and then run the GETCHAR() function on the rest of the record. This saves having to create another
table or column that references the record number entity with the agent ID entity.
It should be noted that if either expression1 or expression2 is of a Unicode data type, the nvarchar and nchar types respectively, and
the other is not, the other is converted to Unicode. In other words, if ex1 is Unicode and ex2 is not, then ex2 is converted to Unicode.
Also, if either expression1 or expression2 is NULL, the CHARINDEX() function returns NULL if the database compatibility level
you have set is 70. If the database compatibility level is 65 or earlier, the CHARINDEX() function returns NULL only when both
expression1 and expression2 are NULL.
DATALENGTH(variable expression)
This “function” deserves a place in a list of string manipulation functions. It is similar to LEN() referenced later in this section but
returns the declared length of a variable or field. For example, the expression
USE NORTHWIND
SELECT DATALENGTH(Phone)
FROM CUSTOMERS WHERE CustomerID = 'EASTC'
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 18 of 90
returns the integer value of “28,” while LEN() returns “14,” the actual length of the string in the variable or field.
DIFFERENCE(expression1, expression2)
This function is used to determine the difference between the SOUNDEX algorithm values of two char or varchar strings represented
in expression1 and expression2 (see SOUNDEX() later in this section). The return value is an integer on a scale of 0 to 4. The lowest
value of “0” indicates the highest difference between the two strings. The highest value of “4” indicates that the two strings “sound”
very similar. For example, the expression
returns a value of “2,” indicating the two strings do not sound the same but are similar in construction. Upon changing “bar” to “boo,”
the difference value increases to 3.
LEFT(string, startindex)
This function returns a character string that starts at a specified number of characters from the left of the varchar variable. In can be
used in conjunction with the CHARINDEX() function to return the value of a string at the specified index. For example, the
expression
USE NORTHWIND
SELECT LEFT(CompanyName, 5)
FROM CUSTOMERS WHERE CustomerID = 'EASTC'
returns the varchar value “Easte.” If you need to simply evaluate the string, just substitute the column name used in “expression” for a
string; for example, LEFT(‘codered,’ 4) returns “code.”
The expression can be of character or binary data, a constant, a variable, or a column. It must also be of a data type that can be
implicitly convertible to varchar. Use the CAST function to explicitly convert the string to varchar before you evaluate it.
LEN(string)
This function returns the number of characters, not the number of bytes, of the given string expression, excluding trailing blanks. For
example, the expression
USE NORTHWIND
SELECT LEN(Phone)
FROM CUSTOMERS WHERE CustomerID = 'EASTC'
returns the integer value of “14,” which will help us clean up the telephone number column in the Customer table of the Northwind
database (see also DATALENGTH()).
LOWER(string)
This function converts all uppercase characters of character or binary data in the expression argument to lowercase and then returns
the new expression. For example, the expression
USE NORTHWIND
SELECT LOWER(CompanyName)
FROM CUSTOMERS WHERE CustomerID = 'EASTC'
returns the value “eastern connection.” The string in expression can be a constant, a variable, or a column as shown in the example.
The expression string must also be of a type that can be implicitly converted to varchar. Use CAST to explicitly convert the character.
See also UPPER() later in this section.
LTRIM(string)
This function returns a character expression after first removing leading blanks. For example, the expression
returns the value “ my bunny lies over the hillside”, with spaces, but the
expression
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 19 of 90
returns the value “so bring back my bunny to me,” sans the spaces. The expression must be an expression of character or binary data.
It can be a constant, a variable, or a column, but it must of a data type that is implicitly convertible to varchar. Otherwise, use CAST
to convert the expression to varchar.
NCHAR(integer)
This function returns the Unicode character with the given integer code, as defined by the Unicode standard. For example, the
statement
PRINT NCHAR(167)
returns the character “§.” The value must be a positive number in the range 0–65535. If you specify a value outside this range, NULL
is returned.
This function returns a Unicode string with the delimiters surrounding the string. For example, the statement
SELECT QUOTENAME('PHONES','"')
PRINT QUOTENAME('PHONES','[')
returns the value [PHONES]. The ‘, “, [, ], {, and } are valid quote characters.
This function finds all occurrences of the second string in the first string and then replaces it with the string in the third expression.
For example, the statement
This function can be used with both character and binary data.
REPLICATE(character, integer)
This function repeats a character expression for a specified number of times. It is useful for padding if you replicate a space instead of
a character. For example, the statement
PRINT REPLICATE('0', 2)
returns the value ‘00’. The int expression must be a positive whole number. If it is negative, a null string is returned.
REVERSE(string)
This function returns the reverse of a character expression. For example, the statement
SELECT REVERSE('evol')
RIGHT(string, integer)
This function returns the part of a character string starting a specified number of characters from the right. For example, the statement
SELECT RIGHT('evol' , 1)
RTRIM(string expression)
This function is the converse of LTRIM. It snips all trailing blanks from the expression passed in the argument placeholder. For
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 20 of 90
SOUNDEX(string expression)
This function returns the four-character code of the SOUNDEX algorithm that is used evaluate the similarity of two strings (see the
DIFFERENCE() function discussed earlier). For example, the statement
SELECT SOUNDEX('WASH')
The SOUNDEX() function converts an alpha string to a four-character code to find similar-sounding words or names. You can then
use this value and compare it to another SOUNDEX() using the DIFFERENCE() function. The first character of the SOUNDEX code
is the first character of the argument, and the second through fourth characters of the code are numbers. Vowels in the argument are
ignored unless they are the first letter of the string. String functions can be nested.
SPACE(value)
This function returns a string of repeated spaces x number of times as indicated by the integer passed in the argument. For example,
the statement
returns the expression “Y=1.” If you are adding spaces to Unicode data, use the REPLICATE() function instead of SPACE().
This function returns character data converted from numeric data. For example, the statement
SELECT STR(42393.78, 8, 1)
returns the value “42393.8.” The float expression must be an expression of an approximate numeric (float) data type with a decimal
point. The length argument is the total length of the returned value including the decimal point, sign, digits, and spaces. The default is
10. The decimal argument is the number is the number of places to the right of the decimal point, rounded off as in the preceding
example.
If you supply the values for length and decimal parameters to the STR() function, they must be positive. The specified length you
provide should be greater than or equal to the part of the number before the decimal point plus any number sign you provide. A short
float expression is right-aligned in the specified length, while the long float expression is truncated to the specified number of decimal
places. For example, STR(12, 10) yields the result of 12, which is right-aligned in the result set. However, STR(1223, 2) truncates the
result set to **.
This function deletes a specified length of characters in a string and “stuffs” another set of characters at a specified starting point. You
can use it to delete the characters in the middle of a string and replace them with new characters. For example, the statement
returns all telephone numbers from the PHONE column in the CUSTOMERS table with the space removed at the tenth character and
the dash inserted instead. The value is changed from “(800) 555.1212” to “(800) 555–1212”.
This function returns part of a character, binary, text, or image expression. For example, the statement
returns the value “33428” representing the first five digits of the nine-digit ZIP code.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 21 of 90
The argument in the string expression can be a character string, a binary string, text, an image, a column, or an expression that
includes a column (but not an expression that includes aggregate functions). The start parameter is an integer that specifies where the
substring begins, while the length parameter takes an integer that specifies the length of the substring (the number of characters or
bytes to return). (See Books Online for more information on using this function with the other data types.)
UNICODE(unicode expression)
This function returns the integer value, as defined by the Unicode standard, for the first character of the input expression. For
example, the statement
SELECT UNICODE('§')
UPPER(character expression)
This function returns a character expression with the lowercase characters converted to uppercase. For example, the statement
SELECT UPPER('noodle')
returns the value “NOODLE” (see also LOWER()) earlier in this section.
Mathematical Functions
The T-SQL mathematical functions are scalar functions that compute the values passed as arguments and then return a numeric value.
All the functions are deterministic-in other words, they always return the same value for any given value passed as an argument-with
the exception of the RAND() function, which returns a random value. The RAND() function, however, becomes deterministic when
you use the same seed value as an argument.
In addition (no pun intended), the trigonometric functions, such as LOG, LOG10, EXP, SQUARE, and SQRT, cast the input value to
a float before computing and then return the value as a float. Table 10–11 lists the mathematical functions and provides brief
explanations of how to use them. For a complete reference to these functions consult SQL Server Books Online.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 22 of 90
SIN(float) Returns the trigonometric sine of the given angle (in radians) in an approximate numeric (float) expression.
SQRT(float) Returns the square root of the given expression.
SQUARE(float) Returns the square of the given expression.
TAN(float) Returns the tangent of the input expression.
Aggregate Functions
The aggregate functions are used to perform a calculation on a set of values and then return a single value to the caller. Typically
these values ignore NULL, but COUNT does not because technically NULL is a value. Aggregate functions are often used with the
GROUP BY clause in a SELECT statement.
The aggregate functions are deterministic and thus return the same value when they are called with a given set of input values. These
functions can only be used in the following situations:
In a HAVING clause
The Transact-SQL programming language provides these aggregate functions, listed in Table 10–12.
These scalar functions perform an operation on a date and time input value and return a string, numeric, or date and time value. Table
10–13 lists the date and time functions and the information they return.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 23 of 90
DAY() Returns an integer representing the day datepart of the specified date.
GETDATE() Returns the current system date and time in the standard internal format for datetime values.
GETUTCDATE Returns the datetime value representing the current UTC time (Universal Time Coordinate or Greenwich Mean
() Time). The current UTC time is derived from the current local time and the time zone setting in the operating
system of the computer on which SQL Server is running.
MONTH() Returns an integer that represents the month part of a specified date.
YEAR() Returns an integer that represents the year part of a specified date.
Interval Value Range
Year Yy, yy 1753–9999
Quarter Qq, q 1–4
Month Mm, m 1–12
Dayofyear Dd, y 1–366
Day Dd, d 1–31
Week Wk, ww 1–53
Weekday dw 1–7
Hour hh 0–23
Minute Mi, n 0–59
Second ss 0–59
Millisecond ms 0–999
These are scalar, nondeterministic functions that can perform an operation on a text or image argument. The following functions are
supported in T-SQL:
PATINDEX()
TEXTPTR()
TEXTVALID()
PATINDEX(%pattern%, expression)
This function returns the starting position of the first occurrence of a pattern in the specified expression. It returns zeros if the pattern
is not found. The function works on all valid text and character data types. For example, the expression
returns 11 records that contain a period in the string. Nine of these are reported to be in position three, which indicates that an IP
address has been inserted into the Phone column. Records that do not qualify are returned as a zero value on the returned result set. In
the preceding example, I used a single character as an example, but your pattern could be any combination of characters and spaces
that form the pattern. In other words, the pattern is a literal string.
You can also use wildcard characters, but you must remember to insert the % character at the beginning and end of the pattern to be
evaluated (except when searching for first or last characters). You can also use PATINDEX() in a WHERE clause. For example, the
following statement
returns all customers with a last name of Shapiro in the CUSTOMERS table.
TEXTPTR(column)
This function returns the text-pointer value that corresponds to a text, ntext, or image column in varbinary format. The retrieved text
pointer value can then be used in READTEXT, WRITETEXT, and UPDATE statements. For example, the statement
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 24 of 90
returns the image data you can then use in the client application.
For tables with in-row text, TEXTPTR returns a handle for the text to be processed. You can obtain a valid text pointer even if the text
value is null. If the table does not have in-row text, and if a text, ntext, or image column has not been initialized by an UPDATE
statement, TEXTPTR returns a null pointer.
TEXTVALID()
The TEXTVALID() function is used to check whether a text pointer exists. You cannot use UPDATETEXT, WRITETEXT, or
READTEXT without a valid text pointer. Chapter 17 provides an example of the TEXTVALID() function.
Conversion Functions
SQL Server 2005 supports two conversion functions, CONVERT() and CAST(), that let you convert a variable or column of one type
to another. Use of these functions is called explicit casting or conversion because SQL Server 2005 supports automatic conversion on
several data types. In other words, the conversion functions can be used if you have no choice but to manually convert, or your
application demands it for some reason.
CAST() does not do anything more than CONVERT(), but it is provided for compatibility with the SQL-92 standard. This discussion
thus focuses on CONVERT(), and I will make mention of features that CAST() does not support. The syntax for this function is
CONVERT(data_type, variable, style). The arguments are required as follows:
Data_type This is the target of the conversion, for example to convert a money value to a character value data type for use in
the construction of a financial report, perhaps an invoice.
Style This is the optional variable when the target data type can take one or more style changes.
You can use either of the functions in SELECT statements, and in the WHERE clause, and anywhere else you provide an expression.
The following example converts a column from 30 to 25 characters:
The data type argument you use in CONVERT() can be any valid data type supported by SQL Server. If you use a data type that takes
a length argument (nchar, nvarchar, char, varchar, binary, or varbinary), then you can pass its length in the parentheses that enclose
the data type length.
You can also use the CONVERT() function to obtain a variety of special data formats. For example, the style argument (not supported
by CAST()) is used to specify a particular date format required when you convert datetime and smalldatetime variables to character
types. Table 10–14 lists the style values and the date formats returned.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 25 of 90
12 112 Yymmdd
13 113 Dd mm yyy hh:mm:ss:mmm(24)
14 114 Hh:mi:ss:mmm(24)
20 120 Yyyy-mm-dd hh:mm:ss(24)
21 121 Yyy-mm-dd hh:mi:ss:mmm(24)
The following example illustrates the differences between CAST() in the first SELECT statement and CONVERT() in the second
SELECT statement. The result set is the same for both queries:
CAST():
USE NORTHWIND
SELECT CAST(regiondescription AS char(2)), regionid
FROM region
CONVERT():
Ea 1
We 2
No 3
So 4
In the preceding example, we converted the region description column from 50 to 2 characters. A better example would be to convert
a first name column to one character and compile a report listing of first name initials and full last names. CONVERT() is also useful
when using LIKE in the WHERE clause.
As mentioned earlier, SQL Server automatically converts certain data types. If, for example, you compare a char expression and a
datetime expression, or a smallint expression and an int expression, or char expressions of different lengths, SQL Server will convert
them automatically. This is called an implicit conversion.
SQL Server reports an error when you attempt a conversion that is not possible. For example, trying to converting a char with
letters to an integer will create an exception.
If you do not specify a length when converting, SQL Server will supply a length of 30 characters by default.
SQL Server will reject all values it cannot recognize as dates (including dates earlier than January 1, 1753) when you try to
convert from datetime or smalldatetime. You can only convert datetime to smalldatetime when the date is in the proper date
range (from January 1, 1900, through June 6, 2079). The time value will be rounded to the nearest minute.
When you convert to money or smallmoney, any integers in the conversion expression are assumed to be monetary units. For
example, let’s say you pass an integer value of 5 in the expression; SQL Server will convert it to the money equivalent of five
dollars-expressed as U.S. dollars if us_english is the default language.
All money value numbers to the right of the decimal in floating-point values are rounded to four decimal places by default.
Expressions of data types char or varchar that are being converted to an integer data type must consist only of digits and an
optional plus or minus sign (+ or −). The leading blanks are ignored. Any expressions of data types char or varchar converted to
money can also include an optional decimal point and leading currency sign.
You can include optional exponential notation (e or E, followed by an optional + or − sign, and then a number) in data types
char or varchar that are being converted to float or real.
When you pass character strings for conversion to a data type of a different size, any values too long for the new data type are
truncated, and SQL Server displays an asterisk (*). This is the default display in both the OSQL utility and Management Studio.
Any numeric expression that is too long for the new data type to display is truncated.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 26 of 90
You can also explicitly convert any text data to char or varchar, and image data to binary or varbinary. As discussed earlier,
these data types are limited to 8,000 characters, and so you are limited to the maximum length of the character and binary data
types; that is, 8,000 characters. When you explicitly convert ntext data to nchar or nvarchar, the output is confined to the
maximum length of 4,000 characters. Remember that when you do not specify the length, the converted value has a default
length of 30 characters. Implicit conversion is not supported with these functions.
When you convert between data types in which the target data type has fewer decimal places than the source data type, the
resulting value is truncated. For example, the result of CAST(10.3496 AS money) is $10.3496.
Style
The number you supply as the style argument is used to determine how the datetime data will be displayed. For starters, the year can
be displayed in either two or four digits. By default, SQL Server supplies a two-digit year, which may be a problem in certain
transactions. For example, the statement
returns the date 07/06/00. Table 10–14 provides the values for the style argument.
T-SQL Flow-Control
The T-SQL language supports basic flow-control logic that will allow you to perform program flow and branching according to
certain conditions you provide the switching routines. The routines allow you to test one thing or another in simple either/or
constructions, or test for multiple values in an easy-to-use CASE facility. The T-SQL flowcontrol options are as follows:
If…Else
CASE
While
Continue/Break
GOTO/Return
If…Else
This branching or condition-switching statement will execute an isolated block of code in a routine according to a qualifying
condition. If the condition qualifies, the code in the If block is executed. If it does not qualify, the program moves to the block of code
in the Else section of the routine. The block of code in the Else section can contain something of substance or very little.
IF condition
Begin
{do something here}
End
Else
Begin
{do something here}
End
This syntax is a little like Pascal; however, notice that no “end ifs” are required, but you should enclose your code in the Begin…End
blocks. I say “should” because you can get away with omitting the Begin…End blocks in simple code segments. However, the
Begin...End is essential when you need to make sure that all lines in the code segment are processed.
CASE
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 27 of 90
The CASE statement works the same as the CASE statements you find in all modern programming languages such as Visual Basic, or
Delphi or Java. The T-SQL CASE statement can compare a variable or a field against several variables or fields. You could
technically do this with multiple If…Else blocks, but that would be ugly to say the least, and you have no way to escape such a
construction after a condition finds a match or tests true.
T-SQL CASE statements test a variable to be true by using the WHEN…THEN clause. For example, “WHEN the banana is yellow”
THEN “eat it.” After the WHEN tests true, the THEN condition is applied and execution flow continues through the CASE block. For
example, the statement
Obviously the preceding statement might make more sense if the query also checked restock dates and other factors because an item
could be considered a slow mover an hour after a new shipment arrived. However, it adequately illustrates a simple CASE usage.
You can do a lot with CASE, such as assign the obtained value in a case statement and then pass that out to a stored procedure or
another construction. For example, consider the following statement:
I use the discount variable obtained at the end of the CASE and apply it to an item for which the customer has a discount coupon I can
identify with a coupon code. In this case, the discount is 7.5 percent. The variable @CouponCode could change from item to item.
This can be wrapped up in a trigger, as demonstrated in the next chapter, allowing the server to appropriately apply the discount.
WHILE
The WHILE loop is a flow-control statement that executes a single statement or block of code between BEGIN and END keywords.
For example, the following is a simple WHILE statement that increments a value:
To repeatedly execute more than just a single line of code, enclose the code between the BEGIN and END blocks, as demonstrated in
If…Else. We will revisit WHILE in later chapters to demonstrate some advanced T-SQL features, such as triggers, cursors, and stored
procedures.
Continue or Break
Use CONTINUE and BREAK to change or stop the execution of the WHILE loop. The CONTINUE keyword restarts a WHILE loop,
and the BREAK terminates the innermost loop it is in.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 28 of 90
These two flow-control statements let you jump out of your current segment and move to another location in the procedure, similar to
the GOTO in VB or DBase. GOTO moves to a line identified by a label followed by a colon (ArrivedHere:). RETURN ends a
procedure unconditionally and can optionally return a result.
The GOTO command is confined to a control-of-flow statement, statement blocks, or procedures, but it cannot go to a label outside of
the current process. However, the GOTO branch can alter the flow and reroute it to a label defined before or after GOTO. The
following example emulates a WHILE loop, and the RETURN is used to break out of the loop when a certain value is reached:
CheckResult:
IF @Counter = 10
BEGIN
PRINT 'You have reached '+ CAST (@Counter AS CHAR)
RETURN
END
Else
Goto Counter
WAITFOR
The WAITFOR statement suspends procedure execution until a certain time or time interval has passed. The following example prints
the time exactly as prescribed in the argument, but notice the conversion and trimming that is needed to return the system time in a
simple time format of 00:00 hours:
BEGIN
WAITFOR TIME '18:20'
PRINT 'THE TIME IS '+ LEFT (CONVERT (CHAR(20), GETDATE(), 14), 5)
END
RAISERROR
RAISERROR is a facility supported by SQL Server 2005 as a flow-control feature, which is why I tacked it onto this section.
However, you will use this facility in many places, such as triggers, stored procedures, transaction processing, and so on.
However, the simplest syntax to observe at this point for client information is simply RAISERROR(Message, Severity, State). Thus
the syntax for a simple message to the user would be RAISERROR(‘This is a non-severe error’, 1,1). More about RAISERROR later.
TRY…CATCH
The new Try…Catch support in T_SQL now implements a error handling process for Transact-SQL similar to the exception handling
in .NET Framework (such as in Visual Basic, C#, and C++). You can enclose your Transact-SQL statements in a TRY block and if an
error occurs within the TRY block, control is passed to another group of statements enclosed in the CATCH block. The following
syntax is the standard TRY…CATCH construct for T-SQL:
BEGIN TRY
{ sql_statement | statement_block }
END TRY
BEGIN CATCH
{ sql_statement | statement_block }
END CATCH
[ ; ]
Here is an example:
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 29 of 90
BEGIN TRY
-- Test the impossible divide-by-zero to force an exception.
SELECT 1/0;
END TRY
BEGIN CATCH
EXECUTE MyErrorHandler
END CATCH;
Identifiers
The names that you give to databases and database objects when you create them are called identifiers. You don’t have to supply an
identifier with the same fanfare and excitement that you went through when your parents named you, but identifiers and how you use
them are important.
Most of the time you need to supply an identifier at the instant you create, or define, an object. In some cases SQL Server does this for
you. A good example of an object that automatically gets its own identifier from SQL Server is the default constraint object. In the
next chapter I talk about referencing this identifier in your code.
is an example of naming the table and then referencing it by the identifier in the same stroke.
Often it becomes necessary to delimit identifiers with open/close square brackets ([]) when the identifiers do not conform to the rules
for well-formed identifiers in T-SQL. For example, the table ImportedCoffee is fine, but Imported Coffee needs to be delimited.
It also makes good business sense as a T-SQL programmer to help the optimizer reuse execution plans and connecting identifiers as a
SQL Server database namespace is important. In the following code examples both styles are accepted by SQL Server, but the latter is
preferred:
--one way
USE Northwind
SELECT CustomerID FROM Customers
--better way
SELECT CustomerID FROM Northwind.dbo.Customers
As discussed earlier, the at sign (@) at the beginning of an expression is an identifier reserved for SQL Server use, as are the pound
(#) sign, the dollar ($) sign, and the underscore (_). You cannot use pound as a subsequent character, but @, $, and underscore will
fly.
Moving On
This chapter covers much of the “bits and pieces” you need to code in T-SQL. With what we discussed you’ll be able to get cracking
and, within a few chapters, come up with some smart queries. You might not be ready to code the query that calculates the age of the
universe, so a good “complete reference” to T-SQL or SQL is a good idea. This chapter, the ones that follow, and the goodies you will
find in Appendix go a long way to providing that reference.
While we dealt with a lot of the basic ingredients, we also touched on some new data type and functions, which I will refer to in the
chapters ahead. So, let’s move on to stored procedures and triggers and put some of the stuff we learned in this chapter to work.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 30 of 90
Language Runtime
Overview
This chapter describes the .NET Framework and its Common Language Runtime (CLR) and how SQL Server 2005 now integrates
this environment into its processing engine The incorporation of a version of the CLR directly into SQL Server (in-process) has
catapulted the product into the future, years ahead of itself. To fully understand how it works, this chapter looks at the various aspects
of this runtime environment, because it is important to us going in from the ground up to create functions, stored procedures, triggers,
data structures (such as arrays) and .NET SQL Server integrated applications, components, and other services.
With this concise coverage of the.NET Framework’s runtime, you will be able to design and code applications with the runtime in
mind, especially in the area of memory management, which represents the biggest change in the way we write any SQL Server-based
applications. Knowing about the runtime is also especially important for programming with the correct security model, exception
handling, referencing the correct assemblies to target namespaces, debugging assemblies, and otherwise managing assemblies
(deployment and maintenance). All of these subjects make programming the SQL Server CLR far more complex than standard T-
SQL.
The Common Type System (CTS) The system that provides the type architecture of the Framework and type safety.
The Common Language Specification (CLS) The specification all NET language adopters and compiler makers adhere to so
that their languages can be seamlessly integrated into the .NET Framework.
The Common Language Runtime (CLR) The runtime and managed execution environment in which all .NET managed
applications are allowed to process.
We will then break down the Common Language Runtime into several components to be discussed as follows:
Managed execution This section discusses what managed execution means, as well as how it differs from other execution
environments such as VBRUN, SmallTalk’s runtime, and the Java Virtual Machine (JVM). It also introduces the garbage
collector.
The runtime environment This section discusses how the CLR works with metadata and Microsoft Intermediate Language
(MSIL) to execute code. It also investigates the just-in-time (JIT) compilation architecture. We also briefly look at application
domains and what they mean for your deployment requirements. And we also touch on the subject of attributes-a facility for
allowing programmers to have more control over the execution and management of their code in the runtime environment.
Assemblies This section goes into assemblies in some depth and examines how .NET applications, class libraries, and
components are packaged.
CLR and security This section introduces the security architecture of the CLR and how it affects your code and ability to
deploy.
Even if you have not had any experience writing and compiling a.NET application outside the SQL Server engine, this chapter will
give you the necessary information to hit the ground running-writing, compiling, and executing any stored procedure, function,
trigger, or user-defined data type (UDT).
The Common Type System is the formal definition of how all types in the .NET Framework are constructed, how all types are
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 31 of 90
declared and used, and how they are managed. The CTS also lays the ground rules for protecting the integrity of executing code.
Generally we talk about an object model in object-oriented programming, but the Common Type System is more than just an object
model.
The CTS also specifies how types-classes-are referenced, and how applications and class libraries are packaged for execution on the
CLR. The CTS describes class declaration, inheritance, referencing, and type management, not so much as SQL Server idioms but
rather as .NET Framework idioms. In other words, all .NET development environments must walk the same walk and talk the same
talk, if they hope to be tightly integrated with the platform.
In particular the Common Type System provides the following foundations for the .NET Framework:
It provides an object-oriented model that is supported by all programming languages that have adopted the .NET Framework. In
this regard it is responsible for the Common Language Specification, and how it is implemented by .NET adherents. This means
you can use even COBOL to program objects for SQL Server.
It establishes the foundations and reference framework for cross-language integration, interoperation, type safety, security, and
high-performance code execution.
It defines rules that languages must follow, which helps ensure that objects written in different languages can interact with each
other.
You could also consider the subject of assemblies and namespaces, to be discussed later in this chapter, but let’s look at the CTS
object model to get our bearings and gain some perspective.
Later in this chapter you will come across references to the root of the object model, Object, and how it functions as the so-called
“ultimate” object of the Framework. Figure 11–1 illustrates the model.
Figure 11–1: The CTS type model, which is the basis for the object model and hierarchy
Language interoperability is considered to be one of the Holy Grails of software development- and the .NET Framework has risen to
the challenge admirably. By writing “CLS-compliant code,” you assure that the classes you construct in one language can be used as
is by other languages and their respective IDEs and development tools. Imagine that-you can now create components that can be used
by any language or development tool without complex COM and ActiveX interfaces and registration details, and upload them into the
SQL Server CLR. To achieve the magic, the CLS requires that class and component providers only expose to consumers the features
that are common to all the .NET languages.
The CLS is really a subset of the Common Type System (CTS), as I mentioned earlier. In other words, all the rules specified by the
Common Type System in the runtime environment, like type safety, drive how the CLS governs compliance at the code construction
and compilation levels. The CTS lays down the rules to protect the integrity of code by ensuring type safety. When the CTS was being
created, the code constructs that risked type safety were excluded from the CLS. Thus your code is always checked for type safety As
long as you produce CLS-compliant code, it will be verified by the CTS.
The old cliché that rules can be broken is likely to be echoed in various far-flung shops. But when you program against the specs in
the CLS, you ensure language interoperability for your intended audience and then some. CLS compliance ensures that third parties
can rely on your code, and you obtain the assurance that the facilities you want exposed are available across the entire spectrum of
developers.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 32 of 90
Table 11–1 provides an abridged list of software development features that must meet CLS compliance rules. The table summarizes
the features that are in the CLS and indicates whether the feature applies to both developers and compilers (All) or only compilers.
The CLS includes the language constructs that are needed by all developers, of all .NET languages. That may seem like a tall order,
but the specification it is not very big or complex such that a .NET language will find it very difficult to support. After all, many of
the languages at the source-code level are as different from each other as fish are from birds. Just take a look at Smalltalk and
compare it to Pascal, or compare C# or the managed extensions of C++ to Visual Basic.
Visual Basic does things in its own peculiar way. Thus, writing Visual Basic 2005 code to achieve one end may actually produce
some strange nuances when packaged and then accessed in the C# side of the house. A good example is the big difference between
the way properties are implemented in Visual Basic and how they implemented in C# (see the section “Understanding Assemblies”
later in this chapter). Keep in mind that the CLR for SQL Server executes a fully functional subset of assemblies, so many of the
issues you will have on standard CLR will not impact SQL Server.
Classes that were produced in one language can be inherited by classes used in other languages.
Objects instantiated from the classes of a sender written in one language can be passed to the methods of receive objects whose
classes were created in other languages. The receiving objects receive your arguments and process them as if they were written
in the same language as the receiver.
Exception handling, tracing, and profiling are language agnostic. In other words, you can debug across languages, and even
across processes. Exceptions can be raised in an object from one language and understood by an object created in another
language.
Language interop helps maximize code reuse, which is one of the founding principles of all object-oriented languages, and something
we shout out loud. The interoperability is achieved by the provision of metadata in executables and class assemblies that describe the
makeup of assemblies, and the intermediate stage code that is understood across the entire Framework.
Note Components that adhere to the CLS rules and use only the features included in the CLS may be labeled as CLS-compliant
components.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 33 of 90
Although the members of most types defined in the .NET Framework class library are CLS-compliant, some may have one or more
members that are not CLS-compliant. These members are provided to enable support for non-CLS-compliant features, such as
function pointers. C#, for example, can be used to access these so-called unsafe features while the architects of Visual Basic have
decided to stay clear of unsafe code. The noncompliant types and members are identified as such in the reference documentation.
More information about them can be found in the .NET Framework Reference. In all cases, however, a CLS-compliant alternative to a
non-CLS compliant construct is available.
The code you write to target the CLR is called managed code. This means that the execution of the code in the runtime environment is
managed by the CLR. What exactly the CLR manages is discussed shortly.
When I started programming in Java and Visual Basic in the mid-nineties, I was perplexed by the need to pay so much attention to the
runtime environment. It took an effort to gather up all the runtime elements and make sure they were properly installed just to run the
simplest application. I was always shocked to have to build a CD just to ship an application that could fit on a quarter of the space of a
floppy disk.
As a Delphi programmer I did not need to concern myself with the need to ensure that a runtime layer on the target operating system
would be able to support my application. But then the big Delphi applications produced executables and Dynamic linked libraries
(DLL) files that became rather bloated.
When I moved to VB and Java, I found it disturbing that a tiny executable of no more than 100K needed many megabytes of
supporting libraries just to run. In the early days of Java making sure I had the support of the correct VM was a painful chore. Since I
was only writing Windows applications, I learned to rather program against the Java components of Internet Explorer, to be sure that
my Visual J++ apps would work. Testing for IE’s JVM was actually the easiest way to deploy VJ++ apps back in 1997 or 1998.
After a few years, however, it became clear that the target operating systems my clients were running already had the supporting
runtime environment I needed. This was more the case with the JVM than for VBRUN, mind you, because just about everyone
already had the latest versions of Internet Explorer on their machines. In the new millennium, as long as your operating systems are
well patched and service packs are kept up to date by your IT staff, worrying about the runtime for classic apps is a thing of the past.
This is how it is with SQL Server. You don’t need to worry about any supporting engine components, as we will soon see.
When you compile your SQL Server code, it is changed to an intermediate code that the CLR understands and all other .NET
development environments understand. All .NET languages compile code to this IL, which is known as Microsoft Intermediate
Language, better known as MSIL or just IL for convenience in various places in this book. The idea of compiling to an IL is not new.
As you know, two popular languages compile to an intermediate language (or level): Java and Smalltalk.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 34 of 90
There are many advantages to IL (and several disadvantages we will discuss shortly). For starters, on the pro side, compilation is
much quicker because your code does not have to be converted to machine code. Another of the major advantages of IL is that the
development environments of the other .NET languages can consume components and class libraries from the other languages
because at the IL level all .NET code is the same.
Note MSIL represents a major paradigm shift in the compilation of code for the Windows platform. Gone are the days when vendors
touted compiler speeds, robustness of linkers, and so on. Today thanks to Java and .NET, most of the code we write is first
compiled to IL, and we don’t have to do anything else as programmers to compile our code to the machine code level.
Cross-language debugging and profiling is also possible, and so is cross-platform debugging as long as the CLR is the code
management authority end to end. Exceptions caused by code that was originally written in Visual Basic can be handled by a C#
application, and vice versa. Specifically, IL provides the following benefits:
It provides cross-language integration. This includes cross-language inheritance, which means that you can create a new class
by deriving it from a base class written in another language.
It facilitates automatic memory management, which is fondly known as garbage collection. Garbage collection manages object
lifetimes, rendering reference counting obsolete.
It provides for self-describing objects, which means that complex APIs, like those requiring Interface Definition Language
(IDL) for COM components, are now unnecessary.
It provides for the ability to compile code once and then run it on any CPU and operating system that supports the runtime.
Figure 11–3 shows what happens to your code from the time your write and compile it in Visual Studio to execution.
Metadata
Once you have built an application, a class library, or a component and compiled it, the IL code produced is packaged up with its
metadata in an assembly. The assemblies will have either an exe or a .dll extension, depending on whether they are executables or
class libraries.
But the code cannot be executed just yet, because before the CLR can compile it to machine code, it first needs to decide how to work
with the assembly The metadata in the IL directs how all the objects in your code are laid out; what gets loaded; how it is stored;
which methods get called; and contains a whole slew of data on operations, control-flow, exception handling, and so on.
The metadata also describes the classes used, the signatures of methods, and the referencing required at runtime (which is what gives
you such powerful stuff as reflection and delegation, with its AddressOf operator). It also describes the assembly by exposing the
following information about the IL code in the assembly:
The identity of the assembly (name, version, public key, culture context, and so on)
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 35 of 90
Attributes, which are additional elements used on types and their members at runtime.
All this data is expressed in the metadata and essentially allows the assembly contents to be self-describing to the CLR. Self-
describing code makes all the hassles of registration, type libraries, and Interface Definition Language (IDL), as discussed, a thing of
the past. But metadata does much more.
Self-describing files do not need to be identified or registered with the operating system. By packaging metadata within the executable
file itself, the identification is a self-describing ability on the part of the assembly. You can also trust a self-describing assembly more
implicitly than you can a file that publicizes itself in the registry, because registry entries date rapidly and their integrity can be easily
compromised. Registry entries and their implementation counterparts (the DLLs and executables installed on the system) also can
become easily separated.
If you intend your classes to be totally language agnostic, they need to conform to the CLS and not include elements not supported by
all CLS languages. Because so many CLS languages are here now, and because many more CLS languages are on their way, you
might want to further study the CLS in the .NET SDK.
Executable Code
Assemblies do not have carte blanche run of the CLR. Code is not always passed directly to the just-in-time (JIT) compiler. First, the
IL code may undergo a thorough inspection if deemed necessary by the platform administrator. The code is given a verification test
that is carried out according to the wishes of the network administrator, who might have specified that all .NET code on the machine
must be executed according to a certain security policy. The IL code is also checked to make sure nothing malicious has been
included. How the checking is carried out is beyond the scope of this book, but we will look at various security settings a little later in
the chapter.
The code is also checked to determine whether it is type safe, that the code does not try to access memory locations it is restricted
from accessing, and that references reference what they are supposed to reference. Objects have to meet stringent safety checks to
ensure that objects are properly isolated from one another and do not access each other’s data. In short, if the verification process
discovers that the IL code is not what it claims to be, it is terminated and security exceptions are thrown.
Managed Execution
The .NET JIT compiler has been engineered to conserve both memory and resources while performing its duties. It is able, through
the code inspection process and self-learning, to figure out what code needs to be compiled immediately and what code can be
compiled later, or when it is needed. This is what we mean by JIT compilation-the code is compiled as soon as we need it.
Applications and services thus may appear to be slow to start up the first time, because subsequent execution obviates the need to pass
the code through the “JIT’er” again. You can also force compilation or precompile code if necessary. But for the most part, or at least
until you have a substantial .NET project underway, you will not need to concern yourself about cranking up the JIT compiler, or keep
it idling.
During execution, the CLR manages the execution processes that allocate resources and services to the executable code. Such services
include memory management, security services, cross-language interop, debugging support, and deployment and versioning.
Managed execution also entails a lot more than reading IL, verification, JIT, and so on. It also describes what the CLR does once it
has loaded and executed an application. Three sophisticated operations of the CLR worth noting are side-by-side execution, isolating
applica tions and services into application domains, and garbage collection.
Side-by-Side Execution
The autonomous, independent, self-describing, unique, versioned nature of an assembly allows you to execute multiple versions of the
same assembly simultaneously This is a phenomenon known as side-by-side execution. This is not something that has never been
done before. It is, moreover, something that could never be done easily, and it could not be done with just any application.
Side-by-side execution has brought about the end of DLL hell, because you no longer have to maintain backward compatibility of
libraries and components when new applications and assemblies are installed on a machine. Instead, applications that depend on
yesterday’s version of Smee’s component will not break because a new application was installed with today’s version of Smee’s
component. And when you need to junk the various versions of Smee’s component when they are no longer being used, you can hit
DELETE. However, you will need to explicitly reregister a DLL in SQL Server every time you make changes to the code and then re-
register the component you are going to call, such as a function, a stored procedure, a trigger.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 36 of 90
Side-by-side execution is possible because an executable assembly expresses a dependence on a particular assembly (the old Smee
component). So as long as the old component is still around, any application that needs it still works. However, versioning on .NET is
a little more intelligent than simple version numbers and assemblies that can be gone in a SHIFT-DELETE. Version policy can
specifically force an application to upgrade to the new version of Smee’s component.
Note Just because you can run applications and assemblies side by side on the same computer, and even in the same process, it
doesn’t mean that conflicts won’t crop up. You need good application design and proven patterns of software development to
ensure that code is safe and reentrant.
A boon for developers coding to .NET is the automatic memory management that it provides. This has been in achieved using a
sophisticated memory-management algorithm called a garbage collector (GC).
Let’s set the scene with an analogy. If you are a single person, you know what drag it is to schlep the garbage out in the morning. If
you are not single, you may also know what a drag it is to be asked to schlep the garbage out in the morning. And if you have kids,
you know what it is like to argue with them and still have to take the garbage out yourself.
See yourself in that picture? Programming and managing memory without a GC can be a drag. Now imagine that every morning, the
garbage bag simply dissolves and you no longer have to worry about it. This is what the GC does for you. It eliminates the chores of
managing memory in programming.
When you no longer need the object and nix the reference variable, when you assign the reference variable to another object, or when
something just happens to cut the reference variable from the object, the object gets lost (it has gone out of scope). This means that
you no longer have a way of referencing the object to reuse it.
In VB 6.0 and earlier days, objects that went out of scope, got lost, or simply were not needed anymore had to be explicitly disposed
of (remember Terminate events [VB], Destroy or Free [Delphi], or DeleteRef [C++]). The problem in manual memory management is
that when you have a lot of objects, you sometimes forget to dispose of them, you lose track of how many were created, and so on. So
some of these objects never get cleaned up and you slowly start to “leak out memory.” The .NET GC does not let this happen, because
these “lost” objects are removed and the memory they occupied is freed.
This, of course, could mean that you can write a heck of a lot of code without having to worry about memory management. However,
we need to say “yes, but” and add a big disclaimer: You can write a lot of code and never have to worry about freeing objects. And
you will see that in the examples provided in this book. But the concept of not having to worry about memory management ever again
is simply untrue-untrue for the .NET languages and untrue for Java.
To demonstrate, let’s say you create an application with GC that is opening up sockets all over the Internet and about ten threads are
running, each in its own little “slice” on the system, activating objects and basically being very, very busy. The algorithms in the
application that work the threads need to create objects, work with them, and then dump them (for whatever reason, they cannot be
reused by the thread). In this case, chances are that you are going to run out of memory just as quickly as you would in the unmanaged
world because the GC cannot clean up after your threads as quickly as you need.
You might think that you could just call Finalize after each object is done with. But, sorry folks, GC algorithms do not work that way.
You see, the finalization of objects in the GC world of managed execution is nondeterministic, which means that you cannot predict
exactly when an object will be cleaned out. Objects aren’t removed chronologically, so those that died earlier than others may end up
getting removed out of order. GCs do not just stop and rush over to do your bidding. Like kids, they don’t come running immediately
when the garbage bag is near bursting.
There is something else you need to think about. Garbage collection can itself be a bottleneck. The boon of not having to set objects
free has this trade-off: The GC is under the control of the CLR, and when the collector stops to take out the garbage, your threads
have to mark time. This means that not only do you have garbage stinking up the place, but your threads get put on hold while the
GCs dumpster pulls up at your back door. So now you no longer have memory leaks to worry about, but you might have “time leaks”
instead.
Before you tear up this book and decide to go into shrimp farming, know this: The CLR allows you some management over the GC.
A collection of GC classes and methods are at your disposal. This does not mean that you can force collection or make the cleanup
deterministic, but it does mean that you can design your applications and algorithms in such a way that you have some degree of
control over resource cleanup.
Here is something else to consider. Just because managed code is garbage-collected does not mean you can ignore application design
and common sense. If you are coding applications that lose or nix objects, the GC is not going to work for you. In fact, you should
return this book to the store (don’t tear it up, though) and go into shrimp farming. Your patterns and design should be using the
objects you create until the application or service shuts down. And objects that have to be removed should be kept to a minimum.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 37 of 90
Despite our warnings, the GC is actually very fast. The time you might lose to collection is measured in milliseconds in the life of the
average application on a fast machine. In addition, the GC can be deployed on multiprocessor machines, allowing its threads to be
allocated to one processor while yours run on the other. And because the GC is such an important part of the CLR, you can bet that
Microsoft will often send it back to the workshop for tune-ups, oil changes, tire-rotation, and so on.
Understanding Assemblies
While namespaces can be understood as a logical grouping or encapsulation of classes, the assembly is a “physical” container for at
least one built (compiled) executable or class file or module or some other resource, like an icon. If the assembly is a class library,
then the class or classes it harbors are referenced by the fully qualified namespace (FQNS) name described in the preceding section. If
the assembly is an executable file, an application, you reference it by the name of the physical file, which needs an entry point to
allow the operating system initiate its execution.
Note Assembly names and namespace names should not be confused. The two names, while often similar and sometimes identical,
have very little to do with each other.
At the physical level an assembly is many things, and the organization of its contents- Microsoft Intermediate Language code (MSIL)
and metadata-is quite complex. While you don’t need to know the ins and outs of the contents of the assembly, you need to fully
understand what an assembly is and how to build it, name it, distribute it, and manage it in order to be effective in your development
efforts. This section will help you achieve that so that you can navigate your software development results and the chapters of this
book more easily. You will understand assemblies better if we separate them into the four types of units that the Visual Basic
compiler can produce them as
Console executable This assembly is the standard, GUI-less, console Window that we have been compiling to so far in this
chapter. Console assemblies have the .exe extension. OS entry into the executable is through Main. Console executable code is
not supported in SQL Server.
Windows executable This assembly is the standard .NET Windows executable file. The assemblies are also given the .exe
extension. OS entry into the executable is through WinMain. Windows executable code is not supported in SQL Server.
Class library This assembly is your standard .NET class library, which can be dynamically linked. These assemblies are given
the .dll extension. They can contain one class or many. OS entry into the library is via DLLMain.
Class module This assembly is your standard class module, which is used as a container for compiled classes that still need to
be linked into a project or as part of a formal class library before it can be used. These assemblies are given the .netmodule
extension. No entry into this file is required because entry is via the DLLMain of the assembly it is linked to.
SQL Server’s use of the CLR needs only to work with class library and class modules that have been specifically tailored to SQL
Server. SQL Server obviously does not need to run Windows GUI application or console applications. It can be considered
good .NET programming practice to name an assembly such that it describes the purpose and provides a hint of the types inside it and
the purposes of these classes. The System.dll file that ships with the Framework is a good example. However, also naming the
assembly “System” tends to blur the distinction between the assembly name and the namespace name (such as System.Data, which
refers to both the namespace and the assembly name). I think it’s better to give your assembly a name that does not “clash” with the
root namespace name.
Before we discuss the four types of output files further and how they are produced, let’s take a closer look at how assemblies are
located by the runtime, the actual makeup of an assembly, and the roles they play in .NET Framework software development.
Most of time, the assemblies you create-executable applications, functionality, or resources- reside in a folder you create with an
installation routine or utility. The default location when you are building assemblies is the project folder for Visual Studio.
The assemblies can be stored in the root folder of your application, or in subfolders. You have a lot of flexibility in where you house
your assemblies and how you get them to their folders.
The other location for your assemblies is the Global Assembly Cache, or GAC (pronunciation rhymes with wack). Assemblies placed
into the GAC must be shared and given strong names (described later in this section), so these assemblies would typically be used by
more than one application or user, even concurrently. The concept of “registering” with the GAC is similar to registering with the
registry, just not as fragile a process or as hard to maintain. For SQL Server you might create a folder for assemblies inside the SQL
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 38 of 90
There are ways of overriding the default methods for locating assemblies. You can also redirect the path to an assembly. Assemblies
can also interoperate with the COM and COM+ world and are accessible from unmanaged clients, something you would not typically
do for SQL Server assemblies.
Microsoft suggests keeping assemblies private and thus out of the GAC if they do not need to be shared, which is a good practice for
SQL Server CLR code.
What’s in an Assembly
In the early days of developing for the Microsoft operating systems (usually one of the early shades of Windows), the compilers
produced a file that was compliant with two standards, the Microsoft Portable Executable (PE) format and the Microsoft Common
Object File Format (COFF). The two standards were created to enable the operating system to load and execute your applications, or
link in the DLLs.
The formats specified how the compiled files were laid out, so that the OS found what it expected to find when it executed or loaded
your files. The .NET assemblies adopt the PE/COFF combination to enable the runtime to process your files in the same fashion as
the standard executable files you compile, and this is true on SQL Server CLR as well.
Tip You can’t ignore this section if are in charge of deploying, packaging, or installing assemblies to SQL Server.
Metadata
Assemblies carry metadata so that they can describe themselves to the runtime environment (the CLR). The metadata describes code
and class data, and other information like security. .NET assemblies are not compiled to machine code, like their native brethren, but
rather to MSIL.
Metadata provides us with a simpler programming model than what we have been accustomed for so many decades. We no longer
need to work with complex and finicky Interface Definition Files (IDL), dozens of cryptic header files that are so tedious and time-
consuming to prepare, and external dependencies for code and components alike. This is why a .NET assembly is a no brainer to run
on SQL Server.
When a .NET (PE) file is executed or loaded, the CLR scans the assembly for the metadata manifest that will allow it to interpret,
process, JIT-compile (down to machine code) and then run the file. The metadata is not only for the benefit of the CLR but it
identifies the assembly-allowing it to describe itself-to the SQL Server .NET environment or Framework, even across process
boundaries.
Figure 11–4 illustrates how the contents of the PE/COFF assembly are assembled, hence the terms assembly-which is not a new term
to computer language boffins. While it is convenient to keep calling the .exe files the compiler can produce executables, they are not
really executable without the presence of the Common Language Runtime on the computer, an issue that is likely to disappear within
a few years. Remember how the issue of having the Java Virtual Machine became a non-issue.
When you build or compile your file into the PE format, metadata is inserted into one portion of the file while your code is compiled
down to MSIL and inserted into another portion of the file. Everything in the file is described by the metadata that is packed into the
assembly, including inheritance, class code, class members, access restrictions, and so on.
When you execute an application and a class is referenced, the CLR loads the metadata of the respective assembly and then studies
this payload to learn everything it has to know to successfully accommodate the assembly, its resources, and the requests of the
contents.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 39 of 90
Description of the assembly This metadata describes the identity of the assembly, such as name, version, culture, public key,
and so on. It also holds references to types that are exported, the assembly’s dependencies, and security permissions.
Description of the assembly’s types This metadata describes the types in the assembly. The description includes the name, the
visibility of the class, the base class, and any interfaces implemented. It also describes class members, such as methods, data
fields, properties, events, and type composition or nesting.
Description of Attributes This metadata describes the additional descriptive modifiers that alter types and their members.
The metadata just described provides a sophisticated mechanism for allowing assemblies to describe themselves to the CLR. In other
words the metadata includes everything the CLR needs to know about a module and its execution and interaction with other modules
in the CLR. Since the assemblies do not require explicit registration to the operating system, application reliability is increased
exponentially
The metadata also facilitates language interoperability and allows component code to be accessed equally by any CLS-compliant
language. You can inherit from classes written in other languages by virtue of the BCL, which is mostly written in C#.
The PE file is divided into a section for metadata and a section for the MSIL code. The metadata section references the MSIL sections
via a collection of tables and heap structures, which point to tokens that are embedded in the MSIL code.
This also means that you cannot change the contents of the assemblies or “fix” the MSIL code without the assembly metadata
knowing about it. This provides a consistent means of checking up on the integrity of the assembly contents-that it has not been
compromised.
The metadata token is a four-byte number that identifies what the token references in the MSIL-a method, a field, and so on.
In additional to the logical types of assembly described earlier, assemblies can be either static or dynamic and private or shared:
Static assembly This assembly is the .NET PE file you create whenever you compile and build a class library or some type of
application. The namespaces we discussed earlier are typically partitioned across such assemblies. They can be in one assembly
or partitioned across multiple assemblies.
Dynamic assembly This assembly is a memory-resident module that gets loaded at runtime to provide specific runtime
services. A good example of dynamic assemblies is the Reflection class collection, which allow you to reference and access
runtime type information.
Private assembly This assembly is a static assembly that can only be accessed by a specific application. This assembly is
visibly only to the application or other assemblies in its private folder or subfolder.
Shared assembly This assembly is given a unique or strong name and public key data so that it can be uniquely identified by
the CLR. It can be used by any application. A dynamic assembly can also be shared.
Let’s now take a closer look at the contents of an assembly-and among other things its IL code. The quickest way to do that (besides
reading this book) is to run the IL disassembler application that ships with the .NET Framework Software Development Kit (SDK).
The file is called ILDASM. Double-click the application and the application will load.
Go to File | Open and aim the application at any assembly you might already have created. Let’s first check out the assembly manifest
so that we know what we are looking at.
The manifest is the critical requirement of the assembly because it contains the assembly metadata. However, you can compile an
assembly to MSIL without a manifest, to produce a netmodule (see the section on module assemblies later in this chapter). Assembly
manifests can be stored in single-file assemblies or in multifile assemblies in stand-alone files.
The assembly manifest’s metadata satisfies the CLR’s version requirements and security identity requirements, the scope of the
assembly, and resolution of resources and types.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 40 of 90
Metadata that identifies the assembly, which includes the name, version number, culture (language and culture), public key,
digital signature, and so on
Metadata that identifies all the files that compose the assembly, as a single file or as many files that form a logical unit
Metadata that provides for the resolution of the assembly’s types, their declarations, and implementations
Metadata that resolves dependencies (other assemblies on which this one depends)
Metadata that allows the assembly to describe itself to the runtime environment
.module SQLcr.dll
// MVID: {}
.imagebase 0x11000000
.subsystem 0x00000002
.file alignment 512
.corflags 0x00000001
// Image base: 0x03680000
.namespace SQLcr.Ch11{
.class /*02000002*/ private auto ansi sealed Welcome
extends [mscorlib/* 23000001 */]System.Object/* 01000001 */
{
.custom /*0C000001:0A000003*/ instance void [Microsoft.VisualBasic/* 23000002
*/]Microsoft.VisualBasic.Globals/* 01000003 *//StandardModuleAttribute/* 01000004
*/::.ctor() /* 0A000003 */ = ( 01 00 00 00 )
.language '{}',
'{994B45C4-E6E9–11D2–903F-00C04FA302A1}', '{00000000–0000–0000–0000–
00000000000}'
So now you have seen what goes into the assembly and what the manifest achieves. But what does the assembly do for you? Without
getting lost in the minutiae of the Framework, let’s investigate the essential roles of an assembly An assembly is
A type boundary
A unit of deployment
A unit of execution
A version boundary
A security boundary
On the file system the assembly looks like any other dynamic link library and, as discussed earlier, usually goes by the .dll extension,
although it can also be a cabinet file (with the .cab extension).
First of all, you can build a class and make its source code available to any application. But you would mostly do that for your own
use, and maybe for your development team members. However, I don’t suggest you provide “raw” classes to your team members
either, because with access to the actual source code there’s no telling what problems can be introduced. You would only supply the
raw source files if your user specifically requested or needed it, as do readers of this book, or your customers have opted to buy the
source code of your components (usually as a safeguard against your going out of business).
The best examples of assemblies, as mentioned earlier, are the ones that contain the base class libraries that essentially encompass
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 41 of 90
the .NET Framework. As mentioned earlier, SQL Server uses a subset of these. To compile a class to IL and package it up into an
assembly is very straightforward. You simply build the class and specify to the compiler which assembly you want to put it in and
under what namespace.
Classes (or types as they are known when they have been reduced to IL) are separated by the assembly in which they reside, which is
why the assembly is known as a type boundary. In other words, two types can be placed onto the same namespace, but they can exist
in individual assemblies. The problem arises when you try to reference the type in the IDE because you can only Import to one fully
qualified namespace. The IDE, by the way, will not let you reference the second class twice but will report to you that you have
already made the reference.
The manifest metadata specifies the level of exposure a type and its resources have outside the assembly, the dependencies of the
assembly (other assemblies on which it depends), and how types are resolved and resource requests satisfied.
If the assembly depends on other assemblies that are statically linked to it, then their names and metadata are included in the manifest.
Data such as the referenced assembly’s name, version, and so on are stored in the manifest.
The reference scopes of the types in the assembly are also listed in the manifest. The types can be accessed outside the assembly,
which is the process that lets you reference them by their FQNS, or they can be given friend access, which means that they are hidden
from the outside world-only accessible to the types within the same assembly in which the friend resides.
When you execute an application, the application assembly calls into any other assemblies that it depends on. These assemblies are
either visible to the application assembly, .exe file, in the same folder or in subfolders, or they are visible in the runtime environment
because they have been installed in the GAC.
Assemblies installed in the GAC are shared, which exposes them to other assemblies that may need access to their internals. You
might also ship utility classes, culture and localization classes, components, and so on, and these can be installed in the applications
installation folder or also installed into the GAC. These assemblies let you build very thin application assemblies and allow you to
keep successive deployments small, where you just need to change out the assembly that is outdated.
Also, versioning in .NET lets you or your users install new versions of your assemblies, without breaking the assemblies from
previous installation and so breaking applications that have already been installed on the system.
The CLR lets all shared assemblies execute side by side or be accessed side by side. What that means is that as long as you create a
shared assembly, with a strong identity and a unique version number, and you register it into the GAC, the CLR will be able to
execute the assembly alongside another assembly. The DLL conflicts of the past are thus abolished under the CLR because only the
version number and unique public key data allow the CLR to distinguish between the assemblies.
You will also likely avoid the problem of a new assembly overwriting an older one, thereby “breaking” the previous installation.
The CLR also has no problem referencing any dependent assemblies because all the information it needs to be sure it is executing or
linking in the correct files is each assembly’s manifest. This is known as side-by-side execution. The only difference between the two
assemblies is the version numbers of each.
The assembly is the smallest versionable unit in the CLR, which means that the types and other resources it encapsulates are
versioned with the assembly as a unit. A class cannot stand alone and be accessed outside of the assembly architecture because there
is no way to reference it. The class or type can be either part of the application assembly or stand alone in its own assembly, which
provides the version data for it.
The version number is encapsulated in the assembly manifest, as shown earlier. The CLR uses the version number and the assembly’s
public key data to find the exact assembly it needs to execute and any assemblies that may be dependent on the specific version.
In addition the CLR provides the infrastructure to allow you to enforce specific version rules.
The assembly is a security unit that facilitates access control to the data, type functionality, and resources it encapsulates. As a class
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 42 of 90
provider, the CLR allows you to control access to your assembly’s objects by allowing you to specify a collection of permissions on
an assembly The client process-rich clients, thin clients, Web forms, or otherwise-must have the permission you specify in order to
access the object in the assembly
This level of security is known as code access security. When an assembly is accessed, the CLR very quickly determines the level of
code access allowed on the assembly. If you have authorization, you get code; if not, you’re history. The idea of controlling code
access is fairly new and in line with the model of distributed functionality that is becoming so widespread. Code access security also
employs a role-based security model, which specifies to the CLR what a client is allowed to do with the code it can access.
The security identifier of an assembly is its strong name, which is discussed in the next section.
Besides client access to assemblies, system resources also require protection from assemblies. The SQL Server CLR security secures
access to system resources by comparing credentials and proxies of credentials to the Windows file system’s security architecture.
Strong Names
Assemblies can be given strong names, which will guarantee their uniqueness and provide security attributes. The strong name is
made up of the assembly’s standard name (such as codetimes.sqlserver.system), its version number, culture, public key data, and
digital signature. The strong name is generated from all this data, which is stored in the assembly manifest. If the CLR were to
encounter two assemblies with the same strong name, it would know that the two files are 100 percent identical.
Strong names are issued by Visual Studio and by development tools that ship with the .NET SDK. The idea behind strong names is to
mainly protect the version lineage of an assembly, because the guaranteed uniqueness ensures that no one else can substitute their
assembly for yours, which otherwise would be a major security loophole. In other words, a strong name will ensure that no other
assembly, possibly packed with a hostile payload, can masquerade as your assembly.
The strong name also protects your consumers and allows them to use the types and resources of your assernblies with the knowledge
that your assemblies have not been tampered with. This is a built-in integrity check that will allow consumers to trust your code.
Combined with supporting certificates, this offers you the ultimate security system for the protection of enterprise and distributed
code.
ASP.NET Web application security This mechanism provides the means for controlling access to a Web or Internet site
through authentication. Credentials are compared against the file system or against an XML file that contains lists of authorized
users, authorized roles, and HTTP verbs.
Code access security This mechanism uses permissions to control assembly access to resources and operations. By setting
permissions, you can protect the system from malicious code while at the same time allowing bona fide code to run safely This
form of evidence-based security is managed by administrators.
Role-based security This mechanism provides access to assemblies based on what it, as the impersonator of the user, is
allowed to do. This is determined by user identity, role membership (like the roles you have in SQL Server 2005), or both.
As a SQL Server CLR developer, you need to consider security on a number of levels. You need to determine how your code will run
in the target environment, how it will resist attack, and how you can handle security exceptions that are raised when your code is
blocked.
Note We don’t condone writing assemblies for malicious or hostile use, but nevertheless there are developers out there with less than
amicable intent who will be reviewing the .NET security model to figure out how they can get assemblies onto the .NET runtime.
Tip You can protect your assemblies from invasion through the technique of strong naming or digital signing. If your assemblies are
going to find their way into the public domain, it is recommended that you both sign and strongly name them. A strong name is a
unique name that is generated from the contents of an assembly, such as version numbers, simple names, digital signatures,
culture information, and so on.
You should fully investigate both strong-naming techniques and digital signing of the assembly-which is achieved through public key
encryption technology via the services of a Public Key Infrastructure (PKI)-because most Chief Technical Officers (CTOs) are going
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 43 of 90
To facilitate interoperability, the .NET classes and types are compliant with the CLS. Any language compiler that targets the CLR can
thus use them. However, as mentioned earlier all compilers do not equally support the class libraries.
The .NET Framework classes provide an extensive array of functionality related to basic I/O, threading, networking, security, data
access, forms, Web services, and so on. You can use the classes and data types to build sophisticated applications and services, and
components or controls that can be simply plugged into any .NET-compliant environment. Later chapters throughout this book delve
into advanced use of the class libraries for stored procedures, triggers, functions, and the like.
You can derive from the .NET classes or extend functionality where permitted, or else you can implement an interface directly in your
code from a runtime-based class that implements an interface.
The .NET Framework types are named using dot notation that denotes a hierarchy or a namespace. This is not unlike the naming
notation used by Java, or the namespaces notation used by the Internet Domain Name System or the Active Directory namespace. For
example, the System.Data.ADO namespace refers to the hierarchy of classes that represent the functionality of the ActiveX Data
Object technology (as well as CLR database objects for SQL Server 2005). In order to gain access to the ADO in .NET, you would
need to reference System.Data.ADO to directly reference the actual ADO class, which is the last name in the namespace “chain” of
classes. If you were to reference system.data, you would not only reference ADO but also all other classes on the system.data
namespace. If you do not need any other data class in your application, you would be wasting a lot of resources compiling in the other
resources on the data namespace, such as the SQL types.
The dot notation syntax has no effect on the visibility and access characteristics of classes and their members. It also has no influence
on inheritance or binding or the interfaces available. In fact, the namespaces can also be partitioned across assemblies, and a single
assembly may contain multiple class namespaces.
It really doesn’t take long to understand the nuances of the CLR and its various components. In fact, most of the developers on your
team writing code targeting the CLR need never really worry about the CLR at all. Instead, you can elect one or two people to be CLR
“diligent.”
CLR specifics-especially the GC, application domains, and security-need to be hashed out in the design and modeling stage. Provide
specific support for exception handling (by delegating the duty of adding security exceptions to your custom exception classes), and
not only will projects come in ahead of schedule, but you can take Fridays off to go sailing or horseback riding.
You are now ready to create some .NET components to run on the SQL Server CLR.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 44 of 90
So now that you know how the CLR works and that SQL Server has its own in-process version of it for code execution, what exactly
can you do? The following list describes the database objects that run on the SQL Server CLR:
Stored Procedures
Triggers
Aggregates
User-Defined Type
As mentioned earlier, the functionality to create and deploy these objects is exposed in an assembly called system.data.dll, which is
part of the .NET Framework. This assembly can be found in the Global Assembly Cache (GAC) as well as in the .NET Framework
directory. When you create the class for any of the above listed objects you will need to reference the system.data.dll (version 2.0 of
the Framework) assembly, which contains the following namespaces:
System.Data
System.Data.Sql
Microsoft.SqlServer.Server
System.Data.SqlTypes
The following code demonstrates a simple trigger on a table, that notifies an administrator that the table has been modified by
someone:
using System;
using System.Data;
using System.Data.Sql;
using Microsoft.SqlServer.Server;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using System.Xml;
using System.Text.RegularExpressions;
Once you have created a component to run on the CLR you can either deploy the project using Visual Studio 2005, or manually copy
the assembly to the target server and then load it using T-SQL code in Management Studio as follows:
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 45 of 90
If you need to nix the assembly you can either run the following T-SQL code
or drill down to the \\MYSQLSERVE\\Databases\MYDB\Programmability\Assemblies right-click the assembly and select Delete.
That’s all there is to creating a CLR database object for SQL Server 2005. The complexity will be in your code, not in the act of
installing the assembly and registering the object.
Going Further
Further discussion of the CLR database objects is beyond the scope of this book, save for a short discussion on CLR stored procedures
and functions in Chapter 14. See the Microsoft SQL Server 2005 Developer’s Guide by Michael and Denielle Otey,
Osborne/McGraw-Hill, 2005, for a more advanced discussion of objects such as UDTs and aggregates.
This book is certainly not the forum for a discussion of data integrity, and this is about as far as I want to venture in discussing
relational database theory. But without exploring some concepts and accepting the only feasible definition of data integrity, you will
not benefit from all the tools and new features that SQL Server 2005 supports with respect to data integrity modeling and
programming.
Data integrity definitely is not a practice, a discipline, that ensures that data stored in a database is correct, only that is it believable or
plausible. There is no way between this life and the hereafter that SQL Server 2005, or any other RDBMS, can guarantee that data in
a database is correct. Get correct out of your vocab now. SQL Server 2005 has no way of knowing and thus ensuring that my area
code is not 209 but rather 299, or that my last name is Shapiro and not Schapiro. I have even heard of a girl named Jeffrey. You need
to start thinking, modeling, and programming SQL Server in terms of data plausibility, not in terms of data being right or wrong.
Only if you accept this definition will you be able to use the tools and techniques supported by SQL Server 2005 to ensure the
integrity of your data, and thus its value as an asset to your enterprise. And after you start focusing on integrity in scalar terms and not
correctness in absolute terms, you will have a lot more faith in the data in your database, and you will be able to afford it the trust and
respect it deserves. After all, data that is not plausible or believable is a liability
As I discussed in Chapter 1, human error caused my wife extreme grief when, after changing medical insurance companies, she was
denied coverage for some time because the last name of her doctor, instead of Shapiro, was entered in the spouse’s last name field, To
my wife, the data integrity issue thus became a life-threatening one. To the medical insurance company, the issue almost exploded
into a liability problem.
3. The couple just got divorced but agreed to maintain the coverage.
4. A child is covered by a stepfather but still goes by the last name of his or her biological father.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 46 of 90
6. The last name is typed incorrectly (Shapiro becomes Ahaoeuei with just a few slips of the finger).
7. The handwriting on the application form is poor, or the last name is omitted and the data entry person makes a wrong
assumption.
This list could go on and on. And I am sure you could come up with dozens of scenarios that would also create questionable data, not
only in last name values but also in many other places. Numbers, for example, present incredible opportunities to enter problematic
data into a database.
But is this a question of integrity? If we accept that we program the DBMS to ensure that the data is as believable as possible, then it
is. If we try to ensure that the data is correct, then it is not. Any value may in fact be correct when it is assumed to be wrong, and it
may in fact be wrong when it is assumed to be correct. The only thing you can do to help ensure that data is believable is to help
ensure that it was believable when it was entered into the database.
The best I can think of doing at the data tier to help ensure that a value, such as the spouse’s last name, is believable is to force the
client to go back and check the data before it can be entered, or to compare the data against known values. It is possible to even refer
the record back to the client and request it to be entered by another user, possibly a supervisor who would take the fact checking to the
next level. Asking Web surfers to fill in application forms over the Internet is a good idea because it cuts out the middle data entry
person, the paper trail, and delay And it puts the onus of ensuring the data plausibility on the client, who is more likely to ensure that
his or her information can be relied upon.
I recently watched a horrifying story on CNN about an American pharmacist who gave a child a f fatal overdose of a drug contrary to
what had been correctly prescribed by the pediatrician. The excuse was human error, failure of the supervisor to double-check
prescriptions, filling hundreds of prescriptions a day Why, in heaven’s name, in this day and age, are pharmacists still using
typewriters and word processors to provide instructions about dosage and administration of dangerous drugs? A database should have
been used to check that the dosage did not exceed safely levels for the drug prescribed. No computer program checked the dosage,
and so a mother sent her child to bed and he never woke up. Now, whenever we buy drugs, we check the label and wonder “can we
trust our lives to this data?”
Obviously, the subject of human error is beyond the scope of this book, other than to discuss what possible means we might have of
preventing humans from entering questionable data into a database. Joe Celko touched on the subject in his marvelous book, Joe
Celko’s Data & Databases: Concepts in Practice (New York: Morgan Kaufmann, 1999). In a section titled “Models Versus Reality,”
he talks about errors in models, describing Type I and Type II error levels. A Type I error is accepting as false something that is true,
and a Type II error is accepting as true something that it false.
I agree without equivocation that the subject of errors in models is very important for database people to understand. Generations of
people have been wiped out because of this problem. Sub-Saharan Africa, where I spent my childhood, is going to be wiped out
because of AIDS. This could have been prevented, but the population there still believes, by and large, that AIDS is not sexually
transmitted and that the publicity is just “Western propaganda.” The fraud is in fact self-perpetuating or self-fulfilling, because
millions of Africans still have unprotected sex.
Yes, we can use fancy programming tricks and system features such as triggers and stored procedures to lessen the likelihood of
implausible data; we can even build more advanced human integrity checking into the client applications. How can we avoid
problems like the one just described and still program SQL Server 2005 as wisely as possible? To arrive at a possible solution, let’s
first explore the integrity assurance features and functions of SQL Server 2005. After this discussion, we can redress the last name
integrity issue and offer my medical insurance company some ideas before they get sued.
A check constraint or trigger can easily be used to prevent a customer from spending more than $500 on credit. You might agree that
any number above $500 is considered risky, but another customer might not. To apply this reasoning to the real world, for example,
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 47 of 90
an airline booking system may be programmed to resist assigning seats to frequent-flier passengers who try to redeem miles toward a
ticket, because the rules dictate that a seat should first be assigned to a cash customer, as opposed to a liability customer. All airlines
maintain seat assignment rules when it comes to frequent fliers; although they vary widely in their rules.
Another rule, at a lower level than the one just described, would be that all rows for a given table must be unique. This rule is one of
the core tenets of the Date relational model (C.J.Date). According to Chris Date, one of the world’s foremost database experts, the
relational model should not allow for any NULLs or duplicates whatsoever. In fact, Date is outspokenly against NULLs and declares
that they should never have been introduced into relational theory.
The Date rule declares that entity values (column values in a row) should never be NULL (unknown or missing). SQL Server 2005
lets you decide whether to abide by the Date rule or code to your own business rules, which may in certain circumstances allow both
duplicate rows in a table and even NULL values.
Ensuring integrity is very much part of the relational database modeling, whether it is expressed in terms akin to calculus and algebra
or according to Boolean logic or some other form of analysis. But data integrity, or the extent to which you manage it, as alluded to
earlier, is also up to you. And this brings to us the subject of rules, specifically business rules.
Business rules are a hot topic in the new millennium. Yet they are really the abstract declaration of the data integrity requirements
demanded by business owners and enterprise and data analysts. The “frequent fliers get last choice” rule discussed earlier is exactly
the type of business rule about which we are talking.
We can look at this another way. We can say that the data integrity constraint logic is the formal definition of a business rule applied
to corporate data. After all, you can scream about business rules and data integrity until the cows come home, but that will do nothing
to a database that knows only unfettered character-based data, has more duplicates than bottle tops, and can do little about enforcing
integrity (see Chapter 3). You will find, as we move from operational data support to analytical and temporal data support (discussed
in Part III) in SQL Server 2005, that the formulation of business rules becomes more of a requirement than a luxury. Analytical data
comes from operational data, so the more lax your integrity control in the OLTP system, the more effort you will have to expend
when you need to get analysis data scrubbed before it can be copied to the data warehouse. Values like N/A, TBA, or “unknown”
lessen the value in the data mine, and the information extraction becomes extremely time-consuming and expensive.
Figure 12–1 represents this discussion in conceptual terms. At the highest level-that is, the conceptual level-the enterprise and data
analysts formulate rules with the business owners. This is also the requirements formulating level. In the middle is the modeling level
that translates the business rules into database integrity requirements by database analysts and even DBAs. And at the lowest level is
the development model that implements the integrity requirement as constraints, checks, and procedures in SQL Server 2005,
implemented by DBAs and SQL Server developers.
Figure 12–1: Modeling the database for integrity and adherence to business rules
Now that our philosophical (and emotional) banks are charged, we can look at the level you are probably most interested in:
implementing the integrity requirements. To do so, we must classify integrity into several governing sections as follows:
Referential integrity
Entity integrity
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 48 of 90
Type integrity
Domain integrity
Transition integrity
Transaction integrity
Database and table constraint mechanisms are the broadest form of integrity control because they relate to the relationships between
entities, multiple columns in a table, and multiple tables in a database. A good example of a database integrity violation is allowing a
customer with bad credit to buy more goods and to put the order on account, or even allowing payment to be made by anything other
than cash, letter of credit, or a credit card. The following pseudocode would thus enforce the rules of database integrity for the given
example:
FOR Customers.Orders
IF Customer_Credit.Credit < 10
BEGIN
RAISERROR 1.Message ('Customer must pay with cash or credit card')
GOTO Cash
END
ELSE
BEGIN
RAISERROR 2.Message('Customer is cleared for credit')
GOTO Account
END
Another table-level constraint against integrity violation is the case of the orphaned row. If we delete an entity related to one or more
entities through primary-foreign key relationships, we are deemed (to use a legal expression) to have violated rules of referential
integrity. Referential integrity rules should be adhered to in every database deployment, which is why we devote a separate section to
this topic.
Referential Integrity
By deleting a member row that is referenced by other entities or rows, we in fact leave the remaining undeleted entities orphaned and
the record in tatters. In fact, the database is now in a poor state because no easy means now exists of identifying or finding the
orphaned entities and removing the entire record. And if you have data in a database that can never be accessed or relied upon, the
entire database becomes questionable.
I liken referential integrity violations to the habit some people have of not finishing an apple. If you cut the apple, take out a wedge, or
only eat half of it, the remaining fruit darkens and quickly goes bad. So it is when you delete a row and violate referential integrity:
over time the database goes bad also. As mentioned in the chapters in Part I, if you regard all the rows in linked tables as combining to
represent a complete record, deleting a row in one table and leaving the related rows intact is akin to taking a bite out of an apple and
then leaving it to go brown and rot.
On the other hand, referential integrity violation can also be said to have occurred when an entity references a nonexistent entity in the
database. Referential integrity requires that there be no references to nonexistent values and that if a key value changes, all references
to it change consistently throughout the database.
In the past, referential integrity would be maintained using triggers and many lines of code; we called this approach procedural
referential integrity (PRI). Most modern DBMS products now support declarative referential integrity (DRI), which is essentially the
opposite of procedural referential integrity. DRI makes use of built-in mechanisms to ensure or maintain referential integrity, while
PRI is the responsibility of the database modeler or developer.
The integrity criteria are defined in the various database objects’ definitions, such as foreign key (FK) existence.
The integrity criteria are defined and enforced in your T-SQL code.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 49 of 90
The constraints are implemented in triggers and stored procedures (see Chapters 13 and 14), or the T-SQL code in other manual
constraints.
Entity Integrity
Entity constraints, also known as relational constraints, define rules that govern rows and columns in tables and the tables as units.
Uniqueness is one of the primary entity integrity rules to ensure. Of course, you can maintain a database that enforces no uniqueness
whatsoever, but that would certainly render a table devoid of any integrity For example, you can create and populate a table of orders
and insert duplicate orders into the table willy-nilly. Your reasoning might be that, since customers can make duplicate orders, these
duplicate orders should be stored without any constraints.
But is there such a thing as a duplicate order? Only in error, I believe, and a constraint should be designed to catch such an error. Each
order is placed at a certain time and date. No order can be entered twice at exactly the same time. The description, quantity, price, and
discount of an item might be duplicated, but the time the entry was made cannot. Besides, each order is a new transaction that differs
from another transaction only in the time it was entered and the order number assigned to it. It is in fact unique.
Caution Although highly unlikely, it is possible for two users to enter the same record or data at exactly the same time. An identity
column and table locks would prevent the two rows from competing on the question of their uniqueness.
The tools we use to constrain entities are the primary keys (on identified columns), which can impose primary indexes, UNIQUE
constraints, IDENTITY properties, and so forth. Table constraints are checked before any data can be allowed into the table. If the
data violates the constraint rule-for instance, if it contains a duplicate row or a duplicate value on which we have installed a primary
index-it is rejected.
Type Integrity
Type constraint discussions generally cover domains, although data types and data domains are two separate concepts, and that is
gospel. I have thus added domain constraints to the list, and we discuss domain integrity next. A type constraint enforces rules that
govern correct use of the data type of a given value. Ensuring the consistent use of a date data type, for example, is a form of type
constraint. Dates, decimals, currency, and the like are data types recorded in several formats that govern precision, but the data will
become impossible to work with if such scalar values are not constrained to just one of the several formats required by your rules.
Assigning NULL values is another form of type constraint, even though we talk about NULL with respect to database and entity
constraints. By allowing NULL values in a tuple (field), you are explicitly allowing the storage of missing or unknown values. If you
disavow NULL values, you must either supply a value through some formula or supply a default value for the variable.
A NULL value, for that matter, is an oxymoron in a manner of speaking. You might ask yourself how a NULL value can be a value
and replace something that is missing or unknown. It is, however, a convenient means of allowing a row to be inserted into the
database even if it contains missing or unknown values. If your rules allow NULLs to be a temporary expedient, in they go. If not, out
they go.
The rules of integrity should guide your use of the NULL. Ask yourself if your data is reliable if it is made up of missing or unknown
values. If not knowing middle initials will not break your data, then the NULL represents a convenient placeholder until the initial
becomes available. Inserting a default value would not be feasible, because you would be inserting a Type II error that could spell
disaster: accepting as true something that may be false.
In this regard, you might consider NULL values to be a violation of database integrity as well. My rule allows NULL if it does not
violate the integrity of my data or adversely affect the analytical value of the data. If it does, I must install a default value and code the
logic, based on the business rule, that obtains the default. If you do not have the data for a value for a given row, you should instead
assign a default value or obtain a value through an alternate formula. There is a lot of debate surrounding NULL. Nevertheless, SQL
Server 2005 permits you the flexibility of either allowing or disallowing NULL values according to your needs…or ignorance.
Type constraints are defined in code and in the definitions of the data types, such as specifying NULL (allowed) or NOT NULL
(disallowed). Integrity is ensured with automatic checks and procedural code.
Domain Integrity
Data domains are the logical grouping of values. For example, in my items table I have a color column that can take only one of five
colors. Notice the emphasis on only because that is the focus of my domain constraints. The domain rule for the item dictates that it
can only be red, blue, mustard, lime, or black. You might argue that mustard and lime belong to the domain flavors, and if you do,
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 50 of 90
you have grasped the concept of a data domain, and you have joined the debate.
I tend not to agree that domain constraints govern data types; that is the work of type constraints. However, the check and constraint
mechanisms might be the same for both, and many domain constraints have been cast along data type lines. For example, it is very
convenient to constrain a numeric value by the type of integer. Domains can represent collections of data types, which is what leads to
the incorrect definition that domains are data types, period. While domains are in fact data types, the definition of a domain should
refer to a logical collection or grouping of data values and entities, not data types. As long as you understand the difference, you’ll be
okay.
Besides checks, you can use stored procedures and triggers to ensure domain integrity. For example, a stored procedure can populate a
list or table at the client with allowable values from which to choose.
Transition Integrity
One of my clients has a simple rule that was more important to the IS managers than anything in the database: “Call center agents are
only allowed to take an order and receive payment.” The agent is not allowed access to any functions in the client application that
debits items from inventory or causes a picker in the warehouse to go pack the items to send to shipping. Only a second-level stock
manager is allowed to do that. My client manages its stock levels like a squirrel manages its acorns. Only stock managers can produce
the pick and shipping data that will translate the order into a shippable collection of items. That was the business rule; I did not
question it, I only implemented it.
Rules such as these are translated into what we call transition constraints. These constraints ensure that the status of records changes
according to predetermined rules. Inventory levels and accounting databases need to adhere to strict transition constraints. For
example, you should never credit to one table without debiting from another. Inventory cannot be debited if shipping is not credited.
There are several levels on which you can define or specify transition states. In most order-taking databases, these can be defined as
follows:
1. Order entered.
3. Item back-ordered.
6. Items shipped.
7. Obligation completed.
This list relates to the various transitions in a database. For example, an order changes from entered to taken only when either credit is
approved or the items have been paid for. In other words, if transition integrity is maintained or enforced on the database, then an
order entered can only be considered a de facto liability (the company owes the client the items) if money has changed hands or the
customer is in good credit standing. A check on cash or credit will allow the order to go from a state of entered to a state of taken.
Some companies do not consider an obligation completed until the order is on the road to the client. Only then do they actually debit
the credit card.
Items also move through various states. For example, an AllowBackorders constraint can enforce a rule that either allows or disallows
part of an order from being back-ordered. For example, a customer might request that the order should not ship until a part in the
shipment is available for immediate delivery.
The aforementioned client also has another very important business rule. The items cannot be shipped and the software cannot
produce the shipping label unless the shipping department has called the client and obtained a verbal agreement to accept the order. If
the client agrees, only at that instant will the credit card be run or a debit applied to the account.
My client advised me the main reason for this rule is that about 95 percent of the shipments rejected by customers come from
customers paying on credit accounts or credit cards. The customer changes his or her mind after placing the order and then refuses the
delivery. My client then has to eat the loss on the shipping costs (often UPS Red or FedEx delivery) because the shipper has fulfilled.
Transaction Integrity
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 51 of 90
A transaction in the sense described here is a collection of operations on data that must be completed according to business rules as a
unit, or the transaction is completely canceled. In this regard the transaction constraint is similar to the formal transaction monitoring
in which SQL Server engages automatically and thus displays elements of atomicity (see Chapter 16). Transaction integrity in the
preceding discussion is more a procedural integrity mechanism (something you usually have to code a solution for) that ensures that
all of the components that are required for a complete transaction are entered and satisfy all of the preceding integrity rules before the
transaction is committed.
Such transactions can, however, be applied over long periods, depending on the business needs and rules. Transactions can also
happen over a short period, and several states can make up several transactions. For example, an order might be broken up into several
transactions, one for each state an order is in.
Planning Integrity
In Chapter 6 we tackled the CREATE TABLE statement, but the primary objective of the methods discussed and the code
demonstrated was to create tables, not to demonstrate the installing of integrity mechanisms. In this chapter, we go a step further and
code the formal or declarative integrity constraint definitions into the CREATE TABLE and ALTER TABLE statements.
The capability to code check constraint expressions that evaluate the values in the SQL statement to determine if they are
allowed to be saved to the column
The capability to code referential integrity constraints (cross-referencing foreign key columns)
The capability to declare primary and unique keys that ensure uniqueness in a column or through the combination of columns
The capability to use triggers as a form of constraint, which can also provide a trans- or intra-database integrity checking
facility Triggers are covered in the next chapter.
When you first create a table, you are in fact laying down the first integrity constraints in your database because you create columns
that have different data types for different data. For example, you would not store a noncharacter value in a character data type, and
you would not try to store a character in an integer data type.
When you design a database, one of the basic rules to follow is to use common sense. If you are storing integer data, then store it in
integer columns; date-to-date columns; character-to-character columns; and so forth. Naturally you will come to situations that will
require you to decide between variations of a data type (small integers, integers, or big integers), or to make decisions that rely on the
precision and scale of data, such as values of type real or float, currency, and so on. Other times, your choice of data type will be
related to storage requirements, system resources, and so forth. An example would be deciding to switch to the bigint data type
because of the need to store very large numbers. But you would not incur the storage overhead of a big integer (bigint) if you were
storing nothing larger than 99.
The format of the data being stored is also a consideration, because modern database systems do not store only characters and
numerals any more. They also store binary information, images, objects, bitmaps, large amounts of text, and so forth. And SQL Server
2005 also allows us to build our own user-defined data types, which are discussed in Chapter 15.
The integrity plan is one of the most important sections of your overall database definition, model, or architecture. You can use the
flow diagram in Figure 12–2 to build your integrity plan.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 52 of 90
This section of the integrity plan identifies business rules that will impact the database model and architecture. These rules, as
discussed earlier, differ from the integrity issues that are built in to the relational model from the get-go, such as referential integrity.
Sit down with the people responsible for establishing the business rules, or provide the facilities in your model and code for later easy
incorporation of constraints to cater to the business rules.
These issues will be referenced in the architecture or will become patently part of it. For example, you might want to list attributes of
the data that have legal implications if they render the data questionable. It is also important to plan for the prospect of warehousing
the data, and constraining data in the operational databases with analysis in mind will make it easier to transform the data at a later
stage.
This step is closely related to the preceding step, and they can be combined. Here you should list the precise integrity needs of the
databases and tables right down to the values required.
You could list each table and link it to a list of integrity constraints required on its internals. For example, you will specify referential
integrity requirements with linked tables, the primary key column, type constraints, and so on.
This section covers how you propose to deal with each integrity requirement. For example, do you plan to create unique constraints
directly in T-SQL code or interactively in Management Studio (or possibly through the DMO object library)? Here you would also
determine which integrity mechanism would be most suited for the task. For example, would a check constraint work or would you
have to code a trigger?
Also document and establish procedures for maintaining and revising the constraints. If constraints are implemented in code, then you
need to maintain source, version control, and so forth. You will also need to manage access, permissions to the constraint objects, and
so on.
Do Cost Analysis
There is a cost attached to every constraint. The costs are both direct, in terms of their consumption of SQL Server resources and
usefulness, and indirect, in terms of the costs of programming, maintenance, documentation, and so on. For example, using a stored
procedure to check integrity is more expensive in all terms than using a trigger (besides, there are situations that only a trigger caters
to). Using a trigger, on the other hand, is more expensive than using a built-in cascade. And, defaults or constraint objects are more
expensive than check constraints and so on.
When performing cost analysis of constraints, you should also consider adherence to standards as a factor. For example, you should
use built-in constraints because they are based on ANSI compliance and also conform to relational database rules.
Triggers, on the other hand, can be used to meet ANSI recommendations, but since triggers are entirely procedural, they cannot be
considered ANSI-compliant. SQL Server can check code for errors, but it is not able to check trigger code for standards compliance.
In fact, even if trigger code is syntactically correct, it might still nuke your data beyond recovery.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 53 of 90
When integrity is violated or a constraint traps problem data, SQL Server will report errors. You need to determine how best to check
these errors and establish procedure for using the error logs to trap faults in design and development, data entry, or business-rule
integration. In short, error checking and tracking can be used to reduce errors down the road.
As discussed in Chapter 11, we don’t want to build a client tier that collapses every time we make changes in the data tier, which is
why we move the data processing to the data tier and code just about everything in triggers, UDFs, stored procedures, and functions.
If you plan well and manage the database model and architecture properly, the constraints in the data tier should have little impact on
the clients. However, you will have a lot to think about if you need to maintain legacy client code that still maintains a lot of client-
side data processing and logic.
It is also possible to code constraint or integrity checking in the client tier without adversely affecting development costs and
management. For example, it is more useful in terms of server resource conservation to check for format or type and domain integrity
violations before the data gets transmitted to the server. A mask over the telephone number in a client application, even a Web
browser, will obviate the need for SQL Server to bounce back with an integrity violation that incurs an additional round trip (see the
section “Check Constraints” later in this chapter).
Default Values
The default value is used in a column that may or may not forbid NULL values. For example, a column of type integer might take a 0
as the default, while a character-based column might take “unknown” as a default value-or even something like “pending.”
Note A default value is not really a constraint per se because it does not restrict you from entering a certain value; it only kicks in
when you do not provide a value.
Default values do not just happen automatically when you insert new rows in a table. As demonstrated in the Chapter 16, you must
explicitly tell SQL Server to apply the default value when performing a row insert. SQL Server can also create the row in the INSERT
statement and apply all the default values for the row (in which case you would not specify columns in the INSERT statement). This
is achieved using the DEFAULT VALUES clause in your INSERT statement.
The following code demonstrates the provisioning of a default value in the CREATE TABLE statement:
You are not limited in terms of T-SQL code as to how you can concoct a default value, as long as the default value does not violate
the data type. For example, you can use system-supplied values, such as those that are returned from built-in functions, to provide a
default value. The following code automatically provides information on the user who inserted the row:
However, changing the default value is another matter altogether when you need to do things in T-SQL code, because you can’t easily
reference the default name, at least not according to SQL-92 or Microsoft specifications. The CREATE TABLE statement also
provides no facility for providing a custom name that would be easy to reference in a script. SQL Server, on the other hand, provides a
name for the default property, but it too cannot be referenced easily from front-end applications or T-SQL scripts.
So you cannot willy-nilly change the default when the MIS walks in and asks you to do so. You have to first delete it and then
recreate it. Deleting the default can be achieved using the following ALTER TABLE statement:
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 54 of 90
The constraint name in the preceding code was provided automatically by SQL Server, so you can see how difficult it is to reference
it. This code works, but I had to open a query window in Management Studio to look it up and then script it out to a new query
window.
This is a real pain. A better method is to look up the default name in the system tables. After looking for the information in the system
tables and tinkering around with QA for an hour, I arrived at the following code to delete the default programmatically:
USE MASTER
DECLARE @dfltname varchar(100), @cmd varchar(1000)
SET @dfltname =
(SELECT name FROM sysobjects sysobs
JOIN sysconstraints scons ON sysobs.id = scons.constid
WHERE object_name(sysobs.parent_obj)= 'NewOrders'
AND sysobs.xtype = 'D' AND scons.colid =
(SELECT colid FROM syscolumns where ID = object_id('dbo.defs')
AND name = 'custName'))
This is by no means an easy statement to comprehend, especially if you are new to T-SQL, so you might want to come back to it after
you have gone through the next couple of chapters, which discuss SELECT, JOIN, aliases, EXEC, and stored procedures. In Chapter
14, I put the code in a stored procedure so that it’s two parameters away from me when I need it.
As an alternative, you can create a default object (or several objects), install a default value as the object’s property, and then bind and
unbind the object to the column as needed. There are, however, a number of problems associated with default objects:
They are not ANSI-compliant. The default objects have been dragged up through the ages from the romance between Microsoft
and Sybase.
The code to create the default object (CREATE DEFAULT) and the ensuing execution of system stored procedures
sp_bindefault and sp_unbindefault is tedious.
They are not true SQL DDL and as such are not as efficient as ANSI default constraints.
You will be limited in the expression you use to provide a default value. As demonstrated earlier, a T-SQL ANSI default value
can be derived from a sophisticated query/calculation providing a unique value each time the default in invoked. The makes the
ANSI default more powerful by an order of magnitude.
Managing defaults can drive you nuts. If you need to drop a column that has a default object fused to it, you first need to unbind
the default from the column. You cannot simply delete the default either, because it might be bound to other columns.
As I am sure you are aware, you can create and manage tables from Management Studio, which essentially provides you with an
interactive and visual “hookup” to integrity application for databases and tables.
You can create the default in Management Studio by opening the table in the Design Table console and entering a default value in the
default value for the column selected. You can easily change the defaults in this manner. Adding or changing a default value in the
console is illustrated in Figure 12–3. To add or change a default interactively, take the following steps:
1. Drill down to the database in question and expand the tree to access the table node. Expand the table node.
2. Select the table, and expand it further to expose the tree of columns. Right-click the target column and choose Modify. The
properties of the column in edit mode opens in the details pane and gives you access to the column properties (see Figure 12–3).
3. Select the value and enter the value in Default Value or Binding option (under General). Close the dialog box to save the new
settings.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 55 of 90
The SMO object model provides similar access to the default property, but you will encounter the same difficulty in accessing the
default objects in code. The following points about default constraints should be taken into account:
You can obviously only have one default to a column; there is no way of specifying a member of a default collection.
Check Constraints
The check constraint is useful because a single constraint definition can be applied to more than one column. You can also define as
many check constraints as you need. For example, if you want to ensure that data for a telephone number is inserted or updated as
(XXX) XXX-XXXX, you can create a check to prevent a value of any other form from being committed to the database.
The first action to take in defining checks is to list the check requirements and definitions in your integrity plan as described earlier. A
check constraints list might contain the following items:
You can attach a check constraint in a database diagram (see Chapter 11) or the table designer as follows:
2. Go to the toolbar and select the Manage Check Constraints button (the last one). Note the three buttons at the far right of the
toolbox on this console: These relate to the application of indexing, integrity, and constraints. Each button represents a different
dialog box.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 56 of 90
4. Click Add to create a new check constraint. Enter the check code in the constraint expression window. (You cannot check the
expression syntax in this dialog box, and you will not be able to close the dialog, and thus save the check expression, if the code
is incorrect. You might consider building the code in a query window first and testing it as T-SQL script, which for many
architects, including me, is much slicker than fiddling in dialog boxes.) If the expression works, click the Save button on the
toolbar and close the table designer.
and
and
If you were just setting out to create the preceding Orders table, you might script in the constraint at the same time. The following
creates the Orders table and then applies the constraint at the table level:
The T-SQL syntax is rich, and so you can code for sophisticated expressions for the preceding checks and the other constraint objects.
The preceding code added another U.S. state to check for in the constraint CK_states demonstrated earlier. The code obviates the need
to add a second check for Alaska to keep it out of the UPS ground column in my table. Instead I used a comma-separated list and the
expression tells SQL Server that either Hawaii or Alaska are not allowed in the column. I repeated the list technique in the ALTER
TABLE statement as follows:
They can reference more than one column in a table, in the check expression.
They cannot be used on identity columns, timestamp columns, and unique identifier columns.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 57 of 90
You can attach as many constraints as you like to a column or collection of columns.
You do not necessarily have to include the keyword CONSTRAINT in your code, but I would do so for clarity.
You do not need to provide a name for the constraint. Knowing SQL Server’s affinity for inventing names that look and sound
like ancient Greek, however, I prefer to use a naming convention that makes it easier to read the code and document our team’s
solutions.
Tip Check constraints enforce domain integrity and I recommend you install them whenever necessary to keep garbage out of the
database and ensure the integrity of the database. Using check constraints without regard for the client environment is a bad
idea, especially in Internet applications where people connect through their Web browsers, because all manner of badly
formatted strings, formats, and contradictory values will come flying at SQL Server, causing it to balk and flame the client, which
causes network roundtrips and a degradation of resources. You should use client or middle-tier constraint mechanisms to avoid
this, such as masked edit fields, allowable values lists, and so on wherever you can, leaving SQL Server as the last line of defense.
Foreign Keys
If you examine the T-SQL CREATE TABLE and ALTER TABLE syntax, you will see that you can create the foreign key (FK)
constraints to enforce and ensure referential integrity and cascading updates when you create the table. Also note that they can be
created and managed when you alter tables and their columns.
The foreign key must reference a primary key or a unique key in the partner tables.
Permissions apply, so you need to ensure that users have SELECT and DRI permissions enabled on the tables (this is discussed
in detail in Chapter 16, with code examples).
The following code adds a foreign key constraint to the ShipTo table in my Customers database, which I called
“FK_ShipTo_CustDetails,” and the constraint references the CustDetails table:
In this code, when I update or delete the row in the CustDetails table, the constraint ensures the ShipTo table’s corresponding row is
likewise deleted (referential integrity) or updated.
1. Select Modify and then click the Relationships button on the toolbar. The Relationships dialog box loads, as shown in Figure
12–5.
2. Select the corresponding columns that represent the primary and foreign keys.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 58 of 90
4. Change the name of the constraint, if you need to, and then close the dialog box to commit the changes to the table.
As discussed earlier, entity integrity ensures that rows in a table are unique. The tools to use to enforce uniqueness are the primary key
constraints (PK), the unique key constraints (UK), and the identity property (discussed in Chapter 10). The primary and unique keys
are also referenced in referential integrity constraints.
The primary key was discussed in Chapter 3 and again in Chapter 10, so I will not go over it again here.
If you examine the CREATE TABLE and ALTER TABLE syntax, you will see that you can create the primary key when you create
the table and that it can be created and managed when you alter the table and its columns.
When working with the ALTER TABLE statement, you need to remember that you can only have one primary key in a table. In other
words, there can only be one column that is the beneficiary of a primary key
While you can modify the primary key interactively with the graphical tools, such as Management Studio, you have to delete the key
and then recreate it in T-SQL. The ALTER TABLE statement can only be used to drop or add a primary key to the table, not to alter
the key itself.
Also, when adding the key, remember that the target column must have no duplicate data, nor can it accept NULL values. While
NULL values are hardly duplicates, SQL Server doesn’t see it that way, because it cannot reference a unique value if the value is
technically missing.
Note Chapter 10 also looks at the CREATE TABLE statement in more depth and highlights the differences between table-level
definitions and column-level definitions.
To add a primary key constraint, open the Design Table dialog box and take the following steps (remember you can also do this in a
database diagram as discussed in Chapter 10):
1. Select Modify and click the Manage Indexes And Keys button on the toolbar. The Indexes/Keys dialog box opens, as illustrated
in Figure 12–6.
2. Select the column name and sort order for the key
3. Select the options to be used by SQL Server on the primary key clustering.
Unique keys, or unique key constraints, are very similar in function to primary keys, but the difference is that the unique key can be
used to generate a nonclustered index and can live in a table that already has a primary key installed on another column. The unique
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 59 of 90
key can be created in T-SQL and interactively (or the SMO object model).
ROWGUIDCOL This argument indicates that the new column can hold globally unique identifiers. It is a constraint because
only one such uniqueidentifier column (set as such) per table can be designated as the ROWGUIDCOL column.
The ROWGUIDCOL property, however, does not enforce uniqueness of the values stored in the column. It also does not
automatically generate values for new rows inserted into the table. To generate unique values for each column, either use the
NEWID function on INSERT statements or use the NEWID function as the default for the column.
CONSTRAINT This is an optional keyword indicating the beginning of a PRIMARY KEY, NOT NULL, UNIQUE,
FOREIGN KEY, or CHECK constraint definition.
NULL | NOT NULL These are keywords that determine if null values are allowed in the column. NULL is not strictly a
constraint but can be specified in the same manner as NOT NULL, which is a constraint.
PRIMARY KEY This is a constraint that enforces entity integrity for a given column or columns through a unique index.
Only one PRIMARY KEY constraint can be created per table.
UNIQUE This is a constraint that provides entity integrity for a given column or columns through a unique index. A table can
have multiple UNIQUE constraints.
CLUSTERED | NONCLUSTERED These are keywords to indicate that a clustered or nonclustered index is created for the
PRIMARY KEY or UNIQUE constraint. PRIMARY KEY constraints default to CLUSTERED, and UNIQUE constraints
default to NONCLUSTERED.
You can specify CLUSTERED for only one constraint in a CREATE TABLE statement. If you specify CLUSTERED for a
UNIQUE constraint and also specify a PRIMARY KEY constraint, the PRIMARY KEY defaults to NONCLUSTERED.
FOREIGN KEY…REFERENCES These are constraints that provide referential integrity for the data in the column or
columns. FOREIGN KEY constraints require that each value in the column exist in the corresponding referenced column(s) in
the referenced table. FOREIGN KEY constraints can reference only columns that are PRIMARY KEY or UNIQUE constraints
in the referenced table.
ON DELETE (CASCADE | NO ACTION} These arguments specify what action takes place in a row in the table created, if
that row has a referential relationship and the referenced row is deleted from the parent table. The default is NO ACTION.
If CASCADE is specified, a row is deleted from the referencing table if that row is deleted from the parent table. If NO
ACTION is specified, SQL Server raises an error and the delete action on the row in the parent table is rolled back (see Chapter
16, where this constraint is further discussed).
On the other hand, if NO ACTION is specified, SQL Server raises an error and rolls back the delete action on the Customers
row if there is at least one row in the Orders table that references it.
ON UPDATE {CASCADE | NO ACTION} This argument specifies what action takes place in a row in the table created, if
that row has a referential relationship and the referenced row is updated in the parent table. The default is NO ACTION.
If CASCADE is specified, the row is updated in the referencing table if that row is updated in the parent table. If NO ACTION
is specified, SQL Server raises an error and the update action on the row in the parent table is rolled back.
If NO ACTION is specified, SQL Server raises an error and rolls back the update action on the Customers row if there is at
least one row in the Orders table that references it.
CHECK This is a constraint that enforces domain integrity by limiting the possible values that can be entered into a column or
columns.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 60 of 90
NOT FOR REPLICATION This is not really a constraint but an “anti-constraint” that is important for integrity
consideration. The argument is used to prevent the CHECK constraint from being enforced during the distribution process used
by replication. When tables are subscribers to a replication publication, do not update the subscription table directly; instead,
update the publishing table and let replication distribute the data back to the subscribing table.
A CHECK constraint can be defined on the subscription table to prevent users from modifying it. Unless the NOT FOR
REPLICATION clause is added, however, the CHECK constraint also prevents the replication process from distributing
modifications from the publishing table to the subscribing table. The NOT FOR REPLICATION clause means the constraint is
enforced on user modifications but not on the replication process.
The NOT FOR REPLICATION CHECK constraint is applied to both the before and after images of an updated record to prevent
records from being added to or deleted from the replicated range. All deletes and inserts are checked; if they fall within the replicated
range, they are rejected.
When this constraint is used with an identity column, SQL Server allows the table not to have its identity column values reseeded
when a replication user updates the identity column. See also Chapter 8 for more specifics regarding replication between SQL Server
instances.
Alias types are easier to implement because they are an abstract of already existing system types. Their integrity utility becomes
apparent when you need to store the same type of data in a column in a number of tables and you need to be certain that the columns
across all the tables maintain the identical data type, length, and nullability.
USE SCRATCH
GO
CREATE TYPE dbo.Telephone
FROM nvarchar(10) NOT NULL;
You can now use the type “Telephone” in your application and the type will conform to the system type, length, and nullability of the
underlying specification…and it will inherit default values and so on that you specified when creating the alias. Using Management
Studio, the type will appear in your selection of Data Types when you add columns on the Modify option. Or you can use the alias
type in T-SQL as follows:
In Review
Data integrity is probably the most important subject discussed in this book, which is the reason I devoted this chapter entirely to
integrity Configuring and managing the integrity can also consume a lot of your time and can be stressful. I recommend you fully
document your integrity issues and requirements in a document akin to the integrity plan I provide in this chapter. The document can
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 61 of 90
then be circulated among the business’s managers for input, and it will become your working document detailing all data integrity
efforts.
The subject of data integrity has also been covered in chapter 13. Chapter 10 discusses integrity and the constraints in managing
databases and working with the table designer and database diagrams; and Chapter 16 provides some advanced programming
instruction related to integrity.
Stored procedures and triggers are very closely related. They are programmed in T-SQL or in managed code with a language like C#,
and both are objects that are attached to your databases. When it boils down to what you can do with either object, it pays to rather list
what you cannot do, and then the rest is up to your imagination.
Note The SQL-SMO object model provides objects for creation, editing, and management of triggers and stored procedures. It is,
however, more important for you to master these elements in T-SQL, using a query window in Management Studio.
While the code in a trigger can be identical in function to the code of a stored procedure, the major difference between the two is that
the trigger is connected to a table or view object, while a stored procedure is exposed as a database object and has to be explicitly
called, like a function, and be passed parameters. If you think of your table or view as art object, which you should be doing, then
think of the trigger as an event (like OnClick) that fires when an inbound DML statement “triggers” the table or the view.
Triggers are central to ensuring integrity and business rule adherence procedurally, while stored procedures are central to providing
functionality and business services and functions on a broad scale. How each object is used and created will now be covered in its
respective section. Because triggers are the easier of the two constructs to grasp, and because they follow up our treatise on integrity
in the preceding chapter, let’s start with them first By the way, well also discuss some of the new features of triggers in SQL Server
2005, and I’ll point these out as we progress. From here on this chapter will deal exclusively with triggers, while Chapter 14 is
devoted to stored procedures.
Triggers
The SQL Server 2005 trigger is secondary to the primary built-in mechanisms for enforcing business rules and data integrity of any
application or enterprise requirement, as we discussed in the preceding chapter. A trigger, however, is a lot more than a constraint
check or a rule; it packs a lot more punch.
A trigger can do a lot. For all intents and purposes, it is a stored procedure that in itself is a full-blown SQL statement or batch that
can query tables, calculate, evaluate, communicate, and provide complex functionality. SQL Server relates to a trigger in the same
way it relates to an inline SQL statement that comes down the TDS wire. Triggers are treated as single transactions, which means that
if they create an undesirable result, they can be rolled back (see Chapter 16).
As cascades Triggers can be used to cascade events and changes through related tables in a database. A good example is the
manual cascade deletes or updates we would program in triggers in earlier versions of SQL Server (pre-Y2K) before these
wishes became de facto features in SQL Server 2000 (the prodigy parent of SQL Server 2005). In many cases, cascading
triggers can be used in data integrity or business rule requirements. You should, however, only consider this if the built-in
cascade features and declarative integrity functions do not supply the desired end result or don’t exist. Naturally the built-in
stuff is more efficient (referential integrity is a good example of a constraint effort that would be wasted in a trigger).
As checks (and then some) Triggers can do the work of the check constraints we discussed in Chapter 12, but when you need
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 62 of 90
more bang for your buck, triggers come to take the lead. A trigger, created as a super constraint, is essentially a check constraint
on steroids. You can reference columns in other tables (which you can’t do on a check definition). You can conjure up a result
set out of a trigger and then run through the result check to analyze inserted or modified data. Triggers can also talk to your
users, fire off e-mail, or wake the DBA in the middle of the night. However, while constraints are proactive “filters” so to
speak, triggers are reactive processes. Even the INSTEAD OF trigger fires in reaction to the DML statement sent to SQL
Server.
As rule enforcers Like badgers to honey, the existence of a database will invariably tempt users to attempt contradictory data
access procedures on your data. Some users may just do things that the developer or DBA did not expect, while others may
attempt to access the database in ways contrary to corporate or organization rules…and in many cases with criminal intent. A
trigger can be used to ensure that a certain action cannot be attempted on a table. For example, an attempt to retrieve all credit
card numbers from a table can be blocked in a trigger. The trigger can also send out alerts and even capture connection
information and perform certain auditing functions. (This chapter includes the code for a highly efficient database object
auditing system.) And because you can install multiple triggers on a table, you can pretty much take care of any situation that is
contrary to both business rules and change control procedures.
As an evaluation mechanism You can use a trigger to evaluate the state of your data before and after a DML statement does
its work on a table or view. If the state is not up to par or compliance, then a trigger can be used to fill in the “missing links” or
take some corrective action, such as rolling back, without requiring additional user input. Here’s a drastic, but entirely possible,
sequence of events a trigger can initiate:
An important attribute of triggers is that they have a long reach. While checks and rules are limited to tables in the current database, a
trigger can reference beyond its parent table to other tables and other databases. But all good things have their limitations.
The T-SQL statements listed here are not allowed in trigger code:
ALTER DATABASE
CREATE DATABASE
DROP DATABASE
LOAD DATABASE
RESTORE DATABASE
DISK INIT
DISK RESIZE
RECONFIGURE
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 63 of 90
LOAD LOG
RESTORE LOG
While you can use SELECT statements in a trigger and generate result set returns, the client connections that fire the trigger usually
have no means of interpreting or working with the returned data. You should accordingly avoid returning results by avoiding open
SELECT statements in trigger code. By “open” I mean that the result set is not assigned to an internal structure. In addition, you
should use SET NOCOUNT at the beginning of the trigger to obviate the return of any data to the client connection. You should also
refrain from using cursors in triggers, because overuse can be a drain on server resources. In any event, you should be working with
rowset functionality if you need to work with multiple rows in trigger code (see the discussion of cursors in Chapter 16).
Also, keep in mind that during a TRUNCATE TABLE operation (which is a delete en masse that empties a table by deallocating the
table’s data pages), trigger firing is suppressed. This applies to the database owner and should not be a concern of users. And the
WRITETEXT statement does not activate a trigger.
AFTER This trigger is fired only after the statement that fired it completes. This is the default for SQL Server 2005. On an
UPDATE statement, for example, the trigger will be activated only after the UPDATE statement has completed (and the data has
been modified). If the DML statement fails, the AFTER trigger is never fired. You can have multiple AFTER triggers on a table
(views are not supported, by the way) and list the triggers in an order of execution (see “Managing Triggers” later in this chapter).
(By-the-by, AFTER triggers are never executed if a constraint violation arises.)
INSTEAD OF This trigger is fired instead of the actual triggering action. For example, if an UPDATE arrives on the wire and an
INSTEAD OF trigger is defined, the UPDATE is never executed but the trigger statement is executed instead. By contrast with its
AFTER sibling, you can define an INSTEAD OF trigger for either a table or a view. INSTEAD OF triggers can be used for many
scenarios, but one of the fanciest features is the capability to update view data, which are not normally updatable. As explained in the
Chapter 15, it is not a simple matter to just obtain a fresh view of data using a view that has been around for a while.
Although the INSTEAD OF triggers are fired instead of the DML statement sent to the server, they fire before anything else,
including any constraints that may have been defined (which is a big difference between the INSTEAD OF trigger and the AFTER
trigger). The triggers are also not recalled, because SQL Server checks for recursion that might develop as the result of the actions the
trigger itself takes on the table. In other words, it is certainly possible to create a trigger that practically replaces exactly what the
original DML had intended to do to the table or view. It thus seems logical that the trigger would cause itself to be refired, but this
unintended recursion is dampened.
Note While trigger overhead is very low, a time may come when you need to squeeze every drop of bandwidth out of your application.
It thus makes sense to write trigger code within a trigger, such as flow-control logic (IF, CASE), that checks if the trigger really
needs to run its course. For example, an INSTEAD OF trigger might check the underlying table state before it performs
anything, and exit out if it determines the entire statement is not required. Also, if the constraint can be catered to using the
check constraints described in Chapter 12, go with those because they incur much less overhead than triggers for simple
integrity constraints.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 64 of 90
Identify the business rules that can be best addressed by triggers. This section of the database plan identifies business rules that will
impact the database model and architecture. Sit down with the people responsible for establishing the business rules and determine
how best to cater to their needs using triggers.
Issues to be catered to by triggers must be referenced or become part of the database architecture. If you are the DBA but do not write
triggers, or if you are assigning the trigger writing to SQL Server developers or third parties, then it is important to list key issues that
will impact the development plan. For example, note how you plan to maintain triggers. While most trigger code is straight up and
down, a need may arise for the creation of a complex trigger, and the code needs to be documented and maintained and protected like
all source code. Trigger code can also be dangerous if it falls into the wrong hands. And if you decide that encrypting the code is an
option, then you need to be sure the source code is stored in a backup system, in documentation, or in source files that are secured. (I
personally do not like encrypting objects like triggers and stored procedures. Some DBAs encrypt to protect the system against
malicious or careless individuals who may have rights in the database. This is simply poor management.)
In this section, list the precise objective of each trigger you need to create. As with constraints and stored procedures, an extensive
system of triggers can hold a lot of trigger objects, and it will not take long for you to lose track of what triggers are installed where
and for what reason. This problem is further compounded by an order of magnitude when more than one trigger is installed or defined
for a table, or when you create a trigger cascade or nest.
Once you have written a trigger or a collection of triggers, then each trigger, including the actual trigger code and the trigger’s relation
to other triggers, constraints, checks, and stored procs, should be fully documented. This should not only help you see the trigger trees
for the trigger forest but help you pinpoint areas that could use improvement.
Most important, trigger documentation-both externally in supporting documents and internally using inline comments-can help other
parties read your code. The documentation will also help you debug problems. Remember that triggers are regarded as transactions, so
documenting them as such will allow you to debug and recover from problems created by rogue triggers. It is not normal to have a
trigger go nuts on you and do substantial damage, but when I started out writing triggers, I once created a cascading delete that
cascaded my entire database down the drain. To this day I am not sure how it happened, but it did.
In addition to documenting the solution arrived at for the trigger, it is also imperative to fully document the progress being made on
the trigger or triggers under assignment. Some triggers may be extensive, and many may require the trigger writer to be supervised
over several weeks or even on a permanent basis. To put it in more direct terms, you should not regard the management of a trigger or
T-SQL project, in terms of managing the software creation process, as any different from a C# or Visual Basic 2005 project.
Do Cost Analysis
There are also costs attached to every trigger. As in the case of constraints, the costs are both direct, in terms of their consumption of
SQL Server resources and usefulness, and indirect, in terms of the costs of programming, maintenance, documentation, and so on.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 65 of 90
As mentioned in connection with the integrity plan, triggers are procedural and their code cannot be guaranteed by SQL Server to be
ANSI-compliant or even safe. SQL Server can check code for errors, but it is not able to check trigger code for standards compliance
on behalf of the trigger author or warn you that you are about to take unpaid permanent leave from your very fine DBA job.
Cost analysis of triggers should also cater to resource costs, and you should thus thor oughly test trigger code (in execution plans) and
in the profiler (see Chapters 15 and 16).
Determine a system for checking errors that materialize from trigger code. Errors can be directly caused by triggers, or they can result
from the correct trigger execution with unpredictable or unintended results. Investigate the feasibility of deploying an error database,
something like a bug recording system that records error messages that derive from both triggers and stored procedures. If you follow
the advice dispensed in Chapter 12 and move either all or a substantial part of the processing to SQL Server, this is an essential
practice.
You can also set up logging in the profiler and enable a security system to track who does what and when they did it. But the profiler
does not cater to errors that occur in T-SQL code and that create problems with the database or the data.
Understanding the effects in the client tier or what reporting and alerting is required at the client is important. What we do in the data
tier can affect the client tier, especially if the client processes have been implemented in a middle tier, especially in Web services. If
you still need to maintain legacy client code, which means client-side data processing and logic, trigger implementation and especially
stored procedure implementation can have unpredictable results, and these need to be considered.
Creating Triggers
Triggers, like most SQL Server objects, are created using T-SQL code, in Management Studio, and the SQL-SMO triggers collection.
You can also write triggers in C# or any other .NET language and run them on the SQL Server CLR. This is covered in Chapters 11
and 14. Depending on your needs, the SQL-SMO and .NET Framework object model provides an alternative to T-SQL, and a very
useful one at that. For the most part, however, the typical path to trigger creation and management is via T-SQL, so whip out
Management Studio, or whatever tool you use to write and test T-SQL code, and get cracking.
Trigger Deployment
Deploying a trigger requires more than just writing the code and assigning the trigger to a table or view and then crossing your fingers
that your landing gear is down. This is engineering, so you need to approach this as an engineer. The following steps, illustrated in the
flow chart in Figure 13–2, document the process of trigger creation and deployment from beginning to end. Create your own
deployment plan, which can act as a checklist that will take you from concept to deployment in a logical, well-controlled manner. I do
trigger work for a number of clients, and thus each one has a file and trigger deployment plan for one or more triggers (and the overall
trigger plan).
Step 1: Obtain trigger requirements The requirement specs are obtained from the trigger plan. Your system may be large enough
to warrant formal trigger assignment, as would large OLTP or e-commerce systems that will require several developers working on
the project.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 66 of 90
The following specification is an example of a trigger requirement on a call center application that logs the date and time and the
agent connects to SQL Server on a number of tables. I could use the logging capability of the profiler, but the output is difficult to
work with from the viewpoint of the call center equipment, such as the ACD scheduler that needs to have information about the
CSR’s open case load, and from the call center analyst who needs to export the data to decision support systems.
CSR Shift Log AFTER Trigger Record the date and time the agent (via the Web service proxy) initially logs on to SQL
Server and runs Open-Shift. The Open-Shift session gets the agent oriented, reviews shift objectives, considers past shift
performance, and so on. The trigger also records the Close-Shift data…the date and time when the agent has concluded Close-
Shift and logs off the system.
The trigger can only be fired when the OpenCloseShift table is accessed, so the information records when the CSR logged on to
the CRM application and when the agent logged off the CRM application. The data allows the call center manager to monitor
shift duration and (not shown here) check how long agents are spending in Open-Shift and Close-Shift sessions.
Check Credit INSTEAD OF Trigger Check the credit rating of the client. If the credit rating is green, allow the CSR to take
the order on account. If the credit rating is red, the CSR must advise the caller that only a money order or credit card can be
accepted. The trigger can then return call center scripting (by calling a procedure) that the CSR can read back to the caller.
Step 2: Craft trigger logic or solution in pseudocode This may be done in any custom or preferred pseudocode. The idea is to
outline the trigger and sketch the scope and flow of the trigger.
In the CSR shift log specification just described, the pseudocode could be as follows:
Step 3: Model, write, and test trigger against development system Once all pseudocode is written and you have cross-checked it
with IS managers, supervisors, or yourself, the stage is set for modeling and writing the trigger in T-SQL, testing against the
development system, checking the performance of the trigger (especially under load), and so on. The next section goes into the actual
trigger code.
Step 4: Deploy trigger to target system This step entails installing the trigger (the job of a DBA) to the target system if your
processes have been approved by quality assurance and your trigger testing program. You can copy objects to the production system
or script out the code for execution against the target system from Management Studio.
Step 5: Encrypt trigger Encrypt your trigger in the development system. This is obviously not essential, but it is advisable if the
systems might come under attack or cannot be secured. One of my projects entails installing SQL Server in a number of data centers
spread throughout the United States where I cannot guarantee that the servers are off-limits or that they are safe from access by
unauthorized use of query tools.
Step 6: Verify permissions You do not install permissions on triggers per se, but you need to verify that users on the connections
that fire the triggers have permission either to query the table or to insert, update, or delete from it. This also applies to permissions on
a view on which INSTEAD OF triggers are installed. This stage of your trigger deployment plan is critical. When I first started out
installing triggers, I was so concerned with the actual trigger and how impressed my clients would be that, after testing it with my
super-DBA/ developer rights, I f orgot to make sure the users would be able to access the tables on the production system. SQL Server
schema-level security is thus a great improvement in this area.
The principal T-SQL statement is CREATE TRIGGER, as follows (the full explanation and usage of the arguments are documented
in SQL Server Books Online):
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 67 of 90
<dml_trigger_option> ::=
[ENCRYPTION]
[EXECUTE AS Clause]
<method_specifier> ::=
assembly_name.class_name.method_name
CREATE TRIGGER trigger_name
ON {ALL SERVER | DATABASE}
[WITH <ddl_trigger_option> [ ,...n]]
{FOR AFTER} {event_type | event_group} [ ,...n]
AS {sq1_statement [ ;] [ ,...n] | EXTERNAL NAME < method specifier > [ ;]}
<ddl_trigger_option> ::=
[ENCRYPTION]
[EXECUTE AS Clause]
<method_specifier> ::=
assembly_name.class_name.method_name
You need to provide each trigger with a name and define it for a particular table or view in a database. You also have the option of
encrypting the trigger so that no one (not even you, ever) can look at the original code. Triggers are secured with permissions (see
Chapter 6) so that only you, the trigger creator, or the schema, can alter or drop the trigger.
After you have specified which table or view is to be the “beneficiary” of the trigger, you need to define the trigger as an AFTER or
INSTEAD OF trigger. This specification may seem a little late in the syntax because you cannot define an AFTER trigger for a view.
Another important argument, NOT FOR REPLICATION, has serious implications in distributed database scenarios. This argument
specifies that the trigger should not be executed when replication is the cause of table manipulation (see the chapter that covers
replication, Chapter 8).
Following the AFTER or INSTEAD OF argument, you must specify the DML event the trigger fires on. This can be either DELETE,
INSERT, and UPDATE. Finally, following the AS keyword, you enter the segment of T-SQL code to be executed every time the
DML statement lands on the table or view. So your basic create trigger statement will look something like the following:
This same trigger code specified as an AFTER or INSTEAD OF trigger would look like this in its most basic form:
To create a trigger, drill down to the table on which you want to apply the trigger. Expand the table so that the Trigger folder is
exposed. Right-click the folder and select New Trigger…. The script for creating a trigger is loaded into a query window. (If you want
to add header and revision information, you can edit the template and install new template parameters, as discussed in Chapter 11.)
Write and test your trigger here to a development system or table. When you are ready to install the trigger to a table, all you need to
do is extract the script and execute the query to the production table.
To alter the trigger at any time, you can drill down to the trigger as you did to create it and either choose the Modify option or the
Script Trigger As option. Both routes load the Alter Trigger code into a query window. To replace the old trigger with the new one,
simply execute the query
Note You can step through trigger code in the Visual Studio debugger.
To create and manage triggers on views, simply repeat the process just described on the views in SSMS.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 68 of 90
You can use the UPDATE(column_name) and COLUMNS_UPDATED() statements respectively to check for the completion of
updates that apply to certain columns.
You can use the IF UPDAIE(column_name) clause in your trigger code to determine if the DML (INSERT or UPDATE) statement
that fired the trigger, or an earlier one, actually made any changes to write home about. This function will return true if indeed the
column was assigned a value. The IF UPDATE() trigger will look like this:
Or you can use the IF COLUMNS_UPDATED() clause to check which columns in a table were updated by an INSERT or UPDATE
statement. This clause makes use of an integer bitmask to specify the columns to test. And the COLUMN_UPDATED() trigger looks
like this:
Unfortunately, a limitation of both of the preceding functions is that they cannot test to see if a specific value in a field has been
deleted; thus, neither function will return any result on a delete statement. However, you can check for row deletion using the
@@ROWCOUNT function, which returns the number of rows affected by the last query.
The function @@ROWCOUNT returns the number or rows that were deleted as @@ROWCOUNT=X. Thus you can test for rows
and make flow choices based on the results (see the stored procedure example in the next section). If, for example, the row count
returns 0, you could exit the trigger or take some other action. If the row count is greater than 0, you could switch to a different code
segment or even use a GOTO and code in a series of GOTO labels.
The Examples
Let’s now look at a trigger from an actual deployment plan in which the business owner required being alerted to unusual sales
activity:
/*
Script Name: Something Fishy
Description: Trigger to report unusually high sales
Usage: Placed on customers.dbo.orders for canship
Return Code: N/A
Author: Jeffrey R.Shapiro
Version: 1.00
Date Created: 9/25/2005
Revision History:
*/
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 69 of 90
As a second example, make use of the RAISERROR message function. As discussed in Chapter 11, the RAISERROR is similar to the
message dialog facility in the Win32 API (and wrapped by the .NET Framework), which has been exposed (or wrapped) by every
major language capable of producing Windows applications. If you look at RAISERROR in Chapter 10 or Books Online you will see
that it can take first a message string or a message ID that can get returned to the client. You can also add as parameters severity levels
and replacement parameters. The replacement parameter placeholder in the function is that old familiar %d visiting T-SQL from the C
language, as demonstrated next.
In the following example the trigger code tests for the number of items scheduled for shipping, and if it is over 10,000 units, an error
message or alert is committed to the Windows Server 2003 application log by using the WITH LOG option. In addition I have also
added the SET NOCOUNT line, which will ensure that the trigger does not return the “n rows affected” message to the user. The
trigger code will not modify any rows, but SET NOCOUNT will suppress anything that might slip in when later versions of this
trigger are installed.
GO
SET QUOTED_IDENTIFIER OFF
GO
SET ANSI_NULLS ON
GO
You should clearly understand, however, that the feature allows you to define a first or last attribute on a trigger after each DML
event that fires triggers. In other words, you can specify for TableX that trigger triggerA is first after the FOR INSERT event, that
triggerB is first after the FOR UPDATE event, and that triggerC is last after the FOR DELETE event.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 70 of 90
Here are some nuances to the application of the first/last attributes you should be aware of:
The first AFTER trigger cannot be the last AFTER trigger as well.
The triggers between first and last are not executed according to any order.
The first and last triggers must be fired by DML statements (INSERT, UPDATE, and DELETE).
If you alter a first or last trigger, its status as first or last is dropped.
Replicated tables will define a first trigger automatically for any table that is an immediate or queued update subscriber. In other
words, the replication trigger will be positioned as the first trigger, regardless of any other trigger attributed as a first trigger,
and if you try to reassign the first manually, after configuring for update subscription, SQL Server will generate an error.
If you use an INSTEAD OF trigger on a table, it will fire before any updates on the base table fire the AFTER triggers.
Also, if you define an INSTEAD OF trigger on a view, and that trigger updates a base table that has AFTER triggers defined,
these triggers will fire before any manipulation on the base table takes place as a result of the INSTEAD OF trigger fired at the
view level.
You use the sp_settriggerorder stored procedure to specify the first and last attributes for an AFTER trigger. The options that are
available are as follows:
First This makes an AFTER trigger the first trigger fired after a fire event.
Last This makes an AFTER trigger the last trigger fired after a fire event.
None This cancels the first or last attribute on a trigger. Use None to reset the trigger to fire in any order.
The following example demonstrates the use of the trigger positioning stored proc:
Trigger Recursion
SQL Server 2005 provides a feature known as recursive invocation. Recursion can thus be invoked on two levels, indirect recursion
and direct recursion. The two types permit the following behaviors:
Indirect A statement triggers TableA, Trigger1, which causes an event that fires TableB, Trigger1. TableB, Trigger1 then
causes TableA, Trigger1 to fire again.
Direct A statement triggers TableA, Trigger1, which causes an event that fires TableA, Trigger2. TableA, Trigger2 then
causes TableA, Trigger1 to fire again.
The recursion types can work for or against you and can break your code. You can set direct recursion off using the sp_dboption
stored procedure, but that will leave indirect recursion enabled, which you may want. To disable both recursion types, you need to use
sp_configure.
Trigger Nesting
Triggers can be nested (in an arrangement also described as a trigger cascade). In other words, a trigger on TableA can update TableB
that fires a trigger on TableC that fires a trigger on TableD that…. SQL Server will prevent a chain of triggers from forming an
infinite loop, and you cannot nest to more than 32 levels.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 71 of 90
You can also disable nested trigger execution, on a server-wide basis, using the sp_configure stored procedure. The default is that
trigger nesting is allowed. Nesting and recursion are controlled by the same argument in sp_configure, so if you turn off nesting,
recursion goes as well, and vice versa; this is regardless of the setting you have used in the recursion attribute set by sp_dboption,
discussed in the section on recursion. Trigger nesting also terminates if any trigger executes a ROLLBACK TRANSACTION
statement.
You can also manage the nesting behavior of triggers from Management Studio or from the SQL-DMO object model or using the
sp_configure procedure (see Chapter 8).
Managing Triggers
Triggers are a powerful and essential attribute of any DBMS, but they can be a headache to manage, especially when you have a lot of
them. For this reason, if you have a big system, you will want to architect triggers in a modeling system and provide access to trigger
metadata. The following sections explain how to alter and drop triggers.
Altering Triggers
To alter a trigger in T-SQL, you need to use the ALTER TRIGGER statement. The basic statement is as follows:
Note Management Studio adds the alter trigger code automatically the first time you open the trigger code for editing.
The code to apply after the ON line takes the same syntax and choice of arguments as the CREATE TRIGGER statement described
earlier (see the SQL Server 2005 Books Online for the full explanation and usage of the arguments).
Dropping Triggers
Dropping a trigger in T-SQL requires the DROP TRIGGER statement followed by the trigger name. Consider, for example, the
following code:
USE MYDB
IF EXISTS (SELECT name FROM sysobjects
WHERE name = 'SecurityViolation' AND type = 'TR')
DROP TRIGGER SecurityViolation
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 72 of 90
You can specify multiple triggers in a DROP TRIGGER statement by separating trigger names with commas and enclosing the list in
square brackets, as here:
You should make sure to check for trigger dependencies with the sp_depends stored procedure before dropping a trigger.
To drop a trigger interactively, simply drill down to the tables in your database and the Triggers folder. Then expand the list of
triggers in the folder and right-click the trigger you wish to manage. From the context menu, you can select the trigger from the drop-
down list and click Delete. You can also disable a trigger at this point.
To obtain information from SQL Server about the triggers installed on a table, you should execute the system stored procedure
sp_helptrigger as follows:
The sp_helptrigger procedure returns information about the triggers on the table, the trigger owners, and the DML statements they are
defined for. The full syntax of this stored procedure is as follows:
where the @triggertype option specifies the type of triggers you require information on. With the code
1. Use triggers when necessary to enforce business rules and integrity not adequately handled by the built-in constraints.
2. Keep trigger code simple. If there is a lot you need to accomplish in a trigger, then consider breaking your code into more than
one trigger, which is akin to how you write code in traditional programming environments.
4. Use NOCOUNT to suppress the “n rows affected” message returned to the connection. And don’t leave result sets open or
unassigned. Use them only for the benefit of the trigger, such as by using SELECT to find values or to compare values in
multiple tables.
6. If your triggers begin to look like general procedural code, requiring the return of result sets to clients, and functionality beyond
integrity and enforcement of business rules, then you need to switch to a stored procedure, a function, or managed code.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 73 of 90
set ANSI_NULLS ON
set QUOTED_IDENTIFIER ON
go
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 74 of 90
This second trigger is a simple mechanism on a voice mail system that enforces a business rule on the process of deleting messages.
To Recap
We covered a lot of ground together in this chapter dealing with triggers. As you can see, trigger writing and management can
consume substantial resources, and without proper planning, documentation, change control, archiving, source code maintenance,
modeling, and so on, you can create a lot of problems for yourself or the team.
If you are new to trigger writing, the change in development style and philosophy can put a lot of strain on mental and physical
resources. And the conversion of client-side, inline SQL code to server-side triggers (and stored procedures) can be expensive in
terms of both time and materials.
If you have not already done so, you should read Chapter 6 carefully, or read it again, because trigger and stored procedure
deployment require you to manage permissions and security so that your users can exploit the code you have written. Also, query
plans are discussed in Chapter 16, and understanding them is an essential prerequisite to writing and testing stored procedures and
triggers.
This chapter covers both legacy stored procedures and functions written in T-SQL as well as the procs and functions you can now
install as part of the .NET Framework’s common language runtime (CLR) support (see Chapter 11). We have not cover trigger
creation using the .NET Framework because the process for writing the code, compiling and installing the assembly and installing the
trigger to SQL Server is identical for all “objects” CLR.
For the most part we will be discussing stored procedures because you will be creating and using them more. If you are unfamiliar
with the concept of a stored procedure, you will find that the following list sheds some light on these critical SQL Server features:
Stored procedures are collections of Transact-SQL statements or .NET Framework language assemblies that contain inline T-
SQL, that can be referenced and executed by name from a client connection. They consist of functionality that executes
remotely from the calling connection-the client-that is interested in exploiting the result of the remote execution.
Stored procedures encapsulate repetitive tasks. Often in client applications a large number of SQL statements all do the same
thing. One stored procedure that aecepts variables from the client can satisfy more than one query at the client, executed
concurrently or at different times. More than one client can call the same stored procedure. Variable parameters that identify
columns and values can replace almost all query code at the client.
Stored procedures share application logic and code. In this respect they have a reuse benefit similar to that of classes in object-
oriented software. A good example of an application that can greatly benefit from stored procedures is Report Server.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 75 of 90
Stored procedures hide database schema and catalog details. When you query a database using client-side SQL code, you need
to know specifics of the tables and columns you are querying. This exposes the schema to the client connection and the user,
especially in Internet applications. The stored procedure does not allow the client to have the proverbial foot in the door. The
only information the client or connection has is the procedure name to call. In this regard, stored procedures provide a layer of
security because the client also needs appropriate permissions to execute the procedure.
Stored procedures conserve network bandwidth and allow you to concentrate processing needs at the data tier, which can be
appropriately scaled up or out as needed (see Chapter 9).
Like functions, stored procedures return values and error messages. But they can also return result sets from server-side queries
that can be further processed on the server before being sent to the client. The return values can be used to indicate success or
failure of stored procedure functionality, and the status can be returned to the client or used to control the flow and scope of the
procedure logic.
Stored procedures are created using the CREATE PROCEDURE statement, edited or updated using the ALTER PROCEDURE
statement, and executed by the client connections; they return the result to the clients. The flow chart in Figure 14–1 illustrates
the life-cycle (abridged) of the stored procedure.
Database developers need intimate knowledge of the workings of stored procedures. For all intents and purposes, they are to the
DBMS and its databases what classes are to languages like C# and Java. Stored procedures are not inherited, derived, or cloned, nor
do they sport inherited properties, methods, and the like, but they share many other valuable attributes of object-based programming
such as code isolation, reuse, and sharing (by both developers and users). You cannot build any form of effective application that
relies on SQL Server, nor can you be an effective DBA, without having an intimate knowledge of how to code and manage stored
procedures.
The several types of stored procedure supported by SQL Server are as follows:
System The system stored procedures are built into SQL Server and cannot be altered or tampered with short of destroying the
catalog. They provide information about the database schema, object names, constraints, data types, permissions, and so on.
There are several collections of system stored procedures: the catalog stored procedures, SQL Server Agent stored procedures,
replication stored procedures, and so on. The system stored procedures are discussed in their respective chapters and in
Appendix.
Local The local stored procedures, written by the DBA and SQL Server developer, are the focus of this chapter.
Temporary These provide the same or similar functionality as the local stored procedures discussed in this chapter, but as
explained further a little later in this chapter, they only exist for the life of the connection.
Remote These stored procedures exist in remote servers and can be referenced by an originating server. These stored
procedures are used in distributed applications.
Extended The extended stored procedures are similar in function to the local stored procedures but can reference functionality
external to SQL Server, such as calling routines and functions in remote libraries and processes compiled, for example, in DLLs
or object storehouses. For the most part extended stored procedures will be replaced by .NET Framework stored procedures.
Stored procedures are processed in two stages. In the first stage the procedure is first parsed by the SQL Server database engine (see
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 76 of 90
Chapter 2) upon creation, after which two things happen. SQL Server stores the definition of the procedure, name and code, in the
catalog. It also pushes the code through the Query Optimizer, as discussed in Chapter 4, and determines the best execution plan for the
code.
Next the code is compiled and placed in the procedure cache. The only time the plan is flushed from the cache is when an explicit
recompile is called by the client connection or the plan no longer exists in the cache, which means it had aged and had to be expelled.
The cache can also be flushed via the DBCC freeproccache command discussed in Chapter 10.
In the second stage the query plan is retrieved when the stored procedure’s name is referenced in code. The procedure code is then
executed in the context of each connection that called the procedure. Any result sets or return values are returned to each connection.
Stored procedure names, like triggers, are stored in sysobjects and the code is stored in syscomments. To inspect the stored
procedure code, execute sp_helptext in the parent database of the stored procedure. More on sp_helptext later.
The words PROCEDURE and PROC can be used interchangeably, and SQL Server recognizes both. The statements CREATE
PROC, DROP PROC, and ALTER PROC are thus also valid.
The following CREATE statements cannot be used in a stored procedure: CREATE DEFAULT, CREATE PROCEDURE,
CREATE TRIGGER, CREATE RULE, CREATE VIEW.
You can create any other database object from a stored procedure and even reference it in the stored procedure, as long as you
create it before you reference it. You can even reference temporary tables in a stored procedure.
You cannot use remote stored procedures in remote transaction scenarios. If you execute a remote stored procedure, the
transaction on the remote instance cannot be rolled back.
Stored procedures can spawn stored procedures that can access any object created by the parent stored procedure. However, if
you create a local temporary table, it only exists for the stored procedure that created it. If you exit the stored procedure, the
temporary table is lost.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 77 of 90
We have looked at some of the reasons to use stored procedures earlier and at length in Chapter 10. This is the section of your plan
(especially if you are motivated to move away from a desktop solution like Microsoft Access) to list and discuss the needs and
reasons.
The issues to be catered to by stored procedures, or the solutions to be obtained using stored procedures, must be referenced or
become part of the database architecture. (In the trigger plan, I also listed a number of issues you should cover here, so go back and
apply the points there to the stored procedure plan.)
One important area to address is the development tools you use to create and debug stored procedures. You can use query windows in
Management Studio, which you probably will do to write, edit, debug, and performance-test your code as described in this chapter. A
few third-party SQL Integrated Development Environments (IDEs) on the market are specifically suited to SQL Server. They are
worthy tools for the DBA or SQL Server developer who writes a lot of stored procedures, and perhaps some very complex ones as
well. The serious developer who needs to fully step through code will use Visual Studio.
The following list provides an example of some stored procedure issues to consider in the stored procedure plan:
How do we handle error messages and track bugs raised by the code?
When would we need to use an extended stored procedure, and who will create it?
In this section, list the precise objective of each procedure you need to create. If you are converting legacy in-line SQL statements still
buried in your client applications, a good way to start is to go through all the client procedures and copy the SQL statements from
them. Paste each statement into a document, noting exactly where in the client code the statement is buried, and mark the statement
for replacement by the stored procedure. Doing it this way can help you identify SQL statements in your code that are the same or
similar and could thus make use of one stored procedure, or a variation of one or more.
In many cases, you will find that one stored procedure just needs new parameters and can be used to satisfy a number of areas in the
client code. This exercise is rewarding because the code in the client application can drop dramatically For example, a few dozen one-
line EXEC PROC statements can reduce the number of code lines by several thousand.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 78 of 90
This section is as important for stored procedures as triggers. If you have not read the trigger plan earlier and you intend to prepare a
stored procedure plan, you should read the trigger plan first.
Do Cost Analysis
Stored procedures assume all the data processing overhead that was once dispersed among the clients, so a complex procedure that
consumes a certain amount of processing overhead compounds the overhead for every connection associated with a procedure
currently executing in the server. If you have not yet mastered the art of profiling your stored procedures and analyzing your queries,
you should spend some time with the Profiler, as discussed in Chapter 18 before embarking on an extensive stored procedure
development effort.
Another cost of stored procedures is the indirect cost of adopting a new style and philosophy of programming. To obtain the data from
your client that you send to the stored procedure, you will likely have to change the way your client does business with the user.
Bound controls, for example, are history in many cases (you don’t have them on the Internet), and so you might have to add code in
some places where data-bound controls have been excised. The following Java code is a good example of how you would code the
assignment of data in place of a data-bound control, especially in middle-tier solutions:
String ProcParaml="Jeffrey";
String ProcParam2="Shapiro";
-or
String ProcParaml=FNText.Text;
String ProcParam2=LNText.Text;
Note I cannot think of a better word than excise to describe getting rid of data-bound controls in a client application. Honestly, they
were good in the early days of Delphi and VB, when the database engine squatted on your local PC like a warthog trapped in
mud. With SQL Server solutions, they truly are a thing of the past. Since I have been working exclusively with SQL Server, I
have removed all of my data-bound controls from my applications.
See the trigger plan discussed earlier for this section of the stored procedure plan. Error handling in stored procedures is also a lot
more involved due to the code’s inclination to say something to the client. I have thus discussed error handling in stored procedures in
more detail later in this chapter.
One of the positive effects of getting rid of SQL code in the client tier is that you’ll end up with clients that look a lot thinner than
usual. Your code will also be cleaner and easier to document in the client. However, the downside is that the process of converting to
a client/ server system and adopting stored procedures can be long and involved. For example, you might make extensive use of
ADO.NET visual controls that need to have their dataset methods changed from SQL to stored procedure calls, and so on. I have
simply dumped data-bound grids and the like and chosen to work only at the object level in ADO.NET, pulling back a result set from
a stored procedure and then looping the data up into a dataset.
To me, creating a stored procedure to replace a data-bound ADO grid is like guzzling a Bud on a hot August afternoon on Miami
Beach. You’ll likely have to spend a morning to code a complex procedure that returns the same data as the data-bound grid. But once
you have tested the procedure in QA or whatever tool you use, seeing the data appear in a simple grid in the client is a wonderful
feeling, knowing that all the client had to do was issue a single line of code to call the proc.
The following steps, illustrated in the flow chart in Figure 14–3, document the process of stored procedure creation and deployment
from beginning to end. Create your own deployment plan, which will be your checklist that will take you from concept to deployment
in a logical, well-controlled manner.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 79 of 90
Step 1: Obtain the stored procedure requirements The requirement specs are obtained from the stored procedure plan as
discussed earlier.
The following specification is an example of a stored procedure requirement on an order-taking system that debits stock items
from the inventory or warehouse table and credits them to the customer’s account.
Step 2: Craft stored procedure logic or solution in pseudocode You should work on this section with the idea of sketching
the scope, functionality, and final result of the procedure.
Step 3: Model, write, and test the stored procedure against your development system This work is done in Management
Studio or the IDE of your choice (see Chapter 4). Before you begin code, however, stored procedures should be defined in a
modeling language, which would aim to capture procedure-related metadata.
Step 4: Deploy the stored procedure to the target system Deployment of the stored procedure entails executing the
CREATE PROCEDURE statement against the target system.
Step 5: Encrypt the stored procedure The same motivation used to justify trigger encryption applies here. Just as with
triggers, this step is done inside the CREATE PROCEDURE statement. You must not encrypt your trigger in the development
system unless you have a separate and secure version of the source code elsewhere. The WITH ENCRYPTION clause in the
CREATE PROCEDURE statement is like a loaded Uzi with the safety off. One slip and off goes your foot. Be sure that you are
connected to the target product system to install and encrypt the procedure. More on encryption later in this chapter.
Step 6: Verify permissions Permission verification is the last step you take before allowing users to obtain service from your
stored procedure. Unlike with triggers, users must have direct permission to execute a stored procedure. This can be done in T-
SQL code as explained in Chapter 5.
However, the permissions issue does not stop with the right to call EXEC. You also need to verify that users on the connection
that call the proc-either to query the table or to insert, update, or delete from it-have these DML permissions as well.
You can use any of several methods to create stored procedures; there are a few tools floating around. The principal method, of
course, is to use the T-SQL statements CREATE PROC or CREATE PROCEDURE. You can also use the SQL-SMO object model to
create, alter, and manage stored procedures.
To code and test stored procedures in T-SQL, you will use a query window in Management Studio. The stored procedure’s CREATE
and ALTER templates are useful and will save you some coding time. The following syntax represents the CREATE PROC statement
(the full explanation and usage of the arguments are documented in SQL Server Books Online):
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 80 of 90
AS T-SQL statements [ …n ]
A stored procedure must be named. It is a good idea to name a stored procedure with a prefix (such as the acronym of a process or
module, as in jrs_storedproc). However, you should not name your stored procedures using the sp_ prefix, because those are typically
reserved for system, built-in, and user-defined stored procedures.
You can create one or more parameters in a stored procedure. The client in the execution statement that calls the stored procedure
must supply the values for the parameters. If a stored procedure expecting a parameter value does not receive it, the stored procedure
will fail and return an error. It is thus especially important when you code stored procedures that you handle all parameter errors
properly. You can code flow-control statements, return codes, and the like, as long as errors raised in the stored procedure are
properly handled.
Object names get resolved when a stored procedure is executed. If you reference object names inside a stored procedure and do not
reference the object by ownership (name qualification), the ownership defaults to the owner of the stored procedure, and the stored
procedure is thus restricted to the stored procedure owner. In other words, only the owner gets to execute the procedure.
Also, objects referenced by DBCC, ALTER TABLE, CREATE TABLE, DROP TABLE, TRUNCATE TABLE, CREATE INDEX,
DROP INDEX, and UPDATE STATISTICS must also be qualified with the object owner’s name so that users can execute the stored
procedure. If you create a stored procedure and do not reference any table objects in the stored procedure with qualified names, then
access to the tables, during execution of the stored procedure, is restricted to the owner of the stored procedure (see how permissions
affect this in Chapters 5).
Encryption
As discussed in the trigger section and in the stored procedure deployment plan, you can hide the code of your stored procedure using
encryption (by using the WITH ENCRYPTION clause) as you do when you create triggers. However, once the procedure is
encrypted, there is no way to decrypt it; not even the SA account or an Administrator can do so. Encrypting the code is a good idea if
the code you have defined in the stored procedure exposes highly sensitive data, as long as you keep a copy of the unencrypted code
that only you can access. Although it has never happened to me, I did hear of someone who spent a month writing the mother of all
procedures and then encrypted the code on the development system by mistake, before he made a copy of the final source.
Encryption is also useful for a turnkey product that ships with the SQL Server engine as the data store. Your product will then be in
the hands of third parties, and you’ll have no means of preventing them from checking out, tampering with, and even stealing the data
store code. Encryption prevents all that. In fact if the product, such as a voice mail system, cannot operate without the data store, an
encrypted stored procedure might obviate the need for one of those clumsy “dongles” you shove onto the parallel port to control
access to the system, or prevent it from being pirated.
Grouping
You can create a group of stored procedures, each one having the same name to identify them as part of a group, by assigning each
stored procedure in the group an identification number. Grouping the procedures like this allows you to maintain collections of stored
procedures that pertain to a particular function or purpose. For example, I have a group of stored procedures that are all part of the
accounts payable database, as follows:
accpay; 1
accpay; 2
accpay; 3
accpay; 4…
Creating each member in the group is easy: Just specify the number of the individual procedure in your CREATE PROC code.
Caveat? You cannot delete an individual. When you are done with the group, the statement DROP PROCEDURE destroys the whole
group.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 81 of 90
The steps you take to opening a CREATE PROCEDURE template or an ALTER PROCEDURE template are almost identical to what
I described for triggers earlier, so forgive me if I don’t repeat those steps here. Let’s instead go directly to debugging the stored
procedure, which means you need to open Visual Studio 2005.
Once the proc has been written and your syntax is clear of errors, the procedure code is ready to be observed in the debugger. As you
step through the code, you can see parameters change and statements executed without affecting underlying table data.
Connect to the server holding your stored procedures. Drill down to the Programmability folder and expand the list of procedures.
Double-click the proc or select OPEN from the context menus so that the procedure code window opens in Visual Studio. You can
then set break points in the window as you would any .NET code.
1. Right-click the procedure and select Step Into Stored Procedure from the context menus. Select Debug from the context menu;
the Run Procedure dialog box loads to allow you to enter parameter values to test with. This is demonstrated in Figure 14–4.
(Notice the check box Auto Roll Back; enable this to roll back all changes made to the data while debugging a stored
procedure).
2. Step into the code and watch the execution of each statement in the transaction. Two tables are operated on in this procedure,
and both operations must complete or nothing must complete. So the code I am stepping through is enclosed in a transaction
that I can roll back if I detect a failure anywhere in the transaction. (The full code of this stored procedure is listed later in this
chapter, in the section “The Example.”) This illustrates that the initial queries have run and the local variables (see the
parameters now for @Amt and @Debit) have been changed accordingly.
3. If I keep stepping through the code, as soon as I update either of the tables in this procedure the appropriate triggers will fire.
We have left the procedure and entered the trigger code after the DML statement has been executed in the transaction. I can
now step into the trigger. Notice the trigger (fishy) is now in the call stack. Depending on what the trigger is looking for, or to
do, it might or might not cause the remainder of the procedure to execute in the debugger. Naturally, if you want to step through
the proc without the intervention of the trigger, just drop the trigger from the table and reinstall it later.
If you change stored procedure (or trigger) code, you must execute the ALTER query to install the latest version of the procedure to
the database. You are not editing the installed procedure when you run Edit or script out the procedure code, so failing to re-execute
after changing the code will not help you, and you’ll be as confused as an apple in a peach tree when the bug you just squished returns
to the debugger. Think of the Execute Query as the build or compile button on your traditional IDE.
It is also a good idea to save the query out to a text file or version control system as often as possible, because if you lose power or
your system crashes before you have a chance to re-execute the ALTER query, the source code will be lost.
For example, the code sp_who and nothing else in the statement will execute the stored procedure by that name. There is one caveat to
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 82 of 90
just naming the procedure in the procedure. The name must be the first line of code in your statement or batch…even if a billion lines
of code follow it. If you stick a statement above the procedure name, the code will break.
Using the Clause Exec or Execute in Front of the Stored Procedure Name
The name is then followed by the rest of the stored procedure code. As long as you prefix the procedure with Exec or Execute, the
code will execute.
You can execute a stored procedure, as described here, that is grouped with a collection of stored procedures by specifying the
procedure number assigned to the member procedure as follows:
EXECUTE accpay_proc; 4
You can call a stored procedure in an INSERT statement, a topic that is further discussed in Chapter 20. What happens is that the
result set returned from the stored procedure is inserted into the reference table. The following code provides an example of an
INSERT … EXEC.
When a client calls a stored procedure (using one of the methods described in the earlier section), it can pass values to the procedure
through parameters. These values are then used in the stored procedure to achieve the desired result. I mentioned earlier that you can
write stored procedures that can contain as many as 1,024 parameters. To expand further, each parameter can be named, associated
with a data type, given a direction, and even assigned a default value (even NULL).
You can pass parameters by name and by position. The following example demonstrates passing parameters by name:
This is the more robust form of coding stored procedures because it means you can use the named parameters in a stored procedure
without defining the order in which the parameters were received. The following code will get hairy if you start going wild on
parameters:
When you execute stored procedures that are expecting parameters, SQL Server will report an error if none is received in your
connection. If your connection might not transmit a parameter, you could code default values in the stored procedure, or even use
NULL. SQL Server will then make use of the default values or place NULL into the statement’s parameter placeholder instead of
returning an error.
In keeping with our discussion of NULL values in the last chapter, your first choice should be to provide a default value in the
parameter that makes the most sense and will enhance rather than break your code. For example, the following procedure will use
your parameter values, or a default parameter:
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 83 of 90
The following example takes NULL when and if the circumstances can live with it:
The following code is a simple order-adding stored procedure demonstrating the use of parameters:
USE rainbow
DECLARE @returncode int
DECLARE @CustID char (8)
DECLARE ©AgentID char (6)
DECLARE @Item char (50)
DECLARE @Price money, @Discount real, @Quantity int
INSERT [Orders] (CustID, AgentID, Item, Quantity, Price, Discount)
VALUES (@CustID, ©AgentID, @Item, @Quantity, @Price, @Discount)
This stored procedure is kept simple to demonstrate the use of parameters. Notice that you can also group parameter declarations
behind one DECLARE. The following list provides some parameter tips:
Use default values in your parameters in preference to NULL. Use the NULL value when it makes sense.
Write code that checks for inconsistent or missing values early in your stored procedure code.
Use easy-to-remember names that make it easier to pass the value by name rather than position.
You need to identify data intended for output with the OUTPUT keyword, which also has a short form OUT. This specifies the data
intended for return to the client. The following stored procedure code returns a single integer value:
Returning result sets to clients from stored procedures is simply a matter of coding a SELECT statement into the stored procedure.
The tabulated or multirow data is returned automatically. If you need to perform complex SELECT routines (such as SELECT INTO)
in your stored procedure, you can also enclose the SELECT statement into conditional or flow-control logic to prevent any result set
from being returned to the client. The following code demonstrates a simple stored procedure that returns a result set to the client:
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 84 of 90
Despite the nest limit of 32, you can also spawn as many stored procedures as you want from within the nest chain or cascade. The
called procedure does not increment the nest level counter as long as it completes without spawning another stored procedure.
Automatically executing stored procedures on startup is also highly useful for certain management applications or functionality that a
server needs to have soon after startup. I call this warming up SQL Server. You should take note of a few nuances and rules before
you consider using this “warm-up” capability These nuances are as follows:
The creator or owner of an autoexecuted stored procedure can only be the system administrator or a member of the sysadmin
fixed server role.
The stored procedure must be a background process and cannot take any input parameters.
Each autoexecuted, or startup, stored procedure makes a connection to the DBMS. You can have as many autoexecuted stored
procedures as you like, but if each one consumes a connection, this could be a significant drain on resources. If the stored procedures
do not need to be executed concurrently, you can nest them. Thus, you could create a single stored procedure that calls a list of stored
procedures synchronously. The cascade of stored procedures only consumes one connection, the one incurred by the autoexecuting
stored procedure.
By cascading or nesting stored procedures, you could thus call user-defined stored procedures and even pass parameters into them.
This can be useful for an application that requires certain information and objects to be available to users when the databases come
back online.
In one of my call centers, if I need to restart or “IPL” a server for any reason, or we suffer a server or host crash, an autoexecuted
stored procedure sends a message to database users when the server is ready and they can reconnect. This saves the help desk from
having to call or e-mail users that the server is back up and can be accessed again.
If you have a problem and you ever need to delay the autoexecuting of stored procedures, you can start the instance of SQL Server
2005 with the -f flag. This will start the server in a minimal configuration (like Safe mode on Windows Server 2003) and allow you to
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 85 of 90
debug the problem. You can also specify trace flag 4022 as a startup parameter, which forces the startup to bypass autoexecution.
To create a startup or autoexecuted stored procedure, you need to be logged in as a member of the sysadmin role and you must create
the procedure in the master database.
You can also use sp_procoption to designate an existing stored procedure as a startup stored procedure, to reset the startup option on a
stored procedure, or to view a list of all stored procedures that execute on startup.
To alter a stored procedure in T-SQL, you need to use the ALTER PROCEDURE statement. The basic statement is as follows:
The code to apply after the ON line takes the same syntax and choice of arguments as the CREATE PROC statement described earlier
(see the SQL Server 2005 Books Online for the full explanation and usage of the arguments).
To alter a stored procedure in Management Studio, follow the steps described earlier for creating a stored procedure in Management
Studio and edit the stored procedure accordingly Altering a stored procedure in Management Studio is demonstrated shortly.
Dropping a stored procedure in T-SQL requires the DROP PROCEDURE statement followed by the procedure name. For example,
the code
USE Stores
IF EXISTS (SELECT name FROM sysobjects
WHERE name = 'DebitStores')
DROP PROCEDURE DebitStores
You should make sure to check for stored procedure dependencies with the sp_depends stored procedure before dropping the stored
procedure.
To drop a stored procedure interactively in Management Studio, simply drill down to the database and select the Stored Procedures
node from the console tree. Select the stored procedure, right-click, and select Delete.
To obtain information from SQL Server about the stored procedures attached to the database, you should execute the system stored
procedure sp_helptext. This system stored procedure returns the code for database objects like stored procedures, rules, and defaults.
It queries the code in syscomments as mentioned earlier. Call sp_helptext as follows:
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 86 of 90
The Example
Let’s now look at the debit/credit example we documented in the stored procedure deployment plan discussed earlier.
CREATE PROC
/*
Script Name: jrs_CRDR
Description: Credit/Debit for Items/Orders
Usage: For stock picking
Return Code: -1 to -10
Author: Jeffrey R. Shapiro
Version: 1.1
Date Created: 9/25/2005
Revision History:
*/
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_NULLS ON
GO
AS
BEGIN TRANSACTION
DECLARE @Amt int, @Debit int --Value for current number of
Items in stock
SET @Amt = 0 /*assignment not necessary but nice to see how it changes
in the debugger*/
--First get the stock level for the sku and see if we can debit
SELECT @Amt = (SELECT Quantity FROM Customers.dbo.Items
WHERE ItemNumber = @SKU)
IF @Amt IS NULL
BEGIN
ROLLBACK TRANSACTION
RAISERROR
('Bad SKU. Please call stock controller about %d', 16, 1, @SKU)
RETURN (-4)
END
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 87 of 90
orders
WHERE OrderItem = @IN --at this item number
IF @@ERROR <> 0
BEGIN
ROLLBACK TRANSACTION
RETURN (-8)
END
COMMIT
PRINT 'Item posted'
GO
SET QUOTED_IDENTIFIER OFF
GO
SET ANSI_NULLS ON
GO
In the preceding code I used PRINT to report that the item posted okay (thanks to the commit that only gets called if the transaction is
kosher). But I coded in various return codes that would get returned on an error. These could be suppressed and confined to the stored
procedure or returned as an output value to the client. In other words you can loop through return codes in the procedure and send a
related message to the user, or just send the return code to the client and let logic on the client decide how to proceed. I prefer to keep
the error codes local to the procedure, which means I can change, at any time, what I tell the client about the errors or what course of
action to take.
select @fData=@MONNUMBER
SELECT
@val1=CASE @fData
WHEN '01' THEN 'JAN'
WHEN '02' THEN 'FEB'
WHEN '03' THEN 'MAR'
WHEN '04' THEN 'APR'
WHEN '05' THEN 'MAY'
WHEN '06' THEN 'JUN'
WHEN '07' THEN 'JUL'
WHEN '08' THEN 'AUG'
WHEN '09' THEN 'SEP'
WHEN '10' THEN 'OCT'
WHEN '11' THEN 'NOV'
WHEN '12' THEN 'DEC'
END
begin
select @returnval=@val1
end
return @returnval
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 88 of 90
end
User ALTER FUNCTION to change this function. UDFs in C# or VB that run on the CLR are more exciting. We discuss these next.
The first thing you have to do before you can execute anything on the SQL Server CLR is enable it (the scope is server-wide). To
enable the CLR execute the following T-SQL code in Management Studio.
Now you can use Visual Studio to create your code. Create a solution for creating class libraries, and set up a new project specifically
for stored procedures. Call the class StoredProcedures or something similar. In the class that is created, you will need the following
directives:
using System;
using System.Data;
using System.Data.Sql;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
Now add the definition for a partial class (if Visual Studio has not already done so).
[Microsoft.SqlServer.Server.SqlProcedure]
public static void GetCurrentDate()
{
SqlPipe p = SqlContext.Pipe;
p.Send (System.DateTime.Today.ToString());
}
The stored procedure class is now ready and must be installed to SQL Server. This can be done as follows in T-SQL using
Management Studio:
This code installs your assembly to SQL server. Once the assembly has been installed (you can see it under the assemblies folder in
your specific database), you can add the stored procedure into SQL Server as follows:
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 89 of 90
The above stored procedure simply returns the current date (of course the built-in function GetDate() does that as well but we needed
a simple example). Now let’s get a little more sophisticated and create a stored procedure that takes a parameter. Note the use of the
SqlPipe construct, which is used to return data to the client that called the stored procedure.
[Microsoft.SqlServer.Server.SqlProcedure]
public static void GetFormattedDate(int Option)
{
string s = System.DateTime.Today.ToString();
SqlPipe p = SqlContext.Pipe;
switch (Option)
{
case 1:
s = s.Remove(10);
p.Send(s);
break;
case 2:
//other options
default:
break;
}
}
To install this parameter-driven procedure, you need to specify the parameter in T-SQL as follows:
exec GetCurrentDate
exec GetFormattedDate 1
Let’s now look at CLR-based UDFs. The GetFormattedDate stored procedure is actually a better candidate for a CLR function than it
is as a stored procedure. So let’s reimplement it accordingly.
[SqlFunction(DataAccess = DataAccessKind.Read)]
public static SqlString GetFormattedDate(int Option)
{
string s = System.DateTime.Today.ToString();
switch (Option)
{
case 1:
s = s.Remove(10);
break;
case 2:
//other options
default:
break;
} return s;
}
Creating the assembly and installing it to SQL Server is the same process as described earlier for stored procedures (see also Chapter
11). The T-SQL code for installing the function is a little different.
Remember, always make sure you specify the exact path to the function when installing the function; that is, the namespace, class and
method, or you will get an error that the function cannot be found.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011
Part III: Programming SQL Server 2005 Page 90 of 90
To Recap
We covered a lot of ground together in this chapter dealing with triggers and stored procedures. As you can see, trigger and stored
procedure writing and management can consume substantial resources, and without proper planning, documentation, change control,
archiving, source code maintenance, modeling, and so on, you can create a lot of problems for yourself or the team.
If you are new to trigger and stored procedure writing, the change in development style and philosophy can put a lot of strain on
mental and physical resources. And the conversion of client-side, inline SQL code to server-side triggers and stored procedures can be
expensive in terms of both time and materials.
file://C:\Users\WINz\AppData\Local\Temp\~hh4C4D.htm 3/27/2011