Microsoft SQL Database Analytics Paper
Microsoft SQL Database Analytics Paper
Microsoft SQL Database Analytics Paper
Accelerating Large-Scale
Business Analytics
Contents Technology
Executive Summary . . . . . . . . . . . . . . 1 Intel Xeon Processor-Based Platforms
Technology . . . . . . . . . . . . . . . . . . . . . . 2 In our configuration, we rely on the advantage of in-memory computing for fast
Intel Xeon Processor- insights and decision making. Infrastructure built on the Intel Xeon processor
Based Platforms. . . . . . . . . . . . . . . . 2 E7-8800 v4 product family can deliver real-time analytics services and open up
Intel Data Center SSDs . . . . . . . . 2 new data-driven business opportunities. Designed for the most mission-critical
workloads and the always-on enterprise, the processor combines large memory
Intel RAID Controllers . . . . . . . . . 2 capacities with high performance, reliability, and virtualization capabilities to keep
Microsoft SQL Server 2016: Data data centers supplying business advantage without interruption. 2
Warehousing with Improved
Intel Data Center SSDs
Columnstore Technology . . . . . . . . . 2
System Configuration. . . . . . . . . . . . . 3 To ensure that data centers can keep up with the speed of business, all thats
needed is an I/O-optimized infrastructure using the extraordinary performance
Database Creation and and reliability of Intel Data Center SSDs. 3 By pairing Intel Data Center SSDs with
Performance Testing . . . . . . . . . . . . . 6 Intel CPUs, chipsets, firmware, and drivers, we built a seamless system, enabling
Building the Database . . . . . . . . . . 6 large amounts of data to be transferred to processors quickly, eliminating I/O
bottlenecks. Faster data servers give organizations a competitive edge as they
Test Queries . . . . . . . . . . . . . . . . . . . 6
deal with ever-increasing infrastructure demands. 2
Results & Analysis. . . . . . . . . . . . . . . . 7
Intel RAID Controllers
The New Intel Xeon Processor
E7 v4 Family Performance SAS-based Intel RAID controllers, featuring the dual-core LSI SAS3108
Combined with Improved RAID-on-Chip (ROC) processor, offer significant performance enhancements
SQL Server 2016 for solutions architected with 12Gb/s or 6Gb/s SAS drives.4 The read/write
Columnstore Engine. . . . . . . . . . . . 7 performance ideally suits the controllers for a broad range of application
workloads, such as enterprise data center applications, cloud computing, and
SQL Server 2016 Optimized
content applications. The massive growth of cloud and big data applications
for Intel AVX2. . . . . . . . . . . . . . . . . 8
requires enterprise features to manage, optimize, and improve the efficiency of
Summary . . . . . . . . . . . . . . . . . . . . . . . . 9 growing data centers. The 12Gb/s SAS Intel RAID controller StorCLI utility
Acknowledgements . . . . . . . . . . . . . 10 features server storage management, either locally or remotely, from a single
pane of glass with the RAID Web Console application. This generation of storage,
Appendix A . . . . . . . . . . . . . . . . . . . . . 11 spurred by the increased adoption and trending lower costs of flash, requires
Test Queries . . . . . . . . . . . . . . . . . . . . 11 more bandwidth and I/O per second (IOPS) capability than the previous generation.
In addition to outstanding performance, the Intel 12Gb/s SAS product family
Test Query 1 . . . . . . . . . . . . . . . . . . 11
supports advanced features such as FastPath I/O and RAID SSD Cache, dual-level
Test Query 2 . . . . . . . . . . . . . . . . . . 11 advanced RAID types, and revertible hot spare. 2
Test Query 3 . . . . . . . . . . . . . . . . . . 11
Test Query 4 . . . . . . . . . . . . . . . . . . 12
Microsoft SQL Server 2016: Data Warehousing
with Improved Columnstore Technology5
Test Query 5 . . . . . . . . . . . . . . . . . . 12
Microsoft SQL Server 2016 has made significant improvements in data
Test Query 6 . . . . . . . . . . . . . . . . . . 13 warehousing technologies and performance, including columnstore features,
Test Query 7 . . . . . . . . . . . . . . . . . . 13 as well as many other improvements.
Test Query 8 . . . . . . . . . . . . . . . . . . 14 Columnstore indices offer great advantages over traditional row stores for analytics
Test Query 9 . . . . . . . . . . . . . . . . . . 14 and data warehousing queries. They are ideally suited for star schemas and tables
with billions of rows, which are commonly seen. Among their advantages for
Test Query 10. . . . . . . . . . . . . . . . . 15 analytics are:
Test Query 11 . . . . . . . . . . . . . . . . . 15
Up to 10x Compression in Data Size
Test Query 12. . . . . . . . . . . . . . . . . 15
Data warehouses are very large by nature, and the compression offered by
Appendix B . . . . . . . . . . . . . . . . . . . . . 16 columnstore index technologies offers not only space and cost savings, but also
Building the Database . . . . . . . . . 16 significantly increased performance. These benefits are due to the dramatically
reduced I/O requirements given by the compression, coupled with the ability to
Performance Tests with the Intel
scan only the specific columns required by each query. This compression also
Xeon Processor E7-8890 v3. . . 16
reduces the amount of memory required to hold a given number of rows from
In-Memory Performance of the source data warehouse.
Columnstore Engine. . . . . . . . . . . . . 17
Performance Scaling with
Data Size. . . . . . . . . . . . . . . . . . . . . . . . 17
Performance of Running
Queries Concurrently. . . . . . . . . . . . 18
2
White Paper | Accelerating Large-Scale Business Analytics
Additional Indices
SQL Server 2016 adds the capability to add additional (B-Tree) indices to columnstore-based tables, which enables efficient
single-row lookup. In addition to these architectural features, we have further optimized the processing of queries in
columnstore indices in the following ways:
Tempdb
Backup
1 Intel Xeon processor
Data E7-8800 v3
Dual
Backup SAS SF 8088
2 connectors
Intel RAID
Controller
Log
System Configuration
The Intel technology-powered white box used is a 4U
Intel Server System that supports four Intel Xeon
processor E7-8890 v4 product family processors.
Attached to the platform, we use 8 LSI 620J JBOD storage
enclosures (see Figure 16), which are dual-connected to
the Intel RAID controllers via SAS SF-8088 cables, which
provides 4 external lanes with a 12Gb/s transfer rate per
lane. We used the directly attached storage units for storing
data (i.e., raw data and database files), tempdb (i.e., SQL
Server system database used for query execution), and
backups. The most crucial performance aspect of our storage
subsystem for our workload was read bandwidth for data and
tempdb drives. Before sizing and configuring the drives, we
ran Iometer tests for 512K sequential reads to measure I/O
bandwidth of each storage unit, populated by 1.6TB Intel
SSDs. In the graph below, you can observe the amount of I/O
bandwidth that can be attained as we add more disks to
the enclosure.
3
White Paper | Accelerating Large-Scale Business Analytics
4000
3500
3000
2500
2000
1500
1000
500
0
0 1 2 3 4 5 6 7 8 9 10 11 12
Number of Disks
As can be seen from Figure 2, Intel RAID cards can easily As for system memory, we fully populated our DIMM sockets
scale up to an approximate 4.5GB/s data transfer rate with 8 with 64GB DDR4 LRDIMMs, giving us a total memory capacity
SSDs. Beyond that point, we are limited by SAS transfer rates. of 6TB. To make use of the whole physical address space,
Based on our observation, we used 8 SSDs for our data drives we used Microsoft Windows Server 2016 RTM build, which
per enclosure to provide peak I/O bandwidth and ample supports memory capacity up to 12TB.
storage (about 11TB per data drive, 88TB total capacity). We
Table 1 lists the specifics about the components we used in
measured total I/O bandwidth of over 25GB/s across all data
our configuration.
drives with our configuration detailed below.
Our storage configuration is as follows: Each storage bay is
populated with 24 x 1.6TB 2.5 Intel SSD DC S3500 Series
drives. We created RAID0 across 8 SSDs for data drives
and 4 SSDs for tempdb drives. We created RAID6 over
the remaining 12 SSDs to hold our database backup files,
ensuring good performance with adequate redundancy
(i.e., around 14.5TB per bay, which allows us to keep 2 sets of
backups). Lastly, for SQL log files, we included a storage unit
with 24 x 800GB 2.5 Intel SSDs daisy-chained to one of the
main storage bays. Figure 1 illustrates the storage layout of
our system.
4
White Paper | Accelerating Large-Scale Business Analytics
CPU
Processor Intel Xeon processor E7-8890 v4
# Cores/Threads 24/48
Last Level Cache 60MB
Base/Turbo Frequency 2.2GHz/3.4GHz
QPI 3 links, 9.6GT/s
Instruction Set Extensions AVX2
PLATFORM
# Sockets 4
Chipset C600/X79 Express Chipset
# PCI Express (PCIe) ports 12
Networking 2x Intel Network Adapter X540-T2 (10Gb/s)
STORAGE
Intel SSD DC S3500 Series
SSDs Data + Tempdb + Backups x192
(1.6TB, 2.5 SATA 3.0)
Intel SSD DC S3700 Series
SSDs Log x24
(800GB, 2.5 SATA 3.0)
Intel RAID Controller RS3SC008
RAID Controllers x8
PCIe x8 Gen3
LSI LSI00217 / 620J 24-Bay 2U
Storage Enclosure x9
Rackmount JBOD
MEMORY
Memory Type DDR4
DIMM Size and Type 64GB LRDIMM, dual rank
Memory Frequency (MHz) 2133
DIMMs/Channel 3
System Memory Capacity (GB) 6144
SOFTWARE
Microsoft SQL Server 2016 Enterprise RTM
Database Software
64-bit Edition (build no: 13.0.1601.5)
Microsoft Windows Server 2016 TP5
OS Distribution
64-bit (build no: 14300)
5
White Paper | Accelerating Large-Scale Business Analytics
Columnstore Database
Total File Size: 50TB Backup
Bulk Insert
into CCI
Backup
35TB
Raw Data
100TB Restore
6
White Paper | Accelerating Large-Scale Business Analytics
900
914
Query Run Time (seconds)
2.21
800 2.0
700
Speed Up
1.53
600 1.5
1.41
1.17
400 1.06 1.0
413
401
300
200 .5
161
138
100
105
5.31
4.97
79
74
52
0 0
TQ1 TQ2 TQ3 TQ4 TQ5 TQ6 TQ7
Intel Xeon processor E7-8890 v3 @ 2.5GHz Intel Xeon processor E7-8890 v4 @ 2.2GHz Speed Up
7
White Paper | Accelerating Large-Scale Business Analytics
8
White Paper | Accelerating Large-Scale Business Analytics
700 2.31
Query Run Time (seconds)
671
2.0
600 1.31
1.31
594
540
500
511
Speed Up
1.5
496
452
400
1.19
1.16 1.15
300 1.09 1.0
200
.5
150
143
138
100 122
114
99
4.97
12
0 0
TQ1 TQ2 TQ8 TQ9 TQ10 TQ11 TQ12
TQ2 is the perfect value-proposition scenario in our product family and enterprise-class Intel SSDs and RAID
experiments. It is a query with a relatively simple query plan, controllers. We demonstrated that it is possible to build and
and it spends most of its query execution time in Intel AVX2- run a 100TB data warehouse on a symmetric multiprocessor
optimized scan operator. This query calculates an aggregate (SMP) configuration using technology that is readily available
value over a dataset from a given year with some additional today from Intel and Microsoft. Our system with four Intel
predicates (i.e., processing roughly 12 billion rows). Hence, Xeon processor E7-8800 v4 product family processors fuels
performance differences in query run times can be directly the advanced business analytics capabilities of Microsoft
attributed to improvements from vectorizing the algorithms. SQL Server 2016 to deliver stunning performance, processing
As query plans get more complicated, measurable benefits 100TB worth of data to bring key business insights in a
from utilizing instruction-level parallelism will also reduce matter of minutes. SQL Server 2016, optimized for Intel
in accordance with Amdahls law, but are still impactful, architecture, makes use of large system memory that can
showing a reduction of 25% in query run times on average fit terabytes of data, eliminating costly I/O operations.
with Intel AVX2.10 Enterprise-level Intel SSDs ensure fast data access,
delivering performance as needed.
We encourage readers of this paper to explore the
possibilities with Intel SSE for different use cases in Through continuous innovation, Intel and Microsoft are
the online Intel 64 and IA-32 Architectures Software committed to bringing exciting technology to customers and
Developer Manuals and easy-to-use Intel Intrinsics more power to their business analytics solutions. Stay tuned.
Guide tool.
Acknowledgements
Summary We would like to acknowledge:
In this paper, we presented our findings on running a large
Lindsey Allen for spearheading the project
(100TB) data warehousing application using Microsoft
SQL Server 2016 on the Intel Xeon processor E7-8800 v4 Jamie Reding for his TPC-H benchmark expertise
9
White Paper | Accelerating Large-Scale Business Analytics
This paper has been updated to reflect the changes to our test platform and our most recent results using the latest-generation Intel Xeon processor E7-8890 v4. In the earlier version of the
paper, we used the Intel Xeon processor E7-8890 v3 to build and run performance tests. We have retained sections of the earlier paper to guide users on how we set up our experiments and
updated performance results obtained using the latest processors. For previous experiment results, please refer to Appendix B.
Disclaimer: The workload used in this paper is derived from portions of the TPC-H Benchmark* and, as such, is not comparable to published TPC-H Benchmark results.
1. http://www.intel.com/content/www/us/en/processors/xeon/xeon-processor-e7-family.html
2. I ntel technologies' features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system
configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at http://www.intel.com. For more details, please refer to the Legal
Notice section in this paper.
3. https://www-ssl.intel.com/content/www/us/en/solid-state-drives/data-center-family.html
4. http://www.intel.com/content/www/us/en/servers/raid/raid-controller-rs3sc008.html
5. Microsoft claims based on SQL Server customer case studies. See https://customers.microsoft.com for more details.
6. Images used for illustration purposes only. They may not reflect actual components.
7. Since both the Intel Xeon processor E7-8890 v4 and the final Windows Server 2016 were not publicly available at the time of this test, we used the previous-generation Intel Xeon processor
E7-8890 v3 and an earlier build of Windows Server 2016 for data generation.
8. 36% more performance average on 7 ad-hoc data warehousing queries, comparing the Intel Xeon processor E7-8890 v4 to the Intel Xeon Processor E7-8890 v3 on a 4-socket server with
6TB DDR4 memory, 100TB storage for data (64 Intel SSD DC S3500 Series) running SQL Server 2016 RTM on Windows Server 2016.
9. SQL Server 2016 supports Intel SSE4.2, Intel AVX, and Intel AVX2 instruction set extensions.
10. Up to 2.3x performance improvement and 25% average reduction in query run times on 7 ad-hoc data warehousing queries, comparing customized SQL Server 2016 RTM with Intel AVX2
instruction set support to same engine without Intel AVX2 support, on a 4-socket server with Intel Xeon processor E7-8890 v4 processors, 6TB DDR4 memory, 100TB storage for data
(64 x Intel SSD DC S3500 Series) running on Windows Server 2016. Reduction in query run time calculated as: ([1 (AVX2 run time / scalar run time)] x 100). Average reduction in run time is
geometric mean of reduction in query run times for the 7 queries.
Copyright 2017 Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, the Intel Inside logo, and Xeon are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or
other countries.
*Other names and brands may be claimed as the property of others
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark* and MobileMark*, are
measured using specific computer systems, components, software, operations, and functions. Any change to any of those factors may cause the results to vary. You should consult other
information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
For more information, visit intel.com/performance.
10
White Paper | Accelerating Large-Scale Business Analytics
Appendix A
Test Queries
In this section, we present the queries used in our tests.
Test Query 1
Test Query 1 creates a report on pricing on sales up to approximately the last 2.5 months of sales data.
This query is derived from TPC-H Q1.
SELECT L_RETURNFLAG,
L_LINESTATUS,
SUM(L_QUANTITY) AS SUM_QTY,
SUM(L_EXTENDEDPRICE) AS SUM_BASE_PRICE,
SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)) AS SUM_DISC_PRICE,
SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)*(1+L_TAX)) AS SUM_CHARGE,
AVG(L_QUANTITY) AS AVG_QTY,
AVG(L_EXTENDEDPRICE) AS AVG_PRICE,
AVG(L_DISCOUNT) AS AVG_DISC,
COUNT_BIG(*) AS COUNT_ORDER
FROM LINEITEM
WHERE L_SHIPDATE <= dateadd(dd, -73, cast(1998-12-01as date))
GROUP BY L _RETURNFLAG,
L_LINESTATUS
ORDER BY L _RETURNFLAG,
L_LINESTATUS
Test Query 2
Test Query 2 estimates the amount of revenue increase assuming that discounts within a range were not applied within a year.
This query is derived from TPC-H Q6.
SELECT SUM(L_EXTENDEDPRICE*L_DISCOUNT) AS REVENUE
FROM LINEITEM
WHERE L_SHIPDATE >= 1993-01-01 AND
L_SHIPDATE < dateadd (yy, 1, cast(1993-01-01 as date)) AND
L_DISCOUNT BETWEEN 0.04 - 0.01 AND 0.04 + 0.01 AND
L_QUANTITY < 25
Test Query 3
Test Query 3 lists the most important set of suppliers within a nation. This query is derived from TPC-H Q11.
SELECT PS_PARTKEY,
SUM(PS_SUPPLYCOST*PS_AVAILQTY) AS VALUE
FROM PARTSUPP,
SUPPLIER,
NATION
WHERE PS_SUPPKEY = S_SUPPKEY AND
S_NATIONKEY = N_NATIONKEY AND
N_NAME = INDIA
GROUP BY PS_PARTKEY
HAVING SUM(PS_SUPPLYCOST*PS_AVAILQTY) >
( S
ELECT SUM(PS_SUPPLYCOST*PS_AVAILQTY) * 0.0000000010
FROM P ARTSUPP,
SUPPLIER,
NATION
S_SUPPKEY = S_SUPPKEY AND
WHERE P
S_NATIONKEY = N_NATIONKEY AND
N_NAME = INDIA
)
ORDER BY VALUE DESC
11
White Paper | Accelerating Large-Scale Business Analytics
Test Query 4
Test Query 4 monitors the percentage of promotional revenue within a given month. This query is derived from TPC-H Q14.
Test Query 5
Test Query 5 finds the supplier that made the biggest contribution to overall revenue within a quarter.
This query is derived from TPC-H Q15.
CREATE VIEW REVENUE0 (SUPPLIER_NO, TOTAL_REVENUE)
AS
SELECT L_SUPPKEY,
SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT))
FROM LINEITEM
WHERE L_SHIPDATE >= 1993-07-01 AND
L_SHIPDATE < dateadd(mm, 3, cast (1993-07-01 as date))
GROUP BY L_SUPPKEY
GO
SELECT S_SUPPKEY,
S_NAME,
S_ADDRESS,
S_PHONE,
TOTAL_REVENUE
FROM SUPPLIER,
REVENUE0
WHERE S_SUPPKEY = SUPPLIER_NO AND
TOTAL_REVENUE = ( SELECT MAX(TOTAL_REVENUE)
FROM REVENUE0
)
ORDER BY S_SUPPKEY
DROP VIEW REVENUE0
12
White Paper | Accelerating Large-Scale Business Analytics
Test Query 6
Test Query 6 displays the amount of profit on certain parts, grouped by suppliers nation for each year.
This query is derived from TPC-H Q9.
SELECT NATION,
O_YEAR,
SUM(AMOUNT) AS SUM_PROFIT
FROM ( SELECT N_NAME AS NATION,
datepart(yy, O_ORDERDATE) AS O_YEAR,
L_EXTENDEDPRICE*(1-L_DISCOUNT)-PS_SUPPLYCOST*L_QUANTITY AS AMOUNT
FROM PART,
SUPPLIER,
LINEITEM,
PARTSUPP,
ORDERS,
NATION
WHERE S_SUPPKEY = L_SUPPKEY AND
PS_SUPPKEY = L_SUPPKEY AND
PS_PARTKEY = L_PARTKEY AND
P_PARTKEY = L_PARTKEY AND
O_ORDERKEY = L_ORDERKEY AND
S_NATIONKEY = N_NATIONKEY AND
P_NAME LIKE %%smoke%%
) AS PROFIT
GROUP BY N ATION,
O_YEAR
ORDER BY N ATION,
O_YEAR DESC
Test Query 7
Test Query 7 displays customers that caused high revenue loss due to returns. This query is derived from TPC-H Q10.
SELECT TOP 20
C_CUSTKEY,
C_NAME,
SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)) AS REVENUE,
C_ACCTBAL,
N_NAME,
C_ADDRESS,
C_PHONE,
C_COMMENT
FROM CUSTOMER,
ORDERS,
LINEITEM,
NATION
WHERE C_CUSTKEY = O_CUSTKEY AND
L_ORDERKEY = O_ORDERKEY AND
O_ORDERDATE >= 1993-05-01 AND
O_ORDERDATE < dateadd(mm, 3, cast(1993-05-01 as date )) AND
L_RETURNFLAG = R AND
C_NATIONKEY = N_NATIONKEY
GROUP BY C_CUSTKEY,
C_NAME,
C_ACCTBAL,
C_PHONE,
N_NAME,
C_ADDRESS,
C_COMMENT
ORDER BY REVENUE DESC
13
White Paper | Accelerating Large-Scale Business Analytics
Test Query 8
Test query 8 finds lowest-cost supplier for a part in a region. This query is derived from TPC-H Query 2.
SELECT TOP 1
00
S_ACCTBAL,
S_NAME,
N_NAME,
P_PARTKEY,
P_MFGR,
S_ADDRESS,
S_PHONE,
S_COMMENT
FROM P
ART,
SUPPLIER,
PARTSUPP,
NATION,
REGION
WHERE P
_PARTKEY = PS_PARTKEY AND
S_SUPPKEY = PS_SUPPKEY AND
P_SIZE = 50 AND
P_TYPE LIKE %%COPPER AND
S_NATIONKEY = N_NATIONKEY AND
N_REGIONKEY = R_REGIONKEY AND
R_NAME = AFRICA AND
PS_SUPPLYCOST = ( SELECT MIN(PS_SUPPLYCOST)
FROM PARTSUPP,
SUPPLIER,
NATION,
REGION
WHERE P_PARTKEY = PS_PARTKEY AND
S_SUPPKEY = PS_SUPPKEY AND
S_NATIONKEY = N_NATIONKEY AND
N_REGIONKEY = R_REGIONKEY AND
R_NAME = AFRICA
)
ORDER BY S_ACCTBAL DESC,
N_NAME,
S_NAME,
P_PARTKEY
Test Query 9
Test query 9 displays top 10 unshipped orders with the highest value. This query is derived from TPC-H Query 3.
SELECT TOP 10
L_ORDERKEY,
SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)) AS REVENUE,
O_ORDERDATE,
O_SHIPPRIORITY
FROM CUSTOMER,
ORDERS,
LINEITEM
WHERE C_MKTSEGMENT = BUILDING AND
C_CUSTKEY = O_CUSTKEY AND
L_ORDERKEY = O_ORDERKEY AND
O_ORDERDATE < 1995-03-05 AND
L_SHIPDATE > 1995-03-05
GROUP BY L_ORDERKEY
O_ORDERDATE,
O_SHIPPRIORITY
ORDER BY REVENUE DESC,
O_ORDERDATE
14
White Paper | Accelerating Large-Scale Business Analytics
Test Query 10
Test query 10 lists revenue volume from local suppliers. This query is derived from TPC-H Query 5.
SELECT
N_NAME,
SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)) AS REVENUE
FROM CUSTOMER,
ORDERS,
LINEITEM,
SUPPLIER,
NATION,
REGION
WHERE C_CUSTKEY = O_CUSTKEY AND
L_ORDERKEY = O_ORDERKEY AND
L_SUPPKEY = S_SUPPKEY AND
C_NATIONKEY = S_NATIONKEY AND
S_NATIONKEY = N_NATIONKEY AND
N_REGIONKEY = R_REGIONKEY AND
R_NAME = MIDDLE EAST AND
O_ORDERDATE >= 1993-01-01 AND
O_ORDERDATE < DATEADD(YY, 1, cast (1993-01-01as date))
GROUP BY N_NAME
ORDER BY REVENUE DESC
Test Query 11
Test query 11 checks if shipping mode impacts delivery dates of high-priority orders. This query is derived from TPC-H Query 12.
SELECT L_SHIPMODE,
SUM( CASE WHEN O_ORDERPRIORITY = 1-URGENT OR
O_ORDERPRIORITY = 2-HIGH
THEN 1
ELSE 0
END) AS HIGH_LINE_COUNT,
SUM( CASE WHEN O_ORDERPRIORITY <> 1-URGENT AND
O_ORDERPRIORITY <> 2-HIGH
THEN 1
ELSE 0
END) AS LOW_LINE_COUNT
FROM ORDERS,
LINEITEM
WHERE O_ORDERKEY = L_ORDERKEY AND
L_SHIPMODE IN (FOB,MAIL) AND
L_COMMITDATE < L_RECEIPTDATE AND
L_SHIPDATE < L_COMMITDATE AND
L_RECEIPTDATE >= 1997-01-01 AND
L_RECEIPTDATE < dateadd(yy, 1,cast (1997-01-01 as date))
GROUP BY L_SHIPMODE
ORDER BY L_SHIPMODE
Test Query 12
Test query 12 calculates average yearly revenue from small-shipment orders. This query is derived from TPC-H Query 17.
SELECT SUM(L_EXTENDEDPRICE)/7.0 AS AVG_YEARLY
FROM LINEITEM,
PART
WHERE P_PARTKEY = L_PARTKEY AND
P_BRAND = Brand#44 AND
P_CONTAINER = JUMBO CASE AND
L_QUANTITY < ( SELECT 0.2 * AVG(L_QUANTITY)
FROM LINEITEM
WHERE L_PARTKEY = P_PARTKEY
15
White Paper | Accelerating Large-Scale Business Analytics
Appendix B
Building the Database
It took about 6 hours to load 10TB of data onto our system. Table 4 provides a summary of the amount of data processed
This is a very impressive result, showing that SQL Server for each step and how SQL Server 2016 performed for the
2016 is capable of loading 100TB worth of data into a whole build.
columnstore database in 60 hours at a rate of 1.6 TB/hour.
Table 4. ETL and Database Management Statistics with Intel Xeon Processors E7-8890 v3
Performance Tests with the Intel Xeon Processor
E7-8890 v3
In this section, we describe three performance tests we
conducted with the test queries TQ17 with the Intel Xeon
processor E7-8890 v3:
1- In-memory performance of SQL Server columnstore
engine We use this test to demonstrate the benefits
of performance gains from utilizing data caching in
memory, reducing I/O requirements. Note that some
queries (even with 100TB data scale) can fit all data in
memory. In cases where data is too big to fit in memory,
we show that SQL Server can still run queries, utilizing
I/O subsystems efficiently. For this test, we ran each
query twice in isolation. Before the query execution, we
ensured that there was no data caching (i.e., we issued
DBCC DROPCLEANBUFFERS command). By doing so, we
ensured that for the first execution, all data was read from
disk. For the second run, data was cached in SQL Server
columnstore object pool. Comparing the two runs, we can
illustrate the benefits of in-memory execution.
2- Performance scaling with respect to data size We
use this test to compare how query execution time scales
with increased amounts of data. To be able to test this, we
built a reference database with the same schema, sized at
3TB. We then compared a second run from queries that
run in memory. Note that on the same system, all queries
running on a 3TB database will be in memory in a second
run, whereas only 5 queries run in memory on the 100TB
database.
3- Performance of queries when they are run
concurrently We use this test to loosely simulate a use
case where multiple clients interact with the database
simultaneously. In this test, we started all queries at the
same time and measured the time to complete all requests.
We then compared how long it took to execute queries
sequentially.
16
White Paper | Accelerating Large-Scale Business Analytics
In-Memory Performance of Apart from 2 queries (TQ6 and TQ7), all queries can fit
Columnstore Engine in memory thanks to data compression. Caching data in
memory provides up to a 30x performance improvement
Table 5 shows execution time for the test queries on 100TB, and 2.5x on average on subsequent operations on the same
demonstrating the benefits of in-memory columnstore data. Note that in cases where the I/O subsystem is not as
technology. performant as our set up, benefits will further increase with
larger system memory.
17
White Paper | Accelerating Large-Scale Business Analytics
TQ7
TQ6
Concurrent
TQ5 1.26x
TQ4 SpeeD Up
TQ3
TQ2
TQ1
Sequential
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
18