Hana Ebook 01
Hana Ebook 01
Hana Ebook 01
In-memory Computing
with SAP HANA on IBM
eX5 Systems
IBM Systems Solution
for SAP HANA
SAP HANA overview
and use cases
Basic in-memory
computing principles
Gereon Vey
Tomas Krojzl
Ilya Krutov
ibm.com/redbooks
SG24-8086-00
Note: Before using this information and the product it supports, read the information in
Notices on page vii.
Contents
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
The team who wrote this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
Now you can become a published author, too! . . . . . . . . . . . . . . . . . . . . . . . . xii
Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Stay connected to IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Chapter 1. History of in-memory computing at SAP . . . . . . . . . . . . . . . . . . 1
1.1 SAP Search and Classification (TREX). . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 SAP liveCache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 SAP NetWeaver Business Warehouse Accelerator . . . . . . . . . . . . . . . . . . 3
1.3.1 SAP BusinessObjects Explorer Accelerated . . . . . . . . . . . . . . . . . . . . 5
1.3.2 SAP BusinessObjects Accelerator . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 SAP HANA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Chapter 2. Basic concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1 Keeping data in-memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.1 Using main memory as the data store . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.2 Data persistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Minimizing data movement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.1 Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.2 Columnar storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.3 Pushing application logic to the database . . . . . . . . . . . . . . . . . . . . . 19
2.3 Divide and conquer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.1 Parallelization on multi-core systems . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3.2 Data partitioning and scale-out . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Chapter 3. SAP HANA overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1 SAP HANA overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1.1 SAP HANA architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.1.2 SAP HANA appliance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2 SAP HANA delivery model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3 Sizing SAP HANA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3.1 The concept of T-shirt sizes for SAP HANA . . . . . . . . . . . . . . . . . . . 26
3.3.2 Sizing approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.4 SAP HANA software licensing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
iii
iv
Contents
vi
Notices
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult your
local IBM representative for information on the products and services currently available in your area. Any
reference to an IBM product, program, or service is not intended to state or imply that only that IBM product,
program, or service may be used. Any functionally equivalent product, program, or service that does not infringe
any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and
verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The
furnishing of this document does not grant you any license to these patents. You can send license inquiries, in
writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785 U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where such
provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION
PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR
IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of
express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made to the
information herein; these changes will be incorporated in new editions of the publication. IBM may make
improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time
without notice.
Any references in this information to non-IBM websites are provided for convenience only and do not in any
manner serve as an endorsement of those websites. The materials at those websites are not part of the materials
for this IBM product and use of those websites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any
obligation to you.
Any performance data contained herein was determined in a controlled environment. Therefore, the results
obtained in other operating environments may vary significantly. Some measurements may have been made on
development-level systems and there is no guarantee that these measurements will be the same on generally
available systems. Furthermore, some measurements may have been estimated through extrapolation. Actual
results may vary. Users of this document should verify the applicable data for their specific environment.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm the
accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the
capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them as
completely as possible, the examples include the names of individuals, companies, brands, and products. All of
these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is
entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any
form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs
conforming to the application programming interface for the operating platform for which the sample programs are
written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or
imply reliability, serviceability, or function of these programs. You may copy, modify, and distribute these sample
programs in any form without payment to IBM for the purposes of developing, using, marketing, or distributing
application programs conforming to IBM's application programming interfaces.
vii
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business
Machines Corporation in the United States, other countries, or both. These and other IBM trademarked
terms are marked on their first occurrence in this information with the appropriate symbol ( or ),
indicating US registered or common law trademarks owned by IBM at the time this information was
published. Such trademarks may also be registered or common law trademarks in other countries. A current
list of IBM trademarks is available on the Web at http://www.ibm.com/legal/copytrade.shtml
The following terms are trademarks of the International Business Machines Corporation in the United States,
other countries, or both:
AIX
BladeCenter
DB2
Global Business Services
Global Technology Services
GPFS
IBM
Intelligent Cluster
Passport Advantage
POWER
PureFlex
RackSwitch
Redbooks
Redpaper
Redbooks (logo)
System x
System z
Tivoli
z/OS
Linux is a trademark of
Linus Torvalds in the
United States, other
countries, or both.
Microsoft, Windows, and
the Windows logo are
trademarks of Microsoft
Corporation in the United
Other company, product, or service names may be trademarks or service marks of others.
viii
Preface
This IBM Redbooks publication describes in-memory computing appliances
from IBM and SAP that are based on IBM eX5 flagship systems and SAP HANA.
We first discuss the history and basic principles of in-memory computing, then
we describe the SAP HANA offering, its architecture, sizing methodology,
licensing policy, and software components. We also review IBM eX5 hardware
offerings from IBM. Then we describe the architecture and components of
IBM Systems solution for SAP HANA and its delivery, operational, and support
aspects. Finally, we discuss the advantages of using IBM infrastructure platforms
for running the SAP HANA solution.
The following topics are covered:
This book is intended for SAP administrators and technical solution architects. It
is also for IBM Business Partners and IBM employees who want to know more
about the SAP HANA offering and other available IBM solutions for SAP
customers.
ix
Kevin Barnes
Tamikia Barrow
Mary Comianos
Shari Deiana
Cheryl Gera
Linda Robinson
David Watts
Erica Wazewski
KaTrina Love
From IBM:
Guillermo B. Vazquez
Irene Hopf
Dr. Oliver Rettig
Sasanka Vemuri
Tag Robertson
Thomas Prause
Volker Fischer
Preface
xi
Comments welcome
Your comments are important to us!
We want our books to be as helpful as possible. Send us your comments about
this book or other IBM Redbooks publications in one of the following ways:
Use the online Contact us review Redbooks form found at:
ibm.com/redbooks
Send your comments in an email to:
[email protected]
Mail your comments to:
IBM Corporation, International Technical Support Organization
Dept. HYTD Mail Station P099
2455 South Road
Poughkeepsie, NY 12601-5400
xii
Preface
xiii
xiv
Chapter 1.
History of in-memory
computing at SAP
In-memory computing has a long history at SAP. This chapter provides a short
overview of the history of SAP in-memory computing. It describes the evolution
of SAP in-memory computing and gives an overview of SAP products involved in
this process:
Formerly named SAP NetWeaver Business Intelligence Accelerator, SAP changed the software
solution name in 2009 to SAP NetWeaver Business Warehouse Accelerator. The solution functions
remain the same.
WinterCorp white paper: "Large-Scale Testing of the SAP NetWeaver BW Accelerator on an IBM
Platform," available at
ftp://ftp.software.ibm.com/common/ssi/sa/wh/n/spw03004usen/SPW03004USEN.PDF
NetWeaver BW front ends can provide. This broadens the user base towards the
less experienced BI users.
Founded in 1998 by Hasso Plattner, one of the founders of SAP AG, chairman of the board until
2003, and currently chairman of the supervisory board of SAP AG
TREX
BWA
7.0
BWA
7.20
BO
Explorer
BOE
accelerated
BO
Data
Services
BOA
SAP
HANA 1.0
MAXDB
P*TIME
liveCache
Chapter 2.
Basic concepts
In-memory computing is a technology that allows the processing of massive
quantities of data in main memory to provide immediate results from analysis and
transaction. The data to be processed is ideally real-time data (that is, data that
is available for processing or analysis immediately after it is created).
To achieve the desired performance, in-memory computing follows these
basic concepts:
Keep data in main memory to speed up data access.
Minimize data movement by leveraging the columnar storage concept,
compression, and performing calculations at the database level.
Divide and conquer. Leverage the multi-core architecture of modern
processors and multi-processor servers, or even scale out into a distributed
landscape, to be able to grow beyond what can be supplied by a single server.
In this chapter, we describe those basic concepts with the help of a few
examples. We do not describe the full set of technologies employed with
in-memory databases, such as SAP HANA, but we do provide an overview of
how in-memory computing is different from traditional concepts.
150x
10,000
1,000
100
2,000x
10
1
17x
0,1
12x
0,01
0,001
CPU register
CPU Cache
RAM
SSD/Flash
Volatile
Hard disk
Non-volatile
Figure 2-1 Data access times of various storage types, relative to RAM (logarithmic scale)
10
The main memory is the fastest storage type that can hold a significant amount
of data. While CPU registers and CPU cache are faster to access, their usage is
limited to the actual processing of data. Data in main memory can be accessed
more than a hundred thousand times faster than data on a spinning hard disk,
and even flash technology storage is about a thousand times slower than main
memory. Main memory is connected directly to the processors through a
high-speed bus, whereas hard disks are connected through a chain of buses
(QPI, PCIe, SAN) and controllers (I/O hub, RAID controller or SAN adapter, and
storage controller).
Compared with keeping data on disk, keeping the data in main memory can
dramatically improve database performance just by the advantage in access
time.
11
Time
Data savepoint
to persistent
storage
Log written
to persistent storage
(committed transactions)
Power failure
After a power failure, the database can be restarted like a disk-based database.
The database pages are restored from the savepoints and then the database
logs are applied (rolled forward) to restore the changes that were not captured in
the savepoints. This ensures that the database can be restored in memory to
exactly the same state as before the power failure.
2.2.1 Compression
Even though todays memory capacities allow keeping enormous amounts of
data in-memory, compressing the data in-memory is still desirable. The goal is to
compress data in a way that does not use up performance gained, while still
minimizing data movement from RAM to the processor.
By working with dictionaries to be able to represent text as integer numbers, the
database can compress data significantly and thus reduce data movement, while
not imposing additional CPU load for decompression, but even adding to the
performance1. Figure 2-3 on page 13 illustrates this with a simplified example.
12
Row
ID
Date/
Time
Material
Customer
Name
Quantity
Customers
Material
Chevrier
MP3 Player
Di Dio
Radio
Dubois
Refrigerator
Miller
Stove
Newman
Laptop
14:05
Radio
Dubois
14:11
Laptop
Di Dio
14:32
Stove
Miller
14:38
MP3 Player
Newman
Row
ID
14:48
Radio
Dubois
14:55
Refrigerator
Miller
15:01
Stove
Chevrier
Date/
Time
Material
Customer
Name
Quantity
845
851
872
878
888
895
901
On the left side of Figure 2-3, the original table is shown containing text attributes
(that is, material and customer name) in their original representation. The text
attribute values are stored in a dictionary (upper right), assigning an integer value
to each distinct attribute value. In the table, the text is replaced by the
corresponding integer value, as defined in the dictionary. The date and time
attribute was also converted to an integer representation. Using dictionaries for
text attributes reduces the size of the table because each distinct attribute value
has only to be stored once, in the dictionary; therefore, each additional
occurrence in the table just needs to be referred to with the corresponding
integer value.
The compression factor achieved by this method is highly dependent on data
being compressed. Attributes with few distinct values compress well; whereas,
attributes with many distinct values do not benefit as much.
While there are other, more effective compression methods that can be
employed with in-memory computing, to be useful, they must have the correct
balance between compression effectiveness. This gives you more data in your
memory, or less data movement (that is, higher performance), resources needed
for decompression, and data accessibility (that is, how much unrelated data has
to be decompressed to get to the data that you need). As discussed here,
13
Row-based
Row
ID
Date/
Time
Column-based
Material
Customer
Name
Quantity
Row
ID
Date/
Time
Material
Customer
Name
Quantity
845
845
851
851
872
872
878
878
888
888
895
895
901
901
Row-based store
1
845
851
851
872
878
872
878
Column-based store
1
845
14
Both storage models have benefits and drawbacks, which are listed in Table 2-1.
Table 2-1 Benefits and drawbacks of row-based and column-based storage
Row-based storage
Benefits
Column-based storage
Drawbacks
The drawbacks of column-based storage are not as grave as they seem. In most
cases, not all attributes (that is, column values) of a row are needed for
processing, especially in analytic queries. Also, inserts or updates to the data are
less frequent in an analytical environment2. SAP HANA implements both a
row-based storage and a column-based storage; however, its performance
originates in the use of column-based storage in memory. The following sections
describe how column-based storage is beneficial to query performance and how
SAP HANA handles the drawbacks of column-based storage.
An exception is bulk loads (for example, when replicating data in the in-memory database, which can be
handled differently).
15
Get all records with Customer Name Miller and Material Refrigerator
Dictionary lookup of the strings
Strings are only compared once!
Customers
Material
Chevrier
MP3 Player
Di Dio
Radio
Dubois
Refrigerator
Miller
Stove
Newman
Laptop
Customer
Material
0 0 1 0 0 1 0
0 0 0 0 0 1 0
Combine
bit-wise AND
0 0 0 0 0 1 0
Resultset
The resulting records can be assembled from the column stores fast, because positions are known
(here: 6th position in every column)
The query asks to get all records with Miller as the customer name and
Refrigerator as the material.
First, the strings in the query condition are looked up in the dictionary. Miller is
represented as the number 4 in the customer name column. Refrigerator is
represented as the number 3 in the material column. Note that this lookup has to
be done only once. Subsequent comparison with the values in the table are
based on integer comparisons, which are less resource intensive than string
comparisons.
In a second step, the columns are read that are part of the query condition (that
is, the Customer and Material columns). The other columns of the table are not
needed for the selection process. The columns are then scanned for values
matching the query condition. That is, in the Customer column all occurrences of
16
4 are marked as selected, and in the Material column all occurrences of 3 are
marked.
These selection marks can be represented as bitmaps, a data structure that
allows efficient boolean operations on them, which is used to combine the
bitmaps of the individual columns to a bitmap representing the selection or
records matching the entire query condition. In our example, the record number 6
is the only matching record. Depending on the columns selected for the result,
now the additional columns must be read to compile the entire record to return.
But because the position within the column is known (record number 6) only the
parts of the columns have to be read that contain the data for this record.
This example shows how compression not only can limit the amount of data
needed to be read for the selection process, but even simplify the selection itself,
while the columnar storage model further reduces the amount of data needed for
the selection process. Although the example is simplified, it illustrates the
benefits of dictionary compression and columnar storage.
See Efficient Transaction Processing in SAP HANA Database - The End of a Column Store Myth
by Sikka, Frber, Lehner, Cha, Peh, Bornhvd, available at
http://dl.acm.org/citation.cfm?id=2213946
17
Figure 2-6 illustrates the lifecycle management for database records in the
column-store.
Update / Insert / Delete
L1 Delta
Merge
Bulk Insert
L2 Delta
Merge
Main store
Unified Table
Read
Figure 2-6 Lifetime management of a data record in the SAP HANA column-store
18
table from one storage to the next one is called Delta Merge, and is an
asynchronous process. During the merge operations, the columnar table is still
available for read and write operations.
Moving records from L1 Delta storage to L2 Delta storage involves reorganizing
the record in a columnar fashion and compressing it, as illustrated in Figure 2-3
on page 13. If a value is not yet in the dictionary, a new entry is appended to the
dictionary. Appending to the dictionary is faster than inserting, but results in an
unsorted dictionary, which impacts the data retrieval performance.
Eventually, the data in the L2 Delta storage must be moved to the Main storage.
To accomplish that, the L2 Delta storage must be locked, and a new L2 Delta
storage must be opened to accept further additions. Then a new Main storage is
created from the old Main storage and the locked L2 Delta storage. This is a
resource intensive task and has to be scheduled carefully.
19
20
Chapter 3.
21
22
SAP HANA
Studio
SAP HANA
Studio Repository
SQL
MDX
SQL Script
Calculation Engine
Relational Engines
Software Update
Manager
Row
Store
Column
Store
SAP HANA
Client
Transaction
Manager
Authorization
Manager
Metadata
Manager
Page
Management
Persistency Layer
Logger
Data Volumes
Persistent Storage
Log Volumes
LM Structure
JVM
SAP CAR
23
The engine used to store data can be selected on a per-table basis at the time of
creation of a table. There is a possibility to convert an existing table from one
type to another. Tables in the row-store are loaded at startup time; whereas,
tables in the column-store can be either loaded at startup or on demand during
normal operation of the SAP HANA database.
Both engines share a common persistency layer, which provides data
persistency consistent across both engines. There is page management and
logging, much like in traditional databases. Changes to in-memory database
pages are persisted through savepoints written to the data volumes on persistent
storage, which are usually hard drives. Every transaction committed in the SAP
HANA database is persisted by the logger of the persistency layer in a log entry
written to the log volumes on persistent storage. The log volumes use flash
technology storage for high I/O performance and low latency.
The relational engines can be accessed through a variety of interfaces. The SAP
HANA database supports SQL (JDBC/ODBC), MDX (ODBO), and BICS (SQL
DBC). The calculation engine allows calculations to be performed in the
database without moving the data into the application layer. It also includes a
business functions library that can be called by applications to do business
calculations close to the data. The SAP HANA-specific SQL Script language is
an extension to SQL that can be used to push down data-intensive application
logic into the SAP HANA database.
24
25
XS
S and S+
M and M+
Compressed
data in
memory
64 GB
128 GB
256 GB
512 GB
Server main
memory
128 GB
256 GB
512 GB
1024 GB
Number of
CPUs
In addition to the T-shirt sizes listed in Table 3-1, you might come across the
T-shirt size XL, which denotes a scale-out configuration for SAP HANA.
The T-shirt sizes S+ and M+ denote upgradable versions of the S and M sizes:
S+ delivers capacity equivalent to S, but the hardware is upgradable to an M
size.
M+ delivers capacity equivalent to M, but the hardware is upgradable to an L
size.
These T-shirt sizes are used when relevant growth of the data size is expected.
For more information about T-shirt size mappings to IBM System Solution
building blocks, refer to the section 6.3.2, SAP HANA T-shirt sizes on page 108.
26
The sizing methodology for SAP HANA is described in detail in the following SAP
Notes1 and attached presentations:
Note 1514966 - SAP HANA 1.0: Sizing SAP In-Memory Database
Note 1637145 - SAP NetWeaver BW on HANA: Sizing SAP In-Memory
Database
The following sections provide a brief overview of sizing for SAP HANA.
27
The uncompressed total size of all the tables (without DB indexes) storing the
required information in the source database is denoted as A.
2. Although the compression ratio achieved by SAP HANA can vary depending
on the data distribution, a working assumption is that, in general, a
compression factor of 7 can be achieved:
B = ( A / 7 )
B is the amount of RAM required to store the data in the SAP HANA
database.
3. Use only 50% of the total RAM for the in-memory database. The other 50% is
needed for temporary objects (for example, intermediate results), the
operating system, and the application code:
C = B * 2
C is the total amount of RAM required.
Round the total amount of RAM up to the next T-shirt configuration size, as
described in 3.3.1, The concept of T-shirt sizes for SAP HANA on page 26, to
get the correct T-shirt size needed.
28
The SSD building block, as described in 6.3, Custom server models for SAP HANA on page 104,
combines Diskpersistence and Disklog on a single SSD array with sufficient capacity.
The certified hardware configurations already take these rules into account, so
there is no need to perform this disk sizing. However, we still include it here for
your understanding.
29
Just as in the previous case, only the size of tables is relevant. All associated
indexes can be ignored.
In case the data in the source system is compressed, the calculated volume
needs to be adjusted by an estimated compression factor for the given
database. Only for DB2 databases, which contain the actual compression
rates in the data dictionary, the script calculates the required corrections
automatically.
In case the source system is a non-unicode system, a unicode conversion will
be part of the migration scenario. In this case, the volume of data needs to be
adjusted, assuming a 10% overhead because the majority of data is expected
to be numerical values.
Alternatively, an ABAP report can be used to estimate the table sizes. SAP
Note 1736976 has a report attached that calculates the sizes based on the
data present in an existing SAP NetWeaver BW system.
The uncompressed total size of all the column tables (without DB indexes) is
denoted as Acolumn . The uncompressed total size of all the row tables
(without DB indexes) is referred to as Arow.
2. The average compression factor is approximately 4 for column-based data
and around 1.5 for row based data.
Additionally, an SAP NetWeaver BW system requires about 40 GB of RAM for
additional caches and about 10 GB of RAM for SAP HANA components.
Bcolumn = ( Acolumn / 4 )
Brow = ( Arow / 1.5 )
Bother = 50
For a fully cleansed SAP NetWeaver BW system having 60 GB of row store
data, we can therefore assume a requirement of about 40 GB of RAM for
row-based data.
Brow = 40
B is the amount of RAM required to store the data in the SAP HANA database
for a given type of data.
3. Additional RAM is required for objects that are populated with new data and
for queries. This requirement is valid for column based tables.
C = Bcolumn * 2 + Brow + Bother
For fully cleansed BW systems, this formula can be simplified:
C = Bcolumn * 2 + 90
C is the total amount of RAM required.
30
The total amount of RAM must be rounded up to the next T-shirt configuration
size, as described in 3.3.1, The concept of T-shirt sizes for SAP HANA on
page 26, to get the correct T-shirt size needed.
31
32
Licensable memorya
XS
128 GB
64 - 128 GB
33
T-shirt size
Licensable memorya
256 GB
128 - 256 GB
512 GB
256 - 512 GB
1024 GB (= 1 TB)
512 - 1024 GB
a. In steps of 64 GB
As you can see from Table 3-2 on page 33, the licensing model allows you to
have a matching T-shirt size for any licensable memory size between 64 GB and
1024 GB.
34
Chapter 4.
35
User Workstation
SAP
SAP HANA
HANA
studio
studio
(optional)
(optional)
SAP
SAP HANA
HANA
client
client
(optional)
(optional)
Server
SAP
SAP HANA
HANA
client
client
SAP
SAP host
host
agent
agent
SAP
SAP HANA
HANA
client
client
SAP
SAP HANA
HANA
LM
LM structure
structure
SAP
SAP HANA
HANA
studio
studio
repository
repository
Software
Software
Update
Update
Manager
Manager
Data
Data Modeling
Modeling
Row
Row Store
Store
Column
Column Store
Store
SAP
SAP HANA
HANA
database
database
Sybase
Sybase
Replication
Replication
Server
Server (*1)
(*1)
Sybase
Sybase
EDCA
EDCA (*1)
(*1)
SAP
SAP HANA
HANA
Load
Load
Controller
Controller (*1)
(*1)
(*1) component is required only in case of replication using Sybase Replication Server
EDCA = Enterprise Connect Data Access
SMD = Solution Manager Diagnostics
36
37
38
The following main function areas are provided by the SAP HANA studio (each
function area is also illustrated by a corresponding figure of the user interface):
Database administration
The key functions are stopping and starting the SAP HANA databases, status
overview, monitoring, performance analysis, parameter configuration, tracing,
and log analysis.
Figure 4-2 shows the SAP HANA studio user interface for database
administration.
39
Security management
This provides tools that are required to create users, to define and assign
roles, and to grant database privileges.
Figure 4-3 shows an example of the user definition dialog.
40
Data management
Functions to create, change or delete database objects (like tables, indexes,
views), commands to manipulate data (for example insert, update, delete,
bulk load, and so on)
Figure 4-4 shows an example of the table definition dialog.
41
Modeling
This is the user interface to work with models (metadata descriptions how
source data is transformed in resulting views), including the possibility to
define new custom models, and to adjust or delete existing models.
Figure 4-5 shows a simple analytic model.
42
Content management
Functions offering the possibility to organize models in packages, to define
delivery units for transport into a subsequent SAP HANA system, or to export
and import individual models or whole packages.
Content management functions are accessible from the main window in the
modeler perspective, as shown in Figure 4-6.
Figure 4-6 SAP HANA studio: Content functions on the main panel of modeler perspective
43
Replication management
Data replication into the SAP HANA database is controlled from the Data
Provisioning dialog in the SAP HANA studio, where new tables can be
scheduled for replication, suspended, or replication for a particular table can
be interrupted.
Figure 4-7 shows an example of a data provisioning dialog.
44
The SAP HANA database queries are consumed indirectly using front-end
components, such as SAP BusinessObjects BI 4.0 clients. Therefore the SAP
HANA studio is required only for administration or development and is not
needed for end users.
The SAP HANA studio runs on the Eclipse platform; therefore, every user must
have Java Runtime Environment (JRE) 1.6 or 1.7 installed, having the same
architecture (64-bit SAP HANA studio has 64-bit JRE as prerequisite).
Currently supported platforms are Windows 32-bit, Windows 64-bit, and Linux
64-bit.
Just like the SAP HANA client, the SAP HANA studio is also backwards
compatible, meaning that the revision level of the SAP HANA studio must be the
same or higher revision level than the revision level of the SAP HANA database.
45
46
47
48
User Workstation
stack.xml
stack.xml
IMCE_SERVER*.SAR
IMCE_SERVER*.SAR
IMCE_CLIENT*.SAR
IMCE_CLIENT*.SAR
IMC_STUDIO*.SAR
IMC_STUDIO*.SAR
HANALDCTR*.SAR
HANALDCTR*.SAR
SAPHOSTAGENT*.SAR
SAPHOSTAGENT*.SAR
SUMHANA*.SAR
SUMHANA*.SAR
Software
Software
Update
Update
Manager
Manager
SAP
SAP host
host
agent
agent
self-update
SAP
SAP HANA
HANA
client
client
updated components
SAP
SAP HANA
HANA
studio
studio
SAP
SAP HANA
HANA
studio
studio
repository
repository
Data
Data Modeling
Modeling
Row
Row Store
Store
Column
Column Store
Store
SAP
SAP HANA
HANA
database
database
SAP
SAP HANA
HANA
Load
Load
Controller
Controller (*1)
(*1)
Figure 4-9 Interaction of Software Update Manager (SUM) for SAP HANA with other software components
The Software Update Manager can download support package stack information
and other required files directly from the SAP Service Marketplace (SMP).
If a direct connection from the server to the SAP Service Marketplace is not
available, the support package stack definition and installation packages must be
downloaded manually and then uploaded to the SAP HANA server. In this case,
the stack generator at the SAP Service Marketplace can be used to identify
required packages and to generate the stack.xml definition file (a link to the stack
generator is located in the download section, subsection Support packages in
the SAP HANA area).
The SUM update file (SUMHANA*.SAR archive) is not part of the stack definition
and needs to be downloaded separately.
The Software Update Manager will first perform a self-update as soon as the
Lifecycle Management perspective is opened in the SAP HANA studio.
After the update is started, all SAP HANA software components are updated to
their target revisions, as defined by the support package stack definition file. This
operation needs downtime; therefore, a maintenance window is required, and the
database must be backed up before this operation.
49
This scenario is preconfigured during installation using the Unified Installer (see
the document SAP HANA Installation Guide with Unified Installer - section SUM
for SAP HANA Default Configuration for more details). If both the SAP HANA
studio and the Software Update Manager for SAP HANA are running on SPS04,
no further steps are required.
Otherwise a last configuration step, installing the server certificate inside the
Java keystore, needs to be performed on a remote workstation where the SAP
HANA studio is located.
For more information about installation, configuration, and troubleshooting of
SUM updates, see the guides:
SAP HANA Installation Guide with Unified Installer
SAP HANA Automated Update Guide
The most common problem during configuration of automatic updates using
SUM is a host name mismatch between server installation (fully-qualified host
name that was used during installation of SAP HANA using Unified Installer) and
the host-name used in the SAP HANA studio. For more details, see the
troubleshooting section in SAP HANA Automated Update Guide.
User Workstation
SAP
SAP HANA
HANA
studio
studio
Data
Data Modeling
Modeling
Row
Row Store
Store
(read operation)
Column
Column Store
Store
SAP
SAP HANA
HANA
studio
studio
repository
repository
SAP
SAP HANA
HANA
database
database
Figure 4-10 Interaction of the Software Update Manager (SUM) for SAP HANA with other
software components during update of remote SAP HANA studio
50
If the Unified Installer was used to install SAP HANA software components, no
actions need to be performed on the server.
The only configuration step needed is to adjust the SAP HANA studio
preferences to enable updates and to define the location of the update server.
51
available extractors and then redirecting the write operation to the SAP HANA
database instead of the local Persistent staging Area (PSA).
Log-based replication
This method is based on reading the transaction logs from the source
database and re-applying them to the SAP HANA database.
Figure 4-11 illustrates these replication methods.
SAP HANA
database
Source System
SAP ERP
Trigger-Based Replication
Application Layer
ETL-Based Replication
Embedded BW
Database
Log
File
Extractor-Based Replication
Log-Based Replication
The following sections discuss these replication methods for SAP HANA in more
detail.
52
53
SAP ERP application logic can be reused by reading extractors or utilizing SAP
function modules.
It offers options for the integration of third-party data providers and supports
replication from virtually any data source.
Data transfers are done in batch mode, which limits the real-time capabilities of
this replication method.
SAP BusinessObjects Data Services provides several kinds of data quality and
data transformation functionality. Due to the rich feature set available,
implementation time for the ETL-based replication is longer than for the other
replication methods. SAP BusinessObjects Data Services offers integration with
SAP HANA. SAP HANA is a available as a predefined data target for the load
process.
The ETL-based replication server is the ideal solution for all SAP HANA
customers who need data replication from non-SAP data sources.
54
55
56
Only certain versions of IBM DB2 on AIX, Linux, and HP-UX are supported with this replication
method.
Real-Time
Near Real-Time
Real-Time Capabilities
Preferred by SAP
SAP LT System
Direct Extractor
Connection
Unicode only
Very limited DB support
Data Conversion Capabilities
Sybase
Replication Server
Real Real-Time
The replication method that you choose depends on the requirements. When
real-lime replication is needed to provide benefit to the business, and the
replication source is an SAP system, then the trigger-based replication is the
best choice. Extractor based replication might keep project cost down by reusing
existing transformations. ETL-based replication provides the most flexibility
regarding data source, data transformation, and data cleansing options, but does
not provide real-time replication.
57
58
Chapter 5.
59
Technology platform
Operational reporting
Accelerator
In-Memory products
Next generation applications
Accelerator
Accelerator
Operational
Operational
Reporting
Reporting
In-Memory
In-Memory
Products
Products
Data
Data Modeling
Modeling
Technology
Technology
platform
platform
Column
Column Store
Store
Row
Row Store
Store
Next
Next
Generation
Generation
Applications
Applications
SAP
SAP HANA
HANA
Figure 5-1 Basic use case scenarios defined by SAP in session EIM205
These five basic use case scenarios describe the elementary ways SAP HANA
can be integrated. We cover each of these use case scenarios in a dedicated
section within this chapter.
SAP maintains a SAP HANA Use Case Repository with specific examples for
how SAP HANA can be integrated. This repository is online at the following web
address:
http://www.experiencesaphana.com/community/resources/use-cases
60
The use cases in this repository are divided into categories based on their
relevance to a specific industry sector. It is a good idea to review this repository
to find inspiration about how SAP HANA can be leveraged in various scenarios.
Data
Data Modeling
Modeling
Non-SAP
Non-SAP
or
or SAP
SAP
data
source
data source
Non-SAP
Non-SAP
application
application
Column
Column Store
Store
Row
Row Store
Store
SAP
SAP HANA
HANA
SAP
SAP
Reporting
Reporting
and
and Analytics
Analytics
SAP HANA is not technologically dependent on other SAP products and can be
used independently as the only one SAP component in the customers
Information Technology (IT) landscape. On the other hand, SAP HANA can be
easily integrated with other SAP products, such as SAP BusinessObjects BI
platform for reporting or SAP BusinessObjects Data Services for ETL replication,
which gives customers the possibility to use only the components that are
needed.
There are many ways that SAP HANA can be integrated into a customer
landscape, and it is not possible to describe all combinations. Software
components around the SAP HANA offering can be seen as building blocks, and
every solution must be assembled from the blocks that are needed in a particular
situation.
This approach is extremely versatile and the amount of possible combinations is
growing because SAP constantly keeps adding new components in their SAP
HANA-related portfolio.
61
IBM offers consulting services that help customers to choose the correct solution
for their business needs. For more information, see section 8.4.1, A trusted
service partner on page 154.
Current situation
Non-SAP
Non-SAP
application
application
Data replication
Non-SAP
Non-SAP
application
application
Non-SAP
Non-SAP
application
application
SAP
SAP HANA
HANA
Non-SAP
Non-SAP
application
application
Custom
Custom
database
database
Data replication
SAP
SAP HANA
HANA
SAP
SAP HANA
HANA
Figure 5-3 Examples of SAP HANA deployment options in regards to data acquisition
62
Each of these three solutions have both advantages and disadvantages, which
we highlight, to show aspects of a given solution that might need more detailed
consideration:
Replacing the existing database with SAP HANA
The advantage of this solution is that the overall architecture is not going to be
significantly changed. The solution will remain simple without the need to
include additional components. Customers might also save on license costs
for the original database.
A disadvantage to this solution is that the custom application must be
adjusted to work with the SAP HANA database. If ODBC or JDBS is used for
database access, this is not a big problem. Also the whole setup must be
tested properly. Because the original database is being replaced, a certain
amount of downtime is inevitable.
Customers considering this approach must be familiar with the features and
characteristics of SAP HANA, especially when certain requirements must be
met by the database that is used (for example in case of special purpose
databases).
Populating SAP HANA with data replicated from the existing database
The second option is to integrate SAP HANA as a side-car database to the
primary database and to replicate required data using one of the available
replication techniques.
An advantage of this approach is that the original solution is not touched and
therefore no downtime is required. Also only the required subset of data has
to be replicated from the source database, which might allow customers to
minimize acquisition costs because SAP HANA acquisition costs are directly
linked to the volume of stored data.
The need for implementing replication technology can be seen as the only
disadvantage of this solution. Because data is only delivered into SAP HANA
through replication, this component is a vital part of the whole solution.
Customers considering this approach must be familiar with various replication
technologies, including their advantages and disadvantages, as outlined in
section 4.2, Data replication methods for SAP HANA on page 51.
Customers must also be aware that replication might cause additional load on
the existing database because modified records must be extracted and then
transported to the SAP HANA database. This aspect is highly dependent on
the specific situation and can be addressed by choosing the proper replication
technology.
63
64
Current situation
Custom
Custom
database
database
Possible scenario
Non-SAP
Non-SAP
application
application
Custom
Custom
database
database
Non-SAP
Non-SAP
application
application
SAP
SAP HANA
HANA
SAP
SAP analytic
analytic
tools
tools
SAP
SAP BOBJ
BOBJ
reporting
reporting
Figure 5-4 An example of SAP HANA as a source for other applications
The initial situation is schematically visualized in the left part of Figure 5-4. A
customer-specific application runs queries against a custom database that is a
functionality that we must preserve.
A potential solution is in the right part of Figure 5-4. A customer-specific
application runs problematic queries against the SAP HANA database. If the
existing database is still part of the solution, specific queries that do not need
acceleration can still be executed against the original database.
Specialized analytic tools, such as the SAP BusinessObjects Predictive Analysis,
can be used to run statistical analysis on data that is stored in the SAP HANA
database. This tool can run analysis directly inside the SAP HANA database,
which helps to avoid expensive transfers of massive volumes of data between the
application and the database. The result of this analysis can be stored in SAP
HANA, and the custom application can use these results for further processing,
for example, to facilitate decision making.
SAP HANA can be easily integrated with products from the SAP
BusinessObjects family; therefore, these products can be part of the solution,
responsible for reporting, monitoring critical KPIs using dashboards, or for data
analysis.
65
These tools can also be used without SAP HANA however SAP HANA is
enabling these tools to process much bigger volumes of data and still provide
results in reasonable time.
66
Corporate BI
Database
Local BI
Data
Mart
SAP ERP 1
SAP ERP n
...
BI
Data
Mart
ETL
DB
Non-SAP
Business
Application
BI
Data
Mart
ETL
Database
DB
Database
Database
DB
67
With the introduction of SAP HANA 1.0, SAP provided an in-memory technology
aiming to support business intelligence at a business unit level. SAP HANA
combined with business intelligence tools, such as the SAP BusinessObjects
tools and data replication mechanisms feeding data from the operational system
into SAP HANA in real-time, brought in-memory computing to the business unit
level. Figure 5-6 shows such a landscape with the local data marts replaced by
SAP HANA.
Corporate BI
Accelerator
Local BI
SAP
HANA
1.0
SAP ERP 1
SAP ERP n
...
Sync
Database
SAP
HANA
1.0
Database
Non-SAP
Business
Application
Database
SAP
HANA
1.0
Figure 5-6 SAP vision after the introduction of SAP HANA 1.0
68
Figure 5-7 illustrates the role of SAP HANA in an operational reporting use case
scenario.
SAP
SAP
Business
Business
Suite
Suite
Data
Data Modeling
Modeling
Column
Column Store
Store
repl.
RDBMS
RDBMS
Row
Row Store
Store
SAP
SAP
Reporting
Reporting
and
and Analytics
Analytics
SAP
SAP HANA
HANA
Usually the first step is the replication of data into the SAP HANA database,
which is usually originating from the SAP Business Suite. However some solution
packages are also built for non-SAP data sources.
Sometimes source systems need to be adjusted by implementing modifications
or by performing specific configuration changes.
Data is typically replicated using the SAP Landscape Transformation replication;
however, other options, such as replication using SAP BusinessObjects Data
Services or SAP HANA Direct Extractor Connection (DXC), are also possible.
The replication technology is usually chosen as part of the package design and
cannot be changed easily during implementation.
A list of tables to replicate (for SAP Landscape Transformation replication) or
transformation models (for replication using Data Services) are part of the
package.
SAP HANA is loaded with models (views) that are either static (designed by SAP
and packaged) or automatically generated based on customized criteria. These
models describe the transformation of source data into the resulting column
views. These views are then consumed by SAP BusinessObjects BI 4.0 reports
or dashboards that are either delivered as final products or pre-made templates
that can be finished as part of implementation process.
Some solution packages are based on additional components (for example, SAP
BusinessObjects Event Insight). If required, additional content that is specific to
these components can also be part of the solution package.
Individual use cases, required software components, prerequisites, configuration
changes, including overall implementation processes, are properly documented
and attached as part of the delivery.
69
70
The currently available Rapid Deployment Solutions, SAP Notes with more
information, and links to related web content (where available) are:
SAP Bank Analyzer Rapid-Deployment Solution for Financial Reporting with
SAP HANA (see SAP Note 1626729)
http://service.sap.com/rds-hana-finrep
SAP CRM rapid-deployment solution for analytics with SAP HANA (see SAP
Note 1680801)
http://service.sap.com/rds-crm-bwhana
SAP Customer Usage Analysis rapid deployment solution (see SAP Note
1729467)
http://service.sap.com/rds-haha-cua
SAP rapid-deployment solution for implementation of data services, BI
platform, and rapid marts to SAP HANA (see SAP Note 1678910)
http://service.sap.com/rds-eim
SAP Grid Infrastructure Analytics rapid-deployment solution (see SAP Note
1703517)
http://service.sap.com/rds-grid-ana
SAP rapid-deployment solution for sales pipeline analysis with SAP HANA
(see SAP Note 1637113)
http://service.sap.com/rds-crm-pipeline
SAP ERP rapid-deployment solution for profitability analysis with SAP HANA
(see SAP Note 1632506)
http://service.sap.com/rds-hana-copa
SAP ERP rapid-deployment solution for accelerated finance and controlling
with SAP HANA (see SAP Note 1656499)
http://service.sap.com/rds-hana-fin
SAP ERP rapid-deployment solution for operational reporting with SAP HANA
(see SAP Note 1647614 for SAP HANA SP03, or SAP Note 1739432 for SAP
HANA SP04)
http://service.sap.com/rds-hana-erp
SAP Global Trade Services rapid-deployment solution for sanctioned-party
list screening with SAP HANA (see SAP Note 1689708)
http://service.sap.com/rds-gts
SAP Situational Awareness rapid-deployment solution for public sector with
SAP HANA (see SAP Note 1681090)
http://service.sap.com/rds-saps
71
72
SAP
SAP UI
UI
SAP
SAP
Business
Business
Suite
Suite
read
Data
Data Modeling
Modeling
Column
Column Store
Store
repl.
RDBMS
RDBMS
Row
Row Store
Store
SAP
SAP
Reporting
Reporting
and
and Analytics
Analytics
SAP
SAP HANA
HANA
The accelerated SAP system must meet specific prerequisites. Before this
solution can be implemented, installation of specific support packages or
implementation of SAP Notes might be required. This introduces the necessary
code changes in the source system.
The SAP HANA client must be installed on a given server, and the SAP kernel
must be adjusted to support direct connectivity to the SAP HANA database.
As a next step, replication of data from the source system is configured. Each
specific use case has a defined replication method and a list of tables that must
be replicated. Most common is the SAP Landscape Transformation replication;
however, some solutions offer alternatives, for example, for the SAP CO-PA
Accelerator, replication can also be performed by an SAP CO-PA Accelerator
specific ABAP report in source system.
The source system is configured to have direct connectivity into SAP HANA as
the secondary database. The required scenario is configured according to the
specifications and then activated. During activation the source system
automatically deploys the required column views into SAP HANA and activates
new ABAP code that was installed in the source system as the solution
prerequisite. This new code can run time, consuming queries against the SAP
HANA database, which leads to significantly shorter execution times.
Because SAP HANA is populated with valuable data, it is easy to extend the
accelerator use case by adding operational reporting functions. Additional
(usually optional) content is delivered for SAP HANA and for SAP
BusinessObjects BI 4.0 client tools, such as reports or dashboards.
73
SAP HANA as the accelerator and SAP HANA for operational reporting use case
scenarios can be nicely combined in a single package. Here is a list of SAP
Rapid Deployment Solutions (RDS) implementing SAP HANA as accelerator:
SAP Bank Analyzer Rapid-Deployment Solution for Financial Reporting with
SAP HANA (see SAP Note 1626729)
http://service.sap.com/rds-hana-finrep
SAP rapid-deployment solution for customer segmentation with SAP HANA
(see SAP Note 1637115)
http://service.sap.com/rds-cust-seg
SAP ERP rapid-deployment solution for profitability analysis with SAP HANA
(see SAP Note 1632506)
http://service.sap.com/rds-hana-copa
SAP ERP rapid-deployment solution for accelerated finance and controlling
with SAP HANA (see SAP Note 1656499)
http://service.sap.com/rds-hana-fin
SAP Global Trade Services rapid-deployment solution for sanctioned-party
list screening with SAP HANA (see SAP Note 1689708)
http://service.sap.com/rds-gts
74
(available)
(planned)
Traditional
extraction
SAP
SAP
Business
Business
Suite
Suite
RDBMS
RDBMS
SAP
SAP BW
BW
SAP
SAP ECC
ECC
Column
Column Store
Store
Column
Column Store
Store
Row
Row Store
Store
Row
Row Store
Store
SAP
SAP HANA
HANA
SAP
SAP HANA
HANA
Figure 5-9 SAP products running on SAP HANA: SAP Business Warehouse (SAP
NetWeaver BW) and SAP ERP Central Component (SAP ECC)
75
Corporate BI
Virtual
Data Mart
Virtual
Data Mart
Virtual
Data Mart
Local BI
Local BI
SAP ERP 1
SAP ERP n
...
SAP
HANA
Database
Non-SAP
Business
Application
SAP
HANA
Database
Database
Figure 5-10 SAP HANA as the database for SAP NetWeaver Business Warehouse
76
Company Code
Master Data Table:
/BI0/PCOMP_CODE
DIMID
COMP_CODE
SID_0CHNGID
SID_0RECORDTP
Fact Table:
/BI0/F0COPC_C08
SID_0REQUID
KEY_0COPC_C08P
KEY_0COPC_C08T
KEY_0COPC_C08U
Enterprise Structure
Dimension Table:
/BI0/D0COPC_C081
KEY_0COPC_C081
DIMID
KEY_0COPC_C082
SID_0COMP_CODE
KEY_0COPC_C083
SID_0PLANT
AMOUNTFX
OBJVERS
CHANGED
COMP_CODE
CHRT_ACCTS
SID
COMPANY
CHCKFL
COUNTRY
DATAFL
...
INCFL
KEY_0COPC_C084
KEY_0COPC_C085
Company Code
SID Table:
/BI0/SCOMP_CODE
Material
Dimension Table:
/BI0/D0COPC_C082
Plant
SID Table:
/BI0/SPLANT
Plant
Master Data Table :
/BI0/PPLANT
PLANT
PLANT
OBJVERS
SID
CHANGED
AMOUNTVR
DIMID
CHCKFL
ALTITUDE
PRDPLN_QTY
SID_0MATERIAL
DATAFL
BPARTNER
LOTSIZE_CM
SID_0MAT_PLANT
INCFL
...
The core part of every InfoCube is the fact table. This table contains dimension
identifiers (IDs) and corresponding key figures (measures). This table is
surrounded by dimension tables that are linked to fact tables using the dimension
IDs.
Dimension tables are usually small tables that group logically connected
combinations of characteristics, usually representing master data. Logically
77
connected means that the characteristics are highly related to each other, for
example, company code and plant. Combining unrelated characteristics leads to
a big amount of possible combinations, which can have a negative impact on the
performance.
Because master data records are located in separate tables outside of the
InfoCube, an additional table is required to connect these master data records to
dimensions. These additional tables contain a mapping of auto-generated
Surrogate IDs (SIDs) to the real master data.
This complex structure is required on classical databases; however, with SAP
HANA this requirement is obsolete. SAP therefore introduced the SAP HANA
Optimized Star Schema, illustrated in Figure 5-12.
Fact Table:
/BI0/F0COPC_C08
Data Package
Dimension Table:
/BI0/D0COPC_C08P
KEY_0COPC_C08P
DIMID
SID_0CALDAY
SID_0CHNGID
SID_0FISCPER
SID_0RECORDTP
SID_0FISCVARNT
SID_0REQUID
Company Code
Master Data Table:
/BI0/PCOMP_CODE
COMP_CODE
Company Code
SID Table:
/BI0/SCOMP_CODE
CHANGED
COMP_CODE
CHRT_ACCTS
SID_0CURRENCY
SID
COMPANY
SID_0UNIT
CHCKFL
COUNTRY
SID_0COMP_CODE
DATAFL
SID_0PLANT
INCFL
SID_0MATERIAL
SID_0MAT_PLANT
SID_0CURTYPE
Plant
SID Table:
/BI0/SPLANT
Plant
Master Data Table :
/BI0/PPLANT
PLANT
...
PLANT
OBJVERS
AMOUNTFX
SID
CHANGED
AMOUNTVR
CHCKFL
ALTITUDE
PRDPLN_QTY
DATAFL
BPARTNER
LOTSIZE_CM
INCFL
Figure 5-12 SAP HANA Optimized Star Schema in SAP NetWeaver BW system
78
OBJVERS
The content of all dimensions (except for the Data Package dimension) is
incorporated into the fact table. This modification brings several advantages:
Simplified modeling
Poorly designed dimensions (wrong combinations of characteristics) cannot
affect performance anymore. Moving characteristics from one dimension to
another is not a physical operation anymore; instead, it is just a metadata
update.
Faster loading
Because dimension tables do not exist, all overhead workload related to
identification of existing combinations or creating new combinations in the
dimension tables is not required anymore. Instead, the required SID values
are directly inserted into the fact table.
The SAP HANA Optimized Star Schema is automatically used for all newly
created InfoCubes on the SAP NetWeaver BW system running on the SAP
HANA database.
Existing InfoCubes are not automatically converted to this new schema during
the SAP HANA conversion of the SAP NetWeaver BW system. The conversion of
standard InfoCubes to in-memory optimized InfoCubes must be done manually
as a follow-up task after the database migration.
79
SAP HANA can calculate all aggregations in real-time. Therefore aggregates are
no longer required, and roll-up activity related to aggregate updates is obsolete.
This also reduces overall execution time of update operations.
If SAP NetWeaver BW Accelerator was used, the update of its indexes is also no
longer needed. Because SAP HANA is based on similar technology as an SAP
NetWeaver BW Accelerator, all queries are accelerated. Query performance with
SAP HANA can be compared to situations when all cubes are indexed by the
SAP NetWeaver BW Accelerator. In reality, query performance can be even
faster than with SAP NetWeaver BW Accelerator because additional features are
available for SAP NetWeaver BW running on SAP HANA, for example, the
possibility to remove an InfoCube and instead run reports against in-memory
optimized DataStore Objects (DSOs).
80
Oracle
11.2
11.2
MaxDB
7.8
7.9
MS SQL server
2008
2008
9.7
9.7
6.1, 7.1
6.1, 7.1
V9, V10
V9, V10
SybaseASE
n/a
15.7
SAP HANA is currently not a supported database for any SAP NetWeaver Java
stack; therefore, dual-stack installations (ABAP+Java) must be separated into
two individual stacks using the Dual-Stack Split Tool from SAP.
Because some existing installations are still non-Unicode installations, another
important prerequisite step might be a conversion of the database to unicode
encoding. This unicode conversion can be done as a separate step or as part of
the conversion to the SAP HANA database.
All InfoCubes with data persistency in the SAP NetWeaver Business Warehouse
Accelerator are set as inactive during conversion, and their content in SAP
NetWeaver BW Accelerator is deleted. These InfoCubes must be reloaded again
81
82
At the time of writing this book, the following operating systems (see
Table 5-2) were available to host ABAP part of SAP NetWeaver BW system:
HP-UX 11.31
IA64 (64-bit)
Solaris 10
SPARC (64-bit)
Solaris 10
x86_64 (64-bit)
Linux RHEL 5
x86_64 (64-bit)
Linux RHEL 6
x86_64 (64-bit)
IBM i 7.1
Power (64-bit)
SAP NetWeaver
BW 7.30
yes
yes
yes
yes
yes
yes
yes
yes
no
SAP NetWeaver
BW 7.31
yes
yes
yes
yes
yes
yes
no
yes
yes
The next step is the database import. It contains the installation of the SAP
NetWeaver BW on the primary application server and the import of data into the
SAP HANA database. The import occurs remotely from the primary application
server as part of the installation process.
Parallel export/import using socket connection and FTP and NFS exchange
modes are not supported. Currently only the asynchronous file-based
export/import method is available.
After mandatory post-activities, conversion of InfoCubes and DataStore objects
to their in-memory optimized form must be initiated to take all benefits that the
SAP HANA database can offer. This can be done either manually for each object
or as a mass operation using a special report.
Customers must plan a sufficient amount of time to perform this conversion. This
step can be time consuming because the content of all InfoCubes must be
copied into temporary tables that have the new structure.
After all post activities are finished, the system is ready to be tested.
83
84
Here is some important information that is relevant to the cloned system. Refer to
the content in SAP Note 886102 to understand the full procedure that must be
applied on the target BW system. The SAP Note states:
Caution: This step deletes all transfer rules and PSA tables of these source
systems, and the data is lost. A message is generated stating that the source
system cannot be accessed (since you deleted the host of the RFC connection).
Choose Ignore.
It is important to understand the consequences of this action and to plan the
required steps to reconfigure the target BW system so that it can again read data
from the source systems.
Persistent Staging Area (PSA) tables can be regenerated by the replication of
DataSources from the source systems, and transfer rules can be transported
from the original BW system. However the content of these PSA tables is lost
and needs to be reloaded from source systems.
This step might potentially cause problems where DataStore objects are used
and PSA tables contain the complete history of data.
An advantage of creating a cloned SAP NetWeaver BW system is that the
original system is not impacted and can still be used for productive tasks. The
cloned system can be tested and results compared with the original system
immediately after the clone is created and after every important project
milestone, such as a release upgrade or the conversion to SAP HANA itself.
Both systems are fully synchronized because both systems periodically extract
data from the same source systems. Therefore, after an entire project is finished,
and the new SAP NetWeaver BW system running on SAP HANA meets the
customers expectations, the new system can fully replace the original system.
A disadvantage of this approach is the additional load imposed on the source
systems, which is caused by both SAP NetWeaver BW systems performing
extraction from the same source system, and certain limitations mentioned in the
following SAP notes:
Note 775568 - Two and more BW systems against one OLTP system
Note 844222 - Two OR more BW systems against one OLTP system
85
procedures, which allows logic to be directly processed inside the SAP HANA
database.
A new software component can be integrated with SAP HANA either directly or it
can be built on top of the SAP NetWeaver stack, which is can work with the SAP
HANA database using client libraries.
Because of its breadth and depth, this use case scenario is not discussed in
detail as part of this publication.
86
Chapter 6.
87
88
The x3850 X5 and the workload-optimized x3950 X5 are the logical successors
to the x3850 M2 and x3950 M2, featuring the IBM eX4 chipset. Compared with
previous generation servers, the x3850 X5 offers:
High memory capacity
Up to 64 DIMMS standard and 96 DIMMs with the MAX5 memory expansion
per 4-socket server
Intel Xeon processor E7 family
Exceptional scalable performance with advanced reliability for your most
data-demanding applications
Extended SAS capacity with eight HDDs and 900 GB 2.5" SAS drives or 1.6
TB of hot-swappable RAID 5 with eXFlash technology
Standard dual-port Emulex 10 GB Virtual Fabric adapter
Ten-core, 8-core, and 6-core processor options with up to 2.4 GHz (10-core),
2.13 GHz (8-core), and 1.86 GHz (6-core) speeds with up to 30 MB L3 cache
Scalable to a two-node system with eight processor sockets and 128 DIMM
sockets
Seven PCIe x8 high-performance I/O expansion slots to support hot-swap
capabilities
Optional embedded hypervisor
The x3850 X5 and x3950 X5 both scale to four processors and 2 Terabytes (TB)
of RAM. With the MAX5 attached, the system can scale to four processors and
3 TB of RAM. Two x3850 X5 servers can be connected together for a single
system image with eight processors and 4 TB of RAM.
89
With their massive memory capacity and computing power, the IBM System
x3850 X5 and x3950 X5 rack-mount servers are the ideal platform for
high-memory demanding, high-workload applications, such as SAP HANA.
This machine is a two-socket, scalable system that offers up to four times the
memory capacity of current two-socket servers. It supports the following
specifications:
Up to 2 sockets for Intel Xeon E7 processors. Depending on the processor
model, processors have six, eight, or ten cores.
Scalable from 32 to 64 DIMMs sockets with the addition of a MAX5 memory
expansion unit.
Advanced networking capabilities with a Broadcom 5709 dual Gb Ethernet
controller standard in all models and an Emulex 10 Gb dual-port Ethernet
adapter standard on some models, optional on all others.
Up to 16 hot-swap 2.5-inch SAS HDDs, up to 16 TB of maximum internal
storage with RAID 0, 1, or 10 to maximize throughput and ease installation.
RAID 5 is optional. The system comes standard with one HDD backplane that
can hold four drives. A second and third backplane are optional for an
additional 12 drives.
New eXFlash high-IOPS solid-state storage technology.
Five PCIe 2.0 slots.
Integrated Management Module (IMM) for enhanced systems management
capabilities.
90
The x3690 X5 features the IBM eXFlash internal storage using solid state drives
to maximize the number of I/O operations per second (IOPS). All configurations
for SAP HANA based on x3690 X5 use eXFlash internal storage for high IOPS
log storage or for both data and log storage.
The x3690 X5 is an excellent choice for a memory-demanding and
performance-demanding business application, such as SAP HANA. It provides
maximum performance and memory in a dense 2U package.
91
For more information about Hyper-Threading Technology, see the following web
page:
http://www.intel.com/technology/platform-technology/hyper-threading/
92
I/O Hub
I/O
Memory
Processor
Processor
Memory
Memory
Processor
Processor
Memory
I/O Hub
I/O
In previous designs, the entire range of memory was accessible through the core
chipset by each processor, which is called a shared memory architecture. This
new design creates a non-uniform memory access (NUMA) system in which a
portion of the memory is directly connected to the processor where a given
thread is running, and the rest must be accessed over a QPI link through another
processor. Similarly, I/O can be local to a processor or remote through another
processor.
For more information about QPI, see the following web page:
http://www.intel.com/technology/quickpath/
93
94
Normal
operation
Hardware
OS
Error
detected
Error
passed to
OS
Hardware
correctable
error
Error
corrected
Memory
Page
unused
Memory Page
unmapped
and marked
SAP HANA
Application
signaled
Application
terminates
Page identified
and data can be
reconstructed
Reconstruct
data in
corrupted page
Figure 6-4 Intel Machine Check Architecture (MCA) with SAP HANA
95
Using the knowledge of its internal data structures, SAP HANA can decide what
course of action to take. If the corrupted memory space is occupied by one of the
SAP in-memory tables, SAP HANA reloads the associated tables. In addition, it
analyzes the failure and checks whether it affects other stored or committed data,
in which case it uses savepoints and database logs to reconstruct the committed
data in a new, unaffected memory location.
With the support of MCA, SAP HANA can take appropriate action at the level of
its own data structures to ensure a smooth return to normal operation and avoid
a time-consuming restart or loss of information.
I/O hubs
The connection to I/O devices (such as keyboard, mouse, and USB) and to I/O
adapters (such as hard disk drive controllers, Ethernet network interfaces, and
Fibre Channel host bus adapters) is handled by I/O hubs, which then connect to
the processors through QPI links. Figure 6-3 on page 93 shows the I/O Hub
connectivity. Connections to the I/O devices are fault tolerant because data can
be routed over either of the two QPI links to each I/O hub.
For optimal system performance in the four processor systems (with two I/O
hubs), balance high-throughput adapters across the I/O hubs. The configurations
used for SAP HANA contain several components that require high throughput
I/O:
Dual-port 10 Gb Ethernet adapters
ServeRAID controllers to connect the SAS drives
High IOPS PCIe Adapters
To ensure optimal performance, the placement of these components in the PCIe
slots was optimized according to the I/O architecture outlined above.
6.1.4 Memory
For an in-memory appliance, such as SAP HANA, a systems main memory, its
capacity, and its performance play an important role. The Intel Xeon processor
E7 family, Figure 6-5 on page 97, has a memory architecture that is well suited to
the requirements of such an appliance.
The E7 processors have two SMIs. Therefore, memory needs to be installed in
matched pairs. For better performance, or for systems connected together,
memory must be installed in sets of four. The memory used in the eX5 systems is
DDR3 SDRAM registered DIMMs. All of the memory runs at 1066 MHz or less,
depending on the processor.
96
Processor
Memory
controller
Buffer
Buffer
Memory
controller
Buffer
Buffer
DIMM
DIMM
DIMM
DIMM
DIMM
DIMM
DIMM
DIMM
DIMM
DIMM
DIMM
DIMM
DIMM
DIMM
DIMM
DIMM
97
1
2
3
4
5
6
7
8
Mem Ctrl 1
Mem Ctrl 1
Mem Ctrl 1
Mem Ctrl 1
Mem Ctrl 1
Mem Ctrl 1
Mem Ctrl 1
Mem Ctrl 1
Mem Ctrl 2
Relative
performance
Each processor:
2 memory controllers
2 DIMMs per channel
8 DIMMs per MC
1.0
Each processor:
2 memory controllers
1 DIMM per channel
4 DIMMs per MC
0.94
0.61
Mem Ctrl 2
Each processor:
2 memory controllers
2 DIMMs per channel
4 DIMMs per MC
0.58
Mem Ctrl 2
Each processor:
2 memory controllers
1 DIMM per channel
2 DIMMs per MC
Mem Ctrl 2
Each processor:
1 memory controller
2 DIMMs per channel
8 DIMMs per MC
0.51
Mem Ctrl 2
Each processor:
1 memory controller
1 DIMM per channel
4 DIMMs per MC
0.47
Mem Ctrl 2
Each processor:
1 memory controller
2 DIMMs per channel
4 DIMMs per MC
Mem Ctrl 2
Mem Ctrl 2
Each processor:
1 memory controller
1 DIMM per channel
2 DIMMs per MC
Memory card
DIMMs
Channel
Memory buffer
SMI link
Memory controller
Mem Ctrl 1
0.31
Memory configurations
1
0.94
0.9
0.8
0.7
0.61
0.6
0.58
0.51
0.5
0.47
0.4
0.31
0.29
0.3
0.2
0.1
0
1
Configuration
0.29
Figure 6-6 Relative memory performance based on DIMM placement (one processor and two memory
cards shown)
98
processor in the system and must be accessed through a QPI link (Figure 6-3 on
page 93). However, using remote memory adds latency. The more such latencies
add up in a server, the more performance can degrade. Starting with a memory
configuration where each CPU has the same local RAM capacity is a logical step
toward keeping remote memory accesses to a minimum.
In a NUMA system, each processor has fast, direct access to its own memory
modules, reducing the latency that arises due to bus-bandwidth contention. SAP
HANA is NUMA-aware, and thus benefits from this direct connection.
Hemisphere mode
Hemisphere mode is an important performance optimization of the Intel Xeon
processor E7, 6500, and 7500 product families. Hemisphere mode is
automatically enabled by the system if the memory configuration allows it. This
mode interleaves memory requests between the two memory controllers within
each processor, enabling reduced latency and increased throughput. It also
allows the processor to optimize its internal buffers to
maximize memory throughput.
Hemisphere mode is enabled only when the memory configuration behind each
memory controller on a processor is identical. In addition, because eight DIMMs
per processor are required for using all memory channels, eight DIMMs per
processor must be installed at a time for optimized memory performance.
99
eXFlash
IBM eXFlash is the name given to the eight 1.8-inch solid state drives (SSDs),
the backplanes, SSD hot swap carriers, and indicator lights that are available for
the x3850 X5/x3950 X5 and x3690 X5. Each eXFlash can be put in place of four
SAS or SATA disks. The eXFlash units connect to the same types of ServeRAID
disk controllers as the SAS/SATA disks. Figure 6-7 shows an eXFlash unit, with
the status light assembly on the left side.
Status lights
Solid state drives
(SSDs)
In addition to using less power than rotating magnetic media, the SSDs are more
reliable and can service many more I/O operations per second (IOPS). These
attributes make them suited to I/O-intensive applications, such as transaction
processing, logging, backup and recovery, and business intelligence. Built on
enterprise-grade MLC NAND flash memory, the SSD drives used in eXFlash
deliver up to 30,000 IOPS per single drive. Combined into an eXFlash unit, these
drives can deliver up to 240,000 IOPS and up to 2 GBps of sustained read
throughput per eXFlash unit.
In addition to its superior performance, eXFlash offers superior uptime with three
times the reliability of mechanical disk drives. SSDs have no moving parts to fail.
Each drive has its own backup power circuitry, error correction, data protection,
and thermal monitoring circuitry. They use Enterprise Wear-Leveling to extend
their use even longer.
A single eXFlash unit accommodates up to eight hot-swap SSDs and can be
connected to up to 2 performance-optimized controllers. The x3690 X5-based
models for SAP HANA enable RAID protection for the SSD drives by using two
ServeRAID M5015 controllers with the ServeRAID M5000 Performance
Accelerator Key for the eXFlash units.
100
System x and BladeCenter. These adapters are alternatives to disk drives and
are available in several sizes, from 160 GB to 1.2 TB. Designed for
high-performance servers and computing appliances, these adapters deliver
throughput of up to 900,000 I/O operations per second (IOPS), while providing
the added benefits of lower power, cooling, and management overhead and a
smaller storage footprint. Based on standard PCIe architecture coupled with
silicon-based NAND clustering storage technology, the High IOPS adapters are
optimized for System x rack-mount systems and can be deployed in blades
through the PCIe expansion units. They are available in storage capacities up to
2.4 TB.
These adapters use NAND flash memory as the basic building block of
solid-state storage and contain no moving parts. Thus, they are less sensitive to
issues associated with vibration, noise, and mechanical failure. They function as
a PCIe storage and controller device, and after the appropriate drivers are
loaded, the host operating system sees them as block devices. Therefore, these
adapters cannot be used as bootable devices.
The IBM High IOPS PCIe Adapters combine high IOPS performance with low
latency. As an example, with 512 KB block random reads, the IBM 1.2TB High
IOPS MLC Mono Adapter can deliver 143,000 IOPS, compared with 420 IOPS
for a 15 K RPM 146 GB disk drive. The read access latency is about 68
microseconds, which is one hundredth of the latency of a 15 K RPM 146 GB disk
drive (about 5 ms or 5000 microseconds). The write access latency is even less,
with about 15 microseconds.
Reliability features include the use of Enterprise-grade MLC (eMLC), advanced
wear-leveling, ECC protection, and Adaptive Flashback redundancy for RAID-like
chip protection with self-healing capabilities, providing unparalleled reliability and
efficiency. Advanced bad-block management algorithms enable taking blocks out
of service when their failure rate becomes unacceptable. These reliability
features provide a predictable lifetime and up to 25 years of data retention.
101
The x3950 X5-based models of the IBM Systems Solution for SAP HANA come
with IBM High IOPS adapters, either with 320 GB (7143-H1x), 640 GB
(7143-H2x, -H3x), or 1.2 TB storage capacity (7143-HAx, -HBx, -HCx).
Figure 6-8 shows the IBM 1.2TB High IOPS MLC Mono adapter, which comes
with the x3950 based 2012 models (7143-HAx, -HBx, -HCx).
102
GPFS can intelligently prefetch data into its buffer pool, issuing I/O requests in
parallel to as many disks as necessary to achieve the peak bandwidth of the
underlying storage-hardware infrastructure. GPFS recognizes multiple I/O
patterns, including sequential, reverse sequential, and various forms of striped
access patterns. In addition, for high-bandwidth environments, GPFS can read or
write large blocks of data in a single operation, minimizing the overhead of I/O
operations.
Expanding beyond a storage area network (SAN) or locally attached storage, a
single GPFS file system can be accessed by nodes using a TCP/IP or InfiniBand
connection. Using this block-based network data access, GPFS can outperform
network-based sharing technologies, such as NFS and even local file systems
such as the EXT3 journaling file system for Linux or Journaled File System.
Network block I/O (also called network shared disk (NSD)) is a software layer
that transparently forwards block I/O requests from a GPFS client application
node to an NSD server node to perform the disk I/O operation and then passes
the data back to the client. Using a network block I/O, configuration can be more
cost effective than a full-access SAN.
Storage pools enable you to transparently manage multiple tiers of storage
based on performance or reliability. You can use storage pools to transparently
provide the appropriate type of storage to multiple applications or different
portions of a single application within the same directory. For example, GPFS
can be configured to use low-latency disks for index operations and high-capacity
disks for data operations of a relational database. You can make these
configurations even if all database files are created in the same directory.
For optimal reliability, GPFS can be configured to eliminate single points of
failure. The file system can be configured to remain available automatically in the
event of a disk or server failure. A GPFS file is designed to transparently fail over
token (lock) operations and other GPFS cluster services, which can be
distributed throughout the entire cluster to eliminate the need for dedicated
metadata servers. GPFS can be configured to automatically recover from node,
storage, and other infrastructure failures.
GPFS provides this functionality by supporting these:
Data replication to increase availability in the event of a storage media failure
Multiple paths to the data in the event of a communications or server failure
File system activity logging, enabling consistent fast recovery after system
failures
In addition, GPFS supports snapshots to provide a space-efficient image of a file
system at a specified time, which allows online backup and can help protect
against user error.
103
104
building blocks available for one T-shirt size. In some, two-building blocks have to
be combined to build a specific T-shirt size. Table 6-1 shows all building blocks
announced in 2011 and their features.
Table 6-1 IBM System x workload-optimized models for SAP HANA, 2011 models
Building
block
Server
(MTM)
CPUs
Main
memory
Log
storage
Data
storage
Preload
XS
x3690 X5
(7147-H1xa)
2x Intel Xeon
E7-2870
128 GB DDR3
(8x 16 GB)
8x 50 GB 1.8
MLC SSD
8x 300 GB
10 K SAS HDD
Yes
x3690 X5
(7147-H2x)
2x Intel Xeon
E7-2870
256 GB DDR3
(16x 16 GB)
8x 50 GB 1.8
MLC SSD
8x 300 GB
10 K SAS HDD
Yes
SSD
x3690 X5
(7147-H3x)
2x Intel Xeon
E7-2870
256 GB DDR3
(16x 16 GB)
Yes
S+
x3950 X5
(7143-H1x)
2x Intel Xeon
E7-8870
256 GB DDR3
(16x 16 GB)
320 GB High
IOPS adapter
8x 600 GB
10 K SAS HDD
Yes
x3950 X5
(7143-H2x)
4x Intel Xeon
E7-8870
512 GB DDR3
(32x 16 GB)
640 GB High
IOPS adapter
8x 600 GB
10 K SAS HDD
Yes
L Option
x3950 X5
(7143-H3x)
4x Intel Xeon
E7-8870
512 GB DDR3
(32x 16 GB)
640 GB High
IOPS adapter
8x 600 GB
10 K SAS HDD
No
a. x = Country-specific letter (for example, EMEA MTM is 7147-H1G, and the US MTM is 7147-H1U).
Contact your IBM representative for regional part numbers.
In addition to the models listed in Table 6-1, there are models specific to a
geographic region:
Models 7147-H7x, -H8x, and -H9x are for Canada only and are the same
configurations as H1x, H2x, and H3x, respectively.
Models 7143-H4x and -H5x are for Canada only and are the same
configuration as H1x and H2x, respectively.
105
Server
(MTM)
CPUs
Main
memory
Log
storage
Data
storage
Preload
XS
x3690 X5
(7147-HAxa)
2x Intel Xeon
E7-2870
128 GB DDR3
(8x 16 GB)
Yes
x3690 X5
(7147-HBx)
2x Intel Xeon
E7-2870
256 GB DDR3
(16x 16 GB)
Yes
S+
x3950 X5
(7143-HAx)
2x Intel Xeon
E7-8870
256 GB DDR3
(16x 16 GB)
1.2 TB High
IOPS adapter
8x 900 GB
10 K SAS HDD
Yes
x3950 X5
(7143-HBx)
4x Intel Xeon
E7-8870
512 GB DDR3
(32x 16 GB)
1.2 TB High
IOPS adapter
8x 900 GB
10 K SAS HDD
Yes
L Option
x3950 X5
(7143-HCx)
4x Intel Xeon
E7-8870
512 GB DDR3
(32x 16 GB)
1.2 TB High
IOPS adapter
8x 900 GB
10 K SAS HDD
No
a. x = Country-specific letter (for example, EMEA MTM is 7147-HAG, and the US MTM is 7147-HAU).
Contact your IBM representative for regional part numbers.
All models (except for 7143-H3x and 7143-HCx) come with a preload comprising
SUSE Linux Enterprise Server for SAP Applications (SLES for SAP) 11 SP1,
IBM GPFS, and the SAP HANA software stack. Licenses and maintenance fees
(for three years) for SLES for SAP and GPFS are included. Section GPFS
license information on page 160 has an overview about which type of GPFS
license comes with a specific model, and the amount of Processor Value Units
(PVU) included.The licenses for the SAP software components have to be
acquired separately from SAP.
The L-Option building blocks (7143-H3x or 7143-HCx) are intended as an
extension to an M building block (7143-H2x or 7143-HBx). When building an
L-Size SAP HANA system, one M building block has to be combined with an
L-Option building block, leveraging eX5 scalability. Both systems then act as one
single eight-socket, 1 TB server. Therefore, the L-Option building blocks do not
require a software preload, it comes however with the required additional
software licenses for GPFS and SLES for SAP.
The building blocks are configured to match the SAP HANA sizing requirements.
The main memory sizes match the number of CPUs, to give the correct balance
between processing power and data volume. Also, the storage devices in the
106
systems provide the storage capacity required to match the amount of main
memory.
All systems come with storage for both the data volume and the log volume
(Figure 6-9). Savepoints are stored on a RAID protected array of 10 K SAS hard
drives, optimized for data throughput. The SAP HANA database logs are stored
on flash technology storage devices:
RAID-protected, hot swap eXFlash SSD drives on the models based on IBM
System x3690 X5
Flash-based High IOPS PCIe adapters for the models based on IBM System
x3950 X5
These flash technology storage devices are optimized for high IOPS
performance and low latency to provide the SAP HANA database with a log
storage that allows the highest possible performance. Because a transaction in
the SAP HANA database can only return after the corresponding log entry is
written to the log storage, high IOPS performance and low latency are key to
database performance.
The building blocks based on the IBM System x3690 X5 (except for the older
7147-H1x and 7147-H2x), come with combined data and log storage on an array
of RAID-protected, hot-swap eXFlash SSD drives. Optimized for throughput, high
IOPS performance, and low latency, these building blocks give extra flexibility
when dealing with large amounts of log data, savepoint data, or backup data.
Time
Data savepoint
to persistent
storage
SAS Drives
Log written
to persistent storage
(committed transactions)
optimized for
optimized for
throughput
Server
local
storage
107
108
The model numbers given might have be to replaced by a region-specific equivalent by changing
the x to a region-specific letter identifier. See 6.3.1, IBM System x workload-optimized models for
SAP HANA on page 104.
Table 6-3 gives an overview of the SAP HANA T-Shirt sizes and their relation to
the IBM custom models for SAP HANA.
Table 6-3 SAP HANA T-shirt sizes and their relation to the IBM custom models
SAP T-shirt
size
XS
S+
M and M+
Compressed
data in
memory
64 GB
128 GB
128 GB
256 GB
512 GB
Server main
memory
128 GB
256 GB
256 GB
512 GB
1024 GB
Number of
CPUs
Mapping to
building
blocksa
7147-HAx or
7147-H1x
7147-HBx or
7147-H3x or
7147-H2x
7143-HAx or
7143-H1x
7143-HBx or
7143-H2x
Combine
7143-HBx or
7143-H2x
with
7143-HCx or
7143-H3x)
a. For a region-specific equivalent, see 6.3.1, IBM System x workload-optimized models for SAP
HANA on page 104.
6.3.3 Scale-up
This section talks about upgradability, or scale-up, and shows how IBM custom
models for SAP HANA can be upgraded to accommodate the need to grow into
bigger T-shirt sizes.
To accommodate growth, the IBM Systems Solution for SAP HANA can be
scaled in these ways:
Scale-up approach: Increase the capabilities of a single system by adding more
components.
Scale-out approach: Increase the capabilities of the solution by using multiple
systems working together in a cluster.
We discuss the scale-out approach in 6.4, Scale-out solution for SAP HANA on
page 110.
109
The building blocks of the IBM Systems Solution for SAP HANA, as described
previously, were designed with extensibility in mind. The following upgrade options
exist:
An XS building block can be upgraded to be an S-size SAP HANA system by
adding 128 GB of main memory to the system.
An S+ building block can be upgraded to be an M-Size SAP HANA system by
adding two more CPUs, which is another 256 GB of main memory. For the
7143-H1x, another 320 GB High IOPS adapter needs to be added to the
system, the newer 7143-HAx has the required flash capacity already
included.
An M building block (7143-H2x or 7143-HBx) can be extended with the L
option (7143-H3x or 7143-HCx) to resemble an L-Size SAP HANA System.
The 2011 models can be combined with the 2012 models, for example, the
older 7143-H2x can be extended with the new 7143-HCx.
With the option to upgrade S+ to M, and M to L, IBM can provide an
unmatched upgrade path from a T-shirt size S up to a T-shirt size L, without
the need to retire a single piece of hardware.
Of course, upgrading server hardware requires system downtime. However, due
to GPFSs capability to add storage capacity to an existing GPFS file system by
just adding devices, data residing on the system remains intact. We nevertheless
recommend that you do a backup of the data before changing the systems
configuration.
110
node01
SAP HANA DB
DB partition 1
- SAP HANA DB
- Index server
- Statistic server
- SAP HANA studio
Primary data
HDD
Flash
data01
log01
111
112
node01
node02
node03
SAP HANA DB
DB partition 1
DB partition 2
DB partition 3
- SAP HANA DB
Worker node
- SAP HANA DB
Worker node
- SAP HANA DB
Worker node
- Index server
- Statistic server
- SAP HANA studio
- Index server
- Statistic server
- Index server
- Statistic server
Primary data
HDD
Flash
HDD
Flash
HDD
Flash
data01
log01
data02
log02
data03
log03
To an outside application connecting to the SAP HANA database, this looks like a
single instance of SAP HANA. The SAP HANA software distributes the requests
internally across the cluster to the individual worker nodes, which process the
data and exchange intermediate results, which are then combined and sent back
to the requestor. Each node maintains its own set of data, persisting it with
savepoints and logging data changes to the database log.
GPFS combines the storage devices of the individual nodes into one big file
system, making sure that the SAP HANA software has access to all data
regardless of its location in the cluster, while making sure that savepoints and
database logs of an individual database partition are stored on the appropriate
storage device of the node on which the partition is located. While GPFS
provides the SAP HANA software with the functionality of a shared storage
system, it ensures maximum performance and minimum latency by using locally
attached disks and flash devices. In addition, because server-local storage
devices are used, the total capacity and performance of the storage within the
cluster automatically increases with the addition of nodes, maintaining the same
per-node performance characteristics regardless of the size of the cluster. This
kind of scalability is not achievable with external storage systems.
113
114
Figure 6-12 illustrates a four-node cluster with the fourth node being a standby
node.
node01
node02
node03
node04
SAP HANA DB
DB partition 1
DB partition 2
DB partition 3
- SAP HANA DB
Worker node
- SAP HANA DB
Worker node
- SAP HANA DB
Worker node
- SAP HANA DB
Standby node
- Index server
- Statistic server
- SAP HANA studio
- Index server
- Statistic server
- Index server
- Statistic server
- Index server
- Statistic server
Primary data
HDD
Flash
HDD
Flash
HDD
Flash
data01
log01
data02
log02
data03
log03
HDD
Flash
Replica
To be able to take over the database partition from the failed node, the standby
node has to load the savepoints and database logs of the failed node to recover
the database partition and resume operation in place of the failed node. This is
possible because GPFS provides a global file system across the entire cluster,
giving each individual node access to all the data stored on the storage devices
managed by GPFS.
In case a node has an unrecoverable hardware error, the storage devices holding
the nodes data might become unavailable or even destroyed. In contrast to the
solution without high-availability capabilities, here the GPFS file system
replicates the data of each node to the other nodes, to prevent data loss in case
one of the nodes goes down. Replication is done in a striping fashion. That is,
every node has a piece of data of all other nodes. In the example illustrated in
Figure 6-12, the contents of the data storage (that is, the savepoints, here
data01) and the log storage (that is, the database logs, here log01) of node01
are replicated to node02, node03, and node04, each holding a part of the data
on the matching device (that is, data on HDD, log on flash). The same is true for
all nodes carrying data, so that all information is available twice within the GPFS
file system, which makes it tolerant to the loss of a single node. The replication
occurs synchronously. That is, the write operation only finishes when the data is
both written locally and replicated. This ensures consistency of the data at any
115
point in time. Although GPFS replication is done over the network and in a
synchronous fashion. This solution still over achieves the performance
requirements for validation by SAP.
Using replication, GPFS provides the SAP HANA software with the functionality
and fault tolerance of a shared storage system while maintaining its performance
characteristics. Again, due to the fact that server-local storage devices are used,
the total capacity and performance of the storage within the cluster automatically
increases with the addition of nodes, maintaining the same per-node
performance characteristics regardless of the size of the cluster. This kind of
scalability is not achievable with external storage systems.
116
node01
node02
node03
node04
SAP HANA DB
DB partition 1
DB partition 2
DB partition 3
- SAP HANA DB
Worker node
- SAP HANA DB
Worker node
- SAP HANA DB
Defunct node
- SAP HANA DB
Worker node
- Index server
- Statistic server
- SAP HANA studio
- Index server
- Statistic server
- Index server
- Statistic server
- Index server
- Statistic server
Primary data
HDD
Flash
HDD
Flash
data01
log01
data02
log02
HDD
Flash
HDD
Flash
data03
log03
Replica
The data that node04 was reading was the data of node03, which failed,
including the local storage devices. For that reason GPFS had to deliver the data
to node04 from the replica spread across the cluster using the network. Now
when node04 starts writing savepoints and database logs again during the
normal course of operations, these are not written over the network, but to the
local drives, again with a replica striped across the cluster.
After fixing the cause for the failure of node03, it can be reintegrated into the
cluster as the new standby system (Figure 6-14 on page 118).
117
node01
node02
node03
node04
SAP HANA DB
DB partition 1
DB partition 2
DB partition 3
- SAP HANA DB
Worker node
- SAP HANA DB
Worker node
- SAP HANA DB
Standby node
- SAP HANA DB
Worker node
- Index server
- Statistic server
- SAP HANA studio
- Index server
- Statistic server
- Index server
- Statistic server
- Index server
- Statistic server
Primary data
HDD
Flash
HDD
Flash
data01
log01
data02
log02
HDD
Flash
HDD
Flash
data03
log03
Replica
118
SAP
BW
system
Switch
10 GbE switch
Node 1
Switch
Node 1
Node 1
...
Node n
10 GbE switch
All network connections within the scale-out solution are fully redundant. Both
the internal GPFS network and the internal SAP HANA network are connected to
119
GPFS
IMM
SAP HANA
Figure 6-16 The back of an M building block with the network interfaces available
Each building block comes with one (2011 models) or two (2012 models)
dual-port 10 Gb Ethernet interface cards (NIC). To provide enough ports for a
fully redundant network connection to the 10 Gb Ethernet switches, an additional
dual-port 10 Gb Ethernet NIC can be added to the system (see also section
6.4.4, Hardware and software additions required for scale-out on page 121).
An exception to this is an L configuration, where each of the two chassis (the M
building block and the L option) hold one or two dual-port 10 Gb Ethernet NICs.
Therefore an L configuration does not need an additional 10 Gb Ethernet NIC for
the internal networks, even for the 2011 models.
The six available 1 Gb Ethernet interfaces available (a.b.e.f.g.h) on the system
can be used to connect the systems to other networks or systems, for example,
for client access, application management, systems management, data
management, and so on. The interface denoted with the letter i is used to
120
121
http://service.sap.com/pam
122
123
SAP HANA system, access the SAP Online Service System (SAP OSS) website
at:
https://service.sap.com
When you reach the website, create a service request ticket using a
subcomponent of BC-HAN or BC-DB-HDB as the problem component. IBM
support works closely with SAP and SUSE and is dedicated to supporting SAP
HANA software and hardware issues.
Send all questions and requests for support to SAP using their OSS messaging
system. A dedicated IBM representative is available at SAP to work on this
solution. Even if it is clearly a hardware problem, an SAP OSS message should
be opened to provide the best direct support for the IBM Systems solution for
SAP HANA.
When opening an SAP support message, we recommend using the text template
provided in the Quick Start Guide, when it is obvious that you have a hardware
problem. This procedure expedites all hardware-related problems within the SAP
support organization. Otherwise, the SAP Support Teams will gladly help you
with the questions regarding the SAP HANA appliance in general.
Before you contact support, make sure that you have taken these steps to try to
solve the problem yourself:
Check all cables to make sure that they are connected.
Check the power switches to make sure that the system and any optional
devices are turned on.
Use the troubleshooting information in your system documentation, and use
the diagnostic tools that come with your system. Information about diagnostic
tools is available in the Problem Determination and Service Guide on the IBM
Documentation CD that comes with your system.
Go to the following IBM support website to check for technical information,
hints, tips, and new device drivers or to submit a request for information:
http://www.ibm.com/supportportal/
For SAP HANA software-related issues you can search the SAP Online
Service System (OSS) website for problem resolutions. The OSS website has
a knowledge database of known issues and can be accessed here:
https://service.sap.com/notes
The main SAP HANA information source is available here:
https://help.sap.com/hana_appliance
124
125
CRM, SAP ERP EhP5, SAP NetWeaver 7.3, SAP BusinessObjects and more
along with the IBM robust DB2 database. The SAP business process platform,
which is a part of the SAP Discovery system, helps organizations discover ways
to accelerate business innovation and respond to changing business needs by
designing reusable process components that make use of enterprise services.
The SAP BusinessObjects portfolio of tools and applications on the SAP
Discovery system were designed to help optimize information discovery and
delivery, information management and query, reporting, and analysis. For
business users, the SAP Discovery system helps bridge the gap between
business and IT and serves as a platform for future upgrade planning and
functional trial and gap analysis.
The SAP Discovery system includes sample business scenarios and
demonstrations that are preconfigured and ready to run. It is a preconfigured
environment with prepared demos and populated with Best Practices data. A list
of detailed components, exercises, and SAP Best Practices configuration is
available online at:
http://www.sdn.sap.com/irj/sdn/discoverysystem
The IBM Systems Solution with SAP Discovery system uses the IBM System
x3650 M4 server to provide a robust, compact and cost-effective hardware
platform for the SAP Discovery System, using VMware ESXi software with
Microsoft Windows and SUSE Linux operating systems. IBM System x3650 M4
servers offer an energy-smart, affordable, and easy-to-use rack solution for data
center environments looking to significantly lower operational and solution costs.
Figure 6-17 shows the IBM Systems solution with SAP Discovery system.
Figure 6-17 The IBM Systems solution with SAP Discovery System
126
The combination of the IBM Systems solution for SAP HANA and the IBM
Systems solution with SAP Discovery System is the ideal platform to explore,
develop, test, and demonstrate the capabilities of an SAP landscape including
SAP HANA. Figure 6-18 illustrates this.
IBM Systems Solution with SAP Discovery System
Client VM
Windows Server 2008
Sybase SUP 2.1
Server VM
SUSE Linux Enterprise Server 11 SP1
SAP
CRM 7.0
Sybase Afaria
SAP Business Objects
Client Tools
SAP BusinessObjects
SAP
MDM 7.1
Explorer 4.0
Business Intelligence 4.0
Dashboard
Crystal
DS / BI 4.0
SAP HANA Studio 1.0
SAP NetWeaver
Developer Studio
SAP NW PI 7.30
SAP NW CE 7.3
ESR
ESR
SR
SR
SLD
SLD
SAP NW
Mobile 7.1
SAP NW
BW 7.3
SAP NW 7.3 EP
SAP NW 7.3 BI
SAP NWDI 7.3
SAP NetWeaver
Business Client 3.0
SAP GUI 7.20
Integrated
with core
VMware ESXi
Figure 6-18 IBM Systems solution with SAP Discovery System combined with SAP HANA
Whether you plan to integrate new SAP products into your infrastructure or are
preparing for an upgrade, the IBM Systems solution with SAP Discovery system
can help you thoroughly evaluate SAP applications and validate their benefits.
You gain hands-on experience, the opportunity to develop a proof of concept,
and the perfect tool for training your personnel in advance of deploying a
production system. The combination of the IBM Systems solution with SAP
Discovery system with one of the SAP HANA models based on IBM System
x3690 X5 gives you a complete SAP environment including SAP HANA in a
compact 4U package.
More information about the IBM Systems solution with SAP Discovery System is
available online at:
http://www.ibm.com/sap/discoverysystem
127
128
Chapter 7.
7.1, Backing up and restoring data for SAP HANA on page 130
7.2, Disaster Recovery for SAP HANA on page 137
7.3, Monitoring SAP HANA on page 142
7.4, Sharing an SAP HANA system on page 144
7.5, Installing additional agents on page 146
7.6, Software and firmware levels on page 147
129
Backing up
A backup of the SAP HANA database has to be triggered through the SAP HANA
Studio or alternatively through the SAP HANA SQL interface. SAP HANA will
then create a consistent backup, consisting of one file per cluster node. Simply
saving away the savepoints and the database logs does not constitute a
consistent backup that can be recovered from. SAP HANA always performs a full
backup. Incremental backups are currently not supported by SAP HANA.
SAP HANA internally maintains transaction numbers, which are unique within a
database instance, also and especially in a scale-out configuration. To be able to
create a consistent backup across a scale-out configuration, SAP HANA
chooses a specific transaction number, and all nodes of the database instance
write their own backup files including all transactions up to this transaction
number.
The backup files are saved to a defined staging area that might be on the internal
disks, an external disk on an NFS share, or a directly attached SAN subsystem.
In addition to the data backup files, the configuration files and backup catalog
files have to be saved to be recovered. For point in time recovery, the log area
also has to be backed up.
With the IBM Systems solution for SAP HANA, one of the 1 Gbit network
interfaces of the server can be used for NFS connectivity, alternatively an
additional 10Gbit Network interface (if PCI slot available). It is also supported to
add a fibre channel HBA for SAN connectivity. The Quick Start Guide for the IBM
Systems solution for SAP HANA lists supported hardware additions to provide
additional connectivity.
130
Restoring a backup
It might be necessary to recover the SAP HANA database from a backup in the
following situations:
The data area is damaged
If the data area is unusable, the SAP HANA database can be recovered up to
the latest committed transaction, if all the data changes after the last
complete data backup are still available in the log backups and log area. After
the data and log backups have been restored, the SAP HANA databases
uses the data and log backups and the log entries in the log area to restore
the data and replay the logs, to recover. It is also possible to recover the
database using an older data backup and log backups, as long as all relevant
log backups made after the data backup are available1. More information:
SAP Note 1705945 (Determining the files needed for a recovery)
The log area is damaged.
If the log area is unusable, the only possibility to recover is to replay the log
backups. In consequence, any transactions committed after the most recent
log backup are lost, and all transactions that were open during the log backup
are rolled back.
After restoring the data and log backups, the log entries from the log backups
are automatically replayed in order to recover. It is also possible to recover the
database to a specific point in time, as long as it is within the existing log
backups.
The database needs to be reset to an earlier point in time because of a logical
error.
To reset the database to a specific point in time, a data backup from before
the point in time to recover to and the subsequent log backups must be
restored. During recovery the log area might be used as well, depending on
the point in time the database is reset to. All changes made after the recovery
time are (intentionally) lost.
You want to create a copy of the database.
It can be desirable to create a copy of the database for various purposes,
such as creating a test system.
A database recovery is initiated from the SAP HANA studio.
A backup can only be restored to an identical SAP HANA system, with regard to
the number of nodes, node memory size, host names and SID. Changing of host
names and SID during recovery is however enabled since SAP HANA 1.0
SPS04.
1
See SAP Note 1705945 for help with determining the files needed for a recovery.
131
When restoring a backup image from a single node configuration into a scale-out
configuration, SAP HANA does not repartition the data automatically. The correct
way to bring a backup of a single-node SAP HANA installation to a scale-out
solution is as follows:
1. Backup the data from the stand-alone node.
2. Install SAP HANA on the master node.
3. Restore the backup into the master node.
4. Install SAP HANA on the slave and standby nodes as appropriate, and add
these nodes to the SAP HANA cluster.
5. Repartition the data across all worker nodes.
More detailed information about the backup and recovery processes for the SAP
HANA database is provided in the SAP HANA Backup and Recovery Guide,
available online at:
http://help.sap.com/hana_appliance
132
See SAP Note 1730932 - Using backup tools with Backint for more details.
node01
node02
node03
SAP HANA DB
DB partition 1
2
Primary data
Tivoli
Storage
Manager
Server
Backup files
DB partition 3
HDD
Flash
HDD
Flash
HDD
Flash
data01
log01
data02
log02
data03
log03
backup
backup
3
backup
4
move files to TSM
5
backup
DB partition 2
backup.sh
restore
133
node01
node02
node03
SAP HANA DB
DB partition 1
2
Primary data
Tivoli
Storage
Manager
Server
5
backup
DB partition 2
DB partition 3
HDD
Flash
HDD
Flash
HDD
Flash
data01
log01
data02
log02
data03
log03
backup.sh
restore
4
Tivoli Storage Manager
Storage
backup
backup
backup
SAP HANA
backup file storage
Figure 7-2 Backup process with Data Protection for SAP HANA, using external storage for backup files
Running log and data backups requires the DP for SAP HANA backup.sh
command to be executed as the SAP HANA administration user (<sid>adm).
134
backup.sh --logs
By using this command, a backup of the SAP HANA database into TSM can be
fully automated.
135
To restore data backups, including SAP HANA configuration files and logfile
backups, TSMs BACKUP-Filemanager is used. Figure 7-3 shows a sample
panel of the BACKUP-Filemanager.
BACKUP-Filemanager V6.4.0.0, Copyright IBM 2001-2012
.------------------+---------------------------------------------------------------.
| Backup ID's
| Files stored under TSM___A0H7K1C4QI
|
|------------------+---------------------------------------------------------------|
| TSM___A0H7KM0XF4 | */hana/log_backup/log_backup_2_0_1083027170688_1083043933760 |
| TSM___A0H7KLYP3Z | */hana/log_backup/log_backup_2_0_1083043933760_1083060697664 |
| TSM___A0H7KHNLU6 | */hana/log_backup/log_backup_2_0_1083060697664_1083077461376 |
| TSM___A0H7KE6V19 | */hana/log_backup/log_backup_2_0_1083077461376_1083094223936 |
| TSM___A0H7K9KR7F | */hana/log_backup/log_backup_2_0_1083094223936_1083110986880 |
| TSM___A0H7K7L73W | */hana/log_backup/log_backup_2_0_1083110986880_1083127750848 |
| TSM___A0H7K720A4 | */hana/log_backup/log_backup_2_0_1083127750848_1083144513792 |
| TSM___A0H7K4BDXV | */hana/log_backup/log_backup_2_0_1083144513792_1083161277760 |
| TSM___A0H7K472YC | */hana/log_backup/log_backup_2_0_1083161277760_1083178040064 |
| TSM___A0H7K466HK | */hana/log_backup/log_backup_2_0_1083178040064_1083194806336 |
| TSM___A0H7K1C4QI | */hana/log_backup/log_backup_2_0_1083194806336_1083211570688 |
| TSM___A0H7JX1S77 | */hana/log_backup/log_backup_2_0_1083211570688_1083228345728 |
| TSM___A0H7JSRG2B | */hana/log_backup/log_backup_2_0_1083228345728_1083245109824 |
| TSM___A0H7JOH1ZP | */hana/log_backup/log_backup_2_0_1083245109824_1083261872960 |
| TSM___A0H7JK6ONC | */hana/log_backup/log_backup_2_0_1083261872960_1083278636608 |
| TSM___A0H7JJWUI8 | */hana/log_backup/log_backup_2_0_1083278636608_1083295400384 |
| TSM___A0H7JJU5YN | */hana/log_backup/log_backup_2_0_1083295400384_1083312166016 |
| TSM___A0H7JFWAV4 | */hana/log_backup/log_backup_2_0_1083312166016_1083328934016 |
| TSM___A0H7JBG625 | */hana/log_backup/log_backup_2_0_1083328934016_1083345705856 |
| TSM___A0H7JBAASN | */hana/log_backup/log_backup_2_0_1083345705856_1083362476352 |
| TSM___A0H7J7BLDK | */hana/log_backup/log_backup_2_0_1083362476352_1083379244416 |
| TSM___A0H7J5U8S7 | */hana/log_backup/log_backup_2_0_1083379244416_1083396008064 |
| TSM___A0H7J5T92O | */hana/log_backup/log_backup_2_0_1083396008064_1083412772928 |
| TSM___A0H7J4TWPG | */hana/log_backup/log_backup_2_0_1083412772928_1083429538688 |
|
| */hana/log_backup/log_backup_2_0_1083429538688_1083446303424 |
|
| */hana/log_backup/log_backup_2_0_1083446303424_1083463079488 |
|
| */hana/log_backup/log_backup_2_0_1083463079488_1083479846528 V
|------------------+---------------------------------------------------------------|
| 24 BID's
| 190 File(s) - 190 marked
|
`------------------+---------------------------------------------------------------'
TAB change windows
F2 Restore
F3 Mark all
F4 Unmark allF5 reFresh
F6 fileInfo
F7 redireCt
F8 Delete
F10 eXit
ENTER mark file
Figure 7-3 The BACKUP-Filemanager interface
Desired data and log backups can be selected and then restored to the desired
location. If no directory is specified for the restore, the BACKUP-Filemanager
restores the backups to the original location from which the backup was done.
136
After the backup files have been restored, the recovery process has to be started
using SAP HANA Studio. More information about this process and the various
options for a recovery is contained in the SAP HANA Backup and Recovery
Guide, available online at:
http://help.sap.com/hana_appliance
After completing the recovery process successfully and the backup files are no
longer needed, they must be removed from disk manually.
137
138
Figure 7-4 illustrates the concept of using backup and restore as a basic disaster
recovery solution.
Primary Site
node01
node02
node03
node04
backup
SAP HANA DB
DB partition 1
DB partition 2
DB partition 3
Standby node
Primary data
HDD
Flash
HDD
Flash
HDD
Flash
data01
log01
data02
log02
data03
log03
HDD
Flash
Storage
Replica
mirror /
transfer
backup set
Secondary Site
node01
node02
node03
node04
SAP HANA DB
DB partition 1
DB partition 2
DB partition 3
restore
Standby node
Primary data
HDD
Flash
HDD
Flash
HDD
Flash
data01
log01
data02
log02
data03
log03
HDD
Flash
Storage
Replica
Figure 7-4 Using backup and restore as a basic disaster recovery solution
Overview
For a Disaster Recovery setup it is necessary to have identical scale-out
configurations on both the primary and the secondary site. In addition there
139
needs to be a third site which has the sole responsibility to act as a quorum site.
In the configuration described here, the distance between the primary and
secondary data centers has to be within a range to allow for synchronous
replication with limited impact to the overall application performance (also
referred to as metro-mirror distance).
The major difference between a single site (as described in 6.4.2, Scale-out
solution with high-availability capabilities on page 114) and a multi-site solution
is the placement of the replicas within GPFS. Whereas in a single-site
configuration there is only one replica3 of each data block in one cluster, a
multi-site solution will hold an additional replica in the remote or secondary site.
This ensures that, when the primary site fails, a complete copy of the data is
available in the second site and operation can be resumed on this site.
A two-site solution implements the concept of a synchronous data replication on
file system level between both sites, leveraging the replication capabilities of
GPFS. Synchronous data replication means that any write request issued by the
application is only committed to the application after it has been successfully
written on both sides. In order to maintain the application performance within
reasonable limits the network latency (and therefore the distance) between the
sites has to be limited to metro-mirror distances. The maximum achievable
distance depends on the performance requirements of the SAP HANA system
and of the network configuration in the customer environment.
Basic setup
During normal operation there is an active SAP HANA instance running. The
SAP HANA instance on the secondary site is not active. The implementation on
each site is identical to a standard scale-out cluster with high availability as
described in section 6.4.2, Scale-out solution with high-availability capabilities
140
In addition to the primary data. In GPFS terminology, these are already two replicas, that is the
primary data and the first copy. To avoid confusion, we do not count the primary data as a replica.
on page 114. It therefore has to include standby servers for high availability. A
server failure is being handled completely within on site and does not enforce a
site failover. Figure 7-5 illustrates this setup.
Site A
node01
node02
node03
Site B
node05
node04
Partition 2
Partition 3
node06
node07
node08
Standby
Partition 2
Partition 3
Standby
synchronous
replication
Second Replica
GPFS
quorum
node
Site C
Figure 7-5 Basic setup of the disaster recovery solution using GPFS synchronous replication
The connection between the two main sites A and B depends on the customers
network infrastructure. It is recommended to have a dual link dark fibre
connection to allow for redundancy also in the network switch side on each site.
For full redundancy an additional link pair is required for cross connection the
switches. Within each site the 10 Gb Ethernet network connections for both the
internal SAP HANA and the internal GPFS network are implemented in a
redundant layout.
As with a standard scale-out implementation, the disaster recovery configuration
relies on GPFS functionality to enable the synchronous data replication between
sites. A single site solution holds one replica of each data block. This is being
enhanced with a second replica in the dual site disaster recovery
implementation. A stretched GPFS cluster is being implemented between the
two sites. Figure 7-5 illustrates that there is a combined cluster on GPFS level
between both sites, whereas the SAP HANA installations are independent of
each other. GPFS file placement policies ensure that there is one replica on the
primary site and a second replica on the secondary site. In case of a site failure
the file system can therefore stay active with a complete data replica in the
secondary site. The SAP HANA database can then be made operational through
a manually procedure based on the persistency and log files available in the file
system.
141
Site failover
During normal operation there is a running SAP HANA instance active on the
primary site. The secondary site has an installed SAP HANA instance that is
inactive. A failover to the remote SAP HANA installation has to be initiated
manually. Depending on the reason for the site failover it can be decided if the
secondary site becomes the production site or both sites stay offline until the
reason for the failover is removed and the primary site becomes active again.
During normal operation the GPFS file system is not mounted on the secondary
site, ensuring that there is not read nor write access to the file system. In case of
a failover, first ensure that a second replica of data is available on the secondary
site before the file system is mounted. This replication process is initiated
manually, and as soon as it completes correctly, the file system is mounted. From
there on the SAP HANA instance can be started and the data loaded into
memory. The SAP HANA database is restored to the latest savepoint and the
available logs are recovered.
Any switch from one site to the other incorporates a down time of SAP HANA
operations, because the two independent instances on either site must not run at
the same time, due to the sharing of the persistency and log files on the
filesystem.
Summary
The disaster recovery solution for the IBM Systems solution for SAP HANA
exploits the advanced replication features of GPFS, creating a cross-site cluster
that ensures availability and consistency of data across two sites. It does not
impose the need for additional storage systems, but completely builds upon the
scale-out solution for SAP HANA. This simple architecture reduces the
complexity in maintaining such a solution.
142
143
Monitoring data provided by the statistics sever can be used by other monitoring
tools also. Figure 7-6 shows an image of this data integrated into IBM Tivoli
monitoring.
Tivoli monitoring also provides agents to monitor the operating system of the
SAP HANA appliance. Hardware monitoring of the SAP HANA appliance servers
can be achieved with IBM Systems Director, which also can be integrated into a
Tivoli monitoring landscape.
By integrating the monitoring data collected by the statistics server, the Tivoli
Monitoring agent for the operating system, and hardware information provided by
IBM director, Tivoli Monitoring can provide a holistic view of the SAP HANA
appliance.
144
test and sandbox systems, possibly for multiple application scenarios, regions or
lines of business. Therefore the consolidation of SAP HANA instances, at least
for non-production systems, seems desirable. There are however major
drawbacks when consolidating multiple SAP HANA instances on one system4.
Due to this it is generally not supported for production systems. For
non-production systems the support status depends on the scenario:
Multiple Components on One System (MCOS)
Having multiple SAP HANA instances on one system, also referred to
as MCOS (Multiple Components on One System) is not supported
because this poses conflicts between different SAP HANA databases
on a single server, for example, common data and log volumes,
possible performance degradations, interference of the systems
against each other, and so on. While SAP supports this under certain
conditions (see SAP Note 1681092), IBM does not support such a
configuration.
Multiple Components on One Cluster (MCOC)
Running multiple SAP HANA Instances on one scale-out cluster (for
the sake of similarity to the other abbreviations we call this MCOC)
is supported as long as each node of the cluster runs only one SAP
HANA instance. A development and a QA instance can run on one
cluster, but with dedicated nodes for each of the two SAP HANA
instances, for example, each of the nodes runs either the
development instance, or the QA instance, but not both. Only the
GPFS file system is shared across the cluster.
Multiple Components in One Database (MCOD)
Having one SAP HANA Instance containing multiple components,
schemas or application scenarios - also referred to as Multiple
Components in One Database (MCOD) - is supported. This means
however to have all data within a single database which is also
maintained as a single database, which can lead to limitations in
operations, database maintenance, backup and recovery, and so on.
For example, bringing down the SAP HANA database affects all of
the scenarios. It is impossible to bring it down for only one scenario.
SAP Note 1661202 documents the implications.
Things to consider when consolidating SAP HANA instances on one system are:
An instance filling up the log volume causes all other instances on the system
to stop working properly. This can be addressed by monitoring the system
closely.
One SAP HANA system, as referred to in this section, can consist of one single server or multiple
servers in a clustered configuration.
145
Installation of an additional instance might fail, when there are already other
instances installed and active on the system. The installation procedures
check the available space on the storage, and refuse to install when there is
less free space than expected. This might also happen when trying to
re-install an already installed instance.
Installing a new SAP HANA revision for one instance might affect other
instances already installed on the system. For example new library versions
coming with the new install might break the already installed instances.
The performance of the SAP HANA system becomes unpredictable because
the individual instances on the system sharing resources like memory and
CPU.
When asking for support for such a system, you might be asked to remove the
additional instances and to recreate the issue on a single instance system.
Tolerated
Prohibited
Solutions that must not be used on the IBM Systems solution for
SAP HANA, using these solutions might compromise the
performance, stability or data integrity of the SAP HANA
appliance.
Do not install additional software on the SAP HANA appliance which is classified
as prohibited for use on the SAP HANA appliance. As an example, initial tests
146
show that some agents can decrease performance or even possibly corrupt the
SAP HANA database (for example, virus scanners).
In general, all additionally installed software must be configured not to interfere
with the functionality or performance of the SAP HANA appliance. If any issue of
the SAP HANA appliance occurs, you might be asked by SAP to remove all
additional software and to reproduce the issue.
The list of agents that are supported, tolerated, or prohibited for use on the SAP
HANA appliance are published in the Quick Start Guide for the IBM Systems
Solution for SAP HANA appliance, available online at:
http://www-947.ibm.com/support/entry/myportal/docdisplay?lndocid=MIGR-5
087035
Firmware
Operating system
Hardware drivers
Software
The IBM System x SAP HANA support team, after informed, reserves the right to
perform basic system tests on these levels when it is deemed to have a direct
impact on the SAP HANA appliance. In general, IBM does not give specific
recommendations to which levels are allowed for the SAP HANA appliance.
The IBM System x SAP HANA Development team provides at regular intervals
new images for the SAP HANA appliance. Since these images have
dependencies regarding hardware, operating system, and drivers use the latest
image for maintenance and installation of SAP HANA systems. These images
can be obtained through IBM support. Part number information is contained in
the Quick Start Guide.
If the firmware level recommendations for the IBM components of the SAP HANA
appliance are given through the individual IBM System x Support teams that fix
known code bugs, it is the customer's responsibility to up-/downgrade to the
recommended levels as instructed by IBM Support.
147
If the operating system recommendations for the SUSE Linux components of the
SAP HANA appliance are given through the SAP, SUSE, or IBM Support teams
that fix known code bugs, it is the customer's responsibility to up- or downgrade
to the recommended levels, as instructed by SAP through an explicit SAP Note
or allowed through a Customer OSS Message. SAP describes their operational
concept, including updating of the operating system components in SAP Note
1599888 - SAP HANA: Operational Concept. If the Linux kernel is updated, take
extra care to recompile the IBM High IOPS drivers and IBM GPFS software as
well.
If an IBM High IOPS driver or IBM GPFS recommendation to update the software
is given through the individual IBM Support teams (System x, Linux, GPFS) that
fix known code bugs, it is not recommend to update these drivers without first
asking the IBM System x SAP HANA support team through an SAP OSS
Customer Message.
If the other hardware or software recommendations for IBM components of the
SAP HANA appliance are given through the individual IBM Support teams that fix
known code bugs, it is the customer's responsibility to up-/downgrade to the
recommended levels as instructed by IBM Support.
148
Chapter 8.
Summary
This chapter summarizes the benefits of in-memory computing and the
advantages of IBM infrastructure for running the SAP HANA solution. We discuss
the following topics:
149
150
Chapter 8. Summary
151
designed and certified by SAP. They are delivered preconfigured with key
software components preinstalled to help speed delivery and deployment of the
solution. The x3690 X5-based configurations offer 128 - 256 GB of memory and
the choice of only solid-state disk or a combination of spinning disk and
solid-state disk. The x3950 X5-based configurations leverage the scalability of
eX5 and offer the capability to pay as you grow, starting with a 2-processor, 256
GB configuration and growing to an 8-processor, 1 TB configuration. The x3950
X5-based configurations integrate High IOPS SSD PCIe adapters. The 8-socket
configuration uses a scalability kit that combines the 7143-H2x or 7143-HBx with
the 7143-H3x or 7143-HCx to create a single 8-socket, 1 TB system.
IBM offers the appliance in a box with no need for external storage. With the
x3690 X5-based SSD only models, IBM has a unique offering with no spinning
hard drives, providing greater reliability and performance.
152
8.3.4 Scalability
IBM offers configurations allowing customers to start with a 2 CPU/256 GB RAM
model (S+), which can scale up to a 4 CPU/512 GB RAM model (M), and then to
an 8 CPU/1024 GB configuration (L). With the option to upgrade S+ to M, and M+
to L, IBM can provide an unmatched upgrade path from a T-shirt size S up to a
T-shirt size L, without the need to retire a single piece of hardware.
If you have large database requirements, you can scale the workload-optimized
solutions to multi-server configurations. IBM and SAP have validated
configurations of up to sixteen nodes with high availability, each node holding
either 256 GB, 512 GB or 1 TB of main memory. This scale-out support enables
support for databases as large 16 TB, able to hold the equivalent of about 56 TB
of uncompressed data. While the IBM solution is certified for up to 16 nodes, its
architecture is designed for extreme scalability and can even grow beyond that.
The IBM solution does not require external storage for the stand-alone or for the
scale-out solution. The solution is easy to grow by the simple addition of nodes to
the network. There is no need to reconfigure a storage area network for failover.
That is all covered by GPFS under the hood.
Chapter 8. Summary
153
IBM uses the same base building blocks from stand-alone servers to scale out,
providing investment protection for customers who want to grow their SAP HANA
solution beyond a single server.
IBM or IBM Business Partners can provide these scale-out configurations
preassembled in a rack, helping to speed installation and setup of the SAP
HANA appliance.
154
After the SAP HANA implementation roadmap is accepted by the customer, IBM
expert teams work with the customer to implement the roadmap.
Chapter 8. Summary
155
Project preparation
Project kick-off
Blueprint
Realization
Testing
Go-live preparation
Go-live
This methodology is focused to help both IBM and customer to keep the
defined and agreed scope under control and to help with issue classification
and resolution management. It is also giving the required visibility about the
current progress of development or testing to all involved stakeholders.
SAP HANA as the underlying database for SAP Business Suite products
A SAP NetWeaver BW system running on SAP HANA is currently the only
released solution from this category. Offerings for other products are
announced after they are released to run on SAP HANA database.
IBM Global Business Services is using a facility called IBM SAP HANA
migration factory designed specially for this purpose. Local experts who are
directly working with the clients are cooperating with remote teams
performing the required activities based on a defined conversion methodology
agreed with SAP. This facility is having the required amount of trained experts
156
covering all key positions needed for a smooth transition from a traditional
database to SAP HANA.
The migration service related to conversion of existing SAP NetWeaver BW
system to run on SAP HANA database has following phases:
Initial assessment
Local teams perform an initial assessment of the existing systems, their
relations and technical status. Required steps are identified, and an
implementation roadmap is developed and presented to the customer.
Conversion preparation
IBM remote teams perform all required preparations for the conversion.
BW experts clean BW systems to remove unnecessary objects. If
required, the system is cloned and upgraded to the required level.
Migration to SAP HANA database
In this phase, IBM remote teams perform the conversion to the SAP HANA
database including all related activities. Existing InfoCubes and DataStore
objects are converted to an in-memory optimized format. After successful
testing the system is released back for customer usage.
Chapter 8. Summary
157
158
Appendix A.
Appendix
This appendix provides information about the GPFS license.
159
160
Table A-1 GPFS licenses included in the custom models for SAP HANA
MTM
PVUs
included
7147-H1x
1400
7147-H2x
1400
7147-H3x
1400
7147-H7x
1400
7147-H8x
1400
7147-H9x
1400
7147-HAx
1400
7147-HBx
1400
7143-H1x
1400
7143-H2x
4000
7143-H3x
5600
7143-H4x
1400
7143-H5x
4000
7143-HAx
4000
7143-HBx
4000
7143-HCx
5600
Licenses for IBM GPFS on x86 Single Server for Integrated Offerings, V3
(referred to as Integrated in the table) cannot be ordered independent of the
select hardware for which it is included. This type of license provides file system
capabilities for single-node integrated offerings. Therefore the model 7143-HAx
includes 4000 PVUs of GPFS on x86 Single Server for Integrated Offerings, V3
licenses, so that an upgrade to the 7143-HBx model does not require additional
licenses. The PVU rating for the 7143-HAx model to consider when purchasing
other GPFS license types is 1400 PVUs.
Clients with highly available, multi-node clustered scale-out configurations must
purchase the GPFS on x86 Server and GPFS File Placement Optimizer product,
as described in 6.4.4, Hardware and software additions required for scale-out
on page 121.
Appendix A. Appendix
161
162
Advanced Business
Application Programming
HPI
I/O
input/output
ACID
Atomicity, Consistency,
Isolation, Durability
IBM
International Business
Machines
APO
ID
Identifier
IDs
identifiers
IMM
Integrated Management
Module
BI
Business Intelligence
BICS
BI Consumer Services
BM
bridge module
IOPS
BW
Business Warehouse
ISICC
CD
compact disc
CPU
ITSO
CRC
International Technical
Support Organization
CRM
Customer Relationship
Management
JDBC
JRE
CRU
KPIs
DB
database
LM
landscape management
DEV
development
LUW
DIMM
MB
megabyte
DSOs
DataStore Objects
MCA
DR
Disaster Recovery
MCOD
DXC
ECC
MCOS
ECC
ERP
MDX
Multidimensional Expressions
ETL
NOS
FTSS
NSD
GB
gigabyte
NUMA
GBS
ODBC
GPFS
ODBO
GTS
OLAP
HA
high availability
OLTP
HDD
OS
operating system
163
OSS
SSD
PAM
SSDs
PC
personal computer
STG
PCI
Peripheral Component
Interconnect
SUM
TB
terabyte
POC
proof of concept
TCO
PSA
TCP/IP
PVU
Transmission Control
Protocol/Internet Protocol
PVUs
TDMS
QA
quality assurance
TREX
QPI
QuickPath Interconnect
RAID
Redundant Array of
Independent Disks
TSM
UEFI
RAM
RAS
RDS
RPM
RPO
RTO
SAN
SAPS
SAS
SATA
Serial ATA
SCM
SCM
software configuration
management
SD
SDRAM
SLD
SLES
SLO
System Landscape
Optimization
SMI
SQL
164
Related publications
The publications listed in this section are considered particularly suitable for a
more detailed discussion of the topics covered in this book.
IBM Redbooks
The following IBM Redbooks publications provide additional information about
the topic in this document. Note that some publications referenced in this list
might be available in softcopy only.
The Benefits of Running SAP Solutions on IBM eX5 Systems, REDP-4234
IBM eX5 Portfolio Overview: IBM System x3850 X5, x3950 X5, x3690 X5, and
BladeCenter HX5, REDP-4650
Implementing the IBM General Parallel File System (GPFS) in a Cross
Platform Environment, SG24-7844
You can search for, view, download, or order these documents and other
Redbooks, Redpapers, Web Docs, draft and additional materials, at the following
website:
ibm.com/redbooks
Other publications
This publication is also relevant as a further information source:
Prof. Hasso Plattner, Dr. Alexander Zeier, In-Memory Data Management,
Springer, 2011
Online resources
These websites are also relevant as further information sources:
IBM Systems Solution for SAP HANA
http://www.ibm.com/systems/x/solutions/sap/hana/
IBM Systems and Services for SAP HANA
165
http://www.ibm-sap.com/hana
IBM and SAP: Business Warehouse Accelerator
http://www.ibm-sap.com/bwa
SAP In-Memory Computing - SAP Help Portal
http://help.sap.com/hana
166
(0.2spine)
0.17<->0.473
90<->249 pages
Back cover
In-memory Computing
with SAP HANA on IBM
eX5 Systems
IBM Systems
Solution for SAP
HANA
SAP HANA overview
and use cases
Basic in-memory
computing principles
INTERNATIONAL
TECHNICAL
SUPPORT
ORGANIZATION
BUILDING TECHNICAL
INFORMATION BASED ON
PRACTICAL EXPERIENCE
IBM Redbooks are developed by
the IBM International Technical
Support Organization. Experts
from IBM, Customers and
Partners from around the world
create timely technical
information based on realistic
scenarios. Specific
recommendations are provided
to help you implement IT
solutions more effectively in
your environment.
ISBN 073843762X