IBM TS7700 Release 4.1 and 4.1.1 Guide PDF
IBM TS7700 Release 4.1 and 4.1.1 Guide PDF
IBM TS7700 Release 4.1 and 4.1.1 Guide PDF
Larry Coyne
Katja Denefleh
Derek Erdmann
Joe Hew
Alberto Barajas Ortiz
Aderson Pacini
Michael Scott
Takahiro Tsuda
Chen Zhu
Redbooks
International Technical Support Organization
May 2018
SG24-8366-01
Note: Before using this information and the product it supports, read the information in “Notices” on
page xvii.
This edition applies to Version 4, Release 1, Modification 2 of IBM TS7700 (product number 3957-AGK0).
© Copyright International Business Machines Corporation 2017, 2018. All rights reserved.
Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule
Contract with IBM Corp.
Contents
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
Summary of contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviii
Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix
Now you can become a published author, too . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxii
Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxii
Stay connected to IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxii
Contents v
Chapter 4. Preinstallation planning and sizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
4.1 Hardware installation and infrastructure planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
4.1.1 System requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
4.1.2 TS7700 specific limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
4.1.3 TCP/IP configuration considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
4.1.4 Factors that affect performance at a distance. . . . . . . . . . . . . . . . . . . . . . . . . . . 153
4.1.5 Host attachments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
4.1.6 Planning for LDAP for user authentication in your TS7700 subsystem . . . . . . . 159
4.1.7 Cluster time coordination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
4.2 Planning for a grid operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
4.2.1 Autonomic Ownership Takeover Manager considerations . . . . . . . . . . . . . . . . . 161
4.2.2 Defining grid copy mode control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
4.2.3 Defining scratch mount candidates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
4.2.4 Retain Copy mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
4.2.5 Defining cluster families . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
4.2.6 TS7720 and TS7760 cache thresholds and removal policies . . . . . . . . . . . . . . . 164
4.2.7 Data management settings (TS7740/TS7700T CPx in a multi-cluster grid) . . . . 168
4.2.8 High availability considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
4.3 Planning for software implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
4.3.1 Host configuration definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
4.3.2 Software requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
4.3.3 System-managed storage tape environments . . . . . . . . . . . . . . . . . . . . . . . . . . 174
4.3.4 Sharing and partitioning considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
4.3.5 Sharing the TS7700 by multiple hosts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
4.3.6 Partitioning the TS7700 between multiple hosts . . . . . . . . . . . . . . . . . . . . . . . . . 176
4.3.7 Logical path considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
4.4 Planning for logical and physical volumes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
4.4.1 Volume serial numbering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
4.4.2 Virtual volumes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
4.4.3 Logical WORM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
4.4.4 Physical volumes for TS7740, TS7720T, and TS7760T . . . . . . . . . . . . . . . . . . . 182
4.4.5 Data compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
4.4.6 Secure Data Erase function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
4.4.7 Planning for tape encryption in a TS7740, TS7720T, and TS7760T . . . . . . . . . 186
4.4.8 Planning for cache disk encryption in the TS7700 . . . . . . . . . . . . . . . . . . . . . . . 188
4.5 Tape analysis and sizing the TS7700 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
4.5.1 IBM tape tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
4.5.2 BatchMagic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
4.5.3 Workload considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
4.5.4 Education and training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
4.5.5 Implementation services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
Contents vii
Chapter 8. Migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
8.1 Migration to a TS7700 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
8.1.1 Host-based migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
8.1.2 Tape-based migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
8.2 Migration between TS7700s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
8.2.1 Join and Copy Refresh processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
8.2.2 Copy Export and Copy Export Recovery / Merge . . . . . . . . . . . . . . . . . . . . . . . . 305
8.2.3 Grid to Grid Migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
8.3 Methods to move data for host-based migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
8.3.1 Phased method of moving data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
8.3.2 Quick method of moving data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
8.3.3 Products to simplify the task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
8.3.4 Combining methods to move data into the TS7700 . . . . . . . . . . . . . . . . . . . . . . 314
8.4 Moving data out of the TS7700 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
8.4.1 Host-based copy tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
8.4.2 Copy Export and Copy Export Recovery / Merge . . . . . . . . . . . . . . . . . . . . . . . . 315
8.4.3 DFSMShsm aggregate backup and recovery support . . . . . . . . . . . . . . . . . . . . 315
8.5 Migration of DFSMShsm-managed data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
8.5.1 Volume and data set sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
8.5.2 TS7700 implementation considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
8.5.3 DFSMShsm task-related considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
8.6 DFSMSrmm and other tape management systems . . . . . . . . . . . . . . . . . . . . . . . . . . 325
8.7 IBM Spectrum Protect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
8.7.1 Native or virtual drives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
8.7.2 IBM Tivoli Storage Manager parameter settings. . . . . . . . . . . . . . . . . . . . . . . . . 330
8.8 DFSMSdss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
8.8.1 Full volume dumps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
8.8.2 Stand-Alone Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
8.9 Object access method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
8.10 Database backups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
8.10.1 DB2 data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
8.10.2 CICS and IMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
8.10.3 Batch data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
Contents ix
11.3.5 Parameters and customization of the TS7700 . . . . . . . . . . . . . . . . . . . . . . . . . 622
11.3.6 Terminology of throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 622
11.3.7 Throttling in the TS7700 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623
11.4 Monitoring TS7700 performance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626
11.4.1 Base information: Types of statistical records. . . . . . . . . . . . . . . . . . . . . . . . . . 627
11.4.2 Using the TS4500 Management GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629
11.4.3 Using the TS3500 Tape Library Specialist for monitoring. . . . . . . . . . . . . . . . . 630
11.4.4 Using the TS7700 Management Interface to monitor IBM storage . . . . . . . . . . 633
11.5 Cache capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643
11.5.1 Interpreting Cache Usage: MI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 644
11.5.2 Interpreting Cache Usage: VEHSTATS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 644
11.5.3 Interpreting Cache Usage: LI REQ,distlib,CACHE . . . . . . . . . . . . . . . . . . . . . . 644
11.5.4 Tuning cache usage - Making your cache deeper . . . . . . . . . . . . . . . . . . . . . . 645
11.5.5 Tuning cache usage - Management of unwanted copies . . . . . . . . . . . . . . . . . 646
11.6 Cache throughput / Cache bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647
11.6.1 Interpreting Cache throughput: Performance graph . . . . . . . . . . . . . . . . . . . . . 647
11.6.2 Interpreting cache throughput: VEHSTATS HOURFLOW . . . . . . . . . . . . . . . . 648
11.6.3 Tuning Cache bandwidth: Premigration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 648
11.6.4 Premigration and premigration throttling values . . . . . . . . . . . . . . . . . . . . . . . . 649
11.7 TS7700 throughput: Host I/O increments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 651
11.7.1 Host I/O in the performance graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 652
11.7.2 Host I/O in the VEHSTATS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 652
11.7.3 Host Throughput Feature Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 653
11.7.4 Tuning for Host I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 654
11.8 Grid link and replication performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655
11.8.1 Installed grid link hardware: Mixing of different Grid link adapters . . . . . . . . . . 655
11.8.2 Bandwidth and quality of the provided network . . . . . . . . . . . . . . . . . . . . . . . . 655
11.8.3 Selected replication mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 656
11.8.4 Tuning possibilities for copies: COPYCOUNT Control . . . . . . . . . . . . . . . . . . . 660
11.8.5 Tuning possibilities for copies: Deferred Copy Throttling . . . . . . . . . . . . . . . . . 661
11.8.6 Grid link performance monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663
11.9 Considerations for the backend TS7740 / TS7700T . . . . . . . . . . . . . . . . . . . . . . . . . 664
11.9.1 Amount of Back-end drives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 664
11.9.2 Monitor Backend drives in the MI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666
11.9.3 Monitor Backend drives in the VEHSTATS. . . . . . . . . . . . . . . . . . . . . . . . . . . . 666
11.9.4 Monitor Backend drives with a LI REQ command. . . . . . . . . . . . . . . . . . . . . . . 667
11.9.5 Tune the usage of back-end drives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668
11.9.6 Number of back-end cartridges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 671
11.9.7 Monitor the usage of back-end cartridges on the MI. . . . . . . . . . . . . . . . . . . . . 671
11.9.8 Monitor the usage of back-end cartridges with VEHSTATS . . . . . . . . . . . . . . . 672
11.9.9 Tuning of the usage of Back-end cartridges with VEHSTATS . . . . . . . . . . . . . 673
11.10 Throttling the TS7700 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673
11.10.1 Monitoring throttling with the MI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673
11.10.2 Monitoring throttling with VEHSTATS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674
11.10.3 Tuning to avoid the throttling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674
11.11 Adjusting parameters in the TS7700 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675
11.12 Monitoring after service or outage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676
11.13 Performance evaluation tool: Plotting cache throughput from VEHSTATS. . . . . . . 676
11.14 Bulk Volume Information Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 678
11.14.1 Overview of the BVIR function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 679
11.14.2 Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 681
11.14.3 Request data format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 681
Contents xi
13.3 DR general considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 772
13.3.1 The z/OS test environment represents a point in time . . . . . . . . . . . . . . . . . . . 772
13.3.2 The data that is available in the DR cluster. . . . . . . . . . . . . . . . . . . . . . . . . . . . 772
13.3.3 Write Protect Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 772
13.3.4 Protection of your production data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 773
13.3.5 Separating production and disaster recovery hosts: Logical volumes . . . . . . . 773
13.3.6 Creating data during the disaster recovery test from the DR host: Selective Write
Protect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 775
13.3.7 Creating data during the disaster recovery test from the disaster recovery host:
Copy policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 776
13.3.8 Restoring the DR host from a production host . . . . . . . . . . . . . . . . . . . . . . . . . 776
13.3.9 Scratch runs during the disaster recovery test from the production host . . . . . 776
13.3.10 Scratch runs during the disaster recovery test from the DR host . . . . . . . . . . 776
13.3.11 Cleanup phase of a disaster recovery test . . . . . . . . . . . . . . . . . . . . . . . . . . . 777
13.3.12 Considerations for DR tests without Selective Write Protect mode . . . . . . . . 777
13.3.13 Returning to scratch without using Selective Write Protect. . . . . . . . . . . . . . . 780
13.4 DR for FlashCopy concepts and command examples . . . . . . . . . . . . . . . . . . . . . . . 781
13.4.1 Basic requirements and concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 782
13.4.2 DR Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 783
13.4.3 LIVECOPY enablement in a DR Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 784
13.4.4 Stopping FlashCopy and Write Protect Mode for a DR Family . . . . . . . . . . . . . 785
13.5 DR testing methods examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 791
13.5.1 Method 1: DR Testing using FlashCopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 791
13.5.2 Method 2: Using Write Protect Mode on DR clusters . . . . . . . . . . . . . . . . . . . . 795
13.5.3 Method 3: DR Testing without Write Protect Mode . . . . . . . . . . . . . . . . . . . . . . 797
13.5.4 Method 4: Breaking the grid link connections . . . . . . . . . . . . . . . . . . . . . . . . . . 798
13.6 Expected failures during a DR test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 801
Appendix B. IBM TS7700 implementation for IBM z/VM, IBM z/VSE, and
IBM z/TPF environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 813
Software requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 814
Software implementation in z/VM and z/VSE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 814
General support information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 814
z/VM native support that uses DFSMS/VM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 815
Native z/VSE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 817
VM/ESA and z/VM guest support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 818
z/VSE as a z/VM guest using a VSE Guest Server . . . . . . . . . . . . . . . . . . . . . . . . . . . 819
Contents xiii
IDCAMS example to change the TCDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 875
JCL to change volumes in RMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 875
REXX EXEC to update the library name. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 875
Contents xv
xvi IBM TS7700 Release 4.1 and 4.1.2 Guide
Notices
This information was developed for products and services offered in the US. This material might be available
from IBM in other languages. However, you may be required to own a copy of the product or product version in
that language in order to access it.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult
your local IBM representative for information on the products and services currently available in your area. Any
reference to an IBM product, program, or service is not intended to state or imply that only that IBM product,
program, or service may be used. Any functionally equivalent product, program, or service that does not
infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to
evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The
furnishing of this document does not grant you any license to these patents. You can send license inquiries, in
writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive, MD-NC119, Armonk, NY 10504-1785, US
This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. IBM may make
improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time
without notice.
Any references in this information to non-IBM websites are provided for convenience only and do not in any
manner serve as an endorsement of those websites. The materials at those websites are not part of the
materials for this IBM product and use of those websites is at your own risk.
IBM may use or distribute any of the information you provide in any way it believes appropriate without
incurring any obligation to you.
The performance data and client examples cited are presented for illustrative purposes only. Actual
performance results may vary depending on specific configurations and operating conditions.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm the
accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the
capabilities of non-IBM products should be addressed to the suppliers of those products.
Statements regarding IBM's future direction or intent are subject to change or withdrawal without notice, and
represent goals and objectives only.
This information contains examples of data and reports used in daily business operations. To illustrate them
as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to actual people or business enterprises is entirely
coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in
any form without payment to IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating platform for which the sample
programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore,
cannot guarantee or imply reliability, serviceability, or function of these programs. The sample programs are
provided “AS IS”, without warranty of any kind. IBM shall not be liable for any damages arising out of your use
of the sample programs.
The following terms are trademarks or registered trademarks of International Business Machines Corporation,
and might also be trademarks or registered trademarks in other countries.
AIX® IBM Spectrum™ Redbooks®
CICS® IBM Spectrum Protect™ Redbooks (logo) ®
DB2® IBM Z® S/390®
DS8000® IBM z Systems® System i®
EnergyScale™ IBM z13® System Storage®
FICON® IMS™ Tivoli®
FlashCopy® OS/400® WebSphere®
GDPS® Parallel Sysplex® z Systems®
Geographically Dispersed Parallel POWER® z/OS®
Sysplex™ POWER7® z/VM®
Global Technology Services® POWER8® z/VSE®
IBM® RACF® z13®
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
Linear Tape-Open, LTO, the LTO Logo and the Ultrium logo are trademarks of HP, IBM Corp. and Quantum in
the U.S. and other countries.
Microsoft, Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States,
other countries, or both.
Other company, product, or service names may be trademarks or service marks of others.
This IBM® Redbooks® publication covers IBM TS7700 R4.1 through R4.1.2. The IBM
TS7700 is part of a family of IBM Enterprise tape products. This book is intended for system
architects and storage administrators who want to integrate their storage systems for optimal
operation.
This publication explains the all-new hardware that is introduced with IBM TS7700 release
R4.1 and the concepts associated with it. TS7700 R4.1 can be installed only on the IBM
TS7720, TS7740, and the all-new, hardware-refreshed TS7760 Models. The IBM TS7720T
and TS7760T (tape attach) partition mimics the behavior of the previous TS7740, but with
higher performance and capacity.
The IBM TS7700 offers a modular, scalable, and high-performance architecture for
mainframe tape virtualization for the IBM Z® environment. It is a fully integrated, tiered
storage hierarchy of disk and tape. This storage hierarchy is managed by robust storage
management microcode with extensive self-management capability. It includes the following
advanced functions:
Policy management to control physical volume pooling
Cache management
Redundant copies, including across a grid network
Copy mode control
The TS7760T writes data by policy to physical tape through attachment to high-capacity,
high-performance IBM TS1150 and IBM TS1140 tape drives installed in an IBM TS4500 or
TS3500 tape library.
The TS7760 models are based on high-performance and redundant IBM POWER8®
technology. They provide improved performance for most IBM Z tape workloads when
compared to the previous generations of IBM TS7700.
In addition to the material in this book, other IBM publications are available to help you better
understand the IBM TS7700.
If you have limited knowledge of the IBM TS7700, see the documentation for TS7700:
http://www.ibm.com/support/knowledgecenter/STFS69/welcome
A series of technical documents and white papers that describe many aspects of the IBM
TS7700 are available. Although the basics of the product are described in this book, more
detailed descriptions are provided in these documents. For that reason, most of these
detailed record descriptions are not in this book, although you are directed to the appropriate
technical document. For these additional technical documents, go to the IBM Techdocs
Technical Sales Library website and search for TS7700:
http://www.ibm.com/support/techdocs/atsmastr.nsf/Web/TechDocs
Familiarize yourself with the contents of Chapter 1, “Introducing the IBM TS7700” on page 3,
Chapter 2, “Architecture, components, and functional characteristics” on page 15, and
Chapter 3, “IBM TS7700 usage considerations” on page 111. These chapters provide a
functional description of all of the major features of the product, and they are a prerequisite for
understanding the other chapters.
If you are planning for the IBM TS7700, see Chapter 4, “Preinstallation planning and sizing”
on page 135 for hardware information. Information on planning for Software begins in 4.3,
“Planning for software implementation” on page 171. Chapter 6, “IBM TS7700
implementation” on page 225 describes the implementation and installation tasks to set up an
IBM TS7700.
If you already have an IBM TS7700 or even an IBM 3494 Virtual Tape Server (VTS) installed,
see Chapter 7, “Hardware configurations and upgrade considerations” on page 243.
Chapter 8, “Migration” on page 299 describes migrating to a TS7700 environment.
Chapter 9, “Operation” on page 339 provides information about the operational aspects of the
IBM TS7700. This information includes the layout of the MI windows to help with daily
operational tasks. Chapter 9 “Host console operations” provides information about
commands and procedures that are initiated from the host operating system.
If you have a special interest in the performance and monitoring tasks as part of your
operational responsibilities, see Chapter 11, “Performance and monitoring” on page 613.
Although this chapter gives a good overview, more information is available in the technical
documents on the Techdocs website.
For availability and disaster recovery specialists, and those individuals who are involved in the
planning and operation that is related to availability and disaster recovery, see Chapter 12,
“Copy Export” on page 735.
Information that is related to disaster recovery can be found in Chapter 5, “Disaster recovery”
on page 201 and Chapter 13, “Disaster recovery testing” on page 767.
Authors
This book was produced by a team working at IBM Tucson, Arizona.
Preface xxi
Katja Denefleh works in the Advanced Technical Skill group in
Germany. She is responsible for providing second-level support for
high-end tape products for Europe, the Middle East, and Africa
(EMEA). Katja has worked more than 15 years as an IBM Z
systems programmer, and more than 10 years as a Mainframe
Architect for outsourcing clients. Her areas of expertise cover all
IBM Z hardware, IBM Parallel Sysplex®, and operations aspects of
large mainframe installations. Before joining IBM in 2003, she
worked for companies using IBM systems and storage in Germany.
Aderson Pacini works in the Tape Support Group in the IBM Brazil
Hardware Resolution Center. He is responsible for providing
second-level support for tape products in Brazil. Aderson has
extensive experience servicing a broad range of IBM products. He
has installed, implemented, and supported all of the IBM Tape
Virtualization Servers, from the IBM VTS B16 to the IBM TS7700
Virtualization Engine. Aderson joined IBM in 1976 as a Service
Representative, and his entire career has been in IBM Services.
Norbert Schlumberger
IBM SO Delivery, Server Systems Operations
Felipe Barajas, Michelle Batchelor, Ralph Beeston, Erika Dawson, Lawrence M. (Larry) Fuss,
Charles House, Katsuyoshi Katori, Khanh Ly, Kohichi Masuda, Takeshi Nohta, Kerri Shotwell,
Sam Smith, Joe Swingler, George Venech
IBM Systems
Tom Koudstaal
E-Storage B.V.
Thanks to the authors of the previous edition, which was published in January 2017:
Larry Coyne, Katja Denefleh, Derek Erdmann, Joe Hew, Sosuke Matsui, Aderson Pacini,
Michael Scott, Chen Zhu
Preface xxiii
Now you can become a published author, too
Here’s an opportunity to spotlight your skills, grow your career, and become a published
author, all at the same time. Join an ITSO residency project and help write a book in your area
of expertise, while honing your experience using leading-edge technologies. Your efforts will
help to increase product acceptance and customer satisfaction, as you expand your network
of technical contacts and relationships. Residencies run 2 - 4 weeks in length, and you can
participate either in person or as a remote resident working from your home base.
Learn more about the residency program, browse the residency index, and apply online:
ibm.com/redbooks/residencies.html
Comments welcome
Your comments are important to us.
We want our books to be as helpful as possible. Send us your comments about this book or
other IBM Redbooks publications in one of the following ways:
Use the online Contact us review Redbooks form:
ibm.com/redbooks
Send your comments in an email:
[email protected]
Mail your comments:
IBM Corporation, International Technical Support Organization
Dept. HYTD Mail Station P099
2455 South Road
Poughkeepsie, NY 12601-5400
This section describes the technical changes that are made in this edition of the book and in
previous editions. This edition might also include minor corrections and editorial changes that
are not identified.
Summary of Changes
for SG24-8366-01
for IBM TS7700 Release 4.1 and 4.1.2 Guide
as created or updated on August 24, 2018.
The IBM TS7700, which was introduced in 2006, is now in its fifth generation of IBM Tape
Virtualization products for mainframes. It replaces the highly successful IBM TotalStorage
Virtual Tape Server (VTS).
This publication explains the all-new hardware that is introduced with IBM TS7700 release
R4.1 and the concepts associated with it. TS7700 R4.1 can be installed only on the IBM
TS7720, TS7740, and the all-new, hardware-refreshed TS7760 Models. The IBM TS7720T
and TS7760T (tape attach) partition mimics the behavior of the previous TS7740, but with
higher performance and capacity.
The TS7700 is a modular, scalable, and high-performance architecture for mainframe tape
virtualization. This is a fully integrated, tiered storage hierarchy of disk and tape. It
incorporates extensive self-management capabilities consistent with IBM Information
Infrastructure initiatives.
These capabilities can improve performance and capacity. Better performance and capacity
help lower the total cost of ownership for tape processing, and help avoid human error. A
TS7700 can improve the efficiency of mainframe tape operations by efficiently using disk
storage, tape capacity, and tape speed. It can also improve efficiency by providing many tape
addresses.
TS7700 provides tape virtualization for the IBM z environment. Tape virtualization can help
satisfy the following requirements in a data processing environment:
Improved reliability
Reduction in the time that is needed for the backup and restore process
Reduction of services downtime that is caused by physical tape drive and library outages
Reduction in cost, time, and complexity by moving primary workloads to virtual tape
More efficient procedures for managing daily backup and restore processing
Infrastructure simplification through reduction of the number of physical tape libraries,
drives, and media
System
IBM Z z
Hosts
FICON FICON
3592
3592
3592
3592
Tape Volume Cache Tape Volume Cache
Emulated tape drives are also called virtual drives. To the host, virtual IBM 3490E tape drives
look the same as physical 3490E tape drives. Emulation is not apparent to the host and
applications. The host always writes to and reads from virtual tape drives. It never accesses
the physical tape drives (commonly referred to as the back-end tape drives) attached to
TS7740, TS7720T, and TS7760T configurations. In fact, it does not need to identify that these
tape drives exist.
Even an application that supports only 3490E tape technology can use the TS7700 without
any changes. Therefore, the application benefits from the high capacity and
high-performance tape drives in the back end. For TS7720 VEB and TS7760 VEC (disk-only)
configurations, no physical tape attachment exists. However, the virtual tape drives work the
same for the host.
Because the host exclusively accesses the virtual tape drives, all data must be written to or
read from emulated volumes in the disk-based TVC. These emulated tape volumes in the
TVC are called virtual volumes.
When the host requests a volume that is still in disk cache, the volume is virtually mounted.
No physical mount is required. After the virtual mount is complete, the host can access the
data at disk speed. Mounting scratch tapes is also virtual, and does not require a physical
mount.
Another benefit of tape virtualization is the large number of drives available to applications.
Each IBM TS7700 can support up to a maximum of 496 virtual tape devices. Often,
applications contend for tape drives, and jobs must wait because no physical tape drive is
available. Tape virtualization efficiently addresses these issues by providing many virtual tape
drives. The TS7740, TS7720T, and TS7760T manage the physical tape drives and physical
volumes in the tape library. It also controls the movement of data between physical and logical
volumes.
In the TS7740, TS7720T, and TS7760T data that is written from the host into the TVC is
scheduled for copying to tape later. The process of copying data to tape that exists only in
cache is called premigration. When a volume is copied from cache to tape, the volume on the
tape is called a logical volume.
A physical volume can contain many logical volumes. The process of putting several logical
volumes on one physical tape is called stacking. A physical tape that contains logical volumes
is referred to as a stacked volume. This concept does not apply to TS7720 VEB and TS7760
VEC because no physical tape devices are attached to it.
Without a TS7740, TS7720T, and TS7760T, many applications would be unable to fill the high
capacity media of modern tape technology, and you might end up with many under-used
cartridges. This wastes much space, and requires an excessive number of cartridge slots.
Tape virtualization eliminates any unused volume capacity, and fully uses physical tape
capacity when present. Also, you can use tape virtualization to use the full potential of
modern tape drive and tape media technology. In addition, it does so without changes to your
applications or job control language (JCL).
When space is required in the TVC of a TS7740, TS7720T, and TS7760T for new data,
volumes that were copied to tape are removed from the cache. By default, removal is based
on a least recently used (LRU) algorithm. Using this algorithm ensures that no new data or
recently accessed data is removed from cache. The process of deleting volumes in cache that
were premigrated to tape is called migration. Volumes that were deleted in the cache and
exist only on tape are called migrated volumes.
In a TS7720 and TS7760 (disk-only) configuration, no migrated volumes exist because there
is no physical tape attachment. Instead, logical volumes are maintained in disk until they
expire. For this reason, cache capacity for the TS7720 and TS7760 is larger than the capacity
for the TS7740.
When a TS7720 and TS7760 is a member of a multicluster hybrid grid, virtual volumes in the
TS7720 and TS7760 cache can be automatically removed. This removal is done by using a
Volume Removal Policy if another valid copy exists elsewhere in the grid. A TS7700 grid
refers to two or more physically separate TS7700 clusters that are connected to one another
by using a customer-supplied Internet Protocol network.
On the TS7740, TS7720T, and TS7760T, a previously migrated volume must be copied back
from tape into the TVC to be accessed. It must be copied because the host has no direct
access to the physical tapes. When the complete volume is copied back into the cache, the
host can access the data. The process of copying data back from tape to the TVC is called
recall.
Host
Write
Read
Stacked
Volumes
TVC
X
ALS124
Recall
Virtual Volumes
Logical Volumes
Figure 1-2 TS7740, TS7720T, and TS7760T Tape Volume Cache processing
With a TS7720 VEB and TS7760 VEC (disk-only), the virtual volumes are accessed by the
host within the TVC.
Figure 1-3 shows the IBM TS7720 and IBM TS7760 TVC processing.
H ost
Write
Read
TVC
A LS 123
V irtu a l V o lu m e s
These sets of links form a multi-cluster grid configuration. Adapter types cannot be mixed in
a cluster. They can vary within a grid, depending on your network infrastructure. Logical
volume attributes and data are replicated across the clusters in a grid. Any data that is
replicated between the clusters is accessible through any other cluster in the grid
configuration. Through remote volume access, you can reach any virtual volume through any
virtual device. You can reach volumes even if a replication has not been made.
Setting policies on the TS7700 defines where and when you have multiple copies of your
data. You can also specify for certain kinds of data, such as test data, that you do not need a
secondary or tertiary copy.
You can group clusters within a grid into families. Grouping enables the TS7700 to make
improved decisions for tasks, such as replication or TVC selection.
Depending on the configuration, multiple TS7700 tape products that form a grid provide the
following types of solutions:
High availability (HA)
Disaster recovery (DR)
HA and DR
Metro and global business continuance
Before R3.2, a multi-cluster grid configuration presented itself to the attached hosts as one
large library with the following maximums:
512 virtual devices for a two-cluster grid
768 virtual tape devices for a three-cluster grid
1024 virtual tape devices for a four-cluster grid
1536 virtual devices for a six-cluster grid
These numbers can now be exceeded in steps of 16 virtual drives, up to 496 virtual devices
per cluster.
The copying of the volumes in a grid configuration is handled by the clusters, and it is not
apparent to the host. By intermixing TS7720, TS7740, and TS7760 Models you can build a
hybrid two, three, four, five, six, seven, or eight cluster grid.
Host A
Ho
DWDM, Channel Extension (optional)
Host
Host B
Ho (optional)
FICON
FICON
TS7760
TS7760 TS7
TS7720 TS776
TS7760T
Cluster
Cluster 0 Clus 1
Cluster Cluster 2
A S S R A S S R A S S R
B R R R B R R R B R R R
C R N D C R N D C R N D
D R N N D N R N D N N R
WAN
Figure 1-4 Multiple TS7700 tape products that depict possible host and grid connections
For TS7740, TS7720T, and TS7760T grid configuration, each TS7740, TS7720T, and
TS7760T manages its own set of physical volumes. Each maintains the relationship between
logical volumes and the physical volumes on which they are located.
The clusters in a TS7700 grid can be, but do not need to be, geographically dispersed. In a
multiple cluster grid configuration, two TS7700 clusters are often located within 100
kilometers (km) or 62 miles of each other, whereas the remaining clusters can be located
more than 1000 km (621.37 miles) away. This configuration provides both a highly available
and redundant regional solution. It also provides a remote DR solution outside of the region.
A multi-cluster grid supports the concurrent growth and reduction of cluster counts.
Certain information services must be high speed to support websites and databases. In some
cases, information services must be multiplexed to multiple locations, or require extra
encryption and overwrite protection. IBM Information Infrastructure helps you apply the
correct services and service levels so that vital information can be delivered. IBM Information
Infrastructure solutions are designed to help you manage this information explosion. They
also address challenges of information compliance, availability, retention, and security.
This approach helps your company move toward improved productivity and reduced risk
without driving up costs. The IBM TS7700 is part of the IBM Information Infrastructure. This
strategy delivers information availability, supporting continuous and reliable access to data. It
also delivers information retention, supporting responses to legal, regulatory, or investigatory
inquiries for information.
You can expect the following types of benefits from tape virtualization:
Brings efficiency to the tape operation environment
Reduces the batch window
Provides HA and DR configurations
Provides fast access to data through caching on disk
Provides optional use of current tape drive, tape media, and tape automation technology
Provides optional use of filling high capacity media to 100%
Provides many tape drives for concurrent use
Provides data consolidation, protection, and sharing
Requires no additional software
Reduces the total cost of ownership
The TS7700 also includes a set of commands and enhanced statistical reporting.
To prevent confusion, IBM uses a convention to differentiate between binary and decimal
units. At the kilobyte level, the difference between decimal and binary units of measurement is
relatively small (2.4%). This difference grows as data storage values increase. When values
reach terabyte levels, the difference between decimal and binary units approaches 10%.
Both decimal and binary units are available throughout the TS7700 Tape Library
documentation. Table 1-1 compares the names, symbols, and values of the binary and
decimal units.
Table 1-1 Names, symbols, and values of the binary and decimal units
Decimal Binary
Table 1-2 Increasing percentage of difference between binary and decimal units
Decimal value Binary value Percentage difference
First, stand-alone and clustered configuration features are explained, followed by features
that apply only to multi-cluster grid configurations.
Though there are some differences between these models, the underlying architecture is the
same. If a function or feature is unique or behaves differently for a given model, it is clearly
stated. If not, you can assume that it is common across all models.
When the TS7700 is referenced, it implies all models and types, including the TS7760D,
TS7760T, TS7720D, TS7720T, and TS7740. When the function is only applicable to models
that are disk-only, then TS7700D is used, if they are only applicable to tape attached models,
then TS770T is used. If the function is only applicable to a specific version of the TS7700,
TS7760D, TS7760T, TS7720D, TS7720T, TS7740 or a subset, the product name or names
are referenced.
IBM decided that it was time to create a next-generation solution with a focus on scalability
and business continuance. Many components of the original VTS were retained, although
others were redesigned. The result was the TS7700 Virtualization Engine.
Nodes
Nodes are the most basic components in the TS7700 architecture. A node has a separate
name, depending on the role that is associated with it. There are three types of nodes:
Virtualization nodes
Hierarchical data storage management nodes
General nodes
Virtualization node
A vNode is a code stack that presents the virtual image of a library and drives to a host
system. When the TS7700 is attached as a virtual tape library, the vNode receives the tape
drive and library requests from the host. The vNode then processes them as real devices
process them. It then converts the tape requests through a virtual drive and uses a file in the
cache subsystem to represent the virtual tape image. After the logical volume is created or
altered by the host system through a vNode, it is in disk cache.
The hnode is the only node that is aware of physical tape resources and the relationships
between the logical volumes and physical volumes. It is also responsible for any replication of
logical volumes and their attributes between clusters. An hnode uses standardized interfaces,
such as Transmission Control Protocol/Internet Protocol (TCP/IP), to communicate with
external components.
General node
A general node (gnode) can be considered a vNode and an hnode sharing a physical
controller. The current implementation of the TS7700 runs on a gnode. The engine has both a
vNode and hnode that are combined in an IBM POWER8 processor-based server.
vNode vNode
vNode
Controller
gNode
hNode
hNode hNode
Controller
Controller
Cluster
The TS7700 cluster combines the TS7700 server with one or more external (from the server’s
perspective) disk subsystems. This subsystem is the TS7700 cache controller. This
architecture enables expansion of disk cache capacity.
Cache Controller
vNode
Cache Expansion
hNode
Controller
TS7700 Cluster
Figure 2-3 TS7700 cluster
A TS7700 cluster provides Fibre Channel connection (IBM FICON) host attachment, and a
default count of 256 virtual tape devices. Features are available that enable the device count
to reach up to 496 devices per cluster. The IBM TS7740 and IBM TS7700T cluster also
includes the assigned TS3500 or TS4500 tape library partition, fiber switches, and tape
drives. The IBM TS7720 and TS7760 can include one or more optional cache expansion
frames.
.....
.....
TS7740 cache controller
The TS7700 Cache Controller and associated disk storage media act as cache storage for
data. The capacity of each disk drive module (DDM) depends on your configuration.
The TS7700 Cache Drawer acts as an expansion unit for the TS7700 Cache Controller. One
or more controllers and their expansion drawers are collectively referred to as the TS7700
Tape Volume Cache, or often named the TVC. The amount of cache available per TS7700
Tape Volume Cache depends on your configuration.
The TS7760 Cache (CSA, CXA) provides a new TVC protection, called Dynamic Disk Pooling
(DDP).
The TS7740 Cache provided a RAID 6 (since CC9) and RAID 5 (up to CC8) protected TVC to
temporarily contain compressed virtual volumes before they are offloaded to physical tape.
The TS7720 and TS7720T CS9/CS9 use RAID 6 protection. If an existing installation is
upgraded, the existing cache is protected by RAID6, where the new CSA/CXA cache uses
DDP for protection.
This limited P2P design was one of the main reasons that the previous VTS needed to be
redesigned. The new TS7700 replaced the P2P concepts with an industry-leading new
technology referred to as a grid.
Fast path: Seven and eight cluster grid configurations are available with a request for price
quotation (RPQ).
A grid configuration and all virtual tape drives emulated in all configured clusters appear as
one large library to the attached IBM Z hosts.
Logical volumes that are created within a grid can be selectively replicated to one or more
peer clusters by using a selection of different replication policies. Each replication policy or
Copy Consistency Point provides different benefits, and can be intermixed. The grid
architecture also enables any volume that is located within any cluster to be accessed
remotely, which enables ease of access to content anywhere in the grid.
In general, any data that is initially created or replicated between clusters is accessible
through any available cluster in a grid configuration. This concept ensures that data can still
be accessed even if a cluster becomes unavailable. In addition, it can reduce the need to
have copies in all clusters because the adjacent or remote cluster’s content is equally
accessible.
A grid can be of all one TS7700 model type, or any mixture of models types, including
TS7760D, TS7760T, TS7720D, TS7720T, and TS7740. When a mixture of models is present
within the same grid, it is referred to as a hybrid grid.
The term multi-cluster grid is used for a grid with two or more clusters. For a detailed
description, see 2.3, “Multi-cluster grid configurations: Components, functions, and features”
on page 61.
The TS7760 provides a disk only model and an option to attach to a physical tape library with
TS1100 tape drives, as did its predecessor the TS7720. To support the IBM TS4500, R4.0
and later needs to be installed on a TS7720T. Both models deliver a maximum of 2.5 PB
usable data in cache.
The TS7740 provides up to 28 terabytes (TB) of usable disk cache space, and supports the
attachment to the IBM TS3500 tape library and TS1100 family of physical tape drives.
The TS7720 was introduced as a response to this need. Hybrid grid configurations combined
the benefits of both the TS7720 (with its large disk cache) and the TS7740 (with its
economical and reliable tape store). Through Hybrid grids, large disk cache repositories and
physical tape offloading were all possible. The next evolution of the TS7700 was combining
the benefits of both TS7720 and TS7740 models into one solution.
Through the combination of the technologies, the TS7760, TS7720, TS7740, and Hybrid grid
benefits can now be achieved with a single product. All features and functions of the TS7720,
the TS7740, and hybrid grid have been maintained, although additional features and
functions have been introduced to further help with the industry’s evolving use of IBM Z virtual
tape.
In addition to the features and functions that are provided on the TS7700D and TS7740, two
key, unique features were introduced as part of the R3.2 TS7720T product release:
Disk Cache Partition, which provides better control of how workloads use the disk cache
Delay Premigration, or the ability to delay movement to tape
The TS7700T supports the ability to create 1 - 7 tape-managed partitions. Each partition is
user-defined in 1 TB increments. Workloads that are directed to a tape-managed partition are
managed independently concerning disk cache residency. After you create 1 - 7
tape-managed partitions, the disk cache capacity that remains is viewed as the resident-only
partition. Partitions can be created, changed, and deleted concurrently from the Management
Interface (MI).
Within this document, the tape-managed partitions are referred to as CP1 - CP7, or
generically as cache partitions (CPx). The resident-only partition is referred to as CP0. The
partitions are logical, and have no direct relationship to one or more physical disk cache
drawers or types. All CPx partitions can use back-end physical tape, but the CP0 partition has
no direct access to back-end physical tape. In addition, CPx partitions have no direct
relationship to physical tape pools. Which partition and which pool are used for a given
workload is independent.
Storage Class (SC) is used to direct workloads to a given partition. There is no automatic
method to have content move between partitions. However, it can be achieved through
mount/demount sequences, or through the LIBRARY REQUEST command.
Workloads that are directed to a given CPx partition are handled similarly to a TS7740,
except that the hierarchal storage management of the CPx content is only relative to
workloads that target the same partition. For example, workloads that target a particular CPx
partition do not cause content in a different CPx partition to be migrated. This enables each
workload to have a well-defined disk cache residency footprint.
Content that is replicated through the grid accepts the SC of the target cluster, and uses the
assigned partition. If more than one TS7700T exists in a grid, the partition definitions of the
two or more TS7700Ts do not need to be the same.
Content queued for premigration is already compressed, so the premigration queue size is
based on post-compressed capacities. For example, if you have a host workload that
compresses at 3:1, 6 TB of host workload results in only 2 TB of content queued for
premigration.
PMPRIOR and PMTHLVL are LIBRARY REQUEST-tunable thresholds that are used to help manage
and limit content in the premigration queue. As data is queued for premigration, and
premigration activity is minimal until the PMPRIOR threshold is crossed. When crossed, the
premigration activity increases based on the defined premigration drive count.
If the amount of content in the premigration queue continues to increase, the PMTHLVL
threshold is crossed, and the TS7700T intentionally begins to throttle inbound host and copy
activity into all CPx partitions to maintain the premigration queue size. This is when the
TS7700T enters the sustained state of operation. The PMPRIOR and PMTHLVL thresholds can be
no larger than the FC5274 resulting premigration queue size. For example, if three FC5274
features are installed, PMTHLVL must be set to a value of 3 TB or smaller.
After a logical volume is premigrated to tape, it is no longer counted against the premigration
queue. The volume exists in both disk cache and physical tape until the migration policies
determine whether and when the volume should be deleted from disk cache.
How many FC5274 features should be installed is based on many factors. The IBM tape
technical specialists can help you determine how many are required based on your specific
configuration.
Another reason that you might want to delay premigration is to run the TS7700T longer in the
peak mode of operation, which can help reduce your job run times. By delaying premigration,
the amount of content in the premigration queue can be reduced, which helps eliminate any
throttling that can occur if the PMTHLVL threshold is crossed while running your workloads.
The delay normally is enough to get you through your daily job window. However, this is only
valid for environments that have a clearly defined window of operation. The delayed
premigration content is eventually queued, and any excessive queuing past the PMTHLVL
threshold might result in heavy throttling. If workloads continue throughout the day, this might
not be a feasible option.
The delay period is in hours, and is an attribute of the SC. Independent of which CPx partition
that the data is assigned to, the delay period can be unique per workload.
If CP0 has no remaining free space, further overspill is prevented. The CPx partitions are not
allowed to overcommit any further. A new LI REQUEST command was introduced in R4.0 to
reserve space for the CP0, which cannot be used for overspill purposes.
In either case, logical volumes can be moved from CP0 to CPx, from CPx to CP0, and from
CPx to a different CPx partition. Movement rules are as described.
When a volume is written from load point, the eight-character SMS construct names (as
assigned through your automatic class selection (ACS) routines) are passed to the library. At
the library’s MI, you can then define policy actions for each construct name, enabling you and
the TS7700 to better manage your volumes. For the other IBM Z platforms, constructs can be
associated with the volumes, when the volume ranges are defined through the library’s MI.
Each of these constructs is used to determine specific information about the data that must
be stored. All construct names are also presented to the TS7700. They need to have an
equivalent definition at the library. You can define these constructs in advance on the TS7700
MI. For more information, see “Defining TS7700 constructs” on page 555. If constructs are
sent to the TS7700 without having predefined constructs on the TS7700, the TS7700 creates
the construct with default parameters.
Tip: Predefine your SMS constructs on the TS7700. The constructs that are created
automatically might not be suitable for your requirements.
Connectivity is defined at both the library level and the SG level. If an SG is connected to
certain systems, any libraries that are associated with that SG must be connected to the
same systems. You can direct allocations to a local or remote library, or to a specific library by
assigning the appropriate SG in the SG ACS routine.
Important: The DATACLAS assignment is applied to all clusters in a grid when a volume is
written from beginning of tape. Given that SG, SC, and MC can be unique per cluster, they
are independently recognized at each cluster location for each mount/demount sequence.
Host commands
Several commands to control and monitor your environment are available. They are described
in detail in Chapter 6, “IBM TS7700 implementation” on page 225, Chapter 8, “Migration” on
page 299, Chapter 9, “Operation” on page 339, and Appendix F, “Library Manager volume
categories” on page 877. These major commands are available:
D SMS,LIB Display library information for composite and distributed libraries.
D SMS,VOLUME Display volume information for logical volumes.
LI REQ The LIBRARY REQUEST command, also known as the Host Console
Request function, is initiated from a z/OS host system to a TS7700
composite library or a specific distributed TS7700 library within a grid.
Use the LIBRARY REQUEST command to request information that is
related to the current operational state of the TS7700, its logical and
physical volumes, and its physical resources.
There is a subtle difference, but it is important to understand. The DS QLIB command can
return different data, depending on which host it is entered. An LI command returns the
same data without regard to the host if both hosts have full accessibility.
The changes to the event notification settings are grid wide and will be persistent if new
microcode levels are installed.
In addition, you can back up these settings and restore them independently on other grids to
improve the ease of managing maintenance of a multi grid environment.
For information about content-based retrieval (CBRxxxx) messages, see the hTS7700 Series
Operator Informational Messages white paper at:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101689
Tools
Many helpful tools are provided for the TS7700. For more information, see Chapter 9,
“Operation” on page 339.
Status information is transmitted to the IBM Support Center for problem evaluation. An IBM
Service Support Representative (IBM SSR) can be dispatched to the installation site if
maintenance is required. Call Home is part of the service strategy that is adopted in the
TS7700 family. It is also used in a broad range of tape products, including VTS models and
tape controllers, such as the IBM System Storage® 3592-C07.
The Call Home information for the problem is transmitted with the appropriate information to
the IBM product support group. This data includes the following information:
Overall system information, such as system serial number and Licensed Internal Code
level
Details of the error
Error logs that can help to resolve the problem
After the Call Home is received by the assigned IBM support group, the associated
information is examined and interpreted. Following analysis, an appropriate course of action
is defined to resolve the problem. For instance, an IBM SSR might be sent to the site location
to take the corrective actions. Alternatively, the problem might be repaired or resolved
remotely by IBM support personnel through a broadband (if available) or telephone (if
necessary) connection.
The TS3000 Total Storage System Console (TSSC) is the subsystem component responsible
for placing the service call or Call Home when necessary. Since model 93p and release
TSSC V4.7, only broadband connection is supported.
Next, general information is provided about the components, functions, and features used in a
TS7700 environment. The general concepts and information are also in 2.2, “Stand-alone
cluster: Components, functions, and features” on page 30. Only deviations and additional
information for multi-cluster grid are in 2.3, “Multi-cluster grid configurations: Components,
functions, and features” on page 61.
You must be able to identify the logical entity that represents the virtual drives and volumes,
but also address the single entity of a physical cluster. Therefore, two types of libraries exist, a
composite library and a distributed library. Each type is associated with a library name and a
Library ID.
Composite library
The composite library is the logical image of the stand-alone cluster or grid that is presented
to the host. All logical volumes and virtual drives are associated with the composite library. In
a stand-alone TS7700, the host sees a logical tape library with up to 31 3490E tape CUs.
These CUs each have 16 IBM 3490E tape drives, and are connected through 1 - 8 FICON
channels. The composite library is defined through the Interactive Storage Management
Facility (ISMF). A composite library is made up of one or more distributed libraries.
Important: A composite library ID must be defined both for a multi-cluster grid and a
stand-alone cluster. For a stand-alone cluster, the composite library ID must not be the
same as the distributed library ID. For a multiple grid configuration, the composite library ID
must differ from any of the unique distributed library IDs. Both the composite library ID and
distributed library ID are five-digit hexadecimal strings.
The Library ID is used to tie the host’s definition of the library to the actual hardware.
The host operating system (OS) sees the TVC as virtual IBM 3490E Tape Drives, and the
3490 tape volumes are represented by storage space in a fault-tolerant disk subsystem. The
host never writes directly to the physical tape drives attached to a TS7740 or TS7700T.
Originally. the TS7760 was delivered with 4 TB disk drive support. Since August 2017, only
8 TB disk drives are available. You can mix 4 TB or 8 TB drives within a frame, but you cannot
mix them within a drawer or enclosure.
The following fault-tolerant TVC options are available. The TS7760 CSA/XSA cache are
protected by the Dynamic Disk Pooling (DDP). For TS7740 configurations that use CC6, CC7,
or CC8 technology, the TVC is protected with RAID 5. For all TS7720D, TS7720T, or TS7740
configurations that use CC9 technology, RAID 6 is used.
DDP on the TS7760 models provides not only a higher protection level, but also allow faster
rebuild times. A DDP is built up from either one or two drawers. In a single drawer DDP, the
data can be re-created when up to two disks in a DDP becomes unavailable. In a two-drawer
DDP configuration, up to four disks can become unavailable and the data can still be
re-created, but only two disks can be rebuilt at the same time.
Whether a DDP is built up from a single drawer or from two drawers depends on your
configuration. Even numbers are always bounded to a two-drawer DDP. If an uneven number
of drawers is installed, the last drawer in the frame is configured as a single drawer DDP.
The DDP does not use a global spare concept anymore, but provide free space on each of
the 12 DDMs in a CSA/XSA drawer. In case of a DDM failure, the data is read from all
remaining DDMs, and write to all remaining DDMs into the free space on the remaining
DDMS. This procedure is called reconstruction.
As opposed to a RAID-protected system, the data will not be copied to the original DDM after
the failing DDM has been replaced. Instead, newly arriving data is used to rebalance the
usage of the DDMs. This behavior uses less internal resources and allows faster return to
normal processing.
For older cache models, the RAID configurations provide continuous data availability to users.
If up to one data disk (RAID 5) or up to two data disks (RAID 6) in a RAID group become
unavailable, the user data can be re-created dynamically from the remaining disks by using
parity data that is provided by the RAID implementation. The RAID groups contain global hot
spare disks to take the place of a failed hard disk drive (HDD).
Using parity, the RAID controller rebuilds the data from the failed disk onto the hot spare as a
background task. This process enables the TS7700 to continue working while the IBM SSR
replaces the failed HDD in the TS7700 Cache Controller or Cache Drawer.
The TS7720T and the TS7760T support Cache Partitions. Virtual volumes in resident-only
partition (CP0) are treated as one partition in TS7720. Virtual volumes in tape-attached
partition (CP1 - CP7) are treated as one partition in TS7740. For a detailed description about
Cache Partition, see 2.1.6, “Introduction of the TS7700T” on page 22.
Each logical volume, like a real volume, has the following characteristics:
Has a unique volume serial number (VOLSER) known to the host and to the TS7700.
Is loaded and unloaded on a virtual device.
Supports all tape write modes, including Tape Write Immediate mode.
Contains all standard tape marks and data blocks.
Supports an IBM, International Organization for Standardization (ISO), or American
National Standards Institute (ANSI) standard label.
Prior to R3.2, Non-Initialized tapes or scratch mounts required that the tape be written
from beginning of tape (BOT) for the first write. Appends could then occur at any legal
position.
With R3.2 and later, Non-Initialized auto-labeled tapes allow the first write to occur at any
position between BOT and just after the first tape mark after the volume label.
The application is notified that the write operation is complete when the data is written to a
buffer in vNode. The buffer is implicitly or explicitly synchronized with the TVC during
operation. Tape Write Immediate mode suppresses write data buffering.
Each host-written record has a logical block ID.
The default logical volume sizes of 400 MiB or 800 MiB are defined at insert time. These
volume sizes can be overwritten at every individual scratch mount, or any private mount
where a write from BOT occurs, by using a DC construct option.
Virtual volumes can exist only in a TS7700. You can direct data to a virtual tape library by
assigning a system-managed tape SG through the ACS routines. SMS passes DC, MC, SC,
and SG names to the TS7700 as part of the mount operation. The TS7700 uses these
constructs outboard to further manage the volume. This process uses the same policy
management constructs defined through the ACS routines.
Beginning with TS7700 R2.0, a maximum of 2,000,000 virtual volumes per stand-alone
cluster or multi-cluster grid was introduced. With a model V07/VEB server with R3.0 followed
by model VEC server with R4.0, a maximum number of 4,000,000 virtual volumes per
stand-alone cluster or multi-cluster grid are supported.
The default maximum number of supported logical volumes is still 1,000,000 per grid. Support
for extra logical volumes can be added in increments of 200,000 volumes by using FC5270.
Larger capacity volumes (beyond 400 MiB and 800 MiB) can be defined through DC and
associated with CST (MEDIA1) or ECCST (MEDIA2) emulated media.
The VOLSERs for the logical volumes are defined through the MI when inserted. Virtual
volumes go through the same cartridge entry processing as native cartridges inserted into a
tape library that is attached directly to an IBM Z host.
After virtual volumes are inserted through the MI, they are placed in the insert category and
handled exactly like native cartridges. When the TS7700 is varied online to a host, or after an
insert event occurs, the host operating system interacts by using the object access method
(OAM) with the Library.
Depending on the definitions in the DEVSUPxx and EDGRMMxx parmlib members, the host
operating system assigns newly inserted volumes to a particular scratch category. The host
system requests a particular category when it needs scratch tapes, and the TS7700 knows
which group of volumes to use to satisfy the scratch request.
Before R4.1.2, the compression was based only on an IBMLZ1 algorithm within the FICON
channel adapter in a TS7700. Two additional TS7700 CPU based algorithms can now be
selected:
LZ4
ZSTD
To use the new compression algorithms, all clusters in a grid must have R4.1.2 or later
microcode. This microcode can also be installed on TS7740 (V07) and TS7720 (VEB) if
32 GB main memory is configured. Note that, especially with ZSTD compression, there might
be performance considerations to the TS7700 throughput on heavily loaded existing
workloads running on older hardware. VEB/V07 clients should test how the new algorithms
work on their TS7700 configuration with small workloads before putting them into production.
To avoid any negative effect, analyze the VEHSTATS performance reports in advance. For
most V07 and VEB installations, LZ4 provides a good compromise to reach a higher
compression ratio that does not exhaust the CPU power in TS7700 models.
Using the new compression algorithms will have a positive impact on these areas:
Cache resources (cache bandwidth and cache space) required
Grid link bandwidth
Physical tape resources
Premigration queue length (FC 5274)
It can also have a positive impact on your recovery point objective, depending on your
configuration.
The actual host data that is stored on a virtual CST or ECCST volume is displayed by the
LI REQ commands and in the MI. Depending on the selected logical volume size (400 MB to
25 GB), the uncompressed size varies between 1200 MiB - 75,000 MiB (assuming a 3:1
compression ratio).
Scratch volumes at the mounting cluster are chosen by using the following priority order:
1. All volumes in the source or alternative source category that are owned by the local
cluster, not currently mounted, and do not have pending reconciliation changes against a
peer cluster
2. All volumes in the source or alternative source category that are owned by any available
cluster, not currently mounted, and do not have pending reconciliation changes against a
peer cluster
3. All volumes in the source or alternative source category that are owned by any available
cluster and not currently mounted
4. All volumes in the source or alternative source category that can be taken over from an
unavailable cluster that has an explicit or implied takeover mode enabled
The first volumes that are chosen in the preceding steps are the volumes that have been in
the source category the longest. Volume serials are also toggled between odd and even
serials for each volume selection.
For all scratch mounts, the volume is temporarily initialized as though the volume was
initialized by using the EDGINERS or IEHINITT program. The volume has an IBM-standard label
that consists of a VOL1 record, an HDR1 record, and a tape mark.
Important: In Release 3.0 or later of the TS7700, all categories that are defined as scratch
inherit the Fast Ready attribute. There is no longer a need to use the MI to set the Fast
Ready attribute to scratch categories. However, the MI is still needed to indicate which
categories are scratch.
When the Fast Ready attribute is set or implied, no recall of content from physical tape is
required in a TS7740 or TS7700T. No mechanical operation is required to mount a logical
scratch volume. In addition, the volume’s current consistency is ignored because a scratch
mount requires a write from BOT.
The TS7700 with SAA function activated uses policy management with z/OS host software to
direct scratch allocations to specific clusters within a multi-cluster grid.
The recalled virtual volume remains in the TVC until it becomes the least recently used (LRU)
volume, unless the volume was assigned a Preference Group of 0 or the Recalls Preferred to
be Removed from Cache override is enabled by using the TS7700 Library Request command.
If the mounted virtual volume was modified, the volume is again premigrated.
If modification of the virtual volume did not occur when it was mounted, the TS7740 or
TS7700T does not schedule another copy operation, and the current copy of the logical
volume on the original stacked volume remains active. Furthermore, copies to remote
TS7700 clusters in a grid configuration are not required if modifications were not made. If the
primary or secondary pool location has changed, it is recognized now, and one or two new
copies to tape are queued for premigration.
In this case, DFSMS Removable Media Manager (DFSMSrmm) or other TMS fails the mount
operation, because the expected last written data set for the private volume was not found.
Because no write operation occurs, the original volume’s contents are left intact, which
accounts for categories that are incorrectly configured as scratch (Fast Ready) within the MI.
The LWORM implementation of the TS7700 emulates physical WORM tape drives and
media. TS7700 provides the following functions:
Provides an advanced function DC construct property that enables volumes to be
assigned as LWORM-compliant during the volume’s first mount, where a write operation
from BOT is required, or during a volume’s reuse from scratch, where a write from BOT is
required
Generates, during the assignment of LWORM to a volume’s characteristics, a temporary
worldwide identifier that is surfaced to host software during host software open and close
processing, and then bound to the volume during the first write from BOT
Generates and maintains a persistent Write-Mount Count for each LWORM volume, and
keeps the value synchronized with host software
Enables only appends to LWORM volumes by using physical WORM append guidelines
Provides a mechanism through which host software commands can discover LWORM
attributes for a given mounted volume
TS7700 reporting volumes (BVIR) cannot be written in LWORM format. For more information,
see 11.14.1, “Overview of the BVIR function” on page 679.
Clarification: Cohasset Associates, Inc. has assessed the LWORM capability of the
TS7700. The conclusion is that the TS7700 meets all US Securities and Exchange
Commission (SEC) requirements in Rule 17a-4(f), which expressly enables records to be
retained on electronic storage media.
Each virtual drive has the following characteristics of physical tape drives:
Uses host device addressing
Is included in the I/O generation for the system
Is varied online or offline to the host
Signals when a virtual volume is loaded
Responds and processes all IBM 3490E I/O commands
Becomes not ready when a virtual volume is rewound and unloaded
Supports manual stand-alone mount processing for host initial program load (IPL) when
initiated from the MI
For software transparency reasons, the functions of the 3490E integrated cartridge loader
(ICL) are also included in the virtual drive’s capability. All virtual drives indicate that they have
an ICL. For scratch mounts, using the emulated ICL in the TS7700 to preinstall virtual
cartridges is of no benefit.
With FC 5275, you can add 1 LCU with 16 drives up to the maximum of 496 logical drives per
cluster.
Note: 8 Gigabit (Gb) and 16 Gigabit (Gb) FICON adapters must be installed in a cluster
before these additional devices can be defined. Existing configurations with 4 Gb FICON
adapters do not support these additional devices.
This is valuable in a setup where you have a production system and a test system with
different security settings on the hosts, and you want to separate the access to the grid in a
more secure way. It can also be used in a multi-tenant service provider to prevent tenants
from accessing each other’s data, or when you have different IBM Z operating systems that
share the TS7700, such as z/OS, IBM z/VSE, IBM z/Transaction Processing Facility (IBM
z/TPF), and IBM z/VM.
Hard partitioning is a way to give a fixed number of LCUs to a defined host group, and
connect the units to a range of logical volumes that are dedicated to a particular host or hosts.
SDAC is a useful function when multiple partitions have the following characteristics:
Separate volume ranges
Separate TMS
Separate tape configuration database
Implementing SDAC requires planning and orchestration with other system areas, to map the
wanted access for the device ranges from individual servers or LPARs, and consolidate this
information in a coherent input/output definition file (IODF) or HCD. From the TS7700
subsystem standpoint, SDAC definitions are set up using the TS7700 MI.
Remember: Do not change the assignment of physical tape drives attached to a TS7740
or TS7700T in the IBM TS3500 IBM or IBM TS4500 Tape Library web interface. Consult
your IBM SSR for configuration changes.
Before Release 3.3, all attached physical drives had to be homogeneous. With Release 3.3,
support was added for the use of a mix between the TS1150 and one other tape drive
generation. This is called heterogeneous tape drive support and is for migration purposes
only. Although the TS1150 does not support JA and JB cartridges, it might be necessary to
read the existing data with a tape drive from the previous generation, and then write the data
with the TS1150 to a JC or JD cartridge. No new data can be placed on the existing JA and
JB cartridges using the heterogeneous support. This is referred to as sunset media.
To support the heterogeneous tape drives, additional controls were introduced to handle the
reclaim value for sunset media differently from the rest of the tape media. Also, two more
SETTING ALERTS were introduced to allow the monitoring of the sunset drives.
For more information, see 7.1.5, “TS7700 tape library attachments, drives, and media” on
page 263.
After the host closes and unloads a virtual volume, the storage management software inside
the TS7740 or TS7700T schedules the virtual volume to be copied (also known as
premigration) onto one or more physical tape cartridges. The TS7740 or TS7700T attempts
to maintain a minimal amount of stacked volume to which virtual volumes are copied.
Therefore, mount activity is reduced because a minimal number of physical cartridges are
mounted to service multiple virtual volume premigration requests that target the same
physical volume pool. How many physical cartridges for premigration per pool can be
mounted in parallel is defined within the MI as part of the pool property definitions. Virtual
volumes are already compressed and are written in that compressed format to the stacked
volume. This procedure maximizes the use of a cartridge’s storage capacity.
A logical volume that cannot fit in the currently filling stacked volume does not span across
two or more physical cartridges. Instead, the stacked volume is marked full, and the logical
volume is written on another stacked volume from the assigned pool.
Due to business reasons, it might be necessary to separate logical volumes from each other
(selective dual write, multi-client environment, or encryption requirements). Therefore, you
can influence the location of the data by using volume pooling. For more information, see
“Using physical volume pools” on page 51.
Through the TS3500/ TS4500 web interface physical cartridge ranges should be assigned to
the appropriate library partition associated with your TS7700. This enables them to become
visible to the correct TS7700. The TS7700 MI must further be used to define which pool
physical tapes are assigned to when initially inserted into the TS3500 or TS4500, which
includes the common scratch pool. How physical tapes can move between pools for scratch
management is also defined by using the MI.
With the Selective Dual Copy function, storage administrators can selectively create two
copies of logical volumes within two pools of a TS7740 or TS7700T. The Selective Dual Copy
function can be used with the Copy Export function to provide a secondary offsite physical
copy for DR purposes. For more information about Copy Export, see 2.2.26, “Copy Export
function” on page 56.
The second copy of the logical volume is created in a separate physical pool to ensure
physical cartridge separation. Control of Dual Copy is through the MC construct (see
“Management Classes window” on page 456). The second copy is created when the original
volume is pre-migrated.
The second copy that is created through the Selective Dual Copy function is only available
when the primary volume cannot be recalled or is inaccessible. It cannot be accessed
separately, and cannot be used if the primary volume is being used by another operation. The
second copy provides a backup if the primary volume is damaged or inaccessible.
Selective Dual Copy is defined to the TS7740/TS7700T and has the following characteristics:
The selective dual copy feature is enabled by the MC setting through the MI where you
define the secondary pool.
Secondary and primary pools can be intermixed:
– A primary pool for one logical volume can be the secondary pool for another logical
volume unless the secondary pool is used as a Copy Export pool.
– Multiple primary pools can use the same secondary pool.
At Rewind Unload (RUN) time, the secondary pool assignment is determined, and the
copy of the logical volume is scheduled. The scheduling of the backup is determined by
the premigration activity occurring in the TS7740 or TS7700T.
The secondary copy is created before the logical volume is migrated to the primary pool.
How you control the content of the TVC (TS7700T and TS7740)
You control the content through the SC construct. Through the MI, you can define one or
more SC names. If the selected cluster possesses a physical library, you can assign
Preference Level 0 or 1. If the selected cluster does not possess a physical library, volumes in
that cluster’s cache display a Level 1 preference.
DC
TS7700
SC ACS
MC routines TS7700 DB
SG
TVC VOLSER SG SC MC DC
Storage Group= BACKUP
Storage Class= NOCACHE VT0999 BACKUP NOCACHE -------- --------
Volser selected= VT0999
Application data
VT0999 Assignment at
Rewind/Unload
If the host passes a previously undefined SC name to the TS7700 during a scratch mount
request, the TS7700 adds the name by using the definitions for the default SC.
Define SCs: Ensure that you predefine the SCs. The default SC might not support your
needs.
For environments that are not z/OS (SMS) environments that use the MI, an SC can be
assigned to a range of logical volumes during insert processing. The SC can also be updated
to a range of volumes after they have been inserted through the MI.
To be compatible with the IART method of setting the preference level, the SC definition also
enables a Use IART selection to be assigned. Even before Outboard Policy Management was
made available for the previous generation VTS, you could assign a preference level to virtual
volumes by using the IART attribute of the SC. The IART is an SC attribute that was originally
added to specify the wanted response time (in seconds) for an object by using the OAM.
If you wanted a virtual volume to remain in cache, you assign an SC to the volume whose
IART value is 99 seconds or less. Conversely, if you want to give a virtual volume preference
to be out of cache, you assign an SC to the volume whose IART value was 100 seconds or
more. Assuming that the Use IART selection is not specified, the TS7700 sets the preference
level for the volume based on the Preference Level 0 or 1 of the SC assigned to the volume.
When space is needed in cache, the TS7740 first determines whether there are any PG0
volumes that can be removed. If not, the TS7740 selects PG1 volumes to remove based on
an LRU algorithm. This process results in volumes that have been copied to physical tape,
and have been in cache the longest without access, to be removed first.
When a preference level is assigned to a volume, that assignment is persistent until the
volume is reused for scratch and a new preference level is assigned. Or, if the policy is
changed and a mount/dismount occurs, the new policy also takes effect.
However, there might be use cases where volumes recalled into cache are known to be
accessed only once, and should be removed from disk cache as soon as they are read (for
example, during a multi-volume data set restore). In this case, you wouldn’t want the volumes
to be kept in cache, because they require other more important cache resident data to
be migrated.
Based on your current requirements, you can set or modify this control dynamically through
the LI REQ SETTING RECLP0 option:
When DISABLED, which is the default, logical volumes that are recalled into cache are
managed by using the actions that are defined for the SC construct associated with the
volume as defined at the TS7700.
When ENABLED, logical volumes that are recalled into cache are managed as PG0
(preferable to be removed from cache). This control overrides the actions that are defined
for the SC associated with the recalled volume.
In a TS7700D stand-alone cluster, you can influence the TVC content only with the Delete
Expired and EJECT setting. No further cache management is available. For a TS770T, the
LI REQ PARTRFSH command can be used to move data between partitions.
If a TS7700D runs out of cache space or TS7700T runs out of CP0 space, warning messages
and critical messages are shown. If the TS7700D enters the Out of cache condition, it moves
to a read-only state. If a TS7720T CP0 partition becomes full, it becomes read-only regarding
workloads that target CP0.
Consider a situation where the amount of delayed premigration content, which has also not
yet met its delay criteria, exceeds a partition’s configured maximum delay-premigration limit.
In this case, delayed content that has been present in the partition for the longest time is
moved to the premigration queue proactively to maintain the configured limit. A too-small
defined delay premigration size leads to the situation where data is always pushed out to tape
too early, which can create excess back-end tape usage and its associated activity.
One important aspect of using delay premigration is that the content that is delayed for
premigration is not added to the premigration queue until its delay criteria has been met. This
means that if a large amount of delayed content meets its criteria at the same time, the
premigration queue can rapidly increase in size. This rapid increase can result in unexpected
host throttling.
Ensure that your FC5274 feature counts can accommodate these large increases in
premigration activity. Alternatively, try to ensure that multiple workloads that are delayed for
premigration do not reach their criteria at the same time.
Assume that you have three different tape partitions and a unique SC for each one. The
following list describes the SC definitions:
CP1: Delay premigration 12 hours after volume creation
CP2: Delay premigration 6 hours after volume creation
CP3: Delay premigration 3 hours after volume creation
In CP1 at 22:00, 6 TB are written every night. The 12-hour delay ensures that they are
premigrated later in the day when there is a lower workload. To make the example simpler, we
assume that no compression exists for all data.
In CP2 at 04:00, 2 TB are written. The six-hour delay makes them eligible for premigration
also at 10:00 in the morning.
In CP3 at 07:00, 1 TB is written. The three-hour delay has them eligible for premigration at the
same time as the other two workloads.
Therefore, all 9 TB of the workload is meeting its delay criteria at roughly the same time,
producing a large increase in premigration activity. If the premigration queue size is not large
enough, workloads into the TS7700T are throttled until the premigration process can reduce
the queue size. Ensure that the number of FC 5274 features are suitable, or plan the delay
times so that they do not all expire at the same time.
To help manage the expired content, the TS7700 supports a function referred to as delete
expire. When enabling delete expire processing against a configured scratch category, you
can set a grace period for expired volumes ranging 1 hour - 144 weeks (the default is 24
hours). If the volume has not already been reused when the delay period has passed, the
volume is marked as a candidate for auto deletion or delete expire.
When deleted, its active space in TVC is freed. If it was also stacked to one or more physical
tapes, that region of physical tape is marked inactive.
The start timer for delete expire processing is set when the volume is moved to a designated
scratch category, or a category with the Fast Ready attribute set, which has defined a delete
expire value. If the scratch category has no delete expire value, the timer is not set.
During the delete expire process, the start timer and the delete expire value are used to
determine whether the logical volume is eligible for the delete expire processing. If so, the
content is deleted immediately.
If the logical volume is reused during a scratch mount before the expiration delete time
expires, the existing content is immediately deleted at the time of first write.
It does not matter whether the volume is in cache or on back-end tape; after the delete expire
time passes, the volume is no longer accessible without IBM SSR assistance. The default
behavior is to Delete Expire up to 1000 delete-expire candidates per hour. This value can be
modified by using the LI REQ command.
For more information about expired volume management, see “Defining the logical volume
expiration time” on page 553. The explicit movement of a volume out of the delete expired
configured category can occur before the expiration of this volume.
Important: Disregarding the Delete Expired Volumes setting can lead to an out-of-cache
state in a TS7700D. With a TS7740 or TS7700T, it can cause excessive tape usage. In an
extreme condition, it can cause an out-of-physical scratch state.
The disadvantage of not having this option enabled is that scratched volumes needlessly use
TVC and physical stacked volume resources, so they demand more TVC active space while
also requiring more physical stacked volumes in a TS7740 or TS7700T. The time that it takes
a physical volume to fall below the reclamation threshold is also increased, because the data
is still considered active. This delay in data deletion also causes scratched stale logical
volumes to be moved from one stacked volume to another during reclamation.
This additional option is made available to prevent any malicious or unintended overwriting of
scratched data before the duration elapses. After the grace period expires, the volume is
simultaneously removed from a held state and made a deletion candidate.
Remember: Volumes in the Expire Hold state are excluded from DFSMS OAM scratch
counts, and are not candidates for TS7700 scratch mounts.
Delete Expired data that was previously stacked onto physical tape remains recoverable
through an IBM services salvage process if the physical tape has not yet been reused, or if
the secure erase process was not performed against it. Contact your IBM SSR if these
services are required. Also, disabling reclamation as soon as any return to scratch mistake is
made can help retain any content still present on physical tape.
Important: When Delete Expired is enabled for the first time against a scratch category, all
volumes that are contained within that category are not candidates for delete expire
processing. Only volumes that moved to the scratch category after the enablement of the
Delete Expired are candidates for delete expire processing.
Changes to the Delete Expired values are effective to all logical volumes that are
candidates for delete expire processing.
However, the cache sizes in a TS7700T are much bigger and depend on the installed
configuration, so this behavior might not be appropriate. Therefore, in Release 3.3, a new
command that is called the LI REQ command defines how a TS7700T behave in such a
condition.
You can now specify whether a TS7700T CPx reacts the same as a TS7740 or accepts the
incoming write in a stand-alone mode until the cache resources are exhausted.
TVC encryption is turned on for the whole disk cache. You cannot encrypt a disk cache
partially. Therefore, all DDMs in all strings must be full disk encryption (FDE)-capable to
enable the encryption. The disk cache encryption is supported for all TS7760 models with
CSA, all TS7720 models with 3956-CS9 cache or higher, and for TS7740 with 3956-CC9.
Encryption can be enabled in the field at any time, and retroactively encrypts all existing
content that is stored within the TVC. Because the encryption is done at the HDD level,
encryption is not apparent to the TS7700 and has no effect on performance.
Starting with R3.0, only local key management is supported. Local key management is
automated. There are no encryption keys (EKs) for the user to manage. Release 3.3 added
support for the external key manager IBM Security Key Lifecycle Manager (SKLM, formerly
IBM Tivoli Key Lifecycle Manager).
If you want to use an external key manager for both TVC and physical tape, you must use the
same external key manager instance for both of them.
There are two differences between the usage of local or external key management:
If you have no connection to the external key manager, TS7700 will not run. Therefore, you
must plan carefully to have a primary and an alternative key manager that are reachable in
a disaster situation.
If a cluster that uses disk encryption with an external key manager is unjoined from a grid,
the encryption must be disabled during this process. Otherwise, the TS7700 cannot be
reused. Therefore, during the unjoin, the cluster is secured erased.
The following list includes some examples of why physical volume pools are helpful:
Data from separate customers on the same physical volume can compromise certain
outsourcing contracts.
Customers want to be able to “see, feel, and touch” their data by having only their data on
dedicated media.
Customers need separate pools for different environments, such as test, user acceptance
test (UAT), and production.
Traditionally, users are charged by the number of volumes they have in the tape library.
With physical volume pooling, users can create and consolidate multiple logical volumes
on a smaller number of stacked volumes, and reduce their media charges.
Recall times depend on the media length. Small logical volumes on the tape cartridges
(JA, JB, and JC) can take a longer time to recall than volumes on the economy cartridge
(JJ or JK). Therefore, pooling by media type is also beneficial.
Some workloads have a high expiration rate, which causes excessive reclamation. These
workloads are better suited in their own pool of physical volumes.
Protecting data through encryption can be set on a per pool basis, which enables you to
encrypt all or some of your data when it is written to the back-end tapes.
Migration from older tape media technology.
Reclaimed data can be moved to a different target pool, which enables aged data to move
to a specific subset of physical tapes.
Second dedicated pool for key workloads to be Copy Exported.
There are benefits to using physical volume pools, so plan for the number of physical pools.
See also “Relationship between reclamation and the number of physical pools” on page 55.
Each TS7740/TS7700T that is attached to an IBM TS4500 or IBM TS3500 tape library has its
own set of pools.
Each pool can be defined to borrow single media type (for example, JA, JB, JC, JD), borrow
mixed media, or have a first choice and a second choice. The borrowing options can be set by
using the MI when you are defining stacked volume pool properties.
Remember: The common scratch pool must have at least three scratch cartridges
available when one or more to reports low scratch count warnings.
Those pools can have their properties tailored individually by the administrator for various
purposes. When initially creating these pools, it is important to ensure that the correct
borrowing properties are defined for each one. For more information, see “Stacked volume
pool properties” on page 53.
By default, there is one pool, Pool 01, and the TS7740/TS7700T stores virtual volumes on
any stacked volume available to it. This creates an intermix of logical volumes from differing
sources, for example, an LPAR and applications on a physical cartridge.
The user cannot influence the physical location of the logical volume within a pool. Having all
of the logical volumes in a single group of stacked volumes is not always optimal.
Using this facility, you can also perform the following tasks:
Separate different clients or LPAR data from each other.
Intermix or segregate media types.
Map separate SGs to the same primary pools.
Set up specific pools for Copy Export.
Set up pool or pools for encryption.
Set a reclamation threshold at the pool level.
Set reclamation parameters for stacked volumes.
Set up reclamation cascading from one pool to another.
Set maximum number of devices to use concurrent premigration on pool base.
Assign or eject stacked volumes from specific pools.
Pool 12
Figure 2-8 TS7740/TS7700T Logical volume allocation to specific physical volume pool flow
Through the MI, you can add an SG construct, and assign a primary storage pool to it.
Stacked volumes are assigned directly to the defined storage pools. The pool assignments
are stored in the TS7740/TS7700T database. During a scratch mount, a logical volume is
assigned to a selected SG.
This SG is connected to a storage pool with assigned physical volumes. When a logical
volume is copied to tape, it is written to a stacked volume that belongs to this storage pool. In
addition, MC can be used to define a secondary pool when two copies on physical tape are
required.
Physical VOLSER ranges can be defined with a home pool at insert time. Changing the home
pool of a range has no effect on existing volumes in the library. When also disabling
borrow/return for that pool, this provides a method to have a specific range of volumes that
are used exclusively by a specific pool.
Tip: Primary Pool 01 is the default private pool for TS7740/TS7700T stacked volumes.
With borrowing, stacked volumes can move from pool to pool and back again to the original
pool. In this way, the TS7740/TS7700T can manage out-of-scratch and low scratch scenarios,
which can occur within any TS7740/TS7700T from time to time.
You need at least two empty stacked volumes in the CSP to avoid any out of scratch
condition. Empty pvols in other pools (regardless of the pool properties) are not considered.
Ensure that non-borrowing active pools have at least two scratch volumes.
One physical pool with an out of stacked volume condition results in an out of stacked volume
condition to the whole TSS740 / TS7700T cluster. Therefore, it is necessary to monitor all
active pools.
Lower capacity JJ, JK, or JL cartridges can be designated to a pool to provide consistently
faster access to application data, such as hierarchical storage management (HSM) or
Content Manager. Higher capacity JA, JB, JC, or JD cartridges that are assigned to a pool
can address archival requirements, such as full volume dumps.
The data that is associated with a logical volume is considered invalidated if any of the
following conditions are true:
A host has assigned the logical volume to a scratch category. Later, the volume is selected
for a scratch mount, and data is written to the volume. The older version of the volume is
now invalid.
A host has assigned the logical volume to a scratch category. The category has a nonzero
delete-expired data parameter value. The parameter value was exceeded, and the
TS7740/TS7700T deleted the logical volume.
A host has modified the contents of the volume. This can be a complete rewrite of the
volume or an append to it. The new version of the logical volume is premigrated to a
separate physical location and the older version is invalidated.
The logical volume is ejected, in which case the version on physical tape is invalidated.
The pool properties change during a mount/demount sequence and a new pool is chosen.
One reclamation task needs two physical tape drives to run. At the end of the reclaim, the
source volume is empty, and it is returned to the specified reclamation pool as an empty
(scratch) volume. The data that is being copied from the reclaimed physical volume does not
go to the TVC. Instead, it is transferred directly from the source to the target tape cartridge.
During the reclaim, the source volume is flagged to be in READ.ONLY mode.
Physical tape volumes become eligible for space reclamation when they cross the occupancy
threshold level that is specified by the administrator in the home pool definitions where those
tape volumes belong. This reclaim threshold is set for each pool individually according to the
specific needs for that client, and is expressed in a percentage (%) of tape usage.
Volume reclamation can be concatenated with a Secure Data Erase for that volume, if
required. This configuration causes the volume to be erased after the reclamation. For more
information, see 2.2.25, “Secure Data Erase function” on page 55.
Consider not running reclamation during peak workload hours of the TS7740/TS7700T. This
ensures that recalls and migrations are not delayed due to physical drive shortages. You must
choose the best period for reclamation by considering the workload profile for that
TS7740/TS7700T cluster, and inhibit reclamation during the busiest period for the system.
A physical volume that is being ejected from the library is also reclaimed in a similar way
before it can be ejected. The active logical volumes that are contained in the cartridge are
moved to another physical volume, according to the policies defined in the volume’s home
pool, before the physical volume is ejected from the library.
Reclamation can also be used to migrate older data from a pool to another while it is being
reclaimed, but only by targeting a separate specific pool for reclamation.
With Release 3.3, it is now possible to deactivate the reclaim on a physical pool base by
specifying a “0” value in the Reclaim Threshold.
With the introduction of heterogeneous tape drive support for migration purposes, the data
from the old cartridges (for example, JA and JB) is reclaimed to the new media (for example,
JC and JD). To support a faster migration, the reclaim values for the sunset media can be
different from the reclaim values for the current tape media. To allow the reclaim for sunset
media, at least 15 scratch cartridges from the newer tape media needs to be available. For
more information “Physical Volume Pools” on page 427.
The number of physical pools, physical drives, stacked volumes in the pools, and the available
time tables for reclaim schedules must be considered and balanced.
You can limit the number of reclaim tasks running concurrent with the LI REQ setting.
A Long Erase operation on a TS11xx drive is completed by writing a repeating pattern from
the beginning to the end of the physical tape, making all data previously present inaccessible
through traditional read operations. The key here is that it is not a fully random from beginning
to end pattern, and it has only one pass. The erasure is writing a single random pattern
repeatedly with one pass, which might not be as secure as a multi-pass fixed pattern method,
as explained by the US Department of Defense (DoD).
Therefore, the logical volumes that are written on this stacked volume are no longer readable.
As part of this data erase function, an extra reclaim policy is added. The policy specifies the
number of days that a physical volume can contain invalid logical volume data before the
physical volume becomes eligible to be reclaimed.
When a physical volume contains encrypted data, the TS7740/TS7700T is able to run a fast
erase of the data by erasing the EKs on the cartridge. Basically, it erases only the portion of
the tape where the key information is stored. This form of erasure is referred to as a
cryptographic erase.
Without the key information, the rest of the tape cannot be read. This method significantly
reduces the erasure time. Any physical volume that has a status of read-only is not subject to
this function, and is not designated for erasure as part of a read-only recovery (ROR).
If you use the eject stacked volume function, the data on the volume is not erased before
ejecting. The control of expired data on an ejected volume is your responsibility.
Volumes that are tagged for erasure cannot be moved to another pool until erased, but they
can be ejected from the library, because such a volume is removed for recovery actions.
Using the Move function also causes a physical volume to be erased, even though the number
of days that are specified has not yet elapsed. This process includes returning borrowed
volumes.
The Copy Export function enables a copy of selected logical volumes that are written to
secondary pools within the TS7740/TS7700T to be removed and taken offsite for DR
purposes. The benefits of volume stacking, which places many logical volumes on a physical
volume, are retained with this function. Because the physical volumes that are being exported
are from a secondary physical pool, the primary logical volume remains accessible to the
production host systems.
The Copy Export sets can be used to restore data at a location that has equal or newer tape
technology and equal or newer TS7700 Licensed Internal Code. A TS7700T Copy Export set
can be restored into both TS7740 and TS7700T. A TS7740 Copy Export set can also be
restored into both TS7740 and TS7700T. However, some rules apply:
TS7700T exported content that is restored to a TS7740 loses all knowledge of partitions.
TS7700T to TS7700T retains all partition information.
TS7740 exported content that is restored into a TS7700T has all content target the
primary tape partition.
There is an offsite reclamation process against copy-exported stacked volumes. This process
does not require the movement of physical cartridges. Rather, the logical volume is written
newly to a copy-exported stacked volume, and the original copy exported stacked volume is
marked invalid. For more information, see 12.1.5, “Reclaim process for Copy Export physical
volumes” on page 743.
This book uses the general term key manager for all EK managers.
Important: The EKM is no longer available and does not support the TS1140 and TS1150.
If you need encryption support for the TS1140 or higher, you must install either IBM
Security Key Lifecycle Manager or IBM Security Key Lifecycle Manager for z/OS.
IBM Security Key Lifecycle Manager replaces Tivoli Key Lifecycle Manager.
The key manager is the central point from which all EK information is managed and served to
the various subsystems. The key manager server communicates with the TS7740/TS7700T
and tape libraries, CUs, and Open Systems device drivers. For more information, see 4.4.7,
“Planning for tape encryption in a TS7740, TS7720T, and TS7760T” on page 186.
The TS7740/TS7700T
The TS7740/TS7700T provides the means to manage the use of encryption and the keys that
are used on a storage pool basis. It also acts as a proxy between the tape drives and the key
manager servers, by using redundant Ethernet to communicate with the key manager servers
and FICONs to communicate with the drives. Encryption must be enabled in each of the
tape drives.
The storage pools were originally created for management of physical media, and they have
been enhanced to include encryption characteristics. Storage pool encryption parameters are
configured through the TS7740/TS7700T MI under Physical Volume Pools.
For encryption support, all drives that are attached to the TS7740/TS7700T must be
Encryption Capable, and encryption must be enabled. If TS7740/TS7700T uses TS1120
Tape Drives, they must also be enabled to run in their native E05 format. The management of
encryption is performed on a physical volume pool basis. Through the MI, one or more of the
32 pools can be enabled for encryption.
Each pool can be defined to use specific EKs or the default EKs defined at the key manager
server:
Specific EKs
Each pool that is defined in the TS7740/TS7700T can have its own unique EK. As part of
enabling a pool for encryption, enter two key labels for the pool and an associated key
mode. The two keys might or might not be the same. Two keys are required by the key
manager servers during a key exchange with the drive. A key label can be up to 64
characters. Key labels do not have to be unique per pool.
For logical volumes that contain data that is to be encrypted, host applications direct them to
a specific pool that has been enabled for encryption by using the SG construct name. All data
that is directed to a pool that is enabled for encryption is encrypted when they are
premigrated to the physical stacked volumes, or reclaimed to the stacked volume during the
reclamation process. The SG construct name is bound to a logical volume when it is mounted
as a scratch volume.
Through the MI, the SG name is associated with a specific pool number. When the data for a
logical volume is copied from the TVC to a physical volume in an encryption-enabled pool, the
TS7740/TS7700T determines whether a new physical volume needs to be mounted. If a new
cartridge is required, the TS7740/TS7700T directs the drive to use encryption during the
mount process.
The TS7740/TS7700T also provides the drive with the key labels specified for that pool.
When the first write data is received by the drive, a connection is made to a key manager and
the key that is needed to perform the encryption is obtained. Physical scratch volumes are
encrypted with the keys in effect at the time of first write to BOT.
Any partially filled physical volumes continue to use the encryption settings in effect at the
time that the tape was initially written from BOT. The encryption settings are static until the
volumes are reclaimed and rewritten again from BOT.
The request for an EK is directed to the IP address of the primary key manager. Responses
are passed through the TS7740/TS7700T to the drive. If the primary key manager did not
respond to the key management request, the optional secondary key manager IP address is
used. After the TS11x0 drive completes the key management communication with the key
manager, it accepts data from the TVC.
When a logical volume needs to be read from a physical volume in a pool that is enabled for
encryption, either as part of a recall or reclamation operation, the TS7740/TS7700T uses the
key manager to obtain the necessary information to decrypt the data.
The affinity of the logical volume to a specific EK, or the default key, can be used as part of
the search criteria through the TS7700 MI.
Remember: If you want to use external key management for both cache and physical
tapes, you must use the same external key manager instance.
You can use this user management to specify independent User IDs. Each User ID is
assigned a role. The role identifies the access rights for this user. You can use this method to
restrict the access to specific tasks.
You should consider restricting access to specific items. Especially the Tape Partition
management and the access to the LIBRARY REQUEST should be considered carefully.
Starting with R3.0, when LDAP is enabled, the TS7700 MI is controlled by the LDAP server.
Also, the local actions that are run by the IBM SSR are secured by the LDAP server. All IBM
standard users can no longer access the system without a valid LDAP user ID and password.
You must have a valid account in the LDAP server, and the roles that are assigned to your
user, to be able to communicate with the TS7700.
If your LDAP server is not available, you are not able to interact with TS7700 (not with IBM
SSR or an operator).
Important: Create at least one external authentication policy for IBM SSRs before a
service event.
With R3.2, IBM RACF® can now be used to control the access. That means that all users are
defined to RACF and, in case of an access, the password is verified on the RACF database.
Roles and profiles still must be maintained because the RACF database runs only the
password authentication.
In Release 3.2, a change was introduced to allow specific access without the usage of LDAP
(IBM SSR and second-level dial-in support).
Before R4.1.2, in rare cases a stand-alone cluster could initiate a Reboot without any
notification to the customer. Starting with R4.1.2 the attached hosts are now notified as long
as the cluster is still able to do so. Then the reboot is executed. The advantage of the
notification is that an error reason is provided and that the customer can trigger these
messages. In addition, more data will be collected and interpreted to trigger a local fence.
Local fence is automatically enabled, cannot be disabled by the customer, and has no
parameters or options.
For more information, see the IBM TS7700 Series Grid Resiliency Improvements User’s
Guide at:
https://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102742
Remember: Seven- and eight-cluster grid configurations are available with an RPQ.
In a multi-cluster grid, some rules for virtual and logical volumes apply:
You can store a logical volume or virtual volume in the following ways:
– Single instance in only one cluster in a grid.
– Multiple instances (two, three, four, five, or six) in different clusters in the grid, up to the
number of clusters in the grid.
In a multi-cluster grid, the following rules for access to the virtual and logical volumes apply:
A logical volume can be accessed from any virtual device in the system.
Any logical volume (replicated or not) is accessible from any other cluster in the grid.
Each distributed library has access to any logical volumes within the composite library.
Note: You can still restrict access to clusters by using host techniques (for example, HCD).
With this flexibility, the TS7700 grid provides many options for business continuance and data
integrity, meeting requirements for a minimal configuration up to the most demanding
advanced configurations.
Grid enablement
FC4015 must be installed on all clusters in the grid.
Grid network
A grid network is the client-supplied TCP/IP infrastructure that interconnects the TS7700 grid.
Each cluster has two Ethernet adapters that are connected to the TCP/IP infrastructure. The
single-port 10 gigabits per second (Gbps) long-wave optical fiber adapter is supported. This
configuration accounts for two or four grid links, depending on the cluster configuration. See
7.1.1, “Common components for the TS7700 models” on page 246.
Earlier TS7740 might still have the single-port adapters for the copper connections and SW 1
Gbps connections. A miscellaneous equipment specification (MES) is available to upgrade
the single port to dual-port adapters.
Tip: Enabling grid encryption significantly affects the replication performance of the
TS7700 grid.
However, if the grid is not connected to an external time source, the time that is presented
from the grid (VEHSTATS and so on) might not show the same time as your LPARs, which
can lead to some confusion during problem determination or for reporting, because the
different time stamps do not match.
The NTP server address is configured into the system vital product data (VPD) on a
system-wide scope. Therefore, all nodes access the same NTP server. All clusters in a grid
need to be able to communicate with the same NTP server that is defined in VPD. In the
absence of an NTP server, all nodes coordinate time with Node 0 or the lowest cluster index
designation. The lowest index designation is Cluster 0, if Cluster 0 is available. If not, it uses
the next available cluster.
Ownership
Any logical volume, or any copies of it, can be accessed by a host from any virtual device that
is participating in a common grid, even if the cluster associated with the virtual device does
not have a local copy. The access is subject to volume ownership rules. At any point in time, a
logical volume is owned by only one cluster. The owning cluster controls access to the data
and the attributes of the volume.
Remember: The volume ownership protects the volume from being accessed or modified
by multiple clusters simultaneously.
Ownership can change dynamically. If a cluster needs to mount a logical volume on one of its
virtual devices and it is not the owner of that volume, it must obtain ownership first. When
required, the TS7700 node transfers the ownership of the logical volume as part of mount
processing. This action ensures that the cluster with the virtual device that is associated with
the mount has ownership.
If a TS7700 Cluster has a host request for a logical volume that it does not own, and it cannot
communicate with the owning cluster, the operation against that volume fails unless more
direction is given.
Ownership can also be transferred manually by an LI REQ,OTCNTL for special purposes. For
more information, see the IBM TS7700 Series z/OS Host Command Line Request User’s
Guide on the following website:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101091
Tokens
Tokens are used to track changes to the ownership, data, or properties of a logical volume.
The tokens are mirrored at each cluster that participates in a grid and represent the current
state and attributes of the logical volume. Tokens have the following characteristics:
Every logical volume has a corresponding token.
The grid component manages updates to the tokens.
Tokens are maintained in an IBM DB2® database that is coordinated by the local hnodes.
Each cluster’s DB2 database has a token for every logical volume in the grid.
Tokens are internal data structures that are not directly visible to you. However, they can be
retrieved through reports that are generated with the Bulk Volume Information Retrieval
(BVIR) facility.
Tokens are part of the architecture of the TS7700. Even in a stand-alone cluster, they exist
and are used in the same way as they are used in the grid configuration (with only one cluster
running the updates and keeping the database). In a grid configuration, all members in the
grid have the information for all tokens (also known as logical volumes) within the composite
library mirrored in each cluster. Token information is updated real time at all clusters in a grid.
Ownership takeovers
In some situations, the ownership of the volumes might not be transferable, such as when
there is a cluster outage. Without the AOTM, you need to take over manually. The following
options are available:
Read-only Ownership Takeover
When Read-only Ownership Takeover is enabled for a failed cluster, ownership of a
volume is taken from the failed TS7700 Cluster. Only read access to the volume is allowed
through the other TS7700 clusters in the grid. After ownership for a volume has been
taken in this mode, any operation that attempts to modify data on that volume or change
its attributes fails. The mode for the failed cluster remains in place until another mode is
selected or the failed cluster is restored.
You can set the level of ownership takeover, Read-only or Write, through the TS7700 MI.
Important: You cannot set a cluster in service preparation after it has already failed.
For more information about an automatic takeover, see 2.3.34, “Autonomic Ownership
Takeover Manager” on page 96.
When a TVC that is different from the local TVC at the actual mount point is chosen, this is
called a remote mount. The TVC is then accessed by the grid network. You have several ways
to influence the TVC selection.
During the logical volume mount process, the best TVC for your requirements is selected,
based on the following considerations:
Availability of the cluster
Copy Consistency policies and settings
Scratch allocation assistance (SAA) for scratch mount processing
DAA for specific mounts
Override settings
Cluster family definitions
Consistency point management is controlled through the MC storage construct. Using the MI,
you can create MCs and define where copies are placed and when they are synchronized
relative to the host job that created them. Depending on your business needs for more than
one copy of a logical volume, multiple MCs, each with a separate set of definitions, can be
created.
The following key questions help to determine copy management in the TS7700:
Where do you want your copies to be placed?
When do you want your copies to become consistent with the originating data?
Do you want logical volume copy mode retained across all grid mount points?
For different business reasons, data can be synchronously created in two places, copied
immediately, or copied asynchronously. Immediate and asynchronous copies are pulled and
not pushed within a grid configuration. The cluster that acts as the mount cluster informs the
appropriate clusters that copies are required and the method they need to use. It is then the
responsibility of the target clusters to choose an optimum source and pull the data into its
disk cache.
For more information, see 2.3.5, “Copy consistency points” on page 68.
Remember: The mount point (allocated virtual device) and the actual TVC used might be
in different clusters. The Copy Consistency Policy is one of the major parameters that are
used to control the TVC.
The concept of families was introduced to help with the I/O TVC selection process, and to
help make distant replication more efficient. For example, two clusters are at one site, and the
other two are at a remote site. When the two remote clusters need a copy of the data, cluster
families enforce that only one copy of the data is sent across the long grid link.
Also, when a cluster determines where to source a volume, it gives higher priority to a cluster
in its family over another family. A cluster family establishes a special relationship between
clusters. Typically, families are grouped by geographical proximity to optimize the use of grid
bandwidth. Family members are given higher weight when determining which cluster to prefer
for TVC selection.
Figure 2-12 on page 70 illustrates how cooperative replication occurs with cluster families.
Cooperative replication is used for Deferred copies only. When a cluster needs to pull a copy
of a volume, it prefers a cluster within its family. The example uses Copy Consistency Points
of Run, Run, Deferred, Deferred [R,R,D,D].
With cooperative replication, one of the family B clusters at the DR site pulls a copy from one
of the clusters in production family A. The second cluster in family B waits for the other cluster
in family B to finish getting its copy, then pulls it from its family member. This way the volume
travels only once across the long grid distance.
Family to Family
Within Within
Family
WAN
Family
Because each family member is pulling a copy of a separate volume, this process makes a
consistent copy of all volumes to the family quicker. With cooperative replication, a family
prefers retrieving a new volume that the family does not have a copy of yet, over copying a
volume within a family. With fewer than 20 (or the number of configured replication) tasks,
copies must be sourced from outside of the family, and the family begins to replicate among
itself.
Second copies of volumes within a family are deferred in preference to new volume copies
into the family. Without families, a source cluster attempts to keep the volume in its cache until
all clusters that need a copy have received their copy. With families, a cluster’s responsibility
to keep the volume in cache is released after all families that need a copy have it. This
process enables PG0 volumes in the source cluster to be removed from cache sooner.
Another benefit is the improved TVC selection in cluster families. For cluster families already
using cooperative replication, the TVC algorithm favors using a family member as a copy
source. Clusters within the same family are favored by the TVC algorithm for remote (cross)
mounts. This favoritism assumes that all other conditions are equal for all the grid members.
For more information about cluster families, see IBM Virtualization Engine TS7700 Series
Best Practices -TS7700 Hybrid Grid Usage, found at the following website:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101656
Note: Remember, Synchronous mode copy is not subject to copy policy override settings.
In these GDPS use cases, you must set the Force Local TVC override to ensure that the local
TVC is selected for all I/O. This setting includes the following options:
Prefer Local for Fast Ready Mounts
Prefer Local for non-Fast Ready Mounts
Force Local TVC to have a copy of the data
Composite library
The composite library is the logical image of all clusters in a multi-cluster grid, and is
presented to the host as a single library. The host sees a logical tape library with up to 96 CUs
in a standard six cluster grid, or up to 186 CUs if all six clusters have been upgraded to
support 496 drives.
The virtual tape devices are defined for the composite library only.
Cluster 0 Cluster 1
Distributed Library Distributed Library
Sequence # Sequence #
Host A B2222 Host B
A1111
Cluster 2
Distributed Library
Sequence # Host C
C3333
EDGRMMnn (Host C)
OPENRULE VOLUME(*) - UCBs 0x3000-0x30FF
TYPE(NORMM) - SCDS (Host C)
ANYUSE(REJECT) Libname Composite Library
Composite Library Information
PRTITION VOLUME(*) - Sequence #
Composite Library Sequence # F0000
TYPE(NORMM) - MYGRID F0000
Volume Ranges A00000-A99999
SMT(IGNORE) -
B00000-B99999 Categories 002*
NOSMT(IGNORE)
C00000-C99999
PRTITION VOLUME(C*)
TYPE(NORMM) - TCDB (Host C)
SMT(ACCEPT) - Volume Range Libname
NOSMT(IGNORE) C00000-C99999 MYGRID
Distributed library
Each cluster in a grid is a distributed library, which consists of a TS7700. In a
TS7740/TS7700T, it is also attached to a physical tape library. Each distributed library can
have up to 31 3490E tape controllers per cluster. Each controller has 16 IBM 3490E tape
drives, and is attached through up to four FICON channel attachments per cluster. However,
the virtual drives and the virtual volumes are associated with the composite library.
However, in a multi-cluster grid, the different TVCs from all clusters are potential candidates
for containing logical volumes. The group of TVCs can act as one composite TVC to your
storage cloud, which can influence the following areas:
TVC management
Out of cache resources conditions
Selection of I/O cache
For more information, see 2.3.20, “General TVC management in multi-cluster grids” on
page 79 and 2.3.25, “Copy Consistency Point: Copy policy modes in a multi-cluster grid” on
page 85.
Remember: Starting with V07/VEB servers and R3.0, the maximum number of supported
virtual volumes is 4,000,000 virtual volumes per stand-alone cluster or multi-cluster grid.
The default maximum number of supported logical volumes is still 1,000,000 per grid.
Support for extra logical volumes can be added in increments of 200,000 volumes by using
FC5270.
Important: All clusters in a grid must have the same quantity of installed instances of
FC5270 configured. If you have configured a different number of FC5270s in clusters that
are combined to a grid, the cluster with the lowest number of virtual volumes constrains all
of the other clusters. Only this number of virtual volumes is then available in the grid.
To optimize your environment, DAA can be used. See “Device allocation assistance” on
page 76.
If the virtual volume was modified during the mount operation, it is premigrated to back-end
tape (if present), and has all copy policies acknowledged. The virtual volume is transferred to
all defined consistency points. If you do not specify the Retain Copy Mode, the copy policies
from the mount cluster are chosen at each close process.
The exception is if the Retain Copy policy is not set, and the MC at the mounting cluster has
different consistency points defined compared to the volume’s previous mount. If the
consistency points are different, the volume inherits the new consistency points and creates
more copies within the grid, if needed. Existing copies are not removed if already present.
Remove any non-required copies by using the LIBRARY REQUEST REMOVE command.
With the new FC 5275, you can add one LCU with 16 drives up to the maximum of 496 logical
drives per cluster. This results in the following maximum numbers of virtual drives. See
Table 2-2.
Table 2-2 Number of maximum virtual drives in a multi-cluster grid with FC 5275 installed
Cluster type Number of maximum virtual drives
To support this number of virtual drives, specific authorized program analysis reports (APARs)
are needed to install the appropriate program temporary fixes (PTFs) for the Preventive
Service Planning (PSP) bucket.
DAA is enabled, by default, in all TS7700 clusters. If random allocation is preferred, it can be
disabled by using the LIBRARY REQUEST command for each cluster. If DAA is disabled for the
cluster, DAA is disabled for all attached hosts.
SAA was introduced in TS7700 R2.0, and is used to help direct new allocations to specific
clusters within a multi-cluster grid. With SAA, clients identify which clusters are eligible for the
scratch allocation and only those clusters are considered for the allocation request. SAA is
tied to policy management, and can be tuned uniquely per defined MC.
SAA is disabled, by default, and must be enabled by using the LIBRARY REQUEST command
before any SAA MC definition changes take effect. Also, the allocation assistance features
might not be compatible with Automatic Allocation managers based on offline devices. Verify
the compatibility before you introduce either DAA or SAA.
Important: Support for the allocation assistance functions (DAA and SAA) was first added
to the job entry subsystem 2 (JES2) environment. Starting with z/OS V2R1, DAA and SAA
are also available to JES3.
If the mount is directed to a cluster without a valid copy, a remote mount can be the result.
Therefore, in special cases, even if DAA is enabled, remote mounts and recalls can still occur.
Later, host processing attempts to allocate a device from the first cluster that is returned in the
list. If an online non-active device is not available within that cluster, it moves to the next
cluster in the list and tries again until a device is chosen. This process enables the host to
DAA improves a grid’s performance by reducing the number of cross-cluster mounts. This
feature is important when copied volumes are treated as Preference Group 0 (removed from
cache first), and when copies are not made between locally attached clusters of a common
grid. With DAA, using the copy policy overrides to Prefer local TVC for Fast Ready mounts
provides the best overall performance. Configurations that include the TS7760 and TS7720
deep cache dramatically increase their cache hit ratio.
Without DAA, configuring the cache management of replicated data as PG1 (prefer to be kept
in cache with an LRU algorithm) is the best way to improve private (non-Fast Ready) mount
performance by minimizing cross-cluster mounts. However, this performance gain includes a
reduction in the effective grid cache size, because multiple clusters are maintaining a copy of
a logical volume. To regain the same level of effective grid cache size, an increase in physical
cache capacity might be required.
DAA (JES2) requires updates in host software (APAR OA24966 for z/OS V1R8, V1R9, and
V1R10). DAA functions are included in z/OS V1R11 and later. DAA (JES3) is available
starting with z/OS V2R1.
SAA functions extend the capabilities of DAA to the scratch mount requests. SAA filters the
list of clusters in a grid to return to the host a smaller list of candidate clusters that are
designated as scratch mount candidates. By identifying a subset of clusters in the grid as sole
candidates for scratch mounts, SAA optimizes scratch mounts to a TS7700 grid.
TS77400
or
TS7700T
0T
40
TS7740
or
0T
TS7700T
LAN/WAN
TS7700D
When queried by the host that is preparing to issue a scratch mount, the TS7700 considers
the candidate list that is associated with the MC, and considers cluster availability. The
TS7700 then returns to the host a filtered, but unordered, list of candidate clusters suitable for
the scratch mount operation.
The z/OS allocation process then randomly chooses a device from among those candidate
clusters to receive the scratch mount. If all candidate clusters are unavailable or in service, all
clusters within the grid become candidates. In addition, if the filtered list returns clusters that
have no devices that are configured within z/OS, all clusters in the grid become candidates.
Be aware that SAA (and therefore this behavior) influences only the mount selection of the
logical volume. If in the Management Class the unavailable cluster is defined as the only
cluster where the data should be written to (TVC selection), the mount will be processed.
However, the job is still unable to run because the selected TVC is unavailable. You will see
CBR4000I and CBR4171I messages, and get a CBR4196D for a reply.
If either of the following events occurs, the mount enters the mount recovery process and
does not use non-candidate cluster devices:
All devices in the selected cluster are busy.
Too few or no devices in the selected cluster are online.
You can use a new LIBRARY REQUEST option to enable or disable globally the function across
the entire multi-cluster grid. Only when this option is enabled does the z/OS software run the
additional routines that are needed to obtain the candidate list of mount clusters from a
certain composite library. This function is disabled by default.
All clusters in the multi-cluster grid must be at release 2.0 level before SAA is operational. A
supporting z/OS APAR OA32957 is required to use SAA in a JES2 environment of z/OS. Any
z/OS environment with earlier code can exist, but it continues to function in the traditional way
in relation to scratch allocations. SAA is also supported in a JES3 environment, starting with
z/OS V2R1.
To use this effective cache size, you need to manage the cache content. This is done by copy
policies (how many copies of the logical volume need to be provided in the grid) and the
cache management and removal policy (which data to keep preferably in the TVC). If you
define your copy and removal policies in a way that every cluster maintains a copy of every
logical volume, the effective cache size is no larger than a single cluster.
Therefore, You can configure your grid to take advantage of removal policies and a subset of
consistency points to have a much larger effective capacity without losing availability or
redundancy. Any logical volume that is stacked in physical tape can be recalled into TVC,
making them available to any cluster in the grid.
Replication order
Volumes that are written to an I/O TVC that is configured for PG0 have priority, based on the
peer TS7700 replication priority. Therefore, copy queues within TS7700 clusters handle
volumes with I/O TVC PG0 assignments before volumes configured as PG1 within the I/O
TVC. This behavior is designed to enable those volumes that are marked as PG0 to be
flushed from cache as quickly as possible, and not left resident for replication purposes.
This behavior overrides a pure FIFO-ordered queue. There is a new setting in the MI under
Copy Policy Override, Ignore cache Preference Groups for copy priority, to disable this
function. When selected, it causes all PG0 and PG1 volumes to be treated in FIFO order.
Tip: These settings in the Copy Policy Override window override default TS7700 behavior,
and can be different for every cluster in a grid.
2.3.22 TVC management for TS7740 and TS7700T CPx in a multi-cluster grid
In addition to the TVC management features from a stand-alone cluster, you can decide the
following information in a multi-cluster grid:
How copies from other clusters are treated in the cache
How recalls are treated in the cache
For example, in a two-cluster grid, consider that you set up a Copy Consistency Point policy of
RUN, RUN, and that the host has access to all virtual devices in the grid. After that, the
selection of virtual devices that are combined with I/O TVC selection criteria automatically
balances the distribution of original volumes and copied volumes across the TVCs.
The original volumes (newly created or modified) are preferred to be in cache, and the copies
are preferred to be removed from cache. The result is that each TVC is filled with unique
newly created or modified volumes, roughly doubling the effective amount of cache available
to host operations.
This behavior is controlled by the LI REQ SETTING CACHE COPYFSC option. When this option is
disabled (default), logical volumes that are copied into cache from a Peer TS7700 are
managed as PG0 (prefer to be removed from cache).
Note: COPYFSC is a cluster-wide control. All incoming copies to that specific cluster are
treated in the same way. All clusters in the grid can have different settings.
Because the TS7700D has a maximum capacity (the size of its TVC), after this cache fills, the
Volume Removal Policy enables logical volumes to be automatically removed from this
TS7700D TVC while a copy is retained within one or more peer clusters in the grid. When
coupled with copy policies, TS7700D Enhanced Removal Policies provide various automatic
data migration functions between the TS7700 clusters within the grid. This is also true for a
TS7700T CP0.
In addition, when the automatic removal is run, it implies an override to the current Copy
Consistency Policy in place, resulting in a lowered number of consistency points compared
with the original configuration defined by the user.
When the automatic removal starts, all volumes in scratch categories are removed first,
because these volumes are assumed to be unnecessary. To account for any mistake where
private volumes are returned to scratch, these volumes must meet the same copy count
criteria in a grid as the private volumes. The pinning option and minimum duration time
criteria described next are ignored for scratch (Fast Ready) volumes.
To ensure that data will always be in a TS7700D or TS7700T CP0, or be there for at least a
minimal amount of time, a volume retention time can be associated with each removal policy.
This volume retention time (in hours) enables volumes to remain in a TS7720 TVC for a
certain time before the volume becomes a candidate for removal. The time varies 0 - 65,536
hours. A volume retention time of zero assumes no minimal requirement.
Prefer Remove and Prefer Keep policies are similar to cache preference groups PG0 and
PG1, except that removal treats both groups as LRU versus using their volume size. In
addition to these policies, volumes that are assigned to a scratch category, and that were not
previously delete-expired, are also removed from cache when the free space on a cluster falls
below a threshold. Scratch category volumes, regardless of their removal policies, are always
removed before any other removal candidates in descending volume size order.
Volume retention time is also ignored for scratch volumes. Only if the removal of scratch
volumes does not satisfy the removal requirements are PG0 and PG1 candidates analyzed
for removal. If an appropriate number of volume copies exist elsewhere, scratch removal can
occur. If one or more peer copies cannot be validated, the scratch volume is not removed.
Host command-line query capabilities are supported that help override automatic removal
behaviors and disable automatic removal within a TS7700D cluster, or for the CP0 in a
TS7700T. For more information, see the IBM Virtualization Engine TS7700 Series z/OS Host
Command Line Request User’s Guide on Techdocs. It is available at the following website:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101091
Delayed Replication in R3.1 changed the auto-removal algorithm so that removal of volumes
where one or more delayed replication consistency points exist can take place only after
those delayed replications have completed. If families are defined, only delayed consistency
points within the same family must have completed.
This restriction prevents the removal of the only copy of a group before the delayed
replications can complete. If no candidates are available for removal, any delayed replication
tasks that have not had their grace period elapse replicate early, enabling candidates for
removal to be created.
During this degraded state, if a private volume is requested to the affected cluster, all TVC
candidates are considered, even when the mount point cluster is in the Out of Cache
Resources state. The grid function chooses an alternative TS7700 cluster with a valid
consistency point and, if you have a TS7700D or TS7700T CP0, available cache space.
Scratch mounts that involve a TVC candidate that is Out of Cache Resources fail only if no
other TS7700 cluster is eligible to be a TVC candidate. Private mounts are only directed to a
TVC in an Out of Cache Resources state if there is no other eligible (TVC) candidate. When
all TVCs within the grid are in the Out of Cache Resources state, private mounts are mounted
with read-only access.
When all TVC candidates are either in the Paused, Out of Physical Scratch Resource, or Out
of Cache Resources state, the mount process enters a queued state. The mount remains
queued until the host issues a dismount command, or one of the distributed libraries exits the
unwanted state. This behavior can be influenced by a new LI REQ,distlib,SETTING,PHYSLIB
command.
Any mount that is issued to a cluster that is in the Out of Cache Resources state, and also
has Copy Policy Override set to Force Local Copy, fails. The Force Local Copy setting
excludes all other candidates from TVC selection.
Tip: Ensure that Removal Policies, Copy Consistency Policies, and threshold levels are
applied to avoid an out-of-cache-resources situation.
A temporary removal threshold is used to free enough cache space of the TS7700D or
TS7700T CP0 cache in advance so that it does not fill up while another TS7700 cluster is in
service. This temporary threshold is typically used when there are plans of taking down one
TS7700 cluster for a considerable amount of time.
In addition, the temporary removal threshold can also be used to free up space before a
disaster recovery test with Flash Copy. During the disaster recovery test, no autoremoval or
delete expire process is allowed. Therefore, you should use the temporary removal threshold
to ensure that enough free space for the usual productions and the additional flash copies is
available in the clusters in the DR family.
Copy management is controlled through the MC storage construct. Using the MI, you can
create MCs, and define where copies exist and when they are synchronized relative to the
host job that created them.
When a TS7700 is included in a multi-cluster grid configuration, the MC definition window lists
each cluster by its distributed library name, and enables a copy policy for each. For example,
assume that three clusters are in a grid:
LIBRARY1
LIBRARY2
LIBRARY3
A portion of the MC definition window includes the cluster name and enables a Copy
Consistency Point to be specified for each cluster. If a copy is to exist on a cluster’s TVC, you
indicate a Copy Consistency Point. If you do not want a cluster to have a copy of the data, you
specify the No Copy option.
Note: The default MC is deferred at all configured clusters, including the local. The default
settings are applied whenever a new construct is defined through the MI, or to a mount
command where MC was not previously defined.
Data is written into one TVC and simultaneously written to the secondary cluster as opposed
to a RUN or DEFERRED copy where the data is not written to the cache in the I/O TVC and
then read again from the cache to produce the copy. Instead, the data is written directly with a
remote mount to the synchronous mode copy cluster. One or both locations can be remote.
All remote writes use memory buffering to get the most effective throughput across the grid
links. Only when implicit or explicit sync operations occur does all data at both locations get
flushed to persistent disk cache, providing a zero RPO of all data up to that point on tape.
Mainframe tape operations do not require that each tape block is synchronized, enabling
improved performance by only hardening data at critical sync points.
Applications that use data set-style stacking, and migrations, are the expected use cases for
SMC. But also, any application that requires a zero RPO at sync point granularity can benefit
from the Synchronous mode copy feature.
Important: The Synchronous mode copy takes precedence over any Copy Override
settings.
Meeting the zero RPO objective can be a flexible requirement for certain applications and
users. Therefore, a series of extra options are provided if the zero RPO cannot be achieved.
For more information, see the IBM TS7700 Series Best Practices - Synchronous Mode Copy
white paper that is available at the following website:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102098
Several new options are available with the synchronous mode copy. These options are
described in the following sections.
Enable this option to enable update operations to continue to any valid consistency point in
the grid. If there is a write failure, the failed S locations are set to a state of
synchronous-deferred. After the volume is closed, any synchronous-deferred locations are
updated to an equivalent consistency point through asynchronous replication. If the
Synchronous Deferred On Write Failure option is not selected, and a write failure occurs at
either of the S locations, host operations fail.
During allocation, an R or D site is chosen as the primary consistency point only when both S
locations are unavailable.
Some applications open the virtual volume with a DISP=OLD parameter, and still append the
volume. In this case, the append is successful, and a synchronous-deferred copy is produced.
Tip: With the introduction of the new z/OS implied update option, we advise you to use
this option for DFSMShsm or equivalent products.
If multiple clusters have a Copy Consistency Point of RUN, all of their associated TVCs must
have a copy of the data before command completion is indicated for the Rewind/Unload
command. These copies are produced in parallel. Options are available to override this
requirement for performance tuning purposes.
Deferred
If a Copy Consistency Point of Deferred is defined, the copy to that cluster’s TVC can occur
any time after the Rewind/Unload command has been processed for the I/O TVC.
Therefore, customers normally chose the always copy option and accepted the processor
burden of replicating data that might soon expire. With the Time Delayed Replication Policy,
you can now specify when the replication is done. A deferred copy will be made to all T sites
after X hours have passed since volume creation or last access. The process to identify newly
created T volumes runs every 15 minutes. You can specify only one T time for all Time
replication target clusters in the MC. You can specify 1 - 65,535 hours.
Data already expired is still copied to the target clusters in these circumstances:
The TMS has not yet returned the volume to scratch.
The logical volume is scratch but not reused, and the scratch category has no expire
delete definition.
The logical volume is scratch but not reused, and the scratch category has an expire
delete setting, which has not yet been reached for this specific logical volume. Ensure that
the expire delete setting for the scratch category and the Time Replication time
combination fit together.
Using the Time Delayed policy, the automatic removal in the TS7700D or TS7700T CP0 can
be influenced. The following rules apply:
In a grid without cluster families, all T copies need to be processed before an automatic
removal can occur on any TS7700D or TS7700T CP0 in the grid.
If cluster families are defined, all T copies in the family must be processed before auto
removal to any TS7700D or TS7700T CP0 in the cluster family can occur. However, a
logical volume in a TS7700D or TS7020T CP0 can be removed even if all T copies in a
different family have not been processed.
A TS7700D or TS7700T CP0 might run out of removal candidates, and the only candidates in
sight are those delayed replications that have not yet had their time expire. In this case, the
TS7700D or TS7700T CP0 detects this condition and triggers a subset of time-delayed
copies to replicate early to create removal candidates. The TS7700D and TS7700T CP0
prioritizes these copies as fast as it can replicate them. To avoid this situation, configure delay
times to be early enough to provide enough removal candidates to complete production
workloads.
No Copy
No copy to this cluster is performed.
For examples of how Copy Consistency Policies work in different configurations, see 2.4,
“Grid configuration examples” on page 103.
A mixture of Copy Consistency Points can be defined for an MC, enabling each cluster to
have a unique consistency point.
Tip: The Copy Consistency Point is considered for both scratch and specific mounts.
You might want to have two separate physical copies of your logical volumes on one of the
clusters and not on the others. Through the MI associated with the cluster where you want the
second copy, specify a secondary pool when defining the MC. For the MC definition on the
other clusters, do not specify a secondary pool. For example, you might want to use the Copy
Export function to extract a copy of data from the cluster to take to a DR site.
Important: During mount processing, the Copy Consistency Point information that is used
for a volume is taken from the MC definition for the cluster with which the mount vNode is
associated.
Define the Copy Consistency Point definitions of an MC to be the same on each cluster to
avoid confusion about where copies are. You can devise a scenario in which you define
separate Copy Consistency Points for the same MC on each of the clusters. In this scenario,
the location of copies and when the copies are consistent with the host that created the data
differs, depending on which cluster a mount is processed.
In these scenarios, use the Retain Copy mode option. When the Retain Copy mode is
enabled against the currently defined MC, the previously assigned copy modes are retained
independently of the current MC definition.
0 2
Host
TS7700 TS7700
Copy Mode DNDNS
WAN
1 3
TS7700 TS7700
Copy Mode NDNDS
On systems where DAA is not supported, 50% of the time, the host allocates to the cluster
that does not have a copy in its cache. When the alternative cluster is chosen, the existing
copies remain present, and more copies are made to the new Copy Consistency Points
defined in the Management Class, resulting in more copies. If host allocation selects the
cluster that does not have the volume in cache, one or two extra copies are created on
Cluster 1 and Cluster 3 because the Copy Consistency Points indicate that the copies need to
be made to Cluster 1 and Cluster 3.
0 2
Host
TS7700 TS7700
Copy Mode DNDNS
WAN
1 3
TS7700 TS7700
Copy Mode NDNDS
Figure 2-17 Four-cluster grid without DAA, Retain Copy mode disabled
With the Retain Copy mode option set, the original Copy Consistency Points of a volume are
used rather than applying the Management Class with the corresponding Copy Consistency
Points of the mounting cluster. A mount of a volume to the cluster that does not have a copy in
its cache results in a cross-cluster (remote) mount instead.
0 2
Host
TS7700 TS7700
Copy Mode DNDNS
WAN
1 3
TS7700 TS7700
Copy Mode NDNDS
Figure 2-18 Four-cluster grid without DAA, Retain Copy mode enabled
Another example of the need for Retain Copy mode is when one of the production clusters is
not available. All allocations are made to the remaining production cluster. When the volume
exists only in Cluster 0 and Cluster 2, the mount to Cluster 1 results in a total of three or four
copies. This applies to JES2 and JES3 without Retain Copy mode enabled (Figure 2-19).
0 2
Host
TS7700 TS7700
Copy Mode DNDNS
WAN
1 3
TS7700 TS7700
Copy Mode NDNDS
Figure 2-19 Four-cluster grid, one production cluster down, Retain Copy mode disabled
0 2
Host
TS7700 TS7700
Copy Mode DNDNS
WAN
1 3
TS7700 TS7700
Copy Mode NDNDS
Figure 2-20 Four-cluster grid, one production cluster down, Retain Copy mode enabled
For more information, see the IBM Virtualization Engine TS7700 Series Best Practices -
TS7700 Hybrid Grid Usage white paper at the Techdocs website:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101656
The list is ordered favoring the clusters that are thought to provide the optimal performance.
With Release 3.3, two new LI REQ parameter settings are introduced that influence the TVC
selection. You can use the SETTING2,PHYSLIB parameter to determine how a shortage or
unavailability condition is treated in a TS7700T.
In addition, you can use the LI REQ parameter LOWRANK to give a cluster a lower ranking in the
TVC selection. This parameter can be used under special conditions before you enter Service
Mode. This parameter influences the TVC selection for Host I/O and the copy and mount
behavior. In addition, it is a persistent setting, and can be set on every cluster independently.
To avoid a negative impact to your data availability, set LOWRANK to the default after the
maintenance is done.
For more information, see IBM TS7700 Series z/OS Host Command Line Request User’s
Guide, found on the following website:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101091
In a grid, you can define that the cluster is treated as degraded, which means that this cluster
has a lower priority in the TVC selection. However, all TVC selection criteria are
acknowledged, and if no other cluster can fulfill the selection criteria, the degraded cluster is
chosen as the TVC cluster.
In addition, you can specify whether this cluster pull s copies from other clusters.
The amount of data that is transferred depends on many factors, one of which is the data
compression ratio provided by the host FICON adapters. To minimize grid bandwidth
requirements, only compressed data that is used or provided by the host is transferred across
the network. Read-ahead and write buffering is also used to get the maximum from the
remote cluster mount.
Important: Ensure that scheduled Copy Export operations are always run from the
same cluster for the same recovery set. Other clusters in the grid can also initiate
independent copy export operations if their exported tapes are kept independent and
used for an independent restore. Exports from two different clusters in the same grid
cannot be joined.
Recovery that is run by the client is only to a stand-alone cluster configuration. After
recovery, the Grid MES offering can be applied to re-create a grid configuration.
When a Copy Export operation is initiated, only the following logical volumes are
considered for export:
– They are assigned to the secondary pool specified in the Export List File Volume.
– They are also on a physical volume of the pool or in the cache of the TS7700 running
the export operation.
For a Grid configuration, if a logical volume is to be copied to the TS7700 that will run the
Copy Export operation, but that copy has not yet completed when the export is initiated, it
is not included in the current export operation. Ensure that all logical volumes that need to
be included have completed replication to the cluster where the export process is run.
A service from IBM is available to merge a Copy Export set in an existing grid. Talk to your
IBM SSR.
AOTM uses the TS3000 TSSC associated with each TS7700 in a grid to provide an
alternative path to check the status of a peer TS7700. Therefore, every TS7700 in a grid must
be connected to a TSSC. To take advantage of AOTM, you must provide an IP communication
path between the TS3000 TSSCs at the cluster sites. Ideally, the AOTM function uses an
independent network between locations, but this is not a requirement.
With AOTM, the user-configured takeover mode is enabled if normal communication between
the clusters is disrupted, and the cluster that is running the takeover can verify that the other
cluster has failed or is otherwise not operational. For more information, see 9.3.11, “The
Service icon” on page 508.
When a cluster loses communication with another peer cluster, it prompts the attached local
TS3000 to communicate with the remote failing cluster’s TS3000 to confirm that the remote
TS7700 is down. If it is verified that the remote cluster is down, the user-configured takeover
mode is automatically enabled. If it cannot validate the failure, or if the system consoles
cannot communicate with each other, AOTM does not enable a takeover mode. In this
scenario, ownership takeover mode can be enabled only by an operator through the MI.
Without AOTM, an operator must determine whether one of the TS7700 clusters has failed,
and then enable one of the ownership takeover modes. This process is required to access the
logical volumes that are owned by the failed cluster. It is important that WOT be enabled only
when a cluster has failed, and not when there is only a problem with communication between
the TS7700 clusters.
Therefore, manually enabling ownership takeover when only network issues are present
should be limited to only those scenarios where host activity is not occurring to the
inaccessible cluster. If two conflicting versions are created, the condition is detected when
communications are resolved, and the volumes with conflicting versions are moved into an
error state. When in this error state, the MI can be used to choose which version is most
current.
Even if AOTM is not enabled, configure it to provide protection from a manual takeover mode
being selected when the cluster is functional. This additional TS3000 TSSC path is used to
determine whether an unavailable cluster is still operational or not. This path is used to
prevent the user from forcing a cluster online when it must not be, or enabling a takeover
mode that can result in dual volume use.
Also, the new function enables any volume that is assigned to one of the categories that are
contained within the configured list to be excluded from the general cluster’s write protect
state. The volumes that are assigned to the excluded categories can be written to or have
their attributes modified. In addition, those scratch categories that are not excluded can
optionally have their Fast Ready characteristics ignored, including Delete Expire and hold
processing. This enables the DR test to mount volumes as private that the production
environment has since returned to scratch (they are accessed as read-only).
One exception to the write protect is those volumes in the insert category. To enable a volume
to be moved from the insert category to a write protect-excluded category, the source
category of insert cannot be write-protected. Therefore, the insert category is always a
member of the excluded categories.
Be sure that you have enough scratch space when Expire Hold processing is enabled to
prevent the reuse of production scratched volumes when you are planning for a DR test.
Suspending the volumes’ Return-to-Scratch processing during the DR test is also advisable.
For more information, see Chapter 13, “Disaster recovery testing” on page 767.
For the DR host, the FlashCopy function provides data on a time consistent basis (Time
zero). The production data continues to replicate during the entire test. The same volumes
can be mounted at both sites at the same time, even with different data. To differentiate
between read-only production data at time zero and fully read/write-enabled content that is
created by the DR host, the selective write protect features must be used.
All access to write-protected volumes involves a snapshot from the time zero FlashCopy. Any
production volumes that are not yet replicated to the DR location at the time of the snapshot
cannot be accessed by the DR host, which mimics a true disaster.
Through selective write protect, a DR host can create new content to segregated volume
ranges. There are 32 write exclusion categories now supported, versus the previous 16. Write
protected media categories cannot be changed (by the DR host) while the Write Protection
mode is enabled. This is true not only for the data, but also for the status of the volumes.
Therefore, it is not possible (by the DR host) to set production volumes from scratch to private
or vice versa. When the DR site has just TS7700Ds, the flash that is initiated during the DR
test is across all TS7700Ds in the DR-Family. As production returns logical volumes to
scratch, deletes them, or reuses them, the DR site holds on to the old version in the flash.
Therefore, return to scratch processing can now run at the production side during a test, and
there is no need to defer it or use expire hold.
For more information about FlashCopy setup, see Chapter 9, “Operation” on page 339. For
DR testing examples, see Chapter 13, “Disaster recovery testing” on page 767.
The following items are extra notes for R3.1 FlashCopy for DR Testing:
Only TS7700 Grid configurations where all clusters are running R3.1 or later, and at least
one TS7720 or TS7760 cluster exists, are supported.
Disk cache snapshot occurs to one or more TS7720 and TS7760 clusters in a DR family
within seconds. TS7740 clusters do not support snapshot.
All logical volumes in a TS7700T CP0 partition, and all logical volumes from CPx kept in
cache, are part of the DR-Flash.
If a TS7740 cluster is present within a DR family, an option is available enabling the
TS7740 live copy to be accessed if it completed replication before time zero of the DR test.
Although the initial snapshot itself does not require any extra space in cache, this might
apply if the TS7720 or TS7760 has its live copy removed for some reason.
Volumes in the TS7720T that are stored in CPx partitions, and that are already migrated to
physical tape, are not part of the DR-Flash. They can still be accessed if the LIVECOPY
Option is enabled and the logical volume was created before time zero.
TS7720 clusters within the DR location should be increased in size to accommodate the
delta space retained during the test:
– Any volume that was deleted in production is not deleted in DR.
– Any volume that is reused in production results in two DR copies (old at time zero
and new).
Automatic removal is disabled within TS7720 clusters during DR test, requiring a
pre-removal to be completed before testing.
LI REQ DR Family settings can be completed in advance, enabling a single LI REQ
command to be run to initiate the flash and start DR testing.
DR access introduces its own independent ownership, and enables DR read-only volumes
to be mounted in parallel to the production-equivalent volumes.
The following terminology is used for FlashCopy for DR Testing:
Live copy A real-time instance of a virtual tape within a grid that can be modified
and replicated to peer clusters.
This is the live instance of a volume in a cluster that is the most current
true version of the volume. It is altered by a production host, or as the
content created during a DR test.
FlashCopy A snapshot of a live copy at time zero. The content in the FlashCopy is
fixed, and does not change even if the original copy is modified. A
FlashCopy might not exist if a live volume was not present at time zero.
In addition, a FlashCopy does not imply consistency, because the live
copy might have been an obsolete or incomplete replication at time
zero.
Grid-wide problems might occur due to a single cluster in the grid experiencing a problem.
When there are problems in one cluster that cause it to be sick or unhealthy, but not
completely dead (Sick But Not Dead (SBND)), the peer clusters might be greatly affected,
and customer jobs end up being affected (long mount time, failed sync mode writes, much
more than degraded).
Grid Resiliency Improvements are the functions to identify the symptoms and make the grid
more resilient when a single cluster experiences a problem by removing the sick or unhealthy
cluster from the grid, either explicitly or implicitly through different methods. By removing it,
the rest of peer clusters can then treat it as “dead” and avoid further handshakes with it until it
can be recovered.
The grid resiliency function is designed to detect permanent impacts. It is not designed to
react to these situations:
Temporarily impacts (like small network issues)
Performance issues due to high workload
Note that due to the nature of a TS7700 grid, this isolation is not comparable to a mechanism
in the disks, like hyperswap or similar techniques. Such techniques are based on local
installed devices, whereas TS7700 grids can span thousands of miles. Therefore, the
detection can take much longer than in a disk world, and also the actions might take longer.
The customer can specify different thresholds (e.g. mount timing, handshake and token
timings, and error counters) and other parameters to influence the level of sensitivity to events
that affect performance. To avoid a false fence condition, use the defaults for the thresholds at
the beginning and adjust the parameters only if necessary.
Local Fence
As in a stand-alone environment, the cluster decides, based on hardware information, that it
has suffered a SBND condition and fences itself. The function is automatically enabled after
R4.1.2 is installed on the cluster, even if other clusters in the grid are not yet running on
R4.1.2 or higher level. The local fence has no parameters or options and cannot be disabled.
Remote Fence
Depending on the parameters, one of the clusters in a grid might detect an unhealthy state of
a peer cluster. In a grid with three or more clusters, all the clusters need to concur that a
specific cluster is unhealthy for it to be fenced.
In a two cluster grid, both clusters need to agree that the same cluster is SBND. Otherwise,
no remote fence occurs.
The remote fence is per default disabled. If the customer enables the remote fence action,
multiple parameters need to be defined:
Primary Action:
– ALERT: An Alert message will be sent to the attached hosts, and the cluster will be
fenced. However, the cluster will remain online and will still be part of the grid. So the
unhealthy situation will not be solved automatically. You might consider this option if
you want to be notified about that SBND condition occurred, but want to execute the
necessary actions manually.
– REBOOT: The SBND cluster will be rebooted. If the reboot is successful, the cluster
will automatically be varied back online to the grid. If the reboot is not successful, the
reboot action will be repeated twice before the cluster remains offline. You might
consider this option if availability of the complete grid is the main target, such as when
the remaining grid resources cannot handle the workload during peak times.
– REBOFF: The SBND cluster will be rebooted, but stays in an offline mode. You might
consider this option if an analysis of the situation is always requested before the cluster
come back to the grid.
– OFFLINE,FORCE: The SBND cluster will be set to offline immediately with no reboot.
This option provides the quickest shutdown, but the reboot action might take longer.
Secondary Action: The customer can enable a secondary option. If enabled and the
primary option fails (for example, the primary action cannot be executed), the cluster will
be isolated from the grid. That means that only the gridlink ports are disabled, and
therefore there is no communication between the SBND cluster and all other clusters in
the grid. Be aware that there are no actions to the virtual devices in Release 4.1.2. If
virtual devices are still online to the connected IBM Z LPARS, the cluster can still accept
mounts. However, this has multiple negative side effects:
– Replication is not feasible to and from this cluster.
– Private mounts can be routed to a cluster where a drive is available, but the ownership
may not be transferable to this cluster due to the gridlink isolation. This mount will not
be executed, and the job will hang.
– Scratch mounts can only be successfully executed if ownership for scratch volumes is
available. To avoid this, offline the devices from the isolated cluster as soon as possible
from all attached IBM Z LPARS.
For more information see IBM TS7700 Series Grid Resiliency Improvements User’s Guide at:
https://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102742
Other distributed libraries within the composite library remain available. The host device
addresses that are associated with the site in service send Device State Change alerts to the
host, enabling those logical devices that are associated with the service preparation cluster to
enter the pending offline state.
If a cluster enters service prep, the following copy actions are processed:
All copies in flight (running currently), regardless of whether they are going to or from the
cluster, are finished.
No copies from other clusters to the cluster that is entering the service mode are started.
All logical volumes that have not been copied yet, and that need to have at least one copy
outside the cluster that is entering the service mode, are copied to at least one other
cluster in the grid. This is true for all copies except those that are Time Delayed.
For time delayed copies, all data that should be copied in the next 8 hours to the target
clusters is copied. All other data is not copied, even if the data is only in the cluster that is
entering the service mode.
When service prep completes and the cluster enters service mode, nodes at the site in
service mode remain online. However, the nodes are prevented from communicating with
other sites. This stoppage enables service personnel to run maintenance tasks on the site’s
nodes, run hardware diagnostics, and so on, without affecting other sites.
Only one service prep can occur within a composite library at a time. If a second service prep
is attempted at the same time, it fails. You should put only one cluster in the service mode at
the same point in time.
A site in service prep automatically cancels and reverts to an ONLINE state if any ONLINE
peer in the grid experiences an unexpected outage. The last ONLINE cluster in a multicluster
configuration cannot enter the service prep state. This restriction includes a stand-alone
cluster. Service prep can be canceled by using the MI, or by the IBM SSR at the end of the
maintenance procedure. Canceling service prep returns the subsystem to a normal state.
Important: We advise you not to place multiple clusters at the same time in service or
service preparation. If you must put multiple clusters in service at once, wait for a cluster to
be in final service mode before you start the service preparation for the next cluster.
If you have multiple SAA candidates defined, you still might consider disabling SAA. This
would be necessary if otherwise the amount of SAA selectable devices are not sufficient to
run all of the jobs concurrently.
After SAA is enabled again, you should restart all attached OAM address spaces to ensure
that the changed SAA state is recognized by the attached z/OS. If you do not restart the OAM
address spaces, the system might react as though SAA is still disabled.
In smaller grid configurations, put only a single cluster into service at a time to retain the
redundancy of the grid. This is only a suggestion, and does not prevent the action from taking
place, if necessary.
If it is necessary to put multiple clusters in service mode, it is mandatory to bring them back to
normal state together. In this situation a cluster cannot come back online if another cluster is
still in service mode. Using the MI, you need to select each cluster independently and select
Return to normal mode. The clusters wait until all clusters in service mode are brought back
to “normal mode” before they exit the service mode.
Tip: Ensure that you can log on to the MIs of the clusters directly. A direct logon is
possible, but you cannot navigate to or from other clusters in the grid when the cluster is in
service mode.
The TS7700 tracks the grouped devices to all path groups that reported CUIR is supported,
and does not enter service until they are all varied offline.
The customer can decide, if an automatic online (AONLINE) will be executed, when the
cluster returns to an operational state from service. Then the logical paths that are
established from the system zLPAR which the cluster surfaced the unsolicited attention
receive an unsolicited attention to request the zLPAR to vary the devices back online.
Note: If a device is varied offline for CUIR reasons, and is unintentionally left in this state,
the existing MVS VARY XXXX,RESET command can be used to reset the device for CUIR
reasons. This command should only be used if there are devices that are left in this state,
and should no longer be in this state.
The following APARs needs to be installed: OA52398, OA52390, OA52376, OA52379, and
OA52381.
New LI REQ commands are provided to enable/disable CUIR and AONLINE, and to get a
overview of the current logical drive/path group information. The default is disabled.
For more information, see the IBM TS7700 Series Control Unit Initiated Reconfiguration
(CUIR) User’s Guide at:
https://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102743
Although TS7760 disk-only configurations can store over 1.3 PB of post-compressed content
in disk cache, your capacity needs might be far too large, especially when a large portion of
your workload does not demand the highest read hit percentage. This is when the
introduction of a TS7700T makes sense.
These clusters are connected through a local area network (LAN). If one of them becomes
unavailable because it failed, is being serviced, or is being updated, data can be accessed
through the other TS7700 Cluster until the unavailable cluster is available. The assumption is
that continued access to data is critical, and no single point of failure, repair, or upgrade can
affect the availability of data.
For these configurations, the multi-cluster grid can act as both an HA and DR configuration
that assumes that all host and disk operations can recover at the metro distant location.
However, metro distances might not be ideal for DR, because some disasters can affect an
entire metro region. In this situation, a third location is ideal.
Depending on the distance to your DR data center, consider connecting your grid members in
the DR location to the host in the local site.
As part of planning a TS7700 grid configuration to implement this solution, consider the
following information:
Plan for the necessary WAN infrastructure and bandwidth to meet the copy requirements
that you need. You generally need more bandwidth if you are primarily using a Copy
Consistency Point of SYNC or RUN, because any delays in copy time that are caused by
bandwidth limitations can result in an elongation of job run times.
If you have limited bandwidth available between sites, copy critical data with a consistency
point of SYNC or RUN, with the rest of the data using the Deferred Copy Consistency
Point. Consider introducing cluster families only for three or more cluster grids.
Depending on the distance, the latency might not support the use of RUN or SYNC at all.
Under certain circumstances, you might consider the implementation of an IBM
SAN42B-R SAN Extension Switch to gain higher throughput over large distances.
This example is a two-site scenario where the sites are separated by a 10 km (6.2 miles)
distance. Although the customer needs big data, and read processes are limited, two
TS7760Ts were installed, one in each site. Because of the limited distance, both clusters are
FICON-attached to each host.
The client chooses to use Copy Export to store a third copy of the data in an offsite vault
(Figure 2-21).
Site/Campus A Site/Campus B
Host
Ho Host
Ho
FICON FICON
TS7
TS7740 TS7740
T
or TS
TS7700T or TS7700T
T
Clu
Cluster 0 Cluster
C 1
WAN < 10 km
The client runs many OAM and HSM workloads, so the large cache of the TS7760 provides
the necessary bandwidth and response times. Also, the client wanted to have a third copy on
a physical tape, which is provided by the TS776T in the remote location.
Host A
Ho DWDM, Channel Extension (optional)
Host B Host
(optional)
FICON
FICON
TS770 0D
TS7700D TS7700D
TS770 TS774
TS7740
Cluster 0
Cluster Cluster
Cluste 1 or TS7700T
TS7
Cluster 2
Cluste
WAN
Host
(optional)
Host A Host B
DWDM, Channel Extension (optional)
FICON FICON
FICON
TS7700D
TS770 TS7700D TS774
TS7740
Cluster 0
Cluste Cluster 1 or
TS770
TS7700T
Cluste
Cluster 2
WAN
By merging environments, the client can address the requirements for DR and still use the
existing environment.
Host A
Ho Host B
Ho Host C
Ho
FICON
FICON FICON
TS7740
or
TS7700D TS7700D TS7700T TS774
TS7740
Cluster 0 Cluster 1 Cluster 2 or
TS770
TS7700T
Cluster 2
Cluste
WAN
The IBM TS7700 offers a great variety of configuration options, features, functions, and
implementation parameters. Many of the options serve different purposes, and their
interactions with other options can affect how they contribute to achieving your business
goals.
For some environments, these features and functions are mandatory, whereas for other
environments, they only raise the level of complexity. There is no “one configuration and
implementation fits all needs” solution. Therefore, you need a plan to build an environment to
meet your requirements.
This chapter summarizes what to consider during the planning phase, especially for
introducing new features. It also provides some general suggestions about considerations for
the day-to-day operation of a TS7700 environment.
In each new release of the IBM Virtual Tape Server (VTS), IBM has delivered new features to
support your most demanding client needs. Consider that although some of these functions
are needed in your environment, others are not:
Some features are independent from all others, others are not.
Certain features have a strong effect on the behavior of your environment, for example,
performance or data availability.
Specific features influence your setup of the environment.
Some features can be overruled by Override settings.
Your batch planners ensured that, if multifile was used, they belonged to the same
application, and that the same rules (duplexing and moving) applied. Sharing resources
between multiple different logical partitions (LPARs), multiple IBM Z operating systems, or
even multiple clients was mostly not wanted or needed. You were always sure where your
data on tape was located. Another important aspect was that users did not expect fast read
response time for data on tape.
Due to cost pressures, businesses are enforcing a Tier-Technology environment. Older or not
often-used data must be on less expensive storage, whereas highly accessed data must stay
on primary storage, which enables fast access. Applications such as Content Manager,
Hierarchical Storage Manager (HSM), or output archiver are rule-based, and are able to shift
data from one storage tier to another, which can include tape. If you are considering using
such applications, the tier concept needs to be planned carefully.
Also, the TS7700 itself decides which workloads must be prioritized. Depending on the
cluster availability in the grid, actual workload, or other storage conditions, the copy queues
might be delayed. In addition, the TS7700 automates many decisions to provide the most
value. This dynamic behavior can sometimes result in unexpected behaviors or delays.
Understanding how your environment behaves, and where your data is stored at any point in
time, is key to having a successful implementation, including the following considerations:
During a mount, a remote Tape Volume Cache (TVC) was chosen over a local TVC.
Copies are intentionally delayed due to configuration parameters, yet they were expected
to complete sooner.
Copy Export sets do not include all of the expected content because the export was
initiated from a cluster that was not configured to receive a replica of all the content.
Other features, such as scratch allocation assistance (SAA) and device allocation assistance
(DAA), might affect your methodology of drive allocation, whereas some customizing
parameters must always be used if you are a Geographically Dispersed Parallel Sysplex
(GDPS) user.
So, it is essential for you to understand these mechanisms to choose the best configuration
and customize your environment. You need to understand the interactions and dependencies
to plan for a successful implementation.
Important: There is no “one solution that fits all requirements.” Do not introduce complexity
when it is not required. Allow IBM to help you look at your data profile and requirements so
that the best solution can be implemented for you.
It can be difficult to get all of the required information from the owners of the data and the
owners of the applications to best manage the data. Using service level agreement (SLA)
requirements and an analysis of your existing tape environment can help with the process.
If your approach is that each data center can host the total workload, plan your environment
accordingly. Consider the possible outage scenarios, and verify whether any potential
degradations for certain configurations can be tolerated by the business until the full
equipment is available again.
However, more advanced TS7700 configurations can be implemented that enable both
availability and data protection to be equally important, for example, a four cluster grid.
Consider what type of data you store in your TS7700 environment. Depending on your type of
data, you have multiple configuration choices. This section starts with a general view before
looking closer at the specific types of data.
Data from your sandbox, test, UAT, or production system might share the tape environment,
but it can be treated differently. That is important for sizing, upgrades, and performance
considerations as well.
Backup data
The data on tape is only a backup. Under normal conditions, it will not be accessed again. It
might be accessed again only if there are problems, such as direct access storage device
(DASD) hardware problems, logical database failures, and site outages.
Expiration
The expiration is mostly a short or medium time frame.
Availability requirements
If tape environment is not available for a short time, the application workload can still run
without any effect. When the solution is unavailable, the backup to tape cannot be processed.
Retrieval requirements
Physical tape recall can normally be tolerated, or at least for previous generations of
the backup.
Multiple copies
Depending on your overall environment, a single copy (not in the same place as the primary
data) might be acceptable, perhaps on physical tape. However, physical media might fail or a
storage solution or its site might experience an outage. Therefore, one or more copies are
likely needed. These copies might exist on more media within the same location or ideally at a
distance from the initial copy.
If you use multiple copies, a Copy Consistency Point of Deferred might suffice, depending on
your requirements.
Expiration
The expiration depends on your application.
Availability requirements
When the tape environment is not available, your original workload might be severely
affected.
Retrieval requirements
Physical tape recalls might not be tolerated, depending on your data source (sandbox, test, or
production) or the type of application. Older, less-accessed active data might tolerate physical
tape recalls.
Depending on the recovery point objectives (RPO) of the data, choose an appropriate
Consistency Point Policy. For example, synchronous mode replication is a good choice for
these workloads because it can achieve a “zero point RPO at sync point” granularity.
Especially for DFSMShsm ML2, HSM backups, and OAM objects, use the synchronous mode
copy.
Expiration
The expiration depends on your application, but it is usually many years.
Availability requirements
Archive data is seldom accessed for read. If the tape environment is not available, your
original workload might still be affected because you cannot write new archive data.
Retrieval requirements
Physical tape recalls might be tolerated.
Multiple copies
Although the tape is the primary source, a single copy is not suggested. Even a media failure
results in data loss. Store multiple copies in different locations to be prepared for a data
center loss. In a stand-alone environment, dual copies on physical tape are suggested.
Depending on the criticality of the data, choose an appropriate Copy Consistency Point
Policy.
Archive data sometimes must be kept for 10 - 30 years. During such long time periods, the
technology progresses, and data migration to newer technology might need to take place. If
your archive data is on physical tapes in a TS7740/TS7700T, you must also consider the life
span of physical tape cartridges. Some vendors suggest that you replace their cartridges
every five years, other vendors, such as IBM, offer tape cartridges that have longer lifetimes.
If you are using a TS7740/TS7700T and you store archive data in the same storage pools
with normal data, there is a slight chance that, due to the reclaim process, the number of
stacked volumes that contain only archive data will increase. In this case, these cartridges
might not be used (either for cartridge reads or reclaim processing) for a longer time. Media
failures might not be detected. If you have more than one copy of the data, the data can still
be accessed. However, you have no direct control over where this data is stored on the
stacked volume, and the same condition might occur in other clusters, too.
Therefore, consider storing data with such long expiration dates on a specific stacked volume
pool. Then, you can plan regular migrations (even in a 5 - 10-year algorithm) to another
stacked volume pool. You might also decide to store this data in the common data pool.
Depending on your choice, the tape environment is more or less critical to your DB2
application. This depends also on the number of active DB2 logs that you define in your DB2
environment. In some environments, due to peak workload, logs are switched every two
minutes. If all DB2 active logs are used and they cannot be archived to tape, DB2 stops
processing.
Scenario
You have a four-cluster grid, spread over two sites. A TS7760D and a TS7760T are at each
site. You store one DB2 archive log directly on tape and the other archive log on disk. Your
requirement is to have two copies on tape:
Using the TS7760 can improve your recovery (no recalls from physical tape needed).
Having a consistency point of R, N, R, N provides two copies, which are stored in both
TS7760s. If one TS7760 is available, DB2 archive logs can be stored to tape. However, if
one TS7760 is not available, you have only one copy of the data. In a DR situation where
one of the sites is not usable for a long time, you might want to change your policies to
replicate this workload to the local TS7760T as well.
If the TS7760D enters the Out of cache resources state, new data and replications to that
cluster are put on hold. To avoid this situation, consider having this workload also target
the TS7760T and enable the Automatic Removal policy to free space in the TS7760D.
Until the Out of cache resources state is resolved, you might have fewer copies than
expected within the grid.
If one TS7760D is not available, all mounts must be run on the other TS7760D cluster.
In the unlikely event that both TS776D0s are not reachable, DB2 stops working as soon as
all DB2 logs on the disk are used.
Having a consistency point of R, N, R, D provides you with three copies, which are stored
in both TS7760Ds and in the TS7760T of the second location. That exceeds your original
requirement, but in an outage of any component, you still have two copies. In a loss of the
primary site, you do not need to change your DB2 settings because two copies are still
written. In an Out of Cache resources condition, the TS7760D can remove the data from
cache because there is still an available copy in the TS7760T.
Note: Any application with the same behavior can be treated similarly.
Ideally, DFSMShsm ML2 workloads should be created with synchronous mode copy to
ensure that a data set is copied to a second cluster before the DFSMShsm migration
processes the next data set. The DFSMShsm application marks the data set candidates for
deletion in DASD. With z/OS 2.1, MIGRATION SUBTASKING enables DFSMShsm to offload
more than one data set at a time, so it can do batches of data sets per sync point.
Using TS7700 replication mechanisms rather than DFSMShsm local duplexing can save
input/output (I/O) bandwidth, improve performance, reduce the number of logical volumes that
are used, and also reduces the complexity of bringing up operations at a secondary location.
Other vendor applications might support similar processing. Contact your vendor for more
information.
Tip: To gain an extra level of data protection, run ML2 migration only after a DFSMShsm
backup runs.
Users accessing the data on tape (in particular the TS7700T or TS7740) might have to wait
for their document until it is read from physical media. The TS7700D or the TS7700T CP0 is
traditionally a better option for such a workload given the disk cache residency can be much
longer and even indefinite.
For OAM primary objects, use Synchronous mode copy on two clusters and depending on
your requirements, additional immediate or deferred copies elsewhere if needed.
With OAM, you can also have up to two backup copies of your object data. Backup copies of
your data (managed by OAM) are in addition to any replicated copies of your primary data
that are managed by the TS7700. Determine the copy policies for your primary data and any
additional OAM backup copies that might be needed. The backups that are maintained by
OAM are only used if the primary object is not accessible. The backup copies can be on
physical or virtual tape.
Rarely accessed data that does not demand quick access times can be put on a TS7700T
tape partition with PG0 and a not-delayed migration, or on the TS7700T in a second cluster.
Data that becomes less important with time can also use the TS7700D or TS7700T CP0
auto-removal policies to benefit from both technologies.
Assume that you have the same configuration as the DB2 archive log example:
With a Consistency Copy Point policy of [N,R,N,R], your data is stored only on the
TS7700T CPx or TS7740s (fast access is not critical).
With a Consistency Copy Point policy of [R,N,R,N], your data is stored only on the
TS7700Ds (fast access is critical).
With a Consistency Copy Point policy of [R,D,R,D], your copy is on the TS7700Ds first and
then also on the TS7700Ts, enabling the older data to age off the TS7700Ds by using the
auto-removal policy.
Data needs a 100% OAM objects (primary TS7700 Disk-Only Pinned / PG1
cache hit data), HSM ML2
TS7700T CP0 CP0/ Pinned/ PG1
Data that benefits from Depending on the user TS7700D with PG1
a longer period in disk requirements. autoremoval
cache OAM objects (primary
data), HSM ML2 TS7700T CPx PG1
Data that is needed for DB2 log files TS7700T CPx / PG1 delay
a specific time in (depending on your premigration
cache, but then should requirements), Active
be kept on tape Batch data TS7740 PG1
If you cannot tolerate any of these items, consider implementing a grid environment.
All applications within a Parallel Sysplex can use the same logical device ranges and logical
volume pools, simplifying sharing resources. When independent sysplexes are involved,
device ranges and volume ranges are normally independent, but are still allowed to share the
disk cache and physical tape resources.
Of all the sharing use cases, most share the FICON channels into the TS7700. Although the
channels can also be physically partitioned, it is not necessary because each FICON channel
has access to all device and volume ranges within the TS7700.
When independent sysplexes are involved, the device ranges and corresponding volume
ranges can be further protected from cross-sysplex access through the SDAC feature.
When device partitioning is used, consider assigning the same number of devices per cluster
per sysplex in a grid configuration so that the availability for a given sysplex is equal across all
connected clusters.
Override policies set in the TS7700 apply to the whole environment and cannot be enabled or
disabled by an LPAR or client.
For more considerations, see the Guide to Sharing and Partitioning IBM Tape Library Data,
SG24-4409.
Note: Some parameters can be updated by the Library Request command. This
command changes the cluster behavior. This is not only valid for the LPAR where the
command was run, but for all LPARs that use this cluster.
Ensure that only authorized personnel can use the Library Request command.
If you share a library for multiple customers, establish regular performance and resource
usage monitoring. See 3.4, “Features and functions available only for the TS7700T” on
page 128.
Note: With APAR OA49373 (z/OS V2R1 and later), the individual IBM MVS LIBRARY
command functions (EJECT, REQUEST, DISPDRV, and so on) can be protected using a
security product such as RACF. This APAR adds security product resource-names for each
of the LIBRARY functions.
TVC selection is also influenced by some LI REQ parameters. For more information about the
parameters LOWRANK and SETTINGS,PHYSLIB, see the Library Request Command white paper,
found at:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101091
TVC selection might also influence the Copy Export. See 12.1, “Copy Export overview and
considerations” on page 736.
If you do not care which TVC is chosen and you prefer a balanced grid, use [D, D, D, D].
With the new Time Delayed Replication policy, you can now decide that certain data is only
copied to other clusters after a specified time. This policy is designed for data that usually has
a short lifecycle, and is replaced shortly with more current data, such as backups and
generation data groups (GDGs). In addition, this policy can also be used for data with an
unknown retention time, and where the data should be copied only to another cluster when
this data is still valid after the given time. Time Delayed Replication policy is targeted
especially for multi-cluster grids (3 or more).
However, plan to have at least two copies for redundancy purposes, such as on a local
TS7700D/TS7700T and a remote TS7700D/TS7700T.
This consistency point is ideal for applications that move primary data to tape, such as
DFSMShsm or OAM Object Support, which can remove the primary instance in DASD after
issuing an explicit sync point.
Therefore, you should use Synchronous mode copy for this type of applications.
The synchronous mode copy offers three options for how to handle private mounts:
Always open both instances on private mount.
Open only one instance on private mount.
Open both instances on z/OS implied update.
Plan the usage of this option. Dual open is necessary for workloads that can append to
existing tapes. When only reads are taking place, the dual open can introduce unnecessary
resource use, especially when one of the instances requires a recall from a physical tape.
Using the dual open z/OS implied update helps reduce any resource use to only those
mounts where an update is likely to take place.
In addition, synchronous mode copy provides an option to determine its behavior when both
instances cannot be kept in sync. One option is to move to the synch-deferred state. Another
option is to fail future write operations. Depending on your requirements, determine whether
continued processing is more important than creating synchronous redundancy of the
workload. For more information, see 2.3.5, “Copy consistency points” on page 68.
The Override policies are cluster-based. They cannot be influenced by the attached hosts or
policies. With Override policies, you can help influence the behavior on how the TS7000
cluster chooses a TVC selection during the mount operation, and whether a copy needs to be
present in that cluster (for example, favoring the local mount point cluster).
Therefore, grid configurations with three or more clusters can benefit from cluster families.
A scratch category can have a defined expiration time, enabling the volume contents for those
volumes that are returned to scratch to be automatically deleted after a grace period passes.
The grace period can be configured from 1 hour to many years. Volumes in the scratch
category are then either expired with time or reused, whichever comes first.
If physical tape is present, the space on the physical tape that is used by the deleted or
reused logical volume is marked inactive. Only after the physical volume is later reclaimed or
marked full inactive is the tape and all inactive space reused. After the volume is deleted or
reused, content that was previously present is no longer accessible.
An inadvertent return to scratch might result in loss of data, so a longer expiration grace
period is suggested to enable any return to scratch mistakes to be corrected within your host
environment. To prevent reuse during this grace period, enable the additional hold option to
prevent such reuse. This provides a window of time where a host-initiated mistake can be
corrected, enabling the volume to be moved back to a private category while retaining the
previously written content.
Plan ahead if you are also using Copy Export or plan to use the Grid to Grid migration tool in
the future. The receiving clusters/grids need to be capable of reading the compressed logical
volumes, so they need to have the same capability (R4.1.2 microcode or later).
Using the software compression at least during the migration period can affect capacity
planning. The reduction is not only for GB in cache or on physical tape. It also reduces the
amount of data in the premigration queue (FC 5274), the amount of data needs to be copied
(might result in a better RPO) and the necessary bandwidth for the physical tapes.
3.3.10 Encryption
Depending on your legal requirements and your type of business, data encryption might be
mandatory.
This allocation routine is aware if a grid crossed the virtual tape scratch threshold and takes
this into account for the mount distribution. All other information (like TVC usage, premigration
queue length, and TVC LOWRANK) are not available at this point in time and will not be used
for grid selection.
Remember: Support for the allocation assistance functions (DAA and SAA) was initially
only supported for the job entry subsystem 2 (JES2) environment. With z/OS V2R1, JES3
is also supported.
If you use the allocation assistance, the device allocation routine in z/OS is influenced by
information from the grid environment. Several aspects are used to find the best mount point
in a grid for this mount. For more information, see 2.3.15, “Allocation assistance” on page 76.
Depending on your configuration, your job execution scheduler, and any automatic allocation
managers you might use, the allocation assist function might provide value to your
environment.
If you use any dynamic tape manager, such as the IBM Automatic Tape Allocation Manager,
plan the introduction of SAA and DAA carefully. Some dynamic tape managers manage
devices in an offline state. Because allocation assist functions assume online devices, issues
can surface.
Therefore, consider keeping some drives always online to a specific host, and leave only a
subset of drives to the dynamic allocation manager. Alternatively, discontinue working with a
dynamic tape allocation manager.
Automatic tape switching (ATSSTAR), which is included with z/OS, works with online devices,
and is compatible with DAA and SAA.
To avoid any performance effect, review your installation before you use the 25 GB volumes.
Unlike other IBM Z availability functions such as System failure management for z/OS LPARs
or Hyperswap for disks, this feature does not react in seconds. The grid technology is
designed not only for local implementations, but also for remote data placement, often
thousands of miles away. Therefore, timeouts and retries must be much longer to cover
temporary network issues.
Although the function supports defining very small thresholds parameters, change the default
only after you analyze the TS7700 grid environment to prevent false fence.
The secondary action (isolate the cluster from the network) should be considered only if a
clear automated action plan is defined and the effect on your production is fully understood.
Note: The z/OS host must include APAR 0A52376 with code level V2R2 and later.
In case software products like Automatic Tape Allocation Manager (ATAM) or other third party
vendor products are used, review whether CUIR is beneficial for your environment. After
CUIR is used to offline the drives, the usual z/OS online command cannot be used to online
the devices after the service has been finished.
In addition, delay premigration was introduced to help manage the movement of data to
physical tape. By using policies that can delay premigration of specific workloads from one to
many hours, only content that has not yet expired when the delay period passes ends up on
tape. This creates a solution where the aged or archive component of a workload is the only
content that moves to tape. Until then, the data is only resident in disk cache.
When the data expires from a host perspective while it is still in cache, it is not premigrated or
migrated to a tape. That reduces your back-end activities (migrate and reclaim).
In addition, optional checks might be useful, especially after complex migrations or changes
in your environment.
For a complete list of all possible CBR3750 submessages, see the IBM Virtualization Engine
TS7700 Series Operator Informational Messages white paper:
https://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101689
Identify the appropriate action that an operator or your automation tool must run. Introduce
the new messages in your automation tool (with the appropriate action) or alert the message
for human intervention.
In addition, you can enhance the messages and can extend the message text with
user-defined content.
If you modify your environment, back up this modification to make sure you can upload your
changes if a microcode issue occurs.
The TS7700 keeps performance data for the last 90 days. If more than 90 days is required,
running tools periodically to collect the information and store it is required. Then, set up
regular Bulk Volume Information Retrieval (BVIR) runs and keep the data. Check this data on
a periodic basis to see the usage trends, especially for shortage conditions.
Consider running copy audits after major changes in the environment, such as joins, merges
and before the removal of one or more clusters. You can also run the Copy Audit periodically
as a method to audit your expected business continuance requirements.
TS7700 Release 3.3 introduces a new data migration method that is called Grid to Grid
Migration (GGM), which is offered as a service from IBM.
While the command is submitted from a host, the data is copied internally through the
gridlinks. There is no Host I/O through the FICON adapters, and all data in the TCDB and
tape management remain unchanged.
The GGM tool should be considered if the following situations are true:
There are already six clusters installed in the grid.
The Join and Copy Refresh processing cannot be used (there are floor space
requirements, microcode restrictions, or other considerations).
Source and Target grid belongs are maintained by different providers.
The GGM tool also provides several different options, such as how the new data (new device
categories) and the old data (keep or delete the data in the source grid) is treated.
To access the data in the new grid, TCDB and the TMC must be changed. These changes are
the responsibility of the customer, and must be processed manually.
The GGM is controlled by the LI REQ command, and reporting is provided by additional BVIR
reports. A summary of this command can be found in the following white paper:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS5328
In addition, several supporting tools to create the necessary input control statements, and the
necessary TCDB entry changes and TMC entry changes, are provided at the IBM Tape Tool
website:
ftp://public.dhe.ibm.com/storage/tapetool
For more information, see Chapter 8, “Migration” on page 299, or ask your local IBM SSR.
With Release 3.3, you now can mix the TS1150 with only one older drive technology. This
intermix is for migration purposes because a TS1150 cannot read content from JA and JB
cartridges.
With reclamation, the data from the discontinued media is moved to the new data. If you do
not want that situation to occur, modify the “Sunset Media Reclaim Threshold Percentage
(%)” for the specific physical pool on the MI to 0, and 0 reclaim runs for the discontinued
media inside that pool.
Remember: For this chapter, the term tape library refers to the IBM TS3500 and TS4500
tape libraries.
Site-A Site-B
TS7700 V4.1
Components
z/VM z/OS z/VSE z/TPF
SAN Fabric 0
FICON Director Director
FICON
2x 1Gbit
J CH-Ext CH-Ext
J
CH-Ext CH-Ext
2x 1Gbit
Director Director
SAN Fabric 1
2-4 4Gb/2-8 8 Gb
For a detailed listing of system requirements, see IBM TS7700 R4.1 at IBM Knowledge
Center:
https://www.ibm.com/support/knowledgecenter/STFS69_4.1.0/ts7740_system_requirement
s.html
Power 240 Vac, 15 amp (single phase) 240 Vac, 15 amp (single phase)
Unit height 36 U 40 U
Operating 10°C - 28°C 5001 ft. AMSL - 20% - 80% 23°C (73°F)
(high altitude) (50°F - 82.4°F) 7000 ft. AMSL
Power considerations
Your facility must have ample power to meet the input voltage requirements for the TS7700.
The standard 3952 Tape Frame includes one internal power distribution unit. However,
feature code 1903, Dual AC power, is required to provide two power distribution units to
support the high availability (HA) characteristics of the TS7700. The 3952 Storage Expansion
Frame has two power distribution units and requires two power feeds.
Current 20 A
Table 4-4 TS7720 Storage Expansion Frame maximum input power requirements
Power requirement Value
Current 20 A
Current 20 A
Current 24 amp
Table 4-7 TS7760 Storage Expansion Frame maximum input power requirements
Power requirement Value
Current 24 amp
The TS7740, TS7720T, and TS7760T support the 3592 Extended Tape Cartridge (JB) media
and require TS1120 model E05 Tape Drives in E05 mode, TS1130 Model E06/EU6 tape
drives, TS1140 Model E07 or EH7 tape drives. Alternatively, they require a heterogeneous
setup involving the TS1150 Model E08 or EH8 tape drives and either of the TS1120 Model
E05, TS1130 Model E06/EU6, or TS1140 Model E07 or EH7 tape drives, depending on the
library generation.
In a TS3500 tape library, all tape drives and media are supported. In a TS4500 tape library,
only TS1140 and TS1150 with the corresponding media is supported.
The TS7740, TS7720T, and TS7760T tape encryption (FC 9900) require that all the backend
drives be encryption capable. TS1130 Model E06/EU6, TS1140 Model E07 or EH7, and
TS1150 Model E08 or EH8 drives are encryption capable. TS1120 Model E05 Tape Drives in
E05 mode are encryption capable, with either FC 9592 from the factory, or FC 5592 as a field
upgrade.
Support for the fourth generation of the 3592 drive family is included in TS7700 Release 2.0
PGA1. At this code level, the TS1140 tape drive that is attached to a TS7740 and TS7720T
cannot read JA or JJ media. Ensure that data from all JA and JJ media has been migrated to
JB media before you replace older-generation tape drives with TS1140 drives. Starting with
Release 2.1 PGA0 the reading of JA and JJ media by the TS1140 drive is supported. The
client can choose to keep the data on the JA/JJ media or can plan to migrate the data to
newer generations of media.
TS1150 tape drives can be intermixed with some other drive types. The E08 drives are
supported in a limited heterogeneous configuration.
The new media types JD and JL are supported by the TS7700. Up to 10 TB of data can be
written to the JD cartridge in the 3593 E08 tape drive recording format. Up to 2 TB of data can
be written to the JL cartridge in the 3592 E08 tape drive recording format. The 3592 E08 tape
drive also supports writing to prior generation media types JK and JC. When the 3592 E08
recording format is used starting at the beginning of tape, up to 7 TB of data can be written to
a JC cartridge and up to 900 GB of data can be written to a JK cartridge.
The 3592 E08 tape drive does not support reading or writing to media type JJ, JA, and JB
cartridges. The TS7700 does not support any type of Write Once Read Many (WORM)
physical media.
Model J1A Read/write Not supported Not supported Not supported Not supported
Model E05 Read/writea Read/write Not supported Not supported Not supported
Model E08/EH8 Not supported Not supported Not supported Read/write Read/write
a. Model E05 can read and write EFMT1 operating in native or J1A emulation mode.
b. Model E07/EH7 can read JA and JJ cartridge types only with a tape drive firmware level of D3I3_5CD or higher.
c. Cartridge type JB only.
Table 4-9 summarizes the tape drive models, capabilities, and supported media by tape drive
model.
Table 4-9 3592 Tape Drive models and characteristics versus supported media and capacity
3592 drive type Supported media Encryption Capacity Data rate
type support
Notes:
To use tape encryption, all drives that are associated with the TS7740, TS7720T, or TS7760T
must be Encryption Capable and encryption-enabled.
Encryption is not supported on 3592 J1A tape drives.
The media type is the format of the data cartridge. The media type of a cartridge is shown by
the last two characters on standard bar code labels. The following media types are supported:
JA: An Enterprise Tape Cartridge (ETC)
A JA cartridge can be used in native mode in a 3592-J1A drive or a 3592-E05 Tape Drive
operating in either native mode or J1A emulation mode. The native capacity of a JA tape
cartridge that is used in a 3592-J1A drive or a 3592-E05 Tape Drive in J1A emulation
mode is 300 GB, equivalent to 279.39 gibibytes (GiB). The native capacity of a JA tape
cartridge that is used in a 3592-E05 Tape Drive in native mode is 500 GB (465.6 GiB). The
native capacity of a JA tape cartridge that is used in a 3592-E06 drive in native mode is
640 GB (596.04 GiB).
The following media identifiers are used for diagnostic and cleaning cartridges:
CE: Customer Engineer diagnostic cartridge for use only by IBM SSRs. The VOLSER for
this cartridge is CE xxxJA, where a space occurs immediately after CE and xxx is three
numerals.
CLN: Cleaning cartridge. The VOLSER for this cartridge is CLN xxxJA, where a space
occurs immediately after CLN and xxx is three numerals.
Important: WORM cartridges, including JW, JR, JX, JY, and JZ, are not supported.
Capacity scaling of 3592 tape media is also not supported by TS7740, TS7720T, and
TS7760T.
When you change the model of the 3592 tape drives of an existing TS7740, TS7720T or
TS7760T, the change must be in the later version direction, from an older 3592 tape drive
model to a newer 3592 tape drive model.
3592 E08 drives can be mixed with one other previous generation tape drive through
heterogeneous tape drive support, which allows a smooth migration of existing TS7700 tape
drives with older tape drives to TS1150 tape drives.
For more information, see 7.2.6, “Upgrading drive models in an existing TS7740 or TS7700T”
on page 275.
For this reason, during the code upgrade process, one grid can have clusters that are
simultaneously running three different levels of code. Support for three different levels of
code is available on a short-term basis (days or a few weeks), which should be long
enough to complete the Licensed Internal Code upgrade in all clusters in a grid. The
support for two different levels of code in a grid enables an indefinite coexistence of
V06/VEA and V07/VEB/VEC clusters within the same grid.
Figure 4-2 shows you the different networks and connections that are used by the TS7700
and associated components. This two-cluster TS7740/TS7720T/TS7760T grid shows the
TS3500 and TS4500 tape library connections (not present in a TS7720D and TS7760D
configuration).
The TS7700 grid IP network infrastructure must be in place before the grid is activated so that
the clusters can communicate with one another as soon as they are online. Two or four 1-GbE
or 10-GbE connections must be in place before grid installation and activation.
The default configuration for a R4.1 TS7700 server from manufacturing (3957-VEC) is two
dual-ported PCIe 1-GbE adapters. You can use FC 1038,10 Gb dual port grid optical LW
connection to choose support for two 10-Gb optical LW Ethernet adapters instead.
If the TS7700 server is a 3957-V07, 3957-VEB or 3957-VEC, two instances of either FC 1036
(1 Gb grid dual port copper connection) or FC 1037 (1 Gb dual port optical SW connection)
must be installed. You can use FC 1034 to activate the second port on dual-port adapters.
Clusters that are configured by using four 10-Gb, two 10-Gb, four 1-Gb, or two 1-Gb clusters,
can be interconnected within the same TS7700 grid, although the explicit same port-to-port
communications still apply.
Important: Identify, order, and install any new equipment to fulfill grid installation and
activation requirements. The connectivity and performance of the Ethernet connections
must be tested before grid activation. You must ensure that the installation and testing of
this network infrastructure is complete before grid activation.
To avoid performance issues, the network infrastructure should not add packet metadata
(increase its size) to the default 1500-byte maximum transmission unit (MTU), such as with
an encryption device or extender device.
The network between the TS7700 clusters in a grid must have sufficient bandwidth to account
for the total replication traffic. If you are sharing network switches among multiple TS7700
paths or with other devices, the total bandwidth of the network must be sufficient to account
for all of the network traffic.
The TS7700 clusters attempt to drive the grid network links at the full speed that is allowed by
the adapter (1 Gbps or 10 Gbps rate), which might exceed the network infrastructure
capabilities. The TS7700 supports the IP flow control frames so that the network paces the
level at which the TS7700 attempts to drive the network. The preferred performance is
achieved when the TS7700 can match the capabilities of the underlying grid network,
resulting in fewer dropped packets.
Remember: When the grid network capabilities are below TS7700 capabilities, packets
are lost. This causes TCP to stop, resync, and resend data, resulting in a less efficient use
of the network. Flow control helps to reduce this behavior. 1-Gb and 10-Gb clusters can be
within the same grid, but compatible network hardware must be used to convert the signals
because 10 Gb cannot negotiate down to 1 Gb.
Note: It is advised to enable flow control in both directions to avoid grid link performance
issues.
To maximize throughput, ensure that the underlying grid network meets these requirements:
Has sufficient bandwidth to account for all network traffic that is expected to be driven
through the system to eliminate network contention.
Can support the flow control between the TS7700 clusters and the switches, which
enables the switch to pace the TS7700 to the WAN capability. Flow control between the
switches is also a potential factor to ensure that the switches can pace their rates to one
another. The performance of the switch should be capable of handling the data rates that
are expected from all of the network traffic.
Latency can be defined as the time interval elapsed between a stimulus and a response. In the
network world, latency can be understood as how much time it takes for a data package to
travel from one point to another in a network infrastructure. This delay is introduced by some
factors, such as the electronic circuitry used in processing the data signals, or plainly by the
universal physics constant, the speed of light. Considering the current speed of data
processing, this is the most important element for an extended distance topology.
In short, latency between the sites is the primary factor. However, packet loss due to bit error
rates or insufficient network capabilities can cause TCP to resend data, which multiplies the
effect of the latency.
The TS7700 uses clients LAN/WAN to replicate virtual volumes, access virtual volumes
remotely, and run cross-site messaging. The LAN/WAN must have adequate bandwidth to
deliver the throughput necessary for your data storage requirements.
The cross-site grid network is 1 GbE with either copper (RJ-45) or SW fiber (single-ported or
dual-ported) links. For copper networks, CAT5E or CAT6 Ethernet cabling can be used, but
CAT6 cabling is preferable to achieve the highest throughput. Alternatively, two or four 10-Gb
LW fiber Ethernet links can be provided. Internet Protocol Security (IPSec) is now supported
on grid links to support encryption.
For TS7700 clusters configured in a grid, the following extra assignments must be made for
the grid WAN adapters. For each adapter port, you must supply the following information:
A TCP/IP address
A gateway IP address
A subnet mask
Tip: In a TS7700 multi-cluster grid environment, you must supply two or four IP addresses
per cluster for the physical links that are required by the TS7700 for grid cross-site
replication.
The TS7700 provides up to four independent 1 Gb copper (RJ-45) or SW fiber Ethernet links
for grid network connectivity, or up to four 10 Gb LW links. To be protected from a single point
of failure that can disrupt all WAN operating paths to or from a node, connect each link
through an independent WAN interconnection.
Note: It is a strongly preferred practice that the primary and alternative grid interfaces exist
on separate subnets. Plan different subnets for each grid interface. If the grid interfaces are
directly connected (without using Ethernet switches), you must use separate subnets.
Use the third IP address to access a TS7700. It automatically routes between the two
addresses that are assigned to physical links. The virtual IP address enables access to the
TS7700 MI by using redundant paths, without the need to specify IP addresses manually for
each of the paths. If one path is unavailable, the virtual IP address automatically connects
through the remaining path.
You must provide one gateway IP address and one subnet mask address.
Important: All three provided IP addresses are assigned to one TS7700 cluster for MI
access.
Each cluster in the grid must be configured in the same manner as explained previously, with
three TCP/IP addresses providing redundant paths between the local intranet and cluster.
Mozilla Firefox 31.x ESR, 38.x ESR, 45.x ESR 45.0 ESR
For the list of required TSSC TCP/IP port assignments, see Table 4-11 on page 152.
The MI in each cluster can access all other clusters in the grid through the grid links. From the
local cluster menu, select a remote cluster. The MI goes automatically to the selected cluster
through the grid link. Alternatively, you can point the browser to the IP address of the target
cluster that you want.
92.168.1.3 92.168.2.3
Local
TS7700
x GigE
Remote
TS7700
Cluster Cluster
IPv6 support
All network interfaces that support monitoring and management functions are now able to
support IPv4 or IPv6:
Management Interface (MI)
Key manager server: IBM Security Key Lifecycle Manager
Simple Network Management Protocol (SNMP) servers
Rocket-fast System for Log processing (RSYSLOG) servers
Lightweight Directory Access Protocol (LDAP) server
Network Time Protocol (NTP) server
Important: All of these client interfaces must be either IPv4 or IPv6 for each cluster. Mixing
IPv4 and IPv6 is not supported within a single cluster. For grid configurations, each cluster
can be either all IPv4 or IPv6 unless an NTP server is used, in which case all clusters
within the grid must be all one or the other.
Note: The TS7700 grid link interface does not support IPv6.
Each component of your TS7700 tape subsystem that is connected to the TSSC uses at least
one Ethernet port in the TSSC Ethernet hub. For example, a TS7700 cluster needs two
connections (one from the primary switch and other from the alternative switch). If your
cluster is a TS7740, TS7720T, or TS7760T, you need a third port for the TS3500 or TS4500
tape library. Depending on the size of your environment, you might need to order a console
expansion for your TSSC. For more information, see FC2704 in the IBM TS7700 R4.1 IBM
Knowledge Center:
https://www.ibm.com/support/knowledgecenter/STFS69_4.1.0/ts7700_feature_codes_all.
html
Generally, there should be at least one TSSC available per location in proximity of the tape
devices, such as TS7700 clusters and TS3500 tape libraries. Apart from the internal TSSC
network, the TSSC can also have another two Ethernet physical connections:
External Network Interface
Grid Network Interface
Those two Ethernet adapters are used by advanced functions, such as AOTM, LDAP, Assist
On-site (AOS), and Call Home (not using a modem). If you plan to use them, provide one or
two Ethernet connections and the corresponding IP addresses for the TSSC. The ports in the
table must be opened in the firewall for the interface links to work properly. Table 4-11 shows
the network port requirements for the TSSC.
443
22
443
9666
ICMP
Clarification: These requirements apply only to the LAN/WAN infrastructure. The TS7700
internal network is managed and controlled by internal code.
Native SW Fibre Channel transmitters have a maximum distance of 150 m with 50-micron
diameter, multi-mode, optical fiber (at 4 Gbps). Although 62.5-micron, multimode fiber can be
used, the larger core diameter has a greater dB loss and maximum distances are shortened
to 55 meters. Native LW Fibre Channel transmitters have a maximum distance of 10 km
(6.2 miles) when used with 9-micron diameter single-mode optical fiber. See the Table 4-13
on page 156 for a comparative table.
Link extenders provide a signal boost that can potentially extend distances to up to about
100 km (62 miles). These link extenders act as a large, fast pipe. Data transfer speeds over
link extenders depend on the number of buffer credits and efficiency of buffer credit
management in the Fibre Channel nodes at either end. Buffer credits are designed into the
hardware for each Fibre Channel port. Fibre Channel provides flow control that protects
against collisions.
This configuration is important for storage devices, which do not handle dropped or
out-of-sequence records. When two Fibre Channel ports begin a conversation, they
exchange information about their number of supported buffer credits. A Fibre Channel port
sends only the number of buffer frames for which the receiving port has given credit.
This approach both avoids overruns and provides a way to maintain performance over
distance by filling the pipe with in-flight frames or buffers. The maximum distance that can be
achieved at full performance depends on the capabilities of the Fibre Channel node that is
attached at either end of the link extenders.
This relationship is vendor-specific. There must be a match between the buffer credit
capability of the nodes at either end of the extenders. A host bus adapter (HBA) with a buffer
credit of 64 communicating with a switch port with only eight buffer credits is able to read at
full performance over a greater distance than it is able to write because, on the writes, the
HBA can send a maximum of only eight buffers to the switch port.
On the reads, the switch can send up to 64 buffers to the HBA. Until recently, a rule has been
to allocate one buffer credit for every 2 km (1.24 miles) to maintain full performance.
Buffer credits within the switches and directors have a large part to play in the distance
equation. The buffer credits in the sending and receiving nodes heavily influence the
throughput that is attained in the Fibre Channel. Fibre Channel architecture is based on a flow
control that ensures a constant stream of data to fill the available pipe. Generally, to maintain
acceptable performance, one buffer credit is required for every 2 km (1.24 miles) distance
covered. See IBM SAN Survival Guide, SG24-6143, for more information.
Figure 4-4 shows a sample diagram that includes the DWDM and FICON Directors
specifications. For more information, see “FICON Director support” on page 156.
Figure 4-4 The IBM Z host attachment to the TS7700 (at speed of 8 Gbps)
Table 4-13 shows the relationship between connection speed and distance by cable type.
Figure 4-4 on page 155 shows the supported distances using different fiber cables for
single-mode long wave laser and multimode short wave laser.
Note: Long wave cables attach only to long wave adapters and short wave cables attach
only to short wave adapters. There is no intermixing.
You cannot mix different vendors, such as Brocade (formerly McData, CNT, and InRange) and
CISCO, but you can mix models of one vendor.
Using the frame shuttle or tunnel mode, the extender receives and forwards FICON frames
without performing any special channel or control unit (CU) emulation processing. The
performance is limited to the distance between the sites and the normal round-trip delays in
FICON channel programs.
Emulation mode can go unlimited distances, and it monitors the I/O activity to devices. The
channel extender interfaces emulate a CU by presenting command responses and channel
enablement (CE)/device end (DE) status ahead of the controller, and emulating the channel
when running the pre-acknowledged write operations to the real remote tape device.
Therefore, data is accepted early and forwarded to the remote device to maintain a full pipe
throughout the write channel program.
The supported channel extenders between the IBM Z host and the TS7700 are in the same
matrix as the FICON switch support under the following URL (see the FICON Channel
Extenders section):
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/FQ116133
Cascaded switches
The following list summarizes the general configuration rules for configurations with cascaded
switches:
Director Switch ID
This is defined in the setup menu.
The inboard Director Switch ID is used on the SWITCH= parameter in the CHPID
definition. The Director Switch ID does not have to be the same as the Director Address.
Although the example uses a different ID and address for clarity, keep them the same to
reduce configuration confusion and simplify problem determination work.
The following allowable Director Switch ID ranges have been established by the
manufacturer:
– McDATA range: x'61' - x'7F'
– CNT/Inrange range: x'01' - x'EF'
– Brocade range: x'01' - x'EF'
Director Address
This is defined in the Director GUI setup.
The Director Domain ID is the same as the Director Address that is used on the LINK
parameter in the CNTLUNIT definition. The Director Address does not have to be the
same as the Director ID, but again, keep them the same to reduce configuration confusion
and simplify PD work.
Port type and port-to-port connections are defined by using the available setup menu in the
equipment. Figure 4-5 shows an example of host connection that uses DWDM and cascaded
switches.
Figure 4-5 Host connectivity that uses DWDM and cascaded switches
Important: Enabling LDAP requires that all users must authenticate with the LDAP server.
All interfaces to the TS7700, such as MI, remote connections, and even the local serial
port, are blocked. The TS7700 might be inaccessible if the LDAP server is unreachable.
Enabling authentication through an LDAP server means that all personnel with access to the
TS7700 subsystem, such as computer operators, storage administrators, system
programmers, and IBM SSRs (local or remote), must have a valid account in the LDAP
server, along with the roles assigned to each user. The role-based access control (RBAC) is
also supported. If the LDAP server is down or unreachable, it can render a TS7700
inaccessible from the outside.
Important: Create at least one external authentication policy for IBM SSRs before a
service event.
When LDAP is enabled, the TS7700 MI is controlled by the LDAP server. Record the
Direct LDAP policy name, user name, and password that you created for IBM SSRs and keep
this information easily available in case you need it. Service access requires the IBM SSR to
authenticate through the normal service login and then to authenticate again by using the IBM
SSR Direct LDAP policy.
For more information about how to configure LDAP availability, see “Defining security
settings” on page 561.
The preferred method to keep nodes in sync is with a Network Time Protocol (NTP) Server.
The NTP server can be a part of the Grid and WAN infrastructure, it can be a part of a
customer intranet, or it can be a public server on the internet.
The NTP server address is configured into system VPD on a system-wide scope, so that all
clusters access the same NTP server. All of the clusters in a grid need to be able to
communicate with the same NTP server that is defined in VPD. In the absence of an NTP
server, all nodes coordinate time with Cluster 0 (or the lowest-numbered available cluster in
the grid).
Without AOTM, an operator must determine if one of the TS7700 clusters has failed, and then
enable one of the ownership takeover modes. This is required to access the virtual volumes
that are owned by the failed cluster. It is very important that write ownership takeover be
enabled only when a cluster has failed, and not when there is a problem only with
communication between the TS7700 clusters.
If it is enabled and the cluster in question continues to operate, data might be modified
independently on other clusters, resulting in a corruption of the data. Although there is no
data corruption issue with the read ownership takeover mode, it is possible that the remaining
clusters might not have the latest version of the virtual volume and present previous data.
Even if AOTM is not enabled, it is advised that it be configured. Doing so provides protection
from a manual takeover mode being selected when the other cluster is still functional.
With AOTM, one of the takeover modes is enabled if normal communication between the
clusters is disrupted and the cluster to perform takeover can verify that the other cluster has
failed or is otherwise not operating. If a TS7700 suspects that the cluster that owns a volume
it needs has failed, it asks the TS3000 System Console to which it is attached to query the
System Console attached to the suspected failed cluster.
If the remote system console can validate that its TS7700 has failed, it replies back and the
requesting TS7700 enters the default ownership takeover mode. If it cannot validate the
failure, or if the system consoles cannot communicate, an ownership takeover mode can only
be enabled by an operator.
To take advantage of AOTM, the customer should provide IP communication paths between
the TS3000 System Consoles at the cluster sites. For AOTM to function properly, it should not
share the same paths as the Grid interconnection between the TS7700s.
Note: When the TSSC code level is Version 5.3.7 or higher, the AOTM and Call Home IP
addresses can be on the same subnet. However, earlier levels of TSSC code require the
AOTM and Call Home IP addresses to be on different subnets. It is advised to use different
subnets for those interfaces.
AOTM can be enabled through the MI interface, and it is also possible to set the default
ownership takeover mode.
For more information, see the following links for details about this subject:
IBM TS7700 Series Best Practices - TS7700 Hybrid Grid Usage
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101656
IBM TS7700 Series Best Practices - Copy Consistency Points
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101230
IBM TS7700 Series Best Practices - Synchronous Mode Copy
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102098
Force volumes that are mounted on this cluster to be copied to the local cache
For a private (non-Fast Ready) mount, this override causes a copy to be created on the
local cluster as part of mount processing. For a scratch (Fast Ready) mount, this setting
overrides the specified MC with a Copy Consistency Point of Rewind-Unload for the
cluster. This does not change the definition of the MC, but serves to influence the
Replication policy.
Enable fewer RUN consistent copies before reporting RUN command complete
If selected, the value that is entered for Number of required RUN consistent copies,
including the source copy, is used to determine the number of copies to override before
the RUN operation reports as complete. If this option is not selected, the MC definitions
are used explicitly. Therefore, the number of RUN copies can be from one to the number of
clusters in the grid.
Ignore cache preference groups for copy priority
If this option is selected, copy operations ignore the cache preference group when
determining the priority of volumes that are copied to other clusters.
If you have a grid with two or more clusters, you can define scratch mount candidates. For
example, in a hybrid configuration, the scratch allocation assist (SAA) function can be used to
direct certain scratch allocations (workloads) to one or more TS7700Ds or cache partition
(CP0) of a TS7700Ts for fast access, while other workloads can be directed to TS7740s or
the cache partition (CPx) of TS7760Ts for archival purposes.
Clusters not included in the list of scratch mount candidates are not used for scratch mounts
at the associated MC unless those clusters are the only clusters that are known to be
available and configured to the host. If you have enabled SAA, but not selected any cluster as
SAA candidates in the Management Class, all clusters are treated as SAA candidates.
Before SAA is operational, the SAA function must be enabled in the grid by using the LI REQ
SETTING SCRATCH ENABLE command.
This function introduces a concept of grouping clusters together into families. Using cluster
families, you can define a common purpose or role to a subset of clusters within a grid
configuration. The role that is assigned, for example, production or archive, is used by the
TS7700 Licensed Internal Code to make improved decisions for tasks, such as replication
and TVC selection. For example, clusters in a common family are favored for TVC selection,
or replication can source volumes from other clusters within its family before using clusters
outside of its family.
The following list describes the three thresholds in ascending order of occurrence:
Automatic Removal
The policy removes the oldest logical volumes from the TS7720 or TS7760 cache if a
consistent copy exists elsewhere in the grid. This state occurs when the cache is 3 TB
below the out-of-cache-resources threshold. In the automatic removal state, the TS7720
or TS7760 automatically removes volumes from the disk-only cache to prevent the cache
from reaching its maximum capacity.
This state is identical to the limited-free-cache-space-warning state unless the Temporary
Removal Threshold is enabled. You can also lower the removal threshold in the LI REQ.
The default is 4 TB.
Clarification: Host writes to the TS7720 or TS7760 and inbound copies continue
during this state.
If all valid clusters are in this state or unable to accept mounts, the host allocations fail.
Read mounts can choose the TS7720 or TS7760 in this state, but modify and write
operations fail. Copies inbound to this TS7720 or TS7760 are queued as Deferred until
the TS7720 or TS7760 exits this state.
Table 4-14 displays the start and stop thresholds for each of the active cache capacity states
that are defined.
Limited free cache < 3 TB > 3.5 TB or 15% of CBR3792E upon entering state
space warning (CP0 the size of CP0, CBR3793I upon exiting state
for a TS7720T) whichever is less
To ensure that data is always in a TS7720 or TS7760, or is in for at least a minimal amount of
time, a volume copy retention time must be associated with each removal policy. This volume
retention time in hours enables volumes to remain in a TS7720 or TS7760 TVC for at least x
hours before it becomes a candidate for removal, where x is 0 - 65,536. A volume retention
time of zero assumes no minimal requirement.
In addition to pin time, three policies are available for each volume within a TS7720D or
TS7760D and for CP0 within a TS7720T or TS7760T. For more information, see Chapter 2,
“Architecture, components, and functional characteristics” on page 15.
Removal threshold
The default, or permanent, removal threshold is used to prevent a cache overrun condition in
a TS7720 or TS7760 cluster that is configured as part of a grid. By default, it is a 4 TB (3 TB
fixed plus 1 TB) value that, when taken with the amount of used cache, defines the upper size
limit for a TS7720 or TS7760 cache, or for a TS7720T or TS7760T CP0.
Note: Virtual volumes are only removed if there is another consistent copy within the grid.
Virtual volumes are removed from a TS7720 or TS7760 cache in this order:
1. Volumes in scratch categories
2. Private volumes that are least recently used by using the enhanced removal policy
definitions
After removal begins, the TS7720 or TS7760 continues to remove virtual volumes until the
stop threshold is met. The stop threshold is a value that is the removal threshold minus
500 GB.
A particular virtual volume cannot be removed from a TS7720 or TS7760 cache until the
TS7720 or TS7760 verifies that a consistent copy exists on a peer cluster. If a peer cluster is
not available, or a volume copy has not yet completed, the virtual volume is not a candidate
for removal until the appropriate number of copies can be verified later. Time delayed
replication can alter the removal behavior.
Tip: This field is only visible if the selected cluster is a TS7720 or TS7760 in a grid
configuration.
Virtual volumes might need to be removed before one or more clusters enter service mode.
When a cluster in the grid enters service mode, remaining clusters can lose their ability to
make or validate volume copies, preventing the removal of enough logical volumes. This
scenario can quickly lead to the TS7720 or TS7760 cache reaching its maximum capacity.
The lower threshold creates more free cache space, which enables the TS7720 or TS7760 to
accept any host requests or copies during the service outage without reaching its maximum
cache capacity.
The temporary removal threshold value must be greater than or equal to (>=) the expected
amount of compressed host workload that is written, copied, or both to the TS7720 or
TS7760 during the service outage. The default temporary removal threshold is 4 TB, which
provides 5 TB (4 TB plus 1 TB) of existing free space. You can lower the threshold to any
value from 2 TB to full capacity minus 2 TB.
All TS7720 or TS7760 clusters in the grid that remain available automatically lower their
removal thresholds to the temporary removal threshold value that is defined for each one.
Each TS7720 or TS7760 cluster can use a different temporary removal threshold. The default
temporary removal threshold value is 4 TB or 1 TB more data than the default removal
threshold of 3 TB. Each TS7720 or TS7760 cluster uses its defined value until the originating
cluster in the grid enters service mode or the temporary removal process is canceled. The
cluster that is initiating the temporary removal process does not lower its own removal
threshold during this process.
Note: A detailed description of the Host Console Request functions and their responses is
available in IBM Virtualization Engine TS7700 Series z/OS Host Command Line Request
User’s Guide, which is available at the Techdocs website:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101091
Note: Construct names are assigned to virtual volumes by the attached Host System, and
they are used to establish Data Management policies to be executed by the TS7700
against specific volumes. Constructs (and associated policies) are defined in advance
using the TS7700 MI. For more information, see “Defining TS7700 constructs” on
page 555. If the Host System assigns a construct name without first defining it, the TS7700
will create the construct with the default parameters.
In the default Storage Class Case, for a TS7700 running in a dual production multi-cluster grid
configuration, virtual tape drives in both TS7700 are selected as the I/O TVCs, and have the
original volumes (newly created or modified) preferred in cache. The copies to the other
TS7700 are preferred to be removed from cache. Therefore, each TS7700 TVC is filled with
unique, newly created, or modified volumes, roughly doubling the amount of cache seen by
the host.
However, for a TS7700 running in a multi-cluster grid configuration that is used for business
continuance, particularly when all I/O is preferred to the local TVC, this default management
method might not be wanted. If the remote site of the multi-cluster grid is used for recovery,
the recovery time is minimized by having most of the needed volumes already in cache. What
is needed is to have the most recent copy volumes remain in the cache, not being preferred
out of cache.
If the remote TS7700 is used for recovery, the recovery time is minimized by having most of
the needed volumes in cache. However, it is not likely that all of the volumes to restore will be
resident in the cache, so some number of recalls is required. Unless you can explicitly control
the sequence of volumes to be restored, it is likely that recalled volumes will displace cached
volumes that have not yet been restored from, resulting in further recalls later in the recovery
process.
After a restore completes from a recalled volume, that volume is no longer needed. These
volumes must be removed from the cache after they have been accessed so that they
minimally displace other volumes in the cache.
Based on business requirements, this behavior can be modified using the RECLPG0 setting:
LI REQ, <distributed-library>, SETTING, CACHE, RECLPG0, <ENABLE/DISABLE>
By following these guidelines, the TS7700 grid configuration supports the availability and
performance goals of your workloads by minimizing the effect of the following outages:
Planned outages in a grid configuration, such as Licensed Internal Code or hardware
updates to a cluster. While one cluster is being serviced, production work continues with
the other cluster in the grid configuration after virtual tape device addresses are online to
the cluster.
Unplanned outage of a cluster. For the logical volumes with an Immediate or Synchronous
Copy policy effective, all jobs that completed before the outage have a copy of their data
available on the other cluster. For jobs that were in progress on the cluster that failed, they
can be reissued after virtual tape device addresses are online on the other cluster (if they
were not already online) and an ownership takeover mode has been established (either
manually or through AOTM).
If it is necessary, access existing data to complete the job. For more details about AOTM,
see 2.3.34, “Autonomic Ownership Takeover Manager” on page 96. For jobs that were
writing data, the written data is not accessible and the job must start again.
Important: Scratch categories and Data Classes (DCs) settings are defined at the
system level. Therefore, if you modify them in one cluster, it applies to all clusters in that
grid.
On the host side, definitions must be made in HCD and in the SMS. For an example, see
Table 4-15, and create a similar one during your planning phase. It is used in later steps. The
Library ID must contain only hexadecimal characters (0 - 9 and A - F).
Table 4-15 Sample of library names and IDs in a four-cluster grid implementation
TS7700 virtual library names SMS namea LIBRARY-ID Defined in Defined in
HCD SMS
Use the letter “C” to indicate the composite library names and the letter “D” to indicate the
distributed library names. The composite library name and the distributed library name cannot
start with the letter “V”.
The distributed library name and the composite library name are not directly tied to the
configuration parameters that are used by the IBM SSR during the installation of the TS7700.
These names are not defined to the TS7700 hardware. However, to make administration
easier, associate the LIBRARY-IDs with the SMS library names through the nickname setting
in the TS7700 MI.
Remember: Match the distributed and composite library names that are entered at the
host with the nicknames that are defined at the TS7700 MI. Although they do not have to
be the same, this guideline simplifies the management of the subsystem.
Tip: Specify the LIBRARY-ID and LIBPORT-ID in your HCD/IOCP definitions, even in a
stand-alone configuration. This configuration reduces the likelihood of having to reactivate
the IODF when the library is not available at IPL, and provides enhanced error recovery in
certain cases. It might also eliminate the need to have an IPL when you change your I/O
configuration. In a multicluster configuration, LIBRARY-ID and LIBPORT-ID must be
specified in HCD, as shown in Table 4-15.
Distributed library ID
During installation planning, each cluster is assigned a unique, five-digit hexadecimal number
(that is, the sequence number). This number is used during subsystem installation
procedures by the IBM SSR. This is the distributed library ID. This sequence number is
arbitrary, and can be selected by you. It can start with the letter D.
In addition to the letter D, you can use the last four digits of the hardware serial number if it
consists only of hexadecimal characters. For each distributed library ID, it is the last four digits
of the TS7700 serial number.
The composite library ID for this four-cluster grid can then be CA010.
Important: Whether you are using your own or IBM nomenclature, the important point is
that the subsystem identification must be clear. Because the identifier that appears in all
system messages is the SMS library name, it is important to distinguish the source of the
message through the SMS library name.
Composite library ID
The composite library ID is defined during installation planning and is arbitrary. The
LIBRARY-ID is entered by the IBM SSR into the TS7700 configuration during hardware
installation. All TS7700 tape drives participating in a grid have the same composite library ID.
In the example in “Distributed library ID”, the composite library ID starts with a “C” for this five
hex-character sequence number.
The last four characters can be used to identify uniquely each composite library in a
meaningful way. The sequence number must match the LIBRARY-ID that is used in the HCD
library definitions and the LIBRARY-ID that is listed in the Interactive Storage Management
Facility (ISMF) Tape Library definition windows.
LIBPORT-ID
Each logical control unit (LCU), or 16-device group, must present a unique subsystem
identification to the IBM Z host. This ID is a 1-byte field that uniquely identifies each LCU
within the cluster, and is called the LIBPORT-ID. The value of this ID cannot be 0.
Table 4-16 shows the definitions of the LIBPORT-IDs in a multi-cluster grid. For Cluster 0, 256
devices is 01 - 10 and 496 devices is 01 - 1F. LIBPORT-ID is always one more than CUADD.
0 0 - 1E X’01’-X’1F’
1 0 - 1E X’41’-X’5F’
2 0 - 1E X’81’-X’9F’
3 0 - 1E X’C1’-X’DF’
4 0 - 1E X’21’-X’3F’
5 0 - 1E X’61’-X’7F’
6 0 - 1E X’A1’-X’BF’
7 0 - 1E X’E1’-X’FF’
In general, install the host software support. See the VTS, PTP, and 3957 Preventive Service
Planning (PSP) topics on the IBM Support and Downloads web page (ibm.com/support) for
the current information about Software Maintenance.
The automatic class selection (ACS) routines process every new tape allocation in the
system-managed storage (SMS) address space. The production ACS routines are stored in
the active control data set (ACDS). These routines allocate to each volume a set of classes
(DC, SC, MC, and SG) that reflect your installation’s policies for the data on that volume.
The ACS routines are started for every new allocation. Tape allocations are passed to the
OAM, which uses its Library Control System (LCS) component to communicate with the
Integrated Library Manager.
For SMS-managed requests, the SG routine assigns the request to an SG. The assigned SG
determines which LPARs in the tape library are used. Through the SG construct, you can
direct logical volumes to specific tape libraries.
Data Facility System Managed Storage Removable Media Manager (DFSMSrmm) is one
such TMS that is included as a component of z/OS. The placement and access to the disk
that contains the DFSMSrmm control data set (CDS) determines whether a standard or
client/server subsystem (RMMplex) should be used. If all z/OS hosts have access to a shared
disk, an RMMplex is not necessary.
Review the Redbooks publication DFSMSrmm Primer, SG24-5983 and the z/OS DFSMSrmm
Implementation and Customization Guide, SC23-6874 for further information about which
RMM subsystem is correct for your environment.
The OAM is a component of DFSMSdfp that is included with z/OS as part of the storage
management system (SMS). Along with your TMS, OAM uses the concepts of
system-managed storage to manage, maintain, and verify tape volumes and tape libraries
within a tape storage environment. OAM uses the tape configuration database (TCDB), which
consists of one or more volume catalogs, to manage volume and library entries.
If tape libraries are shared among hosts, they must all have access to a single TCDB on
shared disk, and they can share the DEVSUPxx parmlib member. If the libraries are to be
partitioned, each set of sharing systems must have its own TCDB. Each such TCDBplex must
have a unique DEVSUPxx parmlib member that specifies library manager categories for each
scratch media type, error, and private volumes.
Planning what categories are used by which hosts is an important consideration that needs to
be addressed before the installation of any tape libraries. For more information about OAM
implementation and category selection, see z/OS DFSMS OAM Planning, Installation, and
Storage Administration Guide for Tape Libraries, SC23-6867.
Sharing of an IBM automated tape library (ATL) means that all attached hosts have the same
access to all volumes in the tape library. To achieve this sharing, you need to share the host
CDSs, that is, the TMS inventory and the integrated catalog facility (ICF) catalog information,
among the attached hosts.
Additionally, you need to have the same categories defined in the DEVSUPxx member on all
hosts. In a non-SMS environment, all systems must share the ICF catalog that contains the
Basic Tape Library Support (BTLS) inventory.
This partition is implemented through values that are updated in the DEVSUPxx category
definitions. Until now, to modify a category value you needed to change the DEVSUPxx
member and restart the system. A new command, DS QLIB, CATS, enables you to query and
modify these category values without an initial program load (IPL). However, this command
must be used with great care because a discrepancy in this area causes scratch mounts
to fail.
SDAC is enabled by using FC 5271, Selective Device Access Control. This feature license
key must be installed on all clusters in the grid before SDAC is enabled. You can specify one
or more LIBPORT-IDs per SDAC group. Each access group is given a name and assigned
mutually exclusive VOLSER ranges. Use the Library Port Access Groups window on the
TS7700 MI to create and configure Library Port Access Groups for use with SDAC.
To calculate the number of logical paths that are required in an installation, use the following
formula:
Number of logical paths per FICON channel = number of LPARs x number of CUs
This formula assumes that all LPARs access all CUs in the TS7700 with all channel paths. For
example, if one LPAR has 16 CUs defined, you are using 16 logical paths of the 512 logical
paths available on each FICON adapter port.
The FICON Planning and Implementation Guide, SG24-6497, covers the planning and
implementation of FICON channels, and describes operating in FICON native (Fibre Channel
(FC)) mode. It also describes the FICON and FC architectures, terminology, and supported
topologies.
Define one tape CU in the HCD dialog for every 16 virtual devices. Up to eight channel paths
can be defined to each CU. A logical path might be thought of as a three-element entity:
A host port
A TS7700 port
A logical CU in the TS7700
Remember: A reduction in the number of physical paths reduces the throughput capability
of the TS7700 and the total number of available logical paths per cluster. A reduction in
CUs reduces the number of virtual devices available to that specific host.
The VOLSER of the virtual and physical volumes must be unique throughout a
system-managed storage complex (SMSplex) and throughout all storage hierarchies, such as
DASD, tape, and optical storage media. To minimize the risk of misidentifying a volume, the
VOLSER should be unique throughout the grid and across different clusters in different
TS3500 or TS4500 tape libraries.
The VOLSERs must be unique throughout an SMSplex and throughout all storage
hierarchies. It must also be unique across LPARs connected to the grid. Have independent
plexes use unique ranges in case volumes ever need to be shared. In addition, future merges
of grids require that their volume ranges be unique.
Tip: Try not to insert an excessive amount of scratch that isn’t likely to be used over a few
months duration given that it can add processor burden to allocations, especially when
expire with hold is enabled. Volumes can always be inserted later if scratch counts become
low.
When you insert volumes, you do that by providing starting and ending volume serial number
range values.
The TS7700 determines how to establish increments of VOLSER values based on whether
the character in a particular position is a number or a letter. For example, inserts starting with
ABC000 and ending with ABC999 add logical volumes with VOLSERs of ABC000, ABC001,
ABC002…ABC998, and ABC999 into the inventory of the TS7700. You might find it helpful to
plan for growth by reserving multiple ranges for each TS7700 that you expect to install.
If you have multiple partitions, it is better to plan which ranges will be used in which partitions,
for example, A* for the first sysplex and B* for the second sysplex. If you need more than one
range, you can select A* and B* for the first sysplex, C* and D* for the second sysplex, and so
on.
Tip: For 3957-V06/VEA, the maximum is 2,000,000 virtual volumes, which will also be the
limit for the overall grid that contains clusters corresponding to one of those models.
The TS7700 supports logical WORM (LWORM) volumes. Consider the size of your logical
volumes, the number of scratch volumes you need per day, the time that is required for
return-to-scratch processing, how often scratch processing is run, and whether you need to
define LWORM volumes.
Depending on the virtual volume sizes that you choose, you might see the number of volumes
that are required to store your data grow or shrink depending on the media size from which
you are converting. If you have data sets that fill native 3590 volumes, even with 6000 MiB
virtual volumes, you need more TS7700 virtual volumes to store the data, which is stored as
multivolume data sets.
The 400 MiB cartridge storage tape (CST)-emulated cartridges or 800 MiB with emulated
enhanced capacity cartridge system tape (ECCST)-emulated cartridges are the two types
you can specify when adding volumes to the TS7700. You can use these sizes directly, or use
policy management to override them to provide for the 1000, 2000, 4000, 6000, or
25,000 MiB sizes.
A virtual volume size can be set by VOLSER, and can change dynamically by using the
DFSMS DC storage construct when a job requires a scratch volume or writes from the
beginning of tape (BOT). The amount of data that is copied to the stacked cartridge is only the
amount of data that was written to a logical volume. The choice between all available virtual
volume sizes does not affect the real space that is used in either the TS7700 cache or the
stacked volume.
In general, unless you have a special need for CST emulation (400 MiB), specify the ECCST
media type when you insert volumes in the TS7700.
In planning for the number of logical volumes that is needed, first determine the number of
private volumes that make up the current workload that you will be migrating. One way to do
this is by looking at the amount of data on your current volumes and then matching that to the
supported logical volume sizes. Match the volume sizes, accounting for the compressibility of
your data. If you do not know the average ratio, use the conservative value of 2:1.
If you choose to use only the 800 MiB volume size, the total number that is needed might
increase depending on whether current volumes that contain more than 800 MiB compressed
need to expand to a multivolume set. Take that into account for planning the number of logical
volumes required. Consider using smaller volumes for applications such as DFSMShsm and
larger volumes for backup and full volume memory dumps.
Now that you know the number of volumes you need for your current data, you can estimate
the number of empty scratch logical volumes you must add. Based on your current
operations, determine a nominal number of scratch volumes from your nightly use. If you have
an existing VTS installed, you might have already determined this number, and are therefore
able to set a scratch media threshold with that value through the ISMF Library Define window.
Next, multiply that number by the value that provides a sufficient buffer (typically 2×) and by
the frequency with which you want to perform returns to scratch processing.
The following formula is suggested to calculate the number of logical volumes needed:
Vv = Cv + Tr + (Sc)(Si + 1)
For example, assuming the current volume requirements (that use all the available volume
sizes), that use 2500 scratch volumes per night, and running return-to-scratch processing
every other day, you need to plan on the following number of logical volumes in the TS7700:
75,000 (current, rounded up) + 2,500 + 2,500 (1+1) = 82,500 logical volumes
If you plan to use the expired-hold option, take the maximum planned hold period into account
when calculating the Si value in the previous formula.
If you define more volumes than you need, you can always eject the additional volumes.
Unused logical volumes do not use space, but excessive scratch counts in the 100,000+
might add processor burden to scratch allocation processing.
The default number of logical volumes that is supported by the TS7700 is 1,000,000. You can
add support for more logical volumes in 200,000 volume increments, up to a total of
4,000,000 logical volumes. This is the maximum number either in a stand-alone or grid
configuration.
To make this upgrade, see how to use FC 5270 in the Increased logical volumes bullet in
7.2.1, “TS7700 concurrent system component upgrades” on page 266.
Return-to-scratch processing
Return-to-scratch processing involves running a set of tape management tools that identify
the logical volumes that no longer contain active data, and then communicating with the
TS7700 to change the status of those volumes from private to scratch.
The amount of time the process takes depends on the type of TMS being employed, how
busy the TS7700 is when it is processing the volume status change requests, and whether a
grid configuration is being used. You can see elongated elapsed time in any TMSs
return-to-scratch process when you migrate to or install a multicluster configuration solution.
If the number of logical volumes that is used daily is small (fewer than a few thousand), you
might choose to run return-to-scratch processing only every few days. A good rule is to plan
for no more than a 4-hour time period to run return to scratch. By ensuring a nominal run time
of 4 hours, enough time exists during first shift to run the process twice if problems are
encountered during the first attempt. Unless there are specific reasons, run return-to-scratch
processing one time per day.
With z/OS V1.9 or later, return-to-scratch in DFSMSrmm has been enhanced to speed up this
process. To reduce the time that is required for housekeeping, it is now possible to run several
return-to-scratch processes in parallel. For more information about the enhanced
return-to-scratch process, see the z/OS DFSMSrmm Implementation and Customization
Guide, SC23-6874.
Tip: The expire-hold option might delay the time that the scratch volume becomes usable
again, depending on the defined hold period.
Under this preferred migration, hierarchical storage management (HSM) first migrates all
volumes in a scratch category according to size (largest first). Only when all volumes (PG0 or
PG1) in a scratch category have been migrated and the PG1 threshold is still unrelieved does
HSM operate on private PG1 volumes in LRU order.
Note: You must define all scratch categories before using the preferred migration
enhancement.
If the number of scratch physical volumes in your system is fewer than these thresholds, the
following situations can occur:
Reclamation of sunset media does not occur
Reclamation runs more frequently
The following is a suggested formula to calculate the number of physical volumes needed:
For each workload, calculate the number of physical volumes needed:
Px = (Da/Cr)/(Pc × Ut/100)
Next, add in physical scratch counts and the Px results from all known workloads:
Pv = Ps + P1 + P2 + P3 + ...
Using the suggested formula and the assumptions, plan to use the following number of
physical volumes in your TS7700:
Example 1 by using the following assumptions:
– Da = 100 TB
– Cr = 2
– Ut = 67.5%
– Pc = 10 TB (capacity of a JD volume)
If the number of physical volumes in the common scratch pool is Ps = 15, you would need to
plan on the following number of physical volumes in the TS7740, the TS7720T, or the
TS7760T:
Pv = Ps + P1 + P2 = 15 + 8 + 16 = 39 physical volumes
If you need dual copied virtual volumes in a single cluster, you need to double the number of
physical volumes for that workload. If a workload uses dedicated pools with the borrow/return
sharing disabled, then each workload must have its own dedicated additional scratch count
versus the shared Ps count.
The default value can be adjusted through the MI (use the Copy Export Settings window) to a
maximum value of 10,000. After your Copy Export operations reach a steady state,
approximately the same number of physical volumes is being returned to the library for
reclamation as there are those being sent offsite as new members of the Copy Export set of
volumes.
No special feature code is needed for these options to be available, but all clusters in the Grid
must run R4.1.2 or later (the presence of lower levels of code in the grid will prevent the use of
this feature).
Different virtual volumes in the same grid can use different compression algorithms,
depending on Data Class construct assignment (which will contain the selected algorithm
information). This feature implies this is a grid-scope setting. The option to be selected then
depends on desired compression ratio/speed level.
Only virtual volumes which are written from BOT will be able to go through LZ4/ZSTD
processing (as long as its associated Data Class is configured for that). Previously written
data in existing virtual volumes will keep the initially applied compression method, even if
Data Class parameters are changed. There is no available method to convert old data to the
new compression algorithms.
Note: The uncompressed data size allowed to be written to a single virtual volume has a
“logical” limit of 68 GB for channel byte counters tracking the amount of written data. This
limit can be surpassed when using compression rates equal or higher than 2.7:1 (either
with FICON traditional compression or the new enhanced algorithms) against volumes
assigned to Data Classes configured for 25,000 MiB volume sizes (after compression).
Taking that circumstance into consideration, Data Classes can now be configured to
decide how to handle that event, using the new attribute 3490 Counters Handling with the
following available options:
Surface EOT: Set this option to surface EOT (End Of Tape) when channel bytes written
reaches maximum channel byte counter (68 GB).
Wrap Supported: Set this option to allow channel bytes written to exceed maximum
counter value and present the counter overflow unit attention to the attached LPAR,
which will then be able to collect and reset the counters in the TS7700 by using the RBL
(X'24') command.
Important: To achieve the optimum throughput when Ficon compression is in use, verify
your definitions to ensure that you specified compression for data that is written to the
TS7700.
TS7740, TS7720T, and TS7760T implement the Secure Data Erasure on a pool basis. With
the Secure Data Erase function, all reclaimed physical volumes in that pool are erased by
writing a random pattern across the whole tape before reuse. If a physical volume has
encrypted data, the erasure is accomplished by deleting Encryption Keys on the physical
volume, rendering the data unrecoverable on this cartridge. A physical cartridge is not
available as a scratch cartridge if its data is not erased.
Consideration: If you choose this erase function and you are not using tape encryption,
TS7740, TS7720T, or TS7760T need time to erase every physical tape. Therefore, the
TS7740, TS7720T, or TS7760T need more time and more back-end drive activity every
day to complete reclamation and erase the reclaimed cartridges afterward. With tape
encryption, the Secure Data Erase function is relatively fast.
The Secure Data Erase function also monitors the age of expired data on a physical volume
and compares it with the limit set by the user in the policy settings. Whenever the age
exceeds the limit that is defined in the pool settings, the Secure Data Erase function forces a
reclaim and subsequent erasure of the volume.
In a heterogeneous drive configuration, older generations of tape drives are used for
read-only operation. However, the Secure Data Erase function uses older generations of tape
drives to erase older media (discontinued media) that cannot be written by 3592-E08 tape
drives.
For more information about the Secure Data Erase function, see 2.2.25, “Secure Data Erase
function” on page 55 and “Defining physical volume pools in the TS7700T” on page 540.
When compared to the normal or long erasure operation, EK shredding is much faster.
Normal erasure is always used for non-encrypted tapes, and EK shredding is the default that
is used for encrypted tapes. The first time an encrypted tape is erased, a normal erasure is
performed, followed by an EK shredding. A TS7700 can be configured to perform a normal
erasure with every data operation, but this function must be configured by an IBM SSR.
Encryption on the TS7740, TS7720T, and TS7760T is controlled on a storage pool basis. SG
and MC DFSMS constructs specified for logical tape volumes determine which physical
volume pools are used for the primary and backup (if used) copies of the logical volumes. The
storage pools, originally created for the management of physical media, have been enhanced
to include encryption characteristics.
The tape encryption solution in a TS7740, TS7720T, and TS7760T consists of several
components:
The TS7740, TS7720T, and TS7760T tape encryption solution uses either the IBM
Security Key Lifecycle Manager (SKLM) or the IBM Security Key Lifecycle Manager for
z/OS (ISKLM) as a central point from which all EK information is managed and served to
the various subsystems.
The TS1120, TS1130, TS1140, or TS1150 encryption-enabled tape drives are the other
fundamental piece of TS7740, TS7720T, and TS7760T tape encryption, providing
hardware that runs the cryptography function without reducing the data-transfer rate.
The TS7740, TS7720T, or TS7760T provides the means to manage the use of encryption
and the keys that are used on a storage-pool basis. It also acts as a proxy between the
tape drives and the IBM Security Key Lifecycle Manager (SKLM) or IBM Security Key
Lifecycle Manager for z/OS (ISKLM) by using Ethernet to communicate with the SKLM or
ISKLM (or in-band through FICONs) to the tape drives. Encryption support is enabled with
FC9900.
Rather than user-provided key labels per pool, the TS7740, TS7720T, and TS7760T can also
support the use of default keys per pool. After a pool is defined to use the default key, the
management of encryption parameters is run at the key manager. The tape encryption
function in a TS7740, TS7720T, or TS7760T does not require any host software updates
because the TS7740, TS7720T, or TS7760T controls all aspects of the encryption solution.
Although the feature for encryption support is client-installable, check with your IBM SSR for
the prerequisites and related settings before you enable encryption on your TS7740,
TS7720T, or TS7760T.
Note: The IBM Encryption Key Manager is not supported for use with TS1140 3592 E07
and TS1150 3592 E08 tape drives. Either the IBM Security Key Lifecycle Manager (SKLM)
or the IBM Security Key Lifecycle Manager for z/OS (ISKLM) is required.
You also need to create the certificates and keys that you plan to use for encrypting your
back-end tape cartridges.
Although it is possible to operate with a single key manager, configure two key managers for
redundancy. Each key manager needs to have all of the required keys in its respective
keystore. Each key manager must have independent power and network connections to
maximize the chances that at least one of them is reachable from the TS7740, TS7720T, and
TS7760T when needed.
If the TS7740, TS7720T, and TS7760T cannot contact either key manager when required,
you might temporarily lose access to migrated logical volumes. You also cannot move logical
volumes in encryption-enabled storage pools out of cache.
IBM Security Key Lifecycle Manager waits for and responds to key generation or key retrieval
requests that arrive through TCP/IP communication. This communication can be from a tape
library, tape controller, tape subsystem, device drive, or tape drive.
Additional information can be obtained from the IBM Security Key Lifecycle Manager website:
http://www.ibm.com/support/knowledgecenter/SSWPVP_2.5.0/com.ibm.sklm.doc_2.5/welco
me.htm
ISKLM manages EKs for storage, simplifying deployment and maintaining availability to data
at rest natively on the IBM Z mainframe environment. It simplifies key management and
compliance reporting for the privacy of data and compliance with security regulations.
Note: The IBM Security Key Lifecycle Manager for z/OS (ISKLM) external key manager
supports TS7700 physical tape but does not support TS7700 disk encryption.
Additional information can be obtained from the IBM Security Key Lifecycle Manager for z/OS
website:
https://www.ibm.com/support/knowledgecenter/en/SSB2KG
The TS7740, TS7720T, and TS7760T must not be configured to force the TS1120 drives into
J1A mode. This setting can be changed only by your IBM SSR. If you need to update the
Licensed Internal Code level, be sure that the IBM SSR checks and changes this setting, if
needed.
For a comprehensive TS7740, TS7720T, and TS7760T encryption implementation plan, see
“Implementing TS7700 Tape Encryption” in IBM System Storage Tape Encryption Solutions,
SG24-7320.
Authentication occurs after the FDE disk has powered on, where it will be in a locked state. If
encryption was never enabled (the lock key is not initially established between CC9/CS9/CSA
cache controller and the disk), the disk is considered unlocked with access unlimited just like
a non-FDE drive.
Disk-based encryption is activated with the purchase and installation of Feature Code 5272:
Enable Disk Encryption, which is installable on the TS7720-VEB, TS7740-V07, and
TS7760-VEC (Encryption Capable Frames, as listed in the previous required feature
code list).
Key management for FDE does not use the IBM Security Key Lifecycle Manager (SKLM), or
IBM Security Key Lifecycle Manager for z/OS (ISKLM). Instead, the key management is
handled by the disk controller, either the 3956-CC9, 3956-CS9, or 3956-CSA. There are no
keys to manage by the user, because all management is done internally by the
cache controllers.
Disk-based encryption FDE is enabled on all HDDs that are in the cache subsystem (partial
encryption is not supported). It is an “all or nothing” proposition. All HDDs, disk cache
controllers, and drawers must be Encryption Capable as a prerequisite for FDE enablement.
FDE is enabled at a cluster TVC level, so you can have clusters with TVC encrypted along
with clusters with TVC that are not encrypted as members of the same grid.
When disk-based encryption is enabled on a system already in use, all previously written user
data is encrypted retroactively, without a performance penalty. After disk-based encryption is
enabled, it cannot be disabled again.
A page opens to a list of .TXT, .PDF, and .EXE files. To start, open the OVERVIEW.PDF file to see
a brief description of all the various tool jobs. All jobs are in the IBMTOOLS.EXE file, which is a
self-extracting compressed file that, after it has been downloaded to your PC, can expand into
four separate files:
IBMJCL.XMI: Job control language (JCL) for current tape analysis tools
IBMJCL.XMI: Parameters that are needed for job execution
IBMLOAD.XMI: Load library for executable load modules
IBMPAT.XMI: Data pattern library, which is only needed if you run the QSAMDRVR utility
Two areas of investigation can assist you in tuning your current tape environment by
identifying the factors that influence the overall performance of the TS7700. An example of
factors is bad block sizes, that is, smaller than 16 KB, and low compression ratios, both of
which can affect performance in a negative way.
By keeping historical SMF data and studying its trends, an installation can evaluate changes
in the configuration, workload, or job scheduling procedures. Similarly, an installation can use
SMF data to determine wasted system resources because of problems, such as inefficient
operational procedures or programming conventions.
The examples that are shown in Table 4-17 show the types of reports that can be created
from SMF data. View the examples primarily as suggestions to assist you in planning
SMF reports.
04 Step End.
05 Job End.
The following job stream was created to help analyze these records. See the installation
procedure in the member $$INDEX file:
EREPMDR: JCL to extract MDR records from the EREP history file
TAPECOMP: A program that reads either SYS1.LOGREC or the EREP history file and
produces reports on the current compression ratios and MB transferred per hour
The SMF 21 records record both channel-byte and device-byte information. The TAPEWISE
tool calculates data compression ratios for each volume. The following reports show
compression ratios:
HRS
DSN
MBS
VOL
TAPEWISE
The TAPEWISE tool is available from the IBM Tape Tools FTP site. TAPEWISE can, based on
input parameters, generate several reports that can help with various items:
Tape activity analysis
Mounts and MBs processed by hour
Input and output mounts by hour
Mounts by SYSID during an hour
Concurrent open drives used
Long VTS mounts (recalls)
The following job stream was created to help analyze these records. See the installation
procedure in the member $$INDEX file:
EREPMDR: JCL to extract MDR records from EREP history file
BADBLKSZ: A program that reads either SYS1.LOGREC or the EREP history file, finds
volumes writing small block sizes, and then gathers the job name and data set name from
a TMS copy
Collect the stated SMF records for all z/OS systems that share the current tape configuration
and might have data that is migrated to the TS7700. The data that is collected must span one
month (to cover any month-end processing peaks) or at least those days that represent the
peak load in your current tape environment. Check in SYS1.PARMLIB in member SMF to see
whether the required records are being collected. If they are not being collected, arrange for
their collection.
TMS SMF
data data
FORMCA TS SORTSMF
1 (FORMCA TS) (SMFIL TER)
...FORMCA TS ...SMFDATA
.TMCATLG .SORTED
BMPACKT BMPACKS
2 (BMPACK) (BMPACK)
In addition to the extract file, the following information is useful for sizing the TS7700:
Number of volumes in current tape library
This number includes all the tapes (located within automated libraries, on shelves, and
offsite). If the unloaded Tape Management Catalog (TMC) data is provided, there is no
need to collect the number of volumes.
4.5.2 BatchMagic
The BatchMagic tool provides a comprehensive view of the current tape environment and
predictive modeling of workloads and technologies. The general methodology behind this tool
involves analyzing SMF type 14, 15, 21, and 30 records, and data extracted from the TMS.
The TMS data is required only if you want to make a precise forecast of the cartridges to be
ordered based on the current cartridge usage that is stored in the TMS catalog.
When you run BatchMagic, the tool extracts data, groups data into workloads, and then
targets workloads to individual or multiple IBM tape technologies. BatchMagic examines the
TMS catalogs and estimates cartridges that are required with new technology, and it models
the operation of a TS7700 and 3592 drives (for TS7740, TS7720T, or TS7760T) and
estimates the required resources.
The reports from BatchMagic give you a clear understanding of your current tape activities.
They make projections for a TS7700 solution together with its major components, such as
3592 drives, which cover your overall sustained and peak throughput requirements.
BatchMagic is specifically for IBM internal and IBM Business Partner use.
This section highlights several important considerations when you are deciding what
workload to place in the TS7700:
Throughput
The TS7700 has a finite bandwidth capability, as does any other device that is attached to
a host system. With 8 Gb FICON channels and large disk cache repositories that operate
at disk speeds, most workloads are ideal for targeting a TS7700.
Non – TS7700
In this example, the TS7700 cache hit results in savings in tape processing elapsed time
of 40 seconds.
The time reduction in the tape processing has two effects:
– It reduces the elapsed time of the job that is processing the tape.
– It frees up a drive earlier, so the next job that needs a tape drive can access it sooner
because there is no rewind or unload and robotics time after closing the data set.
When a job attempts to read a volume that is not in the TS7740, TS7720T and TS7760T
TVC, the logical volume is recalled from a stacked physical volume back into the cache.
When a recall is necessary, the time to access the data is greater than if it were already in
the cache. The size of the cache and the use of the cache management policies can
reduce the number of recalls. Too much recall activity can negatively affect the overall
throughput of the TS7740, TS7720T, and TS7760T.
Remember: The TS7720 and TS7760 resident-only partition (CP0) features a large
disk cache and no back-end tape drives. These characteristics result in a fairly
consistent throughput at peak performance most of the time, operating with 100% of
cache hits.
During normal operation of a TS7700 grid configuration, logical volume mount requests
can be satisfied from the local TVC or a remote TVC. TS7700 algorithms can evaluate the
mount request and determine the most effective way to satisfy the request from within the
TS7700 grid.
Notes:
The term local means the TS7700 cluster that is running the logical mount to
the host.
The term remote means any other TS7700 that is participating in the same grid as
the local cluster.
The acronym TVC means tape volume cache.
With the wide range of capabilities that the TS7700 provides, unless the data sets are large or
require interchange, the TS7700 is likely a suitable place to store data.
Also, you must cover how the TS7740, TS7720T, or TS7760T relates to the TS3500 or
TS4500, which helps operational personnel understand the tape drives that belong to the
TS7740, TS7720T, or TS7760T, and which logical library and assigned cartridge ranges are
dedicated to the TS7740, TS7720T, or TS7760T.
The operational staff must be able to identify an operator intervention, and perform the
necessary actions to resolve it. They must be able to perform basic operations, such as
inserting new volumes in the TS7740, TS7720T, or TS7760T, or ejecting a stacked cartridge
by using the MI.
Storage administrators and system programmers need to also receive the same training as
the operations staff, in addition to the following information:
Software choices and how they affect the TS7700
Disaster recovery considerations
For more information about storage services and IBM Global Services, contact your IBM
marketing representative, or see the following website:
http://www.ibm.com/services
References in this publication to IBM products or services do not imply that IBM intends to
make them available in all countries in which IBM operates.
Table 4-18 can help you when you plan the preinstallation and sizing of the TS7700. Use the
table as a checklist for the main tasks that are needed to complete the TS7700 installation.
Postinstallation tasks (if any) 11.3.1, “TS7700 components and task distribution” on
page 618
Whether a copy is available at another TS7700 cluster in a multi-cluster grid depends on the
Copy Consistency Policy that was assigned to the logical volume when it was written. The
Copy Consistency Policy is set through the Management Class (MC) storage construct. It
specifies whether and when a copy of the data is made between the TS7700 clusters in the
grid configuration. The following Copy Consistency Policies can be assigned:
Synchronous Copy (Synch): Data that is written to the cluster is compressed and
simultaneously written to another specified cluster.
Rewind Unload (RUN): Data that is created on one cluster is copied to the other cluster as
part of successful RUN command processing.
Deferred Copy (Deferred): Data that is created on one cluster is copied to the specified
clusters after successful RUN command processing.
No Copy (None): Data that is created on one cluster is not copied to the other cluster.
Consider when the data is available on the cluster at the DR site. With Synchronous Copy, the
data is written to a secondary cluster. If the primary site is unavailable, the volume can be
accessed on the cluster that specified Synch. With RUN, unless the Copy Count Override is
enabled, any cluster with Run specified has a copy of the volume available. With None, no
copy is written in this cluster. With Deferred, a copy is available later, so it might be available
at the cluster that specified Deferred.
When you enable Copy Count Override, it is possible to limit the number of RUN consistency
points that are required before the application is given back device end, which can result in
fewer copies of the data that is available than your copy policies specify.
The Volume Removal policy for hybrid grid configurations is available in any grid configuration
that contains at least one TS7720 or TS7720T cluster and should be considered as well. The
TS7720 Disk-Only solution has a maximum storage capacity that is the size of its TVC, and
TS7720T CP0 works like TS7720. Therefore, after the cache fills, this policy enables logical
volumes to be removed automatically from cache while a copy is retained within one or more
peer clusters in the grid. If the cache is filling up, it is possible that fewer copies of the volume
exist in the grid than is expected based on the copy policy alone.
With the standard settings, host application I/O always has a higher priority than the Deferred
Copy Queue. It is normally expected that the configuration and capacity of the grid is such
that the entire queue has the copies completed each day; otherwise, the incoming copies
cause the Deferred Copy Queue to grow continually and the RPO might not be fulfilled.
When a cluster becomes unavailable due to broken grid links, error, or disaster, the incoming
copy queue might not be complete, and the data might not be available on other clusters in
the grid. You can use BVIR to analyze the incoming copy queue, but the possibility exists that
volumes are not available. For backups, this might be acceptable, but for primary data, it
might be preferable to use a Synch copy policy rather than Deferred.
At any time however, a logical volume is owned by a single cluster. We call this the owning
cluster. The owning cluster has control over access to the volume and changes to the
attributes that are associated with the volume (such as category or storage constructs). The
cluster that has ownership of a logical volume can surrender it dynamically to another cluster
in the grid configuration that is requesting a mount of the volume.
When a mount request is received on a virtual device address, the cluster for that virtual
device must have ownership of the volume to be mounted, or must obtain the ownership from
the cluster that owns it. If the clusters in a grid configuration and the communication paths
between them are operational (grid network), the change of ownership and the processing of
logical volume-related commands are transparent to the operation of the TS7700.
However, if a cluster that owns a volume is unable to respond to requests from other clusters,
the operation against that volume fails, unless more direction is given. Clusters will not
automatically assume or take over ownership of a logical volume without being directed.
This is done to prevent the failure of the grid network communication paths between the
clusters, resulting in both clusters thinking that they have ownership of the volume. If more
than one cluster has ownership of a volume, that might result in the volume’s data or
attributes being changed differently on each cluster, resulting in a data integrity issue with the
volume.
If a cluster fails, is known to be unavailable (for example, a power fault in the IT center), or
must be serviced, its ownership of logical volumes is transferred to the other cluster through
one of the following modes.
Guidance: The links between the TSSCs must not be the same physical links that are
also used by cluster grid gigabit (Gb) links. AOTM must have a different network to be
able to detect that a missing cluster is down, and that the problem is not caused by a
failure in the grid gigabit wide area network (WAN) links.
When enabled by the IBM SSR, suppose that a cluster cannot obtain ownership from the
other cluster because it does not get a response to an ownership request. In this case, a
check is made through the TSSCs to determine whether the owning cluster is inoperable, or if
the communication paths to it are not functioning. If the TSSCs determine that the owning
cluster is inoperable, they enable either read or WOT, depending on what was set by the IBM
SSR.
AOTM enables an ownership takeover mode after a grace period, and can be configured only
by an IBM SSR. Therefore, jobs can intermediately fail with an option to try again until the
AOTM enables the configured takeover mode. The grace period is set to 20 minutes, by
default. The grace period starts when a cluster detects that another remote cluster has failed.
It can take several minutes.
The following OAM messages can be displayed when AOTM enables the ownership takeover
mode:
CBR3750I Message from library libname: G0013 Library libname has experienced an
unexpected outage with its peer library libname. Library libname might be
unavailable or a communication issue might be present.
CBR3750I Message from library libname: G0009 Autonomic ownership takeover
manager within library libname has determined that library libname is
unavailable. The Read/Write ownership takeover mode has been enabled.
CBR3750I Message from library libname: G0010 Autonomic ownership takeover
manager within library libname determined that library libname is unavailable.
The Read-Only ownership takeover mode has been enabled.
CBR4196D Job OAM, drive xxx, volser vvvvvv, error code 140394. Reply 'R' to retry
or 'C' to cancel.
The response to this message should be ‘C’ and the logical drives in the failing cluster should
then be varied offline with VARY devicenumber, OFFLINE to prevent further attempts from the
host to mount volumes on this cluster.
A failure of a cluster causes the jobs that use its virtual device addresses to end abnormally
(abend). To rerun the jobs, host connectivity to the virtual device addresses in the other
cluster must be enabled (if it is not already), and an appropriate ownership takeover mode
selected. If the other cluster has a valid copy of a logical volume, the jobs can be tried again.
If a logical volume is being accessed in a remote cache through the Ethernet link and that link
fails, the job accessing that volume also fails. If the failed job is attempted again, the TS7700
uses another Ethernet link. If all links fail, access to any data in a remote cache is not
possible.
After the failed cluster comes back online and establishes communication with the other
clusters in the grid, the following message will be issued:
CBR3750I Message from library libname: G0011 Ownership takeover mode within
library libname has been automatically disabled now that library libname has
become available.
At this point, it is now possible to issue VARY devicenumber, ONLINE to bring the logical drives
on the cluster for use.
After the cluster comes back into operation, if there are any volumes that are in a conflicting
state because they were accessed on another cluster, the following message will be issued:
CBR3750I Message from library libname: OP0316 The TS7700 Engine has detected
corrupted tokens for one or more virtual volumes.
If this occurs, see “Repair Virtual Volumes window” on page 512 for the process for repairing
the corrupted tokens.
It is now possible to change the impact code and impact text that are issued with the
CBR3750I messages. For more information, see 10.2.1, “CBR3750I Console Message” on
page 598.
The white paper documents a series of TS7700 Grid failover test scenarios for z/OS that were
run in an IBM laboratory environment. Simulations of single failures of all major components
and communication links, and some multiple failures, are run.
To switch over to the DR clusters, a simple vary online of the DR devices is all that is needed
by the production hosts to enable their usage. Another alternative is to have a separate IODF
ready with the addition of the DR devices. However, that requires an IODF activation on the
production hosts.
With a stand-alone system, a single cluster is installed. If the site at which that system is
installed is destroyed, the data that is associated with the TS7700 might be lost unless COPY
EXPORT was used and the tapes were removed from the site. If the cluster goes out of
service due to failures, whether the data is recoverable depends on the failure type.
Remember: The DR process is a joint exercise that requires your involvement and that of
your IBM SSR to make it as comprehensive as possible.
For many clients, the potential data loss or the recovery time that is required with a
stand-alone TS7700 is not acceptable because the COPY EXPORT method might take some
time to complete. For those clients, the TS7700 grid provides a near-zero data loss and
expedited recovery-time solution. With a multi-cluster grid configuration, up to six clusters are
installed, typically at two or three sites, and interconnected so that data is replicated among
them. The way that the sites are used then differs, depending on your requirements.
In a two-cluster grid, one potential use case is that one of the sites is the local production
center and the other site is a backup or DR center, which is separated by a distance that is
dictated by your company’s requirements for DR. Depending on the physical distance
between the sites, it might be possible to have two clusters be both a high availability and DR
solution.
In a three-cluster grid, the typical use is that two sites are connected to a host and the
workload is spread evenly between them. The third site is strictly for DR and there probably
are no connections from the production host to the third site. Another use for a three-cluster
grid might consist of three production sites, which are all interconnected and holding the
backups of each other.
In a four or more cluster grid, DR and high availability can be achieved. The high availability is
achieved with two local clusters that keep RUN or SYNC volume copies, with both clusters
attached to the host. The third and fourth (or more) remote clusters can hold deferred volume
copies for DR. This design can be configured in a crossed way, which means that you can run
two production data centers, with each production data center serving as a backup for the
other.
The only connection between the production sites and the DR site is the grid interconnection.
There is normally no host connectivity between the production hosts and the DR site’s
TS7700. When client data is created at the production sites, it is replicated to the DR site as
defined through Outboard policy management definitions and storage management
subsystem (SMS) settings.
Two-cluster grid
With a two-cluster grid, you can configure the grid for DR, high availability, or both.
Configuration considerations for two-cluster grids are described. The scenarios that are
presented are typical configurations. Other configurations are possible, and might be better
suited for your environment.
A natural or human-caused event has made the local site’s cluster unavailable. The two
clusters are in separate locations, which are separated by a distance that is dictated by your
company’s requirements for DR. The only connections between the local site and the DR site
are the grid interconnections. There is no host connectivity between the local hosts and the
DR site cluster.
Production Backup
Host(s) DR Host
WAN
DR Grid
In a high-availability configuration, both clusters are within metro distance of each other.
These clusters are connected through a LAN. If one of them becomes unavailable because it
has failed, or is undergoing service or being updated, data can be accessed through the other
cluster until the unavailable cluster is made available.
As part of planning a grid configuration to implement this solution, consider the following
information:
Plan for the virtual device addresses in both clusters to be configured to the local hosts. In
this way, a total of 512 or 992 virtual tape devices are available for use (256 or 496 from
each cluster).
Set up a Copy Consistency Point of RUN for both clusters for all data to be made highly
available. With this Copy Consistency Point, as each logical volume is closed, it is copied
to the other cluster.
Design and code the DFSMS ACS routines and MCs on the TS7700 to set the necessary
Copy Consistency Points.
Ensure that AOTM is configured for an automated logical volume ownership takeover
method in case a cluster becomes unexpectedly unavailable within the grid configuration.
Alternatively, prepare written instructions for the operators that describe how to perform
the ownership takeover manually, if necessary. See 2.3.34, “Autonomic Ownership
Takeover Manager” on page 96 for more details about AOTM.
Local Site
Host
H
TS7700
700 TS7700
TS770
Cluster
ster 0 Ethernet Cluster 1
Cluste
R R R R
Backup
DR Host
TS7700 Cluster
WAN
FICON Channel
DWDM
Production
Host(s) TS7700 Cluster
The planning considerations for a two-cluster grid also apply to a three-cluster grid.
Host data that is written to Cluster 0 is copied to Cluster 1 at RUN time or earlier with
Synchronous mode. Host data that is written to Cluster 1 is written to Cluster 0 at RUN time.
Host data that is written to Cluster 0 or Cluster 1 is copied to Cluster 2 on a Deferred basis.
The Copy Consistency Points at the DR site (NNR or NNS) are set to create a copy only of
host data at Cluster 2. Copies of data are not made to Cluster 0 and Cluster 1. This enables
DR testing at Cluster 2 without replicating to the production site clusters.
Figure 5-4 shows an optional host connection that can be established to the remote Cluster 2
by using DWDM or channel extenders. With this configuration, you must define an extra 256
or 496 virtual devices at the host.
Host
DWDM, Channel Extension (optional)
Host
FICON (optional)
TS77
TS7700
7000 TS770
TS7700 TS770
TS7700
Clust
te
er 0
Cluster Cluste 1
Cluster Cluste
Cluster 2
R R D R R D N N R
WAN
All virtual devices in Cluster 0 and 1 are online to the host, Cluster 2 devices are offline.
The virtual devices in Cluster 0 are online to Host A and the virtual devices in Cluster 1 are
online to Host B. The virtual devices in Cluster 2 are offline to both hosts. Host A and Host B
access their own set of virtual devices that are provided by their respective clusters. Host data
that is written to Cluster 0 is not copied to Cluster 1. Host data that is written to Cluster 1 is
not written to Cluster 0. Host data that is written to Cluster 0 or Cluster 1 is copied to Cluster 2
on a Deferred basis.
The Copy Consistency Points at the DR site (NNR or NNS) are set to create only a copy of
host data at Cluster 2. Copies of data are not made to Cluster 0 and Cluster 1. This enables
DR testing at Cluster 2 without replicating to the production site clusters.
Figure 5-5 shows an optional host connection that can be established to remote Cluster 2
using DWDM or channel extenders.
Host A
Hos
DWDM, Channel Extension (optional)
Ho
Host B Host
FICON (optional)
FICON
TS7
TS7700
770
00 TS770
TS7700 TS770
TS7700
Cluster
Clu
uster 0 Cluster 1
Cluste Cluster 2
Cluste
R N D N R D N N R
WAN
This configuration, which provides high-availability production cache if you choose to run
balanced mode with three copies (R-R-D for both Cluster 0 and Cluster 1), is depicted in
Figure 5-6.
TS7740 or TS7700T
Cluster 2
TS7700D
Cluster 0
TS7700D
Cluster 1
Figure 5-6 Three-cluster high availability and disaster recovery with two TS7700Ds and one
TS7740/TS7700T tape library
Another variation of this model uses a TS7700D and a TS7740/TS7700T for the production
site, as shown in Figure 5-7, both replicating to a remote TS7740/TS7700T.
TS7740 or TS7700T
Cluster 2
TS7700D
Cluster 0
TS7740 or TS7700T
Cluster 1
Figure 5-7 Three-cluster high availability and disaster recovery with two TS7740/TS7700T tape
libraries and one TS7700D
In both models, if a TS7700D reaches the upper threshold of usage, the PREFER REMOVE
data, which has already been replicated to the TS7740/TS7700T, is removed from the
TS7700D cache followed by the PREFER KEEP data. PINNED data can never be removed
from a TS7700D cache or a TS7700T CP0.
In the example that is shown in Figure 5-7, you can have particular workloads that favor the
TS7740/TS7700T, and others that favor the TS7700D, suiting a specific workload to the
cluster best equipped to perform it.
Four-cluster grid
A four-cluster grid that can have both sites for dual purposes is described. Both sites are
equal players within the grid, and any site can play the role of production or DR, as required.
Replication
Production Backup
Host(s) DR Host
Replication
You can have host workload balanced across both clusters (Cluster 0 and Cluster 1 in
Figure 5-8). The logical volumes that are written to a particular cluster are only replicated to
one remote cluster. In Figure 5-8, Cluster 0 replicates to Cluster 2 and Cluster 1 replicates to
Cluster 3. This task is accomplished by using copy policies. For the described behavior, copy
mode for Cluster 0 is RDRN or SDSN and for Cluster 1 is DRNR or DSNS.
This configuration delivers high availability at both sites, production and DR, without four
copies of the same tape logical volume throughout the grid.
If this example was not in Metro Mirror distances, use copy policies on Cluster 0 of RDDN and
Cluster 1 of DRND.
New Jobs
Production
Host(s)
Replication
Mounts
for Read
TS7700 Cluster 1 TS7700 Cluster 3
Figure 5-9 Four-cluster grid high availability and disaster recovery - Cluster 0 outage
During the outage of Cluster 0 in the example, new jobs for write use only one half of the
configuration (the unaffected partition in the lower part of Figure 5-9). Jobs for read can
access content in all available clusters. When power is normalized at the site, Cluster 0 starts
and rejoins the grid, reestablishing the original balanced configuration.
In a DR situation, the backup host in the DR site operates from the second high availability
pair, which is the pair of Cluster 2 and Cluster 3 in Figure 5-9. In this case, copy policies can
be RNRD for Cluster 2 and NRNR for Cluster 3.
If these sites are more than Metro Mirror distance, you can have Cluster 2 copy policies of
DNRD and Cluster 3 policies of NDDR.
You might also want to update the library nicknames that are defined through the MI for the
grid and cluster to match the library names defined to DFSMS. That way, the names that are
shown on the MI windows match those names that are used at the host for the composite
library and distributed library.
To set up the composite name that is used by the host to be the grid name, complete the
following steps:
1. Select Configuration → Grid Identification Properties.
2. In the window that opens, enter the composite library name that is used by the host in the
grid nickname field.
3. You can optionally provide a description.
For a bank, during the batch window, and without any other alternatives to bypass a 12-hour
TS7700 outage, this can be a real disaster. However, if the bank has a three-cluster grid (two
local and one remote), the same situation is less dire because the batch window can continue
accessing the second local TS7700.
Because no set of fixed answers exists for all situations, you must carefully and clearly define
which situations can be considered real disasters, and which actions to perform for all
possible situations.
Several differences exist between a DR test situation and a real disaster situation. In a real
disaster situation, you do not have to do anything to be able to use the DR TS7700, which
makes your task easier. However, this easy-to-use capability does not mean that you have all
the cartridge data copied to the DR TS7700.
In a real disaster scenario, the whole primary site is lost. Therefore, you need to start your
production systems at the DR site. To do this, you need to have a copy of all your information
not only on tape, but all DASD data copied to the DR site.
After you can start the z/OS partitions, from the TS7700 perspective, you must be sure that
your hardware configuration definition (HCD) “sees” the DR TS7700. Otherwise, you cannot
put the TS7700 online.
You must change ownership takeover, also. To perform that task, go to the MI interface and
enable ownership takeover for read and write.
All the customizations that you made for DR testing are not needed during a real disaster.
Production tape ranges, scratch categories, SMS definitions, RMM inventory, and so on, are
in a real configuration that is in DASD and is copied from the primary site.
After you are in a stable situation at the DR site, you need to start the tasks that are required
to recover your primary site or to create a new site. The old DR site is now the production site,
so you must create a new DR site.
The default behavior of the TS7740 in selecting which TVC is used for the I/O is to follow the
MC definitions and considerations to provide the best overall job performance. However, it
uses a logical volume in a remote TS7740’s TVC, if required, to perform a mount operation
unless override settings on a cluster are used.
To direct the TS7740 to use its local TVC, complete the following steps:
1. For the MC that is used for production data, ensure that the local cluster has a Copy
Consistency Point. If it is important to know that the data is replicated at job close time,
specify a Copy Consistency Point of RUN or Synchronous mode copy.
If some amount of data loss after a job closes can be tolerated, a Copy Consistency Point
of Deferred can be used. You might have production data with different data loss
tolerance. If that is the case, you might want to define more than one MC with separate
Copy Consistency Points. In defining the Copy Consistency Points for an MC, it is
important that you define the same copy mode for each site because in a site switch, the
local cluster changes.
The TS7700 provides a capability called Bulk Volume Information Retrieval (BVIR). If there is
an unplanned interruption to tape replication, GDPS uses this BVIR capability to
automatically collect information about all volumes in all libraries in the grid where the
replication problem occurred. In addition to this automatic collection of in-doubt tape
information, it is possible to request GDPS to perform BVIR processing for a selected library
by using the GDPS window interface at any time.
GDPS supports a physically partitioned TS7700. For more information about the steps that
are required to partition a TS7700 physically, see Appendix I, “Case study for logical
partitioning of a two-cluster grid” on page 919.
The complete instructions for implementing GDPS with the TS7700 can be found in the
GDPS manuals.
For more information about defining a tape subsystem in a DFSMS environment, see IBM
TS3500 Tape Library with System z Attachment A Practical Guide to Enterprise Tape Drives
and TS3500 Tape Automation, SG24-6789, and IBM TS4500 R4 Tape Library Guide,
SG24-8235.
The TS7720D and TS7760D do not have a tape library that is attached, so the
implementation steps that are related to a physical tape library, for IBM TS4500 or IBM
TS3500, do not apply.
You can install the TS7760T, TS7740, or TS7720T together with your existing TS3500 tape
library, or install them with a new TS4500 to serve as the physical back-end tape library.
When using a TS3500 as the physical back-end tape library to either the TS7760T, TS7740,
or TS7720T, the IBM 3953 Library Manager is no longer required because the Library
Manager functions are provided by the TS7700 Licensed Internal Code.
These three groups of implementation tasks can be done in parallel or sequentially. HCD and
host definitions can be completed before or after the actual hardware installation.
Your IBM Service Support Representative (IBM SSR) installs the TS7760T, TS7740, or
TS7720T hardware, its associated tape library, and the frames. This installation does not
require your involvement other than the appropriate planning. For more information, see
Chapter 4, “Preinstallation planning and sizing” on page 135.
Clarification: The steps that are described in this section relate to the installation of a new
IBM TS4500/TS3500 tape library with all of the required features, such as Advanced
Library Management System (ALMS), installed. If you are attaching an existing IBM
TS3500 tape library that is already attached to Open Systems hosts to IBM Z hosts, see
IBM TS3500 Tape Library with System z Attachment A Practical Guide to Enterprise Tape
Drives and TS3500 Tape Automation, SG24-6789, or IBM TS4500 R4 Tape Library Guide,
SG24-8235 for extra actions that might be required.
The following tasks are for TS4500/TS3500 library definition. For the detailed procedure, see
9.5.1, “The tape library with the TS7700T cluster” on page 521.
Defining a logical library
– Ensuring that ALMS is enabled
– Creating a new logical library with ALMS
– Setting the maximum cartridges for the logical library
Adding drives to the logical library
Each tape-attached TS7700 will be associated to only one logical library in the physical
library, and each logical library can be associated to only one TS7700. TS7700T will
require a minimum of four tape drives installed to be operational, and a maximum of
sixteen can be used.
Defining control path drives
Each TS7760T, TS7740, or TS7720T requires the definition of four tape drives (among
installed ones) as control path drives.
Defining the Encryption Method for the new logical library
TS7700T only supports encryption when the logical library has been configured to use the
System-Managed encryption method
Defining CAPs (Cartridge Assignment Policies)
Inserting TS7760T, TS7740, or TS7720T physical volumes
Assigning cartridges in the TS4500/TS3500 tape library to the logical library partition
This procedure is necessary only if a cartridge was inserted, but a Cartridge Assignment
Policy (CAP) was not provided in advance.
The tasks that are listed in this section are for TS7760T, TS7740, or TS7720T only:
Defining VOLSER ranges for physical volumes
Defining physical volume pools:
– Reclaim threshold setting
– Inhibit Reclaim schedule
0 00-0F 01
1 00-0F 02
2 00-0F 03
3 00-0F 04
4 00-0F 05
5 00-0F 06
6 00-0F 07
7 00-0F 08
8 00-0F 09
9 00-0F 0A
A 00-0F 0B
B 00-0F 0C
C 00-0F 0D
D 00-0F 0E
E 00-0F 0F
F 00-0F 10
10 00-0F 11
11 00-0F 12
12 00-0F 13
13 00-0F 14
14 00-0F 15
15 00-0F 16
16 00-0F 17
17 00-0F 18
18 00-0F 19
19 00-0F 1A
1A 00-0F 1B
1B 00-0F 1C
1C 00-0F 1D
1D 00-0F 1E
1E 00-0F 1F
Table 6-2 CUADD and LIBPORT-ID for the first set of 256 virtual devices
CU 1 2 3 4 5 6 7 8
CUADD 0 1 2 3 4 5 6 7
LIBPORT-ID 01 02 03 04 05 06 07 08
For the ninth to sixteenth CUs, use CUADD=8 - CUADD=F and LIBPORT-IDs of 09 - 10, as
shown in Table 6-3.
Table 6-3 CUADD and LIBPORT-ID for the second set of virtual devices
CU 9 10 11 12 13 14 15 16
CUADD 8 9 A B C D E F
LIBPORT-ID 09 0A 0B 0C 0D 0E 0F 10
Figure 6-1 and Figure 6-2 on page 231 show the two important windows for specifying a
tape CU. To define devices by using HCD, complete the following steps:
1. Specify the CU number and the type here (3490), as shown in Figure 6-1. Press Enter.
Connected to switches . . . 01 01 01 01 __ __ __ __ +
Ports . . . . . . . . . . . D6 D7 D8 D9 __ __ __ __ +
If connected to a switch:
Unit address . . . . . . 00 __ __ __ __ __ __ __ +
Number of units . . . . 16 ___ ___ ___ ___ ___ ___ ___
Tip: When the TS7700 is not attached through Fibre Channel connection (FICON)
directors, the link address fields are blank.
3. Repeating the previous process, define the 2nd - 16th TS7700 virtual tape CUs, specifying
the logical unit address (CUADD)=1 - F, in the Add Control Unit windows. The Add Control
Unit summary window is shown in Figure 6-2.
4. To define the TS7700 virtual drives, use the Add Device window that is shown in
Figure 6-3.
Connected to CUs . . 0440 ____ ____ ____ ____ ____ ____ ____ +
Preferred CHPID . . . . . . . . __ +
Explicit device candidate list . No (Yes or No)
6. After you enter the required information and specify to which operating systems the
devices are connected, the window in Figure 6-5 is displayed, where you can update the
device parameters.
Parameter /
Feature Value P Req. Description
OFFLINE Yes Device considered online or offline at IPL
DYNAMIC Yes Device supports dynamic configuration
LOCANY No UCB can reside in 31 bit storage
LIBRARY Yes Device supports auto tape library
AUTOSWITCH No Device is automatically switchable
LIBRARY-ID CA010 5-digit library serial number
LIBPORT-ID 01 2 digit library string ID (port number)
MTL No Device supports manual tape library
SHARABLE No Device is Sharable between systems
COMPACT Yes Compaction
***************************** Bottom of data ****************************
F1=Help F2=Split F4=Prompt F5=Reset F7=Backward
F8=Forward F9=Swap F12=Cancel F22=Command
To define the remaining TS7700 3490E virtual drives, repeat this process for each CU in your
implementation plan.
As an alternative to the procedures described next, you can always perform an initial program
load (IPL) or restart of the system.
After activation, you can check the details by using the DEVSERV QTAPE command. See 10.1.2,
“MVS system commands” on page 585.
If LIBRARY-ID (LIBID) and LIBPORT-ID are not coded, after you delete the library’s dynamic
control blocks, complete the following steps:
1. Run MVS console command VARY ONLINE to vary on the devices in the library. This
creates some control blocks, and you see the following message:
IEA437I TAPE LIBRARY DEVICE(ddd), ACTIVATE IODF=xx, IS REQUIRED
2. Activate an IODF that defines all of the devices in the modified library.
3. Use QLIB LIST to verify that the ACTIVE control blocks are properly defined.
The MIH timeout value applies only to the virtual 3490E drives and not to the real IBM
TS1150/TS1140/TS1130/TS1120/3592 drives that the TS7740 manages in the back end.
The host knows only about logical 3490E devices.
Table 6-4 describe traditionally used MIH values, which can be adjusted depending on
specific operational factors.
TS7700 multi-cluster grid with 3490E emulation drives and not using Rewind 20 minutes
Unload (RUN) copy policies
If adjustment is needed, the MIH value can be specified in the PARMLIB member IECIOSxx.
Alternatively, you can also set the MIH values using the IBM Z operator command SETIOS. A
user-specified MIH value will override the default mentioned above, and will be effective until it
is manually changed or until the system is initialized.
For more information about MIH settings, see MVS Initialization and Tuning Reference,
SA23-1380.
During IPL (if the device is defined to be ONLINE) or during the VARY ONLINE in process, some
devices (such as the IBM 3590/3592 physical tape devices) might present their own MIH
timeout values through the primary/secondary MIH timing enhancement that is contained in
the self-describing data for the device. The Primary MIH Time OUT value is used for most I/O
commands, but the Secondary MIH Time OUT value can be used for special operations, such
as long-busy conditions or long-running I/O operations.
Any time that a user specifically sets a device or device class to an MIH timeout value that is
different from the default for the device class that is set by IBM, that value overrides the
device-established Primary MIH Time OUT value. This implies that if an MIH timeout value
that is equal to the MIH default for the device class is explicitly requested, IOS does not
override the device-established Primary MIH Time OUT value. To override the
device-established Primary MIH Time OUT value, you must explicitly set a timeout value that
is not equal to the MIH default for the device class.
This type of failure is known as Sick But Not Dead (SBND). Even if only a single cluster is in
this condition, the failure could produce effects that might affect the performance, the
functionality, or both of other clusters that are present in the same grid.
Grid-wide consequences for this type of problem can be prevented by a “fence” mechanism
that is available when all machines in the grid are running code level R4.1.2 or later. The
mechanism isolates the failing box so that it receives the corresponding repair action, while
other machines on the grid will keep running without being affected by the unscheduled
peer-outage.
Note: A detailed description of the Host Console Request functions and their
responses is available in the IBM Virtualization Engine TS7700 Series z/OS Host
Command Line Request User’s Guide, which is available at the Techdocs website:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101091
To use the TS7700, at least one SG must be created to enable the TS7700 tape library virtual
drives to be allocated by the storage management subsystem (SMS) ACS routines. Because
all of the logical drives and volumes are associated with the composite library, only the
composite library can be defined in the SG.
See the following resources for information about host software implementation tasks for IBM
tape libraries:
z/OS DFSMS OAM Planning, Installation, and Storage Administration Guide for Tape
Libraries, SC23-6867
IBM TS3500 Tape Library with System z Attachment A Practical Guide to Enterprise Tape
Drives and TS3500 Tape Automation, SG24-6789
IBM TS4500 R4 Tape Library Guide, SG24-8235
If your TMS is DFSMS Removable Media Manager (DFSMSrmm), the following manuals can
be useful:
DFSMSrmm Primer, SG24-5983
z/OS DFSMSrmm Implementation and Customization Guide, SC23-6874
If this installation is the first SMS tape library at this host, additional steps are required. The
full product documentation for your TMS needs to be consulted, in addition to the OAM PISA
listed in the previous bullets.
Complete the following steps to define the TS7700 tape library in an existing z/OS SMStape
environment:
1. Use the ISMF Library Management → Tape Library → Define panel to define the tape
library as a DFSMS resource. Define the composite library and one or more distributed
libraries. Library names cannot start with a “V”. In the following figures, we define one
composite library that is named IBMC1 and a single distributed library that is named
IBMD1.
Remember: Library ID is the only field that applies for the distributed libraries. All
other fields can be blank or left as the default.
2. Using the Interactive Storage Management Facility (ISMF), create or update the DCs,
SCs, and MCs for the TS7700. Ensure that these defined construct names are the same
as those that you defined at the TS7700 MI.
3. Using ISMF, create the SGs for the TS7700. Ensure that these defined construct names
are the same as those that you defined at the TS7700 MI.
The composite library must be defined in the SG. Do not define the distributed libraries in
the SG.
4. Update the ACS routines to assign the constructs that are needed to use the TS7700 and
then convert, test, and validate the ACS routines.
5. Customize your TMS to include the new volume ranges and library name. For
DFSMSrmm, this involves EDGRMMxx updates for VLPOOL and LOCDEF statements, in
addition to any OPENRULE and PRTITION statements needed. If you leave REJECT
ANYUSE(*) in your EDGRMMxx member, you cannot use any tape volume serial numbers
(VOLSERs) not previously defined to RMM.
6. Consider whether your current TMS database or control data set (CDS) has sufficient
space for the added library, data set, and volume entries. For DFSMSrmm, consult the
IBM Redbooks Publication DFSMSrmm Primer, SG24-5983 under the topic “Creating the
DFSMSrmm CDS.”
7. Modify the SYS1.PARMLIB members DEVSUPxx for new categories, as described in
4.3.4, “Sharing and partitioning considerations” on page 175. The DEVSUPxx default
categories should always be changed to prevent disruption of library operations. For more
information, see the “Changing the library manager category assignments in an ATLDS”
topic in z/OS V2R2.0 DFSMS OAM Planning, Installation, and Storage Administration
Guide for Tape Libraries, SC23-6867.
8. Consider updating COMMANDxx to vary the library online at IPL if you have not already
set it to come online during the definition of the library through ISMF. For more information,
see 10.1.1, “DFSMS operator commands” on page 582. The OAM address space must be
active for the vary to complete successfully.
This section describes the hardware components that are part of the TS7700. These
components include the TS7720, TS7740, and TS7760, which can be attached to an IBM
TS3500 or TS4500 tape library that is configured with IBM 3592 tape drives.
1
A TS7760 Cache Controller and up to five attached TS7760 Cache Drawers are referred to as a string, with each
TS7760 Cache Controller acting as the “head of [the] string.” A single TS7700 can have up to three “strings”
attached, the first in the base frame (base string) and the next two in the expansion frames (string 1 and string 2).
The 3952 Tape Frame can be designated as a TS7700 Storage Expansion Frame when
ordered with FC 7334, TS7700 Encryption-capable expansion frame.
Any lock on the 3952 Tape Frame prevents access to the TS7700 Emergency Power Off
(EPO) switch. If a lock (FRU 12R9307) is installed on the 3952 Tape Frame, an external EPO
switch or circuit breaker must be installed near the TS7700 to allow an emergency shutdown.
Additionally, the emergency contact label that is included with the Installation Instruction RPQ
8B3585 (Front/Rear and Side Panel Locking Procedure), PN 46X6208, must be completed
and affixed to the 3952 Tape Frame door in an immediately visible location. This label must
clearly indicate the location of the external EPO switch or circuit breaker.
If a lock is installed on the 3952 Tape Frame and the original key is not available, any 3952
Tape Frame key can be used to open the lock. If no frame key is available and immediate
access is required to get inside the frame, you must contact a locksmith to open the lock. If
the key is still unavailable after the lock is opened, you can contact your IBM service
representative to order a new lock and key set (FRU 12R9307).
The IBM 3952 Model F05 Tape Base Frame provides up to 36U (rack units or Electronics
Industry Association (EIA) units) of usable space. The IBM 3952 Model F06 Tape Base Frame
provides up to 40U of usable space. The rack units contain the components of the defined
tape solution.
The 3952 Tape Base Frame is not a general-purpose frame. 3952 Model F05 is designed to
contain the components of specific tape offerings, such as the TS7740, TS7720, and
TS7720T. 3952 Model F06 is designed to contain the components of TS7760.
Only components of one solution family can be installed in a 3952 Tape Frame. The 3952
Tape Frame is configured with Dual AC Power Distribution units for redundancy.
Note: Available by RPQ, feature 8B3670 allows the cable to exit from the top of the 3952
F06 frame.
Ethernet switches
Primary and alternate switches are used in the TS7700 internal network communications for
redundancy.
The communications to the external network use a set of dedicated Ethernet ports on
adapters in the 3957 server. Internal network communications (interconnecting TS7700
switches, TSSC, Disk Cache System, and TS3500 or TS4500 when present) use their own
set of Ethernet ports on adapters in the I/O Expansion Drawers.
Depending on your bandwidth and availability needs, TS7700 can be configured with two or
four 1-Gb links. Feature Code 1034 (FC1034) is needed to enable the second pair of ports in
the grid adapters. These ports can be either fiber SW or copper. Optionally, there is a choice
of two longwave (LW) single-ported Optical Ethernet adapters (FC1035) for two or four 10-Gb
links. Your network infrastructure must support 10 Gbps for this selection. The adapter does
not scale down to 1 Gbps.
The Ethernet adapters cannot be intermixed within the same cluster, they must be of the
same type (same feature code).
FDE simplifies security planning and provides unparalleled security assurance with
government-grade encryption. FDE uses the Advanced Encryption Standard (AES) 256 bit
encryption to protect the data. This algorithm is approved by the US government for
protecting secret-level classified data. Data is protected through the hardware lifecycle and
enables return of defective drives for servicing. It also allows for quick decommission or
repurposing of drives with instant cryptographic erase.
FDE preserves performance because the encryption is hardware-based in the disk drive.
FDE doesn’t slow the system down because the encryption engine matches the drive’s
maximum port speed and scales as more drives are added. In order for drive-level encryption
to provide this value, the key that enables encryption must be protected and managed.
There are two types of keys that are used with the FDE drives. The data encryption key is
generated by the drive and never leaves the drive, so it always stays secure. It is stored in an
encrypted form within the drive and performs symmetric encryption and decryption of data at
full disk speed with no effect on disk performance. Each FDE drive uses its own unique
encryption key, which is generated when the disk is manufactured and regenerated when
required by the SSR.
The lock key or security key is a 32-byte random number that authenticates the drive with the
CC9/CS9/CSA Cache Controller by using asymmetric encryption for authentication. When
the FDE drive is secure enabled, it must authenticate with the CC9/CS9/CSA Cache
Controller or it does not return any data and remains locked.
After the drive is authenticated, access to the drive operates like an unencrypted drive. One
security key is created for all FDE drives attached to the CC9/CS9/CSA cache controller and
CX9/XS9/CSA Cache Expansion drawers. The authentication key is generated, encrypted,
and hidden in the subsystem (NVSRAM) of the CC9/CS9/CSA Cache Controller in each of
the two CECs. The TS7700 stores a third copy in the Vxx persistent storage disks. A method
is provided to securely export a copy to DVD.
The authentication typically occurs only after the FDE starts, where it will be in a “locked”
state. If encryption was never enabled (the lock key is not initially established between the
CC9/CS9/CSA Cache Controller and the disk), the disk is considered unlocked with access
unlimited, as in a non-FDE drive.
The lock key or security key is set up by the SSR using the SMIT panels. There are two
feature codes that are required to enable FDE. Feature Code 7404 is required on all
3956-CC9, 3956-CX9, 3956-CS9, 3956-XS9, 3956-CSA, and 3956-XSA cache drawers.
Through the SMIT menus, the SSR can “Re-generate Encryption Key” on the cache
subsystem disks by requesting the FDE drives to erase their data encryption key and
generate a new one. After FDE is enabled, it cannot be disabled.
Note: Only IBM Security Key Lifecycle Manager (SKLM) supports both external disk
encryption and TS1140 and TS1150 tape drives. The settings for Encryption Server are
shared for both tape and external disk encryption.
The IBM Security Key Lifecycle Manager for z/OS (ISKLM) external key manager supports
TS7700 physical tape, but does not support TS7700 disk encryption.
The switches can be in a frame that contains some of the associated back-end drives, or can
reside in a frame that does not contain any of the associated drives. The switches are placed
at the bottom of the tape library frame. The fibre patch panel must be removed from the frame
if it has one.
A frame that contains the back-end switches can still house up to 12 or 16 drives (Based on
TS3500 or TS4500). Feature code 4879 supplies the mounting hardware for the back-end
switches and a pair of dressed eight fiber cable trunks to connect the back-end switches to
the associated back end drives in the frame.
Only eight pairs are supplied in the trunks because the preferred practice for
TS7740/TS7720T/TS7760T drive placement states that the drives should be split evenly
between two frames. Drives that do not attach to the back-end switches must be cabled
directly to the drives, because the patch panel has been removed.
Note: The TS7760T does not support 4 Gbps and 8 Gbps fiber switches for connection to
the back-end drives. Currently, a TS7760T must use the 16 Gbps fiber switch to connect to
the back-end drives.
Table 7-1 shows the feasible combination of TS7700, TSx500, and the necessary switches.
Figure 7-2 Single frame layout of a TS7760 with a manufacturing installed 3957-VEC, 3956-CSA, and
3956-XSA
Figure 7-3 Layout of a TS7760 Storage Expansion Frame with 3956-CSA and 3956-XSA
The TS7760 Storage Expansion Frame is a 3952 Tape Frame that is designated as a cache
expansion frame for use with a fully configured TS7700 Base Frame.
The distance between a TS7760 Storage Expansion Frame and the TS7700 Base Frame
cannot exceed 10 meters. This distance enables connection of the frames by using a
30-meter fibre cable.
Each Expansion Unit I/O adapter drawer offers the following features:
Six additional hot-pluggable PCIe cartridge style slots (used to house FICON adapters for
host attachment, Fibre Channel adapters for cache drawer attachment, Fibre Channel
adapters for tape communication, and Ethernet adapters for grid communication).
Redundant AC power
Redundant cooling
Concurrent maintenance of:
– PCIe or PCI-X adapters
– Two power supplies
– Two fans
Figure 7-6 shows the TS7700 Server Expansion Unit I/O drawer.
Figure 7-6 TS7700 Server Expansion Unit I/O drawer (rear view)
Figure 7-7 shows the TS7760 Cache Controller from the front.
Figure 7-8 shows the TS7760 Cache Controller from the rear.
The TS7760 Cache Drawer expands the capacity of the TS7760 Cache Controller by
providing additional dynamic disk pools-protected disk storage. Each TS7760 Cache Drawer
offers the following features:
Two Environmental Services Module (ESM) cards
Two AC power supplies with embedded enclosure cooling units
12 DDMs, each with a storage capacity of 4 TB or 8 TB
Supports Advanced Encryption Standard (AES) 256-bit encryption
Attachment to the TS7760 Cache Controller
Figure 7-10 shows the TS7760 Cache drawer from the rear.
The TS7720 consists of a 3952 Model F05 Encryption Capable Base Frame and one or two
optional 3952 Model F05 Encryption Capable Storage Expansion Frames. FC5272 enables
FDE on the VEB. FC7404 is needed to enable FDE on each cache drawer. After it is enabled,
FDE cannot be disabled.
The 3952 Model F05 Tape Base Frame houses the following components:
One TS7720 Server, 3957 Model VEB.
One TS7700 I/O Expansion Drawer (primary and alternative).
One TS7720 Encryption Capable 3956-CS9 Cache Controller Drawer. The controller
drawer has 0 - 9 TS7720 Encryption Capable 3956- XS9 Cache Expansion Drawers. The
base frame must be fully configured before you can add a first storage expansion frame.
Two Ethernet switches.
The 3952 Model F05 Storage Expansion Frame houses one TS7720 Encryption Capable
3956-CS9 Cache Controller Drawer. Each controller drawer can have 0 - 15 TS7720
Encryption Capable 3956-XS9 Cache Expansion Drawers. The first expansion frame must be
fully configured before you can add a second storage expansion frame.
The base frame, first expansion frame, and second expansion frame are not required to be of
the same model and type. Only when the base frame is of the CS9 type is it required to be
fully populated when you add an expansion frame. When you add a second expansion frame,
the first expansion frame must be fully populated if it contains CS9 technology.
Each I/O expansion Drawer offers six extra PCI Express adapter slots:
One or two 4 Gb FICON adapters per I/O Expansion Drawer, for a total of two or four
FICON adapters per cluster. Adapters can work at 1, 2, or 4 Gbps. FICON card must be of
the same type within one cluster.
Figure 7-14 TS7720 Encryption Capable Cache Controller, 3956-CS9 (front and rear views)
The TS7720 Cache Controller provides RAID 6 protection for virtual volume disk storage,
enabling fast retrieval of data from cache.
Figure 7-15 TS7720 Encryption Capable Cache Drawer (front and rear views)
The TS7720 Cache Drawer expands the capacity of the TS7720 Cache Controller by
providing extra RAID 6-protected disk storage. Each TS7720 Cache Drawer offers the
following features:
Two 8 Gb Fibre Channel processor cards
Two power supplies with embedded enclosure cooling units
Eleven DDMs, each with a storage capacity of 3 TB, for a usable capacity of 24 TB
per drawer
The TS7720 disk-only can be used to write tape data that does not need to be copied to
physical tape, which enables access to the data from the Tape Volume Cache (TVC) until the
data expires.
The TS7720T enables a TS7720 to act like a TS7740 to form a virtual tape subsystem to
write to physical tape. Full disk and tape encryption are supported. It contains the same
components as the TS7720 disk-only. In a TS7720 disk-only configuration, the Fibre Channel
ports are used to communicate with the attached cache, whereas in a TS7720T configuration,
two of the Fibre Channel ports are used to communicate with the attached tape drives.
The total usable capacity of a TS7740 with one 3956-CC9 and two 3956-CX9s is
approximately 28 TB before compression.
Figure 7-17 shows the front and rear views of the TS7740 Encryption Capable 3956-CC9
Cache Controller Drawer.
Figure 7-17 TS7740 Encryption Capable Cache Controller Drawer (front and rear views)
The TS7740 Encryption Capable Cache Controller Drawer provides RAID 6-protected virtual
volume disk storage. This storage temporarily holds data from the host before writing it to
physical tape. When the data is in the cache, it is available for fast retrieval from the disk.
Figure 7-19 shows the front view and the rear view of the TS7740 Encryption Capable
3956-CX9 Cache Expansion Drawer.
Figure 7-19 TS7740 Encryption Capable Cache Drawer (front and rear views)
The TS7740 Encryption Capable Cache Expansion Drawer expands the capacity of the
TS7740 Cache Controller Drawer by providing extra RAID 6 disk storage. Each TS7740
Cache Expansion Drawer offers the following features:
Two Environmental Service Modules (ESMs)
Two power supplies with embedded enclosure cooling units
22 DDMs, each with 600 GB of storage capacity, for a total usable capacity of 9.58 TB per
drawer
Attachment to the TS7740 Encryption Capable 3956-CC9 Cache Controller Drawer
Tape libraries
A TS7740, TS7720T, or a TS7760T attached to a TS3500 or TS4500 tape library interfaces
directly with tape drives in the library.
When attached to a TS3500 or TS4500 tape library, the TS7700 can attach only to 3592 Tape
Drives. Up to 16 3592 Tape Drives can be attached.
Communication, control, and data signals travel along Fibre Channel connections between
the TS7700 and tape drives contained in the TS3500 or TS4500 tape library. A pair of Fibre
Channel switches routes the data to and from the correct tape drive.
Note: TS1140 EH7 tape drives and TS1150 EH8 tape drives are used with the TS4500
tape library. All other tape drives are used with the TS3500 tape library.
Tape drives
The 3592 Tape Drives supported for use with the TS7740, TS7720T, and TS7760T include:
TS1150 Tape Drive
TS1140 Tape Drive
TS1130 Tape Drives
TS1120 Tape Drives (in native mode and emulating 3592 J1A Tape Drives)
3592 J1A Tape Drives
For more information, see “Tape drives and media support (TS7740,TS7720T, and
TS7760T)” on page 141.
All of these devices are connected to a dedicated, private local area network (LAN) that is
owned by TSSC. Remote data monitoring of each one of these subsystems is provided for
early detection of unusual conditions. The TSSC sends this summary information to IBM if
something unusual is detected and the Call Home function has been enabled.
Note: For Call Home and remote support since TS7700 R3.2, an internet connection is
necessary.
For IBM TS7700 R4.1, the following features are available for installation:
FC 2704, TS3000 System Console expansion 26-port Ethernet switch/rackmount
FC 2715, Console attachment
FC 2725, Rackmount TS3000 System Console
FC 2748, Optical drive
For more information, see Appendix A, “Feature codes and RPQ” on page 805.
7.1.7 Cables
This section describes the cable feature codes for attachment to the TS7700, extra cables,
fabric components, and cabling solutions.
A TS7700 Server with the FICON Attachment features (FC 3441, FC 3442, FC 3443, FC
3438, FC 3439, FC 3402 or FC 3403) can attach to FICON channels of IBM Z components by
using FICON cable features ordered on the TS7700 Server.
Requirement: 8-Gb FICON adapters require FC 3462 (16-GB memory upgrade) and
TS7700 Licensed Internal Code R3.1 or later.
See the IBM Virtualization Engine TS7700 Introduction and Planning Guide, GA32-0568, for
Fibre Channel cable planning information.
If Grid Enablement (FC4015) is ordered, Ethernet cables are required for the copper/optical 1
Gbps and optical LW adapters to attach to the communication grid.
Note: All host data transfers through the TS7740 Cluster are considered for the data
transfer limit regardless of which TS7740 Cluster initiated or received the data transfer.
Consideration: The feature must be installed on all clusters in the grid before the
function becomes enabled.
Remember: The number of logical volumes that are supported in a grid is set by the
cluster with the smallest number of FC 5270 increments installed.
When joining a cluster to an existing grid, the joining cluster must meet or exceed the
currently supported number of logical volumes of the existing grid.
When merging one or more clusters into an existing grid, all clusters in the ending grid
configuration must contain enough FC 5270 increments to accommodate the sum of all
post-merged volumes.
Disk encryption.
You can encrypt the DDMs within a TS7700 disk storage system.
TS7700 Storage Expansion frame.
You can add up to two cache expansion frames to a fully configured TS7760 using FC
9323 (Expansion frame attachment) and applying FC 7334 (TS7700 Encryption-capable
expansion frame) to a 3952 F06 Tape Frame.
Note: The adapter installation (FC 5241, Dual port FC HBA) is non-concurrent.
– Copper Ethernet
You can add a 1 Gbps copper Ethernet adapter for grid communication between
TS7700 tape drives. On a 3957-V07, 3957-VEB, or 3957-VEC use FC 1036, 1 Gbps
grid dual port copper connection to achieve this upgrade.
Clarification: On a TS7700, you can have two 1 Gbps copper Ethernet adapters or
two 1 Gbps SW fiber Ethernet adapters or two 10 Gbps LW fiber Ethernet adapters
(3957-V07, VEB, and VEC only) installed. Intermixing different types of Ethernet
adapters within one cluster is not supported.
For the data storage values in TB versus TiB, see 1.6, “Data storage values” on page 12. The
TS7760 Base frame minimum cache configuration is one CSA Cache Controller containing
12 DDMs, each with a storage capacity of 4 TB (3.63 TiB) or 8 TB (7.27 TiB). Optional
attachment to a maximum of nine TS7760 Cache Drawers in a TS7760 Base Frame.
The TS7760 Storage Expansion Frame consists of one TS7760 Cache Controller
(3956-CSA), containing 12 DDMs, each of which have a storage capacity of 4 TB or 8 TB.
Optional attachment to up to 15 TS7760 Cache Drawers (3956-XSA), each containing 12
DDMs with a storage capacity of 4 TB or optional attachment to up to 14 TS7760 Cache
Drawers (3956-XSA), each containing 12 DDMs with a storage capacity of 8 TB.
4 14 439.35 TB 30 941.88 TB
(399.59 TiB) (856.64 TiB)
5 15 470.78 TB 31 973.31 TB
(428.17 TiB) (885.22 TiB)
6 16 502.21 TB 32 1004.74 TB
(456.76 TiB) (913.80 TiB)
7 17 533.64 TB 33 1036.17 TB
(485.34 TiB) (942.39 TiB)
8 18 565.06 TB 34 1067.60 TB
(513.92 TiB) (970.97 TiB)
9 19 596.49 TB 35 1099.02 TB
(542.51 TiB) (999.56 TiB)
10 20 627.92 TB 36 1130.45 TB
(571.09 TiB) (1028.14 TiB)
11 21 659.35 TB 37 1161.88 TB
(599.67 TiB) (1056.72 TiB)
12 22 690.78 TB 38 1193.31 TB
(628.26 TiB) (1085.31 TiB)
13 23 722.21 TB 39 1224.74 TB
(656.84 TiB) (1113.89 TiB)
14 24 753.63 TB 40 1256.17 TB
(685.43 TiB) (1142.48 TiB)
15 25 785.06 TB 41 1287.59 TB
(714.01 TiB) (1171.06 TiB)
16 26 816.49 TB 42 1319.02 TB
(742.59 TiB) (1199.64 TiB)
a. The term Total cache units refers to the combination of cache controllers and cache drawers.
Table 7-3 TS7760 Storage Expansion Frame configurations using 8 TB capacity drives
Cache Cache unitsa in First TS7760 Storage Expansion Second TS7760 Storage Expansion
configuration in eachTS7760 Frame Frame
a new TS7760 Storage
Expansion Total cache Available Total cache Available
Frame cache units (including capacity units (including capacity
controller TS7760 Base TS7760 Base
(3956-CSA) plus Frame) Frame)
optional cache
drawers
(3956-XSA)
4 14 865.19 TB 29 1792.52 TB
(786.89 TiB) (1630.28 TiB)
5 15 927.03 TB 30 1854.36 TB
(843.13 TiB) (1686.53 TiB)
6 16 988.87 TB 31 1916.20 TB
(899.38 TiB) (1742.77 TiB)
7 17 1050.72 TB 32 1978.04 TB
(955.62 TiB) (1799.02 TiB)
8 18 1112.56 TB 33 2039.88 TB
(1011.86 TiB) (1855.26 TiB)
9 19 1174.40 TB 34 2101.72 TB
(1068.11 TiB) (1911.50 TiB)
10 20 1236.24 TB 35 2163.56 TB
(1124.35 TiB) (1967.75 TiB)
11 21 1298.08 TB 36 2225.40 TB
(1180.60 TiB) (2023.99 TiB)
12 22 1359.92 TB 37 2287.24 TB
(1236.84 TiB) (2080.24 TiB)
13 23 1421.76 TB 38 2349.08 TB
(1293.08 TiB) (2136.48 TiB)
14 24 1483.60 TB 39 2410.93 TB
(1349.33 TiB) (2192.72 TiB)
15 25 1545.44 TB 40 2472.77 TB
(1405.57 TiB) (2248.97 TiB)
a. The term Total cache units refers to the combination of cache controllers and cache drawers.
For the data storage values in TB versus TiB, see 1.6, “Data storage values” on page 12.
A TS7720 existing frame operating with a 3956-CS7/CS8 controller drawer can be expanded
by adding another expansion frame that contains CSA/XSA cache drawers. Empty CS7/CS8
slots remain empty.
A TS7720 existing frame operating with a 3956-CS9 controller drawer can be expanded by
adding XS9 expansion drawers, or by adding CSA-based expansion frames after the existing
CS9 frames are completely populated.
Note: CS9/XS9 MES will be withdrawn in the future, so consult with your IBM service
support representative for the upgrade options.
Subsets of total cache and peak data throughput capacity are available through incremental
features FC 5267, 1 TB cache enablement, and FC 5268, 100 MBps increment. These
features enable a wide range of factory-installed configurations, and enable you to enhance
and update an existing system.
They can help you meet specific data storage requirements by increasing cache and peak
data throughput capability to the limits of your installed hardware. Increments of cache and
peak data throughput can be ordered and installed concurrently on an existing system
through the TS7740 MI.
The capacity of the system is limited to the number of installed 1 TB increments, but the data
that is stored is evenly distributed among all physically installed disk cache. Therefore, larger
drawer configurations provide improved cache performance even when usable capacity is
limited by the 1 TB installed increments. Extra cache can be installed up to the maximum
capacity of the installed hardware.
Table 7-4 on page 275 displays the maximum physical capacity of the TS7740 Cache
configurations and the instances of FC 5267, 1 TB cache enablement, required to achieve
each maximum capacity. Install the cache increments by using the TS7740 MI.
Table 7-4 shows the maximum physical capacity of the TS7740 Cache configurations by
using the 3956-CC9 cache controller.
Table 7-4 Supported TS7740 Cache configurations that use the 3956-CC9 cache controller
Configuration Physical Maximum usable capacity Maximum
capacity quantity of
FC5267a
Consideration: Drive model changes can be made only in an upward direction (from an
older to a newer model). Fallback to the older models is not supported.
Note: Throughout this section, the term TS7700 refers to either the TS7740, the TS7720
Tape Attach, or the TS7760 Tape Attach.
– JA and JJ media can be ejected by using the TS7700 Management Interface after their
active data is reclaimed onto newer media.
Note: JA and JJ media should not be inserted if the volumes do not exist in the TS7700
database.
If JB media contains data that is written in E05 format, it is marked full and is supported as
READ-ONLY data. After the data is reclaimed or written in E06 or E07 format, it is
supported for read/write operations. The IBM Encryption Key Manager is not supported for
use with TS1140 Tape Drives. If encryption is used, either the IBM Security Key Lifecycle
Manager (ISKLM) or Security Key Lifecycle Manager (SKLM) must be used.
3592 EU6 Tape Drives cannot be converted to TS1140 Tape Drives.
Note: JA, JJ, and JB media should not be inserted if the volumes do not exist in the
TS7700 database.
The IBM Encryption Key Manager is not supported for use with a TS1150 Tape Drive. If
encryption is used, either the IBM Security Key Lifecycle Manager (ISKLM) or the Security
Key Lifecycle Manager (SKLM) must be used.
TS1140 Tape Drives cannot be converted to TS1150 Tape Drives.
Another possible use case is to use the existing TS7700T to complete the following steps:
1. First, change the Storage Class definitions to point to the resident-only partition.
2. Then, run a PARTRFSH command to move the logical volumes from the tape-attached
partition to the resident-only partition.
3. Then, recall all of the migrated data from the legacy tapes into the cache resident partition
and allow the MES to have TS1150 tape drives installed.
4. After the MES, you can again change the Storage Class, run a PARTRFSH, and push the
logical volumes back to backend tapes.
After the first TS1150 is installed and configured, the TS7700 detects the new drive
generation during the online process and acknowledges the cartridge types. Make sure that
all of the logical volumes in sunset media, such as JJ, JA, or JB, are migrated to JC or JK
media before TS1150 installation. If TS7700 detects logical volumes in sunset media, the
online process fails. TS7700 comes online if no logical volumes exist in sunset media. They
can be ejected from the TS7700 MI after TS7700 becomes online.
When a TS7700 has all TS1150 tape drives, the following 3592 media types can be used as
scratch media for read/write:
JK - Advanced Type K Economy (ATKE) (900 GB)
JC - Advanced Type C Data (ATCD) (7000 GB)
JL - Advanced Type L Economy (ATLE) (2000 GB)
JD - Advanced Type D Data (ATDD) (10000 GB)
Empty sunset media, such as JJ, JA, or JB, are marked as sunset read-only after TS7700
comes online. This media cannot be used as scratch tapes.
For a storage pool that is not a copy export pool, the E08 recording format is used from the
beginning when writing tapes. If the storage pool is a copy export pool, the recording format
must be selected through the TS7700 MI.
Support for limited heterogeneous tape drives seamlessly helps customers move from older
media and drives to JK, JC, JL, and JD media, and TS1150 tape drives. This option enables
you to add TS1150 tape drives to the existing TS7700 so that all new workloads can be
directed to the TS1150 tape drives, and to leave at least one of the legacy tape drive
generations (3592-J1A, TS1120, TS1130, or TS1140) to handle legacy media.
Note: Only one generation of legacy tape drives can be together with the newly installed
TS1150.
During the TS7700 online processing, media types JJ, JA, and JB are marked as sunset
read-only. Those volumes are mounted only for recall or idle reclamation according to the
reclaim percentage pool settings. Make sure at least 15 scratch JC, JK, JD, or JL media are
inserted to run reclamation of sunset media. After the reclaim process moves the data to
TS1150 supported media, the operator can eject sunset media by using the TS7700 MI.
In heterogeneous drive configuration, legacy tape drives are defined as read-only drives.
They are used for reading logical volumes from sunset media, and are not used for writing
new data. However, there are two exceptions when read-only drives are used for writing data:
One is Secure Data Erase (SDE) for a sunset media. If SDE is enabled, previous
generation 3592 tape drives write a repeating pattern to the legacy media to erase data.
The other is a Copy Export operation. If the legacy media exist in a Copy Export pool,
previous generation 3592 tape drives write a DB backup to the legacy media.
Clarification: This section provides high-level information about the subject. Do not use
this information as a step-by-step guide, but work with your IBM SSR to prepare for update.
Remember: In the previous scenario, all cartridges in the filling state are closed if the new
drives do not support writing in the original tape format. Otherwise, the cartridges continue
to be written in the same format to the end. Scratch tapes that are in use after the change
are automatically reinitialized to the new tape format.
You can apply the same procedure when changing the tape emulation mode in the TS7740 or
TS7700T from 3592-J1A emulation to TS1120-E05 native mode. All steps apply except the
steps that relate to changing drives physically and changing drive emulation mode within the
TS3500. Drive emulation is changed in the TS3500 web interface (see Figure 9-93 on
page 531 for a reference) by using a specific command in the TS7740 or TS7700T by the IBM
SSR. The media format is handled as described in the previous scenario.
Migrating TS7740 or TS7700T data from sunset media type after upgrading
heterogeneous drive configuration
This procedure can be helpful when upgrading your tape drives to the TS1150 3592-E08 tape
drives, or when replacing the existing media cartridges with a newer type to increase the
storage capacity of your library. The E08 drives cannot read JA, JB, or JJ tapes, so you must
have JC, JK, JL, or JD media for the TS7740 or TS7700T to support new logical volumes that
are written to TS1150 drives. In addition, you must have at least two drives of a sunset
generation that are available to read logical volumes from JA, JB, or JJ tapes.
In this scenario, coming from a previous 3592 tape drive model to the E08, all JA, JB, or JJ
media are sunset, which means that after reclaiming the active logical volumes still contained
in it, the JA, JB, or JJ media can be ejected from the library. In this case, you must have a
working pool of stacked volumes of the supported media, such as JC, JK, JD, or JL. Your
data, formerly in a JA, JB, or JJ media, is forcibly migrated into the supported media.
There are two alternatives to introduce the new media. You can either use one physical
volume pool or two physical volume pools. In the second scenario, complete the following
steps:
1. Create a range of physical volumes in the TS7740 or TS7700T for the new media, as
shown in Figure 9-59 on page 427.
2. Create a Cartridge Assignment Policy (CAP) for the new range and assign it to the correct
TS7740 or TS7700T logical partition (LPAR), as described in “Defining Cartridge
Assignment Policies” on page 532.
3. Insert the new cartridges in the TS3500/TS4500 tape library, as described in “Inserting
TS7700T physical volumes” on page 533.
These settings cause the TS7740 or TS7700T to start using the new media for stacking newly
created logical volumes. Existing physical volumes are reclaimed into the new JD media,
becoming empty.
You might prefer to keep your pool definitions unchanged throughout the media change
process. In this case, you can just run the previous steps 1-3. No further change is necessary
if you are migrating from JJ/JA/JB cartridges to JD cartridges. If you are migrating from JC/JK
to JD/JL cartridges, you should set the new media as “First media” in the Pool Properties
table. This way, cartridges of the previous media type are not available for selection in the
common scratch pool.
You can keep the previous media type as the secondary media type as a precaution to not
run out of scratch media. For more information, see “Defining physical volume pools in the
TS7700T” on page 540. After the old-type cartridges are emptied, they can be ejected from
the tape library.
Clarification: You might use the new Host Console Request Resident on Recall for
Sunsetting RRCLSUN (ReCaLl SUNset) to expedite the replacement of the sunset media
with newer media. In this case, ensure that the common scratch pool has the new media
type available, and that the storage pools are set to borrow from the common scratch pool.
Otherwise, the storage pools run out of scratch.
This function invalidates the logical volume on the sunset media just after recall, regardless
of whether the logical volume is updated. As a result, any recalled volume is premigrated to
newer media. The library request command is shown:
LI REQ, lib_name,RRCLSUN ENABLE/DISABLE/STATUS
Where:
ENABLE Activates the force residency on recall function.
DISABLE Deactivates the force residency on recall function.
STATUS Displays the current setting.
If you are changing existing drives to new drive models that use the same media type, use the
Library Request (LI REQ) command to accelerate the media type conversion. Use this
process to reclaim capacity from your existing media. In this scenario, you are not changing
the existing cartridges already in use. There are no changes that are needed regarding the
existing physical volume pools.
To accelerate the media type conversion, modify the pool property to set a high value to
Sunset Media Reclaim Threshold Percentage by using TS7700 MI. Whenever the active data
percentage of sunset media goes below the threshold, the sunset media is reclaimed and
active data is migrated to the newer media.
If you want to limit the service resource of data migration, there is a new Host Console
Request command in Release 3.3 that is available to influence the performance of the
replacement of the sunset media.
Clarification: Use Host Console Request, Reclaim Maximum Tasks Limit For Sunset
Media (RCLMSMAX) so that the TS7700 has fewer reclaims. You change the format of the
sunset media to the newest one, and use the service resource for other activities in the
cluster.
This function provides a method to limit the maximum number of concurrent reclamation
tasks that run against sunset media. The maximum is the number of installed sunset drives
- 1. You can set the maximum by specifying “0” (this is the default value). Here is the library
request command:
LI REQ, <lib_name>, SETTING, RECLAIM, RCLMSMAX, <number_of_drives>
The values that can be set for these new keywords are the same as the existing keywords
PDRVLOW and PDRVCRIT, except that the existing keywords are used for non-sunset drives and
the new keywords are for sunset drives.
You might want to replace old hardware (V06, V07, or VEB) with the new hardware (VEC) to
have significant performance improvement and increased disk cache capacity over the old
ones. R4.0 PGA1 and later supports frame replacement of a cluster at R2.1 and later code
level.
To replace the old hardware, all the logical volumes need to be pre-migrated to physical
volumes, transferred to the new frame, and recalled from the physical volumes. There are
several considerations before perform frame replacement:
If the frame you want to replace is TS7720T (VEB), all the private logical volumes in CP0
need to be moved to CPx before frame replacement.
If the frame you want to replace is TS7720T (VEB), all the resident logical volumes with
delayed premigration setting need to be premigrated before frame replacement.
All the physical volumes that you want to migrate to the new frame must be able to be read by
the new tape drives. See 4.1.2, “TS7700 specific limitations” on page 144.
Existing V06 and VEA systems do not support Release 3.1 or higher levels of Licensed
Internal Code.
When you are updating code on a cluster in a grid configuration, plan an upgrade to minimize
the time that a grid operates clusters at different code levels. Also, the time in service mode is
important.
Before you start a code upgrade, all devices in this cluster must be varied offline. A cluster in
a grid environment must be put into service mode and then varied offline for the code update.
You might consider making more devices within other clusters in the grid available because
you are losing devices for the code upgrade.
Consideration: Within the grid, some new functions or features are not usable until all
clusters within the grid are updated to the same Licensed Internal Code (LIC) level and
feature codes.
The MI in the cluster that is being updated is not accessible during installation. You can use
a web browser to access the remaining clusters, if necessary.
Apply the required software support before you perform the Licensed Internal Code upgrade
and have the IBM service representative apply any required maintenance associated with the
LIC level you are installing.
Important: Ensure that you check the D/T3957 Preventive Service Planning (PSP) bucket
for any recommended maintenance before performing the LIC upgrade.
Migrations to a TS7700 multi-cluster grid configuration require the use of the Internet Protocol
network. In a two-cluster grid, the grid link connections can be direct-connected (in a
point-to-point mode) to clusters that are located within the supported distance for the
adapters present in the configuration.
Alternatively, on a 3957-V07 /VEB server, two 10 Gbps LW fiber Ethernet links or 3957-VEC
with four 10 Gbps LW fiber Ethernet links can be provided. Be sure to connect each one
through an independent WAN interconnection to be protected from a single point of failure
that disrupts service to both WAN paths from a node. See 4.1.3, “TCP/IP configuration
considerations” on page 146 for more information.
The merging cluster can be a stand-alone cluster or it can be a cluster in an existing grid.
Similarly, the existing cluster can be a stand-alone cluster or it can be a cluster in an existing
grid.
Note: An RPQ is required before you can implement a seven or eight cluster configuration.
Existing New
Stand-alone Join or
Merge Cluster
Cluster
Gigabit
LAN/WAN
Merge is used when a new cluster contains client data. The merging cluster must
be at the exact same level of code as the existing cluster.
Existing
Grid
TS7700 Cluster
Gigabit
LAN/WAN
TS7700 Cluster
Join is used when a new cluster is empty (no logical volumes previously inserted).
The joining cluster can be at an equal or higher level of code than the existing cluster.
Merge is used when a new cluster contains client data. The merging cluster must
be at the exact same level of code as the existing cluster.
Preparation
When performing a join, the actual data does not get copied from one cluster to another. This
process instead creates only placeholders for all of the logical volume data in the final grid.
When joining to an existing grid, the process is initiated to a single cluster in the grid and the
information is populated to all members of the grid.
TS7700 constructs, such as Management Class (MC), Data Class (DC), Storage Class (SC),
and Storage Group (SG), are copied over from the existing cluster or grid to the joining
cluster.
Licensed Internal Code supported levels and feature code for join
Release 4.1 supports the ability to have V07/VEB/VEC clusters (new from manufacturing or
empty through a manufacturing clean-up process) join an existing grid with a restricted
mixture of Release 2.1, Release 3.x clusters, and Release 4.1. There can be three total code
level differences across both targets and the joining system during the MES where R2.1 can
be the lowest of the three levels.
The joining cluster must be at an equal or later code level than the existing clusters. One or
more Release 4.1, Release 3.x, or Release 2.1 clusters can exist in the grid if the total of all
levels, including the joining cluster, does not exceed three unique levels.
When you join one cluster to a cluster in an existing grid, all clusters in the existing grid are
automatically joined. Before you add an empty cluster to an existing cluster or grid, ensure
that you have addressed the following restrictions for the join process:
The joining cluster must be empty (contain no data, no logical volumes, and no
constructs).
If the existing cluster to be joined to is a member of a grid, it must be the current code level
of any member in the grid.
The joining cluster must be at an equal or later code level than the existing clusters.
The joining cluster and existing cluster must have FC 4015 installed.
The joining cluster must support at least the number of logical volumes that are supported
by the grid by using FC 5270.
The joining cluster must contain FC 5271 if the existing cluster to be joined has this feature
code installed.
If the joining cluster has FC 1035 installed, the client’s infrastructure must support 10 Gb.
Join steps
Complete the following steps to join the cluster:
1. Arrange for these join cluster tasks to be performed by the IBM SSR:
a. Verify the feature code.
b. Establish the cluster index number on the joining cluster.
c. Configure the grid IP address on both clusters and test.
d. Configure and test Autonomic Ownership Takeover Manager (AOTM) when needed.
See Chapter 2, “Architecture, components, and functional characteristics” on page 15
for more information.
2. Change HCD channel definitions.
Define the new channels and the device units’ addresses in HCD.
Consideration: If the new source control data set (SCDS) is activated before the new
library is ready, the host cannot communicate with the new library yet. Expect message
CBR3006I to be generated:
CBR3006I Library <library-name> with Library ID <library-ID> unknown in I/O
configuration.
5. Vary devices online to all connected hosts. After a new cluster is joined to a cluster in an
existing grid, all clusters in the existing grid are automatically joined. Now, you are ready to
validate the grid.
6. Modify Copy Policies and Retain Copy mode in the MC definitions according to your
needs. Check all constructs on the MI of both clusters and ensure that they are set
properly for the grid configuration. See 2.3.25, “Copy Consistency Point: Copy policy
modes in a multi-cluster grid” on page 85 for more information.
7. Review your family definitions, and decide if the cluster needs to be included in one of the
families. In specific situations, you might want to introduce a new family, for example if a
new site is populated.
8. Review your SDAC definition, and include the new LIBPORT ID statements if necessary.
9. Review the cluster settings with the LI REQ SETTING for the new distributed library. Enable
the alerts according to the TS7700 model, and review COPYFSC and RECLPG0 settings
(especially if the new cluster is used for DR purposes).
10.For a disk-only model, check the REMOVE and RMVTHR.
11.For a tape-attached model, check the same settings, but also review the PMTHLVL setting
and set it to the amount of installed FC 5274, for example, 6 x FC5274 = 6000. For more
information about these settings, see Chapter 11, “Performance and monitoring” on
page 613.
12.If the joined cluster is a tape attach, also define the tape partitions or resize CP1 and
revisit the Storage Classes. Also review the Storage groups and the inhibit reclaim
schedule. If you use multiple physical pools, you might also want to influence the number
of maximum used premigration drives or the reclaim value. For more information, see
Chapter 11, “Performance and monitoring” on page 613.
13.Run test jobs to read and write to volumes from all of the clusters.
14.Test the write and read capabilities with all of the clusters, and validate the copy policies to
match the previously defined Copy Consistency Points and other constructs.
15.Consider creating a new MC for BVIR purposes to be able to run specific cluster reports.
To produce a new copy, the data needs to be in cache. If your source cluster is a TS7740 or a
TS7700T you should consider sorting the logical volumes in a copy order that maps to the
physical volume layout. This will improve the performance of the copy action. The COPYRFSH
processing enables you to specify a source cluster.
Also, prestaging the data to the cache helps to improve the performance. To simplify these
actions, IBM provides some support in the “TAPE TOOL” suite. For more information about
the performance, see Chapter 11, “Performance and monitoring” on page 613.
TS7700 Cluster
Joined
Grid
TS7700 Cluster TS7700 Cluster
You can merge two existing TS7700 grids to create a larger grid. This solution enables you to
keep redundant copies of data in both grids during the entire merge process versus needing
to remove one or more clusters first and exposing them to a single copy loss condition.
When performing a merge, it is important to note that the actual data does not get copied
from one cluster to another. This process creates place holders for all of the logical volumes
in the final grid. When merging grids, the process is initiated to a single cluster in the grid and
the information is populated to all members of the grid.
Schedule this process during a low activity time on the existing cluster or grid. The grid or
cluster that is chosen to be inaccessible during the merge process has its indexes changed to
not conflict with the other grid or cluster. Check with your IBM SSR for planning information.
Ensure that no overlapping logical volume ranges or physical volume ranges exist. The merge
process detects that situation. You need to check for duplicate logical volumes and, on
TS7740, TS7720T, or TS7760T clusters, for duplicate physical volumes. Logical volume
ranges in a TS7700 must be unique. If duplicate volumes are identified during the merge
process, the process stops before the actual merge process begins.
Figure 7-24 shows the MC definition of the merging cluster and the two-cluster grid before the
merge.
Cluster 2 Cluster 0
Cluster 1
A fter M erge
T wo-Clus ter G rid
M C: A – S c ratch m ount
c andidate enabled
M C: D
M C: E
If categories and constructs are already defined on the merging cluster, verify that the total
number of each category and construct that will exist in the grid does not exceed 256. If
necessary, delete existing categories or constructs from the joining or merging clusters before
the grid upgrade occurs. Each TS7700 grid supports a maximum of 256 of each of the
following categories and constructs:
Scratch Categories
Management Classes
Data Classes
Storage Classes
Storage Groups
If the current combined number of logical volumes in the clusters to be joined exceeds the
maximum number of supported logical volumes, some logical volumes must be moved to
another library or deleted to reach the allowed grid capacity. To maximize the full number of
logical volumes that is supported on the grid, all clusters must have the same quantity of
FC 5270 components that are installed. If feature counts do not match and the final merged
volume count exceeds a particular cluster’s feature count, further inserts are not allowed until
the feature counts on those clusters are increased.
Merge steps
Complete these steps to merge all of the clusters or grids into a grid:
1. Arrange for these merge cluster tasks to be performed by the IBM SSR:
a. Verify the feature code.
b. Configure the grid IP address on all clusters and test.
c. Configure and test AOTM, when needed. For more information, see Chapter 2,
“Architecture, components, and functional characteristics” on page 15.
2. Change HCD channel definitions.
Define the new channels and the device units’ addresses in HCD. For more information
about HCD, see 4.3.1, “Host configuration definition” on page 171 and 6.4, “Hardware
configuration definition” on page 228.
3. Change SMS and TCDB.
With the new grid, you need one composite library and up to six distributed libraries. All
distributed libraries and cluster IDs must be unique. You must now define the new added
distributed library in SMS. Make sure to enter the correct Library-ID that was delivered by
the IBM SSR.
4. Activate the IODF and the SMS definitions and issue an OAM restart (if it was not done
after the SMS activation). If you are merging a cluster that was previously part of an
existing grid, you might need to delete the services control blocks of that cluster’s devices
using the DS QL,nnnn,DELETE command, where nnnn is the LIBID of the cluster.
5. Vary devices online to all connected hosts. After a cluster is merged to a cluster in an
existing grid, all clusters in the existing grid are automatically merged. Now, you are ready
to validate the grid.
6. Run test jobs to read and write to volumes from all of the clusters. Remember, you must
verify all LPARs in the sysplex.
7. Modify copy policies and Retain Copy mode in the MC definitions according to your needs.
Check all constructs on the MI of both clusters and ensure that they are set correctly for
the new configuration. For more information, see 2.3.25, “Copy Consistency Point: Copy
policy modes in a multi-cluster grid” on page 85.
9. If you want part or all of the existing logical volumes to be replicated to the new cluster, the
same methods can be used as after a join processing. See “Population of a new cluster
(COPYRFSH)” on page 289.
After the removal, FC 4017 Cluster Cleanup can be run. FC 4017 is required if the removed
cluster is going to be reused. A Cluster Cleanup removes the previous data from cache and
returns the cluster to a usable state, similar to a new TS7700 from manufacturing, keeping the
existing feature codes in place. Both feature codes are one-time use features.
You can delay the cluster cleanup for a short period while the TS7700 grid continues
operation to ensure that all volumes are present after the removal of the TS7700 cluster.
The client is responsible for determining how to handle the volumes that have only a Copy
Consistency Point at the cluster that is being removed (eject them, move them to the scratch
category, or activate an MC change on a mount/demount to get a copy on another cluster).
This process needs to be done before you start the removal process. A new Bulk Volume
Information Retrieval (BVIR) option Copy Audit or COPYRFSH is provided for generating a list
of inconsistent volumes to help you.
The removal of the cluster from the grid is concurrent with client operations on the remaining
clusters, but some operations are restricted during the removal process. During this time,
inserts, ejects, and exports are inhibited. Generally, run the removal of a cluster from the grid
during off-peak hours.
No data, on cache or tapes, on the removed cluster is available after the cluster is removed
with the completion of FC 4016. The cluster cannot normally be rejoined with the existing
data. However, there is a special service offering to rejoin a cluster with existing data, if this
particular operation is wanted. Contact your IBM sales representative for details.
No secure erase or low-level format is done on the tapes or the cache as part of FC 4016 or
FC 4017. If the client requires data secure erase of the TVC contents, it is a contracted
service for a fee. Consider delaying the cluster cleanup for a short time while the TS7700 grid
continues operation to ensure that all volumes are present after the removal of the TS7700
cluster.
After all of their data is copied to the primary data center TS7700 tape drives, the client can
remove the third cluster from the remote data center and clean up the data from it. This
TS7700 can now be relocated and the process can be repeated.
TS7700 reuse
A client has a multi-site grid configuration, and the client no longer requires a TS7700 at one
site. The client can remove this cluster (after all required data is copied, removed, or expired)
and use this resource in another role. Before the cluster can be used, it must be removed
from the grid domain and cleaned up by using FC4017.
A copy consistency check is run at the beginning of the process. Do not skip
consistency checks unless it is a disaster recovery (DR) unjoin or you can account for
why a volume is inconsistent. Failure to do this can result in data loss when the only
valid copy was present on the removed cluster.
After a cluster is removed, you might want to modify the host configuration to remove
the LIBPORT IDs associated with the removed cluster.
Chapter 8. Migration
This chapter explains aspects of migrating to a TS7700 environment from an IBM Virtual Tape
Server (VTS) or from other tape drive technologies. It presents various options that can be
tailored to your current environment.
Guidance is provided to help you achieve the migration scenario that best fits your needs. For
this reason, methods, tools, and software products that can help make the migration easier
are highlighted.
New models that were introduced with TS7700 Release 4.1 are also described in this section.
The most efficient approach will depend on the source configuration, which can be an IBM
3494 B10/B20 VTS, or IBM or original equipment manufacturer (OEM) native tape drives, and
the target configuration which can be tape attached or disk only. Table 8-1 shows which
Migration Method is available in which case.
Table 8-1 Available migration methods based on migration source and target configurations
Migration Target
Table 8-2 Available combinations of source tape media and target tape drives for tape-based migration
Tape Drive attached to Target TS7740 or TS7700T
Tip: The TMS can be Removable Media Management (DFSMSrmm) or other products
from other vendors.
Information is provided about the TS7700 family replacement procedures that are available
with the new hardware platform and the TS7700 R4.1 Licensed Internal Code (LIC) level.
With the availability of the new generation hardware, an upgrade path is provided for existing
TS7700 users to migrate to this new hardware.
This section covers upgrading tape drive models in an existing TS7740 or TS7700T to get
more capacity from your existing media, or to provide encryption support. It also details the
hardware upgrade procedure and the cartridge migration aspects.
Even though both migration source and target have physical tapes, migration from VTS with
3590 tape drives or native tape drives to a TS7740 or TS7700T always requires host-based
migration to copy the data into the TS7700. This requirement is because there is no data
compatibility between the physical tape media used by migration source and target solutions.
Migration to a TS7700D can be performed only by using host-based migration because the
TS7700D does not have any attached back-end tape drives. Therefore, data must be copied
into the TS7700D by using host programs.
For more information about the methods you can use for host-based migration, see 8.3,
“Methods to move data for host-based migration” on page 309.
Host-based migration is also available for migration from VTS with 3592 tape drives to a
TS7740 or TS7700T. However, TS7740 and TS7700T have attached back-end tape drives so
tape-based migration might be a choice. see 8.1.2, “Tape-based migration” on page 302.
Work with your IBM service support representative (IBM SSR) for more details about IBM
Migration Services for Tape Systems. These services are available from an IBM migration
team and can assist you in the preparation phase of the migration. The migration team
performs the migration on the hardware.
When migrating data from a VTS to a new TS7740 or TS7700T installed in the same tape
library, the process is called data migrate without tape move. If a source VTS is attached to
one tape library and the new target TS7740 or TS7700T is attached to another tape library,
the process is called data migration with tape move. When a source VTS is migrated to an
existing TS7740 or TS7700T, or two VTSs are migrated to the same target TS7740 or
TS7720T, the process is called merge.
If data will be moved inside the same grid (after a join of a cluster, or a merge), COPYRFSH is
the preferred method. For more information, see “Population of a new cluster (COPYRFSH)”
on page 289.
If you want to move to a new data center, or do a technical refresh, use this method to migrate
the data to a new cluster without using host-based migration. This method can be used only
when the data migration is done inside a grid. Inside a grid, it is the fastest and proven
method. How to do it is following steps.
1. Join a new cluster to a grid:
– How to join a new cluster to a grid is described in 7.4, “Adding clusters to a grid” on
page 284.
2. Change the Copy Mode in Management Class (MC) to allow copies to the new cluster:
– MC can be configured from MI. It is described in 9.3.8, “The Constructs icon” on
page 454.
– Description about Copy Mode is written in 4.2.2, “Defining grid copy mode control” on
page 161
3. Generate a list of logical volumes which need to be copied to new cluster:
– User can list status of logical volumes by following Bulk Volume Information Retrieval
(BVIR) command. User can process its ouptput and create a list of logical volumes
which need to be copied to new cluster.
• VOLUME STATUS
– The following IBM Tape Tools are available. Combination of the tools can list logical
volumes which need to be copied to new cluster and generate Copy Refresh
commands list for them easily. This process is described in “How to generate Copy
Refresh commands list” on page 304.
• BVIRMES
• VESYNC
• BVIRVTS
• COPYRFSH
4. Run Copy Refresh to each logical volume in the list created at step 3 to produce a new
copy of the data in the new cluster:
– Copy Refresh can be run by LI REQ COPYRFSH command.
5. Check if there is no valid copy which eixsts only on an old cluster if you want to remove the
old cluster from the grid after necessary logical volumes are copied to the new cluster:
– User can ensure if there are logical volumes whose copies are only on the old cluster
to be removed by following Bulk Volume Information Retrieval (BVIR) command (do not
specify COPYMODE to certainly detect volumes which exists only on specified
cluster).
• COPY AUDIT INCLUDE <Old Cluster Lib ID>
See 11.16, “IBM Tape Tools” on page 698 for a brief description of available tools.
See the white paper IBM TS7700 Series Bulk Volume Information Retrieval Function User's
Guide for more details about commands aforementioned. It is available at:
https://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101094
See the white paper IBM TS7700 Series z/OS Host Command Line Request User's Guide for
more details about LI REQ COPYRFSH command aforementioned. It is available on:
https://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101091
BVIRMES
INPUT:
• Copy target TS7700 cluster ID.
This tool executes
VTS BULK VOLUME DATA REQUEST
VOLUME STATUS ______
OUTPUT:
• BVIR Volume Status report.
OUTPUT
MESFILE
INPUT
COPYRFSH
INPUT:
• List of logical volumes which copy target cluster needs copy
• List of mapping between logical and physical volumes on copy source cluster
• List of logical volumes which exist on cache on copy source cluster
This tool generates Copy Refresh command list
OUTPUT:
• Copy Refresh command list.
OUTPUT
COMMAND LIST
Figure 8-1 How to generate Copy Refresh command list by using the IBM Tape Tools
Copy Export function can export a copy of the selected logical volumes stored in physical
tapes on TS7740 or TS7700T. Exported physical tapes can be removed from TS7740 or
TS7700T and taken offsite. Typically, exported data stored in physical tapes is used for
disaster recovery. However, user can move data into another TS7740 or TS7700T by using
exported data.
To restore exported data, the TS7700 or TS7740T must have physical tape drives that are
capable of reading the exported physical tape volumes. Therefore, it must be kept in mind that
when the target TS7740 or TS7700T cluster has TS1150 tape drives only, source TS7740 or
TS7700T must use JK or JC cartridge with TS1140 or TS1150 tape drives for export. The
user can restore exported data into another TS7740 or TS7700T using one of the following
methods:
Copy Export Recovery
– Can move exported data into stand-alone empty TS7740 or TS7700T.
You can find more details about Copy Export and Copy Export Merge in Chapter 12, “Copy
Export” on page 735.
The following list describes the main components in GGM overview in Figure 8-2:
CSG Copy Source Grid.
CTG Copy Target Grid.
Proxy server A binary program that is installed in the CSG through vtd_exec, which
enables the CTG to communicate with the CSG.
The GGM tool should be considered if the following situations are true:
There are already six clusters installed in the grid.
The Join and Copy Refresh processing cannot be used (there are floor space
requirements, microcode restrictions, or other considerations).
CSG and CTG are maintained by different providers.
In this case, normally the TCDB and TMS are not changed during the copy process, but need
to be adjusted at cutover time, or before if cutover tests are made.
Also, it is necessary to copy the same logical volume multiple times, because the lifecycle
processing of “create - expire - delete - create” is running in the origin system.
In this case, you should consider using the lifecycle information (expiration, retention) as input
to avoid data with a very short lifetime being copied.
To ensure that during tests of the new environment the original copied data to the new grid is
only read but not modified, you should put all clusters in the target grid in write protect mode
for the origin categories and use different logical volume ranges and categories for the
testing. While the cluster is in write protect, no GGM copies can be performed.
Using a CSG where all data resides in cache enables you to concentrate to copy the data by
lifecycle information, or if needed by application purposes, especially for multivolume data
sets.
If only a TS7740 or a TS7700T can be chosen, you also need to consider the backend
resources and the available cache for the GGM copy processing. In this case, we strongly
advise you to do a prestaging of the data, and copy the data based on physical volume, to
avoid too many back end movements. Consider that this approach might not match with the
TMS selection for retention or application purposes.
If the data will be recalled by the GGM process itself, the recall process might take much
longer, and affect your overall GGM performance. Prestaging the data helps you to improve
this performance. Consider changing your RECLPG0 value to allow recalled data to reside
with the original Storage Class settings in the cluster. Otherwise, recalled data might already
be already migrated before GGM could start the copy processing.
For more information, see the IBM TS7700 Series Grid To Grid Migration User's Guide. It
contains a detailed description of the GGM process, requirements, and user setup and
preparation, as well as examples and LI REQ commands used in GGM process.
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS5328
In all other scenarios, migrating data into the TS7700 requires that the TS7700 and the
existing environment remain installed in parallel until the data has been migrated through the
host attached to them.
Examples of this type of configuration are native tape drives, VTS with 3590 Tape Drives, or
other vendor tape solutions. Although they can all be migrated to a TS7700, the process
requires host involvement to copy the data into the TS7700.
This section describes techniques for moving data into a TS7700. You can start using the
TS7700 by moving data into it. The best method depends on the application you want to
manage with the TS7700.
Hints about how to move data out of the TS7700 are provided in 8.4, “Moving data out of the
TS7700” on page 315. However, the TS7700 is a closed-storage method, so you must be
careful about selecting data to move into it. You do not want to store a large amount of data in
the TS7700 that must be moved back out.
You can select data based on data set name, by application, or by any other variable that you
can use in the automatic class selection (ACS) routines. You can also select data based on
type, such as System Management Facilities (SMF) data or DASD DUMP data.
If you use database data, such as logs or image copy, direct new allocations into the TS7700
by updating the following definitions:
DC ACS routines (if used)
MC ACS routines (if used)
SC ACS routines (required)
SG ACS routines (required)
For data other than DFSMShsm and DFSMSdss, if you are using SMS tape, update the ACS
routines to include the data that you want to move. You decide which data to filter and how
you write the ACS routines. You can also migrate based on the UNIT parameter in the JCL to
reflect the applicable unit for the TS7700.
Certain applications have knowledge of the VOLSER where the data is stored. There are
special considerations for these applications. If you change the VOLSER on which the data is
stored, the application has no way of knowing where the data is. For more information about
this topic, see “Implementing Outboard Policy Management for non-z/OS hosts” on page 825.
An easy method is to obtain information from the TMS database. Reports can give you details
about the data you have in the tape shop, which helps you select the input volumes.
If you are using DFSMSrmm, you can easily acquire data from a Removable Media
Management (RMM) EXTRACT file, which is normally created as part of the regular
maintenance. Then, using a REXX EXEC or ICETOOL JCL program, you extract the needed
information, such as data set name, VOLSER, and file sequence of the input volumes.
When using SMS tape, the first step is to update the ACS routines to direct all new data to the
TS7700. With this change, new data on tapes gets created in the TS7700 so that moving it
again later is not necessary.
If you move OAM-owned data, you can use the OAM recycle process, OAM Storage
Management Component (OSMC), or the OAM MOVEVOL utility to move the data to the
TS7700. If you move DFSMShsm-owned data, you have to use RECYCLE command to move
incremental backup and migration volumes. Use a COPYDUMP job to move DFSMSdss data to
the TS7700. The utility to use depends on the data selected. In most cases, it is sequential
data that can be copied by using the IEBGENER utility, DITTO/ESA. If you have DFSORT,
ICEGENER and ICETOOL perform better.
You must use a specific utility when the input data is in a special format, for example,
DFSMSdss dump data. DFSMSdss uses blocks up to 256 KB blocksize and only the proper
DSS utility, such as COPYDUMP, can copy with that blocksize. Be careful when copying multifile
and multivolume chains.
In RMM, this step can be done with a CHANGEDATASET command that has special authority to
update O/C/EOV recorded fields. For detailed information about this command, see z/OS
DFSMSrmm Managing and Using Removable Media, SC23-6873.
To avoid this time-consuming process, use a tape copy tool because they can make all the
necessary changes in a TMS.
Tape copy tools recatalog tapes during movement without the need for manual intervention.
Using this quick-method sequence, you can copy every kind of tape data, including
generation data groups (GDGs), without modifying the generation number.
In an RMM environment, you can use RMMCLIST variables, RMM REXX variables, or both
as well as RMM commands, listing data from the input volumes and then using the RMM
REXX variables with the CD command to update the output. Then, call IDCAMS to update the
integrated catalog facility (ICF) catalog. For more information, see z/OS DFSMS Access
Method Services Commands, SC23-6846.
When the operation completes and all errors are corrected, use the RMM DELETEVOLUME
command to release the input volumes. For more information about RMM commands and
REXX variables, see z/OS DFSMSrmm Managing and Using Removable Media, SC23-6873.
If you are using a TMS other than RMM, see the appropriate product functions to obtain the
same results.
With DFSMShsm, you can change the ARCCMDxx tape device definitions to an esoteric
name with TS7700 virtual drives (in a BTLS environment) or change SMS ACS routines to
direct DFSMShsm data in the TS7700. The DFSMShsm RECYCLE command can help speed
the movement of the data.
A similar process can be used with IBM Tivoli Storage Manager, changing the device class
definitions for the selected data to put in the TS7700 and then starting the space reclamation
process.
If you are moving DB2 data into the TS7700, ensure that, when copying the data, the DB2
catalog is also updated with the new volume information. You can use the DB2 MERGECOPY
utility to speed up processing, using TS7700 virtual volumes as output.
In general, DB2 image copies and Archlog are not retained for a long time. After all new write
activity goes to the TS7740 or TS7720T, you can expect that this data is moved by the
everyday process.
Do not use the tool to copy tape data sets owned by Hierarchical Storage Manager
(DFSMShsm), IBM Tivoli Storage Manager, or similar products, where information of old
VOLSERs is kept within the product and not reflected after a copy is made. This challenge
typically applies to products where tapes are not cataloged in an ICF catalog, but kept in the
product’s own database.
The DFSMSrmm Tape Copy Tool cannot be used when you have a TMS other than
DFSMSrmm. You must choose another Tape Copy Tool from Table 8-3 on page 314.
Consider the following factors when you evaluate a tape copy product:
Interaction with your TMS
Degree of automation of the process
Speed and efficiency of the copy operation
Flexibility in using the product for other functions, such as duplicate tape creation
Ease of use
Ability to create a pull list for any manual tape mounts
Ability to handle multivolume data sets
Ability to handle volume size changes, whether from small to large, or large to small
Ability to review the list of data sets before submission
Audit trail of data sets already copied
Table 8-3 lists several common tape copy products. You can choose one of these products or
perhaps use your own utility for tape copy. You do not need any of these products, but a tape
copy product can make your job easier if you have many tapes to move into the TS7700.
Tape Copy Tool/ IBM Contact your IBM SSR for more information about
DFSMSrmm this service offering. Do not confuse this with the
Tape Analysis Tools that are mentioned in 11.16.2,
“Tools download and installation” on page 700,
which can be download from IBM for no extra fee.
In addition to using one of these products, consider using IBM Global Technology Services
(GTS) to assist you in planning and moving the data into the TS7700.
Static data is information that will be around for a long time. This data can be moved into the
TS7700 only with the quick method. You must decide how much of this data will be moved
into the TS7700. One way to decide this is to examine expiration dates. You can then set a
future time when all volumes, or a subset, are copied into the TS7700. There might be no
reason to copy volumes that are going to expire in two months. By enabling these volumes to
go to SCRATCH status, you can save yourself some work.
Dynamic data is of a temporary nature. Full volume backups and log tapes are one example.
These volumes typically have a short expiration period. You can move this type of data with
the phased method. There is no reason to copy these volumes if they are going to expire
soon.
With this method, the data is reprocessed by the host and copied to another medium. This
method is described in 8.3.1, “Phased method of moving data” on page 309. The only
difference is that you need to address the TS7700 as input and the non-TS7700 drives as
output.
You can find more details about Copy Export and Copy Export Merge in this book in
Chapter 12, “Copy Export” on page 735.
Using the DFSMShsm ABARS function, group the data you want to move outside the
TS7700. Then, start addressing other tape drives outside the TS7700, or use the Copy
Export function. In this way, you obtain an exportable copy of the data that can be put in an
offsite location.
You can identify the data set names in a single selection data set, or you can divide the
names among as many as five selection data sets. You can specify six types of data set lists
in a selection data set. The type that you specify determines which data sets are backed up
and how they are recovered.
An INCLUDE data set list is a list of data sets to be copied by aggregate backup to a tape
data file where they can be transported to the recovery site and recovered by aggregate
recovery. The list can contain fully qualified data set names or partially qualified names with
placeholders. DFSMShsm expands the list to fully qualified data set names.
Using a selection data set with the names of the data sets you want to export from the
TS7700, obtain a list of files on logical volumes that the ABARS function copies to
non-TS7700 drives.
You can also use the Copy Export function to move the ABARS tapes to a data recovery site
outside of the library.
Define the aggregate group and MC used for aggregate backup to DFSMS through ISMF
panels.
The aggregate group lists the selection data set names, instruction data set name, and extra
control information that is used by the aggregate backup to determine which data sets to
back up.
With the PROCESSONLY(USERTAPE) keyword, only tape data sets are processed. In this way, you
can be sure that only the input data from TS7700 logical volumes is used.
When you enter the ABACKUP command with the EXECUTE option, the following tape files are
created for later use as input for aggregate recovery:
Data file: Contains copies of the data sets that have been backed up.
Control file: Contains control information that is needed by aggregate recovery to verify or
recover the application’s data sets.
Instruction/activity log file: Contains the instruction data set, which is optional.
The effects of recall takeaway can be a real disadvantage when writing Migration Level 2 data
onto native, high-capacity cartridges, because the space management task must set aside its
output tape to make it available to the recall task. Although the partially filled output tape
remains eligible for subsequent selection, the next time that space management runs, it is
possible to accumulate several partial tapes beyond DFSMShsm needs if recall takeaway
activity occurs frequently.
Excess partial tapes created by recall takeaway activity result in poor use of native cartridges.
In addition, because recall takeaway activity does not cause the set-aside tape to be marked
full, it is not automatically eligible for recycling, despite its poor utilization.
High-capacity cartridges are more likely to experience both frequent recall takeaway activity,
and also frequent piggy-back recall activity, in which recalls for multiple data sets on a single
tape are received while the tape is mounted. However, piggy-back recalls have a positive
effect by reducing the number of mounts that are required to run several recalls. You must
also consider that multiple recalls from the same tape must be performed serially by the same
recall task.
If those same data sets are on separate tapes, the recalls can potentially be performed in
parallel, given enough recall tasks. In addition, the persistence of the virtual tape in the Tape
Volume Cache (TVC) after it has been unmounted enables DFSMShsm to run ML2 recalls
from the disk cache without requiring that a physical tape be mounted.
Other reasons also exist for directing DFSMShsm data into a TS7700. The number of native
drives limits the number of DFSMShsm tasks that can run concurrently. With the large
number of up to 496 virtual drives in a stand-alone cluster configuration or 992 virtual drives
in a two-cluster grid configuration, you can dedicate a larger number of virtual drives to each
DFSMShsm function and enable higher throughput during your limited backup and space
management window.
Other reasons for using the TS7700 with DFSMShsm are the greatly reduced run times of
DFSMShsm operations that process the entire volume, such as AUDIT MEDIACONTROLS and
TAPECOPY.
DFSMShsm can benefit from the TS7700 tape drive’s high throughput and from its large TVC
size, which enables long periods of peak throughput.
DFSMShsm data is well-suited for the TS7700 because of the appropriate tailoring of those
parameters that can affect DFSMShsm performance. The subsequent sections describe this
tailoring in more detail.
For more information, see the z/OS DFSMShsm Storage Administration Guide, SC23-6871.
Table 8-4 lists the maximum data set sizes that are supported by DFSMShsm in z/OS
environments.
Important: A single DFSMShsm user data set can span up to 40 tapes (with z/OS V2R1,
this limit is now 254). This limit is for migration, backup, and recycle.
After DFSMShsm writes a user data set to tape, it checks the volume count for the
DFSMShsm tape data set. If the volume count is greater than 215, the DFSMShsm tape data
set is closed, and the currently mounted tape is marked full and is de-allocated.
The number 215 is used so that a data set spanning 40 tapes fits within the 255-volume
allocation limit. DFSMShsm selects another tape, and then starts a different DFSMShsm tape
data set. Data set spanning can be reduced by using the SETSYS TAPESPANSIZE command.
In z/OS V2.1 and higher, the limit is 254 volumes for HSM user data sets, so the maximum
user data set becomes 508,000 MiB:
800 MiB x 2.5 x 254 = 508000 MiB
Assume that you have a very large data set of 300 GiB. This data set does not fit on
40 volumes of 800 MiB each, but it can fit on 6000 MiB large virtual volumes, as shown in the
following example:
6000 MiB x 2.5 x 40 = 600000 MiB
However, in z/OS V2.1 and higher, this data set can fit on 800 MiB volumes. Any single user
data set larger than 3.81 TiB at z/OS 1.13 or 15.875 TiB in z/OS 2.1 and higher, is a
candidate for native 3592 tape drives. Assuming a compression rate of 2.5:1, they might not
fit onto the supported number of volumes. In this case, consider using native 3592-E06
(TS1130) or 3592-E07 (TS1140) tape drives rather than TS7700.
IDCAMS DCOLLECT BACKUPDATA can be used to determine the maximum size of backed-up data
sets in DFSMShsm inventory. MIGRATEDATA can be used to determine the maximum size of
migrated data sets in DFSMShsm inventory.
Each instance of DFSMShsm can have a unique MIGUNIT specified. For instance, one
host can specify MIGUNIT(3590-1) and another MIGUNIT(TS7700). The same is true for
BUUNIT.
The DFSMShsm host that has 3590-1 specified as a migration or backup unit should
process only space management or automatic backup for the SGs where your large data
sets, such as z/FS, are. The other DFSMShsm hosts can then migrate and back up SGs
containing the smaller data sets to the TS7700.
To direct a command to a specific instance of DFSMShsm, you can use an MVS MODIFY
command with the started task name of the instance of DFSMShsm that you want to
process the command. For example, “F DFSMS2, BACKDS...” or “F DFSMS2, BACKVOL
SG(SG)...”.
The following commands affect which output device is used by a specific function:
SETSYS TAPEMIGRATION(ML2TAPE(TAPE(unittype))
SETSYS RECYCLEOUTPUT(MIGRATION(unittype))
SETSYS BACKUP(TAPE(unittype))
SETSYS RECYCLEOUTPUT(BACKUP(unittype))
Each tape that is identified as being empty or partially filled must be marked full by using one
of the following DFSMShsm commands:
DELVOL volser MIGRATION(MARKFULL)
DELVOL volser BACKUP(MARKFULL)
As DFSMShsm migrates data and creates backup copies, it prefers to add to an existing
migration/backup volume. As the volume nears full, it handles spanning of data sets, as
described in “Tape spanning” on page 322. If a data set spans across DFSMShsm volumes, it
becomes a connected set in DFSMShsm terms.
However, a key point is that if the data set spans, DFSMShsm uses Force end-of-volume
(FEOV) processing to get the next volume mounted. Therefore, the system thinks that the
volume is part of a multivolume set regardless of whether DFSMShsm identifies it as a
connected set. Because of the end-of-volume (EOV) processing, the newly mounted
DFSMShsm volume uses the same DC and other SMS constructs as the previous volume.
With the DFSMShsm SETSYS PARTIALTAPE MARKFULL option, DFSMShsm marks the last
output tape full, even though it has not reached its physical capacity. By marking the last
volume full, the next time processing starts, DFSMShsm will use a new volume, starting a
new multivolume set and enabling the use of a new DC and other SMS constructs. If the
volume is not marked full, the existing multivolume set continues to grow and to use the
old constructs.
This is relevant to Outboard policy management and the implementation of different logical
volume sizes. If all volumes have been marked full, you can simply update your ACS routines
to assign a new DC and other SMS constructs. From then on, each new migration or backup
volume uses the new size.
Installations in which recalls from ML2 are rare, and installations in which very large data sets
are migrated that might result in reaching the 40 or 254-volume limits, should use the
maximum capacity of the virtual volume. Write your ACS routines to select a different SMS
DATACLAS for backup and migration activities that is based on the optimum volume size.
See Table 8-5 on page 323 when you customize the ARCCMDxx SETSYS parameters. HSM is
aware of the large virtual volume capacity; it is not necessary to use high PERCENTFULL values
to tune capacity of tapes from a DFSMShsm point of view. The maximum PERCENTFULL value
that can be defined is 110% but it is no longer necessary to go above 100%.
In OAM’s Object Tape Support, the TAPECAPACITY parameter in the SETOAM statement of the
CBROAMxx PARMLIB member is used to specify the larger logical volume sizes. Because
OAM also obtains the size of the logical volume from the TS7700, defining TAPECAPACITY in
the CBROAMxx PARMLIB member is not necessary. For more information about Outboard
policy management, see the z/OS DFSMS Object Access Method Planning, Installation, and
Storage Administration Guide for Tape Libraries, SC23-6867.
Scratch volumes
The default volume size is overridden at the library through the DC policy specification, and is
assigned or reassigned when the volume is mounted for a scratch mount or rewritten from
load point as a specific mount. Using a global scratch pool, you benefit from a fast mount time
by establishing your scratch categories, as explained in “Defining scratch categories” on
page 550. Consider using the following definitions to benefit from the fast scratch mount
times:
SETSYS SELECTVOLUME(SCRATCH): Requests DFSMShsm to use volumes from the common
scratch pool.
SETSYS TAPEDELETION(SCRATCHTAPE): Defines that DFSMShsm returns tapes to the
common scratch pool.
SETSYS PARTIALTAPE(MARKFULL): Defines that an DFSMShsm task will mark the last tape
that it used in a cycle to be full, avoiding a specific mount during the next cycle.
When you use the MARKFULL parameter, the stacked volume contains only the written data of
each logical volume that is copied, and the same applies to the TVC.
Tape spanning
You can use the optional TAPESPANSIZE parameter of the SETSYS command to reduce the
spanning of data sets across migration or backup tape volumes, for example:
SETSYS TAPESPANSIZE(4000)
The value in parentheses represents the maximum number of megabytes of tape (ML2 or
backup) that DFSMShsm might leave unused while it tries to eliminate the spanning of data
sets. To state this in another way, this value is the minimum size of a data set that is allowed to
span tape volumes. Data sets whose size is less than the value do not normally span
volumes. Only those data sets whose size is greater than or equal to the specified value are
allowed to span volumes.
This parameter offers a trade-off: It reduces the occurrences of a user data set spanning
tapes in exchange for writing less data to a given tape volume than its capacity otherwise
enables. The amount of unused media can vary 0 - nnnn physical megabytes, but roughly
averages 50% of the median data set size. For example, if you specify 4000 MiB and your
median data set size is 2 MiB, on average, only 1 MiB of media is unused per cartridge.
Installations that currently experience an excessive number of spanning data sets need to
consider specifying a larger value in the SETSYS TAPESPANSIZE command. Using a high value
reduces tape spanning. In a TS7700, this value reduces the number of virtual volumes that
need to be recalled to satisfy DFSMShsm recall or recover requests.
You can be generous with the value because no space is wasted. For example, a
TAPESPANSIZE of 4000 means that any data set with less than 4000 MiB that does not fit on the
remaining space of a virtual volume is started on a fresh new virtual volume.
Volume dumps
When using TS7700 as output for the DFSMShsm AUTODUMP function, do not specify the
following parameters:
DEFINE DUMPCLASS(dclass STACK(nn))
BACKVOL SG(sgname)|VOLUMES(volser) DUMP(dclass STACK(10))
These parameters were introduced to force DFSMShsm to use the capacity of native physical
cartridges. If used with TS7700, they cause unnecessary multivolume files and reduce the
level of parallelism possible when the dump copies are restored. Use the default value, which
is NOSTACK.
For example, if your installation often has more than 10 tape recall tasks at one time, you
probably need 12 back-end drives to satisfy this throughput request because all migrated
data sets might already have been removed from the TVC and need to be recalled from tape.
TAPECOPY
The DFSMShsm TAPECOPY function requires that original and target tape volumes are of
the same media type and use the same recording technology. Using a TS7700 as the target
for the TAPECOPY operation from an original volume that is not a TS7700 volume might
cause problems in DFSMShsm because TS7700 virtual volumes have different volume sizes.
For example, if you are planning to put DFSMShsm alternative copies into a TS7700, a tape
capacity of 45% might not be enough for the input non-TS7700 ECCST cartridges.
TAPECOPY fails if the (virtual) output cartridge encounters EOV before the input volume has
been copied completely.
However, using TS7700 logical volumes as the original and 3490E native as the TAPECOPY
target might cause EOV at the alternative volume because of the higher LZ data compression
algorithm, IBMLZ1, compression seen on the virtual drive compared to the improved
data-recording capability (IDRC) compression on the native drive.
For special situations where copying from standard to enhanced capacity media is needed,
the following patch command can be used:
PATCH .MCVT.+4F3 BITS(.....1..)
DUPLEX TAPE
For duplexed migration, both output tapes must be of the exact same size and unit type. A
preferred practice is to use a multi-cluster grid and the new Synchronous mode copy support,
and enable the hardware to run the duplex rather than the DFSMShsm software function. This
method also enables you to more easily manage the disaster side. You can use
Geographically Dispersed Parallel Sysplex (GDPS) and switch to the remote DASD side and
the tape VOLSER itself does not need to be changed. No TAPEREPL or SETSYS DISASTERMODE
commands are needed.
When HSM writes ML2 data to tape, it deletes the source data as it goes along, but before the
RUN is sent to the TS7700. Therefore, until the copy is made, only one copy of the ML2 data
might exist. The reason is because the TS7700 grid, even with a Copy Consistency Point of
[R,R], makes a second copy at RUN time.
By using the appropriate MC settings in SMS, you can ensure that a data set is not migrated
to ML2 before a valid backup copy of this data set exists. This way, there are always two valid
instances from which the data set can be retrieved: One backup and one ML2 version. After
the second copy is written at rewind-unload time, two copies of the ML2 data will exist in the
grid.
RECYCLE
The DFSMShsm RECYCLE function reduces the number of logical volumes inside the
TS7700, but when started, it can cause bottlenecks in the TS7740 or TS7700T recall process.
If you have a TS7740 or TS7700T with four physical drives, use a maximum of two concurrent
DFSMShsm RECYCLE tasks. If you have a TS7740 or TS7700T with six physical drives, use
no more than five concurrent DFSMShsm RECYCLE tasks.
Use a RECYCLEPERCENT value that depends on the logical volume size, for example:
5 for 1000 MiB, 2000 MiB, 4000 MiB, or 6000 MiB volumes
10 for 400 MiB or 800 MiB volumes
You can use the following commands to limit which volumes can be selected for DFSMShsm
RECYCLE processing. For instance, you might want to limit RECYCLE to only your old
technology, and exclude the newer tape technology from RECYCLE until the conversion is
complete. You can use the following commands to limit which tape volume ranges are
selected for RECYCLE:
RECYCLE SELECT (INCLUDE(RANGE(nnnnn:mmmmm)))
RECYCLE SELECT (EXCLUDE(RANGE(nnnnn:mmmmm)))
You can also use the SETSYS RECYCLEOUTPUT to determine which tape unit to use for the
RECYCLE output tapes. You can use your ACS routines to route the RECYCLEOUTPUT unit
to the wanted library by using the &UNIT variable.
See IBM z/OS DFSMShsm Primer, SG24-5272 for more information about implementing
DFSMShsm.
DFSMSrmm accepts logical volume capacity from an open close end-of volume (OCE)
module. DFSMSrmm can now always list the actual reported capacity from TS7700.
Note: Prior to APAR OA49373, the CBR3660A message was deleted when the scratch
count was 2X+1 above the threshold. With this APAR (z/OS V2R1 and later), when the
CBR3660A is deleted it can be customized using the CBROAMxx PARMLIB member and
the SETTLIB command.
When you direct allocations inside the TS7700, the vital record specifications (VRSs), or vault
rules, indicate to the TMS that the data set will never be moved outside the library. During
VRSEL processing, each data set and volume is matched to one or more VRSs, and the
required location for the volume is determined based on priority. The volume’s required
location is set.
The volume is not moved unless DSTORE is run for the location pair that includes the current
volume location and its required location. For logical volumes, this required location can be
used to determine which volume must be exported. For Copy Export, the required location is
only used for stacked volumes that have been Copy Exported.
Other TMSs must modify their definitions in a similar way. For example, CA-1 Tape
Management must modify their RDS and VPD definitions in CA/1 PARMLIB. Control-M/Tape
(Control-T) must modify its rules definitions in the Control-T PARMLIB.
The DFSMSrmm return-to-scratch process has been enhanced to enable more parallelism in
the return-to-scratch process. EDGSPLCS is a new option for the EDGHSKP SYSIN file
EXPROC command that can be used to return to scratch tapes in an asynchronous way. With
the most recent software support changes, EDGSPLCS can be used to run scratch
processing in parallel across multiple libraries, or in parallel within a library.
The only necessary step is to run different instances of CBRSPLCS. For more information
about the enhanced return-to-scratch process, see z/OS DFSMSrmm Implementation and
Customization Guide, SC23-6874.
Stacked volumes cannot be used by the host; they are managed exclusively by the TS7740 or
TS7720T. Do not enable any host to either implicitly or explicitly address these stacked
volumes. To indicate that the stacked VOLSER range is reserved and cannot be used by any
host system, define the VOLSERs of the stacked volumes to RMM.
Use the following PARMLIB parameter, assuming that VT is the prefix of your stacked TS7700
cartridges:
REJECT ANYUSE(VT*)
This parameter causes RMM to deny any attempt to read or write those volumes on native
drives. There are no similar REJECT parameters in other TMSs.
You do not need to explicitly define the virtual volumes to RMM. During entry processing, the
active RMM automatically records information about each volume in its CDS. RMM uses the
defaults that you specified in ISMF for the library entry values if there is no existing RMM
entry for an inserted volume. Set the default entry status to SCRATCH.
When adding 1,000,000 or more virtual volumes, the size of the RMM CDS and the amount of
secondary space available must be checked. RMM uses 1 MB for every 1,000 volumes
defined in its CDS. An extra 1,000,000 volumes need 1,000 MB of space. However, do not
add all the volumes initially. See “Inserting virtual volumes” on page 549 for more information.
Other TMSs, such as BrightStor, CA-1 Tape Management Copycat Utility (BrightStor CA-1
Copycat), and BrightStor CA-Dynam/TLMS Tape Management Copycat Utility (BrightStor
CA-Dynam/TLMS Copycat) must reformat their database to add more volumes. Therefore,
they must stop to define more cartridges.
Additionally, some TMSs do not enable the specification of tape volumes with alphanumeric
characters or require user modifications to do so. See the correct product documentation for
this operation.
In both RMM and the other TMSs, the virtual volumes do not have to be initialized. The first
time that a VOLSER is used, TS7700 marks the virtual volume with VOL1, HDR1, and a tape
mark, as though it had been done by EDGINERS or IEHINITT.
Tivoli Storage Manager 6.1, released in 2009, had no Tivoli Storage Manager Server support
for z/OS. IBM Tivoli Storage Manager for z/OS Media V6.3 and IBM Tivoli Storage Manager
for z/OS Media Extended Edition V6.3 are replacement products for Tivoli Storage Manager
V5.5 and Tivoli Storage Manager Extended Edition for z/OS V5.5, with new functions
available in Tivoli Storage Manager V6, while maintaining the ability to access Fibre Channel
connection (FICON)-attached storage on a z/OS system.
IBM Tivoli Storage Manager for z/OS Media and IBM Tivoli Storage Manager for z/OS Media
Extended Edition, introduced with Version 6.3, are designed to enable IBM Tivoli Storage
Manager V6.3 servers that are running on IBM AIX and Linux on z Systems to access various
FICON-attached tape libraries on z/OS, including the TS7700 family.
Tip: Beginning with Version 7.1.3, IBM Tivoli Storage Manager is now IBM Spectrum
Protect™. Some applications, such as the software fulfillment systems and IBM License
Metric Tool, use the new product name. However, the software and its product
documentation continue to use the Tivoli Storage Manager product name. To learn more
about the rebranding transition, see the following website:
http://www.ibm.com/support/docview.wss?uid=swg21963634
For the current Tivoli Storage Manager supported levels for Linux on z Systems, see IBM
Knowledge Center:
http://www.ibm.com/support/docview.wss?uid=swg21243309
Spectrum Protect
Backup-
Archive
Client
Spectrum Protect
Server
Figure 8-3 Data flow from the backup-archive client to z/OS media server storage
If you plan to store IBM Tivoli Storage Manager data in the TS7700, consider the following
suggestions for placing data on your TS7700:
Use TS7740 or TS7700T for IBM Tivoli Storage Manager Archiving for archiving and
backing up large files or databases for which you do not have a high-performance
requirement during backup and restore. TS7740 or TS7700T is ideal for IBM Tivoli
Storage Manager archive or long-term storage because archive data is infrequently
retrieved. Archives and restorations for large files can see less effect from the staging.
For more information about setting up Tivoli Storage Manager, see the following web page:
http://www.ibm.com/support/knowledgecenter/SSGSG7_7.1.1/com.ibm.itsm.srv.common.do
c/t_installing_srv.html
8.8 DFSMSdss
This section describes the uses of DFSMSdss with the TS7700.
With TS7740 or TS7700T, you fill the stacked cartridge without changing JCL by using
multiple virtual volumes. The TS7740 or TS7700T then moves the virtual volumes that are
created onto a stacked volume.
Using the COMPRESS keyword of the DUMP command, you obtain a software compression of the
data at the host level. Because data is compressed at the TS7700 before being written into
the TVC, host compression is not required unless channel use is high already.
Stand-Alone Services support the 3494 and 3584 (TS3500) tape library and the VTS. You
can use it to restore from native and virtual tape volumes in a TS7700. With Stand-Alone
Services, you specify the input volumes on the RESTORE command and send the necessary
mount requests to the tape library.
You can use an initial program load (IPL) of the Stand-Alone Services core image from a
virtual tape device and use it to restore dump data sets from virtual tape volumes.
Stand-Alone Services are provided as a replacement to the previous DFDSS V2.5 and
DFSMS V1 stand-alone functions. The installation procedure for Stand-Alone Services
retains, rather than replaces, the existing stand-alone restore program so you do not have to
immediately change your recovery procedures. Implement the procedures as soon as you
can and start using the enhanced Stand-Alone Services.
To use Stand-Alone Services, create a stand-alone core image suitable for IPL by using the
new BUILDSA command of DFSMSdss. Create a new virtual tape as non-labeled and then put
the stand-alone program on it.
For more information about how to use the TS7700 MI to set a device in stand-alone mode,
see “Modify Virtual Volumes window” on page 415.
Complete these steps to use an IPL of the Stand-Alone Services program from a virtual
device and restore a dump data set from virtual volumes:
1. Ensure that the virtual devices you will be using are offline to other host systems. Tape
drives to be used for stand-alone operations must remain offline to other systems.
2. Set the virtual device from which you will load the Stand-Alone Services program in
stand-alone mode by selecting Virtual Drives on the TS7700 MI of the cluster where you
want to mount the logical volume.
3. Follow the sequence that is described for stand-alone mount in “Virtual tape drives” on
page 396.
4. Load the Stand-Alone Services program from the device you set in stand-alone mode. As
part of this process, select the operator console and specify the input device for entering
Stand-Alone Services commands.
L00001 and L00002 are virtual volumes that contain the dump data set to be restored.
0A40 is the virtual device that is used for reading source volumes L00001 and L00002.
And, 0900 is the device address of the DASD target volume to be restored.
Stand-Alone Services request the TS7700 to mount the source volumes in the order in
which they are specified on the TAPEVOL parameter. It automatically unloads each
volume, then requests the TS7700 to unmount it and to mount the next volume.
6. When the restore is complete, unload and unmount the IPL volume from the virtual device
by using the TS7700 MI’s Setup Stand-alone Device window.
7. In the Virtual Drives window in Figure 9-45 on page 397, click Actions → Unmount
logical volume to unload the virtual drive and finish the Stand-alone Mount operation.
Stand-Alone Services send the necessary mount and unmount orders to the library. If you are
using another stand-alone restore program that does not support the mounting of library
resident volumes, you must set the source device in stand-alone mode and manually instruct
the TS7700 to mount the volumes by using the Setup Stand-alone Device window.
For more information about how to use Stand-Alone Services, see the z/OS DFSMSdss
Storage Administration, SC23-6868.
Enabling objects to be stored on tape volumes with DASD and optical media provides
flexibility and more efficiency within the storage management facility.
OAM stores objects on a TS7700 as they are stored in a normal TS3500 tape library, with up
to 496 virtual drives and many virtual volumes available.
When using the TS7740 or TS7700T, consider using the TAPEPERCENTFULL parameter
with object tape data because the retrieval time of an OAM object is important. The recall time
for smaller logical volumes can be reduced considerably.
The OAM TAPECAPACITY parameter is no longer needed when you use the TS7700 with OAM
because OAM now obtains the capacity of the logical volume from the library.
Virtual volumes in a TS7700 can be used to store your object data (OBJECT or OBJECT
BACKUP SG data). With the data in cache, the TS7700D (or TS7700T CP0) can be ideal for
your primary OBJECT SG needs. Your OBJECT BACKUP SGs can then be in a TS7700 or
TS7700T CPx, depending on your recovery needs.
As with DFSMShsm, the Synchronous mode copy option can be used to replicate your data
to other clusters in the grid. This replicated copy (and others in the grid) is then managed by
the TS7700. The replication capabilities in the grid can be used in addition to any
OAM-managed backup copies.
Archive logs
DB2 tracks database changes in its active log. The active log uses up to 31 DASD data sets
(up to 62 with dual logging) in this way: When a data set becomes full, DB2 switches to the
next one and automatically offloads the full active log to an archive log.
Archive logs are sequential data sets that are allocated on either DASD or tape. When
archiving to tape, a scratch tape volume is requested each time.
Archive logs contain unique information necessary for DB2 data recovery. Therefore, to
ensure DB2 recovery, make backups of archive logs. You can use general backup facilities or
the DB2 dual archive logging function.
When creating a Dual Copy of the archive log, usually one is local and the other is for disaster
recovery. The local copy can be written to DASD, then moved to tape, by using Tape Mount
Management (TMM). The other copy can be written directly to tape and then moved to an
offsite location.
With TS7700, you can write the local archive log directly inside the TS7700. Avoiding the use
of TMM saves DASD space, saves DFSMShsm CPU cycles, and simplifies the process. The
disaster recovery copy can be created by using Copy Export capabilities in the TS7740 and
TS7700T, or by using native tape drives, so that it can be moved offsite.
The size of an archive log data set varies from 150 MB to 1 GB. The size of a virtual volume
on a TS7700 can be up to 25000 MiB, so be sure that your archive log is directed to a virtual
volume that can hold the entire log. Use a single volume when unloading an archive log to
tape. The size of a virtual volume on a TS7700 can be up to 75000 MiB, assuming a 3:1
compression ratio.
Limiting data set size might increase the frequency of offload operations and reduce the
amount of active log data on DASD. However, this is not a problem with the TS7700 because
it requires no manual operation. Even with the TS7740 (or TS7700T), the archive logs stay in
the TVC for some time and are available for fast recovery.
One form of DB2 recovery is backward recovery, typically done after a processing failure,
where DB2 backs out uncommitted changes to resources. When doing so, DB2 processes
log records in reverse order, from the latest back toward the oldest.
If the application that is being recovered has a large data set and makes only a few commit
operations, you probably need to read the old archive logs that are on tape. When archive
logs are on tape, DB2 uses read-backward channel commands to read the log records.
Read-backward is a slow operation on tape cartridges that are processed on real IBM 3480 (if
improved data-recording capability (IDRC) is enabled) and IBM 3490 tape drives.
On a TS7700, it is only about 20% slower than a normal I/O because data is retrieved from
the TVC, so the tape drive characteristics are replaced by the random access disk
characteristics. Another benefit that TS7700 can provide for DB2 operations is the availability
of up to 496 (stand-alone cluster) or 2976 virtual drives (six-cluster grid configuration)
because DB2 often needs many drives concurrently to run recovery or backup functions.
Image copies
Image copies are backup copies of table spaces in a DB2 database. DB2 can create both full
and incremental image copies. A full image copy contains an image of the whole table space
at the time the copy was taken. An incremental image copy contains only those pages of a
table space that changed since the last full image copy was taken. Incremental image copies
are typically taken daily. Full image copies are typically taken weekly.
DB2 provides the option for multiple image copies. You can create up to four identical image
copies of a table space, one pair for local recovery use and one pair for offsite storage.
The size of the table spaces to be copied varies from a few megabytes to several gigabytes.
The TS7700 solution is best for small-sized and medium-sized table spaces because you
need a higher bandwidth for large table spaces.
When a database is recovered from image copies, a full image copy and the subsequent
incremental image copies need to be allocated at the same time. This can potentially tie up
many tape drives and, in smaller installations, can prevent other work from being run. With
one TS7700 and 496 virtual drives, this is not an issue.
The large number of tape drives is important also for creating DB2 image copies. Having
more drives available enables you to run multiple copies concurrently and use the
MERGECOPY DB2 utility without effect. An advisable solution is to run a full image copy of
the DB2 databases once a week outside the TS7700, and run the incremental image copies
daily by using TS7700. The smaller incremental copy fits better with the TS7700 volume
sizes.
CICS is only a data communication product. IMS has both the data communication and the
database function (IMS-DL/1). CICS uses the same DL/1 database function to store its data.
CICS and IMS logs are sequential data sets. When offloading these logs to tape, you must
request a scratch volume every time.
The logs contain the information necessary to recover databases and usually those logs are
offloaded, as with DB2, in two copies, one local and one remote. You can write one local copy
and then create the second for disaster recovery purposes later, or you can create the two
copies in the same job stream.
With TS7700, you can create the local copy directly on TS7700 virtual volumes, and then
copy those volumes to non-TS7700 tape drives, or to a remote TS7700.
Having a local copy of the logs that is written inside the TS7700 enables you faster recovery
because the data stays in the TVC for some time.
When recovering a database, you can complete back out operations in less time with the
TS7700 because when reading logs from tape, IMS uses the slow read backward operation
(100 KBps) on real tape drives. With the TS7700, the same operation is much faster because
the data is read from TVC.
Lab measurements do not see much difference between read forward and read backward in a
TS7700. Both perform much better than on physical drives. The reason is not just that the
data is in the TVC, but the TS7700 code also fully buffers the records in the reverse order that
they are on the volume when in read backwards mode.
Another benefit TS7700 provides to recovery operations is the availability of up to 496 virtual
drives per cluster. This configuration enables you to mount several logs concurrently and to
back out the database to be recovered faster.
The IMS change accumulation utility is used to accumulate changes to a group of databases
from several IMS logs. This implies the use of many input logs that will be merged into an
output accumulation log. With the TS7700, you can use more tape drives for this function.
Image copies
Image copies are backup copies of the IMS databases. IMS can create only full image copies.
To create an image copy of a database, use a batch utility to copy one or more databases
to tape.
With the TS7700, you do not have to stack multiple small image copies to fill a tape cartridge.
Using one virtual volume per database does not waste space because the TS7700 then
groups these copies into a stacked volume.
The TS7700 volume stacking function is the best solution for every database backup because
it is transparent to the application and does not require any JCL procedure change.
The amount of data from these applications can be huge if your environment does not use
TMM or if you do not have DFSMShsm installed. All of this data benefits from using the
TS7700 for output.
With TS7700, the application can write one file per volume, by using only part of the volume
capacity. The TS7740 or TS7700T takes care of completely filling the stacked cartridge for
you, without JCL changes.
The only step that you must remember is that if you need to move the data offsite, you must
address a device outside the local TS7700, or use other techniques to copy TS7700 data
onto other movable tapes, as described in 8.4, “Moving data out of the TS7700” on page 315.
Part 3 Operation
This part describes daily operations and the monitoring tasks related to the IBM TS7700. It
also provides you with planning considerations and scenarios for disaster recovery, and for
disaster recovery testing.
Chapter 9. Operation
This chapter provides information about how to operate and configure the IBM TS7700 by
using the Management Interface (MI).
For general guidance regarding TS3500 or TS4500 tape libraries, see the following IBM
Redbooks publications:
IBM TS3500 Tape Library with System z Attachment A Practical Guide to Enterprise Tape
Drives and TS3500 Tape Automation, SG24-6789
IBM TS4500 R4 Tape Library Guide, SG24-8235
The logical view is named the host view. From the host allocation point of view, there is only
one library, called the composite library. The logical view includes virtual volumes and virtual
tape drives.
With R4.1 a composite library can have up to 4096 virtual addresses for tape mounts,
considering a eight-cluster grid with support for 496 virtual devices in each cluster (available
with FC5275 and z/OS APAR OA44351 ). For more information, see Chapter 2,
“Architecture, components, and functional characteristics” on page 15. The host is only aware
of the existence of the underlying physical libraries because they are defined through
Interactive Storage Management Facility (ISMF) in a z/OS environment. The term distributed
library is used to denote the physical libraries and TS7700 components that are part of one
cluster of the multi-cluster grid configuration.
The physical view shows the hardware components of a stand-alone cluster or a multi-cluster
grid configuration. In a TS7700 tape-attached model, it includes the currently configured
physical tape library and tape drives:
The TS4500 tape library with supported tape drives TS1140 (3592 EH7) and TS1150
(3592 EH8) models
The TS3500 tape library, which supports the 3592 J1A, TS1120, TS1130, TS1140, or
TS1150 tape drive models
Note: TS7760-VEC tape attach configurations must use CISCO switches (16 Gbps)
Release 4.0 introduced support for the TS4500 tape library when attached to models
TS7740-V07, TS7720T-VEB, and TS7760. TS3500 tape library can still be attached to all
TS7700 tape attach models.
Release 3.3 introduced support for TS1150 along with heterogeneous support for two
different tape drive models at the same time, as described in 7.1.5, “TS7700 tape library
attachments, drives, and media” on page 263.
The following operator interfaces for providing information about the TS7700 are available:
Object access method (OAM) commands are available at the host operator console.
These commands provide information about the TS7700 in stand-alone and grid
environments. This information represents the host view of the components within the
TS7700. Other z/OS commands can be used against the virtual addresses. This interface
is described in Chapter 10, “Host Console operations” on page 581.
Web-based management functions are available through web-based user interfaces (UIs).
The following browsers can be used to access the web interfaces:
– Firefox ESR:31.x, 38.x, 45.x
– Microsoft Internet Explorer Version 9.x, 10.x, and 11
– Chrome 39.x and42.x
Call Home Interface: This interface is activated on the TS3000 System Console (TSSC)
and provides helpful information to IBM Service, Support Center, and Development
personnel. It also provides a method to connect IBM storage systems with IBM remote
support, also known as Electronic Customer Care (ECC). No user data or content is
included in the call home information.
Figure 9-1 shows the TS3500 tape library GUI initial window with the System Summary.
The tape library management GUI windows are used during the hardware installation phase
of the TS7700 tape attach models. The installation activities are described in 9.5.1, “The tape
library with the TS7700T cluster” on page 521.
The current TS7700 graphical user interface (GUI) implementation has an appearance and
feel similar to other MI adopted in other IBM Storage products.
4. If a local name server is used, where names are associated with the virtual IP address,
then the cluster name rather than the hardcoded address can be used for reaching the MI.
5. The login window for the MI displays as shown in Figure 9-3. Enter the default login name
as admin and the default password as admin.
After logging in, the user is presented to the Grid Summary page, as shown in Figure 9-4 on
page 344.
After security policies are implemented locally at the TS7700 cluster or by using centralized
role-base access control (RBAC), a unique user identifier and password can be assigned by
the administrator. The user profile can be modified to provide only functions applicable to the
role of the user. All users might not have access to the same functions or views through the
MI.
For more information, see 9.3.9, “The Access icon” on page 464.
Each cluster is represented by an image of the TS7700 frame, displaying the cluster’s
nickname and ID, and the composite library name and Library ID.
The health of the system is checked and updated automatically at times that are determined
by the TS7700. Data that is displayed in the Grid Summary window is not updated in real
time. The Last Refresh field, in the upper-right corner, reports the date and time that the
displayed data was retrieved from the TS7700. To populate the summary with an updated
health status, click the Refresh icon near the Last Refresh field in the upper-right corner of
Figure 9-4.
The health status of each cluster is indicated by a status sign next to its icon. The legend
explains the meaning of each status sign. To obtain additional information about a specific
cluster, click that component’s icon. In the example shown in Figure 9-4, the TS7720T has a
Warning or Degraded indication on it.
Login window
Each cluster in a grid uses its own login window, which is the first window that opens when
the cluster URL is entered in the browser address field. The login window shows the name
and number of the cluster to be accessed. After logging in to a cluster, other clusters in the
same grid can be accessed from the same web browser window.
Banner
The banner is common to all windows of the MI. The banner elements can be used to
navigate to other clusters in the grid, run some user tasks, and locate additional information
about the MI. The banner is located across the top of the MI web page, and allows a
secondary navigation scheme for the user.
Current MI
Product MI user
Navigation sequence
(bread crumb trail) MI screen
info
The last field to the right of the banner (question mark symbol) provides information about
current MI window. In addition, you can invoke learning and tutorials, the knowledge center,
and check the level of the installed knowledge center by hovering the mouse over it and
clicking the desired option.
Figure 9-6 shows some examples of status and events that can be displayed from the Grid
Summary window.
Figure 9-6 Status and Events indicators in the Grid Summary pane
All cluster indicators provide information for the accessing cluster only, and are displayed only
on MI windows that have a cluster scope. MI also provides ways to filter, sort, and change the
presentation of different tables in the MI. For example, the user can hide or display a specific
column, modify its size, sort the table results, or download the table row data in a
comma-separated value (CSV) file to a local directory.
The LI REQ panel is minimized and docked at the bottom of the MI window. The user must
only click it (at the lower right) to open the LI REQ command pane. Figure 9-7 shows the new
LI REQ command panel and operation.
By default, the only user role that is allowed to run LI REQ commands is the Administrator. LI
REQ commands are logged in to tasks.
Figure 9-8 shows an example of a library request command reported in the Tasks list, and
shows how to get more information about the command by selecting Properties clicking See
details in the MI window.
Important: LI REQ commands that are issued from this window are not presented in the
host console logs.
For a complete list of available LI REQ commands, their usage, and respective responses, see
the current IBM TS7700 Series z/OS Host Command Line Request User’s Guide
(WP101091), found at:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101091
More items might also show, depending on the actual cluster configuration:
Systems icon This window shows the cluster members of the grid and grid-related
functions.
Monitor icon This window gathers the events, tasks, and performance information
about one cluster.
Light cartridge icon Information that is related to virtual volumes is available here.
MI Navigation
Use this window (Figure 9-9) for a visual summary of the TS7700 MI Navigation.
Tip: The Systems icon is only visible when the accessed TS7700 Cluster is part of a grid.
This window shows a summary view of the health of all clusters in the grid, including family
associations, host throughput, and any incoming copy queue. Figure 9-10 shows an example
of a Grid Summary window, including the pop-up windows.
There is a diskette icon on the right of the Actions button. Clicking the icon saves a
CSV-formatted file with a summary of the grid components information.
Note: The number that is shown in parentheses in breadcrumb navigation and cluster
labels is always the cluster ID.
Order by Families
Select this option to group clusters according to their family association.
Show Families
Select this option to show the defined families on the grid summary window. Cluster
families are used to group clusters in the grid according to a common purpose.
Cluster Families
Select this option to add, modify, or delete cluster families used in the grid.
Modify Grid Identification
Use this option to change grid nickname or description.
This new function is introduced by the R4.1.2 level of code. This menu option is available
only if control unit initiated reconfiguration (CUIR) is enabled by a LI REQ command and
the Automatic Vary Online (AONLINE) notification is disabled. For more information about
the use of the LI REQ commands, see 10.8, “CUIR for Tape (R4.1.2 Enhancement)” on
page 606.
Fence Cluster, Unfence Cluster
Select Fence Cluster to place a selected cluster in a fence state. If a cluster is already in
a fence state, the option Unfence Cluster will show instead.
Select Unfence Cluster to unfence a selected cluster that is currently in a fence state.
R4.1.2 introduces this new function as part of the grid resilience improvements package.
The Fence Cluster option in the Actions menu allows the user Administrator (default) to
manually remove (fence) a cluster that has been determined to be sick or not functioning
properly from the grid. Fencing a cluster will isolate it from the rest of the grid. The
administrator can fence the local cluster (the one being accessed by MI) or a remote
cluster in the grid from this window.
The user can decide what action will be taken by the sick cluster after the fence cluster
action is selected:
– Options for the local cluster:
• Forced offline
• Reboot
• Reboot and stay offline
– Options for a remote cluster (from any other cluster in the grid besides the cluster
under suspicion )
• Send an alert
• Force cluster offline
• Reboot
• Reboot and stay offline or isolate from the grid
Figure 9-13 shows the TS7700 MI sequence to manual fence a cluster.
See 2.3.37, “Grid resiliency functions” on page 99 for more information about cluster fence
function and proper usage.
To view or modify cluster family settings, first verify that these permissions are granted to the
assigned user role. If the current user role includes cluster family permissions, select Modify
to run the following actions:
Add a family: Click Add to create a new cluster family. A new cluster family placeholder is
created to the right of any existing cluster families. Enter the name of the new cluster
family in the active Name text box. Cluster family names must be 1 - 8 characters in length
and composed of Unicode characters. Each family name must be unique. Clusters are
added to the new cluster family by relocating a cluster from the Unassigned Clusters area
by using the method that is described in the Move a cluster function, described next.
Move a cluster: One or more clusters can be moved by dragging, between existing cluster
families, to a new cluster family from the Unassigned Clusters area, or to the Unassigned
Clusters area from an existing cluster family:
– Select a cluster: A selected cluster is identified by its highlighted border. Select a
cluster from its resident cluster family or the Unassigned Clusters area by using one of
these methods:
• Clicking the cluster with the mouse.
• Using the Spacebar key on the keyboard.
• Pressing and holding the Shift key while selecting clusters to select multiple clusters
at one time.
• Pressing the Tab key on the keyboard to switch between clusters before selecting
one.
Delete a family: To delete an existing cluster family, click the X in the upper-right corner of
the cluster family to delete it. If the cluster family to be deleted contains any clusters, a
warning message is displayed. Click OK to delete the cluster family and return its clusters
to the Unassigned Clusters area. Click Cancel to abandon the delete action and retain the
selected cluster family.
Save changes: Click Save to save any changes that are made to the Cluster Families
window and return it to read-only mode.
Remember: Each cluster family must contain at least one cluster. An attempt to save a
cluster family that does not contain any clusters results in an error message. No
changes are made, and the Cluster Families window remains in edit mode.
To change the grid identification properties, edit the available fields and click Modify. The
following fields are available:
Grid nickname: The grid nickname must be 1 - 8 characters in length and composed of
alphanumeric characters with no spaces. The characters at (@), period (.), dash (-), and
plus sign (+) are also allowed.
Grid description: A short description of the grid. Up to 63 characters can be used.
Exceptions in the cluster state are represented in the Grid Summary window by a little icon at
the lower right side of the cluster’s picture. Additional information about the status can be
viewed by hovering your cursor over the icon. See Figure 9-6 on page 346 for a visual
reference of the icons and how they show up on the Grid Summary page.
Figure 9-18 shows the icons for other possible statuses for a cluster that can be viewed on the
TS7700 Cluster or Grid Summary windows.
For a complete list of icons and meanings, see the following website:
http://www.ibm.com/support/knowledgecenter/STFS69_4.1.2/ts7700_ua_grid_summary_main.html
For an in-depth explanation of the throttling in a TS7700 grid, see the IBM TS7700 Series
Best Practices - Understanding, Monitoring, and Tuning the TS7700 Performance white
paper at:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101465
Figure 9-20 Cluster Summary window with TS7760T and TS4500 tape library
The Cluster Information can be displayed by hovering the cursor over the components, as
shown in Figure 9-20. In the resulting box, the following information is available:
Cluster components health status
Cluster Name
Family to which this cluster is assigned
Cluster model
Licensed Internal Code (LIC) level for this cluster
Description for this cluster
Disk encryption status
Cache size and occupancy (Cache Tube)
There is a diskette icon to the right of the Actions button. Clicking that icon downloads a
CSV-formatted file with the meaningful information about that cluster.
From the Action menu, the Cluster State can be changed to a different one to perform a
specific task, such as preparing for a maintenance window, performing a disaster recovery
drill, or moving machines to a different IT center. Depending on the current cluster state,
different options display.
Service Pending Force Service Use this option when an operation stalls and is
preventing the cluster from entering Service Prep.
Select Force Service to confirm this change.
All but one cluster in a grid can be placed into
service mode, but it is advised that only one cluster
be in service mode at a time. If more than one
cluster is in service mode, and service mode is
canceled on one of them, that cluster does not
return to normal operation until service mode is
canceled on all clusters in the grid.
Shutdown (offline) User interface not After an offline cluster is powered on, it attempts to
available return to normal. If no other clusters in the grid are
available, skip hot token reconciliation can be tried.
Each cluster in the grid keeps its own copy of the collection of tokens, representing all of the
logical volumes that exist in grid, and those copies are kept updated at the same level by the
grid mechanism. When coming back online, a cluster needs to reconcile its own collection of
tokens with the peer members of the grid, making sure that it represents the status of the grid
inventory. This reconcile operation is also referred to as token merge.
Note: After a shutdown or force shutdown action, the targeted cluster (and associated
cache) are powered off. A manual intervention is required on site where the cluster is
physically located to power it up again.
A cluster shutdown operation that is started from the TS7700 MI also shuts down the
cache. The cache must be restarted before any attempt is made to restart the TS7700
cluster.
Remember: Service mode is only possible for clusters that are members of a grid.
Service mode enables the subject cluster to leave the grid graciously, surrendering the
ownership of its logical volumes as required by the peer clusters in the grid to attend to the
tasks being performed by client. The user continues operating smoothly from the other
members of the grid automatically, if consistent copies of volumes that reside in this cluster
also exist elsewhere in the grid, and the host also has access to those clusters.
Before changing a cluster state to Service, the user needs to vary offline all logical devices
that are associated with this cluster on the host side. No host access is available in a cluster
that is in service mode.
R4.1.2 level of code starts implementing the CUIR. CUIR will help to alleviate client
involvement and simplify the process necessary to start service preparation in a grid member.
Before code level R4.1.2, the user needed to vary offline all logical drives associated to the
cluster going into service before changing cluster state to service. This process had to be
done across all LPARs and system plexes attached to the cluster. Any long running jobs using
these pending offline devices at this point will continue to run up to the end. Thus, the user
should issue SWAP commands to these jobs, causing them to move to a different logical drive
in a different cluster of the grid. After the cluster maintenance is completed and IBM CSR
cancels service for the cluster, the user needed to vary online all the devices again across all
LPARs and system plexes.
CUIR can automate this entire process when fully implemented. For more information about
CUIR, see 10.8, “CUIR for Tape (R4.1.2 Enhancement)” on page 606.
Important: Forcing Service Mode causes jobs that are currently mounted or that use
resources that are provided by targeted cluster to fail.
Whenever a cluster state is changed to Service, it enters first in service preparation mode,
and then, when the preparation stage finishes, it goes automatically into service mode.
During the service preparation stage, the cluster monitors the status of current host mounts,
sync copy mounts targeting local Tape Volume Cache (TVC), monitors and finishes up the
copies that are currently in execution, and makes sure that there are no remote mounts
targeting local TVC. When all running tasks have ended, and no more pending activities are
detected, the cluster finishes the service preparation stage and enters Service mode.
In a TS7700 grid, service preparation can occur on only one cluster at any one time. If service
prep is attempted on a second cluster before the first cluster has entered in Service mode, the
attempt will fail. After service prep has completed for one cluster, and that cluster has entered
in service mode, another cluster can be placed in service prep. A cluster in service prep
automatically cancels service prep if its peer in the grid experiences an unexpected outage
while the service prep process is still active. Although all clusters except one can be in
Service mode at the same time within a grid, the preferred approach is having only one
cluster in service mode at a time.
Be aware that when multiple clusters are in service mode simultaneously, they need to be
brought back to Normal mode at the same time. Otherwise, the TS7700 will not get to
ONLINE state, waiting until the remaining clusters also leave service mode. Only then, those
clusters merge their tokens and rejoin the grid as ONLINE members.
Remember: If more than one cluster is in Service mode, and service is canceled on one of
them, that cluster does not return to an online state until Service mode is canceled on all
other clusters in this grid.
For a disk-only TS7700 cluster or CP0 partition in a grid, click Lower Threshold to lower the
required threshold at which logical volumes are removed from cache in advance. See
“Temporary removal threshold” on page 167 for more information about the Temporary
Depending on the mode that the cluster is in, a different action is presented by the button
under the Cluster State display. This button can be used to place the TS7700 into service
mode or back into normal mode:
Prepare for Service Mode: This option puts the cluster into service prep mode and
enables the cluster to finish all current operations. If allowed to finish service prep, the
cluster enters Service mode. This option is only available when the cluster is in normal
mode. To cancel service prep mode, click Return to Normal Mode.
Return to Normal Mode: Returns the cluster to normal mode. This option is available if the
cluster is in service prep or service mode. A cluster in service prep mode or Service mode
returns to normal mode if Return to Normal Mode is selected.
A window opens to confirm the decision to change the Cluster State. Click Service Prep or
Normal Mode to change to new Cluster State, or Cancel to abandon the change operation.
This window is visible from the TS7700 MI whether the TS7700 is online or in service. If the
cluster is offline, MI is not available, and the error HYDME0504E The cluster you selected is
unavailable is presented.
Note: After a shutdown or force shutdown action, the targeted cluster (and associated
cache) are powered off. A manual intervention is required on the site where the cluster is
physically located to power it up again.
Only the cluster where a connection is established can be shut down by the user. To shut
down another cluster, drop the current cluster connection and log in to the cluster that must
be shut down.
Before the TS7700 can be shut down, decide whether the circumstances provide adequate
time to perform a clean shutdown. A clean shutdown is not mandatory, but it is suggested for
members of a TS7700 grid configuration. A clean shutdown requires putting the cluster in
Service mode first. Make sure that no jobs or copies are targeting or being sourced from this
cluster during shutdown.
Attention: A forced shutdown can result in lost access to data and job failure.
A cluster shutdown operation that is started from the TS7700 MI also shuts down the cache.
The cache must be restarted before any attempt is made to restart the TS7700.
If Shutdown is selected from the action menu for a cluster that is still online, as shown at the
top of Figure 9-22 on page 366, a message alerts the user to put the cluster in service mode
first before shutting down, as shown in Figure 9-23.
Note: For normal situations, set the cluster into service mode before shutdown is always
recommendable.
Figure 9-23 Warning message and Cluster Status during forced shutdown
It is still possible to force a shutdown without going into service by entering the password and
clicking the Force Shutdown button if needed (for example, during a DR test to simulate a
cluster failure. In this case, placing a cluster in service does not apply).
In Figure 9-23, the Online State and Service State fields in the message show the operational
status of the TS7700 and appear over the button that is used to force its shutdown. The
lower-right corner of the picture shows the cluster status that is reported by the message.
When a shutdown operation is in progress, the Shutdown button is disabled and the status
of the operation is displayed in an information message. The following list shows the
shutdown sequence:
1. Going offline
2. Shutting down
3. Powering off
4. Shutdown completes
Verify that power to the TS7700 and to the cache is shut down before attempting to restart the
system.
A cluster shutdown operation that is started from the TS7700 MI also shuts down the cache.
The cache must be restarted first and allowed to achieve an operational state before any
attempt is made to restart the TS7700.
The following information that is related to cluster identification is displayed. To change the
cluster identification properties, edit the available fields and click Modify. The following fields
are available:
Cluster nickname: The cluster nickname must be 1 - 8 characters in length and composed
of alphanumeric characters. Blank spaces and the characters at (@), period (.), dash (-),
and plus sign (+) are also allowed. Blank spaces cannot be used in the first or last
character position.
Cluster description: A short description of the cluster. Up to 63 characters can be used.
Note: Copy and paste might bring in invalid characters. Manual input is preferred.
Figure 9-20 on page 360, Cluster Summary page, depicts a TS7760T with a TS4500 tape
library attached. Within cluster front view page, cluster badge (top of the picture) brings a
general description about the cluster, such as model, name, family, Licensed Internal Code
level, cluster description, and cache encryption status. Hovering the cursor over the locations
within the picture of the frame shows the health status of different components, such as the
network gear (at the top), TVC controller and expansion enclosures (bottom and halfway up),
and the engine server along with the internal 3957-Vxx disks (the middle section). The
summary of cluster health shows at the lower-right status bar, and also at the badge health
status (over the frame).
Figure 9-24 shows the back view of the cluster summary window and health details. The
components that are depicted in the back view are the Ethernet ports and host Fibre Channel
connection (FICON) adapters for this cluster. Under the Ethernet tab, the user can see the
ports that are dedicated to the internal network (the TSSC network) and those that are
dedicated to the external (client) network. The assigned IP addresses are displayed. Details
about the ports are shown (IPv4, IPv6, and the health). In the grid Ethernet ports, information
about links to the other clusters, data rates, and cyclic redundancy check (CRC) errors are
displayed for each port in addition to the assigned IP address and Media Access Control
(MAC) address.
The host FICON adapter information is displayed under the Fibre tab for a selected cluster, as
shown in Figure 9-24. The available information includes the adapter position and general
health for each port.
Figure 9-24 Back view of the cluster summary with health details
To display the different area health details, hover the cursor over the component in the picture.
Tip: The expansion frame icon is only displayed if the accessed cluster has an expansion
frame.
Figure 9-25 shows the Cache Expansion frame details and health view through the MI.
Consideration: If the cluster is not a tape-attached model, the tape library icon does not
display on the TS7700 MI.
The library details and health are displayed as explained in Table 9-2.
Physical library The type of physical library (type is always TS3500) accompanied by the name
type - virtual library of the virtual library established on the physical library.
name
Tape Library The health states of the library and its main components. The following values
Health are possible:
Fibre Switch Normal
Health Degraded
Tape Drive Health Failed
Unknown
Operational Mode The library operational mode. The following values are possible:
Auto
Paused
Virtual I/O Slots Status of the I/O station that is used to move cartridges into and out of the
library. The following values are possible:
Occupied
Full
Empty
Physical The number of physical cartridges assigned to the identified virtual library.
Cartridges
Tape Drives The number of physical tape drives available, as a fraction of the total. Click this
detail to open the Physical Tape Drives window.
The Physical Tape Drives window shows all the specific details about a physical tape drive,
such as its serial number, drive type, whether the drive has a cartridge mount on it, and for
what is it mounted. To see the same information, such as drive encryption and tape library
location, about the other tape drives, select a specific drive and choose Details in the Select
Action menu.
Events
Use this window that is shown in Figure 9-28 on page 373 to view all meaningful events that
occurred within the grid or a stand-alone TS7700 cluster. Events encompass every significant
occurrence within the TS7700 grid or cluster, such as a malfunctioning alert, an operator
intervention, a parameter change, a warning message, or some user-initiated action.
R4.1.2 level of code improves the presentation and handling of the cluster state and alerting
mechanisms, providing new capabilities to the user.
The user is now able to customize the characteristics of each one of the CBR3750I
messages, being able to add custom text to each of them. This goal is accomplished through
the new TS7700 MI Notification Settings window implemented by the R4.1.2 level of code.
Information is displayed on the Events table for 30 days after the operation stops or the event
becomes inactive.
Note: The Date & Time column refers the time of the events to the local time on the
computer where the MI was initiated. If the DATA/TIME is modified in the TS7700 from
Coordinated Universal Time during installation, the event times are offset by the same
difference in the Events display on the MI. Coordinated Universal Time in all TS7700
clusters should be used whenever possible.
For more information about Events page, see IBM Knowledge Center for TS7700, available
locally by clicking the question mark symbol at the right of the banner on the TS7700MI, or
online at:
https://www.ibm.com/support/knowledgecenter/STFS69_4.1.2
Table 9-3 describes the column names and descriptions of the fields, as shown in the Event
window.
Table 9-3 Field name and description for the Events window
Column name Description
ID The unique number that identifies the instance of the event. This number consists
of the following values:
A locally generated ID, for example: 923
The type of event: E (event) or T (task)
An event ID based on these examples appears as 923E.
System Whether the event can be cleared automatically by the system. The following
Clearable values are possible:
Yes. The event is cleared automatically by the system when the condition that
is causing the event is resolved.
No. The event requires user intervention to clear. The event needs to be
cleared or deactivated manually after resolving the condition that is causing
the event.
Table 9-4 lists actions that can be run on the Events table.
Deactivate or clear one 1. Select at least one but no more than 10 events.
or more alerts 2. Click Mark Inactive.
If a selected event is normally cleared by the system, confirm the selection.
Other selected events are cleared immediately.
A running task can be cleared but if the task later fails, it is displayed again
as an active event.
Enable or disable host Select Actions → [Enable/Disable] Host Notification. This change
notification for alerts affects only the accessing cluster.
Tasks are not sent to the host.
Filter the table data Follow these steps to filter by using a string of text:
1. Click in the Filter field.
2. Enter a search string.
3. Press Enter.
To filter by column heading:
1. Click the down arrow next to the Filter field.
2. Select the column heading to filter by.
3. Refine the selection.
9.3.5 Performance
This section present information for viewing IBM TS7700 Grid and Cluster performance and
statistics.
All graphical views, except the Historical Summary, are from the last 15 minutes. The
Historical Summary presents a customized graphical view of the different aspects of the
cluster operation, in a 24-hour time frame. This 24-hour window can be slid back up to 90
days, which covers three months of operations.
For information about the steps that you can take to achieve peak performance, see IBM
TS7700 Series Best Practices - Understanding, Monitoring and Tuning the TS7700
Performance, WP101465, available online at:
https://www.ibm.com/support/techdocs
The MI is enhanced to accommodate the functions that are introduced by the code.
Figure 9-31 shows the Performance Historical Summary and related chart selections that are
available for this item.
Y axes The left vertical axis measures either throughput (MiBs/s) or copies (MiB), depending
on the selected data sets. The right vertical axis measures the number of mounted
virtual drives, reclaimed physical volumes, and data set size to copy. Possible
measurements include MiB, GiBs, milliseconds, seconds, hours, and percentage.
X axis The horizontal axis measures in hours the time period for the data sets shown.
Measurements are shown in 15-minute increments by default. Click the time span
(located at the top-center of the chart) to change the display increments. The
following are the possible values:
1 day
12 hours
1 hour
30 minutes
15 minutes
custom
Last 24 hours Click the icon in the top right corner of the page to reset the time span that is shown
on the chart to the past 24-hour period. A change to the time span does not alter the
configuration of data sets displayed in the main area of the chart.
Data sets Data sets displayed in the main area of the chart are shown as lines or stacked
columns.
Data sets related to throughput and copy queues can be grouped to better show
relationships between these sets. See Table 9-6 on page 380 for descriptions of all
data sets.
Legend The area below the X axis lists all data sets selected to display on the chart, along
with their identifying colors or patterns. The legend displays a maximum of 10 data
sets. Click any data set shown in this area to toggle its appearance on the chart.
Note: To remove a data set from the legend, you must clear it using the Select metrics
option.
Time span The time period from which displayed data is drawn. This range is shown at the top
of the page.
Note: Dates and times that are displayed reflect the time zone in which your browser
is located. If your local time is not available, these values are shown in Coordinated
Universal Time (UTC).
Click the displayed time span to modify its start or end values. Time can be selected
in 15-minute increments.
Start date and time: The default start value is 24 hours before the present date
and time. You can select any start date and time within the 90 days that precede
the present date and time.
End date and time: The default end value is 24 hours after the start date or the
last valid date and time within a 24-hour period. The end date and time cannot be
later than the current date and time. You can select any end date and time that is
between 15 minutes and 24 hours later than the start value.
Presets Click one of the Preset buttons at the top of the vertical toolbar to populate the chart
using one of three common configurations:
The established time span is not changed when a preset configuration is applied.
Note: The preset options that are available depend on the configuration of the
accessing cluster. See Table 9-6 on page 380 for existing restrictions.
Select metrics Click the Select metrics button on the vertical toolbar to add or remove data sets
displayed on the Historical Summary chart. See Table 9-6 on page 380 for
descriptions of all data sets.
Download spreadsheet Click the Download spreadsheet button on the vertical toolbar to download a
comma-separated (.csv) file to your web browser for the period shown on the graph.
In the .csv file, time is shown in 15-minute intervals.
Note: The time reported in the CSV file is shown in UTC. You might find time
differences if the system that you are using to access the Management Interface is
configured for a different time zone.
Chart settings Click the Chart settings button on the vertical toolbar to enable the low graphics
mode for improved performance when many data points are displayed. Low graphics
mode disables hover-over tool tips and improves chart performance in older
browsers. If cookies are enabled on your browser, this setting is retained when you
exit the browser.
Note: Low graphics mode is enabled by default when the browser is Internet Explorer,
version 8 or earlier.
Use the Select metrics button to open the Select metrics window to add or remove data sets
displayed on the Historical Summary chart.
The Select metrics window organizes data sets by sections and categories.
Throughput I/O Channel R/W MiB/s Transfer rate (MiB/s) of host data on the FICON channel, which
includes this information:
Host raw read: Rate that is read between the HBA and host.
Host raw write: Rate that is written to the virtual drive from the
host.
Throughput I/O Primary Cache Read Data transfer rate (MiB/s) read between the virtual drive and HBA
for the primary cache repository.
Throughput I/O Primary Cache Write Data transfer rate (MiB/s) written to the primary cache repository
from the host through the HBA.
Throughput I/O Remote Write Data transfer rate (MiB/s) to the cache of a remote cluster from the
cache of the accessing cluster as part of a remote read operation.
This data set is only visible if the access cluster is part of a grid.
Throughput Copies Link Copy Out Data transfer rate (MiB/s) for operations that copy data from the
accessing cluster to one or more remote clusters. This is data
transferred between legacy TS7700 Grid links.
This data set is only visible if the access cluster is part of a grid.
Throughput Copies Link Copy In Data transfer rate (MiB/s) for operations that copy data from one or
more remote clusters to the accessing cluster. This is data
transferred between legacy TS7700 Grid links. This data set is only
visible if the access cluster is part of a grid.
Throughput Copies Copy Queue Size The maximum size of the incoming copy queue for the accessing
cluster, which is shown in MiBs, GiBs, or TiBs. Incoming copy queue
options include the following:
Immediate
Synchronous-deferred
Immediate-deferred
Deferred
Family deferred
Copy refresh
Time delayed
Total
This data set is only visible if the accessing cluster is part of a grid.
Throughput Copies Average Copy Life The average age of virtual volumes to be copied to the distributed
Span library for the accessing cluster. The following are the available
options:
Immediate Mode Copy
Time Delayed Copy
All other deferred type copies
This data set is only visible if the accessing cluster is part of a grid.
Storage Cache Cache to Copy The number of GiBs that reside in the incoming copy queue of a
remote cluster, but are destined for the accessing cluster. This value
is the amount of data that is being held in cache until a copy can be
made.
This data set is only visible if the accessing cluster is part of a grid.
Storage Cache Cache Hit The number of completed mount requests where data is resident in
the TVC.
Storage Cache Cache Miss The number of completed mount requests where data is recalled
from a physical stacked volume.
Storage Cache Cache Hit Mount The average time (ms) to complete Cache Hit mounts. This data set
Time is visible only if the accessing cluster is attached to a tape library. If
the cache is partitioned, this value is displayed according to
partition.
Storage Cache Cache Miss Mount The average time (ms) to complete Cache Miss mounts.
Time
This data set is visible only if the accessing cluster is attached to a
tape library. If the cache is partitioned, this value is displayed
according to partition.
Storage Cache Partitions If the accessing cluster is a TS7720 or TS7760 attached to a tape
library, a numbered tab exists for each active partition. Each tab
displays check boxes for these categories:
Cache Hit
Cache Miss,
Mount Time Hit
Mount Time Miss
Data in Cache
Storage Cache Data Waiting for The amount of data in cache assigned to volumes waiting for
Premigration premigration.
Storage Cache Data Migrated The amount of data in cache that has been migrated.
Storage Cache Data Waiting for The amount of data in cache assigned to volumes waiting for
Delayed Premigration delayed premigration.
Storage Virtual Maximum Virtual The greatest number of mounted virtual drives. This value is a
Tape Drives Mounted mount count.
Storage Physical Write to Tape Data transfer rate (MiB/s) written to physical media from cache. This
Tape value typically represents premigration to tape.
This data set is not visible when the selected cluster is not attached
to a library.
Storage Physical Recall from Tape Data transfer rate (MiB/s) read from physical media to cache. This
Tape value is recalled data.
This data set is not visible when the selected cluster is not attached
to a library.
Storage Physical Reclaim Mounts Number of physical mounts that are completed by the library for the
Tape physical volume reclaim cache operation. This value is a mount
count.
This data set is not visible when the selected cluster is not attached
to a library.
Storage Physical Recall Mounts Number of physical mounts that are completed by the library for the
Tape physical volume reclaim operation.
This data set is not visible when the selected cluster is not attached
to a library.
Storage Physical Premigration Mounts Number of physical mount requests completed by the library
Tape required to satisfy pre-migrate mounts.
This data set is not visible when the selected cluster is not attached
to a library.
Storage Physical Physical Drives The maximum, minimum, or average number of physical devices of
Tape Mounted all device types concurrently mounted. The average number
displays only when you hover over a data point.
This data set is only visible when the selected cluster attaches to a
library.
Storage Physical Physical Mount The maximum, minimum, or average number of seconds required to
Tape Times complete the execution of a mount request for a physical device.
The average number displays only when you hover over a data
point.
This data set is only visible when the selected cluster attaches to a
library.
System Throttling Average Copy The average time delay as a result of copy throttling, which is
Throttle measured in milliseconds. This data set contains the averages of
nonzero throttling values where copying is the predominant reason
for throttling.
This data set is only visible if the selected cluster is part of a grid.
System Throttling Average Deferred The average time delay as a result of deferred copy throttling, which
Copy Throttle is measured in milliseconds. This data set contains the averages of
30-second intervals of the deferred copy throttle value.
This data set is only visible if the selected cluster is part of a grid.
System Throttling Average Host Write The average write overrun throttle delay for the tape attached
Throttle for Tape partitions. This data set is the average of the non-zero throttling
Attached Partitions values where write overrun was the predominant reason for
throttling.
System Throttling Average Copy The average copy throttle delay for the tape attached partitions. The
Throttle for Tape value presented is the average of the non-zero throttling values
Attached Partitions where copy was the predominant reason for throttling.
System Throttling Average Deferred The average deferred copy throttle delay for the tape attached
Copy Throttle for partitions. This value is the average of 30-second intervals of the
Tape Attached deferred copy throttle value during the historical record.
Partitions
This data set is only visible if the selected cluster is part of a grid and
is a TS7720 or TS7760 attached to a tape library.
System Utilization Maximum CPU The maximum percentage of processor use for the primary TS7700
Primary Server server.
System Utilization Maximum Disk I/O The maximum percentage of disk cache I/O uses as reported by the
Usage Primary primary server in a TS7700.
Server
For an explanation of the values and what to expect in the resulting graphs, see Chapter 11,
“Performance and monitoring” on page 613. Also, for a complete description of the window
and available settings, see the TS7700 R4.0 IBM Knowledge Center. The TS7700 R4.0 IBM
Knowledge Center is available both locally on the TS7700 MI (by clicking the question mark
icon at the upper right corner of the window) and on the following website:
http://www.ibm.com/support/knowledgecenter/STFS69_4.1.2/ts7740_ua_performance.html
Also, see IBM TS7700 Series Best Practices - Understanding, Monitoring, and Tuning the
TS7700 Performance, WP101465, which is available on the IBM Techdocs Library website:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101465
Virtual mounts
This page is used to view the virtual mount statistics for the TS7700 Grid. Virtual mount
statistics are displayed for activity on each cluster during the previous 15 minutes. These
statistics are presented in bar graphs and tables and are organized according to number of
virtual mounts and average mount times.
Number of virtual mounts: This section provides statistics for the number of virtual mounts
on a given cluster during the most recent 15-minute snapshot. Snapshots are taken at
15-minute intervals. Each numeric value represents the sum of values for all active
partitions in the cluster.Information displayed includes the following:
– Cluster: The cluster name.
– Fast-Ready: The number of virtual mounts that were completed using the Fast-Ready
method.
– Cache Hits: The number of virtual mounts that were completed from cache.
– Cache Misses: The number of mount requests that are unable to be fulfilled from
cache.
Note: This field is visible only if the selected cluster possesses a physical library.
Note: This field is visible only if the selected cluster possesses a physical library.
For information about how to achieve better performance, check IBM TS7700 Series Best
Practices - Understanding, Monitoring and Tuning the TS7700 Performance, WP101465,
which is available on the IBM Techdocs Library website:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101465
This page is not visible on the TS7700 MI if the grid does not possess a physical library (no
tape attached member).
Host throughput
Use this page to view statistics for each cluster, vNode, host adapter, and host adapter port in
the grid. At the top of the page is a collapsible tree that allows you to view statistics for a
specific level of the grid and cluster:
Click the grid hyperlink to display information for each cluster.
Click the cluster hyperlink to display information for each vNode.
Click the vNode hyperlink to display information for each host adapter.
Click a host adapter link to display information for each of its ports.
The host throughput data is displayed in two bar graphs and one table. The bar graphs
represent raw data coming from the host to the host bus adapter (HBA) and for compressed
data going from the HBA to the virtual drive on the vNode.
Note: See 1.6, “Data storage values” on page 12 for information related to the use of
binary prefixes.
In the table, the letters in the heading correspond to letter steps in the diagram above the
table. Data is available for a cluster, vNode, host adapter, or host adapter port. The letters
referred to in the table can be seen on the diagram in Figure 9-32.
Cache throttling
This window shows the statistics of the throttling values that are applied on the host write
operations and RUN copy operations throughout the grid.
Throttling refers to the intentional slowing of data movement to balance or re-prioritize system
resources in a busy TS7700. Throttling can be applied to host write and inbound copy
operations. Throttling of host write and inbound copy operations limits the amount of data
movement into a cluster. This is typically done for one of two reasons:
The amount of unused cache space is low
The amount of data in cache that is queued for premigration has exceeded a threshold.
Host write operations can also be throttled when RUN copies are being used and it is
determined that a throttle is needed to prevent pending RUN copies from changing to the
immediate-deferred state. A throttle can be applied to a host write operation for these
reasons:
The amount of unused cache space is low
The amount of data in cache that needs to be premigrate is high
For RUN copies, an excessive amount of time is needed to complete an immediate copy
The Cache Throttling graph displays the throttling that is applied to host write operations and
to inbound RUN copy operations. The delay represents the time delay, in milliseconds, per
32 KiB of transferred post-compressed data. Each numeric value represents the average of
values for all active partitions in the cluster. Information shown includes the following:
Cluster: The name of the cluster affected
Copy: The average delay, in milliseconds, applied to inbound copy activities
Write: The average delay, in milliseconds, applied to host write operations, both locally and
from remote clusters
For information about the steps you can take to achieve peak network performance, see IBM
TS7700 Series Best Practices - Understanding, Monitoring and Tuning the TS7700
Performance, WP101465, available online at:
http://www.ibm.com/support/techdocs
Cache utilization
Cache utilization statistics are presented for clusters that have one resident-only or tape-only
partition, and for clusters with partitioned cache. Models TS7720 disk-only and TS7740 have
only one resident or tape partition, which accounts for the entire cache. For the TS7720T
(tape attach) cluster model, up to eight cache partitions (one CP0 cache resident and up to
seven tape partitions) can be defined, and represented in the cache utilization window.
Cache Partition
The Cache Partition window presents the cache use statistics for the TS7720 or TS7760
tape-attached models, in which the cache is made up of multiple partitions. Figure 9-34
shows a sample of the Cache Partition (multiple partitions) window. This window can be
reached by using the Monitor icon (as described here) or by using the Virtual icon. Both
ways direct to the same window. In this window, the user can display the already existent
cache partitions, but also can create a new partition, reconfigure an existing one, or delete a
partition as needed.
Tip: Consider limiting the MI user roles who are allowed to change the partition
configurations through this window.
For a complete description of the window, see IBM Knowledge Center, either locally on the
TS7700 MI (by clicking the question mark icon) or on the following website:
http://www.ibm.com/support/knowledgecenter/STFS69_4.1.2/ts7740_ua_cache_utilizatio
n.html
For more information about this window, see the TS7700 R4.1 section in IBM Knowledge
Center. Learn about data flow within the grid and how those numbers vary during the
operation in Chapter 11, “Performance and monitoring” on page 613.
Pending Updates
The Pending Updates window is only available if the TS7700 cluster is a member of a grid.
Pending updates window can be used to monitor status of outstanding updates per cluster
throughout the grid. Pending updates can be caused by one cluster being offline, in service
preparation or service mode while other grid peers were busy with the normal client’s
production work.
A faulty grid link communication also might cause a RUN or SYNC copy to became Deferred
Run or Deferred Sync. The Pending Updates window can be used to follow the progress of
those copies.
The Download button in the top of the window saves a comma-separated values (.csv) file
that lists all volumes or grid global locks that are targeted during an ownership takeover. The
volume or global pending updates are listed, along with hot tokens and stolen volumes.
Tokens are internal data structures that are used to track changes to the ownership, data, or
properties of each one of the existing logical volumes in the grid. Hot tokens occur when a
cluster attempts to merge its own token information with the other clusters, but the clusters
are not available for the merge operation (tokens not able to merge became ‘hot’).
Stolen volume describes a volume whose ownership has been taken over during a period in
which the owner cluster was in service mode or offline, or if an unexpected cluster outage
occurs when the volume ownership is taken over under an operator’s direction, or by using
AOTM.
For more information about copy mode and other concepts referred to in this section, see
Chapter 2, “Architecture, components, and functional characteristics” on page 15. For other
information about this MI function, see IBM Knowledge Center on the following website:
http://www.ibm.com/support/knowledgecenter/STFS69_4.0.0/ts7740_ua_pending_logical_
volume_updates.html
Tasks are listed by starting date and time. Tasks that are still running are shown on the top of
the table, and the completed tasks are listed at the bottom. Figure 9-35 shows an example of
the Tasks window. Notice that the information on this page and the task status pods are of
grid scope.
Note: The Start Time column refers to the time of starting a task to the local time on the
computer where the MI was started. If the DATE/TIME is modified in the TS7700 from the
Coordinated Universal Time during installation, the time that is shown in the Start Times
field is offset by the same difference from the local time of the MI. Use Coordinated
Universal Time in all TS7700 clusters whenever possible.
The available items under the Virtual icon are described in the following topics.
Cache Partitions
In the Cache Partitions window in the MI, you can create a cache partition, or reconfigure or
delete an existing cache partition for the TS7720 or TS7760 tape-attached models. Also, you
can use this window to monitor the cache and partitions occupancy and usage.
Figure 9-38 on page 392 shows a sequence for creating a new partition. There can be as
many as eight partitions, from Resident partition (partition 0) to Tape Partition 7, if needed.
The tape partition allocated size is subtracted from the actual resident partition capacity, if
there is at least more than 2 TB of free space in the resident partition (CP0). For the complete
set of rules and allowed values in effect for this window, see the TS7700 R4.0 IBM Knowledge
Center. Also, learn about the tape attach TS7700, cache partitions, and usage in Chapter 2,
“Architecture, components, and functional characteristics” on page 15.
Figure 9-39 shows an example of successful creation in the upper half. The lower half shows
an example where the user failed to observe the amount of free space available in CP0.
Notice that redefining the size of existing partitions in an operational TS7720T might create
unexpected load peak in the overall premigration queue, causing host write throttling to be
applied to the tape partitions.
For instance, consider the following example, where a tape attached partition is downsized,
and become instantly overcommitted. In this example, the TS7720T premigration queue is
flooded by volumes that got dislodged by the size of this cache partition becoming smaller.
Partition readapts to the new size by migrating volumes in excess to physical tape.
Figure 9-41 shows tape partition 1 being downsized to 8 TB. Note the initial warning and
subsequent overcommit statement that shows up when resizing the tape partition results in
overcommitted cache size.
Accepting the Overcommit statement initiates the resizing action. If this is not the best-suited
time for the partition resizing (as during the peak load period), the user can click No and
decline to take the action, and then resume it at a more appropriate time. Figure 9-42 shows
the final sequence of the operation.
Read more about this subject in Chapter 11, “Performance and monitoring” on page 613 and
Chapter 2, “Architecture, components, and functional characteristics” on page 15.
Tip: Consider limiting the MI user roles that are allowed to change the partition
configurations through this window.
It can be specified through policies and settings on which clusters (if any) copies are, and how
quickly copy operations should occur. Each cluster maintains its own list of copies to acquire,
and then satisfies that list by requesting copies from other clusters in the grid according to
queue priority.
Table 9-7 shows the values that are displayed in the copy queue table.
Copy Type The type of copy that is in the queue. The following values are possible:
Immediate: Volumes can be in this queue if they are assigned to a
Management Class (MC) that uses the Rewind Unload (RUN) copy mode.
Synchronous-deferred: Volumes can be in this queue if they are assigned to
an MC that uses the Synchronous mode copy and some event (such as the
secondary cluster going offline) prevented the secondary copy from
occurring.
Immediate-deferred: Volumes can be in this queue if they are assigned to an
MC that uses the RUN copy mode and some event (such as the secondary
cluster going offline) prevented the immediate copy from occurring.
Deferred: Volumes can be in this queue if they are assigned to an MC that
uses the Deferred copy mode.
Time Delayed: Volumes can be in this queue if they are eligible to be copied
based on either their creation time or last access time.
Copy-refresh: Volumes can be in this queue if the MC assigned to the
volumes changed and a LI REQ command was sent from the host to initiate
a copy.
Family-deferred: Volumes can be in this queue if they are assigned to an MC
that uses RUN or Deferred copy mode and cluster families are being used.
Last TVC Cluster The name of the cluster where the copy last was in the TVC. Although this might
not be the cluster from which the copy is received, most copies are typically
obtained from the TVC cluster.
This column is only shown when View by Last TVC is selected.
Using the upper-left option, choose between View by Type and View by Last TVC Cluster.
Use the Actions menu to download the Incoming Queued Volumes list.
Recall queue
The Recall Queue window of the MI displays the list of virtual volumes in the recall queue.
Use this window to promote a virtual volume or filter the contents of the table. The Recall
Queue item is visible but disabled on the TS7700 MI if there is no physical tape attachment to
the selected cluster, but there is at least one tape attached cluster (TS7740 or TS7720T,
which are connected to a TS3500 tape library) within the grid. Trying to access the Recall
queue link from a cluster with no tape attachment causes the following message to display:
Tip: This item is not visible on the TS7700 MI if there is no TS7700 tape-attached cluster in
grid.
A recall of a virtual volume retrieves the virtual volume from a physical cartridge and places it
in the cache. A queue is used to process these requests. Virtual volumes in the queue are
classified into three groups:
In Progress
Scheduled
Unscheduled
In addition to changing the recall table’s appearance by hiding and showing some columns,
the user can filter the data that is shown in the table by a string of text, or by the column
heading. Possible selections are by Virtual Volume, Position, Physical Cartridge, or by Time in
Queue. To reset the table to its original appearance, click Reset Table Preferences.
Another interaction now available in the Recall window is that the user can promote an
unassigned volume recall to the first position in the unscheduled portion of the recall queue.
This is available by checking an unassigned volume in the table, and clicking Actions →
Promote Volume.
The field Cache Mount Cluster in the virtual tape drives page identifies to which cluster TVC
the volume is mounted. The user can recognize remote (crossed) or synchronous mounts
simply by looking in this field. Remote mounts show other clusters that are being used by a
mounted volume instead of a local cluster (the cluster to whom the Virtual Tape Drives belong
to in the page currently on display), whereas synchronous mounts show both clusters used by
the mounted volume.
The user can perform a stand-alone mount to a logical volume against a TS7700 logical drive
for special purposes, such as to perform an initial program load (IPL) of a stand-alone
services core image from a virtual tape drive. Also, MI allows the user to manually unmount a
logical drive that is mounted and in the idle state. The Unmount function is available not only
for those volumes that have been manually mounted, but also for occasions when a logical
volume has been left mounted on a virtual drive by an incomplete operation or some test
rehearsal, therefore creating the need to unmount it through MI operations.
The user can mount only virtual volumes that are not already mounted, on a virtual drive that
is online.
If there is a need to unmount a logical volume that is currently mounted to a virtual tape drive,
follow the procedure that is shown in Figure 9-47.
The user can unmount only those virtual volumes that are mounted and have a status of Idle.
Virtual volumes
The topics in this section present information about monitoring and manipulating virtual
volumes in the TS7700 MI.
There is a tutorial available about virtual volume display and how to interpret the windows
accessible directly from the MI window. To watch it, click the View Tutorial link on the Virtual
Volume Detail window.
This graphical summary brings details of the present status of the virtual volume within the
grid, plus the current operations that are taking place throughout the grid concerning that
volume. The graphical summary helps you understand the dynamics of a logical mount,
whether the volume is in the cache at the mounting cluster, or whether it is being recalled
from tape in a remote location.
Note: The physical resources are shown in the virtual volume summary, virtual volume
details table, or the cluster-specific virtual volume properties table for the TS7720T and
TS7740 cluster models.
The cluster that owns the logical volume being displayed is identified by the blue border
around it. For instance, referring to Figure 9-48 on page 400, volume Z22208 is owned by
cluster 1, where volume and a FlashCopy are in cache. Volume Z22208 is not mounted or
available in the primary cache at cluster 2. At the same time, Z22208 is in the deferred
incoming copy queue for cluster 0.
Figure 9-49 shows how the icons are distributed through the window, and where the pending
actions are represented. The blue arrow icon over the cluster represents data that is being
transferred from another cluster. The icon in the center of the cluster indicates data that is
being transferred within the cluster.
Figure 9-50 Legend list for the graphical representation of the virtual volume details
Volser The VOLSER of the virtual volume. This value is a six-character number that uniquely
represents the virtual volume in the virtual library.
Media Type The media type of the virtual volume. The following are the possible values:
Cartridge System Tape
Enhanced Capacity Cartridge System Tape
Maximum Volume Capacity The maximum size (MiB) of the virtual volume. This capacity is set upon insert by the
Data Class of the volume.
Note: If an override is configured for the Data Class 's maximum size, it is only applied
when a volume is mounted and a load-point write (scratch or fast ready mount)
occurs. During the volume close operation, the new override value is bound to the
volume and cannot change until the volume is reused. Any further changes to a Data
Class override are not inherited by a volume until it is written again during a fast ready
mount and then closed.
Note: When the host mounts a scratch volume and unmounts it without completing
any writes, the system might report that the virtual volume's current size is larger than
its maximum size. This result can be disregarded.
Current Volume Size Size of the data (MiB) for this unit type (channel or device).
Note: See 1.6, “Data storage values” on page 12 for additional information about the
use of binary prefixes.
Current Owner The name of the cluster that currently owns the latest version of the virtual volume.
Currently Mounted Whether the virtual volume is currently mounted in a virtual drive. If this value is Yes,
these qualifiers are also shown:
vNode: The name of the vNode that the virtual volume is mounted on.
Virtual Drive: The ID of the virtual drive that the virtual volume is mounted on.
Cache Copy Used for Mount The name of the cluster associated to the TVC selected for mount and I/O operations
to the virtual volume. This selection is based on consistency policy, volume validity,
residency, performance, and cluster mode.
Mount State The mount state of the logical volume. The following are the possible values:
Mounted: The volume is mounted.
Mount Pending: A mount request was received and is in progress.
Recall Queued/Requested: A mount request was received and a recall request
was queued.
Recalling: A mount request was received and the virtual volume is being staged
into tape volume cache from physical tape.
Cache Management The preference level for the Storage Group. This setting determines how soon
Preference Group volumes are removed from cache following their copy to tape. This information is only
displayed if the owner cluster is a TS7740 or if the owner cluster is a TS7700T and
the volume is assigned to a cache partition that is greater than 0. The following are
the possible values:
0: Volumes in this group have preference to be removed from cache over other
volumes.
1: Volumes in this group have preference to be retained in cache over other
volumes. A “least recently used” algorithm is used to select volumes for removal
from cache if there are no volumes to remove in preference group 0.
Unknown: The preference group cannot be determined.
Last Accessed by a Host The most recent date and time a host accessed a live copy of the virtual volume. The
recorded time reflects the time zone in which the user's browser is located.
Last Modified The date and time the virtual volume was last modified by a host mount or demount.
The recorded time reflects the time zone in which the user's browser is located.
Category The number of the scratch category to which the virtual volume belongs. A scratch
category groups virtual volumes for quick, non-specific use.
Storage Group The name of the Storage Group that defines the primary pool for the premigration of
the virtual volume.
Management Class The name of the Management Class applied to the volume. This policy defines the
copy process for volume redundancy.
Storage Class The name of the Storage Class applied to the volume. This policy classifies virtual
volumes to automate storage management.
Data Class The name of the Data Class applied to the volume. This policy classifies virtual
volumes to automate storage management.
Volume Data State The state of the data on the virtual volume. The following are the possible values:
New: The virtual volume is in the insert category or a non-scratch category and
data has never been written to it.
Active: The virtual volume is currently located within a private category and
contains data.
Scratched: The virtual volume is currently located within a scratch category and
its data is not scheduled to be automatically deleted.
Pending Deletion: The volume is currently located within a scratch category and
its contents are a candidate for automatic deletion when the earliest deletion
time has passed. Automatic deletion then occurs sometime thereafter.
Note: The volume can be accessed for mount or category change before the
automatic deletion and therefore the deletion might be incomplete.
Pending Deletion with Hold: The volume is currently located within a scratch
category configured with hold and the earliest deletion time has not yet passed.
The volume is not accessible by any host operation until the volume has left the
hold state. After the earliest deletion time has passed, the volume then becomes
a candidate for deletion and moves to the Pending Deletion state. While in this
state, the volume is accessible by all legal host operations.
Deleted: The volume is either currently within a scratch category or has
previously been in a scratch category where it became a candidate for automatic
deletion and was deleted. Any mount operation to this volume is treated as a
scratch mount because no data is present.
FlashCopy Details of any existing flash copies. The following are the possible values, among
others:
Not Active: No FlashCopy is active. No FlashCopy was enabled at the host by an
LI REQ operation.
Active: A FlashCopy that affects this volume was enabled at the host by an LI
REQ operation. Volume properties have not changed since FlashCopy time zero.
Created: A FlashCopy that affects this volume was enabled at the host by an LI
REQ operation. Volume properties between the live copy and the FlashCopy
have changed. Click this value to open the Flash Copy Details page.
Earliest Deletion On The date and time when the virtual volume will be deleted. Time that is recorded
reflects the time zone in which your browser is located.
This value displays as “—” if no expiration date is set.
Volume Format ID Volume format ID that belongs to the volume. the following are the possible values:
-2: Data has not been written yet
5: Logical Volume old format
6: Logical Volume new format
3490 Counters Handling 3490 Counters Handling value that belongs to the volume.
The Cluster-specific Virtual Volume Properties table displays information about requesting
virtual volumes on each cluster. These are properties that are specific to a cluster. Virtual
volume details and the status that is displayed include the following properties shown on
Table 9-9.
Cluster The cluster location of the virtual volume copy. Each cluster
location occurs as a separate column header.
Device Bytes Stored The number of bytes used (MiB) by each cluster to store the
volume. Actual bytes can vary between clusters, based on settings
and configuration.
Primary Physical Volume The physical volume that contains the specified virtual volume.
Click the VOLSER hyperlink to open the Physical Stacked Volume
Details page for this physical volume. A value of None means that
no primary physical copy is to be made. This column is only visible
if a physical library is present in the grid. If there is at least one
physical library in the grid, the value in this column is shown as “—”
for those clusters that are not attached to a physical library.
Secondary Physical Volume Secondary physical volume that contains the specified virtual
volume. Click the VOLSER hyperlink to open the Physical Stacked
Volume Details page for this physical volume. A value of None
means that no secondary physical copy is to be made. This column
is only visible if a physical library is present in the grid. If there is at
least one physical library in the grid, the value in this column is
shown as “—” for those clusters that are not attached to a physical
library.
Copy Activity Status information about the copy activity of the virtual volume
copy. The following are the possible values:
Complete: This cluster location completed a consistent copy of
the volume.
In Progress: A copy is required and currently in progress.
Required: A copy is required at this location but has not
started or completed.
Not Required: A copy is not required at this location.
Reconcile: Pending updates exist against this location's
volume. The copy activity updates after the pending updates
get resolved.
Time Delayed Until [time]: A copy is delayed as a result of
Time Delayed Copy mode. The value for [time] is the next
earliest date and time that the volume is eligible for copies.
Queue Type The type of queue as reported by the cluster. The following are the
possible values:
Rewind Unload (RUN): The copy occurs before the
rewind-unload operation completes at the host.
Deferred: The copy occurs some time after the rewind-unload
operation completes at the host.
Sync Deferred: The copy was set to be synchronized,
according to the synchronized mode copy settings, but the
synchronized cluster could not be accessed. The copy is in the
deferred state. For more information about synchronous mode
copy settings and considerations, see Synchronous mode
copy in the Related information section.
Immediate Deferred: A RUN copy that was moved to the
deferred state due to copy timeouts or TS7700 Grid states.
Time Delayed: The copy will occur sometime after the delay
period has been exceeded.
Copy Mode The copy behavior of the virtual volume copy. The following are the
possible values:
Rewind Unload (RUN): The copy occurs before the
rewind-unload operation completes at the host.
Deferred: The copy occurs some time after the rewind-unload
operation at the host.
No Copy: No copy is made.
Sync: The copy occurs on any synchronization operation. For
more information about synchronous mode copy settings and
considerations, see 3.3.5, “Synchronous mode copy” on
page 124.
Time Delayed: The copy occurs sometime after the delay
period has been exceeded.
Exist: A consistent copy exists at this location, although No
Copy is intended. A consistent copy existed at this location at
the time the virtual volume was mounted. After the volume is
modified, the Copy Mode of this location changes to No Copy.
Deleted The date and time when the virtual volume on the cluster was
deleted. Time that is recorded reflects the time zone in which the
user's browser is located. If the volume has not been deleted, this
value displays as “—”.
Removal Residency The automatic removal residency state of the virtual volume. In a
TS7700 Tape Attach configuration, this field is displayed only when
the volume is in the disk partition. This field is not displayed for
TS7740 Clusters. The following are the possible values:
—: Removal Residency does not apply to the cluster. This
value is displayed if the cluster attaches to a physical tape
library, or inconsistent data exists on the volume.
Removed: The virtual volume has been removed from the
cluster.
No Removal Attempted: The virtual volume is a candidate for
removal, but the removal has not yet occurred.
Retained: An attempt to remove the virtual volume occurred,
but the operation failed. The copy on this cluster cannot be
removed based on the configured copy policy and the total
number of configured clusters. Removal of this copy lowers
the total number of consistent copies within the grid to a value
below the required threshold. If a removal is expected at this
location, verify that the copy policy is configured and that
copies are being replicated to other peer clusters. This copy
can be removed only after enough replicas exist on other peer
clusters.
Deferred: An attempt to remove the virtual volume occurred,
but the operation failed. This state can result from a cluster
outage or any state within the grid that disables or prevents
replication. The copy on this cluster cannot be removed based
on the configured copy policy and the total number of available
clusters capable of replication. Removal of this copy lowers
the total number of consistent copies within the grid to a value
below the required threshold. This copy can be removed only
after enough replicas exist on other available peer clusters. A
subsequent attempt to remove this volume occurs when no
outage exists and replication is allowed to continue.
Pinned: The virtual volume is pinned by the virtual volume
Storage Class. The copy on this cluster cannot be removed
until it is unpinned. When this value is present, the Removal
Time value is Never.
Held: The virtual volume is held in cache on the cluster at least
until the Removal Time has passed. When the removal time
has passed, the virtual volume copy is a candidate for
removal. The Removal Residency value becomes No Removal
Attempted if the volume is not accessed before the Removal
Time passes. The copy on this cluster is moved to the
Resident state if it is not accessed before the Removal Time
passes. If the copy on this cluster is accessed after the
Removal Time has passed, it is moved back to the Held state.
Removal Time This field is displayed only if the grid contains a disk-only cluster.
Values displayed in this field depend on values displayed in the
Removal Residency field. Possible values include:
—: Removal Time does not apply to the cluster. This value is
displayed if the cluster attaches to a physical tape library.
Removed: The date and time the virtual volume was removed
from the cluster.
Held: The date and time the virtual volume becomes a
candidate for removal.
Pinned: The virtual volume is never removed from the cluster.
No Removal Attempted, Retained, or Deferred: Removal Time
is not applicable.
The time that is recorded reflects the time zone in which the user's
browser is located.
Volume Copy Retention Group The name of the group that defines the preferred auto removal
policy applicable to the virtual volume. This field is displayed only
for a TS7700 Cluster or for partition 0 (CP0) on a TS7700 Tape
Attach Cluster, and only when the grid contains a disk-only cluster.
Retention Time The length of time, in hours, that must elapse before the virtual
volume named in Volume Copy Retention Group can be removed.
Depending on the Volume Copy Retention Reference setting, this
time period can be measured from the time when the virtual
volume was created or when it was most recently accessed.
Volume Copy Retention The basis for calculating the time period defined in Retention Time.
Reference The following are the possible values:
Volume Creation: Calculate retention time starting with the
time when the virtual volume was created. This value refers to
the time a write operation was performed from the beginning
of a tape. It can be either a scratch mount or a private mount
where writing began at record 0.
Volume Last Accessed: Calculate retention time starting with
the time when the virtual volume was most recently accessed.
TVC Preference The group containing virtual volumes that have preference for
premigration. This field is displayed only for a TS7740 Cluster or for
partitions 1 through 7 (CP1 through CP7) in a TS7720/TS7760
Tape Attach cluster. The following are the possible values:
PG0: Volumes in this group have preference to be premigrated
over other volumes.
PG1: Volumes in this group have preference to be premigrated
over other volumes. A “least recently used” algorithm is used
to select volumes for premigration if there are no volumes to
premigrate in preference group 0.
Time-Delayed Premigration The length of time, in hours, that must elapse before a delayed
Delay premigration operation can begin for the virtual volumes
designated by the TVC Preference parameter. Depending on the
Time-Delayed Premigration Reference setting, this time period can
be measured from the time when the virtual volume was created or
when it was most recently accessed.
Time-Delayed Premigration The basis for calculating the time period defined in Time-Delayed
Reference Premigration Delay. The following are the possible values:
Volume Creation: Calculate premigration delay starting with
the time when the virtual volume was created. This value
refers to the time a write operation was performed from the
beginning of a tape. It can be either a scratch mount or a
private mount where writing began at record 0.
Volume Last Accessed: Calculate premigration delay starting
with the time when the virtual volume was most recently
accessed.
Storage Preference The priority for removing a virtual volume when the cache reaches
a predetermined capacity. The following are the possible values:
Prefer Remove: These virtual volumes are removed first when
the cache reaches a predetermined capacity and no
scratch-ready volumes exist.
Prefer Keep: These virtual volumes are the last to be removed
when the cache reaches a predetermined capacity and no
scratch-ready volumes exist.
Pinned: These virtual volumes are never removed from the
TS7700 Cluster.
Partition Number The partition number for a TS7700 Tape Attach Cluster. Possible
values are C0 through C7. “Inserted” logical volumes are those
with a -1 partition, meaning that there is no consistent data yet.
This window is available only for volumes with a created FlashCopy of a virtual volume. In this
context, created FlashCopy means an existing FlashCopy, which becomes different from the
live virtual volume. The live volume has been modified after FlashCopy time zero. For the
volumes with a FlashCopy active (meaning no difference between the FlashCopy and live
volume) as in Figure 9-51 on page 403, only the Virtual Volume details window is available
(FlashCopy and live volume are identical).
Figure 9-53 shows a FlashCopy details page in the MI compared with the output of LI REQ
Lvol flash command.
The virtual volume details and status are displayed in the Virtual volume details table:
Volser. The VOLSER of the virtual volume, which is a six-character value that uniquely
represents the virtual volume in the virtual library.
Media type. The media type of the virtual volume. Possible values are:
– Cartridge System Tape
– Enhanced Capacity Cartridge System Tape
Maximum Volume Capacity. The maximum size in MiB of the virtual volume. This
capacity is set upon insert, and is based on the media type of a virtual volume.
Current Volume Size. Size of the data in MiB for this virtual volume.
Current Owner. The name of the cluster that currently owns the latest version of the
virtual volume.
Currently Mounted. Whether the virtual volume is mounted in a virtual drive. If this value
is Yes, these qualifiers are also displayed:
– vNode. The name of the vNode that the virtual volume is mounted on.
– Virtual drive. The ID of the virtual drive the virtual volume is mounted on.
The Insert Virtual Volume window shows the Currently availability across entire grid table.
This table shows the total of the already inserted volumes, the maximum number of volumes
allowed in the grid, and the available slots (the difference between the maximum allowed and
the currently inserted numbers). Clicking Show/Hide under the table shows or hides the
information box with the already inserted volume ranges, quantities, media type, and capacity.
Figure 9-55 shows the inserted ranges box.
Note: You can use the Modify Virtual Volume function to manage virtual volumes that
belong to a non-MVS host that is not aware of constructs.
MVS hosts automatically assign constructs for virtual volumes, and manual changes are
not recommended. The Modify Virtual Volumes window acts on any logical volume
belonging to the cluster or grid regardless of the host that owns the volume. The changes
that are made on this window take effect only on the modified volume or range after a
mount-demount sequence, or by using the LI REQ COPYRFSH command.
To display a range of existing virtual volumes, enter the starting and ending VOLSERs in the
fields at the top of the window and click Show.
To modify constructs for a range of logical volumes, identify a Volume Range, and then, click
the Constructs menu to select construct values and click Modify. The menus have these
options:
Volume Range: The range of logical volumes to be modified:
– From: The first VOLSER in the range.
– To: The last VOLSER in the range.
Constructs: Use the following menus to change one or more constructs for the identified
Volume Range. From each menu, the user can select a predefined construct to apply to
the Volume Range, No Change to retain the current construct value, or dashes (--------)
to restore the default construct value:
– Storage Groups: Changes the SG for the identified Volume Range.
– Storage Classes: Changes the SC for the identified Volume Range.
– Data Classes: Changes the DC for the identified Volume Range.
– Management Classes: Changes the MC for the identified Volume Range.
Note: Only the unused logical volumes can be deleted through this window, meaning
volumes in insert category FF00 that have never been mounted or have had their category,
constructs, or attributes modified by a host. Otherwise, those logical volumes can be
deleted only from the host.
The normal way to delete several virtual scratch volumes is by initiating the activities from the
host. With Data Facility Storage Management Subsystem (DFSMS)/Removable Media
Management (RMM) as the tape management system (TMS), it is done by using RMM
commands.
To delete unused virtual volumes, select one of the options described next, and click Delete
Volumes. A confirmation window is displayed. Click OK to delete or Cancel to cancel. To
view the current list of unused virtual volume ranges in the TS7700 grid, enter a virtual
volume range at the bottom of the window and click Show. A virtual volume range deletion
can be canceled while in progress at the Cluster Operation History window.
To cancel a move request, select the Cancel Move Requests link. The following options to
cancel a move request are available:
Cancel All Moves: Cancels all move requests.
Cancel Priority Moves Only: Cancels only priority move requests.
Cancel Deferred Moves Only: Cancels only Deferred move requests.
Select a Pool: Cancels move requests from the designated source pool (1 - 32), or from all
source pools.
After defining the move operation parameters and clicking Move, confirm the request to move
the virtual volumes from the defined physical volumes. If Cancel is selected, you return to the
Move Virtual Volumes window.
A maximum of 10 search queries results or 2 GB of search data can be stored at one time. If
either limit is reached, the user should delete one or more stored queries from the Previous
Virtual Volume Searches window before creating a new search.
To view the results of a previous search query, select the Previous Searches hyperlink to see
a table containing a list of previous queries. Click a query name to display a list of virtual
volumes that match the search criteria.
To clear the list of saved queries, select the check box next to one or more queries to be
removed, select Clear from the menu, and click Go. This operation does not clear a search
query already in progress.
Confirm the decision to clear the query list. Select OK to clear the list of saved queries, or
Cancel to retain the list of queries.
To create a new search query, enter a name for the new query. Enter a value for any of the
fields and select Search to initiate a new virtual volume search. The query name, criteria,
start time, and end time are saved along with the search results.
To search for a specific VOLSER, enter parameters in the New Search Name and Volser
fields and then click Search.
When looking for the results of earlier searches, click Previous Searches on the Virtual
Volume Search window, which is shown in Figure 9-56.
Use this table to define the parameters for a new search query. Only one search can be
executed at a time. Define one or more of the following search parameters:
Volser (volume’s serial number). This field can be left blank. The following wildcard
characters in this field are valid:
– Percent sign (%): Represents zero or more characters.
– Asterisk (*): Converted to % (percent). Represents zero or more characters.
– Period (.): Converted to _ (single underscore). Represents one character.
– A single underscore (_): Represents one character.
– Question mark (?): Converted to _ (single underscore). Represents one character.
Category: The name of the category to which the virtual volume belongs. This value is a
four-character hexadecimal string. For instance, 0002/0102 (scratch MEDIA2), 000E (error),
000F/001F (private), FF00 (insert) are possible values for Scratch and Specific categories.
Wildcard characters shown in previous topic also can be used in this field. This field can
be left blank.
Media Type: The type of media on which the volume exists. Use the menu to select from
the available media types. This field can be left blank.
Current Owner: The cluster owner is the name of the cluster where the logical volume
resides. Use the drop-down menu to select from a list of available clusters. This field is
only available in a grid environment and can be left blank.
Expire Time: The amount of time in which virtual volume data expires. Enter a number.
This field is qualified by the values Equal to, Less than, or Greater than in the preceding
menu and defined by the succeeding menu under the heading Time Units. This field can
be left blank.
Removal Residency: The automatic removal residency state of the virtual volume. This
field is not displayed for TS7740 clusters. In a TS7720T (tape attach) configuration, this
field is displayed only when the volume is in partition 0 (CP0). The following values are
possible:
– Blank (ignore): If this field is empty (blank), the search ignores any values in the
Removal Residency field. This is the default selection.
– Removed: The search includes only virtual volumes that have been removed.
– Removed Before: The search includes only virtual volumes that are removed before a
specific date and time. If this value is selected, the Removal Time field must be
complete as well.
– Removed After: The search includes only virtual volumes that are removed after a
certain date and time. If this value is selected, the Removal Time field must be
complete as well.
– In Cache: The search includes only virtual volumes in the cache.
– Retained: The search includes only virtual volumes that are classified as retained.
– Deferred: The search includes only virtual volumes that are classified as deferred.
– Held: The search includes only virtual volumes that are classified as held.
– Pinned: The search includes only virtual volumes that are classified as pinned.
– No Removal Attempted: The search includes only virtual volumes that have not
previously been subject to a removal attempt.
Tip: To avoid cache overruns, plan ahead when assigning volumes to this group.
– “-”: Volume Copy Retention does not apply to the TS7740 cluster and TS7720T (for
volume in CP1 to CP7). This value (a dash indicating an empty value) is displayed if the
cluster attaches to a physical tape library.
Storage Group: The name of the SG in which the virtual volume is. The user can enter a
name in the empty field, or select a name from the adjacent menu. This field can be left
blank.
Remember: The user can print or download the results of a search query by using Print
Report or Download Spreadsheet on the Volumes found table at the end of the Search
Results window.
Use this table to select the properties that are displayed on the Virtual volume search results
window:
Click the down arrow that is adjacent to the Search Results Option, shown in Figure 9-56
on page 418 to open the Search Results Options table.
Select the check box that is adjacent for each property that should be included on the
Virtual Volume Search Results window. The following properties can be selected for
display:
– Category
– Current Owner (Grid only)
– Media Type
– Expire Time
– Storage group
– Management Class
– Storage Class
– Data Class
– Compression Method
– Mounted Tape Drive
– Removal Residency
– Removal Time
– Volume Copy Retention Group
– Storage Preference
– Logical WORM
Click the Search button to start a new virtual volume search.
When search is complete, the results are displayed in the Virtual Volume Search Results
window. The query name, criteria, start time, and end time are saved along with the search
results. Maximum of 10 search queries can be saved. The following subwindows are
available:
Previous virtual volume searches: Use this window to view precious searches of virtual
volumes in the MI currently accessed cluster.
Virtual volume search results: Use this window to view a list of virtual volumes on this
cluster that meet the criteria of an executed search query.
Categories
Use this page to add, modify, or delete a scratch category of virtual volumes.
This page also can be used to view the total number of logical volumes classified in each
category, grouped under Damaged, Scratch and Private groups. Clicking in the "+" adjacent
to the category will expand the information about that category, showing how many volumes
in that category exist in each cluster of the grid (as shown in Figure 9-58 on page 423). A
category is a grouping of virtual volumes for a predefined use. A scratch category groups
virtual volumes for non-specific use. This grouping enables faster mount times because the
TS7700 can order category mounts without recalling data from a stacked volume (fast ready).
You can display the already defined categories, as shown in the Figure 9-58.
Categories The type of category that defines the virtual volume. The following values are
valid:
Scratch: Categories within the user-defined private range 0x0001 through
0xEFFF that are defined as scratch. Click the plus sign (+) icon to expand this
heading and reveal the list of categories that are defined by this type. Expire
time and hold values are shown in parentheses next to the category number.
See Table 9-10 for descriptions of these values.
Private: Custom categories that are established by a user, within the range of
0x0001 - 0xEFFF. Click the plus sign (+) icon to expand this heading and
reveal the list of categories that are defined by this type.
Damaged: A system category that is identified by the number 0xFF20. Virtual
volumes in this category are considered damaged.
Insert: A system category that is identified by the number 0xFF00. Inserted
virtual volumes are held in this category until moved by the host into a scratch
category.
If no defined categories exist for a certain type, that type is not displayed on the
Categories table.
Owning Cluster Names of all clusters in the grid. Expand a category type or number to display.
This column is visible only when the accessing cluster is part of a grid.
Counts The total number of virtual volumes according to category type, category, or
owning cluster.
Scratch Expired The total number of scratch volumes per owning cluster that are expired. The total
of all scratch expired volumes is the number of ready scratch volumes.
The user can use the Categories table to add, modify, or delete a scratch category, or to
change the way information is displayed.
Tip: The total number of volumes within a grid is not always equal to the sum of all
category counts. Volumes can change category multiple times per second, which makes
the snapshot count obsolete.
Modify a scratch category The user can modify a scratch category in two ways:
Select a category on the table, and then, select Actions →
Modify Scratch Category.
Right-click a category on the table and either hold, or select
Modify Scratch Category from the menu.
The user can modify the following category values:
Expire
Set Expire Hold
The user can modify one category at a time.
Delete a scratch category The user can delete a scratch category in two ways:
1. Select a category on the table, and then, select Actions →
Delete Scratch Category.
2. Right-click a category on the table and select Delete Scratch
Category from the menu.
The user can delete only one category at a time.
Filter the table data Follow these steps to filter by using a string of text:
1. Click in the Filter field.
2. Enter a search string.
3. Press Enter.
b. If EXPIRE HOLD is set, then the virtual volume cannot be mounted during the expire time
duration and will be excluded from any scratch counts surfaced to the host. The volume
category can be changed, but only to a private category, allowing accidental scratch
occurrences to be recovered to private. If EXPIRE HOLD is not set, then the virtual volume can
be mounted or have its category and attributes changed within the expire time duration. The
volume is also included in scratch counts surfaced to the host.
Note: There is no cross-check between defined categories in the z/OS systems and the
definitions in the TS7700.
Tip: Pools 1 - 32 are preinstalled and initially set to default attributes. Pool 1 functions as
the default pool and is used if no other pool is selected.
Figure 9-60 on page 428 show an example of the Physical Volume Pools window. There is a
link that is available for a tutorial showing how to modify pool encryption settings. Click the link
to see the tutorial material. This window is visible but disabled on the TS7700 MI if the grid
possesses a physical library, but the selected cluster does not. This message is displayed:
The Physical Volume Pool Properties table displays the encryption setting and media
properties for every physical volume pool that is defined in a TS7740 and TS7720T. This table
contains two tabs: Pool Properties and Physical Tape Encryption Settings. The information
that is displayed in a tape-attached cluster depends on the current configuration and media
availability.
Important: The reclaim pool that is designated for the Copy Export pool needs to be
set to the same value as the Copy Export pool. If the reclaim pool is modified, Copy
Export disaster recovery capabilities can be compromised.
If there is a need to modify the reclaim pool that is designated for the Copy Export pool,
the reclaim pool cannot be set to the same value as the primary pool or the reclaim
pool that is designated for the primary pool. If the reclaim pool for the Copy Export pool
is the same as either of the other two pools that are mentioned, the primary and
backup copies of a virtual volume might exist on the same physical media. If the
reclaim pool for the Copy Export pool is modified, it is the user’s responsibility to Copy
Export volumes from the reclaim pool.
– Maximum Devices: The maximum number of physical tape drives that the pool can use
for premigration.
Note: You can use identical values in Key Label 1 and Key Label 2, but you must
define each label for each key.
If the encryption state is Disabled, this field is blank. If the default key is used, the value
in this field is default key.
– Key Mode 2: Encryption mode that is used with Key Label 2. The following values
are valid:
Clear Label The data key is specified by the key label in clear text.
Hash Label The data key is referenced by a computed value corresponding
to its associated public key.
None Key Label 2 is disabled.
“-” The default key is in use.
– Key Label 2: The current EK Label 2 for the pool. The label must consist of ASCII
characters and cannot exceed 64 characters. Leading and trailing blanks are removed,
but an internal space is allowed. Lowercase characters are internally converted to
uppercase upon storage. Therefore, key labels are reported by using uppercase
characters.
If the encryption state is Disabled, this field is blank. If the default key is used, the value
in this field is default key.
Note: Pools 1-32 are preinstalled. Pool 1 functions as the default pool and is used if no
other pool is selected. All other pools must be defined before they can be selected.
Important: The reclaim pool designated for the copy export pool should be set to
the same value as the copy export pool. If the reclaim pool is modified, copy export
disaster recovery capabilities can be compromised.
If there is a need to modify the reclaim pool designated for the copy export pool, the
reclaim pool cannot be set to the same value as the primary pool or the reclaim pool
designated for the primary pool. If the reclaim pool for the copy export pool is the same
as either of the other two pools mentioned, then primary and backup copies of a virtual
volume might exist on the same physical media. If the reclaim pool for the copy export
pool is modified, it is your responsibility to copy export volumes from the reclaim pool.
– Maximum Devices: The maximum number of physical tape drives that the pool can use
for premigration.
– Export Pool: The type of export supported if the pool is defined as an Export Pool (the
pool from which physical volumes are exported). The following are the possible values:
• Not Defined: The pool is not defined as an Export pool.
• Copy Export: The pool is defined as a Copy Export pool.
– Export Format: The media format used when writing volumes for export. This function
can be used when the physical library recovering the volumes supports a different
media format than the physical library exporting the volumes. This field is only enabled
if the value in the Export Pool field is Copy Export. The following are the possible
values:
• Default: The highest common format supported across all drives in the library. This
is also the default value for the Export Format field.
• E06: Format of a 3592 E06 Tape Drive.
• E07: Format of a 3592 E07 Tape Drive.
• E08: Format of a 3592 E08 Tape Drive.
Note: This control is applied to the reclamation of both sunset and R/W media.
– Age of Last Data Written: The number of days the pool has persisted without write
access to set a physical stacked volume as a candidate for reclamation. Each physical
stacked volume has a timer for this purpose, which is reset when a virtual volume is
accessed. The reclamation occurs at a later time, based on an internal schedule. The
valid range of possible values is 1-365. Clearing the check box deactivates this
function.
Note: This control is applied to the reclamation of both sunset and R/W media.
– Days Without Data Inactivation: The number of sequential days that the data ratio of
the pool has been higher than the Maximum Active Data used to set a physical stacked
volume as a candidate for reclamation. Each physical stacked volume has a timer for
this purpose, which is reset when data inactivation occurs. The reclamation occurs at a
later time, based on an internal schedule. The valid range of possible values is 1-365.
Clearing the check box deactivates this function. If deactivated, this field is not used as
a criteria for reclaim.
Note: This control is applied to the reclamation of both sunset and R/W media.
– Maximum Active Data: The ratio of the amount of active data in the entire physical
stacked volume capacity. This field is used with Days Without Data Inactivation. The
valid range of possible values is 5-95(%). This function is disabled if Days Without Data
Inactivation is not selected.
– Reclaim Threshold: The percentage used to determine when to perform reclamation of
free storage on a stacked volume. When the amount of active data on a physical
stacked volume drops below this percentage, a reclaim operation is performed on the
stacked volume.
Physical volumes hold between the threshold value and 100% of data. For example, if
the threshold value is 35% (the default), the percentage of active data on the physical
volumes is (100%-35%)/2 or 15%. Setting the threshold too low results in more
physical volumes being needed. Setting the threshold too high might affect the ability of
the TS7700 Tape Attach to perform host workload because it is using its resources to
perform reclamation. Experiment to find a threshold that matches your needs.
The valid range of possible values is 0-95(%) and can be entered in 1% increments.
The default value is 35%. If the system is in a heterogeneous tape drive environment,
then this threshold is for R/W media.
Note: If the system contains TS1140 or TS1150 tape drives, the system requires at
least 15 scratch physical volumes to run reclamation for sunset media.
6. To complete the operation click OK. To abandon the operation and return to the Physical
Volume Pools window, click Cancel.
To watch a tutorial showing how to modify pool encryption settings, click the View tutorial link
on the Physical Volume Pools page.
To modify encryption settings for one or more physical volume pools, complete these steps:
1. From the Physical Volume Pools page, click the Encryption Settings tab.
2. Select the check box next to each pool to be modified.
3. Select Modify Encryption Settings from the Select Action drop-down menu.
4. Click Go to open the Modify Encryption Settings window.
5. Modify values for any of the following fields:
– Encryption: The encryption state of the pool. The following are the possible values:
• Enabled: Encryption is enabled on the pool.
• Disabled: Encryption is not enabled on the pool. When this value is selected, key
modes, key labels, and check boxes are disabled.
– Use encryption key server default key: Check this box to populate the Key Label field
using a default key provided by the encryption key server.
Note: Your encryption key server software must support default keys to use this
option.
This check box occurs before both Key Label 1 and Key Label 2 fields You must check
this box for each label to be defined using the default key. If this box is checked, the
following fields are disabled:
• Key Mode 1
• Key Label 1
Note: You can use identical values in Key Label 1 and Key Label 2, but you must
define each label for each key.
– Key Mode 2: Encryption Mode used with Key Label 2. The following are the possible
values for this field:
• Clear Label: The data key is specified by the key label in clear text.
• Hash Label: The data key is referenced by a computed value corresponding to its
associated public key.
• None: Key Label 2 is disabled.
• -: The default key is in use.
– Key Label 2: The current encryption key Label 2 for the pool. The label must consist of
ASCII characters and cannot exceed 64 characters. Leading and trailing blanks are
removed, but an internal space is allowed. Lowercase characters are internally
converted to uppercase upon storage. Therefore, key labels are reported using
uppercase characters.
6. To complete the operation click OK. To abandon the operation and return to the Physical
Volume Pools window, click Cancel.
Physical volumes
The topics in this section present information that is related to monitoring and manipulating
physical volumes in the TS7700T and TS7740. This window is visible but disabled on the
TS7700 MI if the grid possesses a physical library, but the selected cluster does not.
Tip: This window is not visible on the TS7700 MI if the grid does not possess a physical
library.
The following information is displayed when details for a physical stacked volume are
retrieved:
VOLSER. Six-character VOLSER number of the physical stacked volume.
Type. The media type of the physical stacked volume. The following values are possible:
– JA (ETC). ETC
– JB (ETCL). Enterprise Extended-Length Tape Cartridge
– JC (ATCD). ATCD
– JD (ATDD). ATDD
– JJ (EETC). EETC
– JK (ATKE). ATKE
– JL(ATLE). ATLE
Recording Format. The format that is used to write the media. The following values are
possible:
– Undefined. The recording format that is used by the volume is not recognized as a
supported format.
– J1A
– E05
– E05E. E05 with encryption.
– E06
– E06E. E06 with encryption.
– E07
– E07E. E07 with encryption.
– E08
– E08E. E08 with encryption.
Volume State The following values are possible:
– Read-Only. The volume is in a read-only state.
– Read/write. The volume is in a read/write state.
– Unavailable. The volume is in use by another task or is in a pending eject state.
– Destroyed. The volume is damaged and unusable for mounting.
– Copy Export Pending. The volume is in a pool that is being exported as part of an
in-progress Copy Export.
– Copy Exported. The volume has been ejected from the library and removed to offsite
storage.
– Copy Export Reclaim. The host can send a Host Console Query request to reclaim a
physical volume currently marked Copy Exported. The data mover then reclaims the
virtual volumes from the primary copies.
– Copy Export No Files Good. The physical volume has been ejected from the library
and removed to offsite storage. The virtual volumes on that physical volume are
obsolete.
– Misplaced. The library cannot locate the specified volume.
– Inaccessible. The volume exists in the library inventory but is in a location that the
cartridge accessor cannot access.
– Manually Ejected. The volume was previously present in the library inventory, but
cannot currently be located.
Capacity State. Possible values are empty, filling, and full.
Key Label 1/Key Label 2. The EK label that is associated with a physical volume. Up to
two key labels can be present. If there are no labels present, the volume is not encrypted.
If the EK used is the default key, the value in this field is default key.
Encrypted Time. The date the physical volume was first encrypted using the new EK. If
the volume is not encrypted, the value in this field is “-”.
The Select Move Action menu provides options for moving physical volumes to a target pool.
The following options are available to move physical volumes to a target pool:
Move Range of Physical Volumes. Moves physical volumes to the target pool physical
volumes in the specified range. This option requires you to select a Volume Range, Target
Pool, and Move Type. The user can also select a Media Type.
Move Range of Scratch Only Volumes. Moves physical volumes to the target pool
scratch volumes in the specified range. This option requires you to select a Volume Range
and Target Pool. The user can also select a Media Type.
Move Quantity of Scratch Only Volumes. Moves a specified quantity of physical
volumes from the source pool to the target pool. This option requires to select Number of
Volumes, Source Pool, and Target Pool. The user can also select a Media Type.
Move Export Hold to Private. Moves all Copy Export volumes in a source pool back to a
private category if the volumes are in the Export/Hold category but are not selected to be
ejected from the library. This option requires to select a Source Pool.
Cancel Move Requests. Cancels any previous move request.
Note: This option applies only to private media, not scratch tapes.
If the user selects Move Range of Physical Volumes or Move Range of Scratch Only
Volumes from the Select Move Action menu, the user must define a volume range or select
an existing range, select a target pool, and identify a move type. A media type can be
selected as well.
If the user selects Move Export Hold to Private from the Select Move Action menu, the
user must identify a source pool.
After the user defines move operation parameters and clicks Move, the user confirms the
request to move physical volumes. If the user selects Cancel, the user returns to the Move
Physical Volumes window. To cancel a previous move request, select Cancel Move
Requests from the Select Move Action menu. The following options are available to cancel
a move request:
Cancel All Moves. Cancels all move requests.
Cancel Priority Moves Only. Cancels only priority move requests.
Cancel Deferred Moves Only. Cancels only deferred move requests.
Select a Pool. Cancels move requests from the designated source pool (0 - 32), or from
all source pools.
The Select Eject Action menu provides options for ejecting physical volumes.
Note: Before a stacked volume with active virtual volumes can be ejected, all active logical
volumes in it are copied to a different stacked volume.
If the user selects Eject Range of Physical Volumes or Eject Range of Scratch Only
Volumes from the Select Eject Action menu, the user must define a volume range or select
an existing range and identify an eject type. A media type can be selected as well.
If the user selects Eject Quantity of Scratch Only Volumes from the Select Eject Action
menu, the user must define the number of volumes to be ejected, and to identify a source
pool. A media type can be selected as well.
If the user selects Eject Export Hold Volumes from the Select Eject Action menu, the user
must select the VOLSERs of the volumes to be ejected. To select all VOLSERs in the Export
Hold category, select Select All from the menu. The eject operation parameters include these
parameters:
Volume Range. The range of physical volumes to eject. The user can use either this
option or the Existing Ranges option to define the range of volumes to eject, but not both.
Define the range:
– To. VOLSER of the first physical volume in the range to eject.
– From. VOLSER of the last physical volume in the range to eject.
Existing Ranges. The list of existing physical volume ranges. The user can use either this
option or the Volume Range option to define the range of volumes to eject, but not both.
Eject Type. Used to determine when the eject operation will occur. The following values
are possible:
– Deferred Eject. The eject operation occurs based on the first Reclamation policy that
is triggered for the applied source pool. This operation depends on reclaim policies for
the source pool and can take some time to complete.
After the user defines the eject operation parameters and clicks Eject, the user must confirm
the request to eject physical volumes. If the user selects Cancel, the user returns to the Eject
Physical Volumes window.
To cancel a previous eject request, select Cancel Eject Requests from the Select Eject
Action menu. The following options are available to cancel an eject request:
Cancel All Ejects. Cancels all eject requests.
Cancel Priority Ejects Only. Cancels only priority eject requests.
Cancel Deferred Ejects Only. Cancels only deferred eject requests.
When working with volumes recently added to the attached TS3500 tape library that are not
showing in the Physical Volume Ranges window, click Inventory Upload. This action
requests the physical inventory from the defined logical library in the tape library to be
uploaded to the TS7700T, repopulating the Physical Volume Ranges window.
Tip: When inserting a VOLSER that belongs to a defined tape attach TS7700 range, it is
presented and inventoried according to the setup in place. If the newly inserted VOLSER
does not belong to any defined range in the TS7700T, an intervention-required message is
generated, requiring the user to correct the assignment for this VOLSER.
Important: If a physical volume range contains virtual volumes with active data, those
virtual volumes must be moved or deleted before the physical volume range can be moved
or deleted.
Note: JA and JJ media are supported only for read-only operations with 3592 E07 tape
drives. 3592-E08 does not support JA, JJ, or JB media.
Home Pool. The home pool to which the VOLSER range is assigned.
Use the menu on the Physical Volume Ranges table to add a VOLSER range, or to modify
or delete a predefined range.
Unassigned Volumes
The Unassigned Volumes table displays the list of unassigned physical volumes that are
pending ejection for a cluster. A VOLSER is removed from this table when a new range
that contains the VOLSER is added. The following status information is displayed in the
Unassigned Volumes table:
VOLSER. The VOLSER associated with a given physical volume.
Media Type. The media type for all volumes in a VOLSER range. The following values are
possible:
– JA(ETC). ETC.
– JB(ETCL). Enterprise Extended-Length Tape Cartridge.
– JC(ATCD). ATCD.
– JD(ATDD). ATDD.
– JJ(EETC). EETC.
– JK(ATKE). ATKE.
– JL(ATLE). ATLE.
Note: JA and JJ media are supported only for read-only operations with 3592 E07 tape
drives. 3592-E08 does not support JA, JJ, or JB media.
Pending Eject. Whether the physical volume associated with the VOLSER is awaiting
ejection.
Use the Unassigned Volumes table to eject one or more physical volumes from a library that
is attached to a TS7720T or TS7740.
Note: Only one search can be run at a time. If a search is in progress, an information
message displays at the top of the Physical Volume Search window. The user can cancel a
search in progress by clicking Cancel Search within this message.
Note: Only one search can be run at a time. If a search is in progress, an information
message displays at the top of the Physical Volume Search window. The user can cancel a
search in progress by clicking Cancel Search within this message.
Use this table to select the properties that are displayed on the Physical Volume Search
Results window.
Click the down arrow next to Search Results Options to open the Search Results Options
table. Select the check box next to each property that should display on the Physical Volume
Search Results window.
Review the property definitions from the Search Options table section. The following
properties can be displayed on the Physical Volume Search Results window:
Media Type
Recording Format
Home Pool
Current Pool
Pending Actions
Volume State
Mounted Tape Drive
Encryption Key Labels
Export Hold
Read Only Recovery
Copy Export Recovery
Database Backup
Click Search to initiate a new physical volume search. After the search is initiated but before it
completes, the Physical Volume Search window displays the following information message:
The search is currently in progress. The user can check the progress of the search
on the Previous Search Results window.
To check the progress of the search being run, click the Previous Search Results hyperlink
in the information message. To cancel a search in progress, click Cancel Search. When the
search completes, the results are displayed in the Physical Volume Search Results window.
The query name, criteria, start time, and end time are saved along with the search results. A
maximum of 10 search queries can be saved.
The tables in this page show the number of physical volumes that are marked as full in each
physical volume pool, according to % of volume used. The following fields are displayed:
Pool. The physical volume pool number. This number is a hyperlink; click it to display a
graphical representation of the number of physical volumes per utilization increment in a
pool. If the user clicks the pool number hyperlink, the Active Data Distribution subwindow
opens.
Tip: This percentage is a hyperlink; click it to open the Modify Pool Properties window,
where the user can modify the percentage that is used for this threshold.
Number of Volumes with Active Data. The number of physical volumes that contain
active data.
Pool n Active Data Distribution. This graph displays the number of volumes that contain
active data per volume utilization increment for the selected pool. On this graph, utilization
increments (x axis) do not overlap.
Pool n Active Data Distribution (cumulative). This graph displays the cumulative
number of volumes that contain active data per volume utilization increment for the
selected pool. On this graph, utilization increments (x axis) overlap, accumulating as
they increase.
The Active Data Distribution subwindow also displays utilization percentages for the
selected pool, excerpted from the Number of Full Volumes at Utilization Percentages
table.
Media Type. The type of cartridges that are contained in the physical volume pool. If more
than one media type exists in the pool, each type is displayed, separated by commas. The
following values are possible:
– Any 3592. Any media with a 3592 format.
– JA-ETC. ETC.
– JB(ETCL). Enterprise Extended-Length Tape Cartridge.
– JC(ATCD). ATCD.
– JD(ATDD). ATDD.
– JJ(EETC). EETC.
– JK(ATKE). ATKE.
– JL(ATLE). ATLE.
This window is visible but disabled on the TS7700 MI if the grid possesses a physical library,
but the selected cluster does not. The following message is displayed:
The cluster is not attached to a physical tape library.
Tip: This window is not visible on the TS7700 MI if the grid does not possess a physical
library.
The Physical Tape Drives table displays status information for all physical drives accessible by
the cluster, including the following information:
Serial Number. The serial number of the physical drive.
Drive Type. The machine type and model number of the drive. The following values are
possible:
– 3592J1A.
– 3592E05.
– 3592E05E. A 3592 E05 drive that is Encryption Capable.
– 3592E06.
– 3952E07.
– 3952E08.
Online. Whether the drive is online.
Note: If the user is monitoring this field while changing the encryption status of a
drive, the new status does not display until you bring the TS7700 Cluster offline and
then back online.
This window is visible but disabled on the TS7700 MI if the grid possesses a physical library,
but the selected cluster does not. The following message is displayed:
The cluster is not attached to a physical tape library.
Tip: This window is not visible on the TS7700 MI if the grid does not possess a physical
library.
The following physical media counts are displayed for each media type in each storage pool:
Pool. The storage pool number.
Media Type. The media type defined for the pool. A storage pool can have multiple media
types and each media type is displayed separately. The following values are possible:
– JA-ETC. ETC.
– JB(ETCL). Enterprise Extended-Length Tape Cartridge.
– JC(ATCD). ATCD.
– JD(ATDD). ATDD.
– JJ(EETC). EETC.
– JK(ATKE). ATKE.
– JL(ATLE). ATLE.
Empty. The count of physical volumes that are empty for the pool.
Filling. The count of physical volumes that are filling for the pool. This field is blank for
pool 0.
Full. The count of physical volumes that are full for the pool. This field is blank for pool 0.
Tip: A value in the Full field is displayed as a hyperlink; click it to open the Active Data
Distribution subwindow. The Active Data Distribution subwindow displays a graphical
representation of the number of physical volumes per utilization increment in a pool. If
no full volumes exist, the hyperlink is disabled.
Queued for Erase. The count of physical volumes that are reclaimed but need to be
erased before they can become empty. This field is blank for pool 0.
The SGs table displays all existing SGs available for a cluster.
The user can use the SGs table to create an SG, modify an existing SG, or delete an SG.
Also, the user can copy selected SGs to the other clusters in this grid by using the Copy to
Clusters action available in the menu.
Use the menu in the SGs table to add an SG, or modify or delete an existing SG.
To add an SG, select Add from the menu. Complete the fields for information that will be
displayed in the SGs table.
Consideration: If the cluster does not possess a physical library, the Primary Pool field is
not available in the Add or Modify options.
To modify an existing SG, select the radio button from the Select column that appears next to
the name of the SG that needs to be modified. Select Modify from the menu. Complete the
fields for information that must be displayed in the SGs table.
The secondary copy pool column shows only in a tape attach TS7700 cluster. This is a
requirement for using the Copy Export function.
The user can use the MCs table to create, modify, and delete MCs. The default MC can be
modified, but cannot be deleted. The default MC uses dashes (--------) for the symbolic
name.
Remember: If the cluster does not possess a physical library, the Secondary Pool field
is not available in the Add option.
You can use the Copy Action option to copy any existing MC to each cluster in the TS7700
Grid.
You can view SCs from any TS7700 in the grid, but TVC preferences can be altered only from
a tape-attached cluster. Figure 9-68 shows the window in a TS7720T model.
The SCs table lists defined SCs that are available to control data sets (CDSs) and objects
within a cluster.
The default SC can be modified, but cannot be deleted. The default SC has dashes
(--------) as the symbolic name.
Important: Scratch categories and DCs work at the system level and are unique for all
clusters in a grid. Therefore, if they are modified in one cluster, they are applied to all
clusters in the grid.
The DC table (Figure 9-70) displays the list of DCs defined for each cluster of the grid.
The user can use the DCs table to create a DC or modify, copy, or delete an existing DC. The
default DC can be modified, but cannot be deleted. The default DC has dashes (--------) as
the symbolic name.
Use the menu in the DCs table to add a DC, or modify or delete an existing DC.
Tip: The user can create up to 256 DCs per TS7700 grid.
The user can access the following options through the User Access (blue man icon) link, as
depicted in the Figure 9-71.
The TS7700 management interface (MI) pages collected under the Access icon help the
user to view or change security settings, roles and permissions, passwords, and
certifications. The user can also update the Knowledge Center (Infocenter) files from this
menu.
Security Settings. Use this window to view security settings for a TS7700 grid. From this
window, the user can also access other pages to add, modify, assign, test, and delete
security settings.
Roles and Permissions. Use this window to set and control user roles and permissions
for a TS7700 grid.
Note: Although the term “InfoCenter” is still used in the interface of the product, the term
“IBM Knowledge Center” is the current correct term.
Authentication Policies
The user can add, modify, assign, test, and delete the authentication policies that determine
how users are authenticated to the TS7700 Management Interface. Each cluster is assigned
a single authentication policy. The user must be authorized to modify security settings before
changing authentication policies.
There are two categories of authentication policies: Local, which replicates users and their
assigned roles across a grid, and External, which stores user and group data on a separate
server and maps relationships between users, groups, and authorization roles when a user
logs in to a cluster. External policies include Storage Authentication Service policies and
Direct LDAP (lightweight directory access protocol) policies1.
Note: A restore of cluster settings (from a previously taken backup) will not restore or
otherwise modify any user, role, or password settings defined by a security policy.
Policies can be assigned on a per cluster basis. One cluster can employ local authentication,
while a different cluster within the same grid domain can employ an external policy.
Additionally, each cluster in a grid can operate its own external policy. However, only one
policy can be enabled on a cluster at a time.
Type. The policy type, which can be one of the following values:
– Local
A policy that replicates authorization based on user accounts and assigned roles. It is
the default authentication policy. When enabled, it is enforced for all clusters in the grid.
If Storage Authentication Service is enabled, the Local policy is disabled. This policy
can be modified to add, change, or delete individual accounts, but the policy itself
cannot be deleted.
1
When operating at code level 8.20.x.x or 8.21.0.63-8.21.0.119 with Storage Authentication Service enabled, a
5-minute web server outage occurs when a service person logs in to the machine.
Important: If this field is blank for an enabled policy, then IBM service
representatives must log in using LDAP login credentials obtained from the
system administrator. If LDAP server is inaccessible, IBM service representatives
cannot access the cluster.
Tip: Passwords for the users are changed from this window also.
To modify a user account belonging to the Local Authentication Policy, complete these steps:
1. On the TS7700 MI, click Access (blue man icon) → Security Settings from the left
navigation window.
2. Click Select next to the Local policy name on the Authentication Policies table.
3. Select Modify from the Select Action menu and click Go.
Note: The user cannot modify the user name or Group Name. Only the role and the
clusters to which it is applied can be modified.
In the Cluster Access table, select the Select check box to toggle all the cluster check boxes
on and off.
Important: When a Storage Authentication Service policy is enabled for a cluster, service
personnel are required to log in with the setup user or group. Before enabling storage
authentication, create an account that can be used by service personnel.
Remember: If the Primary or alternative Server URL uses the HTTPS protocol, a
certificate for that address must be defined on the SSL Certificates window.
d. Server Authentication: Values in the following fields are required if IBM WebSphere
Application Server security is enabled on the WebSphere Application Server that is
hosting the Authentication Service. If WebSphere Application Server security is
disabled, the following fields are optional:
• User ID: The user name that is used with HTTP basic authentication for
authenticating to the Storage Authentication Service.
• Password: The password that is used with HTTP basic authentication for
authenticating to the Storage Authentication Service.
4. To complete the operation, click OK. To abandon the operation and return to the Security
Settings window, click Cancel.
Note: Generally, select the Allow IBM Support options to grant access to the IBM
services representative to the TS7700.
Click OK to confirm the creation of the Storage Authentication Policy. In the Authentication
Policies table, no clusters are assigned to the newly created policy, so the Local
Authentication Policy is enforced. When the newly created policy is in this state, it can be
deleted because it is not applied to any of the clusters.
To delete a Storage Authentication Service Policy from a TS7700 grid, complete the following
steps:
1. On the TS7700 MI, click Access → Security Settings from the left navigation window.
2. From the Security Settings window, go to the Authentication Policies table and complete
the following steps:
a. Select the radio button next to the policy that must be deleted.
b. Select Delete from the Select Action menu.
c. Click Go to open the Confirm Delete Storage Authentication Service policy window.
d. Click OK to delete the policy and return to the Security Settings window, or click
Cancel to abandon the delete operation and return to the Security Settings window.
3. Confirm the policy deletion: Click OK to delete the policy.
Tip: The policy needs to be configured to an LDAP server before being added in the
TS7700 MI. External users and groups to be mapped by the new policy are checked in
LDAP before being added.
Note: If the user name entered belongs to a user not included on the policy, test results
show success, but the result comments show a null value for the role and access fields.
Additionally, the user name that entered cannot be used to log in to the MI.
4. Click OK to complete the operation. If you must abandon the operation, click Cancel to
return to the Security Settings window.
When the authentication policy test completes, the Test Authentication Policy results window
opens to display results for each selected cluster. The results include a statement indicating
whether the test succeeded or failed, and if it failed, the reason for the failure. The Test
Authentication Policy results window also displays the Policy Users table. Information that is
shown on that table includes the following fields:
Username. The name of a user who is authorized by the selected authentication policy.
Role. The role that is assigned to the user under the selected authentication policy.
Cluster Access. A list of all the clusters in the grid for which the user and user role are
authorized by the selected authentication policy.
To return to the Test Authentication Policy window, click Close Window. To return to the
Security Settings window, click Back at the top of the Test Authentication Policy results
window.
Important: When a Direct LDAP policy is enabled for a cluster, service personnel are
required to log in with the setup user or group. Before enabling LDAP authentication,
create an account that can be used by service personnel. Also, the user can enable an
IBM SSR to connect to the TS7700 through physical access or remotely by selecting those
options in the DIRECT LDAP POLICY window.
Note: LDAP external authentication policies are not available for backup or recovery
through the backup or restore settings operations. Record it, keep it safe, and have it
available for a manual recovery as dictated by the security standards.
The values in the following fields are required if secure authentication is used or anonymous
connections are disabled on the LDAP server:
User Distinguished Name: The user distinguished name is used to authenticate to the
LDAP authentication service. This field supports a maximum length of 254 Unicode
characters, for example:
CN=Administrator,CN=users,DC=mycompany,DC=com
Password: The password is used to authenticate to the LDAP authentication service. This
field supports a maximum length of 254 Unicode characters.
When modifying an LDAP Policy, the LDAP attributes fields also can be changed:
Base Distinguish Name: The LDAP distinguished name (DN) that uniquely identifies a set
of entries in a realm. This field is required but blank by default. The value in this field
consists of 1 - 254 Unicode characters.
User Name Attribute: The attribute name that is used for the user name during
authentication. This field is required and contains the value uid by default. The value in
this field consists of 1 - 61 Unicode characters.
Password: The attribute name that is used for the password during authentication. This
field is required and contains the value userPassword by default. The value in this field
consists of 1 - 61 Unicode characters.
Group Member Attribute: The attribute name that is used to identify group members. This
field is optional and contains the value member by default. This field can contain up to 61
Unicode characters.
Group Name Attribute: The attribute name that is used to identify the group during
authorization. This field is optional and contains the value cn by default. This field can
contain up to 61 Unicode characters.
User Name filter: Used to filter and verify the validity of an entered user name. This field is
optional and contains the value (uid={0}) by default. This field can contain up to 254
Unicode characters.
Group Name filter: Used to filter and verify the validity of an entered group name. This field
is optional and contains the value (cn={0}) by default. This field can contain up to 254
Unicode characters.
Click OK to complete the operation. Click Cancel to abandon the operation and return to the
Security Settings window.
As shown in Figure 9-73 on page 475, a new policy is created, which is called RACF_LDAP. The
Primary Server URL is that of the IBM Security Directory Server, the same way any regular
LDAP server is configured.
In the screen capture, the Group Member Attribute was set to racfgroupuserids (it shows
truncated in MI’s text box).
The item User Distinguished Name should be specified with all of the following parameters:
racfid
profiletype
cn
When the previous setup is complete, more users can be added to the policy, or clusters can
be assigned to it, as described in the topics to follow. There are no specific restrictions for
these RACF/LDAP user IDs, and they can be used to secure the MI, or the IBM service login
(for the IBM SSR) just as any other LDAP user ID.
See IBM Knowledge Center, available locally on the MI window by clicking the question mark
on the upper right upper bar and selecting Help, or on the following website:
http://www.ibm.com/support/knowledgecenter/STFS69_3.3.0/ts7700_security_ldap_racf.
html
The Manage users and assign user roles link opens the Security Settings window where
the user can add, modify, assign, test, and delete the authentication policies that determine
how users are authenticated to the TS7700 Management Interface.
Note: Valid characters for the Name of Custom Role field are A-Z, 0-9, $, @, *, #, and %.
The first character of this field cannot be a number.
To view the Roles and Assigned Permissions table, complete the following steps:
1. Select the check box to the left of the role to be displayed. The user can select more than
one role to display a comparison of permissions.
2. Select Properties from the Select Action menu.
3. Click Go.
The first column of the Roles and Assigned Permissions table lists all the tasks available to
users of the TS7700. Subsequent columns show the assigned permissions for selected role
(or roles). A check mark denotes permitted tasks for a user role. A null dash (-) denotes
prohibited tasks for a user role.
Permissions for predefined user roles cannot be modified. The user can name and define up
to 10 different custom roles, if necessary. The user can modify permissions for custom roles
in the Roles and Assigned Permissions table. The user can modify only one custom role at a
time.
Note: Valid characters for this field are A-Z, 0-9, $, @, *, #, and %. The first character of
this field cannot be a number.
Remember: The user can apply the permissions of a predefined role to a custom role
by selecting a role from the Role Template menu and clicking Apply. The user can then
customize the permissions by selecting or clearing tasks.
The Certificates table displays the following identifying information for SSL certificates on the
cluster:
Alias: A unique name to identify the certificate on the system.
Issued To: The distinguished name of the entity requesting the certificate.
Fingerprint: A number that specifies the Secure Hash Algorithm (SHA) of the certificate.
This number can be used to verify the hash for the certificate at another location, such as
the client side of a connection.
To import a new SSL certificate, select New Certificate from the top of the table, which
displays a wizard dialog.
To retrieve a certificate from the server, select Retrieve certificate from server and click
Next. Enter the host and port from which the certificate is retrieved and click Next. The
certificate information is retrieved. The user must set a unique alias in this window. To
import the certificate, click Finish. To abandon the operation and close the window, click
Cancel. The user can also go back to the Retrieve Signer Information window.
To upload a certificate, select Upload a certificate file and click Next. Click the Upload
button, select a valid certificate file, and click Next. Verify that the certificate information
(serial number, issued to, issued by, fingerprint, and expiration) is displayed on the wizard.
Fill the alias field with valid characters. When the Finish button is enabled, click Finish.
Verify that the trusted certificate was successfully added in the SSL Certificates table. To
abandon the operation and close the window, click Cancel. The user can also go back to
the Retrieve Signer Information window.
The user can back up these settings as part of the TS7700_cluster<cluster ID>.xmi file and
restore them for later use or use with another cluster.
Customer IP addresses tab
Use this tab to set or modify the MI IP addresses for the selected cluster. Each cluster is
associated with two routers or switches. Each router or switch is assigned an IP address
and one virtual IP address is shared between routers or switches.
The address values can be in IPv4 or IPv6 format. A maximum of three DNS servers can be
added. Any spaces that are entered in this field are removed.
To submit changes, click Submit. If the user changes apply to the accessing cluster, a
warning message is displayed that indicates that the current user access will be interrupted.
To accept changes to the accessing cluster, click OK. To reject changes to the accessing
cluster and return to the IP addresses tab, click Cancel.
To reject the changes that are made to the IP addresses fields and reinstate the last
submitted values, select Reset. The user can also refresh the window to reinstate the last
submitted values for each field.
Encrypt Grid Communication tab
Use this tab to encrypt grid communication between specific clusters.
Note: These settings can be backed up by using the Backup Settings function under
Cluster Settings tab and restore them for later use. When the backup settings are restored,
new settings are added but no settings are deleted. The user cannot restore feature
license settings to a cluster that is different from the cluster that created the
ts7700_cluster<cluster ID>.xmi backup file. After restoring feature license settings on a
cluster, log out and then log in to refresh the system.
Use the menu on the Currently activated feature licenses table to activate or remove a feature
license. The user can also use this menu to sort and filter feature license details.
Use this page to configure SNMP traps that log events, such as logins, configuration
changes, status changes (vary on, vary off, or service prep), shutdown, and code updates.
SNMP is a networking protocol that enables a TS7700 to gather and transmit automatically
information about alerts and status to other entities in the network.
SNMP settings
Use this section to configure global settings that apply to SNMP traps on an entire cluster.
The following settings are configurable:
SNMP Version. The SNMP version. It defines the protocol that is used in sending SNMP
requests and is determined by the tool that is used to monitor SNMP traps. Different
versions of SNMP traps work with different management applications. The following
values are possible:
– V1. The suggested trap version that is compatible with the greatest number of
management applications. No alternative version is supported.
– V2. An alternative trap version.
– V3. An alternative trap version.
– Enable SNMP Traps. A check box that enables or disables SNMP traps on a cluster. A
checked box enables SNMP traps on the cluster; a cleared box disables SNMP traps
on the cluster. The check box is cleared, by default.
– Trap Community Name. The name that identifies the trap community and is sent
along with the trap to the management application. This value behaves as a password;
the management application will not process an SNMP trap unless it is associated with
the correct community. This value must be 1 - 15 characters in length and composed of
Unicode characters. The default value for this field is public.
Send Test Trap. Select this button to send a test SNMP trap to all destinations listed in the
Destination Settings table by using the current SNMP trap values. The Enable SNMP
Traps check box does not need to be checked to send a test trap. If the SNMP test trap is
received successfully and the information is correct, click Submit Changes.
Submit Changes. Select this button to submit changes to any of the global settings,
including the fields SNMP Version, Enable SNMP Traps, and Trap Community Name.
Destination Settings. Use the Destination Settings table to add, modify, or delete a
destination for SNMP trap logs. The user can add, modify, or delete a maximum of 16
destination settings at one time.
Note: A user with read-only permissions cannot modify the contents of the Destination
Settings table.
Tip: A valid IPv4 address is 32 bits long, consists of four decimal numbers, each 0 -
255, separated by periods, such as 98.104.120.12.
A valid IPv6 address is a 128-bit long hexadecimal value that is separated into 16-bit
fields by colons, such as 3afa:1910:2535:3:110:e8ef:ef41:91cf. Leading zeros can be
omitted in each field so that :0003: can be written as :3:. A double colon (::) can be
used once per address to replace multiple fields of zeros.
Port. The port to which the SNMP trap logs are sent. This value must be
0 - 65535. A value in this field is required.
Use the Select Action menu on the Destination Settings table to add, modify, or delete an
SNMP trap destination. Destinations are changed in the vital product data (VPD) as soon as
they are added, modified, or deleted. These updates do not depend on clicking Submit
Changes.
Tip: This window is visible only if at least one instance of FC5271 (selective device access
control (SDAC)) is installed on all clusters in the grid.
The user can use the Access Groups table to create a library port access group. Also, the
user can modify or delete an existing access group
Use the Select Action menu on the Access Groups table to add, modify, or delete a
library port access group.
Description. A description of the access group (a maximum of 70 characters).
Access Groups Volume Ranges. The Access Groups Volume Ranges table displays
VOLSER range information for existing library port access groups. The user can also use
the Select Action menu on this table to add, modify, or delete a VOLSER range that is
defined by a library port access group.
Start VOLSER. The first VOLSER in the range that is defined by an access group.
End VOLSER. The last VOLSER in the range that is defined by an access group.
Access Group. The identifying name of the access group, which is defined by the Name
field in the Access Groups table.
Use the Select Action menu on the Access Group Volume Ranges table to add, modify,
or delete a VOLSER range that is associated with a library port access group. The user
can show the inserted volume ranges. To view the current list of virtual volume ranges in
the TS7700 cluster, enter the start and end VOLSERs and click Show.
Note: Access groups and access group ranges are backed up and restored together.
For additional information, see “Backup settings” on page 497 and “Restore Settings
window” on page 500.
Cluster Settings
You can use the Cluster Settings to view or change settings that determine how a cluster runs
copy policy overrides, applies Inhibit Reclaim schedules, uses an EKS, implements write
protect mode, and runs backup and restore operations.
For an evaluation of different scenarios and examples where those overrides benefit the
overall performance, see 4.2, “Planning for a grid operation” on page 160.
Reminder: The items on this window can modify the cluster behavior regarding local copy
and certain I/O operations. Some LI REQUEST commands also can do this action.
Note: This override can be enabled independently of the status of the copies in the
cluster.
Note: These settings override the default TS7700 behavior and can be different for
every cluster in a grid.
This window is visible but disabled on the TS7700 MI if the grid possesses a physical library,
but the selected cluster does not. The following message is displayed:
Tip: This window is not visible on the TS7700 MI if the grid does not possess a physical
library.
Reclamation can improve tape usage by consolidating data on some physical volumes, but it
uses system resources and can affect host access performance. The Inhibit Reclaim
schedules function can be used to disable reclamation in anticipation of increased host
access to physical volumes.
Use the menu on the Schedules table to add a new Inhibit Reclaim schedule or to modify or
delete an existing schedule.
The values are the same as for the Add Inhibit Reclaim Schedule.
Note: Plan the Inhibit Reclaim schedules carefully. Running the reclaims during peak times
can affect production, and not having enough reclaim schedules influences the media
consumption.
In the TS7700 subsystem, user data can be encrypted on tape cartridges by the encryption
capable tape drives that are available to the TS7700 tape-attached clusters. Also, data can be
encrypted by the full data encryption (FDE) DDMs in TS7700 TVC cache.
With R4.0, the TVC cache encryption can be configured either for local or external encryption
key management with 3956-CC9, 3956-CS9, and 3956-CSA cache types. Tape encryption
uses an out-of-band external key management. For more information, see Chapter 2,
“Architecture, components, and functional characteristics” on page 15.
Note: Only IBM Security Key Lifecycle Manager (SKLM) supports both external disk
encryption and TS1140 and TS1150 tape drives. The settings for Encryption Server are
shared for both tape and external disk encryption.
The IBM Security Key Lifecycle Manager for z/OS (ISKLM) external key manager supports
TS7700 physical tape but does not support TS7700 disk encryption.
There is a tutorial available on MI that shows the properties of the EKS. To watch it, click the
View tutorial link in the MI window. Figure 9-78 shows the Encryption Key Server Addresses
setup window.
If the cluster has the feature code for disk or tape encryption enabled, this window is visible
on the TS7700 MI.
Note: Some IP addresses (i.e.: 10.x.x.x, 192.168.x.x, 172.16.x.x) are commonly used in
local networks (internal use) and should not be propagated through the Internet. If TS7700
and key manager reside in different networks, separated by the Internet, those subnets
must be avoided.
The EKS assists encryption-enabled tape drives in generating, protecting, storing, and
maintaining EKs that are used to encrypt information that is being written to, and to decrypt
information that is being read from, tape media (tape and cartridge formats). Also, EKS
manages the EK for the cache disk subsystem, with the External key management disk
encryption feature installed, which removes the responsibility of managing the key from the
3957 V07 or VEB engine, and from the disk subsystem controllers. To read more about data
encryption with TS7700, see Chapter 2, “Architecture, components, and functional
characteristics” on page 15.
The following settings are used to configure the TS7700 connection to an EKS:
Primary key server address
The key server name or IP address that is primarily used to access the EKS. This address
can be a fully qualified host name or an IP address in IPv4 or IPv6 format. This field is not
required if the user does not want to connect to an EKS.
A valid IPv6 address is a 128-bit long hexadecimal value separated into 16-bit fields by
colons, such as 3afa:1910:2535:3:110:e8ef:ef41:91cf. Leading zeros can be omitted
in each field so that :0003: can be written as :3:. A double colon (::) can be used once
per address to replace multiple fields of zeros. For example,
3afa:0:0:0:200:2535:e8ef:91cf can be written as: 3afa::200:2535:e8ef:91cf.
A fully qualified host name is a domain name that uniquely and absolutely names a
computer. It consists of the host name and the domain name. The domain name is one or
more domain labels that place the computer in the DNS naming hierarchy. The host name
and the domain name labels are separated by periods and the total length of the host
name cannot exceed 255 characters.
Primary key server port
The port number of the primary key server. Valid values are any whole number 0 - 65535;
the default value is 3801. This field is only required if a primary key address is used.
Secondary key server address
The key server name or IP address that is used to access the EKS when the primary key
server is unavailable. This address can be a fully qualified host name or an IP address in
IPv4 or IPv6 format. This field is not required if the user does not want to connect to an
EKS. See the primary key server address description for IPv4, IPv6, and fully qualified
host name value parameters.
Secondary key server port
The port number of the secondary key server. Valid values are any whole number
0 - 65535; the default value is 3801. This field is only required if a secondary key address
is used.
Using the Ping Test
Use the Ping Test buttons to check the cluster network connection to a key server after
changing a cluster’s address or port. If the user changes a key server address or port and
do not submit the change before using the Ping Test button, the user receives the following
message:
To perform a ping test you must first submit your address and/or port changes.
After the ping test starts, one of the following two messages will occur:
– The ping test against the address “<address>” on port “<port>” was
successful.
– The ping test against the address “<address>” on port “<port>” from
“<cluster>” has failed. The error returned was: <error text>.
Tip: The user can back up these settings as part of the ts7700_cluster<cluster ID>.xmi
file and restore them for later use or use with another cluster. If a key server address is
empty at the time that the backup is run, when it is restored, the port settings are the same
as the default values.
With FlashCopy in progress, no modifications are allowed on the Write Protect Mode window
until the FlashCopy testing is completed. When Write Protect Mode is enabled on a cluster,
host commands fail if they are sent to virtual devices in that cluster and attempt to modify a
volume’s data or attributes.
Note: FlashCopy is enabled from LI REQ (Library Request Host Console) command.
Meanwhile, host commands that are sent to virtual devices in peer clusters are allowed to
continue with full read and write access to all volumes in the library. Write Protect Mode is
used primarily for client-initiated disaster recovery testing. In this scenario, a recovery host
that is connected to a non-production cluster must access and validate production data
without any risk of modifying it.
A cluster can be placed into Write Protect Mode only if the cluster is online. After the mode is
set, the mode is retained through intentional and unintentional outages and can be disabled
only through the same MI window that is used to enable the function. When a cluster within a
grid configuration has Write Protect Mode enabled, standard grid functions, such as virtual
volume replication and virtual volume ownership transfer, are unaffected.
Virtual volume categories can be excluded from Write Protect Mode. Up to 32 categories can
be identified and set to include or exclude from Write Protect Mode by using the Category
Write Protect Properties table. Additionally, write-protected volumes in any scratch category
can be mounted as private volumes if the Ignore Fast Ready characteristics of write-protected
categories check box is selected.
Table 9-12 Write Protect Mode settings for the active cluster
Setting Description
Write Protect State Displays the status of Write PRotect Mode on the active cluster.
The following are the possible values:
Disabled: Write Protect mode is disabled. No Write Protect
settings are in effect.
Enabled: Write Protect Mode is enabled. Any host command
to modify volume data or attributes by using virtual devices in
this cluster will fail, subject to any defined category exclusions.
Write protect for Flash Copy enabled: Write Protect Mode is
enabled by the host. The Write Protect for Flash Copy function
was enabled through the LI REQ zOS command and a DR test
is likely in progress.
Important: Write Protect Mode cannot be modified while
LI-REQ-initiated write protection for Flash Copy is enabled.
User must disable it first by using the LI REQ command before
trying to change any Write Protect Mode settings on the
TS7700 MI:
– DR family: The name of the disaster recovery (DR) family
the Write Protect for Flash Copy was initiated against.
– Flash time: The data and time at which flash copy was
enabled by the host. This mimics the time at which a real
disaster occurs.
Disable Write Protect Mode Select this to disable Write Protect Mode for the cluster.
Enable Write Protect Mode Select this option to enable Write Protect Mode for devices that are
associated with this cluster. When enabled, any host command
fails if it attempts to modify volume data or volume attributes
through logical devices that are associated with this cluster,
subject to any defined category exclusions. After Write Protect
Mode is enabled through the TS7700 Management Interface, it
persists through any outage and can be disabled only through the
TS7700 Management Interface.
Note: Write Protect Mode can only be enabled on a cluster if it is
online and no Write Protect Flash copy is in progress.
Ignore fast ready characteristics Check this box to permit write protected volumes that were
of write protected categories returned to a scratch or a fast ready category to be recognized as
private volumes.
Note: Categories configured and displayed on this table are not replicated to other
clusters in the grid.
When Write Protect Mode is enabled, any categories added to this table must display a
value of Yes in the Excluded from Write Protect field before the volumes in that category
can be modified by an accessing host.
The following category fields are displayed in the Category Write Protect Properties table:
– Category Number: The identifier for a defined category. This is an alphanumeric
hexadecimal value between 0x0001 and 0xFEFF (0x0000 and 0xFFxx cannot be
used). Values that are entered do not include the 0x prefix, although this prefix is
displayed on the Cluster Summary window. Values that are entered are padded up to
four places. Letters that are used in the category value must be capitalized.
Use the menu on the Category Write Protect Properties table to add a category, or modify or
delete an existing category. The user must click Submit Changes to save any changes that
were made to the Write Protect Mode settings.
The user can add up to 32 categories per cluster when all clusters in the grid operate at code
level R3.1 or later.
Below are some Knowledge Center tips regarding Disaster Recovery tests that can be
valuable when planning for DR tests. Also, see Chapter 13, “Disaster recovery testing” on
page 767.
Use the Write Protect Mode during a DR test to prevent any accidental DR host-initiated
changes to your production content.
During a DR test, housekeeping (return to scratch processing) within the DR test host
configuration unless the process specifically targets volumes only within the DR host test
range. Otherwise, even with the Selective Write Protect function enabled, the DR host can
attempt to return production volumes to scratch. This problem can occur because the tape
management system snapshot that is used for the DR test can interpret the volumes as
expired and ready for processing.
Never assume the return to scratch process acts only on DR test volumes. If Write Protect
Mode is enabled before DR testing and return to scratch processing is run on the DR host,
then the Selective Write Protect function prevents the return to scratch from occurring on
protected categories. Further, options in the tape management system can be used to
limit which volumes within the DR host are returned to scratch.
For example, in the DFSMSrmm tape management system, the VOLUMES or VOLUMERANGES
options can be used on the EXPROC command to limit volumes returned to scratch. When
tape management and write protect safeguards are used, protection against data loss
occurs both on the TS7700 and at the host.
Backup settings
Use this selection to back up the settings from a TS7700 cluster.
Important: Backup and restore functions are not supported between clusters operating at
different code levels. Only clusters operating at the same code level as the accessing
cluster (the one addressed by the web browser) can be selected for Backup or Restore.
Clusters operating different code levels are visible, but the options are disabled.
Table 9-13 lists the cluster settings that are available for backup (and restore) in a TS7700
cluster.
Library Port Access Groups Any TS7700 Cluster Any TS7700 Cluster
Categories
Storage Groups
Management Classes
Session Timeout
Account Expirations
Account Lock
SNMP
The Backup Settings table lists the cluster settings that are available for backup:
Constructs: Select this check box to select all of the following constructs for backup:
– Storage Groups: Select this check box to back up defined Storage Groups.
– Management Classes: Select this check box to back up defined Management
Classes.
– Storage Classes: Select this check box to back up defined Storage Classes.
– Data Classes: Select this check box to back up defined Data Classes.
Tape Partitions: Select this check box to back up defined tape partitions. Resident
partitions are not considered.
Inhibit Reclaim Schedule: Select this check box to back up the Inhibit Reclaim
Schedules that are used to postpone tape reclamation. If the cluster does not have an
attached tape library, then this option will not be available.
Library Port Access Groups: Select this check box to back up defined library port
access groups. Library port access groups and access group ranges are backed up
together.
Categories: Select this check box to back up scratch categories that are used to group
virtual volumes.
Physical Volume Ranges: Select this box to back up defined physical volume ranges. If
the cluster does not have an attached tape library, then this option will not be available.
Important: A restore operation after a backup of cluster settings does not restore or
otherwise modify any user, role, or password settings defined by a security policy.
Encryption Key Server Addresses: Select this check box to back up defined encryption
key server addresses, including the following:
– Primary key server address: The key server name or IP address that is primarily used
to access the encryption key server.
– Primary key server port: The port number of the primary key server.
– Secondary key server address: The key server name or IP address that is used to
access the encryption server when the primary key server is unavailable.
– Secondary key server port: The port number of the secondary key server.
Feature Licenses: Select this check box to back up the settings for currently activated
feature licenses.
Note: The user can back up these settings as part of the ts7700_cluster<cluster
ID>.xmi file and restore them for later use on the same cluster. However, the user
cannot restore feature license settings to a cluster different from the cluster that created
the ts7700_cluster<cluster ID>.xmi backup file.
Important: If the user navigates away from this window while the backup is in progress,
the backup operation is stopped and the operation must be restarted.
When the backup operation is complete, the backup file ts7700_cluster<cluster ID>.xmi is
created. This file is an XML Meta Interchange file. The user is prompted to open the backup
file or save it to a directory. Save the file. When prompted to open or save the file to a
directory, save the file without changing the .xmi file extension or the file contents.
Any changes to the file contents or extension can cause the restore operation to fail. The user
can modify the file name before saving it, if the user wants to retain this backup file after
subsequent backup operations. If the user chooses to open the file, do not use Microsoft Excel
to view or save it. Microsoft Excel changes the encoding of an XML Meta Interchange file,
and the changed file is corrupted when used during a restore operation.
Record these settings in a safe place and recover them manually if necessary.
Note: Backup and restore functions are not supported between clusters operating at
different code levels. Only clusters operating at the same code level as the current cluster
can be selected from the Current Cluster Selected graphic. Clusters operating at different
code levels are visible, but not available, in the graphic.
See Table 9-13 on page 497 for a quick Backup and Restore settings reference. Follow these
steps to restore cluster settings:
1. Use the banner bread crumbs to navigate to the cluster where the restore operation is
applied.
2. On the Restore Settings window, click Browse to open the File Upload window.
3. Go to the backup file used to restore the cluster settings. This file has an .xmi extension.
4. Add the file name to the File name field.
5. Click Open or press Enter from the keyboard.
6. Click Show file to review the cluster settings that are contained in the backup file.
The backup file can contain any of the following settings, but only those settings that are
defined by the backup file are shown:
Categories: Select this check box to restore scratch categories that are used to group
virtual volumes.
Physical Volume Pools: Select this check box to restore physical volume pool definitions.
Constructs: Select this check box to restore all of the displayed constructs. When these
settings are restored, new settings are added and existing settings are modified, but no
settings are deleted.
– Storage Groups: Select this check box to restore defined SGs.
– Management Classes: Select this check box to restore defined MCs.
MC settings are related to the number and order of clusters in a grid. Take special care
when restoring this setting. If an MC is restored to a grid that has more clusters than
the grid had when the backup was run, the copy policy for the new cluster or clusters
are set as No Copy.
If an MC is restored to a grid that has fewer clusters than the grid had when the backup
was run, the copy policy for the now-nonexistent clusters is changed to No Copy. The
copy policy for the first cluster is changed to RUN to ensure that one copy exists in the
cluster.
If cluster IDs in the grid differ from cluster IDs present in the restore file, MC copy
policies on the cluster are overwritten with those from the restore file. MC copy policies
can be modified after the restore operation completes.
If the backup file was created by a cluster that did not define one or more scratch
mount candidates, the default scratch mount process is restored. The default scratch
mount process is a random selection routine that includes all available clusters. MC
scratch mount settings can be modified after the restore operation completes.
– Storage Classes: Select this check box to restore defined SCs.
– Data Classes: Select this check box to restore defined DCs.
If this setting is selected and the cluster does not support logical Write Once Read
Many (LWORM), the Logical WORM setting is disabled for all DCs on the cluster.
– Inhibit Reclaim Schedule: Select this check box to restore Inhibit Reclaim schedules
that are used to postpone tape reclamation.
A current Inhibit Reclaim schedule is not overwritten by older settings. An earlier Inhibit
Reclaim schedule is not restored if it conflicts with an Inhibit Reclaim schedule that
currently exists. Media type volume sizes are restored based on the restrictions of the
restoring cluster. The following volume sizes are supported:
• 1000 MiB
• 2000 MiB
• 4000 MiB
• 6000 MiB
• 25000 MiB
Note: If the backup file was created by a cluster that did not possess a physical library,
the Inhibit Reclaim schedules settings are reset to default.
Library Port Access Groups: Select this check box to restore defined library port access
groups.
This setting is only available if all clusters in the grid are operating with Licensed Internal
Code levels of 8.20.0.xx or higher.
Library port access groups and access group ranges are backed up and restored together.
Important: Changes to network settings affect access to the TS7700 MI. When these
settings are restored, routers that access the TS7700 MI are reset. No TS7700 grid
communications or jobs are affected, but any current users are required to log back on
to the TS7700 MI by using the new IP address.
Feature Licenses: Select this check box to restore the settings for currently activated
feature licenses. When the backup settings are restored, new settings are added but no
settings are deleted. After restoring feature license settings on a cluster, log out and then
log in to refresh the system.
Note: The user cannot restore feature license settings to a cluster that is different from
the cluster that created the ts7700_cluster<cluster ID>.xmi backup file.
After selecting Show File, the name of the cluster from which the backup file was created is
displayed at the top of the window, along with the date and time that the backup occurred.
A warning window opens and prompts you to confirm the decision to restore settings. Click
OK to restore settings or Cancel to cancel the restore operation.
Important: If the user navigates away from this window while the restore is in progress, the
restore operation is stopped and the operation must be restarted.
The restore cluster settings operation can take 5 minutes or longer. During this step, the MI is
communicating the commands to update settings. If the user navigates away from this
window, the restore settings operation is canceled.
Restoring to or from a cluster without a physical library: If either the cluster that
created the backup file or the cluster that is performing the restore operation do not
possess a physical library, upon completion of the restore operation all physical tape
library settings are reset to default. One of the following warning messages is displayed on
the confirmation page:
The file was backed up from a system with a physical tape library attached
but this cluster does not have a physical tape library attached. If you
restore the file to this cluster, all the settings for physical tape library
will have default values.
The file was backed up from a cluster without a physical tape library
attached but this cluster has a physical tape library attached. If you
restore the file to this cluster, all the settings for physical tape library
will have default values.
The following settings are affected:
– Inhibit Reclaim Schedule
– Physical Volume Pools
– Physical Volume Ranges
Confirm restore settings: This page confirms that a restore settings operation is in
progress on the IBM TS7700 cluster.
Note: If this value is a Domain Name Server (DNS) address, then you must activate
and configure a DNS on the Cluster Network Settings window.
Note: To send logs to a syslog server, RSyslog must be enabled and the status of the
remote target must be Active. Otherwise, no logs will be sent.
Enable or Disable RSyslog To change the state, select either Enable RSyslog or Disable
RSyslog.
Change the order of the Click the Target Number column heading.
remote targets
Table 9-15 lists the facilities used to collect the different log types that are sent using RSyslog.
Items 0 through 15 are system defaults, and items 16 through 18 are specific to TS7700.
The Number of physical volumes to export is the maximum number of physical volumes that
can be exported. This value is an integer 1 - 10,000. The default value is 2000. To change the
number of physical volumes to export, enter an integer in the described field and click
Submit.
Note: The user can modify this field even if a Copy Export operation is running, but the
changed value does not take effect until the next Copy Export operation starts.
For more information about Copy Export, see Chapter 12, “Copy Export” on page 735.
This page displays information for an entire grid if the accessing cluster is part of a grid, or for
only the accessing cluster if that cluster is a stand-alone machine. Use this page to modify
settings of the notifications that are generated by the system, such as Event, Host Message,
and Call Home Microcode Detected Error (MDE).
There are three types of notifications that are generated by the system:
Events (OPxxxx messages in a CBR3750I message on the host)
Host Message (Gxxxx, ALxxxx, EXXXX, Rxxxx in a CBR3750I message on the host)
Call Home MDE in SIM (in a IEA480E message on the host)
Notification Settings page has been introduced by R4.1.2 level of code as part of the System
Events Redesign package. The new window allows the user to adjust different characteristics
for all CBR3750I messages. Also, the user is able to add personalized text to the messages,
making it easier to implement or manage the automation-based monitoring in the IBM z/OS.
The Notification Settings window also works as a catalog that can be use to search for
descriptions of the Event, Host Message, or Call Home MDE. From this window, the user can
send the text messages to all hosts attached to the subsystem by enabling the Host
Notification option. Figure 9-79 shows the Notification Settings window and options.
Custom Severity Custom severity of the notification. The following are the possible values:
Default
Informational
Warning
Impact
Serious/Error
Critical
Type The type of notification that is listed: Event, Host Message or Call HOME
MDE.
Notification State Whether the notification is active or inactive. If inactive, it will not be sent to
any notification channel.
Comments Field available to add user comments. The comments are sent with the
message, through the notification channels, when the message is triggered
by the system.
Note: Comments for Call Home MDEs will not be sent to Host. Only the
MDE code is sent to the Host.
Table 9-17 shows the actions available from the Notifications table.
Enable or disable host 1. Select Actions → Modify Host Notification State by Cluster.
notifications for alerts 2. Select the cluster in the Modify Host Notification State by Cluster
box.
Select Active, then OK to enable notifications.
Select Inactive, then OK to disable notifications.
Filter the table data by 1. Click the down arrow next to the Filter field.
using a column 2. Select the column heading to filter by.
heading 3. Refine the selection.
The Ownership Takeover Mode option shows only when a cluster is member of a grid,
whereas Copy Export Recover and Copy Export Recovery Status options appear for a
single TS7700T configuration (that is connected to a physical library).
Note: The Copy Export Recover and Copy Export Recover Status options are only
available in a single cluster configuration for a tape-attached cluster.
Note: Keep the IP addresses for the clusters in the configuration available for use during a
failure of a cluster. Thus, the MI can be accessed from a surviving cluster to initiate the
ownership takeover actions.
When a cluster enters a failed state, enabling Ownership Takeover Mode enables other
clusters in the grid to obtain ownership of logical volumes that are owned by the failed cluster.
Normally, ownership is transferred from one cluster to another through communication
between the clusters. When a cluster fails or the communication links between clusters fail,
the normal means of transferring ownership is not available.
Read/write or read-only takeover should not be enabled when only the communication path
between the clusters has failed, and the isolated cluster remains operational. The integrity of
logical volumes in the grid can be compromised if a takeover mode is enabled for a cluster
that is only isolated from the rest of the grid (not failed) and there is active host access to it.
A takeover decision should be made only for a cluster that is indeed no longer operational.
AOTM, when available and configured, verifies the real status of the non-responsive cluster
by using an alternative communication path other than the usual connection between
clusters. AOTM uses the TSSC associated with each cluster to determine whether the cluster
is alive or failed, enabling the ownership takeover only in case the unresponsive cluster has
indeed failed. If the cluster is still alive, AOTM does not initiate a takeover, and the decision is
up to the human operator.
When Read Only takeover mode is enabled, those volumes requiring takeover are read-only,
and fail any operation that attempts to modify the volume attributes or data. Read/write
takeover enables full read/write access of attributes and data.
If an ownership takeover mode is enabled when only a WAN/LAN failure is present, read/write
takeover should not be used because it can compromise the integrity of the volumes that are
accessed by both isolated groups of clusters.
If full read/write access is required, one of the isolated groups should be taken offline to
prevent any use case where both groups attempt to modify the same volume. Figure 9-81
shows the Ownership Takeover Mode window.
Figure 9-81 shows the local cluster summary, the list of available clusters in the grid, the
connection state between local (accessing) cluster and its peers. It also shows the current
takeover state for the peer clusters (if enabled or disabled by the accessing cluster) and the
current takeover mode.
Here is the manual procedure to start an ownership takeover against a failed cluster:
1. Authenticate the MI to the surviving cluster that has the takeover intervention.
2. Go to the Ownership Takeover Mode by clicking Service → Ownership Takeover Mode.
3. Select the failed cluster (the one to be taken over).
4. In the Select Action box, select the appropriate Ownership takeover mode (RW or RO).
5. Click Go, and retry the host operation that failed.
Figure 9-81 on page 510 shows that AOTM was previously configured in this grid (for
Read/Write, with a grace period of 405 minutes for Cluster 0). In this case, automatic
ownership takeover takes place at the end of that period (6 hours and 45 minutes). Human
operation can override that setting manually by taking the actions that are described
previously, or by changing the AOTM settings to more suitable values. You can use the
Configure AOTM button to configure the values that are displayed in the previous AOTM
Configuration table.
Important: An IBM SSR must configure the TSSC IP addresses for each cluster in the grid
before AOTM can be enabled and configured for any cluster in the grid.
Table 9-18 compares the operation of read/write and read-only ownership takeover modes.
Operational clusters in the grid can run these Operational clusters in the grid can run these
tasks: tasks:
Perform read and write operations on the Perform read operations on the virtual
virtual volumes that are owned by the failed volumes that are owned by the failed cluster.
cluster. Operational clusters in the grid cannot run these
Change virtual volumes that are owned by tasks:
the failed cluster to private or SCRATCH Change the status of a volume to private or
status. scratch.
Perform read and write operations on the
virtual volumes that are owned by the failed
cluster.
A consistent copy of the virtual volume must be If no cluster failure occurred, it is possible that a
available on the grid or the virtual volume must virtual volume that is accessed by another cluster
exist in a scratch category. If no cluster failure in read-only takeover mode contains older data
occurred (grid links down) and the ownership than the one on the owning cluster. This situation
takeover was started by mistake, the possibility can occur if the virtual volume was modified on
exists for two sites to write data to the same the owning cluster while the communication path
virtual volume. between the clusters was down. When the links
are reestablished, those volumes are marked in
error.
See the TS7700 IBM Knowledge Center available locally by clicking the question mark icon in
the upper right corner of the MI window, or online at the following website:
https://www.ibm.com/support/knowledgecenter/STFS69_4.1.2/hydra_c_ichome.html
In these cases, the volume is moved to the FF20 (damaged) category by the TS7700
subsystem, and the host cannot access it. If access is attempted, messages like CBR4125I
Valid copy of volume volser in library library-name inaccessible are displayed.
To repair virtual volumes in the damaged category for the TS7700 Grid, use the Repair virtual
volumes page.
The user can print the table data by clicking Print report. A comma-separated value (.csv) file
of the table data can be downloaded by clicking Download spreadsheet. The following
information is displayed on this window:
Repair policy. The Repair policy section defines the repair policy criteria for damaged
virtual volumes in a cluster. The following criteria is shown:
– Cluster’s version to keep. The selected cluster obtains ownership of the virtual
volume when the repair is complete. This version of the virtual volume is the basis for
repair if the Move to insert category keeping all data option is selected.
– Move to insert category keeping all data. This option is used if the data on the virtual
volume is intact and still relevant. If data has been lost, do not use this option. If the
cluster that is chosen in the repair policy has no data for the virtual volume to be
repaired, choosing this option is the same as choosing Move to insert category
deleting all data.
– Move to insert category deleting all data. The repaired virtual volumes are moved to
the insert category and all data is erased. Use this option if the volume is returned to
scratch or if data loss has rendered the volume obsolete. If the volume has been
returned to scratch, the data on the volume is no longer needed. If data loss has
occurred on the volume, data integrity issues can occur if the data on the volume is not
erased.
– Damaged Virtual Volumes. The Damaged Virtual Volumes table displays all the
damaged virtual volumes in a grid. The following information is shown:
• Virtual Volume. The VOLSER of the damaged virtual volume. This field is also a
hyperlink that opens the Damaged Virtual Volumes Details window, where more
information is available.
Damaged virtual volumes cannot be accessed; repair all damaged virtual volumes
that appear on this table. The user can repair up to 10 virtual volumes at a time.
Figure 9-82 shows the navigation to the Network Diagnostics window and a ping test
example.
– IP Address/Hostname: The target IP address or host name for the selected network
test. The value in this field can be an IP address in IPv4 or IPv6 format or a fully
qualified host name.
If the user is experiencing a performance issue on a TS7700, the user has two options to
collect system data for later troubleshooting. The first option, System Snapshot, collects a
summary of system data that includes the performance state. This option is useful for
intermittently checking the system performance. This file is built in approximately 5 minutes.
The second option, TS7700 Log Collection, enables you to collect historical system
information for a time period up to the past 12 hours. This option is useful for collecting data
during or soon after experiencing a problem. Based on the number of specified hours, this file
can become large and require over an hour to build.
Note: Periods that are covered by TS7700 Log Collection files cannot overlap. If the
user attempts to generate a log file that includes a period that is covered by an existing
log file, a message prompts the user to select a different value for the hours field.
Note: Data that is collected during this operation is not automatically forwarded to IBM.
The user must contact IBM and open a problem management report (PMR) to move
manually the collected data off the system.
When data collection is started, a message is displayed that contains a button linking to
the Tasks window. The user can click this button to view the progress of data collection.
Important: If data collection is started on a cluster that is in service mode, the user
might not be able to check the progress of data collection. The Tasks window is not
available for clusters in service mode, so there is no link to it in the message.
Data Collection Limit Reached. This dialog box opens if the maximum number of
System Snapshot or TS7700 Log Collection files exists. The user can save a maximum
number of 24 System Snapshot files or two TS7700 Log Collection files. If the user
attempted to save more than the maximum of either type, the user is prompted to delete
the oldest existing version before continuing. The name of any file to be deleted is
displayed.
Click Continue to delete the oldest files and proceed. Click Cancel to abandon the data
collection operation.
Problem Description. Optional: Enter a detailed description of the conditions or problem
that was experienced before any data collection has been initiated, in this field. Include
symptoms and any information that can assist IBM Support in the analysis process,
including the description of the preceding operation, VOLSER ID, device ID, any host error
codes, any preceding messages or events, time and time zone of incident, and any PMR
number (if available). The number of characters in this description cannot exceed 1000.
Copy Export enables the export of all virtual volumes and the virtual volume database to
physical volumes, which can then be ejected and saved as part of a data retention policy for
disaster recovery. The user can also use this function to test system recovery.
For a detailed explanation of the Copy Export function, see Chapter 12, “Copy Export” on
page 735. Also, see the IBM TS7700 Series Copy Export Function User's Guide, available
online at:
https://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101092
Reminder: The recovery cluster needs tape drives that are compatible with the exported
media. Also, if encrypted tapes are used for export, access to the EKs must be provided.
Before the user attempts a Copy Export, ensure that all physical media that is used in the
recovery is inserted. During a Copy Export recovery, all current virtual and physical volumes
are erased from the database and virtual volumes are erased from the cache. Do not attempt
a Copy Export operation on a cluster where current data is to be saved.
The original physical volume copy is deleted without overwriting the virtual volumes. When
the Copy Export operation is rerun, the new, active version of the data is used.
The following fields and options are presented to the user to help testing recovery or running
a recovery:
Volser of physical stacked volume for Recovery Test. The physical volume from which
the Copy Export recovery attempts to recover the database.
Disaster Recovery Test Mode. This option determines whether a Copy Export Recovery
is run as a test or to recover a system that has suffered a disaster. If this box contains a
check mark (default status), the Copy Export Recovery runs as a test. If the box is cleared,
the recovery process runs in normal mode, as when recovering from an actual disaster.
When the recovery is run as a test, the content of exported tapes remains unchanged.
Additionally, primary physical copies remain unrestored and reclaim processing is
disabled to halt any movement of data from the exported tapes.
Any new volumes that are written to the system are written to newly added scratch tapes,
and do not exist on the previously exported volumes. This ensures that the data on the
Copy Export tapes remains unchanged during the test.
In contrast to a test recovery, a recovery in normal mode (box cleared) rewrites virtual
volumes to physical storage if the constructs change so that the virtual volume’s data can
be put in the correct pools. Also, in this type of recovery, reclaim processing remains
enabled and primary physical copies are restored, requiring the addition of scratch
physical volumes.
A recovery that is run in this mode enables the data on the Copy Export tapes to expire in
the normal manner and those physical volumes to be reclaimed.
Note: The number of virtual volumes that can be recovered depends on the number of
FC5270 licenses that are installed on the TS7700 that is used for recovery. Additionally,
a recovery of more than 2 million virtual volumes must be run by a TS7740 operating
with a 3957-V07 and a code level of 8.30.0.xx or higher.
Erase all existing virtual volumes during recovery. This check box is shown if virtual
volume or physical volume data is present in the database. A Copy Export Recovery
operation erases any existing data. No option exists to retain existing data while running
the recovery. The user can check this check box to proceed with the Copy Export
Recovery operation.
Submit. Click this button to initiate the Copy Export Recovery operation.
Important: The Copy Export recovery status is only available for a stand-alone TS7700T
cluster.
The table on this window displays the progress of the current Copy Export recovery operation.
This window includes the following information:
Total number of steps. The total number of steps that are required to complete the Copy
Export recovery operation.
Current step number. The number of steps completed. This value is a fraction of the total
number of steps that are required to complete, not a fraction of the total time that is
required to complete.
Start time. The time stamp for the start of the operation.
Duration. The amount of time the operation has been in progress, in hours, minutes, and
seconds.
Status. The status of the Copy Export recovery operation. The following values are
possible:
– No task. No Copy Export operation is in progress.
– In progress. The Copy Export operation is in progress.
– Complete with success. The Copy Export operation completed successfully.
– Canceled. The Copy Export operation was canceled.
– Complete with failure. The Copy Export operation failed.
– Canceling. The Copy Export operation is in the process of cancellation.
Operation details. This field displays informative status about the progress of the Copy
Export recovery operation.
Cancel Recovery. Click the Cancel Recovery button to end a Copy Export recovery
operation that is in progress and erase all virtual and physical data. The Confirm Cancel
Operation dialog box opens to confirm the decision to cancel the operation. Click OK to
cancel the Copy Export recovery operation in progress. Click Cancel to resume the Copy
Export recovery operation.
These interfaces and facilities are part of the IBM System Storage Data Protection and
Retention (DP&R) storage system. The main objective of this mechanism is to provide a safe
and efficient way for the System Call Home (Outbound) and Remote Support (Inbound)
connectivity capabilities.
For a complete description of the connectivity mechanism and related security aspects, see
IBM Data Retention Infrastructure (DRI) System Connectivity and Security, found at:
https://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102531
The Call Home function generates a service alert automatically when a problem occurs with
one of the following:
TS7720
TS7740
TS7760
TS3500 tape library
TS4500 tape library
Error information is transmitted to the TSSC for service, and then to the IBM Support Center
for problem evaluation. The IBM Support Center can dispatch an IBM SSR to the client
installation. Call Home can send the service alert to a window service to notify multiple
people, including the operator. The IBM SSR can deactivate the function through service
menus, if required.
TS3500
Shuttle Complex
TS7650
TS7700T TS4500
TSSC
TS7700D
TS7650G
TS3500
TS7740
The TSSC can be ordered as a rack mount feature for a range of products. Feature Code
2725 provides the enhanced TS3000 TSSC. Physically, the TS3000 TSSC is a standard rack
1U mountable server that is installed within the 3592 F05 or F06 frame.
Feature code 2748 provides an optical drive, which is needed for the Licensed Internal Code
changes and log retrieval. With the new TS3000 TSSC provided by FC2725, remote data link
or call home by using an analog telephone line and modem is no longer supported. Dial-in
function through Assist On-site (AOS) and Call Home with ECC functions are both available
using an HTTP/HTTPS broadband connection.
Note: Modem option is no longer offered with the latest TSSC FC 2725. All modem-based
call home and remote support was discontinued at the end of 2017.
The TSSC enables the use of a proxy server or direct connection. Direct connection implies
that there is not an HTTP proxy between the configured TS3000 and the outside network to
IBM. Selecting this method requires no further setup. ECC supports customer-provided HTTP
proxy. Additionally, a customer might require all traffic to go through a proxy server. In this
case, the TSSC connects directly to the proxy server, which initiates all communications to
the Internet.
Note: All inbound connections are subject to the security policies and standards that are
defined by the client. When a Storage Authentication Service, Direct Lightweight Directory
Access Protocol (LDAP), or RACF policy is enabled for a cluster, service personnel (local
or remote) are required to use the LDAP-defined service login.
Important: Be sure that local and remote authentication is allowed, or that an account is
created to be used by service personnel, before enabling storage authentication, LDAP, or
RACF policies.
The outbound communication that is associated with ECC call home can be through an
Ethernet connection, a modem, or both, in the form of a failover setup. Modem is not
supported in the new TS3000 TSSC. The local subnet LAN connection between the TSSC
and the attached subsystems remains the same. It is still isolated without any outside access.
ECC adds another Ethernet connection to the TSSC, bringing the total number to three.
These connections are labeled:
The External Ethernet Connection, which is the ECC Interface
The Grid Ethernet Connection, which is used for the TS7700 Autonomic Ownership
Takeover Manager (AOTM)
The Internal Ethernet Connection, which is used for the local attached subsystem’s subnet
Note: The AOTM and ECC interfaces should be in different TCP/IP subnets. This setup
avoids both communications from using the same network connection.
All of these connections are set up using the Console Configuration Utility User Interface that
is on the TSSC. TS7700 events that start a Call Home are displayed in the Events window
under the Monitor icon.
AOS uses the same network as broadband call home, and works on either HTTP or HTTPS.
Although the same physical Ethernet adapter is used for these functions, different ports must
be opened in the firewall for the different functions. For more information, see 4.1.3, “TCP/IP
configuration considerations” on page 146. The AOS function is disabled by default.
All AOS connections are outbound, so no connection is initiated from the outside to the
TSSC. It is always the TSSC that initiates the connection. In unattended mode, the TSSC
checks whether there is a request for a session when it connects to the regional AOS relay
servers, periodically. When a session request exists, the AOS authenticates and establishes
the connection, allowing remote access to the TSSC.
Assist On-site uses current security technology to ensure that the data that is exchanged
between IBM Support engineers and the TSSC is secure. Identities are verified and protected
with industry-standard authentication technology, and Assist On-site sessions are kept secure
and private by using randomly generated keys for session, plus advanced encryption.
Note: All authentications are subject to the authentication policy that is in effect, as
described in 9.3.9, “The Access icon” on page 464.
Important:
Each tape attach TS7000 cluster requires its own logical library in the tape library.
The ALMS feature must be installed and enabled to define a logical library partition in
both TS3500 and TS4500 tape libraries.
You can check the status of ALMS with the TS3500 tape library GUI by clicking Library →
ALMS, as shown in Figure 9-84.
Figure 9-84 TS3500 tape library GUI Summary and ALMS window
When ALMS is enabled for the first time in a partitioned TS3500 tape library, the contents of
each partition are migrated to ALMS logical libraries. When enabling ALMS in a
non-partitioned TS3500 tape library, cartridges that are already in the library are migrated to
the new ALMS single logical library.
Notice that the TS4500 GUI features selected presets, which helps in the setup of the new
logical library. For the TS7700 library, use the TS7700 option that is highlighted in
Figure 9-86. This option uses the 3592 tape drives that are not assigned to any existent
logical library within the TS4500 tape library. Also, it selects up to four drives as control paths,
distributing them in two separate frames, when this is possible.
Note: The TS7700 preset is disabled when less than four unassigned tape drives are
available to create a new logical library.
The preset also indicates the System Managed encryption method for the new TS7700 logical
library.
Figure 9-88 Defining the new Logical Library for the TS7700T
After the configuration of the logical library is completed, your IBM service representative can
complete the TS770 tape-attached cluster installation, and the tape cartridges can be
inserted in the TS4500 tape library.
Figure 9-89 Creating a Logical Library with the TS3500 tape library
Make sure that the new logical library has the eight-character Volser reporting option set.
Another item to consider is the VIO usage - if VIO is enabled and, if so, how many cells
should be defined. For more information, see the documentation regarding the TS3500 Tape
Library available on virtual I/O slots and applicability:
http://www.ibm.com/support/knowledgecenter/STCMML8/com.ibm.storage.ts3500.doc/ipg_
3584_a69p0vio.html
Assigning drives
Now, the TS7700T tape drives should be added to the logical library.
From the Logical Libraries window that is shown in Figure 9-91 on page 529, use the work
items on the left side of the window to go to the requested web window by clicking Drives →
Drive Assignment. This link takes you to a filtering window where you can select to have the
drives displayed by drive element or by logical library.
Note: For the 3592 J1A, E05, E06, and E07 drives, an intermix of tape drive models is not
supported by TS7720T or TS7740, except for 3592-E05 tape drives working in J1A
emulation mode and 3592-J1A tape drives (the first and second generation of the 3592
tape drives).
Upon selection, a window opens so that a drive can be added to or removed from a library
configuration. Also, you can use this window to share a drive between Logical Libraries and
define a drive as a control path.
Figure 9-91 on page 529 shows the drive assignment window of a logical library that has all
drives assigned.
Unassigned drives appear in the Unassigned column with the box checked. To assign them,
select the appropriate drive box under the logical library name and click Apply.
TS7700 R4.0 works with the TS1150 tape drives in a homogeneous or heterogeneous
configuration. Heterogeneous configuration of the tape drives means a mix of TS1150 (3592
E08) and one previous generation of the 3592 tape drives to facilitate data migration from
legacy media. Tape drives from previous generation are only used to read legacy media
(JA/JB) while the TS1150 will read/write to the newer media types. There will be no writes to
the legacy media type, so the support for heterogeneous configuration of the tape drives is
deemed limited.
You can read more about heterogeneous drive support in Chapter 2, “Architecture,
components, and functional characteristics” on page 15 and Chapter 8, “Migration” on
page 299.
Note: Do not change drive assignments if they belong to an operating TS7700T, or tape
controller. Work with your IBM SSR, if necessary.
In an IBM Z environment, a tape drive always attaches to one tape control unit (CU) only. If it
is necessary to change the assignment of a tape drive from a TS7700T, the CU must be
reconfigured to reflect the change. Otherwise, the missing resource is reported as defective to
the MI and hosts. Work with your IBM SSRs to perform these tasks in the correct way,
avoiding unplanned outages.
In addition, never disable ALMS at the TS3500 tape library after it is enabled for IBM Z host
support and IBM Z tape drive attachment.
Reminders:
When using encryption, tape drives must be set to Native mode.
To activate encryption, FC9900 must have been ordered for the TS7400 or the
TS7720T, and the license key must be installed. In addition, the associated tape drives
must be Encryption Capable 3592-E05, 3592-E06, 3592-E07, or 3592-E08.
3. In the next window that opens, select the native mode for the drive. After the drives are in
the wanted mode, proceed with the Encryption Method definition.
4. In the TS3500 MI, click Library → Logical Libraries, select the logical library with which
you are working, select Modify Encryption Method, and then click Go. See Figure 9-94.
To make encryption fully operational in the TS7740 configuration, more steps are necessary.
Work with your IBM SSR to configure the Encryption parameters in the TS7740 during the
installation process.
To add, change, and remove policies, select Cartridge Assignment Policy from the
Cartridges work items. The maximum quantity of CAPs for the entire TS3500 tape library
must not exceed 300 policies.
The TS3500 tape library enables duplicate VOLSER ranges for different media types only. For
example, Logical Library 1 and Logical Library 2 contain Linear Tape-Open (LTO) media, and
Logical Library 3 contains IBM 3592 media. Logical Library 1 has a CAP of ABC100-ABC200.
The library rejects an attempt to add a CAP of ABC000-ABC300 to Logical Library 2 because
the media type is the same (both LTO). However, the library does enable an attempt to add a
CAP of ABC000-ABC300 to Logical Library 3 because the media (3592) is different.
Tip: The CAP does not reassign an already assigned tape cartridge. If needed, you must
first unassign it, then manually reassign it.
These procedures ensure that TS7700 back-end cartridges are never assigned to a host by
accident. Figure 9-97 shows the flow of physical cartridge insertion and assignment to logical
libraries for TS7740 or TS7720T.
Important: Cartridges that are not in a CAP range (TS3500) or associated to any logical
library (TS4500) are not assigned to any logical library.
After completing the new media insertion, close the doors. After approximately 15 seconds,
the tape library automatically inventories the frame or frames of the door you opened.
Tip: Only place cartridges in a frame whose front door is open. Do not add or remove
cartridges from an adjacent frame.
Basically, with VIO enabled, the tape library moves the cartridges from the physical I/O station
into the physical library by itself. In the first moment, the cartridge leaves the physical I/O
station and goes into a slot that is mapped as a VIO - SCSI element between 769 (X’301’)
and 1023 (X’3FF’) for the logical library that is designated by the Volser association or CAP.
Each logical library has its own set of up to 256 VIO slots, as defined during logical library
creation or later.
With VIO disabled, the tape library does not move cartridges from the physical I/O station
unless it receives a command from the TS7700T or any other host in control.
For both cases, the tape library detects the presence of cartridges in the I/O station when it
transitions from open to close, and scans all I/O cells by using the bar code reader. The CAP
or volser assignment decides to which logical library those cartridges belong and then runs
one of the following tasks:
Moves them to the VIO slots of the designated logical library, with VIO enabled.
Waits for a host command in this logical library. The cartridges stay in the I/O station after
the bar code scan.
The volumes being inserted should belong to the range of volumes that are defined in the
tape library (CAP or volser range) for the TS7700 logical library, and those ranges also should
be defined in the TS7700 Physical Volume Range as described in “Defining VOLSER ranges
for physical volumes” on page 539. Both conditions should be met to a physical cartridge be
successfully inserted to the TS7700T.
If any VOLSER is not in the range that is defined by the policies, the cartridges need to be
assigned to the correct logical library manually by operator.
Note: Make sure that CAP ranges are correctly defined. Insert Notification is not supported
on a high-density library. If a cartridge outside the CAP-defined ranges is inserted, it
remains unassigned without any notification, and it might be checked in by any logical
library of the same media type.
When volumes that belong to a logical library are found unassigned, correct the CAP or volser
assignment definitions, and reinsert them again. Optionally, cartridges could be manually
assigned to the correct logical library by the operator through the GUI.
We strongly suggest having correctly defined CAP or volser assignment policies in the tape
library for the best operation of the tape system.
Clarifications:
Insert Notification is not supported in a high-density library for TS3500. The CAP must
be correctly configured to provide automated assignment of all the inserted cartridges.
A cartridge that has been manually assigned to the TS7700 logical library does not
display automatically in the TS7700T inventory. An Inventory Upload is needed to
refresh the TS7700 cluster inventory. The Inventory Upload function is available on the
Physical Volume Ranges menu as shown in Figure 9-99.
Cartridge assignment to a logical library is available only through the tape library GUI.
To assign a data cartridge to a logical library in the TS3500 tape library, complete these steps:
1. Open the TS3500 tape library GUI (go to the library’s Ethernet IP address or the library
URL by using a standard browser). The Welcome window opens.
2. Click Cartridges → Data Cartridges. The Data Cartridges window opens.
3. Select the logical library to which the cartridge is assigned and select a sort view of the
cartridge range. The library can sort the cartridge by volume serial number, SCSI element
address, or frame, column, and row location. Click Search. The Cartridges window opens
and shows all the ranges for the specified logical library.
4. Select the range that contains the data cartridge that should be assigned.
5. Select the data cartridge and then click Assign.
6. Select the logical library partition to which the data cartridge should be assigned to.
7. Click Next to complete the function.
8. For a TS7700T cluster, click Physical → Physical Volumes → Physical Volume Ranges
and click Inventory Upload, as shown in Figure 9-102 on page 540.
Use the cartridge magazine to insert cleaning cartridges into the I/O station, and then into the
TS4500 tape library. The TS4500 can be set to move expired cleaning cartridges to the I/O
station automatically. Figure 9-100 shows how to set it.
Figure 9-100 TS4500 tape library moves expired cleaning cartridge to I/O station automatically
Also, there are TS4500 tape library command-line interface commands that can be used to
check the status of the cleaning cartridges or alter settings in the tape library. Read about this
in the documentation for TS4500, available locally by clicking the question mark icon at the
top bar in the GUI, or online at:
https://www.ibm.com/support/knowledgecenter/STQRQ9/com.ibm.storage.ts4500.doc/ts45
00_ichome.html
Use the window that is shown in Figure 9-102 to add, modify, and delete physical volume
ranges. Unassigned physical volumes are listed in this window. If a volume is listed as
unassigned volume, and this volume belongs to this TS7700, a new range should be added
including that volume to fix it. If an unassigned volume does not belong to this TS7700 cluster,
it should be ejected and reassigned to the proper logical library in the physical tape library.
Click Inventory Upload to upload the inventory from the TS3500 tape library and update any
range or ranges of physical volumes that were recently assigned to that logical library. The
VOLSER Ranges table displays the list of defined VOLSER ranges for a specific component.
The VOLSER Ranges table can be used to create a VOLSER range, or to modify or delete a
predefined VOLSER range.
See “Physical Volume Ranges window” on page 442 for more information about how to insert
a new range of physical volumes by using the TS7700 Management Interface.
Items under Physical Volumes in the MI apply only to tape attach clusters. Trying to access
those windows from a TS7720 results in the following HYDME0995E message:
This cluster is not attached to a physical tape library.
The Physical Volume Pool Properties table displays the encryption setting and media
properties for every physical volume pool that is defined for TS7700T clusters in the grid.
See “Physical Volume Pools” on page 427 for detailed information about how to view, create,
or modify Physical volume tape pools by using the TS7700 management interface.To modify
encryption settings for one or more physical volume pools, complete the following steps.
Figure 9-104 on page 542 illustrates the sequence. For more information, see “Physical
Volume Pools” on page 427.
1. Open the Physical Volume Pools window.
Tip: A tutorial is available in the Physical Volume Pools window to show how to modify
encryption properties.
See the “Physical Volume Pools” on page 427 for parameters and settings, on the Modify
Encryption settings window.
When the amount of active data on a physical stacked volume drops below this percentage,
the volume becomes eligible for reclamation. Reclamation values can be in the range of
0% - 95%, with a default value of 35%. Selecting 0% deactivates this function.
Note: Subroutines of the Automated Read-Only Recovery (ROR) process are started to
reclaim space in the physical volumes. Those cartridges are made read-only momentarily
during the reclaim process, returning to normal status at the end of the process.
Throughout the data lifecycle, new logical volumes are created and old logical volumes
become obsolete. Logical volumes are migrated to physical volumes, occupying real space
there. When a logical volume becomes obsolete, that space becomes a waste of capacity in
that physical tape. Therefore, the active data level of that volume is decreasing over time.
TS7700T actively monitors the active data in its physical volumes. Whenever this active data
level crosses the reclaim threshold that is defined in the Physical Volume Pool in which that
volume belongs, the TS7700 places that volume in a candidate list for reclamation.
Clarification: Each reclamation task uses two tape drives (source and target) in a
tape-to-tape copy function. The TS7700 TVC is not used for reclamation.
Multiple reclamation processes can run in parallel. The maximum number of reclaim tasks is
limited by the TS7700T, based on the number of available drives as shown in Table 9-19.
3 1
4 1
5 1
6 2
7 2
8 3
9 3
10 4
11 4
12 5
13 5
14 6
15 6
16 7
You might want to have fewer reclaims running, sparing the resources for other activities in
the cluster. The user can set a maximum number of drives that will be used for reclaim in a
pool bases. Also, reclaim settings can be changed by using LI REQ commands.
The reclamation level for the physical volumes must be set by using the Physical Volume
Pools window in the TS7700 MI.Select a pool and click Modify Pool Properties in the menu
to set the reclamation level and other policies for that pool.
No more than four drives can be used for premigration in pool 3. The reclaiming threshold
percentage has been set to 35%, meaning that when a physical volume in pool 3 crosses
down the threshold of 35% of occupancy with active data, the stacked cartridge became
candidate for reclamation. The other way to trigger a reclamation in this example is Days
Without Data Inactivation for tape cartridges with up to 65% of occupancy level.
See “Physical Volume Pools” on page 427 for more details about parameters and settings.
Reclamation enablement
To minimize any effect on TS7700 activity, the storage management software monitors
resource use in the TS7700, and enables or disables reclamation. Optionally, reclamation
activity can be prevented at specific times by specifying an Inhibit Reclaim Schedule in the
TS7700 MI (Figure 9-106 on page 547 shows an example).
However, the TS7700T determines whether reclamation is enabled or disabled once an hour,
depending on the number of available scratch cartridges. It disregards the Inhibit Reclaim
Schedule if the TS7700T goes below a minimum number of scratch cartridges that are
available. Now, reclamation is enforced by the tape attach TS7700 cluster.
Although reclamation is enabled, stacked volumes might not always be going through the
process all the time. Other conditions must be met, such as stacked volumes that meet one of
the reclaim policies and drives available to mount the stacked volumes.
Reclamation for a volume is stopped by the TS7700 internal management functions if a tape
drive is needed for a recall or copy (because these are of a higher priority) or a logical volume
is needed for recall off a source or target tape that is in the reclaim process. If this happens,
reclamation is stopped for this physical tape after the current logical volume move is
complete.
Pooling is enabled as a standard feature of the TS7700, even if only one pool is used.
Reclamation can occur on multiple volume pools at the same time, and process multiple tasks
for the same pool. One of the reclamation methods selects the volumes for processing based
on the percentage of active data.
Individual pools can have separate reclaim policies set. The number of pools can also
influence the reclamation process because the TS7700 tape attach always evaluates the
stacked media starting with Pool 1.
The scratch count for physical cartridges also affects reclamation. The scratch state of pools
is assessed in the following manner:
1. A pool enters a Low scratch state when it has access to less than 50 and two or more
empty cartridges (scratch tape volumes).
2. A pool enters a Panic scratch state when it has access to fewer than two empty cartridges
(scratch tape volumes).
Panic Reclamation mode is entered when a pool has fewer than two scratch cartridges and
no more scratch cartridges can be borrowed from any other pool that is defined for borrowing.
Borrowing is described in “Using physical volume pools” on page 51.
Important: A physical volume pool that is running out of scratch cartridges might stop
mounts in the TS7740 or TS7720T tape attach partitions, affecting host tape operations.
Mistakes in pool configuration (media type, borrow and return, home pool, and so on) or
operating with an empty common scratch pool might lead to this situation.
Consider that one reclaim task uses two drives for the data move, and processor cycles.
When a reclamation starts, these drives are busy until the volume that is being reclaimed is
empty. If the reclamation threshold level is raised too high, the result is larger amounts of data
to be moved, with a resultant penalty in resources that are needed for recalls and
premigration. The default setting for the reclamation threshold level is 35%.
Ideally, reclaim threshold level should be 10% - 35%. Read more about how to fine-tune this
function and about the available host functions in 4.4.4, “Physical volumes for TS7740,
TS7720T, and TS7760T” on page 182. Pools in either scratch state (Low or Panic state) get
priority for reclamation.
3 Pool in Low Yes Yes At least one, Volumes that are subject
scratch state regardless of to reclaim because of
idle drives. If Maximum Active Data,
more idle Days Without Access,
drives are Age of Last Data Written,
available, and Days Without Data
more reclaims Inactivation use priority 3
are started, or 4 reclamation.
up to the
maximum
limit.
To define the Inhibit Reclaim schedule, click Management Interface → Settings → Cluster
Settings, which opens the window that is shown in Figure 9-106.
For more information, see “Inhibit Reclaim Schedules window” on page 489.
The EKS assists encryption-enabled tape drives in generating, protecting, storing, and
maintaining EKs that are used to encrypt information being written to and decrypt information
being read from tape media (tape and cartridge formats). Also, EKS manages the EK for the
TVC cache disk subsystem, with the external key management disk encryption feature
installed. This removes the responsibility of managing the key away from the 3957-Vxx and
from the disk subsystem controllers.
Note: The settings for Encryption Server are shared for both tape and external disk
encryption.
See “Encryption Key Server Addresses window” on page 490 for more details.
To work around this limitation, ensure that at least one device is online (or been online) to
each host or use the LIBRARY RESET,CBRUXENT command to initiate cartridge entry processing
from the host. This task is especially important if only one host is attached to the library that
owns the volumes being entered. In general, after the volumes are entered into the library,
CBR36xxI cartridge entry messages are expected. The LIBRARY RESET,CBRUXENT command
from z/OS can be used to reinitiate cartridge entry processing, if necessary. This command
causes the host to ask for any volumes in the insert category.
Up to now, as soon as OAM starts for the first time, and being the volumes in the Insert
category, the entry processing starts, not allowing for operator interruptions. The LI
DISABLE,CBRUXENT command can be used before starting the OAM address space. This
approach allows for the entry processing to be interrupted before the OAM address space
initially starts.
See “Insert Virtual Volumes window” on page 413 for operational details about this TS7700
MI page.
Note: Up to 10,000 logical volumes can be inserted at one time. This applies to both
inserting a range of logical volumes and inserting a quantity of logical volumes.
Note: The Fast Ready attribute provides a definition of a category to supply scratch
mounts. For z/OS, it depends on the definitions. The TS7700 MI provides a way to define
one or more scratch categories. A scratch category can be added by using the Add
Scratch Category menu.
The MOUNT FROM CATEGORY command is not exclusively used for scratch mounts. Therefore,
the TS7700 cannot assume that any MOUNT FROM CATEGORY is for a scratch volume.
When defining a scratch category, an expiration time can be set up, and further define it as an
Expire Hold time.
The category hexadecimal number depends on the software environment and on the
definitions in the SYS1.PARMLIB member DEVSUPxx for library partitioning. Also, the
DEVSUPxx member must be referenced in the IEASYSxx member to be activated.
Tip: Do not add a scratch category by using MI that was previously designated as a
private volume category at the host. Categories should correspond to the defined
categories in the DEVSUPxx from the attached hosts.
Number of virtual volumes: The addition of all volumes counts that are shown in the
Counts column do not always result in the total number of virtual volumes due to some
rare, internal categories not being displayed on the Categories table. Additionally,
movement of virtual volumes between scratch and private categories can occur multiple
times per second and any snapshot of volumes on all clusters in a grid is obsolete by the
time a total count completes.
The Categories table can be used to add, modify, and delete a scratch category, and to
change the way that information is displayed.
Expire
The amount of time after a virtual volume is returned to the scratch category before its
data content is automatically delete-expired.
A volume becomes a candidate for delete-expire after all the following conditions are met:
– The amount of time since the volume entered the scratch category is equal to or
greater than the Expire Time.
– The amount of time since the volume’s record data was created or last modified is
greater than 12 hours.
– At least 12 hours has passed since the volume was migrated out of or recalled back
into disk cache.
Select an expiration time from the drop-down menu shown on Figure 9-110.
Tip: Add a comment to DEVSUPnn to ensure that the scratch categories are updated
when the category values in DEVSUPnn are changed. They always need to be in sync.
With the Delete Expired Volume Data setting, the data that is associated with volumes that
have been returned to scratch are expired after a specified time period and their physical
space in tape can be reclaimed.
The parameter Expire Time specifies the amount of time in hours, days, or weeks. The data
continues to be managed by the TS7700 after a logical volume is returned to scratch before
the data that is associated with the logical volume is deleted. A minimum of 1 hour and a
maximum of 2,147,483,647 hours (approximately 244,983 years) can be specified.
Specifying a value of zero means that the data that is associated with the volume is to be
managed as it was before the addition of this option. This means that it is never deleted. In
essence, specifying a value (other than zero) provides a “grace period” from when the virtual
volume is returned to scratch until its associated data is eligible for deletion. A separate
Expire Time can be set for each category that is defined as scratch.
Remember:
Scratch categories are global settings within a multi-cluster grid. Therefore, each
defined scratch category and the associated Delete Expire settings are valid on each
cluster of the grid.
The Delete Expired Volume Data setting applies also to disk only clusters. If it is not
used, logical volumes that have been returned to scratch are still considered active
data, allocating physical space in the TVC. Therefore, setting an expiration time on a
disk only TS7700 is important to maintain an effective cache usage by deleting expired
data.
Note: The value 0 is not a valid entry on the dialog box for Expire Time on the Add
Category page. Use No Expiration instead.
Establishing the Expire Time for a volume occurs as a result of specific events or actions. The
following list show possible events or actions and their effect on the Expire Time of a volume:
A volume is mounted
The data that is associated with a logical volume is not deleted, even if it is eligible, if the
volume is mounted. Its Expire Time is set to zero, meaning it will not be deleted. It is
reevaluated for deletion when its category is assigned.
A volume’s category is changed
Whenever a volume is assigned to a category, including assignment to the same category
in it currently exists, it is reevaluated for deletion.
Expiration
If the category has a nonzero Expire Time, the volume’s data is eligible for deletion after
the specified time period, even if its previous category had a different nonzero Expire
Time.
No action
If the volume’s previous category had a nonzero Expire Time or even if the volume was
already eligible for deletion (but has not yet been selected to be deleted) and the category
to which it is assigned has an Expire Time of zero, the volume’s data is no longer eligible
for deletion. Its Expire Time is set to zero.
A category’s Expire Time is changed
If a user changes the Expire Time value through the scratch categories menu on the
TS7700 MI, the volumes that are assigned to that category are reevaluated for deletion.
Expire Time is changed from nonzero to zero
If the Expire Time is changed from a nonzero value to zero, volumes that are assigned to
the category that currently have a nonzero Expire Time are reset to an Expire Time of
zero. If a volume was already eligible for deletion, but had not been selected for deletion,
the volume’s data is no longer eligible for deletion.
Expire Time is changed from zero to nonzero
Volumes that are assigned to the category continue to have an Expire Time of zero.
Volumes that are assigned to the category later will have the specified nonzero Expire
Time.
Expire Time is changed from nonzero to nonzero
Volumes assigned for that category are reevaluated for deletion. Volumes that are
assigned to the category later will have the updated nonzero Expire Time.
After a volume’s Expire Time is reached, it is eligible for deletion. Not all data that is eligible for
deletion is deleted in the hour that it is first eligible. Once an hour, the TS7700 selects up to
1,000 eligible volumes for data deletion. The volumes are selected based on the time that
they became eligible, with the oldest ones being selected first. Up to 1,000 eligible volumes
for the TS7700 in the library are selected first.
These construct names are passed down from the z/OS host and stored with the logical
volume. The actions that are defined for each construct are performed by the TS7700. For
non-z/OS hosts, the constructs can be manually assigned to logical volume ranges.
Storage Groups
On the z/OS host, the SG construct determines into which tape library a logical volume is
written. Within the TS7700T, the SG construct defines the storage pool to which the logical
volume is placed.
Even before the first SG is defined, there is always at least one SG present. This is the default
SG, which is identified by eight dashes (--------). This SG cannot be deleted, but it can be
modified to point to another storage pool. Up to 256 SGs, including the default, can be
defined.
Use the window that is shown in Figure 9-111 to add, modify, and delete an SG used to define
a primary pool for logical volume premigration.
The SGs table displays all existing SGs available for a selected cluster. See “Storage Groups
window” on page 455 for details about this page.
However, in a stand-alone configuration, the dual copy capability can be used to protect
against media failures. The second copy of a volume can be in a pool that is designated as a
Copy Export pool. For more information, see 2.3.32, “Copy Export” on page 95.
If you want to have dual copies of selected logical volumes, you must use at least two storage
pools because the copies cannot be written to the same storage pool as the original logical
volumes.
The Current Copy Policy table displays the copy policy in force for each component of the
grid. If no MC is selected, this table is not visible. You must select an MC from the MCs table
to view copy policy details.
The MCs table (Figure 9-112) displays defined MC copy policies that can be applied to a
cluster. You can use the MCs table to create a new MC, modify an existing MC, and delete
one or more existing MCs. Use the“Management Classes window” on page 456 for
information about how to use the Management Classes window.
Use the window that is shown in Figure 9-113 to define, modify, or delete an SC that is used
by the TS7700 to automate storage management through the classification of data sets and
objects.
The SCs table displays defined SCs available to CDSs and objects within a cluster. Although
SCs are visible from all TS7700 clusters, only those clusters that are attached to a physical
library can alter TVC preferences. A stand-alone TS7700 cluster that does not possess a
physical library does not remove logical volumes from the tape cache, so the TVC preference
for the disk-only clusters is always Preference Level 1.
Data Classes
From a z/OS perspective (SMS-managed tape), the DFSMS DC defines the following
information:
Media type parameters
Recording technology parameters
Compaction parameters
For the TS7700, only the Media type, Recording technology, and Compaction parameters
are used. The use of larger logical volume sizes is controlled through DC. Also, LWORM
policy assignments are controlled from Data Classes.
Starting with R4.1.2, you can select the compression method for the logical volumes per Data
Class policy. The compression method that you choose will be applied to the z/OS host data
being written to logical volumes that belong to a specific Data Class. The following are the
available options:
FICON compression: This is the traditional method in place for the TS7700 family, where
the compression is performed by the FICON adapters (also known as hardware
compression). This algorithm uses no cluster processing resources, but has a lower
compression ratio compared with the newer compression methods.
Use the window that is shown in Figure 9-114 to define, modify, or delete a TS7700 DC. The
DC is used to automate storage management through the classification of data sets.
See “Data Classes window” on page 462 to see more details about how to create, modify, or
delete a Data Class.
See “Feature licenses” on page 484 for more details about how to use this window.
Enabling IPv6
IPv6 and Internet Protocol Security (IPSec) are supported by the TS7700 clusters.
Tip: The client network must use whether IPv4 or IPv6 for all functions, such as MI, key
manager server, SNMP, Lightweight Directory Access Protocol (LDAP), and NTP. Mixing
IPv4 and IPv6 is not currently supported.
Figure 9-117 shows the Cluster Network Settings window from which you can enable IPv6.
For more information about how to use Cluster Network Settings window, see “Cluster
network settings” on page 482.
Caution: Enabling grid encryption significantly affects the performance of the TS7700.
Figure 9-118 shows how to enable the IPSec for the TS7700 cluster.
In a multi-cluster grid, the user can choose which link is encrypted by selecting the boxes in
front of the beginning and ending clusters for the selected link. Figure 9-118 shows a
two-cluster grid, which is the reason why there is only one option to select.
For more information about IPSec configuration, see “Cluster network settings” on page 482.
Also, see IBM Knowledge Center at:
http://www.ibm.com/support/knowledgecenter/STFS69_4.1.2/ts7740_infrastructure_requ
irements_network_switches_tcp_ip_ports.html?lang=en
The TS7700 supports RBAC through the System Storage Productivity Center or by native
LDAP by using Microsoft Active Directory (MSAD) or IBM Resource Access Control Facility
(RACF).
For information about setting up and checking the security settings for the TS7700 grid, see
“Security Settings window” on page 465.
The data consistency point is defined in the MCs construct definition through the MI. This task
can be performed for an existing grid system. In a stand-alone cluster configuration, the
Modify MC definition will only display the lone cluster.
See “Management Classes window” on page 456 for details about how to modify the copy
consistency by using the Copy Action table.
Figure 9-119 shows an example of how to modify the copy consistency by using the Copy
Action table, and then clicking OK. In the figure, the TS7700 is part of a three-cluster grid
configuration. This additional menu is displayed only if a TS7700 is part of a grid environment
(options are not available in a stand-alone cluster).
See “Management Classes window” on page 456 for details about how to modify the copy
consistency by using the Copy Action table.
Reminder: The items on this window can modify the cluster behavior regarding local copy
and certain I/O operations. Some LI REQ commands also can do it.
The settings are specific to a cluster in a multi-cluster grid configuration, which means that
each cluster can have separate settings, if needed. The settings take effect for any mount
requests that are received after the settings were saved. Mounts already in progress are not
affected by a change in the settings. The following settings can be defined and set:
Prefer local cache for scratch mount requests
Prefer local cache for private mount requests
Force volumes that are mounted on this cluster to be copied to the local cache
Enable fewer RUN consistent copies before reporting RUN command complete
Ignore cache preference groups for copy priority
See “Copy Policy Override” on page 487 for information about how to view or modify those
settings.
Scratch mount candidates can be defined in a grid environment with two or more clusters. For
example, in a hybrid configuration, the SAA function can be used to direct certain scratch
allocations (workloads) to one or more TS7720 tape drives for fast access. Other workloads
can be directed to a TS7740 or TS7720T for archival purposes.
Clusters not included in the list of scratch mount candidates are not used for scratch mounts
at the associated MC unless those clusters are the only clusters that are known to be
available and configured to the host.
See Chapter 10, “Host Console operations” on page 581 for information about software levels
that are required by SAA and DAA to function properly, in addition to the LI REQ commands
that are related to the SAA and DAA operation.
Each cluster in a grid can provide a unique list of candidate clusters. Clusters with an ‘N’ copy
mode, such as cross-cluster mounts, can still be candidates. When defining the scratch
mount candidates in an MC, normally you want each cluster in the grid to provide the same
list of candidates for load balancing. See “Management Classes window” on page 456 for
more details about how to create or change a MC.
Note: Scratch mount candidate list as defined in MI (Figure 9-120) is only accepted upon
being enabled by using the LI REQ setting.
Figure 9-120 shows the Retain copy mode check box in the TS7700 MI window.
Note: The Retain Copy mode option is effective only on private (non-scratch) virtual
volume mounts.
This function introduces a concept of grouping clusters together into families. Using cluster
families, a common purpose or role can be assigned to a subset of clusters within a grid
configuration. The role that is assigned, for example, production or archive, is used by the
TS7700 Licensed Internal Code to make improved decisions for tasks, such as replication
and TVC selection. For example, clusters in a common family are favored for TVC selection,
or replication can source volumes from other clusters within its family before using clusters
outside of its family.
Use the Cluster Families option on the Actions menu of the Grid Summary window to add,
modify, or delete a cluster family.
Figure 9-121 shows an example of how to create a cluster family using the TS7700 MI.
See “Cluster Families window” on page 355 for information about how to create, modify, or
delete families on the TS7700 MI.
Clarification: Host writes to the disk only TS7700 cluster and inbound copies continue
during this state.
Clarification: New host allocations do not choose a disk only cluster in this state as a
valid TVC candidate. New host allocations that are sent to a TS7700 cluster in this state
choose a remote TVC instead. If all valid clusters are in this state or cannot accept
mounts, the host allocations fail. Read mounts can choose the disk only TS7700 cluster
in this state, but modify and write operations fail. Copies inbound to this cluster are
queued as Deferred until the disk only cluster exits this state.
Table 9-21 displays the start and stop thresholds for each of the active cache capacity states
defined.
Out of cache < 1 TB or <= > 3.5 TB or CBR3794A upon entering state
resources (CP0 for a 5% of the size >17.5% of the CBR3795I upon exiting state
TS7700 tape attach) of cache size of cache
partition 0, partition 0,
whichever is whichever is
less less
To add or change an existing SC, select the appropriate action in the menu, and click Go. See
Figure 9-123.
Removal Threshold
The Removal Threshold is used to prevent a cache overrun condition in a disk only TS7700
cluster that is configured as part of a grid. By default, it is a 4 TB value (3 TB fixed, plus 1 TB)
that, when taken with the amount of used cache, defines the upper limit of a TS7700 cache
size. Above this threshold, logical volumes begin to be removed from a disk only TS7700
cache.
Note: Logical volumes are only removed if there is another consistent copy within the grid.
Logical volumes are removed from a disk only TS7700 cache in this order:
1. Volumes in scratch categories
2. Private volumes least recently used, by using the enhanced Removal policy definitions
Tip: This field is only visible if the selected cluster is a disk only TS7700 in a grid
configuration.
When a cluster in the grid enters service mode, the remaining clusters can have their ability to
make or validate copies and perform auto removal of logical volumes affected. For an
extended period, this situation might result in a disk-only cluster getting out of cache
resources, considering the worst possible scenario. The Temporary Removal Threshold
resource is instrumental to help preventing this possibility.
The lower threshold creates extra free cache space, which enables the disk-only TS7700 to
accept any host requests or copies during the DR testing or service outage without reaching
its maximum cache capacity. The Temporary Removal Threshold value must be equal to or
greater than the expected amount of compressed host workload written, copied, or both to the
disk-only cluster or CP0 partition during the service outage.
The default Temporary Removal Threshold is 4 TB, which provides 5 TB (4 TB plus 1 TB) of
existing free space. The threshold can be set to any value between 2 TB and full capacity
minus 2 TB.
All disk-only TS7700 cluster or CP0 partition in the grid that remain available automatically
lower their Removal Thresholds to the Temporary Removal Threshold value defined for each.
Each cluster can use a different Temporary Removal Threshold. The default Temporary
Removal Threshold value is 4 TB (an extra 1 TB more data than the default removal threshold
of 3 TB).
Each disk-only TS7700 cluster or CP0 partition uses its defined value until the cluster within
the grid in which the removal process has been started enters service mode or the temporary
removal process is canceled. The cluster that is initiating the temporary removal process
(either a cluster within the grid that is not part of the DR testing, or the one scheduled to go
into Service) does not lower its own removal threshold during this process.
Note: The cluster that is elected to initiate Temporary Removal process is not selectable in
the list of target clusters for the removal action.
Figure 9-124 Selecting cluster to start removal process and temporary removal threshold levels
Even when the temporary removal action is started from a disk-only cluster, this cluster
will still be not selectable on the drop-down menu of the TS7700 List Subject to Auto
Removal because the removal action will not affect this cluster.
This area of the window contains each disk-only TS7700 cluster or CP0 partition in the
grid and a field to set the temporary removal threshold for that cluster.
Note: The Temporary Removal Threshold task ends when the originator cluster enters
in Service mode, or the task is canceled on the Tasks page in MI.
Note: Use Coordinated Universal Time in all TS7700 clusters whenever possible.
The TS4500 tape library time can be set from management Interface, as shown in
Figure 9-125. Notice that TS4500 can be synchronized with NTP server, when available.
More information about TS4500 tape library can be found locally in the TS4500 GUI by
clicking the question mark icon, or online at:
https://www.ibm.com/support/knowledgecenter/STQRQ9/com.ibm.storage.ts4500.doc/ts45
00_ichome.html
During Pause mode, all recalls and physical mounts are held up and queued by the TS7740
or TS7720T for later processing when the library leaves the Pause mode. Because both
scratch mounts and private mounts with data in the cache are allowed to run, but not physical
mounts, no more data can be moved out of the cache after the currently mounted stacked
volumes are filled.
During an unusually elongated period of pause (like during a physical tape library outage), the
cache will continue to fill up with data that has not been migrated to physical tape volumes.
This might lead in extreme cases to significant throttling and stopping of any mount activity in
the TS7740 cluster or in the tape partitions in the TS7700T cluster.
For this reason, it is important to minimize the amount of time that is spent with the library in
Pause mode.
Here is the message posted to all hosts when the TS7700 Grid is in this state:
CBR3788E Service preparation occurring in library library-name.
Tip: Before starting service preparation at the TS7700, all virtual devices on this cluster
must be in the offline state with regard to the accessing hosts. Pending offline devices
(logical volumes that are mounted to local or remote TVC) with active tasks should be
allowed to finish execution and volumes to unload, completing transition to offline state.
Virtual devices in other clusters should be made online to provide mount point to new jobs,
shifting workload to other clusters in the grid before start service preparation. After
scheduled maintenance finishes and TS7700 can be taken out of service, then virtual
devices can be varied back online for accessing hosts.
When service is canceled and the local cluster comes online, a Distributed Library
Notification is surfaced from the cluster to prompt the attached host to vary on the devices
automatically after the following conditions are met:
All clusters in the grid are at microcode level 8.41.200.xx or later.
The attached host logical partition supports CUIR function.
The CUIR function is enabled from the command line, by using the following LI REQ
command: LIBRARY REQUEST,library-name,CUIR,SETTING,SERVICE,ENABLE
The AONLINE notification is enabled from the command line, by using the following LI REQ
command:
LIBRARY REQUEST,library-name,CUIR,AONLINE,SERVICE,ENABLE
For more information about service preparation, see 10.2, “Messages from the library” on
page 598.
If CUIR is not in place, all the host actions such as varying devices offline or online must be
performed manually across all LPARs and system plexes attached to the cluster.
There might be cases where the best option is to prepare the cluster (TS7740 or TS7700T)
for service before servicing the TS3500 tape library. In addition, there might be other
scenarios where the preferred option is to service the tape library without bringing the
associated cluster in service.
Work with the IBM SSR to identify which is the preferred option in a specific case.
For information about how to set the TS7700 to service preparation mode, see “Cluster
Actions menu” on page 360.
A partial inventory can be performed in any specific frame of the tape library: Left-click the
desired frame on the tape library image to select it (it changes colors), and right-click to
display the options; then, select Inventory Frame from the list.
The Scan tier 0 and tier 1 option will check cartridges on the doors and the external layer of
the cartridges on the walls of the library. This option will only scan other tiers if a discrepancy
is found. This is the preferred option for normal tape library operations, and it can be
performed concurrently.
The option Scan all tiers will perform full library inventory, shuffling and scanning all
cartridges in all tiers. This option is not concurrent (even when selected for a specific frame)
and can take a long time to complete, depending on the number of cartridges in the library.
Use scan all tiers only when a full inventory of the tape library is required.
Click Inventory Upload to synchronize the physical cartridge inventory from the attached
tape library with the TS7700T database.
Note: Perform the Inventory Upload from the TS3500 tape library to all TS7700T tape
drives that are attached to that tape library whenever a library door is closed, manual
inventory or Inventory with Audit is run, or a TS7700 cluster is varied online from an offline
state.
The meaningful information that provided by the tape library (the TS7700 in this book) is
contained in the message-text field, which can have 240 characters. This field includes a
five-character message ID that might be examined by the message automation software to
filter the events that should get operator attention. The message ID classifies the event being
reported by its potential impact to the operations. The categories are critical, serious, impact,
warning, and information. For more information, see the IBM TS7700 4.0 IBM Knowledge
Center.
For the list of informational messages, see the IBM TS7700 Series Operator Informational
Messages White Paper, available at:
https://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101689
Operation of the TS7700 continues with a reduced number of drives until the repair action on
the drive is complete. To recover, the IBM SSR repairs the failed tape drive and makes it
available for the TS7700 to use it again.
By default, this cartridge is corrected by an internal function of the TS7700 named Automated
ROR. Make sure that the IBM SSR has enabled Automated ROR in the cluster. Automated
ROR is the process by which hierarchical storage management (HSM) recalls all active data
from a particular physical volume that has exceeded its error thresholds, encountered a
permanent error, or is damaged.
This process extracts all active data (in the active logical volumes) that is contained in that
read-only cartridge. When all active logical volumes are successfully retrieved from that
cartridge, the Automated ROR process ejects the suboptimal physical cartridge from the tape
library, ending the recovery process with success. Messages OP0100 A read-only status
physical volume xxxxxx has been ejected or OP0099 Volser XXXXXX was ejected during
recovery processing reports that the volume was ejected successfully.
After the ejection is complete, the cartridge VOLID is removed from the TS7700 physical
cartridges inventory.
The ROR ejection task runs at a low priority to avoid causing any impact on the production
environment. The complete process, from cartridge being flagged Read-Only to the OP0100 A
read-only status physical volume xxxxxx has been ejected message, signaling the end of
the process, can take several hours (typically one day to complete).
If the process fails to retrieve the active logical volumes from that cartridge due to a damaged
media or unrecoverable read error, the next actions depend on the current configuration that
is implemented in this cluster, whether stand-alone or part of a multi-cluster grid.
In a grid environment, the ROR reaches into other peer clusters to find a valid instance of the
missing logical volume, and automatically copies it back into this cluster, completing the
active data recovery.
If recovery fails because there is no other consistent copy that is available within the grid, or
this is a stand-alone cluster, the media is not ejected and message OP0115 The cluster
attempted unsuccessfully to eject a damaged physical volume xxxxxx is reported, along
with OP0107 Virtual volume xxxxxx was not fully recovered from damaged physical
volume yyyyyy.for each logical volume that failed to be retrieved.
In this situation, the physical cartridge is not ejected. A decision must be made regarding the
missing logical volumes that are reported by OP107 messages. Also, the defective cartridge
contents can be verified through the MI Physical Volume Details window by clicking the
Download List of Virtual Volumes for that damaged physical volume. Check the list of the
logical volumes that are contained in the cartridge, and work with the IBM SSR if data
recovery from that damaged tape should be attempted.
If those logical volumes are not needed anymore, they should be made into scratch volumes
by using the TMS on the IBM Z host. After this task is done, the IBM SSR can redo the ROR
process for that defective cartridge (which is done from the TS7700 internal maintenance
window, and through an MI function). This time because these logical volumes that are not
retrieved do not contain active data, the Automated ROR completes successfully, and the
cartridge is ejected from the library.
Note: Subroutines of the same Automated ROR process are started to reclaim space in
the physical volumes and to perform some MI functions, such as eject or move physical
volumes or ranges from the MI. Those cartridges are made read-only momentarily during
the running of the function, returning to normal status at the end of the process.
Power failure
User data is protected during a power failure because it is stored on the TVC. Any host jobs
reading or writing to virtual tapes fail as they fail with a real IBM 3490E, and they must be
restarted after the TS7700 is available again. When power is restored and stable, the TS7700
must be started manually. The TS7700 recovers access to the TVC by using information that
is available from the TS7700 database and logs.
The MI has improved the accuracy and comprehensiveness of Health Alert messages and
Health Status messages. For example, new alert messages report that a DDM failed in a
specific cache drawer, which is compared to a generic message of degradation in previous
levels. Also, the MI shows enriched information in graphical format.
Important: In a TS7700T cluster, only the tape attached partitions are affected.
Important: Repairing a 3592 tape must be done only for data recovery. After the data is
moved to a new volume, eject the repaired cartridge from the TS7700 library.
Broken tape
If a 3592 tape cartridge is physically damaged and unusable (the tape is crushed, or the
media is physically broken, for example), the TS7740 or TS7700T cannot recover the
contents that are configured as a stand-alone cluster. If this TS7700 cluster is part of a grid,
the damaged tape contents (active logical volumes) are retrieved from other clusters, and the
TS7700 has those logical volumes brought in automatically (given that those logical volumes
had another valid copy within the grid).
With the prior generation of VTS, the VTS always indicated to the host that the mount
completed even if a problem had occurred. When the first I/O command is sent, the VTS fails
that I/O because of the error. This results in a failure of the job without the opportunity to try to
correct the problem and try the mount again.
With the TS7700 subsystem, if an error condition is encountered during the execution of the
mount, rather than indicating that the mount was successful, the TS7700 returns completion
and reason codes to the host indicating that a problem was encountered. With DFSMS, the
logical mount failure completion code results in the console messages shown in Example 9-2.
Reason codes provide information about the condition that caused the mount to fail:
For example, look at CBR4171I. Reason codes are documented in IBM Knowledge Center.
As an exercise, assume RSN=32. In IBM Knowledge Center, the reason code is as follows:
Reason code x’32’: Local cluster recall failed; the stacked volume is
unavailable.
CBR4196D: Error code shows in the format 14xxIT:
– 14 is the permanent error return code.
– xx is 01 if the function was a mount request or 03 if the function was a wait request.
– IT is the permanent error reason code. The recovery action to be taken for each CODE.
– In this example, it is possible to have a value of 140194 for the error code, which means
xx=01: Mount request failed.
IT=94: Logical volume mount failed. An error was encountered during the execution of
the mount request for the logical volume. The reason code that is associated with the
failure is documented in CBR4171I. The first book title includes the acronyms for message
IDs, but the acronyms are not defined in the book.
For CBR messages, see z/OS MVS System Messages, Vol 4 (CBD-DMO), SA38-0671,
for an explanation of the reason code and for specific actions that should be taken to
correct the failure. See z/OS DFSMSdfp Diagnosis, SC23-6863, for OAM return and
reason codes. Take the necessary corrective action and reply ‘R’ to try again. Otherwise,
reply ‘C’ to cancel.
The host is notified that intervention-required conditions exist. Investigate the reason for the
mismatch. If possible, relabel the volume to use it again.
The TS7700 internal recovery procedures handle this situation and restart the TS7700. For
more information, see Chapter 13, “Disaster recovery testing” on page 767.
Information from the IBM TS4500/TS3500 Tape Library is contained in some of the outputs.
However, you cannot switch the operational mode of the TS4500/TS3500 Tape Library with
z/OS commands.
Consideration: DFSMS and MVS commands apply only to SMS-defined libraries. The
library name that is defined during the definition of a library in Interactive Storage
Management Facility (ISMF) is required for libname in the DFSMS commands. The
activation of a source control data set (SCDS) with this libname must have already been
performed for SMS to recognize the library.
DISPLAY SMS,VOLUME(volser)
This command displays all of the information that is stored about a volume in the tape
configuration database (TCDB), also known as the VOLCAT, as well as information
obtained directly from the library, such as the LIBRARY CATEGORY, LM constructs (SMS
constructs stored in the library), and LM CATEGORY. See Example 10-3.
D SMS,OAM
RESPONSE=MZPEVS2
CBR1100I OAM status: 744
TAPE TOT ONL TOT TOT TOT TOT TOT ONL AVL TOTAL
LIB LIB AL VL VCL ML DRV DRV DRV SCRTCH
3 1 0 0 3 0 1280 2 2 45547
There are also 7 VTS distributed libraries defined.
CBRUXCUA processing ENABLED.
CBRUXEJC processing ENABLED.
CBRUXENT processing ENABLED.
CBRUXVNL processing ENABLED.
CBROAM: 00
VARY SMS,LIBRARY(libname),ONLINE/OFFLINE
From the host standpoint, the vary online and vary offline commands for a TS7700 library
always use the library name as defined through ISMF.
This command acts on the SMS library, which is referred to as libname. Using this
command with the OFFLINE parameter stops tape library actions and gradually makes all
of the tape units within this virtual library unavailable. This simple form is a single-system
form. The status of the library remains unaffected in other MVS systems.
Note: A composite and distributed IBM Virtual Tape Server (VTS) library can be varied
online and offline like any VTS library, though varying a distributed library offline from
the host really has no meaning (does not prevent outboard usage of the library).
Message CBR3016I warns the user when a distributed library is offline during OAM
initialization or is varied offline while OAM is active.
Using this command with the ONLINE parameter is required to bring the SMS-defined
library back to operation after it has been offline.
VARY SMS,LIBRARY(libname,sysname,...),ON/OFF and VARY
SMS,LIBRARY(libname,ALL),ON/OFF
This extended form of the VARY command can affect more than one system. The first
form affects one or more named MVS systems. The second form runs the VARY action on
all systems within the SMSplex.
The VARY SMS command enables the short forms ON as an abbreviation for ONLINE and
OFF as an abbreviation for OFFLINE.
In Example 10-6, the command is directed to a specific distributed library in the grid. You
see the mounts for devices only in that specific distributed library.
In Example 10-7, the ALL keyword has been added to the command. It now includes all
mounts where the DISTLIB, PRI-TVC, or SEC-TVC is the distributed library that is
specified on the command.
Example 10-7 LIBRARY DISPDRV MOUNTED ALL command against a distributed library
LIBRARY DISPDRV,DISTLIB3,MOUNTED,ALL
CBR1230I Mounted status:
DRIVE COMPLIB ON MOUNT DISTLIB PRI-TVC SEC-TVC
NUM NAME VOLUME Name DISTLIB DISTLIB
1D03 COMPLIB1 Y A00118 DISTLIB2 DISTLIB1 DISTLIB3
1D1C COMPLIB1 Y A00124 DISTLIB2 DISTLIB2 DISTLIB3
1D22 COMPLIB1 Y A00999 DISTLIB2 DISTLIB2 DISTLIB3
1D35 COMPLIB1 Y A00008 DISTLIB2 DISTLIB2 DISTLIB3
1D42 COMPLIB1 Y A00075 DISTLIB2 DISTLIB1 DISTLIB3
1E1F COMPLIB1 Y A00117 DISTLIB3 DISTLIB3
1E21 COMPLIB1 Y A02075 DISTLIB3 DISTLIB1
1E30 COMPLIB1 Y A01070 DISTLIB3 DISTLIB1
1E56 COMPLIB1 Y A00004 DISTLIB3 DISTLIB2
1E68 COMPLIB1 Y A00576 DISTLIB3 DISTLIB3
For a complete description of this command and its output, see APAR OA47487.
DISPLAY M=DEV(xxxx)
The D M=DEV command is useful for checking the operational status of the paths to the
device. See Example 10-8.
DISPLAY U
The DISPLAY U command displays the status of the requested unit. If the unit is part of a
tape library (either manual or automated), device type 348X is replaced by 348L. An IBM
3490E is shown as 349L, and a 3590 or 3592 is shown as 359L.
MOUNT devnum, VOL=(NL/SL/AL,serial)
The processing of MOUNT has been modified to accommodate automated tape libraries and
the requirement to verify that the correct volume has been mounted and is in private status
in the TCDB.
UNLOAD devnum
The UNLOAD command enables you to unload a drive, if the Rewind Unload (RUN) process
was not successful initially.
With the 3.2 code release, the TS7700 Management Interface (MI) enables an operator to
issue a Library Request host console command as through it was issued from the z/OS host.
The result of the command is displayed on the MI window.
The specified keywords are passed to the TS7700 identified by the library name to instruct it
about what type of information is being requested or which operation is to be run. Based on
the operation that is requested through the command, the TS7700 then returns information to
the host that is displayed in multiline write to operator (WTO) message CBR1280I.
Note: The information that is presented in the CBR1280I message comes directly from the
hardware as a response to the LI REQ command. If you have a question about the
information that is presented to the host in the CBR1280I messages that is generated as a
response to an LI REQ command, engage IBM hardware support.
This section describes some of the more useful and common LI REQ commands that a client
uses. A detailed description of the Host Console Request functions and responses is
available in IBM TS7700 Series z/OS Host Command Line Request User’s Guide, which is
available at the Techdocs website:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101091
>___ _,keyword1___________________________________________________><
|_,keyword2____________________| |_,L=_ _a______ __|
|_,keyword3____________________| |_name___|
|_,keyword4____________________| |_name-a_|
The following parameters are optional. The optional parameters depend on the first keyword
specified. Based on the first keyword that is specified, zero or more of the additional keywords
might be appropriate:
keyword2 Specifies additional information in support of the operation that is
specified with the first keyword.
keyword3 Specifies additional information in support of the operation that is
specified with the first keyword.
keyword4 Specifies additional information in support of the operation that is
specified with the first keyword.
L={a | name | name-a}
Specifies where to display the results of the inquiry: the display area
(L=a), the console name (L=name), or both the console name and the
display area (L=name-a). The name parameter can be an
alphanumeric character string.
LIBRARY REQUEST,ATL3484F,STATUS,GRID
CBR1020I Processing LIBRARY command: REQUEST,ATL3484F,STATUS,GRID.
IEF196I IEF237I 0801 ALLOCATED TO SYS00093
CBR1280I Library ATL3484F request. 467
Keywords: STATUS,GRID
----------------------------------------------------------------------
GRID STATUS V2 .0
COMPOSITE LIBRARY VIEW
IMMED-DEFERRED OWNERSHIP-T/O RECONCILE HCOPY
LIBRARY STATE NUM MB MODE NUM NUM ENB
cluster0 ON 0 0 - 0 0 Y
cluster1 ON 0 0 - 0 0 Y
cluster2 ON 0 0 - 0 0 Y
----------------------------------------------------------------------
COMPOSITE LIBRARY VIEW
SYNC-DEFERRED
LIBRARY NUM MB
cluster0 0 0
cluster1 0 0
cluster2 0 0
----------------------------------------------------------------------
DISTRIBUTED LIBRARY VIEW
RUN-COPY-QUEUE DEF-COPY-QUEUE LSTATE PT FAM
LIBRARY STATE NUM MB NUM MB
cluster0 ON 0 0 0 0 A Y -
cluster1 ON 0 0 0 0 A N -
cluster2 ON 0 0 0 0 A N -
----------------------------------------------------------------------
ACTIVE-COPIES
LIBRARY RUN DEF
cluster0 0 0
cluster1 0 0
cluster2 0 0
----------------------------------------------------------------------
LIBRARY CODE-LEVELS
cluster0 8.41.269.6810
cluster1 8.41.200.41
cluster2 8.41.269.6810
LI REQ,ATL3484F,LVOL,HYD101
CBR1020I Processing LIBRARY command: REQ,ATL3484F,LVOL,HYD101.
CBR1280I Library ATL3484F request. 478
Keywords: LVOL,HYD101
----------------------------------------------------------------------
LOGICAL VOLUME INFORMATION V4 .1
LOGICAL VOLUME: HYD101
MEDIA TYPE: ECST
COMPRESSED SIZE (MB): 13
MAXIMUM VOLUME CAPACITY (MB): 1000
CURRENT OWNER: cluster0
MOUNTED LIBRARY:
MOUNTED VNODE:
MOUNTED DEVICE:
TVC LIBRARY: cluster0
MOUNT STATE:
CACHE PREFERENCE: PG0
CATEGORY: 000F
LAST MOUNTED (UTC): 2017-10-02 14:49:11
LAST MODIFIED (UTC): 2017-10-02 14:46:32
LAST MODIFIED VNODE: 00
LAST MODIFIED DEVICE: 00F0
TOTAL REQUIRED COPIES: 2
KNOWN CONSISTENT COPIES: 2
KNOWN REMOVED COPIES: 0
IMMEDIATE-DEFERRED: N
DELETE EXPIRED: N
RECONCILIATION REQUIRED: N
LWORM VOLUME: N
FLASH COPY: NOT ACTIVE
---------------------------------------------------------------------
LIBRARY RQ CACHE PRI PVOL SEC PVOL COPY ST COPY Q COPY CP REM
cluster0 N N A02014 ------ CMPT - SYNC N
cluster1 N Y ------ ------ CMPT - SYNC N
cluster2 N N ------ ------ NOT REQ - NO COPY N
---------------------------------------------------------------------
LIBRARY CP
cluster0 0
cluster1 0
cluster2 0
The results of this command are specified in the text section of message CBR1086I. To verify
the policy name settings and to see whether the CBRUXCUA installation exit changed the
policy names you set, display the status of the volume.
The syntax of the LIBRARY LMPOLICY command to assign or change volume policy names is
shown in Example 10-12.
If the request is successful, the construct name is changed to the requested name. If you
specify the *RESET* keyword, you are requesting that OAM set this construct to the default,
which is blanks.
The values that you specify for the SG, SC, MC, and DC policy names must meet the storage
management subsystem (SMS) naming convention standards:
Alphanumeric and national (special) characters only
Name must begin with an alphabetical or national (special) character ($, *, @, #, or %)
No leading or embedded blanks
Eight characters or less
Note: The DS QT command with the QHA operand can display which systems and SYSplexs
are connected to a specific device in a cluster. For more information, see 10.8.2,
“Additional commands built to support CUIR functionality” on page 606.
DS QT,1C01,RDC
IEE459I 15.03.41 DEVSERV QTAPE 570
UNIT DTYPE DSTATUS CUTYPE DEVTYPE CU-SERIAL DEV-SERIAL ACL LIBID
1C01 3490L ON-RDY 3957C2A 3592 * 0178-272BP 0178-272BP I 3484F
READ DEVICE CHARACTERISTIC
34905434905400E0 1FD88080B61B41E9 00045AC000000000 3957413592410002
03484F0101000000 4281000004000000 0400000000000000 0000000000000000
**** 1 DEVICE(S) MET THE SELECTION CRITERIA
**** 1 DEVICE(S) WITH DEVICE EMULATION ACTIVE
01 – Distributed LIBRARY-ID
01 – LIBPORT-ID
Clarification: The distributed library number or cluster index number for a given logical
drive can be determined with the DS QT command. As identified in Figure 10-1, the
response shows LIBPORT-ID 01 for logical drive 9600. LIBPORT-ID 01 is associated with
Cluster 0. The association between distributed libraries and LIBPORT-IDs is discussed in
6.4.1, “Defining devices through HCD” on page 230.
Tip: You can get the device type of the backend physical drives of a distributed library from
the following LI REQ command:
LI REQ,<distributed library name>,PDRIVE
On the z/OS host, the new option QHA (Query Host Access to volume) has been added to the
existing DEVSERV QTAPE command. This option allows the command to surface which systems
and SYSplexs are connected to a specific device in a cluster. For a full description of the
syntax and output of the new command, see “DEVSERV QTAPE,xxxx,QHA” on page 610.
After you have the actual categories, you can change them. To perform this task, you can
change the first 3 digits of the category. However, the last digit must remain unchanged
because it represents the media type. Example 10-14 shows the command that changes all
categories to 111 for the first 3 digits.
Ensure that this change is also made in the DEVSUPxx PARMLIB member. If it is not, the
next IPL reverts categories to what they were in DEVSUPxx. For a further description of
changing categories, see 10.5, “Effects of changing volume categories” on page 603.
For a complete description of all of the DS QLIB commands, see Appendix D, “DEVSERV
QLIB command” on page 849.
USE ATR of S indicates that the volume is still in SCRATCH status and has not yet been
reused. Therefore, you have a chance to recover the volume contents if there is a consistent
copy in the TS7700. If the display for this command says USE ATR of P, it has been reused
and you cannot recover the contents of the volume by using host software procedures.
If KNOWN CONSISTENT COPIES is zero, you cannot recover this volume because it has
been DELETE EXPIRED already.
To start this process, use DFSMSrmm to search on a volume string. Then, put all of the
scratch volumes matching that string into a file with a TSO subcommand to change their
status back to MASTER, and set an Expiration Date to some future value (to prevent the next
run of DFSMSrmm Housekeeping from sending the volume back to SCRATCH), as shown in
Example 10-18.
Use the job control language (JCL) shown in Example 10-19 to run the previously generated
CLIST. This process can be done in the same job as the RMM SV command if no editing of the
generated list was needed to remove volumes without a consistent copy found. Altering the
status of such volumes to MASTER needlessly uses a scratch volser because the volume
contents have already been expire-deleted.
The D SMS,VOL command can now be used to verify that the VOLSER was changed from S to
P, as shown in Example 10-20.
Because of the permanent nature of EJECT, the TS7700 allows you to EJECT only a logical
volume that is in either the INSERT or SCRATCH category. If a logical volume is in any other
status, the EJECT fails. If you eject a scratch volume, you cannot recover the data on that
logical volume.
Tip: If a logical volume is in the error category (which is by default 000E), it must first be
moved back to a scratch category before an EJECT can be successful. To move it, use
ISMF ALTER to move it from the SCRATCH to SCRATCH category.
Volumes that are in the INSERT status can also be ejected by the resetting of the return code
through the CBRUXENT exit. This exit is provided by your tape management system vendor.
Another way to EJECT cartridges in the INSERT category is by using the MI. For more
information, see “Delete Virtual Volumes window” on page 416.
After the tape is in SCRATCH status, follow the procedure for EJECT processing specified by
your tape management system vendor. For DFSMSrmm, issue the RMM CHANGEVOLUME volser
EJECT command.
If your tape management system vendor does not specify how to do this, you can use one of
the following commands:
The z/OS command LIBRARY EJECT,volser
ISMF EJECT line operator for the tape volume
The EJECT process fails if the volume is in another status or category. For libraries managed
under DFSMS system-managed tape, the system command LIBRARY EJECT,volser sent to a
logical volume in PRIVATE status might fail with this message:
CBR3726I Function incompatible error code 6 from library <library-name> for volume
<volser>
If your tape management system is DFSMSrmm, you can use the commands that are shown
in Example 10-21 to clean up the Removable Media Management (RMM) control data set
(CDS) for failed logical volume ejects, and to resynchronize the TCDB and RMM CDS.
EXEC EXEC.RMM
The first RMM command asks for a list of volumes that RMM thinks it has ejected, and
writes a record for each in a sequential data set called prefix.EXEC.RMM.CLIST. The CLIST
then checks whether the volume is still resident in the VTS library and, if so, it corrects the
RMM CDS.
Issuing a large number of ejects at one time can cause some resource effect on the host. A
good limit for the number of outstanding eject requests is no more than 10,000 per system.
More ejects can be initiated when others complete. The following commands can be used on
the IBM Z hosts to list the outstanding and the active requests:
F OAM,QUERY,WAITING
F OAM,QUERY,ACTIVE
This indicates that a message has been sent from library library-name. Either the operator, at
the library manager console has entered a message that is to be broadcast to the host, or the
library itself, has broadcast a message to the host to relay status information or report an
error condition. A list of the messages that can be broadcast from the library to the host is
contained in the IBM TS7700 Series Operator Informational Messages White Paper, which
can be accessed at the following website:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101689
With OAM APAR OA52376 and R4.1.2 installed on all the clusters in a grid, this message has
been enhanced to surface the severity impact text (INFORMATION, WARNING, IMPACT,
SERIOUS, or CRITICAL) associated with each library message as documented in the white
paper noted above. This text prevents you from having to reference the white paper to
determine the severity associated with a particular library message, and results in the format
for the CBR3750I message being updated as follows:
CBR3750I Message from library library-name: message. Severity impact: severity
impact text.
You can optionally modify the severity impact text for each message through the TS7700 MI if
you feel a different severity impact text is more applicable in your environment. If you choose
to do so, an “*” is added to the severity impact text as follows:
CBR3750I Message from library library-name: message. Severity impact: severity
impact text*.
In addition, each library message can now optionally be associated with a free-form customer
impact text through the TS7700 MI. You can use this functionality to surface additional
information that might be relevant for a particular library message in your environment. If a
customer impact text is defined to a library message, the associated CBR3750I message is
surfaced as follows:
CBR3750I Message from library library-name: message. Severity impact: severity
impact text. Customer impact: customer-provided-impact-text.
In this message, an error has occurred during the processing of volume volser in library
library-name. The library returned a unit check with an error code error-code, which
indicates that an incompatible function has been requested. A command has been entered
that requests an operation that is understood by the subsystem microcode, but cannot be run.
The explanation for the error-code can be found in the TS7700 Customer IBM Knowledge
Center under Reference → Perform library function codes → Error recovery action
codes → Function Incompatible.
When the cache situation is resolved, the following messages are shown:
CBR3793I Library library-name has left the limited cache free space warning state.
CBR3795I Library library-name has left the out of cache resources critical state.
Empty physical volumes are needed in a pool or, if a pool is enabled for borrowing, in the
common scratch pool, for operations to continue. If a pool runs out of empty physical volumes
and there are no volumes that can be borrowed, or borrowing is not enabled, operations that
might use that pool on the distributed library must be suspended.
If one or more pools run out of empty physical volumes, the distributed library enters the Out
of Physical Scratch state. The Out of Physical Scratch state is reported to all hosts attached
to the cluster associated with the distributed library and, if included in a grid configuration, to
the other clusters in the grid.
The following MVS console message is generated to inform you of this condition:
CBR3789E VTS library library-name is out of empty stacked volumes.
Library-name is the name of the distributed library in the state. The CBR3789E message
remains on the MVS console until empty physical volumes are added to the library, or the pool
that is out has been enabled to borrow from the common scratch pool and there are empty
physical volumes to borrow. Intervention-required conditions are also generated for the
out-of-empty-stacked-volume state, and for the pool that is out of empty physical volumes.
If the option to send intervention conditions to attached hosts is set on the TS7700 that is
associated with the distributed library, the following console messages are also generated to
provide specifics about the pool that is out of empty physical volumes:
CBR3750I Message from library library-name: OP0138 The Common Scratch Pool (Pool
00) is out of empty media volumes.
CBR3750I Message from library library-name: OP0139 Storage pool xx is out of
scratch volumes.
The OP0138 message indicates the media type that is out in the common scratch pool. These
messages do not remain on the MVS console. The intervention conditions can be viewed
through the TS7700 MI.
Monitor the number of empty stacked volumes in a library. If the library is close to running out
of a physical volume media type, either expedite the reclamation of physical stacked volumes
or add more volumes. You can use the Bulk Volume Information Retrieval (BVIR) function to
obtain the physical media counts for each library. The information that is obtained includes the
empty physical volume counts by media type for the common scratch pool and each
defined pool.
If your Pool properties have a Second Media that is defined, and the primary media type is
exhausted, the library does not go into degraded status for out of scratch.
However, this change affects only new volumes that are going into the SCRATCH pool. The
existing volumes in the pool continue to be held until the original EXPIRE time has passed.
However, if EXPIRE HOLD is cleared, these volumes can then be added to the candidate list
for SCRATCH mounts. Therefore, clearing the EXPIRE HOLD option immediately helps to
alleviate the low on scratch condition, but it no longer protects data that has inadvertently
been sent to SCRATCH. The recovery of user data on volumes in the SCRATCH pool might
no longer be certain.
Once per hour, a task runs in the library that processes some number of volumes from this
list, and reduces cache utilization by deleting the expired volumes from cache. The number of
volumes that are deleted per hour is by default 1000. The number of volumes that are moved
to this candidate list is customizable (1000 - 2000), and is controlled by using the
LI REQ,composite-library,SETTING,DELEXP,COUNT,value command that is documented in
10.1.3, “Host Console Request function” on page 588, and in the following white paper:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101091
The EXPIRE time is the grace period that enables the recovery of the data in the case of
procedural error. Careful consideration needs to be made to ensure that this value is long
enough to allow for such errors to be detected, and the data recovered, before the DELETE
EXPIRE process removes the logical volume permanently.
When EXPIRE HOLD is in effect, the total number of scratch volumes differs from the total
number of usable SCRATCH volumes because volumes for which the EXPIRE time has not
yet elapsed are not eligible to be selected to satisfy a SCRATCH mount request. For this
reason, the most accurate source of scratch counts for a TS7700 is always the output from
the D SMS,LIBRARY(libname),DETAIL command.
The categories to be assigned by the host can be changed dynamically by using the DEVSERV
QLIB,CATS command or by modifying the values specified in the DEVSUPxx member and
IPLing.
Special consideration should be given to the effects that the change will have on the host. The
most common problem is that, after the change, users might forget that all of the logical
volumes in the scratch pool still belong to the initially defined categories. If no volumes have
been assigned the new categories, requests for scratch mounts will fail and OAM will surface
the CBR4105I and CBR4196D messages.
There are several ways to resolve such an issue. If the categories were changed because
there is a desire to partition the library, then a new scratch pool must be created for this host
by adding a range of volumes that will be accepted by the TMS during cartridge entry
processing. If the old scratch pool was intended to be used by this host, then the category of
a single volume or range of volumes can be updated by using the ISMF panes to ALTER one
or more use attributes from SCRATCH to SCRATCH.
This step sets the category of the volume in the library manager to match the associated
newly defined category on the host. If a large range of volumes needs to be changed,
consider using the CBRSPLCS utility to perform such a change. For more information about
how to use this utility, see z/OS DFSMS Object Access Method Planning, Installation, and
Storage Administration Guide for Tape Libraries, SC23-6867.
A modification of the volume entries in the TCDB using IDCAMS does not change the
categories that are stored in the library manager and should not be used for this purpose.
The advised action for these messages is to contact the service center and engage support
to determine why the call home was being attempted.
The CBR4196D message is issued along with messages describing the error condition, such
as the following example:
CBR4195I LACS retry possible for job VTNCMP: 015
IEE763I NAME= CBRLLACS CODE= 140169
CBR4000I LACS MOUNT permanent error for drive 0BCA.
CBR4105I No MEDIA2 scratch volumes available in library TVL10001.
IEE764I END OF CBR4195I RELATED MESSAGES
*06 CBR4196D Job VTNCMP, drive 0BCA, volser SCRTCH, error code 140169.
Reply ‘R’ to retry or ‘C’ to cancel.
With OAM APAR OA52376 applied, you can instruct OAM to automate the mount’s retry
function. This function is specified by using the existing SETTLIB command in the CBROAMxx
startup parmlib member, and is controlled by three new parameters:
LACSRETRYTIMES(1-9): Specifies the number of times to automatically retry the mount
LACSRETRYMINUTES(1-9): Specifies the number of minutes between each automatic
retry
LACSRETRYFAIL(YES|NO): Specifies whether the mount should be failed or the
CBR4196D surfaced
You can use different combinations of these three parameters to control how OAM will
automatically respond to retry-able mount failures. The following are the possible OAM
responses:
Retry the mount a number of times (LACSRETRYTIMES), every number of minutes
LACSRETRYMINUTES. If the mount has not yet succeeded at the end of the retry
processing, surface the CBR4196D (LACSRETRYFAIL(NO)).
As shown above, a new message, CBR4197D, will also be surfaced so you can cancel out of
the automatic retry. If the mount is canceled, the LACSRETRYFAIL specification is not used
and the mount will be failed immediately.
If, after the six retries every two minutes (for a total of 12 minutes of retry processing), the
mount has not been satisfied, the following message will be issued if LACSRETRYFAIL is set
to YES:
IEE763I NAME=CBRLLACS CODE= 140169
CBR4000I LACS MOUNT permanent error for drive 0BCA.
CBR4105I No MEDIA2 scratch volumes available in library TVL10001.
IEE764I END OF IEC147I RELATED MESSAGES
This is the same set of messages that would be issued if the mount is not considered to be
retry-able, or if the CBR4196D message is responded to with “C”. Otherwise, if
LACSRETRYFAIL is set to NO, the CBR4196D message is issued as shown above.
When R4.1.2 is installed on each cluster in a grid, CUIR can detect that a cluster in the grid is
entering service. It then signals to each host connected to that cluster that the host needs to
vary offline any devices that the host has online to that cluster. If the z/OS host has OA52376
installed, it receives this signal and attempts to vary offline the devices in the cluster. Any long
running jobs will need to be canceled or swapped to another cluster to ensure that the device
transitions from “pending offline” to “offline”.
When a device is varied offline by this CUIR functionality, it is known as being “offline for CUIR
reasons”. After the cluster is no longer in service mode, it can notify each connected host that
the status has changed and each host can vary those devices that were “offline for CUIR
reasons”, back online.
The signaling of the host by the cluster when the cluster is no longer in service mode is
enabled/disabled through another new LIBRARY REQUEST command that has the following
syntax:
LIBRARY REQUEST,library-name,CUIR,AONLINE,{SERVICE|FENCE|ALL},{ENABLE|DISABLE}
In each command above, library-name is the name of the composite library as defined in
SMS on the z/OS host. For the 4.1.2 release, the keyword FENCE is not yet supported. If it is
used on one of these LIBRARY REQUEST commands, the grid returns an error message.
Note: The FENCE keyword is not support in R4.1.2. If ALL is specified, only the SERVICE
setting is affected.
This section discusses each of these commands and provides the syntax, output, and
keyword descriptions for each.
The output for the command has the following syntax when issued against a composite
library:
Logical Drive Status Information V1 .0
Composite Library View
Current Time (UTC): yyyy-mm-dd hh:mm:ss
Service Vary: Enabled|Disabled Auto Online: Enabled|Disabled
Unhealthy Vary: Enabled|Disabled Auto Online: Enabled|Disabled
XXXXX (CL0)
Assigned/Grouped/Total Devices: nnnn/nnnn/nnnn
Assigned/Grouped/Total LPARS: nnnn/nnnn/nnnn SSV: nnnn SUV: nnnn
Active Service Vary: Y|N
Active Unhealthy Vary: Y|N
XXXXX (CL1)
Assigned/Grouped/Total Devices: nnnn/nnnn/nnnn
Assigned/Grouped/Total LPARS: nnnn/nnnn/nnnn SSV: nnnn SUV: nnnn
Active Service Vary: Y|N
Active Unhealthy Vary: Y|N
... ...
XXXXX (CL7)
Assigned/Grouped/Total Devices: nnnn/nnnn/nnnn
Assigned/Grouped/Total LPARS: nnnn/nnnn/nnnn SSV: nnnn SUV: nnnn
Active Service Vary: Y|N
Active Unhealthy Vary: Y|N
Note: Anything pertaining to “unhealthy vary” is there for future support and is not used
with the R4.1.2 CUIR support.
The output for the command has the following syntax when issued against a distributed
library:
Logical Drive Status Information V1 .0
Distributed Library View
Current Time (UTC): yyyy-mm-dd hh:mm:ss
XXXXX (CLx)
Service Vary: Enabled|Disabled Auto Online: Enabled|Disabled
Unhealthy Vary: Enabled|Disabled Auto Online: Enabled|Disabled
Assigned/Grouped/Total Devices: nnnn/nnnn/nnnn
Assigned/Grouped/Total LPARS: nnnn/nnnn/nnnn SSV: nnnn SUV: nnnn
Active Service Vary: Y|N
Active Unhealthy Vary: Y|N <cluster index list>
Active Path Group Indexes
list of indexes from 0-4095 separated by a single space (multiple rows possible)
The keywords for this command when issued against a distributed library are the same as
when issued against a composite library, with the following addition:
Active Path Group Indexes: The list of Active PGIDs
For more information, see the IBM TS7700 Series Control Unit Initiated Reconfiguration
(CUIR) User’s Guide at:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102743
LIBRARY REQUEST,library-name,LDRIVE,GROUP,index
When the additional GROUP parameter is specified on this command (along with an
associated index listed from the distributed library output above), additional information can
be obtained about an index (LPAR) attached to that distributed library. This version of the
command has the following syntax:
LIBRARY REQUEST,library-name,LDRIVE,GROUP,index
DEVSERV QTAPE,xxxx,QHA
On the z/OS host, a new option, QHA (Query Host Access to volume) has been added to the
existing DEVSERV QTAPE command. This option allows the command to surface which systems
and SYSplexs are connected to a specific device in a cluster. The command has the following
syntax:
DS QT,xxxx,QHA
The FLAGS field contains information about how the host has configured the device and
whether the host supports CUIR. The contents of FLAGS field is defined as follows:
2000: The host/LPAR supports the automatic service notification through the distributed
library notification attention.
4000: The host/LPAR is grouped to the device.
6000: The host/LPAR supports the automatic service notification and the host/LPAR is
grouped to the device.
8000: Device Explicitly Assigned by Host/LPAR.
C000: The host/LPAR is grouped to the device and Device Explicitly Assigned by
Host/LPAR.
E000: The host/LPAR supports the automatic service notification and the host/LPAR is
grouped to the device and Device Explicitly Assigned by Host/LPAR.
For more information, see the IBM TS7700 Series Control Unit Initiated Reconfiguration
(CUIR) User’s Guide at:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102743
LIBRARY DISPDRV,library-name
The existing LIBRARY DISPDRV,library-name command has been updated with a new field,
CU, that was added to account for the new offline reason code for CUIR. The output for the
command has the new syntax:
CBR1220I Tape drive status:
DRIVE DEVICE LIBRARY ON OFFREASN LM ICL ICL MOUNT
NUM TYPE NAME LI OP PT CU AV CATEGRY LOAD VOLUME
devnum devtyp libname b c d e f g hhhhhhh i mntvol
If a device is offline for CUIR reasons, the field contains a ‘Y’. Otherwise, it contains an ‘N’.
APAR OA48240 (z/OS V1R13+) can eliminate the second I/O call. After the APAR is applied,
this change can be done with the LIBRARY DISABLE,CATCOUNT command. This change can
decrease the overall duration of return-to-scratch processing. Re-enabling the second I/O call
can be done with the LIBRARY RESET,CATCOUNT command.
Even though the second I/O call might be disabled, OAM is able to stay updated as to the
current count of scratch tapes in the library through the use of a monitoring task in the OAM
address space. This task queries the library for the current scratch count every 10 minutes. In
addition, OAM continues to update the scratch count when a volume is changed from scratch
to private.
The following steps describe how to create the volume entry for the volume (if it is not already
present in the TCDB), ALTER the volume to SCRATCH status, and EJECT the volume from
the host using different methods:
1. Use the following JCL to invoke IDCAMS to create the volume entry in the TCDB:
//CREATVOL JOB ...
//STEP1 EXEC PGM=IDCAMS
//SYSPRINT DD SYSOUT=*
//SYSIN DD *
CREATE VOLUMEENTRY -
(NAME(Vxxxxxx) -
LIBRARYNAME(libname) -
MEDIATYPE(mediatype) -
LOCATION(LIBRARY)
2. Use ISMF to ALTER the use attribute from SCRATCH to SCRATCH. This command
invokes the CBRUXCUA exit to communicate with the TMS. If the TMS indicates that the
change is allowed to process, the category is changed in the library to the category
defined for the corresponding media type in this host’s DEVSUPxx parmlib member.
Note: At the time of updating this book the performance measurements were not available
for TS7700 V4.1.2. The IBM TS7700 R4 (TS7760) Performance White Paper contains the
most recent updates:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102652
With R3.2 the TS7700 Tape-Attach was introduced. In general, the tape attach models are
based on the same physical hardware than the disk-only models, and the basic performance
numbers are identical. In addition to the TS770D, the TS7700T needs to write data from the
cache to the backend drives, needs to process recalls, and needs additional resources for
reclaims. It is necessary to understand how these actions affect the overall performance of
the TS7700T.
R3.1 introduced the next generation 8-gigabit (Gb) Fibre Channel connection (FICON)
adapters. The TS7700 4-port 8 Gb FICON adapter is the same type as the IBM System
Storage DS8700 family 8-port 8 Gb FICON adapter. This adapter provides the same cyclic
redundancy check (CRC) protection and inline compression as previous TS7700 FICON
adapters provided.
R3.1 supports two ports per 8-Gb FICON adapter only. In a fully populated TS7700, with four
8-Gb adapters, there are eight ports available for TS7700 host connections.
Each port on the 8-Gb FICON adapter supports 512 logical paths, which are twice the
number of logical paths that are supported by the 4-Gb FICON adapters. When fully
configured with 8 8-Gb FICON channels, the TS7700 supports 4096 logical paths.
This means that you have more flexibility when connecting large numbers of logical partitions
(LPARs).
This chapter includes the newest overall performance information, especially for the TS7760
models:
An overview of the shared tasks that are running in the TS7700 server
A description of a TS7700 monitoring and performance evaluation methodology
Understanding the speciality for the TS7700T regarding the different throttling impacts of
CP0 and CPx
Performance monitoring with the TS7700 GUI
Additional information about performance alerting and thresholds
Information about the handling of Sync-Deferred and Immediate-deferred copy handling
A review of bulk volume information retrieval (BVIR) and VEHSTATS reporting
VEHAUDIT and BVIRAUDIT usage
A detailed description of the device allocation possibilities regarding the TS7700
capabilities
A brief overview of the tasks in the TS7700 is provided so that you can understand the effect
that contention for these resources has on the performance of the TS7700.
The monitoring section can help you understand the performance-related data that is
recorded in the TS7700. It discusses the performance issues that might arise with the
TS7700. This chapter can also help you recognize the symptoms that indicate that the
TS7700 configuration is at or near its maximum performance capability. The information that
is provided can help you evaluate the options available to improve the throughput and
performance of the TS7700.
You might experience deviations from the presented figures in your environment. The
measurements are based on a theoretical workload profile, and cannot be fully compared
with a varying workload. The performance factors and numbers for configurations are shown
in the following pages.
Based on initial modeling and measurements, and assuming a 2.66:1 compression ratio,
Figure 11-1 on page 616 shows the evolution in the write performance with the TS7700
family, which is also described in more detail in IBM TS7700 R4 (TS7760) Performance,
WP102652. The following charts are for illustrative purposes only. Always use the most
recently published performance white papers available on the Techdocs website at the
following address:
http://www.ibm.com/support/techdocs/atsmastr.nsf/Web/TechDocs
Figure 11-1 shows the evolution of performance in the TS7700 IBM family compared with the
previous member of the IBM Tape Virtualization family, the IBM Virtual Tape Server (VTS). All
runs were made with 128 concurrent jobs, using 32 kibibyte (KiB) blocks, and queued
sequential access method (QSAM) BUFNO = 20. Before R 3.2, volume size is 800 mebibytes
(MiB), made up of 300 MiB volumes @ 2.66:1 compression. In R 3.2, the volume size is 2659
MiB (1000 MiB volumes @ 2.66:1 compression).
3TANDALONE 64343 2EAD (IT 0ERFORMANCE (ISTORY
Read Hit
Figure 11-2 VTS and TS7700 maximum host read hit throughput
The numbers that are shown in Figure 11-2 were obtained with 128 concurrent jobs in all
runs, each job using 32 KiB blocks, and QSAM BUFNO = 20. Before R 3.2, volume size is
800 MiB (300 MiB volumes @ 2.66:1 compression). Since R 3.2, the volume size is 2659 MiB
(1000 MiB volumes @ 2.66:1 compression).
These are the five major aspects that influence the overall performance:
TS7700 components and task distribution
Replication modes and grid link considerations
Workload profile from your hosts
Lifecycle Management of your data
Parameters and customization of the TS7700
See Figure 11-3 for an overview of all of the tasks. The tasks that TS7700 runs, the
correlation of the tasks to the components that are involved, and tuning points that can be
used to favor certain tasks over others are all described.
All of these tasks share resources, especially the TS7700 Server processor, the TVC, and the
physical tape drives attached to a TS7700T or TS7740. Contention might occur for these
resources when high workload demands are placed on the TS7700. To manage the use of
shared resources, the TS7700 uses various resource management algorithms, which can
have a significant impact on the level of performance that is achieved for a specific workload.
In general, the administrative tasks (except premigration) have lower priority than host-related
operations. In certain situations, the TS7700 grants higher priority to activities to solve a
problem state, including the following scenarios:
Panic reclamation: The TS7740 or TS7700T detects that the number of empty physical
volumes has dropped below the minimum value, and reclaims need to be done
immediately to increase the count.
Cache fills with copy data: To protect from uncopied volumes being removed from cache,
the TS7700T and TS7740 throttle data coming into the cache. For the TS7700T, this type
of throttling occurs only to Host I/O related to the CPx partitions. Data that is written to
CP0 is not throttled in this situation.
A complete description of the tasks processing can be found in the IBM Virtualization Engine
TS7700 Series Best Practices - Understanding, Monitoring and Tuning the TS7700
Performance white paper:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101465
In addition, the chosen copy mode might also have a direct influence on the performance of
the TS7700 grid. Although some of the modes enable the copies to be run at a non-peak time
(all types of deferred copies), other copy modes are enforced to run before job end
(Synchronous, RUN).
This implies that resources from the TS7700 (cache bandwidth, CPU, and grid link
bandwidth) need to be available at this point in time.
Clarification: Cross-cluster mounts to other clusters do not move data through local
cache. Also, reclaim data does not move through the cache.
Family members are given higher weight when deciding which cluster to prefer for TVC
selection.
Members of a family source their copies within the family when possible. In this manner, data
does not have to travel across the long link between sites, optimizing the use of the data link
and shortening the copy time. Also, the family members cooperate among themselves, each
pulling a copy of separate volume and exchanging them later among family members.
With cooperative replication, a family prefers retrieving a new volume that the family does not
have a copy of yet, over copying a volume within a family. When fewer than 20 new copies are
to be made from other families, the family clusters copy among themselves. Therefore,
second copies of volumes within a family are deferred in preference to new volume copies
into the family.
When a copy within a family is queued for 12 hours or more, it is given equal priority with
copies from other families. This prevents family copies from stagnating in the copy queue.
For details about cluster families, see IBM Virtualization Engine TS7700 Series Best
Practices -TS7700 Hybrid Grid Usage at:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101656
Unplanned read workload might have peaks that can affect on the one hand the response
times of these actions (read / recall times). However, these actions can also influence the
deferred copy times and, in a TS7740, the reclamation execution. Changes in the workload
profile might affect the replication time of deferred copies and can lead to throttling situations.
Therefore, review the performance charts of the TS7700 to identify workload profile changes,
and to take appropriate performance tuning measurements if necessary.
If you change the preconfigured values, review your adjustment with the performance
monitoring tools.
For more information, see the IBM Virtualization Engine TS7700 Series Best Practices -
Understanding, Monitoring and Tuning the TS7700 Performance white paper:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101465
Because the TS7740 and TS7700T contain physical tapes to which the cache data is
periodically written, recalls from tape to cache occur, and Copy Export and reclaim activities
occur, the TS7740 and TS7700T exhibits four basic throughput rates:
Peak write
Sustained write
Read-hit
Recall
This throttling mechanism operates to achieve a balance between the amount of data coming
in from the host and the amount of data being copied to physical tape. The resulting data rate
for this mode of behavior is called the sustained data rate, and can theoretically continue on
forever, given a constant supply of logical and physical scratch tapes.
This second threshold is called the premigration throttling threshold, and has a default value
of 2000 GB. These two thresholds can be used with the peak data rate to project the duration
of the peak period. Both the priority and throttling thresholds can be increased through a host
command-line request, which is described later in this chapter.
Read-hit data rates are typically higher than recall data rates.
Summary
The two read performance metrics, along with peak and sustained write performance, are
sometimes referred to as the four corners of virtual tape performance. Performance depends
on several factors that can vary greatly from installation to installation, such as number of
physical tape drives, spread of requested logical volumes over physical volumes, location of
the logical volumes on the physical volumes, and length of the physical media.
This control is accomplished by delaying the launch of new tasks and prioritizing more
important tasks over the other tasks. After the tasks are dispatched and running, control over
the execution is accomplished by slowing down a specific functional area by introducing
calculated amounts of delay in the operations. This alleviates stress on an overloaded
component, or leaves extra central processor unit (CPU) cycles to another needed function,
or waits for a slower operation to finish.
The subsystem has a series of self-regulatory mechanisms that try to optimize the shared
resources usage. Subsystem resources, such as CPU, cache bandwidth, cache size, host
channel bandwidth, grid network bandwidth, and physical drives, are limited, and they must
be shared by all tasks moving data throughout the subsystem.
Important: Resident partition (CP0) and Tape Partitions (CP1 - CP7) are monitored and
handled separately in a TS7700T.
However, even if the PMTHLVL throttling does not apply to the CP0 of a TS7700T, there is still
an indirect influence because of the shared cache modules.
Consider the following items when you configure or monitor a TS7700T resource usage:
Workloads that create data (either host I/O, remote writes, or copy processes from other
clusters) in the CP0 uses resources of the cache bandwidth (write to cache).
After PMTHLVL is crossed for the CPx, the Host I/O creating data in the CPx is throttled,
but there will be no throttling to the jobs creating data in CP0.
In small configurations (for example, up to four drawers), this can lead to the situation
where the jobs running to CP0 use the cache bandwidth resources, and resources for
premigration might be limited. This is especially true for a TS7720T due to the used cache.
If the unpremigrated amount of data still increases, the throttling of the workload into the
CPx increases too. This might reach a number of delays to the jobs where the jobs
creating data in CPx are seriously impacted.
The availability of TS7740/TS7700T physical tape drives for certain functions can significantly
affect TS7740/TS7700T performance.
The TS7740/TS7700T manages the internal allocation of these drives as required for various
functions, but it usually reserves at least one physical drive for recall and one drive for
premigration.
TVC management algorithms also influence the allocation of physical tape drives, as
described in the following examples:
Cache freespace low: The TS7740/TS7700T increases the number of drives available to
the premigration function, and reduces the number of drives available for recalls.
Premigration threshold crossed: The TS7740/TS7700T reduces the number of drives
available for recall down to a minimum of one drive to make drives available for the
premigration function.
The number of drives available for recall or copy is also reduced during reclamation.
The number of drives for premigration can be restricted on a physical pool base. If the number
of drives available for premigration is restricted, or the physical drives are already used by
other processes, it can lead to limiting the number of virtual volumes in the cache to be
premigrated. This might lead to premigration throttling (host I/O is throttled), and later it can
lead to free space or copy queue throttling.
If no physical drive is available when a recall is requested, elongated virtual mount times for
logical volumes that are being recalled can be the result.
Recall performance is highly dependent on both the placement of the recalled logical volumes
on stacked volumes, and the order in which the logical volumes are recalled. To minimize the
effects of volume pooling on sustained write performance, volumes are premigrated by using
a different distribution algorithm.
This algorithm chains several volumes together on the same stacked volume for the same
pool. This can change recall performance, sometimes making it better, sometimes making it
worse. Other than variations in performance because of differences in distribution over the
stacked volumes, recall performance must be constant.
Reclaim policies must be set in the Management Interface (MI) for each volume pool.
Reclamation occupies drives and can affect performance. Using multiple physical pools can
cause a higher usage of physical drives for premigration and reclaim.
In general, the more pools are used, the more drives are needed. If all drives are busy, and a
recall is requested, the reclaim process is interrupted. That can take some seconds to
minutes, because the actual moving logical volume needs to be finished, and then the
cartridge needs to be exchanged.
The Inhibit Reclaim schedule is also set from the MI, and it can prevent reclamation from
running during specified time frames during the week. If Secure Data Erase is used, fewer
physical tape drives might be available even during times when you use inhibited reclamation.
If used, limit it to a specific group of data. Inhibit Reclaim specifications only partially apply to
Secure Data Erase.
Note: Secure Data Erase does not acknowledge your settings and can run erasure
operations if there are physical volumes to be erased.
The TS7700 provides a MI based on open standards through which a storage management
application can request specific information that the TS7700 maintains. The open standards
are not currently supported for applications running under IBM z/OS, so an alternative
method is needed to provide the information to mainframe applications.
You can use the following interfaces, tools, and methods to monitor the TS7700:
IBM TS4500 and TS3500 tape library Specialist (TS7740/TS7700T only)
TS7700 M)
Bulk Volume Information Retrieval function (BVIR)
IBM Tape Tools: VEHSTATS, VEHAUDIT, and VEHGRXCL
Host Console Request Commands
The specialist and MI are web-based. With the BVIR function, various types of monitoring and
performance-related information can be requested through a host logical volume in the
TS7700. Finally, the VEHSTATS tools can be used to format the BVIR responses, which are
in a binary format, to create usable statistical reports.
With the VEHSTATS data, there are now performance evaluation tools available on Techdocs
that quickly create performance-related charts. Performance tools are provided to analyze 24
hours worth of 15-minute data, seven days worth of one-hour interval data, and 90 days worth
of daily summary data. For spreadsheets, data collection requirements, and trending
evaluation guides, see the following Techdocs website:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS4717
VEHSTATS Host
H
Tool
LIBRARY REQUEST
ST
Command
FICON
TS3500
BVIR
TS7700
Figure 11-4 Interfaces, tools, and methods to monitor the TS7700
Point-in-time statistics
These statistics are performance-related. The point-in-time information is intended to supply
information about what the system is doing the instant that the request is made to the system.
This information is not persistent on the system. The TS7700 updates these statistics every
15 seconds, but it does not retain them.
This information focuses on the individual components of the system and their current activity.
These statistics report operations over the last full 15-second interval. You can retrieve the
point-in-time statistics from the TS7700 at any time by using the BVIR facility. A subset of
point-in-time statistics is also available on the TS7700 MI.
The response records are written in binary undefined (U) format of maximum 24,000 bytes.
Tips:
If a cluster or node is not available at the time that the point-in-time statistics are
recorded, except for the headers, all the data fields for that cluster or node are zeros.
The request records are written in fixed-block architecture (FB) format. To read the
response records, use the Undefined (U) format with a maximum blocksize of 24,000.
The response records are variable in length.
The user can also retrieve these records by using BVIR. A subset of the historical statistics is
also available on the TS7700 MI. More information is available in 11.5, “Cache capacity” on
page 643.
The historical statistics for all clusters are returned in the response records. In a TS7700 grid
configuration, this way means that the request volume can be written to any cluster to obtain
the information for the entire configuration. The response records are written in a binary
undefined (U) format of a maximum of 24,000 bytes.
Tips:
If a cluster or node is not available at the time that the historical statistics are recorded,
except for the headers, all the data fields for that cluster or node are zeros.
The TS7700 retains 90 days worth of historical statistics. If you want to keep statistics
for a longer period, be sure that you retain the logical volumes that are used to obtain
the statistics.
The request records are written in FB format. To read the response records, use the
undefined (U) format with a maximum blocksize of 24,000. The response records are
variable in length.
Both point-in-time statistics and historical statistics are recorded. The point-in-time records
present data from the most recent interval, providing speedometer-like statistics. The
historical statistics provide data that can be used to observe historical trends.
These statistical records are available to a host through the BVIR facility. For more
information about how to format and analyze these records, see 11.15, “Alerts and exception
and message handling” on page 692.
Each cluster in a grid has its own set of point-in-time and historical statistics for both the
vNode and hnode.
For a complete description of the records, see IBM TS7700 Series Statistical Data Format
White Paper and TS7700 VEHSTATS Decoder reference, which are available on the following
Techdocs web pages:
http://www.ibm.com/support/techdocs/atsmastr.nsf/Web/Techdocs
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP100829
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TD105477
In the TS4500 Logical Library view, you can find the information regarding the number of
cartridges, drives, and maximum cartridges. Use the FILTER option for selecting the columns.
Figure 11-8 TS3500 Tape Library Specialist Operator Panel Security window
Some information that is provided by the TS3500 Tape Library Specialist is in a display-only
format and there is no option to download data. Other windows provide a link for data that is
available only when downloaded to a workstation. The data, in comma-separated value
(CSV) format, can be downloaded directly to a computer and then used as input for snapshot
analysis for the TS3500. This information refers to the TS3500 and its physical drive usage
statistics from a TS3500 standpoint only.
For more information, including how to request and use this data, see IBM TS3500 Tape
Library with System z Attachment A Practical Guide to Enterprise Tape Drives and TS3500
Tape Automation, SG24-6789.
Consideration: This statistic does not provide information from the host to the TS7700
or from the host to the controller.
The TS7700 MI is based on a web server that is installed in each TS7700. You can access
this interface with any standard web browser.
A link is available to the physical tape library management interface from the TS7700 MI, as
shown at the lower left corner of Figure 11-9 on page 634. This link might not be available if
not configured during TS7740/TS7700T installation, or for a TS7700D.
The navigation pane is available on the left side of the MI, as shown in the Grid Summary
window shown in Figure 11-9.
Historical Summary
This window (Figure 11-10 on page 635) shows the various performance statistics over a
period of 24 hours. Data is retrieved from the Historical Statistic Records. It presents data in
averages over 15-minute periods.
Multiple views can be selected, also dependent on the installed cluster type. Such as
displaying for the maximum of a whole day:
Throughput performance
Throttling information
Copy Queue
Figure 11-10 shows the Historical Summary “Throughput” window for a TS7760T
The precustomized “Throughput” enables you to see all cache bandwidth relevant information
Primary cache device write
Primary cache device read
Remote read
Remote write
Link copy in
Link copy out
Write to tape
Recall from tape
A pre-customized “Throttling” graph is shown in Figure 11-11. It shows which type of throttling
happened during the selected interval, and the throttling impact in milliseconds.
This copy queue view shows how many MiB are sitting in the queue for a specific copy
consistency policy.
All of these metrics can be changed by selecting different metrics. Remember, that you can
select only 10 items for one graph, as shown in Figure 11-13.
Although this is a snapshot of one day or less, the performance evaluation tools on
TECHDOCS provide you a 24-hour or 90-day overview of these numbers. Review the
numbers to help you with these tasks:
Identify your workload peaks and possible bottlenecks.
See trends to identify increasing workload.
Identify schedule times for reclaim.
For more information about using the tool, see 11.6.1, “Interpreting Cache throughput:
Performance graph” on page 647. The Performance evaluation tool does not support new
content regarding the TS7700T yet.
The “Number of logical mounts during last 15 minutes” table has the following information:
Cluster The cluster name
Fast Ready Number of logical mounts that are completed by using the scratch
(Fast Ready) method
Cache Hits Number of logical mounts that are completed from cache
Cache Misses Number of mount requests that were not fulfilled from cache
Total Total number of logical mounts
The “Average mount times (ms) during last 15 minutes” table has the following information:
Cluster The cluster name
Fast Ready Average mount time for scratch (Fast Ready) logical mounts
Cache Hits Average mount time for logical mounts that are completed from cache
Cache Misses Average mount time for requests that are not fulfilled from cache
This view gives you an overview only if you run out of virtual drives in a cluster. Depending on
your environment, it does not show you, if in a specific LPAR or sysplex, there might be a
shortage of virtual drives. Especially if you define virtual drives in a static way to an LPAR
(without an allocation manager), a certain LPAR might not have enough drives. To ensure that
a specific LPAR has enough virtual drives, analyze your environment with Tapetools
MOUNTMON.
Review the used numbers of physical drives to help you with the following tasks:
Identify upcoming bottlenecks.
Determine whether it is appropriate to add or reduce physical pools. Using a larger
number of pools requires more physical drives to handle the premigration, recall, and
reclaim activity.
Determine possible timelines for Copy Export operations.
Use this window to view statistics for each cluster, vNode, host adapter, and host adapter port
in the grid. At the top of the window is a collapsible tree where you view statistics for a specific
level of the grid and cluster. Click the grid to view information for each cluster. Click the cluster
link to view information for each vNode. Click the vNode link to view information for each host
adapter. Click a host adapter link to view information for each port.
The host throughput data is displayed in two bar graphs and one table. The bar graphs are for
raw data that is coming from the host to the host bus adapter (HBA), and for compressed data
that is going from the HBA to the virtual drive on the vNode.
The letter next to the table heading corresponds with the letter in the diagram above the table.
Data is available for a cluster, vNode, host adapter, and host adapter port. The table cells
include the following items:
Cluster The cluster or cluster component for which data is being
displayed (vNode, host adapter, or host adapter port)
Compressed Read (A) Amount of data that is read between the virtual drive and
the HBA
Raw Read (B) Amount of data that is read between the HBA and host
Read Compression Ratio Ratio of compressed read data to raw read data
Compressed Write (D) Amount of data that is written from the HBA to the virtual
drive
Raw Write (C) Amount of data that is written from the host to the HBA
Write Compression Ratio Ratio of compressed written data to raw written data
Although this is a snapshot, the performance evaluation tools on Techdocs provide you with a
24-hour, 7-day, or 90-day overview about these numbers.
Cache throttling is a time interval that is applied to TS7700 internal functions to improve
throughput performance to the host. The cache throttling statistics for each cluster that relate
to copy and write are displayed both in a bar graph form and in a table. The table shows the
following items:
Cluster The cluster name
Copy The amount of time that is inserted between internal copy operations
Write The amount of time that is inserted between host write operations
This example is from a TS7740 cluster that is part of a multi-cluster grid configuration
(four-cluster grid).
The cache utilization statistics can be selected for each cluster. Various aspects of cache
performance are displayed for each cluster. Select them from the Select cache utilizations
statistics menu. The data is displayed in both bar graph and table form, and can be displayed
also by preference groups 0 and 1.
Review this data with the performance evaluation tool from Techdocs to identify the following
conditions:
Cache shortages, especially in your TS7700D
Improvement capabilities for your cache usage through the adjustment of copy policies
Cache Partitions
You can use this window (Figure 11-19) to view Tape cache partitions and their utilization.
Depending on your filter list, you might see the following output:
This window presents information about cross-cluster data transfer rates. This selection is
present only in a multi-cluster grid configuration. If the TS7700 grid has only one cluster, there
is no cross-cluster data transfer through the Ethernet adapters.
The table displays data for cross-cluster data transfer performance (MBps) during the last 15
minutes. The table cells show the following items:
Cluster The cluster name
Outbound Access Data transfer rate for host operations that move data from the specified
cluster into one or more remote clusters
Inbound Access Data transfer rate for host operations that move data into the specified
cluster from one or more remote clusters
Copy Outbound Data transfer rate for copy operations that pull data out of the specified
cluster into one or more remote clusters
Copy Inbound Data transfer rate for copy operations that pull data into the specified
cluster from one or more remote clusters
Review this data with the performance evaluation tools on Techdocs to identify the following
conditions:
Identify upcoming performance problems due to grid link usage.
Identify the amount of transferred data to review your settings, such as DAA, SAA,
override policies, and Copy Consistency Points.
For TS7700D or the TS7700T CP0 the aim is that all data is kept in cache. However it is
possible to use the function of autoremoval, to allow data in a TS7700D or TS7700T CP0 to
be removed, if otherwise a short on storage condition would occur.
In the TS7740 and TS7700T CPx the Storage Class with the storage preference determine if
a logical volume is kept in cache (PG1) or is migrated as soon as possible (PG1).
In a TS7700T, the CP0 usage is reported in the H30TVC1, and CP1 is reported in H30TVC2
and so on.
There is no explicit usage of the cache capacity for each partition reported. The total TVC
usage is reported in TOTAL TVC_GB USED. Also, you find information regarding the following
data:
How many logical volumes are kept in the different preference classes, depending on the
models
How long logical volumes are kept in the different storage preferences (4 hours, 48 hours
and 35 days for TS7740 or TS7700T CPx)
How many logical volumes have been removed with autoremoval
This setting works on a distributed library level. It needs to be specified on each cluster. The
preferred keyword depends on your requirements. ENABLE is the best setting if it is likely that
the recalled logical volumes are used only once. With the setting DISABLE, the logical volume
stays in cache for further retrieval if the SC is defined as PG1 in the cluster that is used for the
I/O TVC.
In previous releases, these copies were deleted at mount/demount time, if the copy was
inconsistent. If the copy was consistent, it was kept in the cluster.
With R3.2, this command was enhanced, and now enables you to determine not only how
unwanted copies are treated, but also when this command is run.
With R3.1, a new HCR command was introduced, with the following options:
LI REQ,distributed library,SETTING,EXISTDEL,CRITERIA,[STALE|ALL NONE]
STALE: The “E” copy is deleted if this copy is inconsistent. This is the default.
ALL: The “E” copy is deleted from the cluster, if all other non-“E” copy mode sites are
consistent.
NONE: The “E” copy will never be deleted.
The deletion of an “E” copy can be processed only if all clusters in the grid are available. That
is necessary because the status of all copies needs to be determined.
These commands are only available if all clusters are on R3.2 or higher.
This TVC bandwidth is shared between the host I/O (compressed), copy activity (from and to
other clusters), premigration activity, recalls for read and remote writes from other clusters.
The TS7700 balances these tasks by using various thresholds and controls to prefer host I/O.
This section gives you some information how to determine if your cache bandwidth is a
limitation, for what actions the cache bandwidth is used, and tuning actions that you can
consider.
The MiB/s Total Xfer is an average value. Notice that some peaks maybe higher.
In this example, the maximum value is 497 MBps, but that is driven by the premigration task.
If that would cause an performance issue, you should review the PMPRIOR and PMTHLVL
setting for tuning.
As stated before, the workload creating data (Host I/O, Copy, or remote write) in the TS7700T
CP0 is not subject to throttling for these values.
As stated before, the workload creating data (Host I/O, Copy, or remote write) in the TS7700T
CP0 is not subject to throttling for these values.
There is no guideline about the values of PMPRIOR and PMTHLVL. The following aspects
need to be considered:
Installed number of FC for Cache enablement for a TS7740
Installed number of FC 5274 for premigration queue size for a TS7700T
Workload profile
Requirement regarding how long data should stay in cache unpremigrated
You need to determine the balance. If throttling occurs, it can be monitored with the MI or the
VEHSTATS reports. You should review the values periodically.
To adjust the parameters, use the Host Console Request command. When you try to define a
not allowed value, the TS7700 automatically uses an appropriate value. Details about these
settings are described in the IBM Virtualization Engine TS7700 Series z/OS Host Command
Line Request User’s Guide, which is available on the Techdocs website:
http://www.ibm.com/support/techdocs/atsmastr.nsf/Web/TechDocs
PMPRIOR setting
If the value of PMPRIOR is crossed, the TS7740/TS7700T starts the premigration. This might
decrease the resources available for other tasks in the TS7740/TS7700T and shorten the
peak throughput period of a workload window.
Raising this threshold increases the timeline where the TS7740/TS7700T can run in Peak
mode. However, the exposure is more for non-premigrated data in cache.
Having a low PMPRIOR causes data to be premigrated faster and avoids hitting the
premigration throttling threshold.
The default is 1600 GB, and the maximum is the value of PMTHLVL minus 500 GB.
PMTHLVL setting
After the PMTHLVL is crossed, the Host I/O, remote writes, and copies into a TS7740 and
TS7700T are throttled. In a TS7700T, only workload to the Tape partitions is subject to
throttling. If the data continues to increase after you hit the PMTHLVL, the amount of delay for
throttling will continue to increase.
Raising the threshold avoids the application of the throttles, and keeps host and copy
throughput higher. However, the exposure is more for non-premigrated data in cache.
How to determine the Premigration Queue Size Feature Codes or Tape Cache
Enablement
The size of the installed feature codes can be determined in the following ways:
The MI, in the window “Feature Code License”
The VEHSTATS Report H30TVCx:
– TVC_SIZE: For a TS7740, the enabled tape cache (not the installed size) is displayed.
For a TS7700D or TS7700T the installed cache size is displayed.
– PARTITION SIZE: For a TS7740 the enabled tape cache size is displayed again, for the
TS7700T the partition size is reported.
– PRE-MIG THROT VALUE: Shows the PMTHLVL value. In a TS7700T, the
recommendation is to set the PMTHLVL equal to the amount of FC 5274. If that
recommendation was used, this is the amount of FC installed.
Especially in a TS7720T, the cache bandwidth can be a limit for the Host I/O as well.
To determine information about the Host I/O, you have multiple choices.
Then, the Host I/O is displayed in the performance graph, as shown in Figure 11-25.
The most important fields on this page give you information regarding your Host I/O
(throughput) behavior. The MAX THRPUT is the enabled Host Throughput limit with FC 5268.
The ATTMPT THRPUT is the amount of MBps the host wanted to deliver to the TS7700. In
combination with the next three values, you can determine whether the installed FC 5268 is
sufficient, or if an upgrade would provide a better performance. The DELAY MAX, DELAY
AVG, and PCT of 15 Sec Intervals fields tell you how much delay is applied to each second
during the 15-second sample interval. VEHSTATS reports thousandths of a second with a
decimal point, which is the same as milliseconds if no decimal point is present. Statistics
records are created every 15 minutes (900 seconds), so there are 60 of the 15-second
intervals that are used to report the 15-minute interval values. Most especially, the PCT of 15
sec INTVLS tells you how often a delay occurred.
This calculation is only an indicator. If you want to enhance the number of host I/O
increments, talk to your IBM representative for sizing help.
The previous 4-Gb FICON system had a maximum of 10 throughput increments, any data
rate above 1 gigabyte per second (GBps) were given for free. With the new 8-Gb cards (or
after an upgrade occurs), the new throughput increments limit is 25 GBps.
If a cluster was able to achieve speeds faster than 1 GBps with 4-Gb FICON cards and 10
throughput increments in the past, that will no longer be true because the TS7700 limits them
to exactly 1 GBps, supposing that 8-Gb FICON cards were installed and the same 10
throughput increments were licensed. Thus, consider purchasing enough throughput
increments (up to 25) to allow TS7700 cluster to run at unthrottled speeds.
See Chapter 7, “Hardware configurations and upgrade considerations” on page 243 for more
details.
Figure 11-28 shows the installed increments (Feature Code (FC) 5268). In this example, four
increments are installed. The throughput is limited to 400 megabytes per second (MBps)
because of number of the installed 100 MBps increments.
If the Host I/O is limited because the cache bandwidth is on the limit, see 11.6.3, “Tuning
Cache bandwidth: Premigration” on page 648.
The grid link and replication performance depends on the following aspects:
Installed grid link hardware
Sufficient bandwidth and quality of the provided network
Chosen replication mode
Defined amount of concurrent copy tasks
Number of remote write/read operations
Remember that cache bandwidth is always an influencing factor, but was already described in
the last paragraphs. So this information is not included in this section.
11.8.1 Installed grid link hardware: Mixing of different Grid link adapters
It is not supported to have different grid link adapter types in one single cluster. However, in a
grid, there can be a situation in which some clusters are connected to the grid link with 10 GB
adapters, and other clusters are connected with 1 GB adapters. That is especially true for
migration or upgrade scenarios.
In the TS7700 grid, there is a 1:1 relationship between the primary and primary adapters, and
the secondary and secondary adapters. Due to that reason, in a mixed environment of 2*10
GB and 4*1 GB adapters, the clusters with the 4*1 GB links cannot use the full speed of the
installed grid link adapters.
Remember, that 4*10 GB can be installed only in a VEC. A VEB/C07 with R4.0 cannot be
upgraded to use 4*10 GB.
The TS7700 uses the TCP/IP protocol for moving data between each cluster. In addition to
the bandwidth, other key factors affect the throughput that the TS7700 can achieve. The
following factors directly affect performance:
Latency between the TS7700s
Network efficiency (packet loss, packet sequencing, and bit error rates)
Network switch capabilities
Flow control to pace the data from the TS7700s
Inter-switch link (ISL) capabilities, such as flow control, buffering, and performance
The TS7700s attempt to drive the network links at the full 1-Gb rate for the two or four 1-Gbps
links, or at the highest possible load at the two 10-Gbps links, which might be much higher
than the network infrastructure is able to handle. The TS7700 supports the IP flow control
frames to have the network pace the rate at which the TS7700 attempts to drive the network.
The best performance is achieved when the TS7700 is able to match the capabilities of the
underlying network, resulting in fewer dropped packets.
To maximize network throughput, you must ensure the following items regarding the
underlying network:
The underlying network must have sufficient bandwidth to account for all network traffic
that is expected to be driven through the system. Eliminate network contention.
The underlying network must be able to support flow control between the TS7700s and
the switches, allowing the switch to pace the TS7700 to the wide-area LANs (WANs)
capability.
Flow control between the switches is also a potential factor to ensure that the switches are
able to pace with each other’s rate.
Be sure that the performance of the switch can handle the data rates that are expected
from all of the network traffic.
Latency between the sites is the primary factor. However, packet loss, because of bit error
rates or because the network is not capable of the maximum capacity of the links, causes
TCP to resend data, which multiplies the effect of the latency.
In addition, the synchronous mode copy does also not adhere to the same rules as the other
copy modes, as shown in Table 11-1.
Data direction Data is pushed from the Data is pulled from the
primary cluster. secondary cluster.
Figure 11-29 Virtual Tape Drive view for synchronous mode copy
In the following picture, you see that in total 44 scratch mounts (FAST NUM MNTS) where
made, all of them are scratch mounts. In addition, you see the same number in the SYNC
NUM MNTS field, which means, that the same number of mounts to a remote cluster has
been run.
There is no further information (such as the number of concurrent sync copies, inconsistency
at interval, and so on) for the synchronous mode available, because none of them is
applicable.
Only if the synchronous mode copy could not be processed, and sync-deferred was activated,
are reports written. However, then these copies are reported with “DEFERRED” and there is
no possibility for a further drill down.
To understand, if RUN copies were queued for processing in that interval, look at
AV_RUN_QUEAGE -- MINUTES--. Having numbers in here means that RUN could not be
processed accordingly. Having RUN lvols waiting also means that the job cannot process
further. If the report shows multiple indications for this behavior, take a closer look at the
number of concurrent copy activities and the grid link usage.
You might want to consider increasing the number of concurrent RUN tasks. Also check if all
receiving clusters where available, or if a cluster went to service or had an outage during that
interval.
Figure 11-31 H33Grid report to see copy behavior for RUN and Deferred copies
To understand, if that throttling applies and is the reason for the increase, look in the
H30TVCx report. The H30TVC1 report contains the information of CP0, and the H30TVC2-8
report contains the CP1- CP7 information.
The deferred copy throttling is for all partitions identically, so it is sufficient to look into one of
the H30TVCx reports. As shown in Figure 11-32, the H30TVC report contains detail
information about the occurred throttling.
Looking to the Figure 11-32 on page 659, you find one interval where deferred copy throttling
occurred in all 120 samples. This results in the maximum value of .125 s penalty for each
copy operation. However, in the report shown, there was no interval where no deferred copy
throttling occurred, but in some intervals were limited throttling measured.
Be aware, that depending on your network, a throttling higher than 20 ms normally results in
little or no deferred copy action. For further information how to influence the deferred copy
throttling, see 11.8.5, “Tuning possibilities for copies: Deferred Copy Throttling” on page 661.
Values can be set for the number of concurrent RUN copy threads and the number of
Deferred copy threads. The allowed values for the copy thread count are 5 - 128. The default
value is 20 for clusters with two 1-Gbps Ethernet links, and 40 for clusters with four 1-Gbps or
two 10-Gbps Ethernet links. Use the following parameters with the LIBRARY command:
SETTING, COPYCNT, RUN
SETTING, COPYCNT, DEF
In this case, an increase of the copy count might reduce the RPO.
Be aware, that usually one gridlink with 1 Gbps can be saturated by 10 copies running in
parallel. If the logical volumes are small, you might see gaps in the grid link usage, when only
a few copies are running, because some volumes finished, and new lvols where selected for
copy processing. In this situation, it might be beneficial to have more copies running
concurrently.
Take into account that if too many copies are running concurrently, you will overflow the grid
link. That can result in package loss and retries, and can lead overall to a lower performance
of the grid link environment.
The performance of the grid links is also affected by the latency time of the connection. The
latency has a significant influence on the maximum grid throughput. For example, with a
one-way latency of 20 - 25 milliseconds (ms) on a 2 x 1 Gb grid link with 20 copy tasks on the
receiving cluster, the maximum grid bandwidth is approximately 140 MBps. Increasing the
number of copy tasks on the receiving cluster increases the grid bandwidth closer to
200 MBps.
The default DCT is 125 ms. The effect on host throughput as the DCT is lowered is not linear.
Field experience shows that the knee of the curve is at approximately 30 ms. As the DCT
value is lowered toward 30 ms, the host throughput is affected somewhat and deferred copy
performance improves somewhat. At and below 30 ms, the host throughput is affected more
significantly, as is the deferred copy performance.
If the DCT needs to be adjusted from the default value, try an initial DCT value of 30 - 40 ms.
Favor the value toward 30 ms if the client is more concerned with deferred copy performance,
or toward 40 ms if the client is concerned about sacrificing host throughput.
After you adjust the DCT, monitor the host throughput and Deferred Copy Queue to see
whether the wanted balance of host throughput and deferred copy performance is achieved.
Lowering the DCT improves deferred copy performance at the expense of host throughput.
A DCT of “0” eliminates the penalty completely, and deferred copies are treated equally as
host I/O. Depending on your RPO requirements that is also a feasible setting.
The DCTAVGTD – DCT 20-Minute Average Threshold looks at the 20-minute average of the
compressed host read and write rate. The threshold defaults to 100 MBps. The Cache Write
Rate – Compressed writes to disk cache includes host write, recall write, grid copy-in write,
and cross-cluster write to this cluster. The threshold is fixed at 150 MBps. Cluster Utilization
looks at both the CPU usage and the disk cache usage. The threshold is when either one is
85% busy or more.
The preceding algorithm was added in R2.0. The reason to introduce the cache write rate
at R2.0 was due to the increased CPU power on the IBM POWER7 processor. The CPU
usage is often below 85% during peak host I/O periods.
Before R2.0, the cache write rate was not considered. Use the following parameters with the
LIBRARY command to modify the DCT value and the DCTAVGTD:
SETTING, THROTTLE, DCOPYT
SETTING, THROTTLE, DCTAVGDT
Note: The recommendation is to not change DCTAVGDT and instead use for the tuning
the DCOPTY only. Changing DCTAVGDT might not reduce the throttling as expected.
Figure 11-34 on page 663 shows the compressed host I/O dropping below the 100 MBps
threshold. As a result, the rate of deferred copies to other clusters is increased substantially.
In Figure 11-34, you see the effect when DCT is “turned off” because the host throughput
drops under 100 MBps (green bar). The number of deferred copy writes in MBps increases
(light blue bar).
The grid link degraded threshold also includes two other values that can be set by the SSR:
Number of degraded iterations: The number of consecutive 5-minute intervals that link
degradation was detected before reporting an attention message. The default value is 9.
Generate Call Home iterations: The number of consecutive 5-minute intervals that link
degradation was detected before generating a Call Home. The default value is 12.
The default values are set to 60% for the threshold, nine iterations before an attention
message is generated, and 12 iterations before a Call Home is generated. Use the default
values unless you are receiving intermittent warnings and support indicates that the values
need to be changed. If you receive intermittent warnings, let the SSR change the threshold
and iteration to the suggested values from support.
For example, suppose that clusters in a two-cluster grid are 2000 miles apart with a round-trip
latency of approximately 45 ms. The normal variation that is seen is 20 - 40%. In this
example, the threshold value is at 25% and the iterations are set to 12 and 15.
If there are insufficient back-end drives, the performance of the TS7740/TS7700T diminishes.
In a TS7740, there was a direct dependency between Host I/O Increments throughput and
the number of backend drives. This was understandable, because the TS7740 had a limited
cache and all data that was written to the cache also needed to be premigrated to backend
cartridges.
As a guideline for the TS7740, use the ranges of back-end drives that are listed in Table 11-2
based on the host throughput that is configured for the TS7740. The lower number of drives in
the ranges is for scenarios that have few recalls. The upper number is for scenarios that have
numerous recalls. Remember, these are guidelines, not rules.
Up to 400 4-6
800 - 1200 8 - 12
1200 - 1600 10 - 16
1600 or higher 16
So, if the Host increments cannot be used for guidance anymore, the question is how to
determine the number of needed backend drives.
As described previously, there is no overall rule. Here are some general statements:
The more physical backend pools you use, the more physical drives you need. Each
physical pool is treated independently (premigration, reclaim).
Depending on the used physical cartridge type and the reclaim value, the amount of still
valid data can be high. Therefore, a reclaim has to copy a high amount of data from one
physical tape to another tape. This uses two drives, and the more data that has to be
transferred, the longer these drives will be occupied. Reclaim is usually a low priority task,
but if not enough reclaims can be run, the number of necessary tapes increases.
Tape drives are also needed for Copy Export and Secure data overwrite.
Data expires in cache without premigration and data in CP0 does not need tape drives.
Low hit ratio requires more recalls from the backend.
Installing the correct number of back-end drives is important, along with the drives being
available for use. Available means that they are operational and might be idle or in use. The
Host Console Request function can be used to set up warning messages for when the
number of available drives drops. Setting the Available Physical Drive Low and High warning
levels is discussed in detail in the IBM Virtualization Engine TS7700 Series z/OS Host
Command Line Request User’s Guide, which is available on Techdocs. Use these keywords:
SETTING, ALERT, PDRVLOW
SETTING, ALERT, PDRVCRIT
The following fields are the most important fields in this report:
PHYSICAL_DRIVE_MOUNTED_AVG: If this value is equal or close to the maximum
drives available during several hours, this might mean that more physical tape drives
are required.
MOUNT_FOR (RCL MIG RCM SDE): This field presents the reason for each physical
mount. If the percentage value in the Recall (RCL) column is high compared to the total
number of mounts, this might indicate a need to evaluate the cache size or cache
management policies. However, this is not a fixed rule and further analysis is required. For
example, if HSM migration is into a TS7740, you might see high recall activity during the
morning, which can be driven by temporary development or user activity. This is normal
and not a problem in itself.
Be aware, that the number of tape drives might be misleading. The report does not recognize
the “IDLE” state. Idle means that the tape drive is mounted, but not in use. Therefore, you
might see a maximum - or even an average - usage that is equal to the installed drives. That
might or might not be a performance issue. To identify if that really is a bottleneck, it is
necessary to have a closer look at the overall situation.
To do so, first review which mounts are run. If a reclaim is still processed, there was no
performance issue (except if a panic reclaim occurred).
The H32GUPxx (General Pool Use) report is shown in Example 11-2. A single report always
shows two pools. In this example, the report shows Pool 01 and Pool 02. You can see the
following details per pool for each recorded time frame:
The number of active logical volumes
The amount of active data in GB
The amount of data written in MB
The amount of data read in MB
The current reclamation threshold and target pool
Check the ODERV12 statements from the BVIR jobs to select which types of cartridges are
used in your environment. Only four different types of media can be reported at the same
time.
LI REQ,DTS7720,PDRIVE
CBR1020I Processing LIBRARY command: REQ,DTS7720,PDRIVE.
CBR1280I Library DTS7720 request. 523
Keywords: PDRIVE
-----------------------------------------------------------------
PHYSICAL DRIVES V2 .1
SERIAL NUM TYPE MODE AVAIL ROLE POOL PVOL LVOL
0000078D1224 3592E07 Y IDLE 00
0000078D0BAA 3592E07 Y IDLE 00
0000078DBC65 3592E08 Y IDLE 00
0000078DBC95 3592E08 E08 Y MIGR 01 R00011 A51571
0000078DBC5C 3592E08 E08 Y MIGR 01 R00001 A66434
0000078DBC87 3592E08 E08 Y MIGR 01 R00002 A66462
Figure 11-35 LI REQ PDRIVE command example
Reclaim operations
Reclaim operations use two drives per reclaim task. Reclaim operations also use CPU MIPs,
but does not use any cache bandwidth resources, because the data will be copied from
physical tape to physical tape directly. If needed, the TS7740/TS7700T can allocate pairs of
idle drives for reclaim operations, making sure to leave one drive available for recall.
Reclaim operations affect host performance, especially during peak workload periods. Tune
your reclaim tasks by using both the reclaim threshold and Inhibit Reclaim schedule.
Table 11-3 shows the reclaim threshold and the amount of data that must be moved,
depending on the stacked tape capacity and the reclaim percentage. When the threshold is
reduced from 40% to 20%, only half of the data needs to be reclaimed. This change cuts the
time and resources that are needed for reclaim in half. However, it raises the needed number
of backend cartridges and slots in the library.
300 GB 30 GB 60 GB 90 GB 120 GB
With the Host Library Request command, you can limit the number of reclaim tasks in the
TS7740/TS7700T. The second keyword RECLAIM can be used along with the third keyword of
RCLMMAX. This expansion applies only to the TS7740/TS7700T. Also, the Inhibit Reclaim
schedule is still acknowledged.
3 1
4 1
5 1
6 2
7 2
8 3
9 3
10 4
11 4
12 5
13 5
14 6
15 6
16 7
If this ramping up is causing too many back-end drives to be used for premigration tasks, you
can limit the number of premigration drives in the Pool Properties window. For a V06, the
maximum number of premigration drives per pool must not exceed 6. Extra drives do not
increase the copy rate to the drives. For a V07, TS7720T, or TS7760T, premigration can
benefit from having 8 - 10 drives available for premigration, the default value is 10. There is no
benefit to have more than 10 running premigration.
The limit setting is in the TS7740/TS7700T MI. For Copy Export pools, set the maximum
number of premigration drives. If you are exporting a small amount of data each day (one or
two cartridges’ worth of data), limit the premigration drives to two. If more data is being
exported, set the maximum to four. This setting limits the number of partially filled export
volumes.
To monitor your actual situation, but also the trend of the cartridge usage, you have several
possibilities.
Review your Active Data Distribution. A low utilization percentage results in a higher number
of stacked volumes. Also, ensure that you monitor the number of empty stacked volumes to
avoid an “out of stacked volumes” condition. If you have defined multiple physical pools, you
might need to check this on a per pool basis, depending on your Borrow/Return policies. In
this example, Pool 3 has the borrow,return parameter enabled.
Remember that it is not sufficient to check only the scratches in the common scratch pool. In
addition, you need to check that all pools can borrow from the CSP and will return the empty
cartridges to the CSP. If a pool is set to no borrow, you need to ensure that always enough
empty cartridges are in inside this pool. This number is reflected in the H32GUPxx reports.
Keep in mind that in a heterogeneous environment, back-level cartridges (JA/JB) can only be
used for read and not for write purposes.
Reclaim value
As explained in the backend cartridge section, you can change the reclaim value to gain more
empty cartridges. The lower the percentage of the reclaim value, the more cartridges are
needed. The higher this value is, the more valid data needs to be transferred and physical
tape drives are required.
To find a good balance, review the active data distribution. Some times, it is sufficient to
change to a slightly higher reclaim value.
In addition, the pool properties should be reviewed. No borrow/Keep has a negative influence.
Keep in mind, that a short delete expire value might reduce your cartridge usage, but a short
value does not enable you to rescue any unintentional deleted data. We suggest not to use a
value below 5 days. A best practise is to use 7 days.
The reasons for using DCT and how to tune the DCT have already been explained in 11.8.5,
“Tuning possibilities for copies: Deferred Copy Throttling” on page 661.
In addition, you can see the throttling on the performance graph, as explained in Figure 11-17
on page 640.
If a value of “x03” is shown in the H30TVC, that means reason X01 and X02 are applicable at
the same time.
To understand if the throttling is a real performance issue, analyze in how many samples of
the interval throttling happened, and the relative throttling impact value (%RLTV IMPAC
VALUE). Even if throttling occurred, this might be only in a few samples during the interval,
which means that the real impact to the write might or might not influence the overall
production run time.
Some of them are parameter changes (PMTHLVL adjustments, ICOPYT), whereas other
issues can only be solved by providing higher bandwidth or more resources (cache, drives).
While adjusting a parameter might be beneficial for a specific situation, it might have another
impact to other behaviors in the grid. Therefore, we recommend to discuss such tuning
measurements with your IBM representative up front.
The reason for this approach is to avoid triggering the 45-minute missing interrupt handler
(MIH) on the host. When a copy is changed to immediate-deferred, the RUN task is
completed, and the immediate copy becomes a high priority deferred copy. See
“Immediate-copy set to immediate-deferred state” on page 694 for more information.
You might decide to turn off host write throttling because of immediate copies taking too long
(if having the immediate copies take longer is acceptable). However, avoid the 40-minute limit
where the immediate copies are changed to immediate-deferred.
In grids where a large portion of the copies is immediate, better overall performance has been
seen when the host write throttle because of immediate copies is turned off. You are trading
off host I/O for length of time that is required to complete an immediate copy. The enabling
and disabling of the host write throttle because of immediate copies is discussed in detail in
the IBM Virtualization Engine TS7700 Series z/OS Host Command Line Request User’s
Guide, which is available on Techdocs. Use the keywords SETTING, THROTTLE, ICOPYT:
http://www.ibm.com/support/techdocs/atsmastr.nsf/Web/TechDocs
Some of them have changed the cluster behavior, whereas others have an influence on the
grid allocation behavior (for example, SAA, DAA, LOWRANK).
Although these settings can be modified by using z/OS or the MI, we suggest that you first
check the parameter settings in case of any issue, and determine if they have been modified.
Remember, that most of the settings are persistent, so after a Service or a Power Off of the
cluster, they are still active.
Important: Library commands change the behavior of the whole cluster. If a cluster is
attached to multiple LPARs from the same client, or to a multi-tenant environment, the
change that is run from one LPAR influences all attached LPARs.
If you have a shared TS7700, consider restricting the usage of the Library command.
This window provides a summary of the number of outstanding updates for each cluster in an
IBM TS7700 Grid. You can also use this window to monitor the progress of pending
immediate-deferred copies, which, like pending updates, normally result from changes that
are made while a cluster is Offline, in service prep mode, or in service mode.
With Release 3.2, the download section also includes the tape with a hot token. Hot tokens
are volumes that have been changed during an unavailability of the cluster and now need a
reconciliation. The reconciliation is run during the cluster online setting.
There are performance tools available on Techdocs that take 24 hours of 15-minute
VEHSTATS data, seven days of 1-hour VEHSTATS data, or 90 days of daily summary data
and create a set of charts for you.
The following material does not contain any information about the specifics of a Tape Attach
model. However, the other information is still valuable, and has not changed, especially how
to create the excel spreadsheet and the charts.
See the following Techdocs site for the performance tools, and the class replay for detailed
information about how to use the performance tools:
Tools:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS4717
Class replay:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS4872
The 24-hour, 15-minute data spreadsheets include the cache throughput chart. The cache
throughput chart has two major components: The uncompressed host I/O line and a stacked
bar chart that shows the cache throughput.
The cache throughput chart includes the following components (all values are in MiB/s):
Compressed host write: This is the MiB/s of the data that is written to cache. This bar is
hunter green.
Compressed host read: This is the MiB/s of the data read from cache. This bar is lime
green.
Data that is copied out from this cluster to other clusters: This is the rate at which copies of
data to other clusters are made. This cluster is the source of the data and includes copies
to all other clusters in the grid. The DCT value that is applied by this cluster applies to this
data.
This tool contains spreadsheets, data collection requirements, and a 90-day trending
evaluation guide to assist you in the evaluation of the TS7700 performance. Spreadsheets for
a 90-day, 1-week, and 24-hour evaluation are provided.
One 90-day evaluation spreadsheet can be used for one-cluster, two-cluster, three-cluster, or
four-cluster grids and the other evaluation spreadsheet can be used for five-cluster and
six-cluster grids. There is an accompanying data collection guide for each. The first
worksheet in each spreadsheet has instructions for populating the data into the spreadsheet.
A guide to help with the interpretation of the 90-day trends is also included.
These spreadsheets are intended for experienced TS7700 users. A detailed knowledge of the
TS7700 is expected, and familiarity with using spreadsheets.
The TS7700 converts the format and storage conventions of a tape volume into a standard
file that is managed by a file system within the subsystem. With BVIR, you are able to obtain
information about all of the logical volumes that are managed by a TS7700.
For more information, see the IBM TS7700 Series Bulk Volume Information Retrieval
Function User’s Guide at the following URL:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101094
Remember: Some information that is obtained through this function is specific to the
cluster on which the logical volume is written, for example, cache contents or a
logical-physical volume map. In a TS7700 grid configuration with multiple clusters, use
an MC for the volume to obtain statistics for a specific cluster. Historical statistics for a
multi-cluster grid can be obtained from any of the clusters.
2. The request volume is again mounted, this time as a specific mount. Seeing that the
volume was primed for a data request, the TS7700 appends the requested information to
the data set. The process of obtaining the information and creating the records to append
can take up to several minutes, depending on the request and, from a host’s viewpoint, is
part of the mount processing time.
After the TS7700 has completed appending to the data set, the host is notified that the
mount is complete. The requested data can then be accessed like any other tape data set.
In a job entry subsystem 2 (JES2) environment, the job control language (JCL) to
complete the two steps can be combined into a single job. However, in a JES3
environment, they must be run in separate jobs because the volume will not be unmounted
and remounted between job steps in a JES3 environment.
Note: Due to the two-step approach, BVIR volumes cannot be written with LWORM
specifications. You need to assign a Data Class without LWORM for BVIR volumes.
Specific
M ount a m ount
logical BVIR
scratch volum e
volum e
R equested D ata
Write BVIR is appended to the
R equest data logical volum e
Processing occurs
during m ount tim e
R/U N
BVIR
volum e
R ead BVIR data
from logical
volum e
D em ount
BVIR Return
volum e Keep BVIR
BVIR or volum e to
volum e scratch
The building of the response information requires a small amount of resources from the
TS7700. Do not use the BVIR function to “poll” for a specific set of information and only issue
one request at a time. Certain requests, for example, the volume map, might take several
minutes to complete.
To prevent “locking” out another request during that time, the TS7700 is designed to handle
two concurrent requests. If more than two concurrent requests are sent, they are processed
as previous requests are completed.
The general format for the request/response data set is shown in Example 11-4.
Example 11-4 BVIR output format
123456789012345678901234567890123456789012345
VTS BULK VOLUME DATA REQUEST
VOLUME MAP
11/20/2008 12:27:00 VERSION 02
S/N: 0F16F LIB ID: DA01A
Clarification: When records are listed in this chapter, there is an initial record showing
“1234567890123...” This record does not exist, but it is provided to improve readability.
Record 0 is identical for all requests, and it is not part of the output; it is for support for records
1 - 5 only. Records 6 and higher contain the requested output, which differs depending on
the request:
Records 1 and 2 contain the data request commands.
Record 3 contains the date and time when the report was created, and the version
of BVIR.
Record 4 contains both the hardware serial number and the distributed library ID of the
TS7700.
Record 5 contains all blanks.
Records 6 - N and higher contain the requested data. The information is described in general
terms. Detailed information about these records is in the IBM TS7700 Series Bulk Volume
Information Retrieval Function User’s Guide at the following URL:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101094
11.14.2 Prerequisites
Any logical volume that is defined to a TS7700 can be used as the request/response volume.
Logical volumes in a TS7700 are formatted as IBM standard-labeled volumes. Although a
user can reformat a logical volume with an ANSI standard label or as an unlabeled tape
volume, those formats are not supported for use as a request/response volume. There are no
restrictions regarding the prior use of a volume that is used as a request/response volume,
and no restrictions regarding its subsequent use for any other application.
Use normal scratch allocation methods for each request (that is, use the DISP=(NEW,CATLG)
parameter). In this way, any of the available scratch logical volumes in the TS7700 can be
used. Likewise, when the response volume’s data is no longer needed, the logical volume
must be returned to SCRATCH status through the normal methods (typically by deletion of
the data set on the volume and a return-to-scratch policy based on data set deletion).
Although the request data format uses fixed records, not all response records are fixed. For
the point-in-time and historical statistics responses, the data records are of variable length
and the record format that is used to read them is the Undefined (U) format. See Appendix E,
“Sample job control language” on page 853 for more information.
In a multi-site TS7700 grid configuration, the request volume must be created on the cluster
for which the data is being requested. The MC assigned to the volume needs to specify the
particular cluster that is to have the copy of the request volume.
The format for the request data set records is listed in the following sections.
Record 1
Record 1 must contain the command exactly as shown in Example 11-5.
The format for the request’s data set records is shown in Table 11-5.
Table 11-5 BVIR request record 1
Record 1: Request identifier
Record 2
With Record 2, you can specify which data you want to obtain. The following options are
available:
VOLUME STATUS zzzzzz
CACHE CONTENTS
VOLUME MAP
POINT IN TIME STATISTICS
HISTORICAL STATISTICS FOR xxx
HISTORICAL STATISTICS FOR xxx-yyy
PHYSICAL MEDIA POOLS
PHYSICAL VOLUME STATUS VOLUME zzzzzz
PHYSICAL VOLUME STATUS POOL xx
COPY AUDIT COPYMODE INCLUDE/EXCLUDE libids
For the Volume Status and Physical Volume Status Volume requests, ‘zzzzzz’’ specifies the
volume serial number mask to be used. By using the mask, one to thousands of volume
records can be retrieved for the request. The mask must be six characters in length, with the
underscore character ( _ ) representing a positional wildcard mask.
For example, assuming that volumes in the range ABC000 - ABC999 have been defined to
the cluster, a request of VOLUME STATUS ABC1_0 returns database records that exist for
ABC100, ABC110, ABC120, ABC130, ABC140, ABC150, ABC160, ABC170, ABC180, and
ABC190.
For the Historical Statistics request, xxx specifies the Julian day that is being requested.
Optionally, -yyy can also be specified and indicates that historical statistics from xxx through
yyy are being requested. Valid days are 001 - 366 (to account for leap year). For leap years,
February 29 is Julian day 060 and December 31 is Julian day 366. For other years, Julian day
060 is March 1, and December 31 is Julian day 365. If historical statistics do not exist for the
day or days that are requested, that will be indicated in the response record.
This can occur if a request is made for a day before the day the system was installed, day or
days the system was powered off, or after the current day before a rolling year has been
accumulated. If a request spans the end of the year, for example, a request that specified as
HISTORICAL STATISTICS FOR 364-002, responses are provided for days 364, 365, 366, 001,
and 002, regardless of whether the year was a leap year.
For Copy Audit, INCLUDE or EXCLUDE is specified to indicate which TS7700’s clusters in a
grid configuration are to be included or excluded from the audit. COPYMODE is an option for
taking a volume’s copy mode for a cluster into consideration. If COPYMODE is specified, a
single space must separate it from INCLUDE or EXCLUDE.
The libid parameter specifies the library sequence numbers of the distributed libraries that
are associated with each of the TS7700 clusters either to include or exclude in the audit. The
parameters are separated by a comma. At least one libid parameter must be specified.
For the Physical Volume Status Pool request, xx specifies the pool for which the data is to be
returned. If there are no physical volumes that are assigned to the specified pool, that is
indicated in the response record. Data can be requested for pools 0 - 32.
For point-in-time and historical statistics requests, any additional characters that are provided
in the request record past the request itself are retained in the response data, but otherwise
ignored. In a TS7700 grid configuration, the request volume must be valid only on the specific
cluster from which the data is to be obtained.
Human-readable appended records can vary in length, depending on the reports that are
requested and can be 80 - 640 bytes. Binary data appended records can be variable in length
of up to 24,000 bytes. The data set is now a response data set. The appropriate block counts
in the end of file (EOF) records are updated to reflect the total number of records written to
the volume.
These records contain the specific response records based on the request. If the request
cannot be understood or was invalid, that is indicated. The record length is fixed; the record
length of each response data is listed in Table 11-7.
CACHE CONTENTS 80
VOLUME MAP 80
After appending the records and updating the EOF records, the host that requested the
mount is signaled that the mount is complete and can read the contents of the volume. If the
contents of the request volume are not valid, one or more error description records are
appended to the data set or the data set is unmodified before signaling the host that the
mount completed, depending on the problem encountered.
All human-readable response records begin in the first character position of the record and
are padded with blank characters on the right to complete the record. All binary records are
variable in length and are not padded.
The response data set contains both request records that are described in 11.14.3, “Request
data format” on page 681, and the response data set contains three explanatory records
(Records 3 - 5) and starting with Record 6, the actual response to the data request.
The detailed description of the record formats of the response record is in the following white
papers:
IBM TS7700 Series Bulk Volume Information Retrieval Function User’s Guide
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101094
IBM TS7700 Series Statistical Data Format White Paper
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP100829
Clarification: When records are listed in this chapter, an initial record shows
“1234567890123...”. This record does not exist, but is provided to improve readability.
The volume status information that is returned represents the status of the volume on the
cluster the requested volume was written to. In a TS7700 grid configuration, separate
requests must be sent to each cluster to obtain the volume status information for the
individual clusters. Using the volume serial number mask that is specified in the request, a
response record is written for each matching logical volume that exists in the cluster.
A response record consists of the database fields that are defined as described in the white
paper. Fields are presented in the order that is defined in the table and are comma-separated.
The overall length of each record is 640 bytes with blank padding after the last field, as
needed. The first few fields of the record that is returned for VOLSER ABC123 are shown in
Example 11-6.
The response records are written in 80-byte fixed block (FB) format.
Remember:
The generation of the response might take several minutes to complete depending on
the number of volumes in the cache and how busy the TS7700 cluster is at the time of
the request.
The contents of the cache typically are all private volumes. However, some might have
been returned to SCRATCH status soon after being written. The TS7700 does not filter
the cache contents based on the private or SCRATCH status of a volume.
Even with inconsistencies, the mapping data is useful if you want to design jobs that recall
data efficiently from physical volumes. If the logical volumes that are reported on a physical
volume are recalled together, the efficiency of the recalls is increased. If a logical volume with
an inconsistent mapping relationship is recalled, it recalls correctly, but an extra mount of a
separate physical volume might be required.
The physical volume to logical volume mapping that is associated with the physical volumes
managed by the specific cluster to which the request volume is written is returned in the
response records. In a TS7700 grid configuration, separate requests must be sent to each
cluster to obtain the mapping for all physical volumes.
Tip: The generation of the response can take several minutes to complete depending on
the number of active logical volumes in the library and how busy the TS7700 cluster is at
the time of the request.
For pool 0 (common scratch pool), because it contains only empty volumes, only the empty
count is returned. Volumes that have been borrowed from the common pool are not included.
For pools 1 - 32, a count of the physical volumes that are empty, are empty and waiting for
erasure, are being filled, and have been marked as full, is returned. The count for empty
includes physical volumes that have been assigned to the pool and volumes that were
borrowed from the common scratch pool but have not yet been returned.
The count of volumes that are marked as Read Only or Unavailable (including destroyed
volumes) is returned. Also, the full data volumes contain a mixture of valid and invalid data.
Response records are provided for the distribution of active data on the data volumes that are
marked as full for a pool.
Information is returned for the common pool and all other pools that are defined and have
physical volumes that are associated with them.
The physical media pool information that is managed by the specific cluster to which the
request volume is written is returned in the response records. In a TS7700 grid configuration,
separate requests must be sent to each cluster to obtain the physical media pool information
for all clusters.
The response records are written in 80-byte FB format. Counts are provided for each media
type associated with the pool (up to a maximum of eight).
In a TS7700 grid configuration, separate requests must be sent to each cluster to obtain the
physical volume status information for the individual clusters. A response record is written for
each physical volume, selected based on the volume serial number mask or pool number that
is specified in the request that exists in the cluster. A response record consists of the
database fields that are defined as shown in Example 11-7 for Volume A03599. The overall
length of each record is 400 bytes with blank padding after the last field, as needed.
Copy audit
A database is maintained on each TS7700 cluster that contains status information about the
logical volumes that are defined to the grid. Two key pieces of information are whether the
cluster contains a valid copy of a logical volume and whether the copy policy for the volume
indicates that it must have a valid copy.
This request runs an audit of the databases on a set of specified TS7700 distributed libraries
to determine whether any volumes do not have a valid copy on at least one of them.
If no further parameter is specified, the Audit checks whether a logical volume has a copy on
the specified cluster or not. There is no validation if that cluster should have a copy or not. To
take the copy modes into account, you need to specify a second parameter COPYMODE.
Using COPYMODE
If the COPYMODE option is specified, whether the volume is supposed to have a copy on the
distributed library is considered when determining whether that distributed library has a valid
copy. If COPYMODE is specified and the copy policy for a volume on a specific cluster is “S”,
“R”, “D”, or “T”, that cluster is considered during the audit.
If the copy policy for a volume on a specific cluster is “N”, the volume’s validity state is ignored
because that cluster does not need to have a valid copy.
The request then returns a list of any volumes that do not have a valid copy, subject to the
copy mode if the COPYMODE option is specified, on the TS7700 clusters specified as part of
the request.
The specified clusters might not have a copy for several reasons:
The copy policy that is associated with the volume did not specify that any of the clusters
that are specified in the request were to have a copy and the COPYMODE option was not
specified. This might be because of a mistake in defining the copy policy or because it
was intended.
For example, volumes that are used in a disaster recovery test need to be only on the
disaster recovery TS7700 and not on the production TS7700s. If the request specified
only the production TS7700 tape drives, all of the volumes that are used in the test are
returned in the list.
The copies have not yet been made from a source TS7700 to one or more of the specified
clusters. This can be because the source TS7700 or the links to it are unavailable, or
because a copy policy of Deferred was specified and a copy has not been completed
when the audit was run.
Each of the specified clusters contained a valid copy at one time, but has since removed it
as part of the TS7700 hybrid automated removal policy function. Automatic removal can
take place on TS7700D or TS7700T clusters in all configuration scenarios (hybrid or
homogeneous). In a TS7700T, only data in CP0 is subject for autoremoval.
In the Copy Audit request, you need to specify which TS7700 clusters are to be audited. The
clusters are specified by using their associated distributed library ID (this is the unique
five-character library sequence number that is defined when the TS7700 Cluster was
installed). If more than one distributed library ID is specified, they are separated by a comma.
The following rules determine which TS7700 clusters are to be included in the audit:
When the INCLUDE parameter is specified, all specified distributed library IDs are
included in the audit. All clusters that are associated with these IDs must be available or
the audit fails.
When the EXCLUDE parameter is specified, all specified distributed library IDs are
excluded from the audit. All other clusters in the grid configuration must be available or the
audit fails.
Distributed library IDs specified are checked for being valid in the grid configuration. If one
or more of the specified distributed library IDs are invalid, the Copy Audit fails and the
response indicates the IDs that are considered invalid.
Distributed library IDs must be specified or the Copy Audit fails.
Here are examples of valid requests (assume a three-cluster grid configuration with
distributed library IDs of DA01A, DA01B, and DA01C):
– COPY AUDIT INCLUDE DA01A: Audits the copy status of all volumes on only the
cluster that is associated with distributed library ID DA01A.
– COPY AUDIT COPYMODE INCLUDE DA01A: Audits the copy status of volumes that
also have a valid copy policy on only the cluster that is associated with distributed
library ID DA01A.
– COPY AUDIT INCLUDE DA01B,DA01C: Audits the copy status of volumes on the
clusters that are associated with distributed library IDs DA01B and DA01C.
– COPY AUDIT EXCLUDE DA01C: Audits the copy status of volumes on the clusters in
the grid configuration that is associated with distributed library IDs DA01A and DA01B.
On completion of the audit, a response record is written for each logical volume that did not
have a valid copy on any of the specified clusters. Volumes that have never been used, have
had their associated data deleted, or have been returned to scratch are not included in the
response records. The record includes the volume serial number and the copy policy
definition for the volume. The VOLSER and the copy policy definitions are comma-separated,
as shown in Example 11-8. The response records are written in 80-byte FB format.
If the request file contains the correct number of records and the first record is correct but the
second is not, the response file indicates in Record 6 that the request is unknown, as shown
in Example 11-9.
If the request file contains the correct number of records, the first record is correct, and the
second is recognized but includes a variable that is not within the range that is supported for
the request, the response file indicates in record 6 that the request is invalid, as shown in
Example 11-10.
Most threshold alerts allow two thresholds per setting. The first one issues the “alarm”
message that the threshold is now crossed. A second, lower limit that informs you when the
condition is resolved, and that the amount of data has fallen beyond the threshold.
Values for the alerting that are too low lead to unnecessary messages and operator actions,
Values that are too high might not give you enough time in a critical situation to react.
The default value for each parameter is 0, which indicates that no warning limit is set and
messages are not generated.
These alerts are applicable for all TS7700s, and can be set for each distributed library
independently.
Important: The monitored values are different for each TS7700 model.
For a TS7700D and TS7700T CP0, if RESDLOW/RESDHIGH was exceeded, check the
amount of Pinned data, and the amount of data that is subject to auto removal. If the data is
filled up with Pinned data and the data will not be expired by the host in the near future, you
might run out of cache.
To ensure, that the production is not impacted, a switch from the origin replication mode to a
“Sync-Deferred” or “Immed-Deferred” is possible. For the synchronous mode copy, the user
can define whether the synchronous mode copy will change to a “SYNC-Deferred” state, or if
the job will fail.
There are numerous reasons why a volume might enter the immediate-deferred state. For
example, it might not complete within 40 minutes, or one or more clusters that are targeted to
receive an immediate copy are not available. Independently of why a volume might enter the
immediate-deferred state, the host application or job that is associated with the volume is not
aware that its previously written data has entered the immediate-deferred state.
The reasons why a volume moves to the immediate-deferred state are contained in the Error
Recovery Action (ERA) 35 sense data. The codes are divided into unexpected and expected
reasons. From a z/OS host view, the ERA is part of message IOS000I (Example 11-11).
New failure content is introduced into the CCW(RUN) ERA35 sense data:
Byte 14 FSM Error. If set to 0x1C (Immediate Copy Failure), the additional new fields are
populated.
Byte 18 Bits 0:3. Copies Expected: Indicates how many RUN copies were expected for
this volume.
Byte 18 Bits 4:7. Copies Completed: Indicates how many RUN copies were verified as
successful before surfacing Sense Status Information (SNS).
Byte 19. Immediate Copy Reason Code:
– Unexpected - 0x00 to 0x7F: The reasons are based on unexpected failures:
• 0x01 - A valid source to copy was unavailable.
• 0x02 - Cluster that is targeted for a RUN copy is not available (unexpected outage).
• 0x03 - 40 minutes have elapsed and one or more copies have timed out.
• 0x04 - Is reverted to immediate-deferred because of health/state of RUN target
clusters.
• 0x05 - Reason is unknown.
– Expected - 0x80 to 0xFF: The reasons are based on the configuration or a result of
planned outages:
• 0x80 - One or more RUN target clusters are out of physical scratch cache.
• 0x81 - One or more RUN target clusters are low on available cache
(95%+ full).
• 0x82 - One or more RUN target clusters are in service-prep or service.
• 0x83 - One or more clusters have copies that are explicitly disabled through the
Library Request operation.
• 0x84 - The volume cannot be reconciled and is “Hot” against peer clusters.
The additional data that is contained within the CCW(RUN) ERA35 sense data can be used
within a z/OS custom user exit to act on a job moving to the immediate-deferred state.
After the volume is closed, any synchronous-deferred locations are updated to an equivalent
consistency point through asynchronous replication. If the SDWF option is not selected
(default) and a write failure occurs at either of the “S” locations, host operations fail and you
must view only content up to the last successful sync point as valid.
For example, imagine a three-cluster grid and a copy policy of Sync-Sync Deferred (SSD),
Sync Copy to Cluster 0 and Cluster 1, and a deferred copy to Cluster 2. The host is connected
to Cluster 0 and Cluster 1. With this option disabled, both Cluster 0 and Cluster 1 must be
available for write operations. If either one becomes unavailable, write operations fail. With the
option enabled, if either Cluster 0 or Cluster 1 becomes unavailable, write operations
continue. The second “S” copy becomes a synchronous-deferred copy.
In the previous example, if the host is attached to Cluster 2 only and the option is enabled, the
write operations continue even if both Cluster 0 and Cluster 1 become unavailable. The “S”
copies become synchronous-deferred copies.
For more information about Synchronous mode copy, see the IBM Virtualization Engine
TS7700 Series Best Practices - Synchronous Mode Copy white paper on Techdocs.
Therefore, a Host Console Request Command is provided, which enables you to define
whether these conditions report the composite library as degraded or not:
LI REQ,composite library,SETTING,ALERT,DEFDEG, [ENABLE|DISABLE]
For example, a setting of 60% means that if one link is running at 100%, the remaining links
are marked as degraded if they are running at less than 60% of the 100% link. The grid link
performance is available with the Host Console Request function, and on the TS7700 MI. The
monitoring of the grid link performance by using the Host Console Request function is
described in detail in the IBM Virtualization Engine TS7700 Series z/OS Host Command Line
Request User’s Guide, which is available on Techdocs:
http://www.ibm.com/support/techdocs/atsmastr.nsf/Web/TechDocs
The grid link degraded threshold also includes two other values that can be set by the SSR:
Number of degraded iterations: The number of consecutive 5-minute intervals that link
degradation was detected before reporting an attention message. The default value is 9.
Generate Call Home iterations: The number of consecutive 5-minute intervals that link
degradation was detected before generating a Call Home. The default value is 12.
The default values are set to 60% for the threshold, nine iterations before an attention
message is generated, and 12 iterations before a Call Home is generated. Use the default
values unless you are receiving intermittent warnings and support indicates that the values
need to be changed. If you receive intermittent warnings, let the SSR change the threshold
and iteration to the suggested values from support.
For example, suppose that clusters in a two-cluster grid are 2000 miles apart with a round-trip
latency of approximately 45 ms. The normal variation that is seen is 20 - 40%. In this
example, the threshold value is at 25% and the iterations are set to 12 and 15.
Example 11-12 lists the content of the Readme.txt file that provides basic information about
the tape tools.
IMPORTANT IMPORTANT
Program enhancements will be made to handle data format changes when they occur.
If you try to run new data with old program versions, results will be
unpredictable. To avoid this situation, you need to be informed of these
enhancements so you can stay current. To be informed of major changes to any of
the tools that are distributed through this FTP site, send an email message to:
In the subject, specify NOTIFY. Nothing else is required in the body of the note.
This will add you to our change distribution list.
The UPDATES.TXT file will contain a chronological history of all changes made to
the tools. You should review that file on a regular basis, at least monthly,
perhaps weekly, so you can see whether any changes apply to you.
If you feel that the JCL or report output needs more explanation, send an email to
the address above indicating the area needing attention.
Most of these tools are z/OS-based and included in the ibmtools.exe file. A complete list of
all tools that are included in the ibmtools.exe file is available in the overview.pdf file.
BADBLKSZ Identify small VTS Improve VTS Logrec MDR & CA1, VOLSER, Jobname, and
blocksizes performance, make TLMS, RMM, Dsname for VTS volumes
jobs run faster. ZARA,CTLT with small blocksizes.
BVIRHSTx Get historical stats Creates U, VB, SMF TS7700 Statistics file.
from TS7700 format.
BVIRPOOL Identify available Reports all pools at the BVIR file Physical media by pool.
scratch by pool same time.
BVIRPRPT Reclaim Copy Export Based on active GB, BVIR file Detailed report of data on
volumes not %. volumes.
BVIRRPT Identify VTS virtual Determine which BVIR data & CA1, Logical volumes by
volumes by owner applications or users TLMS, RMM, ZARA, jobname or dsname,
have virtual volumes. CTLT logical to physical
reports.
BVPITRPT Point in Time stats as Immediately available. TS7700 Point in Time stats as
write to operator WTO.
(WTO)
COPYVTS Copy lvols from old Recall lvols based on BVIR data & CA1, IEBGENER to recall lvols
VTS selected applications. TLMS, RMM, ZARA, and copy to new VTS.
CTLT
DIFFEXP Identify multi-file Prevent single file from CA1, TLMS, RMM, List of files not matching
volumes with not allowing volume to ZARA, CTLT file 1 expiration date.
different expiration return to scratch.
dates
FSRMATCH Replace Allows TapeWise and FSR records plus Updated SMF 14s plus all
*.HMIGTAPE.DATAS other tools by using SMF 14, 15, 21, 30, other SMF records as
ET in SMF 14 with SMF 14/15 data to 40, and so on they were.
actual recalled report actual recalled
dsname data set.
GETVOLS Get VOLSERs from Automate input to CA1, TLMS, RMM, VOLSERs for requested
list of dsns PRESTAGE. ZARA, CTLT dsns.
IOSTATS Report job elapsed Show runtime SMF 30 records Job-step detailed
times improvements. reporting.
MOUNTMON Monitor mount Determine accurate Samples tape UCBs Detail, summary,
pending and volume mount times and distribution, hourly,
allocations concurrent drive TGROUP, and system
allocations. reporting.
ORPHANS Identify orphan data Cleanup tool. CA1, TLMS, RMM, Listing file showing all
sets in Tape ZARA, CTLT multiple occurrence
Management generation data group
Catalog (TMC) (GDGs) that have not
been created in the last
nn days.
PRESTAGE Recall lvols to VTS Ordered and efficient. BVIR VOLUME MAP Jobs that are submitted
to recall lvols.
SMFILTER IFASMFDP exit or Filters SMF records to SMF data Records for tape activity
E15 exit keep just tape activity. plus optional TMM or
Generates “tape” optical activity.
records to simulate
optical activity.
TAPECOMP Show current tape See how well data will Logrec MDR or Shift and hourly reports
compression ratios compress in VTS. EREP history file showing current read and
write compression ratios.
TAPEWISE Identify tape usage Shows UNIT=AFF, SMF 14, 15, 21, 30, Detail, summary,
improvement early close, and 40 distributions, hourly,
opportunities UNIT=(TAPE,2), TGROUP, and system
multi-mount, reporting.
DISP=MOD, recalls.
TCDBMCH Identify tape List VOLSER CA1, TLMS, RMM, ERRRPT with
configuration mismatches. ZARA, CTLT mismatched volumes.
database (TCDB)
versus Tape Catalog
mismatches
TMCREUSE Identify data sets with Get candidate list for CA1, TLMS, RMM, Filter list of potential PG0
create date equal to VTS PG0. ZARA, CTLTF candidates.
last ref date
VEHGRXCL Graphing package Graphs TS7700 VEHSTATS flat files Many graphs of TS7700
activity. activity.
VEHSCAN Dump fields in Individual field dump. BVIR stats file DTLRPT for selected
historical statistics interval.
file
VEHSTATS TS7700 historical Show activity on and BVIRHSTx file Reports showing mounts,
performance performance of data transfer, and box
reporting TS7700. usage.
VEPSTATS TS7700 point-in-time Snapshot of last 15 BVIRPIT data file Reports showing current
statistics seconds of activity plus activity and status.
current volume status.
VESYNC Synchronize TS7700 Identify lvols that need BVIR data and CA1, List of all VOLSERs to
after new cluster copies. TLMS, RMM, ZARA, recall by application.
added CTLT
VOLLIST Show all active Used to get a picture of CA1, TLMS, RMM, Dsname, VOLSER,
VOLSERs from TMC. user data set naming ZARA, CTLT create date, and volseq.
Also, get volume conventions. See how Group name, counts by
counts by group, many volumes are media type.
size, and media. allocated to different
applications.
IBM employees can access IBM Tape Tools on the following website:
http://w3.ibm.com/sales/support/ShowDoc.wss?docid=SGDK749715N06957E45&node
=brands,B5000|brands,B8S00|clientset,IA
IBM Business Partners can access IBM Tape Tools on the following website:
http://www.ibm.com/partnerworld/wps/servlet/ContentHandler/SGDK749715N0695
7E45
For most tools, a text file is available. In addition, each job to run a tool contains a detailed
description of the function of the tool and parameters that need to be specified.
Important: For the IBM Tape Tools, there are no warranties, expressed or implied,
including the warranties of merchantability and fitness for a particular purpose.
To obtain the tape tools, download the ibmtools.exe file to your computer or use FTP from
Time Sharing Option (TSO) on your z/OS system to directly upload the files that are
contained in the ibmtools.exe file.
The ibmtools.txt file contains detailed information about how to download and install the
tools libraries.
After you have created the three or four libraries on the z/OS host, be sure that you complete
the following steps:
1. Copy, edit, and submit userid.IBMTOOLS.JCL($$CPYLIB) to create a new JCL library that
has a unique second node (&SITE symbolic). This step creates a private JCL library for
you from which you can submit jobs while leaving the original as is. CNTL and LOAD can
then be shared by multiple users who are running jobs from the same system.
2. Edit and submit userid.SITENAME.IBMTOOLS.JCL($$TAILOR) to tailor the JCL according to
your system requirements.
The updates.txt file contains all fixes and enhancements made to the tools. Review this file
regularly to determine whether any of the programs that you use have been modified.
To ensure that you are not working with outdated tools, the tools are controlled through an
EXPIRE member. Every three months, a new EXPIRE value is issued that is good for the next
12 months. When you download the current tools package any time during the year, you have
at least nine months remaining on the EXPIRE value. New values are issued in the middle of
January, April, July, and October.
If your IBM tools jobs stop running because the expiration date has passed, download the
ibmtools.exe file again to get the current IBMTOOLS.JCL(EXPIRE) member.
IOSTATS
IOSTATS tool is part of the ibmtools.exe file, which is available at the following URL:
ftp://ftp.software.ibm.com/storage/tapetool/
You can use IOSTATS tool to measure job execution times. For example, you might want to
compare the TS7700 performance before and after configuration changes.
IOSTATS can be run for a subset of job names for a certain period before the hardware
installation. SMF type 30 records are required as input. The reports list the number of disk
and tape I/O operations that were done for each job step, and the elapsed job execution time.
TAPEWISE
As with IOSTATS, TAPEWISE tool is available from the IBM Tape Tools FTP site. TAPEWISE
can, based on input parameters, generate several reports:
Tape activity analysis, including reads and Disp=mod analysis
Mounts and MBs processed by hour
Input and output mounts by hour
Mounts by SYSID during an hour
Concurrent open drives used
Long VTS mounts (recalls)
MOUNTMON
As with IOSTATS, MOUNTMON is available from the IBM Tape Tools FTP site. MOUNTMON
runs as a started task or batch job and monitors all tape activity on the system. The program
must be authorized program facility (APF)-authorized and, if it runs continuously, it writes
statistics for each tape volume allocation to SMF or to a flat file.
Based on data that is gathered from MOUNTMON, the MOUNTRPT program can report on
the following information:
How many tape mounts are necessary
How many are scratch
How many are private
How many by host system
How many by device type
How much time is needed to mount a tape
How long tapes are allocated
How many drives are being used at any time
What is the most accurate report of concurrent drive usage
Which jobs are allocating too many drives
Also, some so-called “Top” reports can be produced to get an overview of the tape usage. For
more information, see z/OS DFSMS Using the Volume Mount Analyzer, SC23-6895.
To convert the binary response record from BVIR data to address your requirements, you can
use the IBM tool VEHSTATS when working with historical statistics. When working with
point-in-time statistics, you can use the IBM tool VEPSTATS. See 11.16.2, “Tools download
and installation” on page 700 for specifics about where to obtain these tools. Details about
using BVIR are in the IBM Virtualization Engine TS7700 Series Bulk Volume Information
Retrieval function User’s Guide.
The most recently published white papers are available at the Techdocs website by searching
for TS7700 at the following address:
http://www.ibm.com/support/techdocs/atsmastr.nsf/Web/TechDocs
With the record layout of the binary BVIR response data, you can decode the binary file or
you can use the record layout to program your own tool for creating statistical reports.
Some of the VEHSTATS reports where already discussed in previous sections. This section
contains only information that is not yet covered.
Both sets of statistics can be obtained through the BVIR functions (see Appendix E, “Sample
job control language” on page 853).
Because both types of statistical data are delivered in binary format from the BVIR functions,
you must convert the content into a readable format.
Alternatively, you can use an existing automation tool. IBM provides a historical statistics tool
called VEHSTATS. Like the other IBM Tape Tools, the program is provided as-is, without
official support, for the single purpose of showing how the data might be reported. There is no
guarantee of its accuracy, and there is no additional documentation available for this tool.
Guidance for interpretation of the reports is available in 11.18.3, “VEHSTATS reports” on
page 707.
You can use VEHSTATS to monitor TS7700 virtual and physical tape drives, and TVC activity
to do trend analysis reports, which are based on BVIR binary response data. The tool
summarizes TS7700 activity on a specified time basis, up to 90 days in time sample intervals
of 15 minutes or 1 hour, depending on the data reported.
Figure 11-43 on page 701 highlights three files that might be helpful in reading and
interpreting VEHSTATS reports:
The TS7700.VEHSTATS.Decoder file contains a description of the fields that are listed in the
various VEHSTATS reports.
The VEHGRXCL.txt file contains the description for the graphical package that is contained
in VEHGRXCL.EXE.
The VEHGRXCL.EXE file contains VEHSTATS_Model.ppt and VEHSTATS_Model.xls. You can
use these files to create graphs of cluster activity based on the flat files that are created
with VEHSTATS. Follow the instructions in the VEHSTATS_Model.xls file to create these
graphs.
In addition to the VEHSTATS tool, sample BVIR jobs are included in the IBMTOOLS libraries.
These jobs help you obtain the input data from the TS7700. With these jobs, you can control
where the historical statistics are accumulated for long-term retention.
Three specific jobs in IBMTOOLS.JCL are designed to fit your particular needs:
BVIRHSTS To write statistics to the SMF log file
BVIRHSTU To write statistics to a RECFM=U disk file
BVIRHSTV To write statistics to a RECFM=VB disk file
BVIR volumes cannot be written with LWORM attributes. Ensure that the BVIR logical
volumes have a Data Class without LWORM specification.
The VEHSTATS reporting program accepts any or all of the various formats of BVIR input.
Define which input is to be used through a data definition (DD) statement in the VEHSTATS
job. The three input DD statements are optional, but at least one of the statements that are
shown in Example 11-14 must be specified.
The SMF input file can contain all SMF record types that are kept by the user. The SMFNUM
parameter defines which record number is processed when you specify the STATSMF
statement.
The fields that are shown in the various reports depend on which ORDER member in
IBMTOOLS.JCL is being used. Use the following steps to ensure that the reports and the flat file
contain the complete information that you want in the reports:
1. Review which member is defined in the ORDER= parameter in the VEHSTATS job member.
2. Verify that none of the fields that you want to see have been deactivated as indicated by
an asterisk in the first column. Example 11-15 shows sample active and inactive
definitions in the ORDERV12 member of IBMTOOL.JCL. The sample statements define
whether you want the amount of data in cache to be displayed in MB or in GB.
If you are planning to create graphics from the flat file by using the graphics package from the
IBM Tape Tools FTP site, specify the ORDERV12 member because it contains all the fields
that are used when creating the graphics, and verify that all statements are activated for all
clusters in your environment.
VEHSTATS gives you a huge amount of information. The following list shows the most
important reports available for the TS7700, and the results and analysis that can help you
understand the reports better:
H20VIRT: Virtual Device Historical Records
H21ADP0x: vNode Adapter Historical Activity
H21ADPXX: vNode Adapter Historical Activity combined (by adapter)
H21ADPSU: vNode Adapter Historical Activity combined (total)
H30TVC1: hnode HSM Historical Cache Partition:
– For a TS7700D and TS7740, this report represents the cache.
– For a TS7700T, multiple TVCs (TVC2 and TVC3) are presented. TVC1 contains the
data from CP0, TVC2 contains the data of CP1, and so on.
H31IMEX: hNode Export/Import Historical Activity
H32TDU12: hNode Library Historical Drive Activity
H32CSP: hNode Library Hist Scratch Pool Activity
H32GUPXX: General Use Pools 01/02 through General Use Pools 31/32
H33GRID: hNode Historical Peer-to-Peer (PTP) Activity
AVGRDST: Hrs Interval Average Recall Mount Pending Distribution
DAYMRY: Daily Summary
MONMRY: Monthly Summary
COMPARE: Interval Cluster Comparison
HOURFLAT: 15-minute interval or 1-hour interval
DAYHSMRY: Daily flat file
Tip: Be sure that you have a copy of TS7700 VEHSTATS Decoder and the TS7700
Statistical white paper available when you familiarize yourself with the VEHSTATS
reports.
Clarification: The report is provided per cluster in the grid. The report title includes the
cluster number in the DIST_LIB_ID field.
Example 11-17 VEHSTATS report for Virtual Drive Activity - second half
HISTORICAL RECORDS RUN ON 13JUL2015 § 13:40:49 PAGE 24
VE_CODE_LEVEL=008.033.000.0025 UTC NOT CHG
____CLUSTER VS FICON CHANNEL______
AHEAD AHEAD BEHIND BEHIND ---------CHANNEL_BLOCKS_WRITTEN_F
MAX AVG MAX AVG <=2048 <=4096 <=8192
------------R3.1.0073+----------->
7585 3551 1064 242 17540 0 25650
7626 4600 1239 326 22632 0 32400
7638 4453 958 325 21943 0 31350
7491 4553 974 353 22913 0 32700
7664 2212 1564 387 14092 0 19500
0 0 0 0 0 0 0
0 0 0 0 0 0 0
8521 4534 713 108 19101 0 32063
With R3.1, new information was included. CLUSTER VS FICON CHANNEL shows you
whether the TS7700 can take more workload (called ahead), or if the FICON tries to deliver
more data than the TS7700 can accept at this specific point in time (called behind). You will
normally see that numbers in both columns are shown.
Use the ratio of both numbers to understand the performance of your TS7700. In our
example, the TS7700 is behind only 8% of the time in an interval. The TS7700 can handle
more workload than is delivered from the host.
In addition, the report shows the CHANNEL BLOCKS WRITTEN FOR BLOCKSIZES. In
general, the largest number of blocks are written at 65,546 or higher blocksize, but this is not
a fixed rule. For example, DFSMShsm writes a 16,384 blocksize, and DB2 writes blocksizes
up to 256,000. The report contains more differences for blocksizes, but are not shown in the
example. From an I/O point of view, analysis of the effect of blocksize on performance is
outside the scope of this book.
The following fields are the most important fields in this report:
GBS_RTE: This field shows the actual negotiated speed of the FICON Channel.
RD_MB and WR_MB: The amount of uncompressed data that is transferred by this
FICON Channel.
The host adapter activity is summarized per adapter, and as a total of all adapters. This result
is also shown in the vNode adapter Throughput Distribution report shown in Example 11-19.
The provided example is an extract of the report and summarizes the overall host throughput
and shows how many one-hour intervals have shown which throughput.
The report also shows information about the Preference Groups. The following fields are the
most important fields in this report:
The ratio between FAST_RDY_MOUNTS, CACHE_HIT_MOUNTS, and
CACHE_MISS_MOUNTS. In general, a high number of CACHE_MISSES might mean
that additional cache capacity is needed, or cache management policies need to be
adjusted. For a TS7700T, you might also reconsider the Tape Partition sizes.
FAST_RDY_AVG_SECS and CACHE_HIT_ AVG_ SECS need to show only a few
seconds. CACHE_MIS_AVG_SECS can list values higher than a few seconds, but higher
values (more than 2 - 3 minutes) might indicate a lack of physical tape drives. For more
information, see Example 11-20.
Peer-to-Peer Activity
The Peer-to-Peer Activity report that is shown in Example 11-21 provides various
performance metrics of grid activity. This report can be useful for installations working in
Deferred copy mode. This report enables, for example, the analysis of subsystem
performance during peak grid network activity, such as determining the maximum delay
during the batch window.
For the time of the report, you can identify, in 15-minute increments, the following items:
The number of logical volumes to be copied (valid only for a multi-cluster grid
configuration)
The amount of data to be copied (in MB)
Tip: Analyzing the report that is shown in Example 11-21, you see three active clusters
with write operations from a host. This might not be a common configuration, but it is an
example of a scenario to show the possibility of having three copies of a logical volume
in a multi-cluster grid.
The following fields are the most important fields in this report:
MB_TO_COPY: The amount of data pending a copy function to other clusters (outbound).
MB_FR: The amount of data (MB) copied from the cluster (inbound data) identified in the
column heading. The column heading 1-->2 indicates Cluster 1 is the copy source and
Cluster 2 is the target.
CALC_MB/SEC: This number shows the true throughput that is achieved when replicating
data between the clusters that are identified in the column heading.
Summary reports
In addition to daily and monthly summary reports per cluster, VEHSTATS also provides a
side-by-side comparison of all clusters for the entire measurement interval. Examine this
report for an overall view of the grid, and for significant or unexpected differences between
the clusters.
The following steps describe the sequence of actions in general to produce the graphs of your
grid environment:
1. Run the BVIRHSTV program to collect the TS7700 BVIR History data for a selected
period (suggested 31 days). Run the VEHSTATS program for the period to be analyzed (a
maximum of 31 days is used).
2. Select one day during the analysis period to analyze in detail, and run the VEHSTATS
hourly report for that day. You can import the hourly data for all days and then select the
day later in the process. You also decide which cluster will be reported by importing the
hourly data of that cluster.
3. File transfer the two space-separated files from VEHSTATS (one daily and one hourly) to
your workstation.
4. Start Microsoft Excel and open this workbook, which must be in the directory C:\VEHSTATS.
5. Import the VEHSTATS daily file into the “Daily data” sheet by using a special parsing
option.
The following examples of PowerPoint slides give an impression of the type of information that
is provided with the tool. You can easily update these slides and include them in your own
capacity management reports.
Figure 11-44 gives an overview of all of the sections included in the PowerPoint presentation.
Agenda
This presentation contains the following sections: In PowerPoint, right click
on the section name and then “Open Hyperlink” to go directly to the
beginning of that section.
– Overview
– Data transfer
– Virtual mounts
– Virtual mount times
– Virtual Drive and Physical Drive usage
– Physical mounts
– Physical mount times
– Data compression ratios
– Blocksizes
– Tape Volume Cache performance
– Throttling
– Multi cluster configuration (Grid)
– Import/Export Usage
– Capacities: Active Volumes and GB stored
– Capacities: Cartridges used
– Pools (Common Scratch Pool and up to 4 Storage Pools )
Overview
Customers Grid February
450
400
350
300
250
200
150
100
50
0
1-Feb 3-Feb 5-Feb 7-Feb 9-Feb 11-Feb 13-Feb 15-Feb 17-Feb 19-Feb 21-Feb 23-Feb 25-Feb 27-Feb
Date
1000
900
800
700
600
500
400
300
200
100
0
1-Feb 3-Feb 5-Feb 7-Feb 9-Feb 11-Feb 13-Feb 15-Feb 17-Feb 19-Feb 21-Feb 23-Feb 25-Feb 27-Feb
Date
Figure 11-47 Sample VEHGRXCL: All physical mounts
The report contains not only missing copies, but also inconsistent data level, data corruption
of a logical volume (only detected at read), and existing (but unwanted) copies.
For each logical volume reported, it also reports all of the following information:
Type of issue
Constructs,
Creation, last reference and expire date
Category
Size
STALE
This report contains logical volumes that are expired, but not yet deleted. This report should
usually be empty.
TOSYNC
If fewer copies exist for a logical volume than requested in the Management Class, this report
should be generated. The tool is able to eliminate the scratches. The list produced in this
report can then be used as input for copy refresh processing.
D SMS,VOL(A13051)
CBR1180I OAM tape volume status: 146
VOLUME MEDIA STORAGE LIBRARY USE W C SOFTWARE LIBRARY
TYPE GROUP NAME ATR P P ERR STAT CATEGORY
A13051 MEDIA1 *SCRTCH* HYDRAG S N N NOERROR SCRMED1
-------------------------------------------------------------------
RECORDING TECH: 36 TRACK COMPACTION: YES
SPECIAL ATTRIBUTE: NONE ENTER/EJECT DATE: 2014-02-12
CREATION DATE: 2014-02-12 EXPIRATION DATE:
LAST MOUNTED DATE: 2014-02-12 LAST WRITTEN DATE: 2014-02-12
SHELF LOCATION:
OWNER:
LM SG: LM SC: LM MC: LM DC:
LM CATEGORY: 0011
-------------------------------------------------------------------
Logical volume.
----------------------------------------------------------------------
For more information, see Chapter 10, “Host Console operations” on page 581 and z/OS
DFSMS Object Access Method Planning, Installation, and Storage Administration Guide for
Tape Libraries, SC23-6867.
For more information about the LIBRARY command, see Chapter 10, “Host Console
operations” on page 581 and z/OS DFSMS Object Access Method Planning, Installation, and
Storage Administration Guide for Tape Libraries, SC23-6867.
All virtual drives LI DD,libname Display each Each shift Report or act on any
online composite library missing drive
and each system
TS7700 health TS7700 MI Display each Each shift Report any offline or
check composite library degraded status
Library online D SMS, LIB(ALL), Display each Each shift Verify availability to
and operational DETAIL composite library systems
and each system
Exits enabled D SMS,OAM Display each Each shift Report any disabled
system exits
Virtual scratch D SMS, LIB(ALL), Display each Each shift Report each shift
volumes DETAIL composite library
Physical scratch D SMS, LIB(ALL), Display each Each shift Report each shift
tapes DETAIL composite library
Interventions D SMS, LIB(ALL), Display each Each shift Report or act on any
DETAIL composite library interventions
Grid link status LI REQ,libname, Display each Each shift Report any errors or
STATUS, composite library elevated Retransmit%
GRIDLINK
Number of TS7700 MI → Display for each Each shift Report and watch for
volumes on the Logical cluster in the grid gradual or sudden
deferred copy Volumes → increases
queue Incoming Copy
Queue
Copy queue TS7700 MI Display for each Each shift Report if queue depth
depths system is higher than usual
Available slot D SMS, LIB(ALL), Rolling weekly Daily Capacity planning and
count DETAIL trend general awareness
Data distribution BVIRPOOL job Watch for healthy Weekly Use for reclaim tuning
distribution
Most checks that you need to make in each shift ensure that the TS7700 environment is
operating as expected. The checks that are made daily or weekly are intended for tuning and
longer-term trend analysis. The information in this table is intended as a basis for monitoring.
You can tailor this information to best fit your needs.
Various scenarios are presented, showing the influence of the algorithms involved in virtual
device allocation. A configuration with two 3-cluster grid configurations, named GRID1 and
GRID2, is used. Each grid has a TS7700D (Cluster 0) and a TS7740 (Cluster 1) at the
primary Production Site, and a TS7740 (Cluster 2) at the Disaster Site. The TS7700D Cluster
0 in the Production Site can be considered a deep cache for the TS7740 Cluster 1 in the
scenarios that are described next.
Drives/Library
TS7740
or TS7700T Drives/Library
Driv
GRID1 Cluster 1 LAN/WAN
TS7740
or TS7700T
GRID1 Cluster 2
TS7700D
GRID1 Cluster 0
FICON
Fabric
Drives/Library
TS7740
or TS7700T
GRID2 Cluster 1 LAN/WAN
Drives/Library
TS7740
or TS7700T
TS7700D GRID2 Cluster 2
GRID2 Cluster 0
In Figure 11-48, the host in the Production Site has direct access to the local clusters in the
Production Site, and has access over the extended FICON fabric to the remote clusters in the
Disaster Site. The extended FICON fabric can include dense wavelength division multiplexing
(DWDM) connectivity, or can use FICON tape acceleration technology over IP. Assume that
connections to the remote clusters have a limited capacity bandwidth.
Furthermore, there is a storage management subsystem (SMS) Storage Group (SG) per grid.
The groups are defined in the SMS SG routine as GRID1 and GRID2. SMS equally manages
SGs. The order in the definition statement does not influence the allocations.
For example, if two libraries are eligible for a scratch allocation and each library has 128
devices, over time, each library will receive approximately half of the scratch allocations. If
one of the libraries has 128 devices and the other library has 256 devices, each of the
libraries still receives approximately half of the scratch allocations. The allocations are
independent of the number of online devices in the libraries.
Remember: With EQUAL allocation, the scratch allocations are randomized across the
libraries. EQUAL allocation is not influenced by the number of online devices in the
libraries.
In this first scenario, both DAA and SAA are assumed to be disabled. With the TS7700, you
can control both assistance functions with the LIBRARY REQUEST command. DAA is
ENABLED by default and can be DISABLED with the command. SAA is DISABLED by default
and can be ENABLED with the command. Furthermore, none of the TS7700 override settings
are used.
Assuming that the MC for the logical volumes has a Copy Consistency Point of [R,R,R] in all
clusters and that the number of available virtual drives are the same in all clusters, the
distribution of the allocation across the two grids (composite libraries) is evenly spread. The
multi-cluster grids are running in BALANCED mode, so there is no preference of one cluster
above another cluster.
Alternatively, the distribution depends on if the library port IDs in the list tend to favor the
library port IDs in one cluster first, followed by the next cluster, and so on. The order in which
the library port IDs are initialized and appear in this DEVSERV list can vary across IPLs or
IODF activates, and can influence the randomness of the allocations across the clusters.
So with the default algorithm EQUAL, there might be times when device randomization within
the selected library (composite library) appears unbalanced across clusters in a TS7700 that
have online devices. As the number of eligible library port IDs increases, the likelihood of this
imbalance occurring also increases. If this imbalance affects the overall throughput rate of the
library, consider enabling the BYDEVICES algorithm described in 11.21.2, “BYDEVICES
allocation” on page 724.
Remember: Exceptions to this can also be caused by z/OS JCL backward referencing
specifications (UNIT=REF and UNIT=AFF).
With z/OS V1R11 and later, and z/OS V1R8 through V1R10 with APAR OA26414 installed, it
is possible to change the selection algorithm to BYDEVICES. The algorithm EQUAL, which is
the default algorithm that is used by z/OS, can work well if the libraries (composite libraries)
under consideration have an equal number of online devices and the previous cluster
behavior is understood.
Drives/Library
TS7740
or TS7700T Drives/Library
Driv
GRID1 Cluster 1
LAN/WAN TS7740
or TS7700T
GRID1 Cluster 2
TS7700D
GRID1 Cluster 0
• ALLOC “EQUAL”
FICON • DAA disabled
• SAA disabled
Fabric • CCP [R,R,R]
Drives/Library
TS7740
or TS7700T
Grid2 Cluster 1 LAN/WAN
Drives/Library
TS7740
or TS7700T
TS7700D GRID2 Cluster 2
GRID2 Cluster 0
For specific allocations (DAA DISABLED in this scenario), it is first determined which of the
composite libraries, GRID1 or GRID2, has the requested logical volume. That grid is selected
and the allocation can go to any of the clusters in the grid. If it is assumed that the logical
volumes were created with the EQUAL allocation setting (the default), it can be expected that
specific device allocation to these volumes will be distributed equally among the two grids.
However, how well the allocations are spread across the clusters depends on the order in
which the library port IDs were initialized, and whether this order was randomized across
the clusters.
In a TS7740 multi-cluster grid configuration, only the original copy of the volume stays in
cache, normally in the mounting cluster’s TVC for a Copy Consistency Point setting of
[R,R,R]. The copies of the logical volume in the other clusters are managed as a TVC
Preference Level 0 (PG0 - remove from cache first) unless an SC specifies Preference Level
1 (PG1 - stay in cache) for these volumes.
In the hybrid multi-cluster grid configuration that is used in the example, there are two cache
allocation schemes, depending on the I/O TVC cluster selected when creating the logical
volume. Assume an SC setting of Preference Level 1 (PG1) in the TS7740 Cluster 1 and
Cluster 2.
If the mounting cluster for the non-specific request is the TS7700D Cluster 0, only the copy
in that cluster stays. The copies in the TS7740 Cluster 1 and Cluster 2 will be managed as
Preference Level 0 (PG0) and will be removed from cache after placement of the logical
volume on a stacked physical volume. If a later specific request for that volume is directed
to a virtual device in one of the TS7740s, a cross-cluster mount from Cluster 1 or Cluster 2
occurs to Cluster 0’s cache.
If the mounting cluster for the non-specific request is the TS7740 Cluster 1 or Cluster 2,
not only the copy in that cluster stays, but also the copy in the TS7700D Cluster 0. Only
the copy in the other TS7740 cluster will be managed as Preference Level 0 (PG0) and will
be removed from cache after placement of the logical volume on a stacked physical
volume.
Cache preferencing is not valid for the TS7700D cluster. A later specific request for that
logical volume creates only a cross-cluster mount if the mount point is the vNode of the
TS7740 cluster that is not used at data creation of that volume.
With the EQUAL allocation algorithm that is used for specific mount requests, there are
always cross-cluster mounts when the cluster where the device is allocated is not the cluster
where the data is. Cache placement can limit the number of cross-cluster mounts but cannot
avoid them. Cross-cluster mounts over the extended fabric are likely not acceptable, so vary
the devices of Cluster 2 offline.
However, if one of the libraries has 128 devices and the other library has 256 devices, over
time, the library that has 128 devices will receive 1/3 of the scratch allocations and the library
that has 256 devices will receive approximately 2/3 of the scratch allocations. This is different
compared to the default algorithm EQUAL, which does not take the number of online devices
in a library into consideration.
Clarification: With BYDEVICES, the scratch allocation randomizes across all devices in
the libraries, and is influenced by the number of online devices.
With z/OS V1R11 and later, and z/OS V1R8 through V1R10 with APAR OA26414 installed, it
is possible to influence the selection algorithm. The BYDEVICES algorithm can be enabled
through the ALLOCxx PARMLIB member by using the SYSTEM
TAPELIB_PREF(BYDEVICES) parameter, or it can be enabled dynamically through the
SETALLOC operator command by entering SETALLOC SYSTEM,TAPELIB_PREF=BYDEVICES.
Consideration: The SETALLOC operator command support is available only in z/OS V1R11
or later releases. In earlier z/OS releases, BYDEVICES must be enabled through the
ALLOCxx PARMLIB member.
Assume that GRID1 has a total of 60 virtual devices online and GRID2 has 40 virtual devices
online. For each grid, the distribution of online virtual drives is 50% for Cluster 0, 25% for
Cluster 1, and 25% for Cluster 2.
Drives/Library
TS7740
or TS7700T
GRID1 Cluster 1 LAN/WAN Drives/Library
Driv
TS7740
or TS7700T
GRID1 Cluster 2
TS7700D
GRID1 Cluster 0 • ALLOC “BYDEVICES”
FICON • DAA disabled
• SAA disabled
Fabric • CCP [R,R,R]
Drives/Library
TS7740
or TS7700T
GRID2 Cluster 1
LAN/WAN Drives/Library
TS7740
or TS7700T
GRID2 Cluster 2
TS7700D
GRID2 Cluster 0
As stated in 11.21.1, “EQUAL allocation” on page 721, DAA is ENABLED by default and was
DISABLED by using the LIBRARY REQUEST command. Furthermore, none of the TS7700
override settings are activated.
With the BYDEVICES allocation algorithm that is used for specific mount requests, there are
always cross-cluster mounts when the cluster where the device is allocated is not the cluster
where the data is. Cache placement can limit the number of cross-cluster mounts but cannot
avoid them. Cross-cluster mounts over the extended fabric are likely not acceptable, so vary
the devices of Cluster 2 offline.
For more information about Copy Consistency Points, see the IBM TS7700 Best Practices -
Synchronous Mode Copy and IBM TS7700 Best Practices - Copy Consistency Point white
papers:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102098
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101230
For non-specific (scratch) allocations, the BYDEVICES algorithm randomizes across all
devices, resulting in allocations on all three clusters of each grid. Subsequently, i/O TVC
selection assigns the TVC of Cluster 0 as the I/O TVC due to the Copy Consistency Point
setting. Many factors can influence this selection, as explained in 2.2.2, “Tape Volume Cache”
on page 32.
Normally, the cluster with a Copy Consistency Point of R(un) gets preference over other
clusters. As a consequence, the TVC of Cluster 0 is selected as the I/O TVC and
cross-cluster mounts are issued from both Cluster 1 and Cluster 2.
By activating the override setting “Prefer Local Cache for Fast Ready Mount Requests” in
both clusters in the Disaster Site, cross-cluster mounts are avoided but the copy to Cluster 0
is made before the job ends, caused by the R(un) Copy Consistency Point setting for this
cluster. Therefore, by further defining a family for the Production Site clusters, Cluster 1
retrieves its copy from Cluster 0 in the Production Site location, avoiding using the remote
links between the locations.
The method to prevent device allocations at the Disaster Site, implemented mostly today, is
just varying offline all the remote virtual devices. The disadvantage is that in losing a cluster in
the Production Site, an operator action is required to vary online manually the virtual devices
of Cluster 2 of the grid with the failing cluster. With the TS7700 R2.0, an alternative solution is
using scratch allocation assistance (SAA), which is described in 11.21.5, “Allocation and
scratch allocation assistance” on page 731.
Cluster 0 is likely to have a valid copy of the logical volume in the cache due to the Copy
Consistency Point setting of [R,D,D]. If the vNodes of Cluster 1 and Cluster 2 are selected as
mount points, it results in cross-cluster mounts. It might happen that this volume has been
removed by a policy in place for TS7700D Cluster 0, resulting in the mount point TVC as the
I/O TVC.
In the TS7700, activating the Force Local TVC to have a copy of the data override first results
in a recall of the virtual volume from a stacked volume. If there is no valid copy in the cluster
or if it fails, a copy is retrieved from one of the other clusters before the mount completes.
Activating the Prefer Local Cache for non-Fast Ready Mount Requests override setting recalls
a logical volume from tape instead of using the grid links for retrieving the data of the logical
volume from Cluster 0. This might result in longer mount times.
With the TS7700, an alternative solution can be considered by using device allocation
assistance (DAA) that is described in 11.21.4, “Allocation and device allocation assistance”
on page 728. DAA is enabled by default.
Cross-cluster mounts might apply for the specific allocations in Cluster 1 because it is likely
that only the TS7700D Cluster 0 will have a valid copy in cache. The red arrows show the data
flow as result of these specific allocations.
Drives/Library
TS7740
or TS7700T
GRID1 Cluster 1 LAN/WAN Drives/Library
Driv
TS7740
or TS7700T
GRID1 Cluster 2
TS7700D
• ALLOC “BYDEVICES”
GRID1 Cluster 0
• DAA disabled
FICON • SAA disabled
Fabric • CCP [R,D,D]
• Cluster 2 no mount points
Drives/Library
TS7740
or TS7700T
GRID2 Cluster 1
LAN/WAN
Drives/Library
TS7740
or TS7700T
TS7700D GRID2 Cluster 2
GRID2 Cluster 0
The selection algorithm orders the clusters first by those having the volume already in cache,
then by those having a valid copy on tape, and then by those without a valid copy. Later, host
processing attempts to allocate a device from the first cluster that is returned in the list.
If an online device is not available within that cluster, it will move to the next cluster in the list
and try again until a device is chosen. This enables the host to direct the mount request to the
cluster that will result in the fastest mount, typically the cluster that has the logical volume
resident in cache.
For JES2, if the default allocation algorithm EQUAL is used, it supports an ordered list for the
first seven library port IDs returned in the list. After that, if an eligible device is not found, all of
the remaining library port IDs are considered equal. The alternative allocation algorithm
BYDEVICES removes the ordered library port ID limitation.
With the TS7700, install the additional APAR OA30718 before enabling the new BYDEVICES
algorithm. Without this APAR, the ordered library port ID list might not be acknowledged
correctly, causing specific allocations to appear randomized.
In the scenario that is described in 11.21.3, “Allocation and Copy Consistency Point setting”
on page 726, if you enable DAA (this is the default) by entering the command LIBRARY
REQUEST,GRID[1]/[2],SETTING,DEVALLOC,PRIVATE,ENABLE, it influences the specific requests
in the following manner. The Copy Consistency Point is defined as [R,D,D]. It is assumed that
there are no mount points in Cluster 2.
It is further assumed that the data is not in the cache of the TS7740 Cluster 1 anymore
because this data is managed as TVC Preference Level 0 (PG0), by default. It is first
determined which of the composite libraries, GRID1 or GRID2, has the requested logical
volume. Subsequently, that grid is selected and the allocation over the clusters is determined
by DAA. The result is that all allocations select the TS7700D Cluster 0 as the preferred
cluster.
You can influence the placement in cache by setting the CACHE COPYFSC option with the
LIBRARY REQUEST,GRID[1]/[2],SETTING,CACHE,COPYFSC,ENABLE command. When the ENABLE
keyword is specified, the logical volumes that are copied into the cache from a peer TS7700
cluster are managed by using the actions that are defined for the SC construct associated
with the volume as defined at the TS7740 cluster receiving the copy.
Therefore, a copy of the logical volume also stays in cache in each non-I/O TVC cluster where
an SC is defined as Preference Level 1 (PG1). However, because the TS7700D is used as a
deep cache, there are no obvious reasons to do so.
There are two major reasons why Cluster 0 might not be selected:
No online devices are available in Cluster 0, but are in Cluster 1.
The defined removal policies in the TS7700D caused Cluster 0 to not have a valid copy of
the logical volume anymore.
In both situations, DAA selects the TS7740 Cluster 1 as the preferred cluster:
When the TS7740 Cluster 1 is selected due to lack of online virtual devices on Cluster 0,
cross-cluster mounts might happen unless the TS7700 override settings, as described in
11.21.3, “Allocation and Copy Consistency Point setting” on page 726, are preventing this
from happening.
When the TS7740 Cluster 1 is selected because the logical volume is not in the TS7700D
Cluster 0 cache anymore, its cache is selected for the I/O TVS and because the Copy
Consistency Point setting is [R,D,D], a copy to the TS7700D Cluster 0 is made as part of
successful RUN processing.
The MARKFULL option can be specified in DFSMShsm to mark migration and backup tapes
that are partially filled during tape output processing as full. This enforces a scratch tape to be
selected the next time that the same function begins.
Figure 11-52 shows the allocation result of specific allocations. The devices of the remote
clusters in the Disaster Site are not online. GRID1 has in total 60% of specific logical volumes
and GRID2 has 40% of the specific logical volumes. This was the result of earlier
BYDEVICES allocations when the logical volumes were created.
The expected distribution of the specific allocations is as shown. Cross-cluster mounts might
apply in situations where DAA selects the vNode of Cluster 1 as the mount point. The red
arrows show the data flow for both the creation of the copy of the data for scratch allocations
and for specific allocations.
Drives/Library
TS7740
or TS7700T
GRID1 Cluster 1 LAN/WAN Drives/Library
Driv
TS7740
or TS7700T
GRID1 Cluster 2
TS7700D
• ALLOC “BYDEVICES”
GRID1 Cluster 0
• DAA enabled
FICON • SAA disabled
Fabric • CCP [R,D,D]
• Cluster 2 no mount points
Drives/Library
TS7740
or TS7700T
GRID2 Cluster 1 LAN/WAN
Drives/Library
TS7740
or TS7700T
TS7700D GRID2 Cluster 2
GRID2 Cluster 0
With DAA, you can vary the devices in the Disaster Site Cluster 2 online without changing the
allocation preference for the TS7700D cache if the logical volumes exist in this cluster and
this cluster is available. If these conditions are not met, DAA manages local Cluster 1 and
remote Cluster 2 as equal and cross-cluster mounts over the extended fabric are issued in
Cluster 2.
If you plan to have an alternative MC setup for the Disaster Site (perhaps for the Disaster Test
LPARs), you must carefully plan the MC settings, the device ranges that must be online, and
whether DAA is enabled. You will probably read production data and create test data by using
a separate category code.
If you do not want the grid links overloaded with test data, vary the devices of Cluster 0 and
Cluster 1 offline on the disaster recovery (DR) host only and activate the TS7700 Override
Setting “Force Local TVC” to have a copy of the data. A specific volume request enforces a
mount in Cluster 2 even if there is a copy in the deep cache of the TS7700D Cluster 0.
SAA is an extension of the DAA function for scratch mount requests. SAA filters the list of
clusters in a grid to return to the host a smaller list of candidate clusters that are designated
as scratch mount candidates. By identifying a subset of clusters in the grid as sole candidates
for scratch mounts, SAA optimizes scratch mounts to a TS7700 grid.
When a composite library supports/enables the SAA function, the host sends an SAA
handshake to all SAA-enabled composite libraries and provide the MC that will be used for
the upcoming scratch mount. A cluster is designated as a candidate for scratch mounts by
using the Scratch Mount Candidate option on the MC construct, which is accessible from the
TS7700 MI, as shown in Figure 11-53. By default, all clusters are considered candidates.
Also, if SAA is enabled and no SAA candidate is selected, all clusters are considered as
candidates.
The targeted composite library uses the provided MC definition and the availability of the
clusters within the same composite library to filter down to a single list of candidate clusters.
Clusters that are unavailable or in service are excluded from the list. If the resulting list has
zero clusters present, the function then views all clusters as candidates.
Because this function introduces system burden into the z/OS scratch mount path, a new
LIBRARY REQUEST option is introduced to globally enable or disable the function across the
entire multi-cluster grid. SAA is disabled, by default. When this option is enabled, the z/OS
JES software obtains the candidate list of mount clusters from a given composite library.
Assume that there are two main workloads. The application workload consists of logical
volumes that are created and then retrieved on a regular, daily, weekly, or monthly basis. This
workload can best be placed in the TS7700D deep cache. The backup workload is normally
never retrieved and can best be placed directly in the TS7740 Cluster 1. SAA helps direct the
mount point to the most efficient cluster for the workload:
The application workload can best be set up in the following manner. In the MC construct,
the MC is defined with a Copy Consistency Point of [R,D,D]. Cluster 0 is selected in all
clusters as Scratch Mount Candidate. In Cluster 1, the SC can best be set as TVC
Preference Level 1. This is advised because in cases where Cluster 0 is not available or
no online devices are available in that cluster, Cluster 1 can be activated as the mount
point. Cluster 2 can set Preference Level 0.
You can control the placement in cache per cluster by setting the SETTING CACHE COPYFSC
option. When the ENABLE keyword is specified, the logical volumes that are copied into the
cache from a peer TS7700 cluster are managed by using the actions that are defined for
the SC construct associated with the volume as defined at the TS7740 cluster receiving
the copy. The SC in Cluster 0 needs to have a Volume Copy Retention Group of Prefer
Keep. Logical volumes can be removed from the TS7700D deep cache if more space is
needed.
The Backup workload can best be set up in the following manner. In the MC construct, the
MC is defined with a Copy Consistency Point of [D,R,D] or [N,R,D]. Cluster 1 is selected in
all clusters as Scratch Mount Candidate. In Cluster 1 and Cluster 2, the SC can best be
set as TVC Preference Level 0. There is no need to keep the data in cache.
The SC in Cluster 0 can have a Volume Retention Group of Prefer Remove. If Cluster 0 is
activated as mount point because of the unavailability of Cluster 1 or because there are no
online devices in that cluster, the logical volumes with this MC can be removed first when
cache removal policies in the TS7700D require the removal of volumes from cache.
With these definitions, the scratch allocations for the application workload are directed to
TS7700D Cluster 0 and the scratch allocations for the Backup workload are directed to
TS7740 Cluster 1. The devices of the remote clusters in the Disaster Site are not online.
Allocation “BYDEVICES” is used. GRID1 has in total 60 devices online and GRID2 has 40
devices online. For each grid, the distribution of online devices is now not determined within
the grid by the number of online devices, as in the scenario BYDEVICES, but by the SAA
setting of the MC.
Drives/Library
TS7740
or TS7700T Drives/Library
Driv
GRID1 Cluster 1 LAN/WAN
TS7740
or TS7700T
GRID1 Cluster 2
Drives/Library
TS7740
or TS7700T
GRID2 Cluster 1 LAN/WAN
Drives/Library
TS7740
or TS7700T
TS7700D GRID2 Cluster 2
GRID2 Cluster 0
Clusters not included in the list are never used for scratch mounts unless those clusters are
the only clusters that are known to be available and configured to the host. If all candidate
clusters have either all their devices varied offline to the host or have too few devices varied
online, z/OS will not revert to devices within non-candidate clusters. Instead, the host goes
into allocation recovery. In allocation recovery, the existing z/OS allocation options for device
allocation recovery (WTOR | WAITHOLD | WAITNOH | CANCEL) are used.
Any time that a service outage of candidate clusters is expected, the SAA function needs to
be disabled during the entire outage by using the LIBRARY REQUEST command. If left enabled,
the devices that are varied offline can result in zero candidate devices, causing z/OS to enter
the allocation recovery mode. After the cluster or clusters are again available and their
devices are varied back online to the host, SAA can be enabled again.
If you vary offline too many devices within the candidate cluster list, z/OS might have too few
devices to contain all concurrent scratch allocations. When many devices are taken offline,
first disable SAA by using the LIBRARY REQUEST command and then re-enable SAA after they
have been varied back on.
If you plan to have an alternative MC setup for the Disaster Site (perhaps for the Disaster Test
LPARs), carefully plan the MC settings, the device ranges that need to be online, and whether
SAA will be used. Read production data and create test data that uses a separate category
code. If you use the same MC as used in the Production LPAR and define in Cluster 2 the MC
with SAA for Cluster 2 and not for Cluster 0 or 1 (as determined by the type of workload),
Cluster 2 might be selected for allocations in the Production LPARs.
Furthermore, the Copy Consistency Point for the MCs in the Disaster Site can be defined as
[D,D,R] or even [N,N,R] if it is test only. If it is kept equal with the setting in the Production Site,
with an [R] for Cluster 0 or Cluster 1, cross-cluster mount might occur. If you do not want the
grid links overloaded with test data, update the Copy Consistency Point setting or use the
TS7700 Virtualization override setting “Prefer Local Cache for Fast Ready Mount Requests”
in Cluster 2 in the Disaster Site.
Cross-cluster mounts are avoided, but the copy to Cluster 0 or 1 is still made before the job
ends, caused by the Production R(un) Copy Consistency Point setting for these clusters. By
further defining a family for the Production Site clusters, the clusters source their copies from
the other clusters in the Production Site location, optimizing the usage of the remote links
between the locations.
Be aware that SAA is only influencing the mount processing. If you have multiple clusters
defined as SAA, but the cluster not available is defined as the only TVC cluster (for example,
[R,N,N] or [N,N,R]), then the job is not able to run. The mount will be processed, but the job
will hang because no TVC can be selected.
An option on the MI window enables designation of a secondary pool as a Copy Export pool.
As logical volumes are written, the secondary copy of the data is written to stacked volumes
in the Copy Export pool.
The TS7700 pre-migrates any logical volumes in the Copy Export pool that have not been
pre-migrated. Any new logical volumes that are written after the Copy Export operation is
initiated are not included in the Copy Export set of physical volumes. These volumes will be
copy exported in the next run, because the Copy Export is an incremental process. Therefore
you need all Copy Export physical volumes from all Copy Export operations to do a full
recovery. In each Copy Export session, the TS7700 writes a complete TS7700 database to
each of the physical volumes in the Copy Export set. It is possible to select to write the
database backup to all of the physical volumes, or to limited number of physical volumes. For
recovery, you should use the database from the last Copy Export session.
During a Copy Export operation, all of the physical volumes with active data on them in a
specified secondary pool are candidates to be exported. Only the logical volumes that are
valid on that TS7700 are considered during the running of the operation. Logical volumes that
are currently mounted during a Copy Export operation are excluded from the export set, as
are any volumes that are not currently in the Tape Volume Cache (TVC) of the export cluster.
The host that initiates the Copy Export operation first creates a dedicated export list volume
on the TS7700 that runs the operation. The export list volume contains instructions about the
execution of the operation, and a reserved file that the TS7700 uses to provide completion
status and export operation information.
As part of the Copy Export operation, the TS7700 creates response records in the reserved
file. These records list the logical volumes that are exported and the physical volumes on
which they are located. This information can be used as a record for the data that is offsite.
The TS7700 also writes records in the reserved file on the export list volume that provide the
status for all physical volumes with a state of Copy Exported.
The choice to eject as part of the Copy Export job or to eject them later from the export-hold
category is based on your operational procedures. The ejected Copy Export set is then
transported to a disaster recovery site or vault. Your RPO determines the frequency of the
Copy Export operation.
In heterogeneous drive configurations, the previous generation of drives is normally used for
read-only operations. However, the Copy Export operation uses previous generation of 3592
tape drives to append the DB backup to physical volumes so that previous generation of
cartridges can also be exported.
Note: If the reclaim pool for the Copy Export pool is the same as either the Copy Export
primary pool or its reclaim pool, the primary and backup copies of a logical volume can
exist on the same physical tape.
Table 12-1 Exportable physical volumes based on tape drives, export format, and LMTDBPVL option
Installed tape Export format of LMTDBPVL Exportable physical volumes
drives copy export pool enabled?
E08 and E07 Default Yes Any media types in any recording
format
X'32' There is more than one valid copy of the specified export list volume
in the TS7700 grid configuration.
In the following example, assume that the TS7700 that is to perform the Copy Export
operation is Cluster 1 in 2-way grid environment. The pool on that cluster to export is pool 8.
You need to set up an MC for the data that is to be exported so that it has a copy on Cluster 1
and a secondary copy in pool 8. To ensure that the data is on that cluster and is consistent
with the close of the logical volume, you want to have a copy policy of Rewind Unload (RUN).
You define the following information:
Define an MC, for example, MCCEDATA, on Cluster 1:
Secondary Pool 8
Cluster 0 Copy Policy RUN
Cluster 1 Copy Policy RUN
Define this same MC on Cluster 0 without specifying a secondary pool.
To ensure that the export list volume is written to Cluster 1 and exists only there, define an
MC, for example, MCELFVOL, on Cluster 1:
Cluster 0 Copy Policy No Copy
Cluster 1 Copy Policy RUN
Define this MC on Cluster 0:
Cluster 0 Copy Policy No Copy
Cluster 1 Copy Policy RUN
A Copy Export operation can be initiated through any virtual tape drive in the TS7700 grid
configuration. It does not have to be initiated on a virtual drive address in the TS7700 that is
to perform the Copy Export operation. The operation is internally routed to the TS7700 that
has the valid copy of the specified export list volume. Operational and completion status is
broadcast to all hosts attached to all of the TS7700s in the grid configuration.
It is possible that one or more logical volumes might become inaccessible because they were
modified on a TS7700 other than the one that performed the Copy Export operation, and the
copy did not complete before the start of the operation. Each copy-exported physical volume
remains under the management of the TS7700 from which it was exported.
Normally, you return the empty physical volumes to the library I/O station that associated with
the source TS7700. They are then reused by that TS7700. If you want to move them to
another TS7700, whether in the same grid configuration or another, consider two important
points:
Ensure that the VOLSER ranges you define for that TS7700 match the VOLSERs of the
physical volumes that you want to move.
Have the original TS7700 stop managing the copy-exported volumes by entering the
following command from the host:
LIBRARY REQUEST,libname,COPYEXP,volser,DELETE
Figure 12-1 on page 744 shows how the Reclaim Threshold Percentage is set in Physical
Volume Pool Properties. If the ratio between active data size and total bytes written to the
physical volume is lower than the Reclaim Threshold Percentage, the physical volume
becomes eligible for reclamation. The ratio between active data size and media capacity is
not used for the comparison with Reclaim Threshold Percentage.
Exported physical volumes that are to be reclaimed are not brought back to the source
TS7700 for processing. Instead, a new secondary copy of the remaining valid logical volumes
is made by using the primary logical volume copy as a source. It is called Offsite Reclaim.
Offsite Reclaim does not start while Copy Export is running, and follows Inhibit Reclaim
Schedule. If there is more than one volume eligible for Offsite Reclaim, it tries to make the
exported physical volumes EMPTY one by one in the ascending order of their active data
size.
Figure 12-1 shows the Reclaim Threshold Percentage for a normal Offsite Reclaim.
Figure 12-1 Reclaim Threshold Percentage is set in Physical Volume Pool Properties
The next time that the Copy Export operation is performed, the physical volumes with the new
copies are also exported. After the Copy Export completes, the physical volumes that were
reclaimed (which are offsite) are no longer considered to have valid data (empty), and can be
returned to the source TS7700 to be used as new scratch volumes.
Tip: If a physical volume is in Copy Export hold state and becomes empty, it is
automatically moved back to the common scratch pool (or the defined reclamation pool)
when the next Copy Export operation completes.
E0000 EXPORT OPERATION STARTED FOR EXPORT LIST VOLUME XXXXXX None.
This message is generated when the TS7700 begins the Copy
Export operation.
E0002 OPENING EXPORT LIST VOLUME XXXXXX FAILED Check whether the export list
This message is generated when opening the export list volume volume or cache file system is
failed during the Copy Export operation. in a bad state.
E0005 ALL EXPORT PROCESSING COMPLETED FOR EXPORT LIST VOLUME None.
XXXXXX
This message is generated when the TS7700 completes an export
operation.
E0006 STACKED VOLUME YYYYYY FROM LLLLLLLL IN EJECT Remove ejected volumes from
This message is generated during Copy Export operations when the convenience I/O station.
an exported stacked volume ‘YYYYYY’ has been assigned to the
eject category at R3.1 or earlier code level. The physical volume is
placed in the convenience I/O station. The ‘LLLLLLLL’ field is
replaced with the distributed library name of the TS7700
performing the export operation.
E0006 STACKED VOLUME YYYYYY FROM LLLLLLLL QUEUED FOR EJECT Remove ejected volumes from
This message is generated during Copy Export operations when the convenience I/O station.
an exported stacked volume ‘YYYYYY’ has been assigned to the
eject category at code level from R3.2 to R4.1.1. The physical
volume is placed in the convenience I/O station. The ‘LLLLLLLL’
field is replaced with the distributed library name of the TS7700
performing the export operation.
E0006 STACKED VOLUME YYYYYY FROM LLLLLLLL IN EJECT-QUEUE Remove ejected volumes from
This message is generated during Copy Export operations when the convenience I/O station.
an exported stacked volume ‘YYYYYY’ has been assigned to the
eject category at R4.1.2 or later code level. The physical volume is
placed in the convenience I/O station. The ‘LLLLLLLL’ field is
replaced with the distributed library name of the TS7700
performing the export operation.
E0013 EXPORT PROCESSING SUSPENDED, WAITING FOR SCRATCH Make one or more physical
VOLUME scratch volumes available to the
This message is generated every 5 minutes when the TS7700 TS7700 performing the export
needs a scratch stacked volume to continue export processing and operation. If the TS7700 does
there are none available. not get access to a scratch
stacked volume in 60 minutes,
the operation is ended.
E0015 EXPORT PROCESSING TERMINATED, WAITING FOR SCRATCH The operator must make more
VOLUME TS7700 stacked volumes
This message is generated when the TS7700 ends the export available, perform an analysis
operation because scratch stacked volumes were not made of the export status file on the
available to the TS7700 within 60 minutes of the first E0013 export list volume, and reissue
message. the export operation.
E0018 EXPORT TERMINATED, EXCESSIVE TIME FOR COPY TO STACKED Call for IBM support.
VOLUMES
The export process has been ended because one or more cache
resident-only logical volumes that are needed for the export were
unable to be copied to physical volumes in the specified secondary
physical volume pool within a 10-hour period from the beginning of
the export operation.
E0024 XXXXXX LOGICAL VOLUME WITH INVALID COPY ON LLLLLLLL When the export operation
This message is generated when the TS7700 performing the completes, perform an analysis
export operation has determined that one or more (XXXXXX) of the export status file on the
logical volumes that are associated with the auxiliary storage pool export list volume to determine
that is specified in the export list file do not have a valid copy the logical volumes that were
resident on the TS7700. The ‘LLLLLLLL’ field is replaced by the not exported. Ensure that they
distributed library name of the TS7700 performing the export have completed their copy
operation. The export operation continues with the valid copies. operations and then perform
another export operation.
E0025 PHYSICAL VOLUME XXXXXX NOT EXPORTED, PRIMARY COPY FOR The logical volume and the
YYYYYY UNAVAILABLE physical volume will be eligible
This message is generated when the TS7700 detected a for the next Copy Export
migrated-state logical volume ‘YYYYYY’ with an unavailable operation after the logical
primary copy. The physical volume ‘XXXXXX’ on which the volume is mounted and
secondary copy of the logical volume ‘YYYYYY’ is stored was not unmounted from the host. An
exported. operator intervention is also
This message is added at code level R1.7. posted.
R0000 RECLAIM SUCCESSFUL FOR EXPORTED STACKED VOLUME YYYYYY The exported physical volume
This message is generated when the TS7700 has successfully no longer contains active data
completed reclaim processing for an exported stacked volume that and can be returned from its
was exported during a previous copy export operation. offsite location for reuse. If it is
Note: A copy exported physical volume can become eligible for placed in export-hold, it should
reclaim based on the reclaim policies that are defined for its be returned when the next copy
secondary physical volume pool, or through the host console export is completed.
request command.
To have DFSMSrmm policy management manage the retention and movement for volumes
that are created by Copy Export processing, you must define one or more volume vital record
specifications (VRSs). For example, assume that all Copy Exports are targeted to a range of
volumes STE000 - STE999. You can define a VRS as shown in Example 12-1.
As a result, all matching stacked volumes that are set in AUTOMOVE have their destination
set to the required location, and your existing movement procedures can be used to move
and track them.
In addition to the support listed, a copy-exported stacked volume can become eligible for
reclamation based on the reclaim policies that are defined for its secondary physical volume
pool or through the Host Console Request function (LIBRARY REQUEST). When it becomes
eligible for reclamation, the exported stacked volume no longer contains active data and can
be returned from its offsite location for reuse.
For users that use DFSMSrmm, when you have stacked volume support that is enabled,
DFSMSrmm automatically handles and tracks the stacked volumes that are created by Copy
Export. However, there is no way to track which logical volume copies are on the stacked
volume. Retain the updated export list file, which you created and the library updated, so that
you have a record of the logical volumes that were exported, and on what exported stacked
volume they are.
For more information and error messages that are related to the Copy Export function in
RMM, see the z/OS DFSMSrmm Implementation and Customization Guide, SC23-6874.
Tip: For the physical volumes that you use for Copy Export, defining a specific VOLSER
range to be associated with a secondary pool on a source TS7700 can simplify the task
of knowing the volumes to use in recovery, and of returning a volume that no longer has
active data on it to the TS7700 that manages it.
For details about how to define the VOLSER ranges, see “Defining VOLSER ranges for
physical volumes” on page 539.
5. Define the characteristics of the physical volume pools used for Copy Export.
For the pool or pools that you plan to use for Copy Export and that you have specified
previously in the MC definition, and, optionally, in the VOLSER range definition, select
Copy Export in the Export Pool field.
For more information about how to change the physical volume pool properties, see
“Defining physical volume pools in the TS7700T” on page 540.
6. Code or modify the MC automatic class selection (ACS) routine.
Add selection logic to the MC ACS routine to assign the new MC name, or names.
7. Activate the new construct names and ACS routines.
Before new allocations are assigned to the new MC, the Source Control Data Set (SCDS)
with the new MC definitions and ACS routines must be activated by using the SETSMS SCDS
command.
In the response that is shown in Example 12-2, you can see the following information:
– Four E08 drives and two E07 drives are defined.
– All nine drives are available (AVAIL=Y).
– The ROLE column describes which drive is performing. The following values can
be indicated:
• IDLE: The drive is not in use for another role or is not mounted.
• SECE: The drive is being used to erase a physical volume.
• MIGR: The drive is being used to copy a logical volume from the TVC to a physical
volume. In this display, logical volume SO0006 is being copied to physical volume
JD0402.
Pool 0 is the Common Scratch Pool. Pool 9 is the pool that is used for Copy Export in this
example. Example 12-3 shows the command POOLCNT. The response is listed per pool:
– The media type used for each pool
– The number of empty physical volumes that are available for Scratch processing
– The number of physical volumes in the filling state
– The number of full volumes
– The number of physical volumes that have been reclaimed, but need to be erased
– The number of physical volumes in read-only recovery (ROR) state
– The number of volumes unavailable or in a destroyed state (1 in Pool 1)
– The number of physical volumes in the copy-exported state (45 in Pool 9)
Use the MI to modify the maximum-allowed number of volumes in the copy-exported state
(Figure 12-2).
For more information about the Library Request command, see 10.1.3, “Host Console
Request function” on page 588.
If you use a multi-cluster grid, be sure to create the export list volume only on the same
TS7700 that is used for Copy Export, but not on the same physical volume pool that is used
for Copy Export. If more than one TS7700 in a multi-cluster grid configuration contains the
export list volume, the Copy Export operation fails.
Ensure that all volumes that are subject for copy export are in the TVC of the TS7700 where
the copy export will be run. If there are copies from other clusters that have not been
processed, you can promote them in the copy queue. Use a host console request (HCR)
command with the COPY,KICK option to do so:
LI REQ,distributed library,LVOL,A08760,COPY,KICK
//****************************************
//* FILE 2: RESERVED FILE
//****************************************
//STEP2 EXEC PGM=IEBGENER,COND=(4,LT)
//SYSPRINT DD SYSOUT=*
//SYSIN DD DUMMY
//SYSUT2 DD DSN=HILEVELQ.RESERVED,MGMTCLAS=MCNOCOPY,
// UNIT=VTS1,DISP=(NEW,KEEP),LABEL=(2,SL),
The information that is required in the export list file is, as for BVIR, provided by writing a
logical volume that fulfills the following requirements:
– That logical volume must have a standard label and contain three files:
• An export list file, as created in step 1 in Example 12-4 on page 752. In this
example, you are exporting Pool 09. Option EJECT in record 2 tells the TS7700 to
eject the stacked volumes upon completion.
With only OPTIONS1,COPY (without EJECT), the physical volumes are placed in
the export-hold category for later handling and left in the library by an operator.
• A reserved file, as created in step 2 in Example 12-4 on page 752. This file is
reserved for future use.
• An export status file, as created in step 3 in Example 12-4 on page 752. In this file,
the information is stored from the Copy Export operation. You must keep this file
because it contains information related to the result of the Export process and must
be reviewed carefully.
– All records must be 80 bytes.
– The export list file must be written without compression. Therefore, you must assign a
Data Class (DC) that specifies COMPACTION=NO or you can overwrite the DC
specification by coding TRTCH=NOCOMP in the JCL.
Important: Ensure that the files are assigned an MC that specifies that only the local
TS7700 has a copy of the logical volume. You can either have the ACS routines assign
this MC, or you can specify it in the JCL. These files need to have the same expiration
dates as the longest of the logical volumes you export because they must be kept for
reference.
Figure 12-3 Management Class settings for the export list volume
2. The Copy Export operation is initiated by running the LIBRARY EXPORT command. In this
command, logical VOLSER is a variable, and is the logical volume that is used in creating
the export list file.
The command syntax is shown in Example 12-5.
3. The host sends a command to the composite library. From there, it is routed to the TS7700
where the export list volume is.
4. The running TS7700 validates the request, checking for required resources, and if all is
acceptable, the Copy Export continues.
5. Logical volumes that are related to the exported pool that still are only in cache can delay
the process. They are copied to physical volumes in the pool as part of the Copy Export
run.
6. Messages about the progress are sent to the system console. All messages are in the
format that is shown in Example 12-6. See Table 12-2 on page 745 for an explanation of
Library Message Text.
To obtain a list of the virtual volumes that were exported during the COPY EXPORT
operation, use the Physical Volumes Details selection in the MI. Specify the volume or
volumes that were written to during the EXPORT. Those VOLSERs are listed in the
CBR3750I messages on the syslog. Click Download List of Virtual Volumes.
Figure 12-4 Physical volume details selection for list of exported volumes
Sample member CBRSPX03, writes the three required files on the export list volume using
a private volume and export list format 03 and has a 4th step (STEP4) that starts
CBRSPLCS to initiate the copy export. CBRSPLCS is an example program that starts the
CBRXLCS programming interface and would need to be modified to suit your business
needs. When modified, it would need to be assembled and link-edited on your system, for
it to be usable through JCL.
A request to cancel an export operation can be initiated from any host that is attached to the
TS7700 subsystem by using one of the following methods:
Use the host console command LIBRARY EXPORT,XXXXXX,CANCEL, where XXXXXX is the
volume serial number of the export list volume.
Use the Program Interface of the Library Control System (LCS) external services
CBRXLCS.
If an export operation must be canceled and there is no host that is attached to the TS7700
that can run the CANCEL command, you can cancel the operation through the TS7700 MI. After
confirming the selection, a cancel request is sent to the TS7700 that is processing the Copy
Export operation.
Messages differ depending on what the TS7700 encountered during the execution of the
operation:
If no errors or exceptions were encountered during the operation, message CBR3855I is
generated. The message has the format that is shown in Example 12-7.
If message CBR3856I is generated, examine the export status file to determine what errors
or exceptions were encountered.
Either of the completion messages provides statistics about what was processed during the
operation. The following statistics are reported:
Requested-number: This is the number of logical volumes that are associated with the
secondary volume pool that is specified in the export list file. Logical volumes that are
associated with the specified secondary volume pool that were previously exported are
not considered part of this count.
Exportable-number: This is the number of logical volumes that are considered exportable.
A logical volume is exportable if it is associated with the secondary volume pool that is
specified in the export list file and it has a valid copy on the TS7700 performing the export.
Logical volumes that are associated with the specified secondary volume pool that were
previously exported are not considered to be resident in the TS7700.
Clarification: The number of megabytes (MB) exported is the sum of the MB integer
values of the data that is stored on each Exported Stacked Volume. The MB integer
value for each Exported Stacked Volume is the full count by bytes divided by 1,048,576
bytes. If the result is less than 1, the MB integer becomes 1, and if greater than 1 MB,
the result is truncated to the integer value (rounded down).
MBytes Moved: For Copy Export at code release level R1.4 and later, this value is 0.
It is possible that multiple physical cartridges are written to during the COPY EXPORT even if
a small amount of data was exported. This effect is primarily due to the optimization of the
operation by using multiple available drives that are configured by Maximum Devices in Pool
Properties in the MI for the Copy Export pool.
Consideration: Clients can run a Copy Export Recovery process only in a stand-alone
cluster. After the recovery process completes, you can create a multi-cluster grid by joining
the grid with another stand-alone cluster. However, there is an IBM service offering to
recover to an existing grid.
The following instructions for how to implement and run Copy Export Recovery also apply if
you are running a DR test. If it is a test, it is specified in each step.
Copy Export Recovery can be used to restore previously created and copy-exported tapes to
a new, empty TS7700 cluster. The same subset of tapes can be used to restore a TS7700 in
an existing grid if the new empty restore cluster replaces the source cluster that is no longer
present.
This enables data that might have existed only within a TS7740 or TS7700T in a hybrid
configuration to be restored while maintaining access to the still existing TS7720 clusters.
This form of extended recovery must be carried out by IBM support personnel.
Figure 12-5 Copy Export Recovery window with erase volume option
3. Ensure that you are logged in to the correct TS7700. Then, select Erase all existing
volumes before the recovery and click Submit. A window opens that provides you with
the option to confirm and continue the erasure of data on the recovery TS7700 or to
abandon the recovery process. It describes the data records that are going to be erased
and informs you of the next action to be taken.
To erase the data, enter your login password and click Yes. The TS7700 begins the
process of erasing the data and all database records. As part of this step, you are logged
off from the MI.
Note: If an error occurs during the erasure process, the task detail window provides a
list of errors that occurred and indicates the reason and any action that needs to be
taken.
5. Starting with an empty TS7700, you must perform several setup tasks by using the MI that
is associated with the recovery TS7700. For many of these tasks, you might have to verify
only that the settings are correct because the settings are not deleted as part of the
erasure step:
a. Verify or define the VOLSER range or ranges for the physical volumes that are to be
used for and after the recovery. The recovery TS7700 must know the VOLSER ranges
that it owns. This step is done through the MI that is associated with the recovery
TS7700.
b. If the copy-exported physical volumes were encrypted, set up the recovery TS7700 for
encryption support and have it connected to an external key manager that has access
to the keys used to encrypt the physical volumes. If you write data to the recovery
TS7700, you must also define the pools to be encrypted and set up their key label or
labels or define to use default keys.
c. If you are running the Copy Export Recovery operations to be used as a test of your
disaster recovery plans and have kept the Disaster Recovery Test Mode check box
selected, the recovery TS7700 does not perform reclamation.
If you are running Copy Export Recovery because of a real disaster, verify or define the
reclamation policies through the MI.
6. With the TS7700 in its online state, but with all virtual tape drives varied offline to any
attached hosts, log in to the MI and select Service → Copy Export Recovery.
The TS7700 determines that it is empty and enables the operation to proceed. Load the
copy-exported physical volumes into the library. Multiple sets of physical volumes have
likely been exported from the source TS7700 over time. All of the exported stacked
volumes from the source TS7700 must be loaded into the library. If multiple pools were
exported and you want to recover with the volumes from these pools, load all sets of the
volumes from these pools.
7. After you add all of the physical volumes into the library and they are now known to the
TS7700, enter the volume serial number of one of the copy-exported volumes from the last
set that was exported from the source TS7700. It contains the last database backup copy,
which is used to restore the recovery TS7700 database. The easiest place to find a
volume to enter is from the export status file on the export list volume from the current
Copy Export operation.
Remember: If you specified the LMTDBPVL option when performing the export, only a
subset of the tapes that were exported have a valid database backup that can be used
for recovery. If a tape that is selected for recovery does not have the backup, the user
gets the following error: “The database backup could not be found on the specified
recovery volume”.
If you are using the Copy Export Recovery operation to perform a disaster recovery test,
keep the Disaster Recovery Test Mode check box selected. The normal behavior of the
TS7700 storage management function, when a logical volume in the cache is unloaded, is
to examine the definitions of the storage management constructs associated with the
volume. If the volume was written to while it was mounted, the actions defined by the
storage management constructs are taken.
If the volume was not modified, actions are only taken if there has been a change in the
definition of the storage management constructs since the last time that the volume was
unloaded. For example, suppose that a logical volume is assigned to an SG, which had
last had the volume written to pool 4. Furthermore, either the SG was not explicitly defined
on the recovery TS7700 or it specified a different pool.
In this case, on the unload of the volume, a new copy of it is written to the pool determined
by the new SG definition, even though the volume was only read. If you are merely
accessing the data on the recovery TS7700 for a test, you do not want the TS7700 to
recopy the data. Keeping the check box selected causes the TS7700 to bypass its
checking for a change in storage management constructs.
Another consideration with merely running a test is reclamation. Running reclamation
while performing a test will require scratch physical volumes and enable the copy-exported
volumes to being reused after they are reclaimed. By keeping the Disaster Recovery
Test Mode check box selected, the reclaim operation is not performed.
With the Disaster Recovery Test Mode check box selected, the physical volumes that
are used for recovery maintain their status of Copy Exported so that they cannot be
reused or used in a subsequent Copy Export operation. If you are using Copy Export
Recovery because of a real disaster, clear the check box.
Enter the volume serial number, select the check box, and then click Submit.
If an error occurs, various possible error texts with detailed error descriptions can help you
solve the problem. For more information and error messages that are related to the Copy
Export Recovery function, see the IBM TS7700 Series Copy Export Function User’s Guide
white paper, which is available at the following URL:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101092
If everything is completed, you can vary the virtual devices online, and the tapes are ready to
read.
Tip: For more general considerations about DR testing, see Chapter 5, “Disaster recovery”
on page 201.
To set up the composite name that is used by the host to be the grid name, complete the
following steps:
1. Select Configuration → Grid Identification Properties.
2. In the window that opens, enter the composite library name that is used by the host in the
grid nickname field.
3. You can optionally provide a description.
The fortunate thing is that, in many cases, recovering from a disaster is easier and requires
fewer steps than having to simulate a disaster and then clean up your disaster environment
as though the simulation had never happened. While Chapter 5, “Disaster recovery” on
page 201 discussed disaster recovery concepts in general, this chapter focuses on concepts
related to disaster recovery testing specifically, providing examples where needed and
includes step-by-step walkthroughs for four methods that clients can use to accomplish DR
testing in a TS7700 grid environment. Those methods are:
1. DR Testing using FlashCopy
2. DR Testing using Write Protect Mode on DR cluster(s)
3. DR Testing without Using Write Protect Mode on DR cluster(s)
4. DR Testing by Breaking the Grid Links connections to DR cluster(s)
All of these methods have their advantages and disadvantages, so it is important that before
you decide which method to use, that you weigh the advantages and disadvantages of each
method against your environment and resources and then choose which method best fits
your DR testing needs and ability.
The description of each method makes the assumption that you are familiar with the DR
concepts presented in Chapter 5. The end of this chapter contains a step-by-step list on how
to perform a DR test using each method. While it might be tempting to jump right to these
lists, it is recommended that you review this chapter in its entirety before DR testing to ensure
that you are familiar with the concepts and options available for DR testing in a TS7700 grid
environment.
With FlashCopy and the underlying Write Protect Mode for DR testing, DR test volumes can
be written to and read from while production volumes are protected from modification by the
DR host. All access by a DR host to write-protected production volumes is provided by using
a snapshot in time (a flash) of the logical volumes. Because of this, a DR host continues to
have read access to any production volumes that have been returned to scratch while the
FlashCopy is active.
If you determine that FlashCopy for DR is not suitable to your DR environment, using this
method is the recommended alternative.
If your choice is between using Write Protect Mode and not using Write Protect Mode, it is
suggested to use Write Protect Mode (Method 2), to provide an additional level of
write-protection in case the TMS on the DR host is not configured correctly to prevent writes
to the production volumes.
As with the previous methods, this method has its advantages and disadvantages:
Advantages:
– After the grid links have been broken, you are assured that any production data that is
accessed from a DR cluster by the DR host is data that had been copied to the DR
cluster before the grid links were broken.
– Return-to-scratch processing initiated by a production host again production volumes
on production clusters does not affect the copy of the volumes on the DR clusters. The
copy on the DR clusters can continue to be accessed for read by the DR host.
– DR volumes that are created for use during the DR test are not copied to the
production clusters.
Disadvantages:
– If a real disaster occurs while the DR test is in progress, data that was created by the
production site after the grid links were broken is lost.
– The disaster clusters must be allowed to takeover read-only volume ownership from
the production clusters. Normally, the takeover function is only used in the event of a
real disaster.
– Breaking the grid links must be done by your CE (SSR). Do not only disable a grid link
with the Library Request command to run this method. Disabling the grid link with the
command does not stop synchronous mode copies and the exchange of status
information.
The concern about losing data in a real disaster during a DR test is the major drawback to
using this DR method. Because of this, if it is possible to use one of the DR methods
described earlier (using FlashCopy or Write Protect Mode), it is advised to use one of those
methods.
Important: Do not use logical drives in the DR site from the production site.
If you decide to break links during your DR test, you must review carefully your everyday
work. For example, if you have 3 TB of cache and you write 4 TB of new data every day, you
are a good candidate for a large amount of throttling, probably during your batch window. To
understand throttling, see 11.3.7, “Throttling in the TS7700” on page 623.
After the test ends, you might have many virtual volumes in the pending copy status. When
TS7700 grid links are restored, communication is restarted, and the first task that the TS7700
runs is to make a copy of the volumes that are created during your links broken window. This
task can affect the TS7700 performance.
If your DR test runs over several days, you can minimize the performance degradation by
suspending copies by using the GRIDCNTL Host Console command. After your DR test is over
and your CE has brought back the grid links, you can enable the copy again during a low
activity workload to avoid or minimize performance degradation. See 10.1.3, “Host Console
Request function” on page 588 for more information.
The main effect is that it is possible that a volume that has been returned to SCRATCH status
by the production system is used in a test. The test system’s catalogs and TMS do not reflect
that change. If the “Ignore fast ready characteristics of write protected categories” option is
selected when Write Protect Mode is enabled on the DR clusters, the data can still be
accessed, regardless if the logical volume is defined as scratch or not.
During your DR test, production data is updated on the remaining production clusters.
Depending on your selected DR testing method, this updated data can be copied to the DR
clusters. Also, it depends on the DR testing method if this updated data is presented to the
DR host, or if a FlashCopy from a Time Zero is available.
Without the FlashCopy option, both alternatives (updating the data versus not updating the
data) have advantages and disadvantages. For more information, see 13.5.4, “Method 4:
Breaking the grid link connections” on page 798.
Also, the DR host might create some data in the DR clusters. For more information, see
13.3.6, “Creating data during the disaster recovery test from the DR host: Selective Write
Protect” on page 775.
When a cluster is write-protect-enabled, all volumes that are protected cannot be modified or
have their category or storage construct names modified. As in the TS7700 write-protect
setting, the option is grid partition scope (a cluster) and configured through the MI. Settings
are persistent, except for DR FLASH, and saved in a special repository.
Also, the new function enables any volume that is assigned to one of the categories that are
contained within the configured list to be excluded from the general cluster’s write-protect
state. The volumes that are assigned to the excluded categories can be written to or have
their attributes modified.
One exception to the write protect is those volumes in the insert category. To enable a volume
to be moved from the insert category to a write-protect-excluded category, the source
category of insert cannot be write-protected. Therefore, the insert category is always a
member of the excluded categories.
Be sure that you have enough scratch space when Expire Hold processing is enabled to
prevent the reuse of production scratched volumes when planning for a DR test. Suspending
the volumes’ return-to-scratch processing during the DR test is also advisable.
However, during a DR test you need to ensure that the actions on the DR site do not have an
influence on the data from production. Therefore, the DR host must not have any connections
to the clusters in production. Ensure that all devices that are attached to the remaining
production clusters are offline (if they are FICON attached to the DR site).
The Write Protect mode prevents any host action (write data, host command) sent to the test
cluster from creating new data, modifying existing data, or changing volume attributes such
as the volume category. The Write Protect mode still enables logical volumes to be copied
from the remaining production clusters to the DR cluster.
As an alternative to the Write Protect Mode or if you would like an additional safeguard, if you
want to prevent overwriting production data, you can use the TMS on the DR host to enable
only read-access to the volumes in the production VOLSER ranges. For more information,
see 13.3.12, “Considerations for DR tests without Selective Write Protect mode” on page 777.
Figure 13-1 shows the process to insert cartridges in a DR site to perform a DR test.
DR Host
Running
Production Host
D/R TEST not started
Running
After these settings are done, insert the new TST* logical volumes. It is important that the DR
volumes that are inserted by using the MI are associated with the DR host so that the TS7700
at the DR site has ownership of the inserted volumes. The DR host must be running before
the insertion is run.
Important: Ensure that one logical unit has been or is online on the test system before
entering logical volumes.
Any new allocations for output that are performed by the DR host use only the logical volumes
that are defined for the DR test. At the end of the DR test, the volumes can be returned to
SCRATCH status and left in the library. Or, if you prefer, they can be deleted by using the
EJECT command in ISMF on the DR host.
The second approach is the most practical in terms of cost. It involves defining the VOLSER
range to be used, defining a separate set of categories for scratch volumes in the DFSMS
DEVSUP parmlib, and inserting the volume range into the DR cluster before the start of the
test.
Important: The test volumes that are inserted by using the MI must be associated with the
cluster that is used as DR cluster so that cluster has ownership of the inserted volumes.
If you require that the DR host be able to write new data, you can use the Write Protect Mode
for DR testing function that enables you to write to volumes belonging to certain categories
during DR testing. With Selective Write Protect, you can define a set of volume categories on
the TS7700 that are excluded from the Write Protect Mode. This configuration enables the
test host to write data onto a separate set of logical volumes without jeopardizing normal
production data, which remains write-protected.
This requires that the DR host use a separate scratch category or categories from the
production environment. If DR volumes also must be updated or if you want to run a TMS
housekeeping process that is limited to the DR volumes, the DR host’s private category must
also be different from the production environment to separate the two environments.
You must determine the production categories that are being used and then define separate,
not yet used categories on the DR host by using the DEVSUPxx member. Be sure that you
define a minimum of four categories in the DEVSUPxx member: MEDIA1, MEDIA2, ERROR,
and PRIVATE.
In addition to the DR host specification, you must also define on the DR clusters those volume
categories that you are planning to use on the DR host and that need to be excluded from
Write-Protect mode.
For more information about the necessary definitions for DR testing with a TS7700 grid that
uses Selective Write Protect, see 13.5.2, “Method 2: Using Write Protect Mode on DR
clusters” on page 795.
The Selective Write Protect function enables you to read production volumes and to write new
volumes from the beginning of tape (BOT) while protecting production volumes from being
modified by the DR host. Therefore, you cannot modify or append to volumes in the
production hosts’ PRIVATE categories, and DISP=MOD or DISP=OLD processing of those
volumes is not possible.
At the end of the DR test, be sure to clean up the data that was written to DR volumes during
the DR test.
Using a copy mode of No Copy for the production clusters prevents the DR clusters from
making a copy of the DR test data. It does not interfere with the copying of production data.
Remember to set the content of the MCs back to the original contents during the cleanup
phase of a DR test.
13.3.9 Scratch runs during the disaster recovery test from the production host
If return-to-scratch processing runs on a production host for a production volume, that volume
can no longer by read by the production host while it is in scratch status. However, it can still
be read by a DR host from a DR cluster that has Write Protect Mode active (either with or
without DR FlashCopy) if the category the volume is in is being write-protected and the
cluster has “Ignore fast read characteristics of write protected categories.” enabled.
During DR testing, you might want to either turning off return-to-scratch processing on the
production hosts or configure a long expire-hold time for the production tapes that can be
scratched to ensure that the data can still be accessed during the DR test.
For scratch processing run during the DR test from the production host without using
Selective Write Protect, see 13.3.12, “Considerations for DR tests without Selective Write
Protect mode” on page 777.
13.3.10 Scratch runs during the disaster recovery test from the DR host
Depending on the selected method, a return-to-scratch procedure that is run on the DR host
should be carefully considered. If Write Protect Mode is enabled and the production category
is set to Write Protect Excluded, you can run a scratch procedure on the DR host. It is
advised to limit the scratch procedure to the DR volume serial range inserted on the DR host.
If you choose not to use Write Protect or define the production categories as excluded from
write protect, a return-to-scratch procedure that is run on a DR host might lead to data loss. If
possible, it is best to avoid running any housekeeping process during a DR test.
If this data is not deleted (set to scratch and EJECTed by using ISMF) after the DR test, this
unneeded data will continue to occupy cache or tape space. Because the volumes this data
resides on remain in a PRIVATE category, they will never expire and will continue to occupy
space indefinitely.
For this reason, be sure to return to scratch those DR volumes that are written to (converted
from SCRATCH to PRIVATE) during the DR test and, at the very least (if you do not want to
delete the volumes), ensure that the scratch category that they are assigned to has an
expiration time specified in the TS7700 MI. Otherwise, space on the TS7700 will continue to
be wasted because these logical volumes will not be overwritten.
Ownership takeover
If you perform the DR test with the links broken between sites, you must enable Read
Ownership Takeover so that the test site can access the data on the production volumes
owned by the production site. Because the production volumes are created by mounting them
on a production cluster, that cluster has volume ownership.
If you attempt to mount one of those volumes from the DR host without ownership takeover
enabled, the mount fails because the DR cluster cannot request ownership transfer from the
production cluster. By enabling ROT, the test host can mount the production logical volumes
and read their contents.
The DR host is not able to modify the production site-owned volumes or change their
attributes. The volume appears to the DR host as a write-protected volume. Because the
volumes that are going to be used by the DR host for writing data were inserted through the
MI that is associated with the DR cluster, that DR cluster already has ownership of those
volumes. The DR host has complete read and write control of these volumes.
Important: Never enable Write Ownership Takeover mode for a test. WOT mode must be
enabled only during a loss or failure of the production TS7700.
If you are not going to break the links between the sites, normal ownership transfer occurs
whenever the DR host requests a mount of a production volume.
With REJECT OUTPUT in effect, products and applications that append data to an existing
tape with DISP=MOD must be handled manually to function correctly. If the product is
DFSMShsm, tapes that are filling (seen as not full) from the test system control data set
(CDS) must be modified to full by running commands. If DFSMShsm then later needs to write
data to tape, it requires a scratch volume that is related to the test system’s logical volume
range.
Figure 13-2 helps you understand how you can protect your tapes in a DR test while your
production system continues running.
HSKP PARMS:
VRSEL, EXPROC,
DSTORE,…
Caution:
Disp=mod
DFSMShsm
This includes stopping any automatic short-on-scratch process, if enabled. For example,
RMM has one emergency short-on-scratch procedure.
To illustrate the implications of running the HSKP task in a DR host, see the example in
Table 13-1, which displays the status and definitions of a production volume in a normal
situation.
Table 13-1 VOLSER AAAAAA before returned to scratch from the disaster recovery site
Environment DEVSUP TCDB RMM MI VOLSER
In this example, volume AAAAAA is the master in both environments. However, due to a
procedural error, it is returned to scratch by the DR host. You can see its status in Table 13-2.
Table 13-2 VOLSER AAAAAA after returned to scratch from the disaster recovery site
Environment DEVSUP TCDB RMM MI VOLSER
Volume AAAAAA is now in scratch category 0012. This presents two issues:
If you need to access this volume from a production host, you need to change its status to
master (000F) using ISMF ALTER from SCRATCH to PRIVATE on the Prod host before
you can access it. Otherwise, you will lose the data on the volume, which can have serious
consequences, for example, 1000 production volumes are accidentally returned to scratch
by the DR host.
On the DR host, RMM is set to reject using the production volumes for output. If this
volume is mounted in response to a scratch mount on the DR host, it will be rejected by
RMM. Imagine the scenario where the TS7700 must mount 1,000 scratch volumes before
the TS7700 mounts a volume that RMM does not reject. This would not be a desirable
situation.
To provide maximum protection from a system operator perspective, perform these tasks to
protect production volumes from unwanted return-to-scratch processing:
Ensure that the RMM HSKP procedure is not running on any host during the test window
of the DR host. There is a real risk of data loss if the DR host returns production volumes
to scratch and you have defined in the TS7700 that the expiration time for the
corresponding category is 24 hours. After this time, volumes can become unrecoverable.
Ensure that the RMM short-on-scratch procedure does not start. The results can be the
same as running an HSKP.
In addition to the protection options that are described, you can also use the following RACF
commands to protect the production volumes:
RDEFINE TAPEVOL x* UACC(READ) OWNER(SYS1)
SETR GENERIC(TAPEVOL) REFRESH
In the command, x is the first character of the VOLSER of the volumes to protect.
After the DR test is finished, you have a set of volumes in the TS7700 that belong to DR test
activities. You need to decide what to do with these tapes. As a test ends, the RMM database
and VOLCAT will probably be destaged (along with all of the data that is used in the DR test).
However, until an action is taken, the volumes remain defined in the MI database.
One is in master status.
The others are in SCRATCH status.
If the volumes are not needed anymore, manually release the volumes and then run
EXPROC to return the volumes to scratch under RMM control. If the tapes will be used for
future test activities, manually release these volumes. The cartridges remain in the SCRATCH
status and ready for use. Remember to use a Scratch category with expiration time to ensure
that no space is wasted.
Important: Although volumes in the MI remain ready to use, you must ensure that the next
time that you create the DR test environment that these volumes are defined to RMM and
the TCDB. Otherwise, you cannot use them.
For a detailed technical description, see IBM Virtualization Engine TS7700 Series Best
Practices - FlashCopy for Disaster Recovery Testing, which is available at the Techdocs
website (search for the term TS7700):
http://www.ibm.com/support/techdocs/atsmastr.nsf/Web/TechDocs
The FlashCopy for DR testing function is supported on TS7700 Grid configurations where at
least one TS7760 or TS7720 cluster exists within the DR location. The function cannot be
supported under TS7740-only grids or where a TS7740 is the only applicable DR cluster. A
TS7740 might be present and used as part of the DR test if at least one TS7760 or TS7720 is
also present in the DR site.
The Write Protect exclusion categories are not a subject for the flash. For these categories
only, a Live Copy exists.
During an enabled Flash, the autoremoval process is disabled for the TS7760/TS7720
members of the DR Family. A TS7760/TS7720 within a DR location requires extra capacity to
accommodate the reuse of volumes and any DR test data that is created within an excluded
category. Volumes that are not modified during the test require no additional TS7760/TS7720
disk cache capacity. The extra capacity requirement must be considered when planning the
size of the TS7760/TS7720 disk cache.
If you are using Time Delay Replication Policy, also check the cache usage of the remaining
production cluster TS7760/TS7720. Volumes can be removed from the TS7760/TS7720 only
when the T copies are processed (either in the complete grid, or in the family).
At least one TS7760 or TS7720 must be part of the DR Family. You can optionally include one
or more TS7740s. The TS7740 does not have the same functions in a DR Family that the
TS7760/TS7720 has. The Write Protect excluded media categories needs to be consistent on
all clusters in a DR Family. If they are not consistent, the FlashCopy cannot be enabled.
The settings for a DR Family can be checked by using the following command
(Example 13-2):
LI REQ, <COMPOSITE>,DRSETUP, SHOW, <FAMILYNAME>
LIVECOPY allows read access from a DR host to production volumes that were consistent
before time zero of a FlashCopy and do not exist in cache on the FLASHed TS7760/TS7720
but do exist on a physical backend tape that is attached to a TS7700 or is in the cache of a
TS7740. If a volume in this state is accessed from a DR host and LIVECOPY is enabled, the
mount is satisfied. If a volume is in this state and LIVECOPY is NOT enabled, the mount fails.
To ensure that during a DR test only data from Time Zero are used, all mounts need to be run
on the TS7760/TS7720.
Important: Use the TS7740 in a DR Family only for remote mounts. Do not vary online the
TS7740 devices directly to the DR host.
The option is disabled by default. If you choose to enable this functionality, you must explicitly
enable the option using the library request command with “LIVECOPY” keyword as follows
(Example 13-3):
LI REQ,<clib_name>,DRSETUP,<family_name>,LIVECOPY,FAMILY
To disable the LIVECOPY option, you must run the following command (Example 13-4):
LI REQ, <clib_name>, DRSETUP, <family_name>, LIVECOPY, NONE
Note: A FlashCopy cannot be enabled if Write Protect Mode was enabled from the MI.
Do not enable the FlashCopy if production hosts with tape processing have device allocations
on the clusters where the Flash will be enabled. Failures might occur because the read-only
mode does not enable subsequent mounts.
You can use the following commands to identify if a FlashCopy exists for a specific volume,
and the status from the livecopy and the FlashCopy.
Example 13-7 Display of a logical volume after modification from production - Livecopy
LI REQ,HYDRAG,LVOL,A08760
CBR1020I Processing LIBRARY command: REQ,HYDRAG,LVOL,A08760.
CBR1280I Library HYDRAG request. 883
Keywords: LVOL,A08760
-------------------------------------------------------------
LOGICAL VOLUME INFORMATION V3 0.0
LOGICAL VOLUME: A08760
MEDIA TYPE: ECST
COMPRESSED SIZE (MB): 2763
MAXIMUM VOLUME CAPACITY (MB): 4000
CURRENT OWNER: cluster1
MOUNTED LIBRARY:
MOUNTED VNODE:
MOUNTED DEVICE:
TVC LIBRARY: cluster1
MOUNT STATE:
CACHE PREFERENCE: PG1
CATEGORY: 000F
LAST MOUNTED (UTC): 2014-03-11 10:19:47
LAST MODIFIED (UTC): 2014-03-11 10:18:08
LAST MODIFIED VNODE: 00
LAST MODIFIED DEVICE: 00
TOTAL REQUIRED COPIES: 2
KNOWN CONSISTENT COPIES: 2
KNOWN REMOVED COPIES: 0
IMMEDIATE-DEFERRED: N
DELETE EXPIRED: N
RECONCILIATION REQUIRED: N
LWORM VOLUME: N
FlashCopy: CREATED
----------------------------------------------------------------
Example 13-8 shows the flash instance of the same logical volume.
Example 13-8 Display of a logical volume after modification from production - Flash volume
LI REQ,HYDRAG,LVOL,A08760,FLASH
CBR1020I Processing LIBRARY command: REQ,HYDRAG,LVOL,A08760,FLASH
CBR1280I Library HYDRAG request. 886
Keywords: LVOL,A08760,FLASH
-----------------------------------------------------------------
LOGICAL VOLUME INFORMATION V3 0.0
FlashCopy VOLUME: A08760
MEDIA TYPE: ECST
COMPRESSED SIZE (MB): 0
MAXIMUM VOLUME CAPACITY (MB): 4000
CURRENT OWNER: cluster2
MOUNTED LIBRARY:
MOUNTED VNODE:
MOUNTED DEVICE:
TVC LIBRARY: cluster1
MOUNT STATE:
CACHE PREFERENCE: ---
CATEGORY: 000F
LAST MOUNTED (UTC): 1970-01-01 00:00:00
LAST MODIFIED (UTC): 2014-03-11 09:05:30
LAST MODIFIED VNODE:
LAST MODIFIED DEVICE:
TOTAL REQUIRED COPIES: -
KNOWN CONSISTENT COPIES: -
KNOWN REMOVED COPIES: -
IMMEDIATE-DEFERRED: -
DELETE EXPIRED: N
RECONCILIATION REQUIRED: N
LWORM VOLUME: -
---------------------------------------------------------------
LIBRARY RQ CACHE PRI PVOL SEC PVOL COPY ST COPY Q COPY CP
cluster2 N Y ------ ------ CMPT - RUN
Only the clusters from the DR Family are shown (in this case only a TS7720 was defined in
the DR Family). This information is also available on the MI.
To see the information from the created FlashCopy instance, select the FlashCopy CREATED
field. This action opens a second view, as shown in Figure 13-4.
The following HCR command provides you information about the space that is used by the
FlashCopy on the bottom of the output. See Example 13-9.
LI REQ,<distributed library name>,CACHE
You can find the same information about the MI as well. You can select the following display
windows:
Monitor
Performance
Cache Usage
Also, you can control the usage of your virtual drives. You can select these displays on the MI:
Virtual
Virtual Tape Drives
Figure 13-6 Virtual Tape Drive window during a FlashCopy for disaster recovery test
All of these methods have their advantages and disadvantages, so it is important that before
you decide which method to use, that you weigh the advantages and disadvantages of each
method against your environment and resources and then choose which method best fits
your DR testing needs and ability.
Note: Each method assumes an independent DR site (DR host and at least one DR
cluster). That is, it is assumed that no production hosts have had any devices online to the
disaster clusters to read/write production data on those clusters.
The next section describes the steps that can be used to run DR testing using the FlashCopy
functionality. For a detailed description of all commands, see IBM Virtualization Engine
TS7700 Series Best Practices - FlashCopy for Disaster Recovery Testing, which is available
at the Techdocs website (search for the term TS7700):
http://www.ibm.com/support/techdocs/atsmastr.nsf/Web/TechDocs
These steps were written in a checklist format to provide a reference of the steps that are
needed to accomplish this method. It is advised that you review all of these steps before an
actual DR exercise, and during the DR exercise. Because the steps were written to apply to
more than one TS7700 grid configuration, make sure that before running each step that you
understand each step and how it applies to your environment.
If you determine that FlashCopy for DR is not suitable to your DR environment, using the
‘Write Protect Mode on DR clusters’ method is the suggested alternative.
The following sections describe the steps that you can use to accomplish this method of DR
testing. As with the previous method, these steps were written in a checklist format to provide
a reference of the steps that are needed to accomplish this method. It is advised that you
review all of these steps before an actual DR exercise, and during the DR exercise. Because
the steps were written to apply to more than one TS7700 grid configuration, make sure that
before running each step that you understand each step and how it applies to your
environment.
If your choice is between using Write Protect Mode and not using Write Protect Mode, it is
recommended to use Write Protect Mode (Method 2), to provide an additional level of
write-protection in case the TMS on the disaster host is not configured correctly to prevent
writes to the production volumes.
Described in the following sections are the steps that you can use to accomplish this method
of DR testing. As with the previous method, these steps were written in a checklist format to
provide a reference of the steps that are needed to accomplish this method. It is advised that
you review all of these steps before an actual DR exercise, and during the DR exercise.
Because the steps were written to apply to more than one TS7700 grid configuration, make
sure that before running each step that you understand each step and how it applies to your
environment.
The concern about losing data in a real disaster during a DR test is the major drawback to
using this DR method. Because of this, if it is possible to use one of the DR methods
described earlier (using FlashCopy or Write Protect Mode), it is suggested to use one of
those methods.
Described below are the steps that you can use to accomplish this method of DR testing. As
with the previous methods, these steps were written in a checklist format to provide a
reference of the steps needed to accomplish this method. It is suggested that you review all of
these steps before an actual DR exercise, and during the DR exercise. As the steps were
written to apply to more than one TS7700 grid configuration, make sure that before running
each step that you understand each step and how it applies to your environment.
The messages in Example 13-10 might appear if you try to read a logical volume that was not
present at time zero in the DR Family.
The message in Example 13-11 might also appear if you want to modify a volume that is in a
write protect media category.
The message in Example 13-12 might occur if a job was running on the cluster while the
FlashCopy was enabled.
Example 13-12 Message for job running on the cluster while FlashCopy was enabled
IEF233A M 2507,A10088,,DENEKA8,STEP2,DENEKA.HG.TEST1.DUMP1
IEC518I SOFTWARE ERRSTAT: WRITPROT 2507,A10088,SL,DENEKA8,STEP2
IEC502E RK 2507,A10088,SL,DENEKA8,STEP2
IEC147I 613-24,IFG0194F,DENEKA8,STEP2,AUS1,2507,,DENEKA.HG.TEST1.DUMP1
Part 4 Appendixes
This part offers management and operational information for your IBM TS7700.
Exception: This appendix provides a general description of the feature codes and where
they apply. Use the following link to the IBM Knowledge Center for technical information
related to the TS7700:
https://www.ibm.com/support/knowledgecenter/en/STFS69_4.1.0/ts7740_feature_code
s.html
TS7700 R4.1.1
Seven- and eight-way grids support through RPQ.
Clarification: The symbol “†” means that the specific feature has been withdrawn.
Clarification: The symbol “†” means that specific feature has been withdrawn.
Clarification: The symbol “†” means that specific feature has been withdrawn.
Table B-1 Supported tape solutions for non-z/OS platforms in IBM Z environments
Platform/Tape system IBM TS3500 tape library TS7700 3592 drives
z/VM V5.4, V6.1, V6.2, and V6.3 native Yes Yesa Yes
Even if z/VM and z/VSE can use the TS7700, you must consider certain items. For
information about support for TPF, see “Software implementation in z/OS Transaction
Processing Facility” on page 821.
Introduced by VM65789 is the ability for the RMS component of DFSMS/VM to use the COPY
EXPORT functionality of a TS7700. COPY EXPORT allows a copy of selected logical
volumes that are written on the backend physical tape that is attached to a TS7700 to be
removed and taken offsite for disaster recovery purposes. For more information, review the
memo bundled with VM65789 or see z/VM: DFSMS/VM Removable Media Services,
SC24-6185.
For DR tests involving a TS7700 grid that is connected to hosts running z/VM or z/VSE,
Release 3.3 of the TS7700 microcode introduced a new keyword on the DRSETUP command
called SELFLIVE. This keyword provides a DR host the ability to access its self-created
content that has been moved into a write-protected category when flash is enabled. For more
information, see the Library Request Command white paper found at:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101091
Tape management
Although the RMS functions themselves do not include tape management system (TMS)
services, such as inventory management and label verification, RMS functions are designed
to interface with a TMS that can perform these functions. Additional information about
third-party TMSs that support the TS7700 in the z/VM environment is in IBM TotalStorage
3494 Tape Library: A Practical Guide to Tape Drives and Tape Automation, SG24-4632.
Appendix B. IBM TS7700 implementation for IBM z/VM, IBM z/VSE, and IBM z/TPF environments 815
Figure B-1 shows the z/VM native support for the TS7700.
Mount
RMSMASTR
Requester
z/VM
FICON Channels
Virtual TS7700
Tape Mgmt
Drives Software
TS7700
Figure B-1 TS7700 in a native z/VM environment using DFSMS/VM
When you use the TS7740, TS7720T, or TS7760T in a VM environment, consider that many
VM applications or system utilities use specific mounts for scratch volumes. With specific
mounts, every time a mount request is sent from the host, the logical volume might need to be
recalled from the stacked cartridge if it is not on already in the Tape Volume Cache (TVC).
Instead, you might want to consider the use of a TS7760, TS7760T, TS7720, or TS7720T
(cache resident partition CP0) for your VM workload. This keeps the data in the TVC for faster
access. In addition, also consider that your VM backup must determine whether a TS7700
and its replication capabilities to remote sites provides what is needed or if physical tape is
needed to move data offsite.
DFSMS/VM
After you define the new TS7700 tape library through HCD, you must define the TS7700 to
DFSMS/VM if the VM system is to use the TS7700 directly. You define the TS7700 tape
library through the DFSMS/VM DGTVCNTL DATA control file. Also, you define the available
tape drives though the RMCONFIG DATA configuration file. See z/VM V6R2 DFSMS/VM
Removable Media Services, SC24-6185, for more information.
You have access to RMS as a component of DFSMS/VM. To enable RMS to run automatic
insert bulk processing, you must create the RMBnnnnn data file in the VMSYS:DFSMS CONTROL
directory, where nnnnn is the five-character tape library sequence number that is assigned to
the TS7700 during hardware installation.
For more information about implementing DFSMS/VM and RMS, see DFSMS/VM Function
Level 221 Removable Media Services User’s Guide and Reference, SC35-0141. If the
TS7700 is shared by your VM system and other systems, more considerations apply. For
more information, see Guide to Sharing and Partitioning IBM Tape Library Data, SG24-4409.
z/VSE supports the TS3500 Tape Library/3953 natively through its Tape Library Support
(TLS). In addition to the old Tape Library Support, a function has been added to enable the
Tape Library to be supported through the IBM S/390® channel command interface
commands. This function eliminates any XPCC/APPC communication protocol that is
required by the old interface. The external interface (LIBSERV JCL and LIBSERV macro)
remains unchanged.
For native support under VSE, where TS7700 is used only by z/VSE, select TLS. At least one
tape drive must be permanently assigned to VSE.
Appendix B. IBM TS7700 implementation for IBM z/VM, IBM z/VSE, and IBM z/TPF environments 817
LIBSERV
The communication from the host to the TS7700 goes through the LIBSERV JCL or macro
interface. Example B-2 shows a sample job that uses LIBSERV to mount volume 123456 for
write on device address 480 and, in a second step, to release the drive again.
For more information, see z/VSE System Administration Guide, SC34-2627, and z/VSE
System Macros Reference, SC34-2638.
Tip: When z/OS is installed as a z/VM guest on a virtual machine, you must specify the
following statement in the virtual machine directory entry for the VM user ID under which
the z/OS guest operating system is started for the first time:
STDEVOPT LIBRARY CTL
NOCTL specifies that the virtual machine is not authorized to send commands to a tape
library, which results in an I/O error (command reject) when MVS tries to send a command to
the library. For more information about the STDEVOPT statement, see z/VM V6.2 Resources:
http://www.vm.ibm.com/zvm620/
z/VSE guests
Some VSE TMSs require VGS support and also DFSMS/VM RMS for communication with the
TS7700.
If the VGS is required, define the LIBCONFIG file and FSMRMVGC EXEC configuration file on the
VGS service system’s A disk. This file cross-references the z/VSE guest’s tape library names
with the names that DFSMS/VM uses. To enable z/VSE guest exploitation of inventory
support functions through the LIBSERV-VGS interface, the LIBRCMS part must be installed
on the VM system.
If VGS is to service inventory requests for multiple z/VSE guests, you must edit the LIBRCMS
SRVNAMES cross-reference file. This file enables the inventory support server to access
Librarian files on the correct VSE guest system. For more information, see 7.6, “VSE Guest
Server Considerations” in Guide to Sharing and Partitioning IBM Tape Library Data,
SG24-4409.
Therefore, some z/VSE guest scenarios use the CMS service system, called the VGS, to
communicate with RMSMASTR. VGS uses the standard facilities of RMS to interact with the
TS7700 and the virtual drives.
Appendix B. IBM TS7700 implementation for IBM z/VM, IBM z/VSE, and IBM z/TPF environments 819
Figure B-2 shows the flow and connections of a TS7700 in a z/VSE environment under a VM.
z/VSE
VSE/ESA
Library APPC/VM Standard RMS
API VSE Guest
Control Interface RMSMASTR
Server (VGS)
Tape I/O
z/VM
VM/ESA
z/VSE
FICON Channels
Virtual TS7700
Tape Mgmt
Drives Software
Mount 1
RMSMASTR
Requester
z/VM 3, 4
2
5 6
z/VSE
FICON Channels
Virtual TS7700
Tape Mgmt
Drives Software
VSE uses original equipment manufacturer (OEM) tape management products that support
scratch mounts. So, if you are using VSE under VM, you have the benefit of using the scratch
(Fast Ready) attribute for the VSE library’s scratch category.
For more information about z/VSE, see z/VSE V6R1.0 Administration, SC34-2627.
Because z/TPF does not have a TMS or a tape catalog system, z/OS manages this function.
In a z/TPF environment, most tape data is passed between the systems. In general, 90% of
the tapes are created on z/TPF and read on z/OS, and the remaining 10% are created on
z/OS and read on z/TPF.
Appendix B. IBM TS7700 implementation for IBM z/VM, IBM z/VSE, and IBM z/TPF environments 821
Be sure to use the normal z/OS and TS7700 installation processes. For more information,
see the following white paper, which describes some of the leading practices for implementing
the TS7700 with z/TPF:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102117
After a volume is loaded into a z/TPF drive, have an automated solution in place that passes
the volume serial number (VOLSER), the tape data set name, and the expiration date over to
z/OS to process it automatically.
On z/OS, you must update the TMS’s catalog and the TCDB so that z/OS can process virtual
volumes that are created by z/TPF. After the z/TPF-written volumes are added to the z/OS
TMS catalog and the TCDB, normal expiration processing applies. When the data on a virtual
volume expires and the volume is returned to scratch, the TS7700 internal database is
updated to reflect the volume information maintained by z/OS.
Advanced Policy Management is supported in z/TPF through a user exit. The exit is called
any time that a volume is loaded into a drive. Then, the user can specify, through the z/TPF
user exit, whether the volume will inherit the attributes of an existing volume by using the
clone VOLSER attribute. Or, the code can elect to specifically set any or all of the Storage
Group (SG), Management Class (MC), Storage Class (SC), or DC construct names. If the exit
is not coded, the volume attributes remain unchanged because the volume is used by z/TPF.
Library interface
z/TPF has only one operator interface with the TS7700, which is a z/TPF functional message
called ZTPLF. The various ZTPLF functions enable the operator to manipulate the tapes in the
library as operational procedures require. These functions include Reserve, Release, Move,
Query, Load, Unload, and Fill. For more information, see IBM TotalStorage 3494 Tape
Library: A Practical Guide to Tape Drives and Tape Automation, SG24-4632.
SIMs and MIMs are represented in z/TPF by EREP reports and the following messages:
CEFR0354
CEFR0355W
CEFR0356W
CEFR0357E
CEFR0347W
CDFR0348W
CDFR0349E
Appendix B. IBM TS7700 implementation for IBM z/VM, IBM z/VSE, and IBM z/TPF environments 823
The issue with z/TPF arises when the period that clusters wait before recognizing that
another cluster in the grid failed exceeds the timeout values on z/TPF. This issue also means
that during this recovery period, z/TPF cannot run any ZTPLF commands that change the
status of a volume. This restriction includes loading tapes or changing the category of a
volume through a ZTPLF command, or through the tape category user exit in segment CORU.
The recovery period when a response is still required from a failing cluster can be as long as
6 minutes. Attempting to send a tape library command to any device in the grid during this
period can render that device inoperable until the recovery period has elapsed even if the
device is on a cluster that is not failing.
To protect against timeouts during a cluster failure, z/TPF systems must be configured to
avoid sending tape library commands to devices in a TS7700 grid along critical code paths
within z/TPF. This task can be accomplished through the tape category change user exit in
the segment CORU. To isolate z/TPF from timing issues, the category for a volume must not
be changed if the exit is called for a tape switch. Be sure that the exit changes the category
when a volume is first loaded by z/TPF and then not changed again.
To further protect z/TPF against periods in which a cluster is failing, z/TPF must keep enough
volumes loaded on drives that are varied on to z/TPF so that the z/TPF system can operate
without the need to load an extra volume on any drive in the grid until the cluster failure is
recognized. z/TPF must have enough volumes that are loaded so that it can survive the
6-minute period where a failing cluster prevents other devices in that grid from loading any
new volumes.
Important: Read and write operations to devices in a grid do not require communication
between all clusters in the grid. Eliminating the tape library commands from the critical
paths in z/TPF helps z/TPF tolerate the recovery times of the TS7700 and read or write
data without problems if a failure of one cluster occurs within the grid.
In this manner, z/TPF has access to a scratch tape that is owned by the cluster that was given
the request for a scratch volume. If all of the volumes in a grid are owned by one cluster, a
failure on that cluster requires a cluster takeover (which can take tens of minutes) before
volume ownership can be transferred to a surviving cluster.
Guidelines
When z/TPF applications use a TS7700 multi-cluster grid that is represented by the
composite library, the following usage and configuration guidelines can help you meet the
TPF response-time expectations on the storage subsystems:
The best configuration is to have the active and standby z/TPF devices and volumes on
separate composite libraries (either single-cluster or multi-cluster grid). This configuration
prevents a single event on a composite library from affecting both the primary and
secondary devices.
Appendix B. IBM TS7700 implementation for IBM z/VM, IBM z/VSE, and IBM z/TPF environments 825
Clarification: In a z/TPF environment, manipulation of construct names for volumes can
occur when they are moved from scratch through a user exit. The user exit enables the
construct names and clone VOLSER to be altered. If the exit is not implemented, z/TPF
does not alter the construct names.
z/TPF use of categories is flexible. z/TPF enables each drive to be assigned a scratch
category. Relating to private categories, each z/TPF has its own category to which
volumes are assigned when they are mounted.
For more information about this topic, see the z/TPF section of IBM Knowledge Center:
http://www.ibm.com/support/knowledgecenter/SSB23S/welcome
Because the hosts do not know about constructs, they ignore static construct assignment,
and the assignment is kept even when the logical volume is returned to scratch. Static
assignment means that at the insert time of logical volumes, they are assigned construct
names, also. Construct names can also be assigned later at any time.
Tip: In a z/OS environment, OAM controls the construct assignment and resets any static
assignment that is made before using the TS7700 MI. Construct assignments are also
reset to blank when a logical volume is returned to scratch.
4. If you want to modify existing VOLSER ranges and assign the required static construct
names to the logical volume ranges through the change existing logical volume function,
select Logical Volumes → Modify Logical Volumes to open the window.
Define groups of logical volumes with the same construct names assigned and during insert
processing, direct them to separate volume categories so that all volumes in one LM volume
category have identical constructs assigned.
Host control is given by using the appropriate scratch pool. By requesting a scratch mount
from a specific scratch category, the actions that are defined for the constructs that are
assigned to the logical volumes in this category are run at the Rewind Unload (RUN) of the
logical volume.
Appendix B. IBM TS7700 implementation for IBM z/VM, IBM z/VSE, and IBM z/TPF environments 827
828 IBM TS7700 Release 4.1 and 4.1.2 Guide
C
DFSMS has support that provides JES3 allocation with the appropriate information to select a
tape library device by referencing device strings with a common name among systems within
a JES3 complex.
All tape library devices can be shared between processors in a JES3 complex. They must
also be shared among systems within the same storage management subsystem complex
(SMSplex).
Consideration: Tape drives in the TS3500 tape library cannot be used by JES3 dynamic
support programs (DSPs).
Define all devices in the libraries through DEVICE statements. All TS3500 tape library drives
within a complex must be either JES3-managed or non-JES3-managed. Do not mix managed
and non-managed devices. Mixing might prevent non-managed devices from use for new
data set allocations and reduce device eligibility for existing data sets. Allocation failures or
delays in the job setup can result.
Neither JES3 or DFSMS verifies that a complete and accurate set of initialization statements
is defined to the system. Incomplete or inaccurate TS3500 tape library definitions can result
in jobs failing to be allocated.
During converter/interpreter (CI) processing for a job, the LDG names are passed to JES3 by
DFSMS for use by MDS in selecting library tape drives for the job. Unlike a JES2
environment, a JES3 operating environment requires the specification of esoteric unit names
for the devices within a library. These unit names are used in the required JES3 initialization
statements.
Important: Even if the LDG definitions are defined as esoterics in HCD, they are not used
in the job control language (JCL). There is no need for any UNIT parameter in JES3 JCL for
libraries. The allocation goes through the automatic class selection (ACS) routines. Coding
a UNIT parameter might cause problems.
The only need for coding the LDG definition is in HCD as an esoteric name is the
HWSNAME definitions in the JES3 INISH deck.
3490E LDG3490E
3592-J1A LDG359J
3592-E05 LDG359K
It also enables you to address a specific device type in a specific tape library. In a
stand-alone grid, or in a multiple cluster TS7700 grid, the previous references to the
five-digit library identification number is to the composite library.
To set up a TS3500 tape library in a JES3 environment, complete the following steps:
1. Define LDGs. Prepare the naming conventions in advance. Clarify all the names for the
LDGs that you need.
2. Include the esoteric names from step 1 in the hardware configuration definition (HCD) and
activate the new Esoteric Device Table (EDT).
There is no default or specific naming convention for this statement. This name is used in
other JES3 init statements to group the devices together for certain JES3 processes (for
example, allocation). Therefore, it is necessary that all the devices with the same XTYPE
belong to the same library and the same device type.
The letters CA in the XTYPE definition indicate to you that this is a CARTRIDGE device, as
shown in Figure C-1.
Exception: When Dump Job (DJ) is used with the SERVER=YES keyword. When this
keyword is used, DJ uses MVS dynamic allocation to allocate the device, which uses
XUNIT.
Therefore, do not specify DTYPE, JUNIT, and JNAME parameters on the DEVICE
statements. No check is made during initialization to prevent tape library drives from definition
as support units, and no check is made to prevent the drives from allocation to a DSP if they
are defined. Any attempt to call a tape DSP by requesting a tape library fails because the
DSP cannot allocate a tape library drive.
SETNAME,XTYPE=LB1359K,
NAMES=(LDGW3495,LDGF4001,LDG359K,LDKF4001)
Complex Library Complex Library
Wide Specific Wide Specific
Library Library Device Device
Name Name Name Name
Tip: Do not specify esoteric and generic unit names, such as 3492, SYS3480R, and
SYS348XR. Also, never use esoteric names, such as TAPE and CART.
F4001 4 x 3592-E05
UADD 1104-1107
4 x 3592-J1A
UADD 1100-1103
F4006
4 x 3592-E05
UADD 2004-2007
4 x 3592-E05
UADD 2000-2003
Complex-wide device type One definition for each installed device type:
LDG359J Represents the 3592-J1A devices
LDG359K Represents the 3592-E05 devices
Library-specific device type One definition for each device type in each library:
LDJF4001 Represents the 3592-J1A in library F4001
LDKF4001 Represents the 3592-E05 in library F4001
LDKF4006 Represents the 3592-E05 in library F4006
DEVICE,XTYPE=(LB13592J,CA),XUNIT=(1000,*ALL,,OFF),numdev=4
DEVICE,XTYPE=(LB13592K,CA),XUNIT=(1104,*ALL,,OFF),numdev=4
DEVICE,XTYPE=(LB23592K,CA),XUNIT=(2000,*ALL,,OFF),numdev=8
SETNAME,XTYPE=(LB13592J,CA),NAMES=(LDGW3495,LDGF4001,LDG359J,LDJF4001)
SETNAME,XTYPE=(LB13592K,CA),NAMES=(LDGW3495,LDGF4001,LDG359K,LDKF4001)
SETNAME,XTYPE=(LB23592K,CA),NAMES=(LDGW3495,LDGF4006,LDG359K,LDKF4006)
For this example, you need three SETNAME statements for these reasons:
One library with two different device types = Two SETNAME statements
One library with one device type = One SETNAME statement
HWSNAME,TYPE=(LDGW3495,LDGF4001,LDGF4006,LDG359J,LDG359K,LDJF4001,LDKF4001,LDKF4006)1
HWSNAME,TYPE=(LDGF4001,LDJF4001,LDKF4001,LDG359J)2
HSWNAME,TYPE=(LDGF4006,LDKF4006)3
HWSNAME,TYPE=(LDJF4001,LDG359J)4
HWSNAME,TYPE=(LDG359J,LDJF4001)5
HWSNAME,TYPE=(LDG359K,LDKF4001,LDGF4006,LDKF4006)6
Library 3 has a LIBRARY-ID of 22051 and only a TS7700 installed with a composite library
LIBRARY-ID of 13001. There are no native tape drives in that library.
8 x 3592-E05
UADD 1107-110F
47110
Single Cluster TS7700 Grid
Composite Library ID
8 x 3592-J1A
UADD 1100-1107
256 x 3490E
UADD 0100-01FF
F4001
8 x 3592-E06
UADD 4000-4007
8 x 3592-E05
UADD 2000-2007
22051 F4006
LDG definitions that are needed for the second configuration example
Table C-4 shows all the LDG definitions that are needed in the HCD of the second
configuration example.
Library-specific name LDGF4001 One definition for each library and for each
LDGF4006 stand-alone cluster TS7700 grid. For a
LDG13001 single cluster or multiple cluster TS7700
LDG47110 grid, only the composite library LIBRARY-ID
is specified.
Complex-wide device type One definition for each installed device type:
LDG3490E Represents the 3490 devices in
TS7700
LDG359J Represents the 3592-J1A
LDG359K Represents the 3592-E05
LDG359L Represents the 3592-E05 with
Encryption
LDG359M Represents the 3592-E06
Library-specific device type One definition for each device type in each
library, except for the multi-cluster TS7700
grid:
LDE13001 Represents the virtual drives in the
stand-alone cluster TS7700 grid in
library 22051
LDE47110 Represents the virtual drives in the
multicluster TS7700 grid in libraries
F4001 and F4006
LDJF4001 Represents the 3592-J1A in library
F4001
LDKF4001 Represents the 3592-E05 in library
F4001
LDLF4006 Represents the encryption-enabled
3592-E05 in library F4006
LDMF4006 Represents the 3592-E06 in library
F4006
DEVICE,XTYPE=(LB13592J,CA),XUNIT=(1100,*ALL,,OFF),numdev=8
DEVICE,XTYPE=(LB13592K,CA),XUNIT=(1107,*ALL,,OFF),numdev=8,
DEVICE,XTYPE=(LB2359M,CA),XUNIT=(4000,*ALL,,OFF),numdev=8
DEVICE,XTYPE=(LB2359L,CA),XUNIT=(2000,*ALL,,OFF),numdev=8
DEVICE,XTYPE=(LB3GRD1,CA),XUNIT=(3000,*ALL,,OFF),numdev=256
DEVICE,XTYPE=(LB12GRD,CA),XUNIT=(0110,*ALL,S3,OFF)
DEVICE,XTYPE=(LB12GRD,CA),XUNIT=(0120,*ALL,S3,OFF)
DEVICE,XTYPE=(LB12GRD,CA),XUNIT=(0130,*ALL,S3,OFF)
DEVICE,XTYPE=(LB12GRD,CA),XUNIT=(0140,*ALL,S3,OFF)
DEVICE,XTYPE=(LB12GRD,CA),XUNIT=(0111,*ALL,S3,OFF)
DEVICE,XTYPE=(LB12GRD,CA),XUNIT=(0121,*ALL,S3,OFF)
DEVICE,XTYPE=(LB12GRD,CA),XUNIT=(0131,*ALL,S3,OFF)
DEVICE,XTYPE=(LB12GRD,CA),XUNIT=(0141,*ALL,S3,OFF)
;;;;;;;
DEVICE,XTYPE=(LB12GRD,CA),XUNIT=(01FF,*ALL,S3,OFF)
The same restriction applies to the virtual devices of the clusters of a multi-cluster grid
configuration. If you want to balance the workload across the virtual devices of all clusters,
do not code the NUMDEV parameter.
SETNAME,XTYPE=LB1359J,NAMES=(LDGW3495,LDGF4001,LDG359J,LDJF4001)
SETNAME,XTYPE=LB1359K,NAMES=(LDGW3495,LDGF4001,LDG359K,LDKF4001)
SETNAME,XTYPE=LB2359L,NAMES=(LDGW3495,LDGF4006,LDG359L,LDKF4006)
SETNAME,XTYPE=LB2359M,NAMES=(LDGW3495,LDGF4006,LDG359M,LDMF4006)
SETNAME,XTYPE=LB3GRD1,NAMES=(LDGW3495,LDG13001,,LDG3490E,,LDE13001)
SETNAME,XTYPE=LB12GRD,NAMES=(LDGW3495,LDG47110,LDG3490E,LDE47110)
HWSNAME,TYPE=(LDGW3495,LDGF4001,LDGF4006,LDG13001,LDG47110,LDG3490E,
LDG359J,LDG359K,LDG359L,LDG359M,LDE13001,LDE47110,LDJF4001,
LDKF4001,LDLF4006,LDMF4006)
HWSNAME,TYPE=(LDGF4001,LDJF4001,LDKF4001)
HWSNAME,TYPE=(LDGF4006,LDLF4006,LDMF4006)
HWSNAME,TYPE=(LDG47110,LDE47110)
HWSNAME,TYPE=(LDG13001,LDE13001)
HWSNAME,TYPE=(LDG3490E,LDE47110,LDE13001)
HWSNAME,TYPE=(LDG359J,LDJF4001)
HWSNAME,TYPE=(LDG359K,LDKF4001)
HWSNAME,TYPE=(LDG359L,LDLF4006)
HWSNAME,TYPE=(LDG359M,LDMF4006)
Figure C-10 High watermark setup statements for the second example
Additional examples
For additional examples, see the IBM TotalStorage Virtual Tape Server: Planning,
Implementing, and Monitoring, SG24-2229 topic regarding JES3 sample initialization deck
definition.
Figure C-11 on page 841 shows the JES3 processing phases for CI and MDS. The
processing phases include the support for system-managed direct access storage device
(DASD) data sets.
The following major differences exist between TS3500 tape library deferred mounting and
tape mounts for non-library drives:
Mounts for non-library drives by JES3 are only for the first use of a drive. Mounts for the
same unit are sent by IBM z/OS for the job. All mounts for TS3500 tape library drives are
sent by z/OS.
If all mounts within a job are deferred because there are no non-library tape mounts, that
job is not included in the setup depth parameter (SDEPTH).
MDS mount messages are suppressed for the TS3500 tape library.
JES3/DFSMS processing
DFSMS is called by the z/OS interpreter to perform these functions:
Update the scheduler work area (SWA) for DFSMS tape requests.
Call ACS exits for construct defaults.
DFSMS system-managed tape devices are not selected by using the UNIT parameter in the
JCL. For each data definition (DD) request requiring a TS3500 tape library unit, a list of device
pool names is passed, and from that list, an LDG name is assigned to the DD request. This
results in an LDG name passed to JES3 MDS for that request. Device pool names are never
known externally.
DFSMS managed DISP=MOD data sets are assumed to be new update locate processing. If a
catalog locate determines that the data set is old by the VOLSER specified, a new LDG name
is determined based on the rules for old data sets.
DFSMS catalog services, a subsystem interface call to catalog locate processing, is used for
normal locate requests. DFSMS catalog services are started during locate processing. It
starts supervisor call (SVC) 26 for all existing data sets when DFSMS is active.
Locates are required for all existing data sets to determine whether they are DFSMS
managed, even if VOL=SER= is present in the DD statement. If the request is for an old data set,
catalog services determine whether it is for a library volume. For multivolume requests that
are system-managed, a check is made to determine whether all volumes are in the same
library.
Fetch messages
When tape library volumes are mounted and unmounted by the library, fetch messages to an
operator are unnecessary and can be confusing. With this support, all fetch messages
(IAT5110) for TS3500 tape library requests are changed to be the non-action informational
USES form of the message. These messages are routed to the same console destination as
other USES fetch messages. The routing of the message is based on the UNITNAME.
Another difference between JES3 MDS allocation and z/OS allocation is that MDS considers
the resource requirements for all the steps in a job for all processors in a loosely coupled
complex. z/OS allocation considers job resource requirements one step at a time in the
running processor.
MDS processing also determines which processors are eligible to run a job based on
resource availability and connectivity in the complex.
z/OS allocation interfaces with JES3 MDS during step allocation and dynamic allocation to
get the JES3 device allocation information and to inform MDS of resource deallocations. z/OS
allocation is enhanced by reducing the allocation path for mountable volumes.
JES3 supplies the device address for the tape library allocation request through a subsystem
interface (SSI) request to JES3 during step initiation when the job is running under the
initiator. This support is not changed from previous releases.
DFSMS and z/OS provide all of the tape library support except for the interfaces to JES3 for
MDS allocation and processor selection.
JES3 MDS continues to select tape units for the tape library. MDS no longer uses the UNIT
parameter for allocation of tape requests for tape library requests. DFSMS determines the
appropriate LDG name for the JES3 setup from the SG and DC assigned to the data set, and
replaces the UNITNAME from the JCL with that LDG name. Because this action is after the ACS
routine, the JCL-specified UNITNAME is available to the ACS routine.
Consideration: An LDG name that is specified as a UNITNAME in JCL can be used only to
filter requests within the ACS routine. Because DFSMS replaces the externally specified
UNITNAME, it cannot be used to direct allocation to a specific library or library device type
unless SMSHONOR is specified on the unit parameter. For a description of SMSHONOR and
changes that are needed for JES3 see z/OS JES3 Initialization and Tuning Guide,
SA32-1003, and z/OS MVS JCL Reference, SA32-1005.
All components within z/OS and DFSMS request tape mounting and unmounting inside a
tape library. They call a Data Facility Product (DFP) service, Library Automation
Communication Services (LACS), rather than sending a write to operator (WTO), which is
done by z/OS allocation, so all mounts are deferred until job run time. The LACS support is
called at that time.
MDS allocates an available drive from the available unit addresses for LDGW3495. It passes
that device address to z/OS allocation through the JES3 allocation SSI. At data set OPEN
time, LACS is used to mount and verify a scratch tape. When the job finishes with the tape,
either CLOSE or deallocation issues an unmount request through LACS, which removes the
tape from the drive. MDS does normal breakdown processing and does not need to
communicate with the tape library.
Consider the following aspects, especially if you are using a multi-cluster grid with more than
two clusters and not all clusters contain copies of all logical volumes:
Retain Copy mode setting
If you do not copy logical volumes to all of the clusters in the grid, JES3 might, for a
specific mount, select a drive that does not have a copy of the logical volume. If Retain
Copy mode is not enabled on the mounting cluster, an unnecessary copy might be forced
according to the Copy Consistency Points that are defined for this cluster in the
Management Class (MC).
Copy Consistency Point
Copy Consistency Point has one of the largest influences on which cluster’s cache is used
for a mount. The Copy Consistency Point of Rewind Unload (R) takes precedence over a
Copy Consistency Point of Deferred (D). For example, assuming that each cluster has a
consistent copy of the data, if a virtual device on Cluster 0 is selected for a mount and the
Copy Consistency Point is [R,D], the CL0 cache is selected for the mount. However, if the
Copy Consistency Point is [D,R], CL1’s cache is selected.
For workload balancing, consider specifying [D,D] rather than [R,D]. This more evenly
distributes the workload to both clusters in the grid.
You can find detailed information about these settings and other workload considerations in
Chapter 6, “IBM TS7700 implementation” on page 225 and Chapter 11, “Performance and
monitoring” on page 613.
Note: This support is available starting with z/OS V2R1 plus APARs OA46747 and
OA47041.
The first set of steps is common for device allocation assistance (specific mounts) and scratch
allocation assistance (scratch mounts). Device allocation assistance can be used
independent of the scratch allocation assistance support and vice versa:
complex-wide name Always LDGW3495. Indicates every device and device type in every
library.
library-specific name
An eight character string that is composed of LDG prefixing the
five-digit library identification number. Indicates every device and
device type in that specific library (for example, LDG12345). In a
TS7700, the library-specific name refers to the composite library.
library-specific device name
An eight character string that is composed of LDx prefixing the
five-digit library identification number. Indicates every device with
device type x in that specific library (for example, LDE12345, where
“E” represents all 3490E devices in library 12345). In a TS7700, the
library-specific device name refers to the composite library.
In this example, each distributed library in the grid has 256 devices for a total of 512. The
changes that must be made to the INISH deck to use the optional allocation assist support in
JES3 are shown in bold italic text. The INISH deck changes are needed only if the allocation
assist functions are to be enabled by specifying JES3_ALLOC_ASSIST=YES in the DEVSUPxx
PARMLIB member.
Before you enable the allocation assist functions, ensure that all TS7700 tape drives in the
INISH deck are defined with the necessary LDXxxxxx names, even if the TS7700 is a
stand-alone configuration consisting of one distributed library. In this example, rather than the
device statement representing the composite library (as a whole), the device statements are
defined at the distributed (or cluster) level and LDXxxxxx names are added (as needed) for
each distributed library in a TS7700 that has devices that are connected to the JES3 host.
These device statements are suggested examples that can be used. However, depending on
the contiguous device ranges that are available, more than one device statement can be used
to represent all of the devices in a composite library. Also, more than one device statement
might be needed to represent the devices in a distributed library (and a device can occur in
only one device statement). For example, if there are not 256 contiguous device addresses
that start with 1100, the devices might be split as follows:
DEVICE,XTYPE=(DLB10001,CA),XUNIT=(1000,*ALL,,OFF),NUMDEV=128
DEVICE,XTYPE=(DLB10001,CA),XUNIT=(1100,*ALL,,OFF),NUMDEV=128
Also, one of the factors that are used by JES3 in selecting devices for volume mounting is the
ADDRSORT parameter on the SETPARAM initialization statement. This parameter specifies that
devices are either allocated in the same order as the DEVICE statement defining them
(ADDRSORT=NO) or allocated by the order of their device numbers in ascending order
(ADDRSORT=YES, which is the default).
SETNAME statements
The following list illustrates the SETNAME statements:
For the 3490E devices in composite library 12345, distributed library (10001):
SETNAME,XTYPE=DLB10001,NAMES=(LDGW3495,LDG12345,LDG3490E,LDE12345,LDX10001)
For the 3490E devices in composite library 12345, distributed library (10002):
SETNAME,XTYPE=DLB10002,NAMES=(LDGW3495,LDG12345,LDG3490E,LDE12345,LDX10002)
High-watermark statements
The following list illustrates the HWSNAME statements:
HWSNAME,TYPE=(LDGW3495,LDG12345,LDG3490E,LDE12345,LDX10001,LDX10002)
HWSNAME,TYPE=(LDG12345,LDE12345,LDG3490E,LDX10001,LDX10002)
HWSNAME,TYPE=(LDE12345,LDG12345,LDG3490E,LDX10001,LDX10002)
HWSNAME,TYPE=(LDG3490E,LDE12345,LDG12345,LDX10001,LDX10002)
HWSNAME,TYPE=(LDX10001)
HWSNAME,TYPE=(LDX10002)
Note: The DLB10001 and DLB10002 device statement names are used here for
illustration purposes. When defining the device statement names, any name (up to 8
characters) can be used.
For a full explanation of these commands with examples, see z/OS MVS System Commands,
SA22-7627.
Important: Do not use this state save command for testing purposes. It affects the
performance of your IBM Virtual Tape Server (VTS) automated tape library (ATL)
because it takes time to get the memory dump in the hardware.
When using the DEVSERV QLIB command to display the subsystems (port-IDs) and drives
associated with the specified Library-ID, if the Library-ID specified is for a composite library,
the command also displays the distributed Library-IDs associated with the composite library.
If the Library-ID specified is for a distributed library, the command also displays the composite
Library-ID that is associated with the distributed library.
Notes: You can find tailored JCL to run BVIR jobs and to analyze the data by using
VEHSTATS in the IBMTOOLS libraries. To access the IBM Tape Tools, go to the following
website:
ftp://ftp.software.ibm.com/storage/tapetool/
For the current information about BVIR, see the IBM Virtualization Engine TS7700 Series
Bulk Volume Information Retrieval Function User’s Guide, at the following address:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101094
These jobs are also available as members in userid.IBMTOOLS.JCL after you have installed
the IBMTOOLS.exe on your host. See 11.15, “Alerts and exception and message handling” on
page 692.
After you run one of these jobs, you can create various reports by using VEHSTATS. See
“VEHSTATS reports” on page 868.
BVIRHSTS
Example E-1 lists the JCL in userid.IBMTOOLS.JCL member BVIRHSTS.
BVIRHSTV
Example E-3 lists the JCL in userid.IBMTOOLS.JCL member BVIRHSTV.
Example E-4 shows the JCL to obtain the Volume Map report, which is also contained in the
userid.IBMTOOLS.JCL member BVIRVTS.
You can use the same JCL as shown in Example E-4 on page 859 for the cache report by
replacing the last statement (written in bold) with the statement listed in Example E-5. This
statement creates a report for Cluster 0.
Change the following parameters to obtain this report from each of the clusters in the grid:
VTSID=
MC=
Clarification: The Cache Contents report refers to the specific cluster to which the request
volume was written. In a TS7700 grid configuration, separate requests must be sent to
each cluster to obtain the cache contents of all of the clusters.
To obtain the Copy Audit report, use the same JCL shown in Example E-4 on page 859, but
replace the last statement (written in bold) with the statement shown in Example E-6, and
update the following parameters:
VTSID=
MC=
These three VEHSTATS jobs are also in userid.IBMTOOLS.JCL. Example E-12 lists the sample
JCL for VEHSTPO.
Example E-13 serves as input for a Copy Export function, which enables you to do offsite
vaulting of physical volumes from a TS7700.
Example E-14 Verify information in RMM CDS, Library Manager database, and TCDB
//EDGUTIL EXEC PGM=EDGUTIL,PARM='VERIFY(ALL,VOLCAT)'
//SYSPRINT DD SYSOUT=*
//MASTER DD DSN=your.rmm.database.name,DISP=SHR
//VCINOUT DD UNIT=3390,SPACE=(CYL,(900,500))
After running EDGUTIL, you receive information about all volumes with conflicting information.
Resolve discrepancies before the migration. For more information about this utility, see z/OS
DFSMSrmm Implementation and Customization Guide, SC23-6874. The job must be run
before the migration starts.
Even if you specify the FORCE parameter, it takes effect only when necessary. This parameter
requires you to be authorized to use a specific IBM Resource Access Control Facility (RACF)
Facility class named STGADMIN.EDG.FORCE. Verify that you have the required authorization.
Example E-19 REXX EXEC for updating the library name in the TCDB
/* REXX */
/*************************************************************/
/* ALTERVOL */
/* */
/* Usage: ALTERVOL DSN(volserlist) LIB(libname) */
/* */
/* Before this EXEC is run, you must create the */
In Release 3.2 of the TS7700, all categories that are defined as scratch inherit the Fast
Ready attribute. There is no longer a need to use the Management Interface (MI) to set the
Fast Ready attribute to scratch categories. However, the MI is still needed to indicate which
categories are scratch.
Remember: IBM z/OS users can define any category from 0x0001 - 0xEFFF (0x0000 and
0xFFxx cannot be used) with the DEVSUPxx member SYS1.PARMLIB. IEASYSxx must point to
the appropriate member. If you use the library with other operating systems, or with
multiple z/OS sysplexes in a partitioned environment, review your category usage to avoid
potential conflicts.
0000 Null category Null category This pseudo-category is used in certain library
commands to specify that the category that is already
associated with the volume is to be used by default,
or that no category is specified. Use of the null
category does not affect the volume’s order within the
category to which it is assigned.
No volumes are associated with this category.
0003 DFSMS HPCT Indicates scratch MEDIA3. MEDIA3 is the IBM 3590
Extended High-Performance Cartridge Tape.
0004 DFSMS EHPCT Indicates scratch MEDIA4. MEDIA4 is the IBM 3590
Extended High-Performance Cartridge Tape.
0005 DFSMS ETC Indicates scratch MEDIA5. MEDIA5 is the IBM 3592
tape cartridge.
0006 DFSMS EWTC Indicates scratch MEDIA6. MEDIA6 is the IBM 3592
tape cartridge Write Once, Read Many (WORM).
0007 DFSMS EETC Indicates scratch MEDIA7. MEDIA7 is the IBM 3592
tape cartridge Economy.
0008 DFSMS EEWTC Indicates scratch MEDIA8. MEDIA8 is the IBM 3592
tape cartridge Economy WORM.
0009 DFSMS EXTC Indicates scratch MEDIA9. MEDIA9 is the IBM 3592
tape cartridge Extended.
0010 - DFSMS N/A Reserved. These volume categories can be used for
007F library partitioning.
0080 DFSMS/VM N/A Indicates that the volume belongs to the VM category
including VSE SCRATCH0.
guest
0081 DFSMS/VM N/A Indicates that the volume belongs to the VM category
including VSE SCRATCH1.
guest
0082 DFSMS/VM N/A Indicates that the volume belongs to the VM category
including VSE SCRATCH2.
guest
0083 DFSMS/VM N/A Indicates that the volume belongs to the VM category
including VSE SCRATCH3.
guest
0084 DFSMS/VM N/A Indicates that the volume belongs to the VM category
including VSE SCRATCH4.
guest
0085 DFSMS/VM N/A Indicates that the volume belongs to the VM category
including VSE SCRATCH5.
guest
0086 DFSMS/VM N/A Indicates that the volume belongs to the VM category
including VSE SCRATCH6.
guest
0087 DFSMS/VM N/A Indicates that the volume belongs to the VM category
including VSE SCRATCH7.
guest
0088 DFSMS/VM N/A Indicates that the volume belongs to the VM category
including VSE SCRATCH8.
guest
0089 DFSMS/VM N/A Indicates that the volume belongs to the VM category
including VSE SCRATCH9.
guest
008A DFSMS/VM N/A Indicates that the volume belongs to the VM category
including VSE SCRATCHA.
guest
008B DFSMS/VM N/A Indicates that the volume belongs to the VM category
including VSE SCRATCHB.
guest
008C DFSMS/VM N/A Indicates that the volume belongs to the VM category
including VSE SCRATCHC.
guest
008D DFSMS/VM N/A Indicates that the volume belongs to the VM category
including VSE SCRATCHD.
guest
008E DFSMS/VM N/A Indicates that the volume belongs to the VM category
including VSE SCRATCHE.
guest
008F DFSMS/VM N/A Indicates that the volume belongs to the VM category
including VSE SCRATCHF.
guest
00A0 Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH00.
00A1 Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH01.
00A2 Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH02.
00A3 Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH03.
00A4 Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH04.
00A5 Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH05.
00A6 Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH06.
00A7 Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH07.
00A8 Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH08.
00A9 Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH09.
00AA Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH10.
00AB Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH11.
00AC Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH12.
00AD Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH13.
00AE Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH14.
00AF Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH15.
00B0 Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH16.
00B1 Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH17.
00B2 Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH18.
00B3 Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH19.
00B4 Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH20.
00B5 Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH21.
00B6 Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH22.
00B7 Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH23.
00B8 Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH24.
00B9 Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH25.
00BA Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH26.
00BB Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH27.
00BC Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH28.
00BD Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH29.
00BE Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH30.
00BF Native z/VSE N/A Indicates that the volume belongs to the VSE
category SCRATCH31.
0100 IBM OS/400® N/A Indicates that the volume has been assigned to
(MLDD) category *SHARE400. Volumes in this category can
be shared with all attached IBM System i® and
AS/400 systems.
0101 OS/400 N/A Indicates that the volume has been assigned to
(MLDD) category *NOSHARE. Volumes in this category can
be accessed only by the OS/400 system that
assigned it to the category.
012C Tivoli Storage N/A Indicates a private volume. Volumes in this category
Manager for are managed by Tivoli Storage Manager.
AIX
012D Tivoli Storage N/A Indicates an IBM 3490 scratch volume. Volumes in
Manager for this category are managed by Tivoli Storage
AIX Manager.
012E Tivoli Storage N/A Indicates an IBM 3590 scratch volume. Volumes in
Manager for this category are managed by Tivoli Storage
AIX Manager.
0FF2 Basic Tape N/A Indicates a scratch volume. Volumes in this category
Library belong to the optional scratch pool SCRTCH2.
Support
(BTLS)
FF01 VTS (VTS) and N/A Stacked Volume Insert category for a VTS and
IBM TS7700 TS7700.
A volume is set to this category when its volume
serial number is in the range that is specified for
stacked volumes for any VTS library partition.
FF02 VTS N/A Stacked Volume Scratch category 0 for a VTS and
TS7700.
This category is reserved for future use for scratch
stacked volumes.
FF03 VTS N/A Stacked Volume Scratch category 1 for a VTS and
TS7700.
This category is used by the VTS for its scratch
stacked volumes. This category is not used if the
Licensed Internal Code (LIC) is 527 or later.
FF04 VTS and N/A Stacked Volume Private category for a VTS and
TS7700 TS7700.
This category includes both scratch and private
volumes (since VTS LIC level 527).
FF05 VTS and N/A Stacked Volume disaster recovery category for a VTS
TS7700 and TS7700.
A volume is set to this category when its volume
serial number is in the range that is specified for
stacked volumes for any VTS library partition and the
Library Manager is in disaster recovery mode.
FF06 VTS and N/A This category is used by the VTS as a temporary
TS7700 category for disaster recovery. After a stacked volume
in category FF05 is processed, it is put into this
category.
This category is also used by the Product Field
Engineering (PFE) tool called “movedata” as a
temporary category.
FF07 VTS and N/A This category is reserved for future hardware
TS7700 functions.
FF08 VTS N/A This category is used by the VTS to indicate that a
Read-Only-Recovery Stacked Volume with active
data cannot be recovered.
FF09 TS7700 N/A Stacked Volume Copy Export category for TS7700.
This category is used by the TS7700 as a category to
represent which physical stacked volumes are being
copy exported or have already been copy exported as
part of a program initiated copy export operation.
FF0A TS7700 N/A Stacked Volume Copy Export Hold category for
TS7700. This category is used by the TS7700 as a
category to represent which physical stacked
volumes have been moved to the copy export hold
state as part of a program initiated copy export
operation.
This appendix describes four examples of how consistency policies work and how certain
parameters influence the behavior of the configurations. This appendix explains the usage of
the following objects:
Different Copy Consistency Policies
Scratch allocation assistance (SAA) and device allocation assistance (DAA)
Retain Copy Mode
Override settings
Synchronous deferred on Write Failure Option
Cluster family
The examples are meant as a drill to exercise some setting options and evaluate the effects
on the grid. They are only hypothetical implementations, for the sake of the settings exercise.
Although the distance between data centers has an influence on latency, the distance has no
influence on the function.
These examples show no Time Delay Replication copy policy. A Time Delay Replication copy
is only made after a certain amount of time has expired (after creation / after last access).
While the timer is not elapsed, this type of copy behaves regarding the dependencies to the
parameter mentioned in these examples like a No copy. When the timer is elapsed, the copy
behaves like a copy produced in Deferred mode copy. Therefore, no specific examples need
to be added.
The data in cache statement applies only to the condition when all clusters are available. In
outages, the normal rules apply. Synchronous goes to synchronous deferred (if the
synchronous write failure option is enabled), and RUN copies go to the Immediate-Deferred
copy queue. As soon as the failing cluster is recovered and is available in the grid again, the
copies are made according to their policies.
Each of the examples also shows one example of the specific influence of SAA, DAA,
override policies, the synchronous write failure option, and the service preparation mode of a
cluster. They also describe the copy policy behavior if a disaster occurs.
Without DAA, there is no pre-selection of the mount point for a non-scratch mount. This is
addressed only in the four-cluster grid example.
The Tape Volume Cache (TVC) selection for scratch mounts depends on the Copy
Consistency Policy. For non-scratch mounts, there is a general rule that if a cluster has a valid
copy of the logical volume in cache, this cluster TVC is selected as the I/O TVC.
Table G-1 shows the influence of the features as explained in the following examples.
Figure G-1 shows a homogeneous TS7740 cluster grid. You can also choose to introduce a
TS7720 only, or a hybrid cluster. See 2.4.1, “Homogeneous versus hybrid grid configuration”
on page 103 to choose the best configuration to meet your demands.
Host
Ho Host
Ho
FICON FICON
TS7700 Cluster 0
TS77 TS7700 Cluster 1
A S S A S S
B R R B R R
C R D C R D
D R N D N R
WAN
SAA Disabled
DAA for private volumes Disabled The default of DAA for private
volume is enabled. The
disablement is for educational
purposes only.
Synchronous Deferred on On
Write Failure option
Note: Because a copy is located only in one of the two clusters, private mounts might
fail in an outage if the targeted volume is not in the cluster that remains available.
These MCs are necessary for BVIR processing, DR volume testing, and Copy Export runs.
In this case, the origin MC (R,N) is selected, and no additional copy by a specific mount is
created.
DAA for private volumes might be chosen to ensure that a cluster with a valid copy is
selected. In this example, the benefits are limited. For MCs S/S, R/R, and R/D, there must
always be a valid copy available in both clusters. Therefore, DAA made no difference. But the
MCD example (RN/NR) benefits because a private mount is directed to the cluster that holds
the valid copy.
Because this is a two-cluster grid, cluster families do not provide value in this configuration.
It is better to use D/D with the preferred local for Fast Ready mounts, which eliminates any
unexpected immediate copies from occurring.
Host
Ho Host
Ho
FICON FICON
TS7700 Cluster 0
TS77 TS7700 Cluster 1
A S S A S S
B R R B R R
C R D C R D
D R N D N R
WAN
SAA Disabled
Note: Because a copy is located only within one of the two clusters, private mounts
might fail in an outage if the targeted volume is not in the cluster that remains available.
TVC selection for scratch mounts: As described in MCA, the local TVC is used. No remote
mount occurs.
Important: These MCs are necessary for BVIR processing, DR volume testing, and Copy
Export runs.
Special considerations for non-Fast Ready mounts and Copy Policy Override
Assume that you have a private non-Fast Ready mount for a logical volume. MC C with R/D is
selected. Prefer local cache for non-fast ready mounts is selected on Cluster 1, but not on
Cluster 0.
If Cluster 0 is selected as the virtual drive mount point, the TVC might be selected where a
valid copy exists. That can result in a remote write to Cluster 1, and the data will be in Cluster
1 after RUN. If the data was modified, a copy is processed to Cluster 0 at RUN time. If the
data was not modified, no copy occurs.
If Cluster 1 has no valid copy of the data (in cache or on a stacked volume), Cluster 0 is
selected for TVC.
During a stand-alone cluster outage, the three-cluster grid solution maintains no single points
of failure that prevent you from accessing your data, assuming that copies exist on other
clusters as defined in the Copy Consistency Point.
In this example, Cluster 0 and Cluster 1 are the HA clusters and are local to each other (less
than 10 kilometers (6.2 miles) apart). Cluster 2 is at a remote site that is away from the
production site or sites. The virtual devices in Cluster 0 and Cluster 1 are online to the host
and the virtual devices in Cluster 2 are offline to the hosts on Site A. The optional host is not
installed. The host accesses the 512 virtual devices that are provided by Cluster 0 and
Cluster 1.
Figure G-3 on page 896 shows an optional host connection that can be established to remote
Cluster 2 using DWDM or channel extenders. With this configuration, you need to define an
extra 256 virtual devices at the host for a total of 768 devices.
In this configuration, each TS7720 replicates to both its local TS7720 peer and to the remote
TS7740, depending on their Copy Consistency Points. If a TS7720 reaches the upper
threshold of usage, the oldest data that has already been replicated to the TS7740 might be
removed from the TS7720 cache, depending on the Copy Consistency Policy.
If you enable the TS7720 to remove data from cache, consider applying the selective dual
copy in the TS7740. In this case, the TS7720 can remove the data from its cache, and then,
the copy in the TS7740 is the only valid copy. Therefore, consider protecting this last valid
copy against a physical media failure.
Copy Export can be used from the TS7740 to have a second copy of the migrated data, if
required.
Host A
Ho DWDM, Channel Extension (optional)
Host B Host
(optional)
FICON
FICON
TS7700D
TS7700D TS77
TS7700D TS7740
TS
Cluster
Clus ter 0 Cluster 1
Clus or TS7700T
TS
Cluster
Clu 2
A S S R A S S R A S S R
B R R R B R R R B R R R
C R N D C R N D C R N D
D R N N D N R N D N N R
WAN
Figure G-3 Three-cluster HA and DR with two TS7700Ds and one TS7700T
SAA Disabled
DAA Disabled
Synchronous Deferred on ON
Write Failure Option
MCD has parameters RNN, NRN, and NNR. These MCs are necessary for BVIR processing,
DR volume testing, and Copy Export runs.
DAA might be chosen to ensure that a cluster with a valid copy is selected. In this example,
the data associated with MCC and MCD benefits.
The usage of override settings for local cache influences only the TVC selection, as described
in “Example 2: Two-cluster grid for HA and DR” on page 892.
Host
Ho Host
Ho
FICON FICON
TS7740
TS7 40 TS7740
TS7
TS7700D or TS7700T
TS 00T TS7700D or TS7700T
TS
Cluster 0 Clusterr 1
Clus Cluster 2 Cluster 3
Clus
A S D S D A S D S D A S D S D A S D S D
B R R R D B R R R D B R R R D B R R R D
C R D R N C R D R N C R D R N C R D R N
D R N N N D N R N N D N N R N D N N N R
WAN
In this example, if a TS7720 reaches the upper threshold of usage, the oldest data that has
already been replicated to the TS7740 can be removed from the TS7720 cache. Copy Export
can be used from the TS7740 to have a second copy of the migrated data, if required.
Synchronous Deferred on ON
Write Failure option
MCD has these parameter settings: R/N/N/N, N/R/N/N, N/N/R/N, and N/N/N/R. These MCs
are necessary for BVIR processing, DR volume testing, and Copy Export runs.
Cluster families have a major influence on the behavior and are introduced in the next
example.
Host
Ho Host
Ho
Family A Family B
FICON FICON
TS7740
TS7 40 TS7740
TS7
TS7700D or TS7700T
TS 00T TS7700D or TS7700T
TS
Cluster 0 Clusterr 1
Clus Cluster 2 Cluster 3
Clus
A S D S D A S D S D A S D S D A S D S D
B R R R D B R R R D B R R R D B R R R D
C R D R N C R D R N C R D R N C R D R N
D R N N N D N R N N D N N R N D N N N R
WAN
Figure G-5 Examples of a four-cluster grid for HA and DR with cluster families
SAA Disabled
Synchronous Deferred on ON
Write Failure option
For deferred copies, only one copy is transferred between the two sites. All other
deferred copies are produced inside the defined family.
Mount behavior in normal conditions: The mount behavior itself remains the same as
family clusters. It is under your control to select the appropriate scratch mount candidates
or to disable virtual drives. Due to the use of SAA, Cluster 0 and Cluster 2 are selected for
scratch mounts. For private mounts, DAA selects a cluster according to the rules of DAA.
TVC selection: Normally, a cluster with R is preferred against a cluster with Deferred.
However, if, in a cluster family, a remote mount occurs, the family overrules this behavior.
Therefore, if a cluster needs a remote mount, the cluster prefers a cluster inside the family
over a cluster with a RUN outside the family. In the example, that can lead to following
situation.
Cluster 2 receives a private mount. Assume that Cluster 2 has no valid copy and initiates a
remote mount. Without a cluster family, Cluster 0 (TS7720) is selected if Cluster 0 has a
valid copy in the cache. Instead, Cluster 2 prefers to select Cluster 3 as the TVC, which
might result in a recall from a stacked volume. If the volume is modified, Cluster 3 already
has a valid copy and transfers this copy to all other clusters because of the Copy
Consistency Point.
MCD has these parameter settings: R/N/N/N, N/R/N/N, N/N/R/N, and N/N/N/R. These MCs
are used for BVIR processing, DR volume testing, and Copy Export runs. For this Copy
Consistency Point, the definition of cluster families has no influence.
All data volumes are calculated without growth for a basic calculation in this example.
Table G-7 shows the type of data, the likelihood of read, and the daily and total volume of TB
stored.
Other data, like SMF High during the next 0.5 TB 730 TB (4 years)
week, then low
Especially for Example 3 and Example 4, there are several different options. In this appendix,
only two options are covered to explain the theory.
The night batch jobs producing the 15 TB from “other backups” are running 5 hours each day.
A throughput for 3 TB in an hour is therefore necessary. Assume that this is a steady
workload, a compressed host I/O from 875 MBps is expected.
Example 2: All data on physical tape (premigrated ASAP), only one tape
partition
First, determine what the minimum requirement for the cache is. That would be at least the
amount of data that is written on one day. Adding the numbers, a one-drawer TS7760 with
31.42 TB would be sufficient. However, that does not satisfy the requirements of the
customer.
A 2 drawer configuration (approx. 63 TB) would allow also to have some HSM data in cache.
Example 3: HSM ML2 will be kept in cache only, all other data will be
premigrated, and tape partitions will be used
To determine the configuration, we first calculate the minimum cache capacity:
1. CP0: HSM ML2, 150 TB + 10% free space and contingency = 165 TB.
2. CP1: DB2: 25 TB + 10% contingency = 28 TB (rounded).
3. CP2: Other data = 15.8 TB. SMF data will be treated with PG1. All other data will be
treated as PG0.
If RUN or SYNC is needed (and we strongly suggest that you use them when HSM ML2
synchronous mode copy is used), then a seven-drawer configuration is not sufficient - or no
premigration could run during the night batch. This could result in an issue, because 15 TB
will not fit in the premigration queue. So we would not recommend this configuration if RUN or
SYNC is needed.
Regarding the premigration queue depth, we cannot follow the recommendation to be able to
keep a full day of premigration data in the queue, because the customer produces 22 TB a
day. In the 5-hour peak, approx. 15 TB are written. That means, that either two TS7700 needs
to be installed or the customer needs to accept that premigration during the peak time is
essential to not run in throttling. In addition, the customer should consider to review the LI
REQ,SETTING2,PRETHDEG / COPYWDEG/TVCWDEG values to be prepared in an
unavailable situation of the physical library.
That means, that we have to provide 703 TB, which results in 23 drawers. This amount of
drawers is capable to run the host I/O in sustained mode, and do also some copy activity.
Because all data from the daily backup will expire in cache, the amount of FC 5274 for the
premigration queue needs to be recalculated.
in this example. the daily amount of compressed data is 7 TB, which results in 7 * FC 5274.
Depending on the workload profile, you might consider to install only 5 TB to cover the DB2
portion only.
In this example also the number of backend cartridges - and maybe even backend drives
would be less than in Example 3.
If 7* FC 5274 is installed, also an unavailability of the physical tape library of 24 hours can be
allowed without any throttling issue.
This is also a feasible solution, if the customer allows that no physical copy exists for a part of
the backup data. Keep in mind that this data has only a short lifecycle anyway.
A CHPID is a two-digit value from 00 to FF depending on how many and which CHPIDs are
defined to the processor. Physical channel IDs (PCHIDs) represent a physical location on the
central processor complex (CPC) and are hardcoded. CHPIDs are arbitrary and plug into the
PCHIDs.
PATH is an IOCP statement that is a set of CHPIDs (path group, up to 8) defined to the
processor.
If a CU definition specifies LINKs, multiple CPCs can talk to that cluster if the proper LINKs
are used in the IOCP/IODF. LINKs are the same on each CPC, even though the CHPIDs are
probably different. The CHPIDs go to the switch that the links are from. The CHPID definition
specifies which switch that particular CHPID goes to and the CU definition specifies which
outbound port (LINK) goes to that device. A single switch or multiple switches (up to 8) can be
used for every CHPID in the PATH statement. It just depends on which switch the CHPID runs
to.
A CPC can use fewer than eight ports, but the LINKs are still the outbound switch ports, you
have a PATH statement with fewer CHPIDs/LINKs. The cables from the outbound switch ports
are arbitrary. They can be connected in no particular order, or can be connected so they
mean something to whomever is plugging in the cables, such as to aid in troubleshooting.
Example H-2 Control unit definition example that uses switches on eight channels
*ZORO/0
CNTLUNIT CUNUMBR=2611, X
PATH=((CSS(0),61,63,65,67,69,6B,6D,6F)), X
LINK=((CSS(0),2B,AC,3A,4F,CD,BB,5F,DD)), X
UNITADD=((00,16)),UNIT=3490,CUADD=0
*$HCDC$ DESC='ZORO BARR39'
TAPED300 IODEVICE ADDRESS=(D300,16),UNIT=3490,CUNUMBR=(2611),UNITADD=00
Directly connecting
In a direct connect situation, there is no switch and the CHPID channel cable connects only to
that device and no others. If all the CHPIDs are direct, you can forgo the link statement. The
link fields in the IOCP definition are all asterisks as shown in Example H-3.
Example H-3 Control unit definition example that uses a direct connection
*ZORO/0
CNTLUNIT CUNUMBR=2611, X
PATH=((CSS(0),61,63,65,67,69,6B,6D,6F)), X
LINK=((CSS(0),**,**,**,**,**,**,**,**)), X
UNITADD=((00,16)),UNIT=3490,CUADD=0
*$HCDC$ DESC='ZORO BARR39'
TAPED300 IODEVICE ADDRESS=(D300,16),UNIT=3490,CUNUMBR=(2611),UNITADD=00
Example H-4 IOCP statements for increasing device count to 496 on 4 channels
*ELWOOD/0
CNTLUNIT CUNUMBR=0C11, X
PATH=(CSS(1),C1,C5,DA,F1), X
LINK=(CSS(1),1D,3D,4F,66), X
UNITADD=((00,16)),UNIT=3490,CUADD=0
*$HCDC$ DESC='ELWOOD CLUSTER 0 BARR74'
TAPEE100 IODEVICE ADDRESS=(E100,16),UNIT=3490,CUNUMBR=(0C11), X
UNITADD=00,PART=(CSS(1),MVSC7,VMT07)
Example H-5 MVSCP with 496 devices connected by using the switch
IODEVICE ADDRESS=(E100,16),UNIT=3490,FEATURE=COMPACT, *
OFFLINE=YES,DYNAMIC=YES,LOCANY=YES, *
USERPRM=((LIBRARY,YES),(AUTOSWITCH,YES),(LIBRARY-ID,BA07*
4),(LIBPORT-ID,01),(MTL,NO)),CUNUMBR=0C11
IODEVICE ADDRESS=(E110,16),UNIT=3490,FEATURE=COMPACT, *
OFFLINE=YES,DYNAMIC=YES,LOCANY=YES, *
USERPRM=((LIBRARY,YES),(AUTOSWITCH,YES),(LIBRARY-ID,BA07*
4),(LIBPORT-ID,02),(MTL,NO)),CUNUMBR=0C12
IODEVICE ADDRESS=(E120,16),UNIT=3490,FEATURE=COMPACT, *
OFFLINE=YES,DYNAMIC=YES,LOCANY=YES, *
USERPRM=((LIBRARY,YES),(AUTOSWITCH,YES),(LIBRARY-ID,BA07*
4),(LIBPORT-ID,03),(MTL,NO)),CUNUMBR=0C13
IODEVICE ADDRESS=(E130,16),UNIT=3490,FEATURE=COMPACT, *
OFFLINE=YES,DYNAMIC=YES,LOCANY=YES, *
USERPRM=((LIBRARY,YES),(AUTOSWITCH,YES),(LIBRARY-ID,BA07*
4),(LIBPORT-ID,04),(MTL,NO)),CUNUMBR=0C14
IODEVICE ADDRESS=(E140,16),UNIT=3490,FEATURE=COMPACT, *
OFFLINE=YES,DYNAMIC=YES,LOCANY=YES, *
USERPRM=((LIBRARY,YES),(AUTOSWITCH,YES),(LIBRARY-ID,BA07*
4),(LIBPORT-ID,05),(MTL,NO)),CUNUMBR=0C15
IODEVICE ADDRESS=(E150,16),UNIT=3490,FEATURE=COMPACT, *
OFFLINE=YES,DYNAMIC=YES,LOCANY=YES, *
USERPRM=((LIBRARY,YES),(AUTOSWITCH,YES),(LIBRARY-ID,BA07*
4),(LIBPORT-ID,06),(MTL,NO)),CUNUMBR=0C16
IODEVICE ADDRESS=(E160,16),UNIT=3490,FEATURE=COMPACT, *
OFFLINE=YES,DYNAMIC=YES,LOCANY=YES, *
USERPRM=((LIBRARY,YES),(AUTOSWITCH,YES),(LIBRARY-ID,BA07*
4),(LIBPORT-ID,07),(MTL,NO)),CUNUMBR=0C17
IODEVICE ADDRESS=(E170,16),UNIT=3490,FEATURE=COMPACT, *
OFFLINE=YES,DYNAMIC=YES,LOCANY=YES, *
USERPRM=((LIBRARY,YES),(AUTOSWITCH,YES),(LIBRARY-ID,BA07*
4),(LIBPORT-ID,08),(MTL,NO)),CUNUMBR=0C18
IODEVICE ADDRESS=(E180,16),UNIT=3490,FEATURE=COMPACT, *
OFFLINE=YES,DYNAMIC=YES,LOCANY=YES, *
USERPRM=((LIBRARY,YES),(AUTOSWITCH,YES),(LIBRARY-ID,BA07*
4),(LIBPORT-ID,09),(MTL,NO)),CUNUMBR=0C19
IODEVICE ADDRESS=(E190,16),UNIT=3490,FEATURE=COMPACT, *
OFFLINE=YES,DYNAMIC=YES,LOCANY=YES, *
USERPRM=((LIBRARY,YES),(AUTOSWITCH,YES),(LIBRARY-ID,BA07*
4),(LIBPORT-ID,0A),(MTL,NO)),CUNUMBR=0C1A
IODEVICE ADDRESS=(E1A0,16),UNIT=3490,FEATURE=COMPACT, *
OFFLINE=YES,DYNAMIC=YES,LOCANY=YES, *
USERPRM=((LIBRARY,YES),(AUTOSWITCH,YES),(LIBRARY-ID,BA07*
4),(LIBPORT-ID,0B),(MTL,NO)),CUNUMBR=0C1B
IODEVICE ADDRESS=(E1B0,16),UNIT=3490,FEATURE=COMPACT, *
OFFLINE=YES,DYNAMIC=YES,LOCANY=YES, *
USERPRM=((LIBRARY,YES),(AUTOSWITCH,YES),(LIBRARY-ID,BA07*
4),(LIBPORT-ID,0C),(MTL,NO)),CUNUMBR=0C1C
IODEVICE ADDRESS=(E1C0,16),UNIT=3490,FEATURE=COMPACT, *
OFFLINE=YES,DYNAMIC=YES,LOCANY=YES, *
USERPRM=((LIBRARY,YES),(AUTOSWITCH,YES),(LIBRARY-ID,BA07*
4),(LIBPORT-ID,0D),(MTL,NO)),CUNUMBR=0C1D
Sharing ports
One CPC can use four ports and another can use the other four ports. You can have up to
eight CPCs, each connected to a switched or direct port. Or you can connect all CPCs to all
ports (switched). You can also have one CPC that uses all eight ports and another that uses
fewer than eight.
Example H-5 on page 915 uses only four CHPIDs/LINKS in each PATH statement. To use the
other four ports available on a second CPC, use those values on the first CPC, then change
the values as shown in Example H-6 on the second CPC. The only differences are that the
PATHs and LINKs are different values on the second CPC from the first CPC.
Example H-6 IOCP statement for using four ports on a second CPC
*ELWOOD/0
CNTLUNIT CUNUMBR=0C11, X
PATH=(CSS(1),C0,C4,D9,F0), X
LINK=(CSS(1),1C,3C,4E,65), X
UNITADD=((00,16)),UNIT=3490,CUADD=0
*$HCDC$ DESC='ELWOOD CLUSTER 0 BARR74'
TAPEE100 IODEVICE ADDRESS=(E100,16),UNIT=3490,CUNUMBR=(0C11), X
UNITADD=00,PART=(CSS(1),MVSC7,VMT07)
*
0 0-1E X’01’-X’1F’
1 0-1E X’41’-X’5F’
2 0-1E X’81’-X’9F’
3 0-1E X’C1’-X’DF’
4 0-1E X’21’-X’3F’
5 0-1E X’61’-X’7F’
It also gives you a practical guideline for aspects of the project, such as naming conventions
and checklists. The solution is based on standard functions from IBM z/OS, Data Facility
Storage Management Subsystem Removable Media Manager (DFSMSrmm), IBM Resource
Access Control Facility (RACF), and functions available in the TS7700 2-cluster grid. A similar
implementation can be done in any single-cluster or multi-cluster grid configuration.
The TS7700 R2.0 extended the possibilities of manageability and usability of the cluster or
grid by introducing the Selective Device Access Control (SDAC). The SDAC enables you to
split the grid or cluster into hard partitions that are accessible by independent hosts or
applications.
SDAC, also known as hard partitioning, can isolate and secure environments with various
requirements and objectives, shielding them from unintended or malicious interference
between hosts. This is accomplished by granting access to determined ranges of logical
volumes by selected groups of devices in a logical control unit (LCU) granularity (also referred
to as LIBPORT-ID).
An example of a real implementation of this function is provided, going through the necessary
steps to separate the environments Production (named PROD) and Test (named TEST) from
each other despite sharing the TS7700 2-cluster grid.
The setup must be as complete as possible and established in a way that generates the best
possible protection against unauthorized access to logical volumes dedicated to the other
partition. You must protect against unauthorized user access for logical volumes on PROD
from TEST and vice versa.
The function SDAC, introduced with R2.0, is used. It can be ordered as Feature Code 5271.
Establish defined naming conventions before making your definitions. This makes it easier to
logically relate all definitions and structures to one host when updates are needed.
Figure I-1 gives an overview of the setup. Updates are needed on many places. Adapt your
current naming standards to your setup.
Definitions in HCD
In HCD, you define the devices that are needed for each of the hosts and connect them to the
LPARS. This case study defines 28 out of 32 CUs for PROD (2 x 224 logical devices) and 4
out of 32 CUs (2 x 32 logical devices) for TEST. The devices for Cluster 0 have addresses
from 1000 - 10FF and for Cluster 1 the values are 2000 - 20FF (Table I-1).
Normally, you can activate the new definitions dynamically. Details regarding HCD definitions
are described in 6.4, “Hardware configuration definition” on page 228.
See the following resources for complete information about options in PARMLIB, definitions,
and commands to activate without IPL:
MVS System Commands, SA22-7627
MVS Initialization and Tuning Reference, SA22-7592
The updates are within the following members of PARMLIB, where the suffix and the exact
name of the PARMLIB data set apply to your naming standards. It is important to make the
changes according to normal change rules. If the updates are not implemented correctly,
severe problems can occur when the next IPL is planned.
IEFSSNxx. These updates apply for TEST and PROD:
– If OAM is new to the installation, the definitions in Example I-1 are required.
Tip: V 1000,AS,ON makes the specified address available for AS support. When
followed by V 1000,ONLINE, it varies the device online. Both commands must be
entered on all hosts that need device 1000 online and auto-switchable.
Note: A Parallel Sysplex must have a shared TMS - plex (TMSplex), SMSplex, and
TCBD. In this case, no partitioning can be defined.
IECIOSxx. In this member, you can define specific device ranges, and you must separate
TEST from the PROD updates:
– TEST updates are in Example I-4, one line for each range of devices. The
MOUNTMSG parameters ensure that the console receives the Mount Pending
message (IOS070E) if a mount is not complete within 10 minutes. You can adjust this
value. It depends on many factors, such as read/write ratio on the connected host and
available capacity in the grid.
– PROD updates are in Example I-5, one line for each range of devices.
DEVSUPxx. In this member, you can define specific device ranges. You must be specific
and separate TEST from the PROD updates:
– DEVSUPxx for TEST is shown in Example I-6 for the categories that apply for TEST.
– DEVSUPxx for PROD is shown in Example I-7 for the categories that apply for PROD.
DFSMSrmm definitions
In this case study, you have DFSMSrmm as the TMS. Equivalent definitions must be defined
if you prefer to use another vendor’s TMS. These definitions can be created by using options
in DFSMSrmm PRTITION and OPENRULE. PRTITION is the preferred method for
partitioning. REJECT commands, although still supported, must not be used in new
installations. If you use REJECT commands, you must convert from the use of REJECT
commands to use the PRTITION and commands.
See z/OS DFSMSrmm Implementation and Customization Guide, SC23-6874, for complete
information about options for DFSMSrmm. Table I-2 shows the definitions that are needed in
this specific case study. You must define the volume range that is connected to the host.
Reject use and insertion of volumes that are connected to the other host.
JES2 definitions
If OAM is new to the hosts, OAM JCL must be defined in one of the JES2 procedure libraries.
These JCL definitions apply for TEST and for PROD as shown in Example I-10.
The naming convention in this case defines that all TEST definitions are prefixes with TS and
PROD with PR.
ACS constructs and definitions for TEST are in Table I-3. Ensure that the construct names
match the names that you define on the Management Interface (MI).
The same rules apply for access to run updates of the content in DFSMSrmm. Various
solutions can be implemented.
For more information about options, security options, and access rules, see z/OS
DFSMSrmm Implementation and Customization Guide, SC23-6874.
Automation activities
If OAM and TS7700 are new on the host, evaluate these concerns:
OAM must start after the IPL.
New messages are introduced.
Hardware errors and operator interventions occur and must be handled.
See the IBM TS7700 Series Operator Informational Messages White Paper, which is
available at the Techdocs library website:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101689
Figure I-3 Management Interface updates for logical partitioning of TEST and PROD
Defining constructs
Define the constructs on all clusters in the grid with definitions consistent with your policies.
Consider the following information:
1. Define an SC for TEST named TSSCPG0. This is an SC for volumes unlikely to be used
(level 0 or PG0), as shown in Figure I-5. The other SCs must be defined in a similar way.
Ensure that Tape Volume Cache Preference is set according to the defined SC name.
Remember: Without the use of MC from z/OS, the default is a copy in both clusters.
Figure I-8 Data Class for TEST with 6000 MB volume size
5. Define an SG for TEST (named TSCOMP1) as shown in Figure I-9. Remember your
requirements for secluded physical volumes as described in “Physical volume pools” on
page 928. The definition for PROD is not shown, but it must relate to volume pool 1. Define
the SG on both clusters in the grid.
5. Check to see that each access group’s ranges are not overlapping, as shown in
Figure I-14.
7. You can see that the ranges are assigned to their correct hosts, as shown in Figure I-16.
PROD TEST
X X
volumes for PROD Volumes for TEST
A00000-A99999 B00000-B99999
Physical volume pool 2 Physical volume pool 1
X means hard error. If you try to access volumes from the other
host your job will fail with this message:
CBR4175I VOLUME 100066 LIBRARY BARR64 ACCESS GROUP DENIES MOUNT.
You can run several procedures to ensure that the setup is correct and ready for production.
Be sure that you cover the following points:
Control that all settings are as expected on z/OS and on the MI.
Configure the channels online and vary the devices online on your system.
Enter one logical volume and ensure that the volume is entered in the correct partition.
Look up the volume by using the started task OAM, or by entering the command:
D SMS,VOL,volser,DETAIL
Check the values of the scratch tape in DFSMSrmm by using dedicated windows.
Create a job that creates a tape and evaluate that the constructs are assigned correctly.
Issue D SMShi,VOL,volser,DETAIL again to check the assignment of constructs from
the grid.
Use the Library Request host console commands to evaluate the status on the created
private volume and the status of the physical volume to which it was copied.
The publications that are listed in this section are considered suitable for a more detailed
description of the topics covered in this book.
You can search for, view, download, or order these documents and other Redbooks,
Redpapers, Web Docs, draft and additional materials, at the following website:
ibm.com/redbooks
Other publications
These publications are also relevant as further information sources:
DFSMS/VM Function Level 221 Removable Media Services User’s Guide and Reference,
SC35-0141
FICON Planning and Implementation Guide, SG24-6497
IBM Encryption Key Manager component for the Java platform Introduction, Planning, and
User’s Guide, GA76-0418
IBM System Storage TS1120 and TS1130 Tape Drives and TS1120 Controller Introduction
and Planning Guide, GA32-0555
IBM System Storage TS1120 and TS1130 Tape Drives and TS1120 Controller Operator
Guide, GA32-0556
IBM TotalStorage Enterprise Tape System 3592 Operators Guide, GA32-0465
IBM TotalStorage UltraScalable Tape Library 3584 Operator Guide, GA32-0468
IBM TS3500 Tape Library with ALMS Introduction and Planning Guide, GA32-0593
Online resources
These web pages are helpful for more information:
Common Information Model (CIM)
http://www.dmtf.org/standards/cim/
IBM Business Continuity and Recovery Services
http://www.ibm.com/services/continuity
IBM Security Key Lifecycle Manager at IBM Knowledge Center:
https://www.ibm.com/support/knowledgecenter/SSWPVP
IBM TS3500 tape library at IBM Knowledge Center:
https://www.ibm.com/support/knowledgecenter/en/STCMML8/com.ibm.storage.ts3500.d
oc/ts3500_ichome.html
The documents in IBM Techdocs are active. The content is constantly changing and new
documents are being created. To ensure that you reference the newest document, search on
the Techdocs website:
http://www.ibm.com/support/techdocs/atsmastr.nsf/Web/TechDocs
From the Search drop-down list, select All of the Techdocs Library and enter TS7700 to
perform a search for all related documents.
SG24-8366-01
ISBN 0738443077
Printed in U.S.A.
®
ibm.com/redbooks