Vsam 1.0
Vsam 1.0
Vsam 1.0
History
Access Method is an interface between the application program and physical
operation of storage devices. It is a component of operating system. VSAM is the
first access method that efficiently uses the virtual storage of MVS. It can manipulate
only the data that resides on a DASD. (Direct access storage device)
KSDS (Key Sequenced Data Set) replaced ISAM (Indexed Sequential Access Method)
RRDS (Relative Record Data Set) replaced BDAM (Basic Direct Access Method)
ESDS (Entry Sequence Data Set) provide same function as normal sequential QSAM.
(Queued Sequential Access Method).
Initially VSAM had only ESDS and KSDS. RRDS and Alternate Index to KSDS
are introduced in 1979. DF/ EF (Data Facility extended Function) VSAM was
introduced in 1979 with Integrated Catalog Facility (ICF) to replace the old VSAM
catalog of the previous versions.
The latest version of DFP/ VSAM released in 1991 called DFP/VSAM 3.3
contains enhancements like variable record length support for RRDS and added
DFSMS facilities.
1.Data retrieval will be faster because of an efficiently organized index. The index is
small because it uses a key compression algorithm.
2.Insertion of records is easy due to embedded free space in the cluster.
3.Records can be physically deleted and the spaces used by them can be used for
storing other records without reorganization.
4.VSAM datasets can be shared across the regions and systems.
5.Datasets can be physically distributed over various volumes based on key ranges.
6.VSAM is independent of storage device types.
7.Information about VSAM datasets is centrally stored in VSAM catalog.
So referencing of any VSAM dataset need not be detailed in JCL.
Disadvantages of VSAM
1.To allow easy manipulation of records, free space should be left in the dataset and
this increases the spaces required.
2.Integrity of the dataset across the region and system need to be controlled by
user.
Mainframe Refresher Part-1 VSAM-Page:2
CLUSTER
A cluster can be thought of as a logical dataset consisting of two separate
physical datasets:
1 The data component (contains the actual data).
2 The index component (contains the actual index).
All types of VSAM datasets are called clusters even though KSDS is the only
type that fulfills the cluster concept. ESDS and RRDS don’t have Index component.
Data Component.
RDF – Record Definition Field – 3 bytes field. For fixed length records, there
will be 2 RDF, first contains the number of records in the control interval and
the second contains record length. For variable length records, the number of
RDF can vary depending on how many adjacent records have the same length
in the CI. If no two adjacent records are of the same length, then one RDF is
needed to describe each record.
Example
1. In the example below, there are four control areas and every control area contains
two control intervals. Control fields are not shown in the diagram. There should be
one sequence for every control area. So there are four sequence sets. There are two
levels of index set. The second level of index set contains pointers to sequence set.
2. Control Interval Split: When a record with key 22 is added in the program, it
should physically be stored between the existing records 21 and 23. 21 and 23 are in
the first control interval of control area-3. There is no more space available to store
this record. So control split will occurs. Record 20 and 21 continue to exist in the
current control interval. Records 22 and 23 will be moved to any of the free control
intervals. In our case control interval 2 is free. So they are moved there and index
set is accordingly updated.
3. Control Area Split: When a record with key 3 in the program, it should be placed
between 2 and 4. These records are in first control interval of first control area and
there is no free space. So control interval split is expected. But there is no free
control interval in the control area-1. So control area split occur. New control area is
allocated and half the records of control area-1 will be moved there and indexes are
properly updated.
To answer this question, you should know how indexes are organized and
read. We have already seen how they are organized. In the example below, to read
the 50th record, first root index is read and identified that the record should be in the
right hand side. In the second I-O, I will get the sequence and in third I-O, I will get
the location of the record. I will get my record in the fourth I-O instead 51 I-O. (50
IO in index and 1 I-O for getting the data)
If I am accessing the first record, then sequential read needs only one I-O but
obviously my random read needs more. So we prefer indexed organization only when
the number of records is significant.
Mainframe Refresher Part-1 VSAM-Page:4
First Level
*8 *17 * 23 *50
Index
1 2 4 12 14 20 21 23 33 35
Data Component
5 7 8 17 50
VSAM LDS Properties: Linear datasets have no records. They are just long strings of
bytes. Since it is a long string of bytes an LDS has no FSPC, US, CIDF and RDF. The
CISZ is always 4k bytes. The main use of LDS is to implement other data access
organizations especially a relational database system like DB2.
As the data in an LDS is manipulated the operating system automatically
pages in and out the portions of the dataset being worked on. The dataset is
addressed by the RBA as if it were in memory and the system pages the needed
pages in and out. Thus the process is very simple and fast.
2. Keyword ‘DATASET’ should directly point to the physical dataset and IDCAMS
allocates this file dynamically for operation.
Mainframe Refresher Part-1 VSAM-Page:7
DEFINE CLUSTER
This command is used to create and name a VSAM Cluster.
DEFINE CLUSTER-NAME
This parameter specifies name to the VSAM cluster. The cluster name
becomes the dataset name in any JCL that invokes the cluster
When the data and index parameters are coded to create the data and index
components, the name parameter is coded for them as well. If the name parameter
is omitted for the data and index VSAM tries to append part of .DATA or .INDEX as
appropriate as the low level qualifier depending on how many characters the dataset
name contains already and still staying within the 44 character limit.
If data and index components are named, parameter values can be applied
separately. This gives performance advantages for large datasets.
VSAM calculates the control area size internally. Control area can of one
cylinder, the largest permitted by VSAM, usually yields the best performance. So it is
always better to allocate space in cylinders because this ensures a CA size of one
cylinder.
The RECORDS parameter is used to allocate space in units of records for small
datasets. When this is done the RECORDSIZE parameter must be specified.
If allocation is specified in units of KILOBYTES or MEGABYTES VSAM reserves space
on the minimum number of tracks it needs to satisfy the request.
For an ESDS (since it is processed sequentially) the CISZ should be relatively large
depending on the size of the record.
FREESPACE(ci% ca%)
FREESPACE(ci% ) control interval only
FREESPACE(0 ca%) control area only
FREESPACE(0 0) is the default.
FREESPACE(100 100) means only one record will be loaded in every control interval
and only one control interval will be loaded in every control area.
In order to effectively allocate FREESPACE the following factors have to be taken into
consideration.
1. The expected rate of growth: If even growth is expected apply FREESPACE to both
CI and CA. If uneven growth is expected apply FREESPACE only to the CA.
2. The expected number of records to be deleted.
3. How often the dataset will be reorganized with REPRO.
4. The performance requirements.
DEFINE CLUSTER -
(NAME(NTCI.V.UE4.W20000.T30.AV.DW200006) -
CYL(5 1) -
KEYS(8 0) -
RECSZ(80 80) -
KEYRANGES ((00000001 2999999) -
(30000000 4700000) -
(47000001 9999999)) -
VOLUMES (NTTSOB -
NTTSOJ -
NTTSO5) -
ORDERED -
NOREUSE -
INDEXED
_ _ _ more parameters
When the ORDERED parameter is coded the number of VOLUMES and KEYRANGES
must be the same.
Mainframe Refresher Part-1 VSAM-Page:11
Password Protection
VSAM datasets can be password protected at four different levels. Each level
gives a different access capability to the dataset. The levels are
1. READPW- provides read only capability.
2. UPDATEPW- records may be read, updated , added or deleted at this level.
3. CONTROLPW- provides the programmer with the access capabilities of
READPW and UPDATEPW.
4. MASTERPW- all the above operations plus the authority to delete the dataset
is provided.
Passwords provided at the cluster level protect only if access requires using
the cluster’s name as dataset name. Therefore it is advisable to protect the data and
index components using passwords because someone could otherwise access them
by name. Another feature of MVS called Resource Access Control Facility (RACF)
ignores VSAM passwords and imposes its own security and for most VSAM datasets
RACF security is sufficient.
The ATTEMPTS parameter coded with the password parameters specifies the
number of attempts permitted for the operator to enter the password before
abending the step.
The CODE parameter allows for the specification of a code to display to the
operator in place of the entry name prompt.
The AUTHORIZATION parameter provides for additional security by naming an
assembler User Security Verification Routine (USVR). The sub parameter for this
enclosed in parenthesis is the entry point of the routine.
Command Syntax
REPRO -
INFILE(DDNAME) | INDATASET(DATASET-NAME) -
OUTFILE(DDNAME) | OUTDATASET(DATASET-NAME) -
optional parameters
When loading a KSDS using REPRO, the input data should be first sorted in
ascending sequence by the field that will become the primary key in the output
dataset. When loading an ESDS the sort step can be eliminated since the records are
loaded in entry sequence. For an RRDS, the records are loaded in relative record
sequence starting with 1. The dataset should be sorted on the field that correlates to
the relative record number.
HEX format prints each character in the dataset as two hexadecimal digits.
A maximum of 120 hexadecimal digits are printed on each line, an equivalent of 60
characters.
PRINT INDATASET(MM01.CUSTOMER.MASTER) -
CHARACTER -
SKIP(28) -
COUNT(3)
DELETE - FORCE(FRC)/NOFORCE(NFRC)
It specifies whether objects that are not empty should be deleted.
FORCE allows you to delete data spaces, generation data groups, and user
catalogs without first ensuring that these objects are empty.
NOFORCE causes the DELETE command to terminate when you request the
deletion of a data space, generation data group, or catalog that is not empty.
DELTE - FILE(DDNAME)
It specifies the name of the DD statement that identifies:
1. The volume that contains a unique data set to be deleted.
2. The partitioned data set from which a member (or members) is to be deleted.
3. The data set to be deleted if ERASE is specified.
4. The volume that contains the data space to be deleted.
5. The catalog recovery volume(s) when the entry being deleted is in a recoverable
catalog. If the volumes are of a different device type, concatenated DD statements
must be used. The catalog recovery volume is the volume whose recovery space
contains a copy of the entry being deleted.
Mainframe Refresher Part-1 VSAM-Page:17
LISTCAT stands for LISTing a CATalog entry. It is useful for listing attributes
and characteristics of all VSAM and non-VSAM objects cataloged in a VSAM or ICF
catalog. Such objects can be the catalog itself, its aliases, the volumes it owns,
clusters, alternate indexes, paths, GDG’s, non-VSAM files etc. The listing also
provides statistics about a VSAM object from the time of its allocation, namely the
number of CI and CA splits, the number of I/O on index and data components, the
number of records added, deleted and retrieved besides other useful information.
Syntax:
LISTCAT ENTRIES(OBJECT-NAME) ALL|ALLOCATION|VOLUME|HISTORY|NAME
Parameter Meaning
NAME List the name and type of entry.
HISTORY Lists reference information for the object including name, type of
entry, creation and expiration date and the release of VSAM under
which it was created.
VOLUME Lists the device type and one or more volume serial number of the
storage volumes where the dataset resides. HISTORY information
is also listed.
ALLOCATION Lists information that has been specified for space allocation
including the unit(cylinders, tracks etc.), number of allocated units
of primary and secondary space and actual extents. This is
displayed only for data and index component entries. If
ALLOCATION is specified VOLUME and HISTORY are included.
ALL All the above details are listed
Example
//SMSXL861 JOB (36512),'MUTHU',NOTIFY=&SYSUID
// LISTCAT EXEC PGM=IDCAMS
//SYSPRINT DD SYSOUT=*
//SYSIN DD *
LISTCAT ENTRIES(SMSXL86.PAYROLL.MASTER) -
GDG -
ALL
/*
Mainframe Refresher Part-1 VSAM-Page:18
Entry-name is the name of the object that need to be exported. OUTFILE mention
exported into what name.
EXPORT - INHIBITSOURCE|NOINHIBITSOURCE
It specifies whether the original data records (the data records of the source
cluster or alternate index) can be accessed for any operation other than retrieval
after a copy is exported. This specification can later be altered through the ALTER
command.
INHIBITSOURCE (INHS) - cannot be accessed for any operation other than retrieval.
NOINHIBITSOURCE - original data records in the original system can be accessed for
any kind of operation.
EXPORT - TEMPORARY|PERMANENT
It specifies whether the cluster or alternate index to be exported is to
be deleted from the original system.
TEMPORARY specifies that the cluster or alternate index is not to be deleted
from the original system. The object in the original system is marked as temporary
to indicate that another copy exists and that the original copy can be replaced.
PERMANENT specifies that the cluster or alternate index is to be deleted from
the original system. Its storage space is freed. If its retention period has not yet
expired, you must also code PURGE. PERMANENT is the default.
Mainframe Refresher Part-1 VSAM-Page:19
EXPORT - ERASE|NOERASE
This specifies whether the data component of the cluster or alternate index to
be exported is to be erased or not (overwritten with binary zeros).
With ERASE specification, the data component is overwritten with binary
zeros when the cluster or alternate index is deleted.
With NOERASE specification, the data component is not overwritten with
binary zeros when the cluster or alternate index is deleted.
Example:
//EXPORT EXEC PGM=IDCAMS
//DD2 DD DSN=SMSXL86.LIB.KSDS.BACKUP(+!),
// DISP=(NEW,CATLG,DELETE),UNIT=TAPE,
// VOL=SER=121212,LABEL=(1,SL),
// DCB=(RECFM=FB,LRECL=80)
//SYSIN DD *
EXPORT A2000.LIB.KSDS.CLUSTER -
OUTFILE (DD2)
/*
IMPORT - OBJECTS
When the OBJECTS parameter is coded the attributes of the new target
dataset can be changed. These attributes include VOLUMES and KEYRANGES.
IDCAMS-VERIFY Command
If a job terminates abnormally and a VSAM dataset is not closed, the catalog
entry for the dataset is flagged to indicate that the dataset may be corrupt(as the
end of file or last key is not updated in the index properly). In such case you would
get VSAM status code of ‘97’ in the open of the file in the program. Before the
dataset can be opened again, the VERIFY command must be used to correctly
identify the end of the dataset and reset the catalog entry. Alternate solution is open
the file in File-aid in edit mode and just save it. It would update the index with end
of file.
Base cluster and alternate index can be verified. The verification of base cluster does
not verify its alternate indexes so each one of them must be treated separately.
Example
JCL for an AMS job with comments that runs an ALTER command
MAXCC contains the maximum value of condition codes from the previously
executed functional commands. MAXCC is returned as step return code in JCL.
SET command is used to reset the MAXCC or LASTCC within AMS.
IF-THEN-ELSE statements are used to control the command execution
sequence.
In the below example, REPRO command loads data from a sequential dataset
on to a KSDS. Only if the condition code of the REPRO step is zero, the next LISTCAT
step will be executed. Otherwise the KSDS will be deleted. MAXCC is set to zero at
the end to avoid non-zero return code.
//SYSIN DD *
REPRO INDATASET(SMSXL86.DATA.TEST) -
OUTDATASET(SMSXL86.TEST.KSDS) -
IF LASTCC = 0 -
THEN -
LISTCAT -
ENTRIES(SMSXL86.TEST.KSDS) ALL
ELSE -
DELETE SMSXL86.TEST.KSDS
END-IF
SET MAXCC=0
/*
ALTERNATE INDEX
Steps Involved
Parameter Meaning
RELATE Relates AIX with base cluster
NONUNIQUE/ Duplicates are allowed / not allowed in alternate key.
UNIQUE
KEYS Defines the length and offset of alternate key in base cluster
UPGRADE Adds the AIX cluster to the upgrade set of base cluster. Whenever
base is modified, its upgrade set is also modified. UPGRADE is
default. NOUPGRADE didn’t add the AIX to base cluster upgrade set.
RECORDSIZE Specifies the record size of alternate index record. It is calculated
using the formula in the next table.
Byte-1 Type of Cluster; X’00’ indicates ESDS and X’01’ indicates KSDS.
Byte-2 Length of base cluster pointers in alternate index; Primary key
length for KSDS and X’04’ for ESDS.
Byte-3 Half word binary indicates number of occurrences of primary key pointers
Byte-4 in alternate index record. X’0001’ for unique alternate key.
Byte-5 Length of alternate key.
Step2. BLDINDEX
Alternate index should have all the alternate keys with their corresponding
primary key pointers. After the AIX is defined, this information should be loaded
from base cluster. Then only we can access the records using AIX. BLDINDEX do this
LOAD operation.
Mainframe Refresher Part-1 VSAM-Page:23
1. INFILE and OUTFILE points to Base Cluster and Alternate index Cluster.
Example:
RELATE(MMA2.EMPMAST) -
KEYS(9 12) -
UNIQUEKEY -
UPGRADE -
REUSE -
VOLUMES(MPS800) ) -
DATA ( NAME(MMA2.EMPMAST.SSN.AIX.DATA) -
CYLINDERS(1 1) ) -
INDEX ( NAME(MMA2.EMPMAST.SSN.AIX.INDEX) )
The order of build index and definition of PATH does not matter.
AMP is most often used to allocate I/O buffers for the index and data components for
optimizing performance.
AMORG
This parameter, which stands for Access Method ORGanization, indicates that
the particular DD statement refers to a VSAM dataset.
BUFND
This parameter gives the number of I/O buffers needed for the data
component of the cluster. The size of each buffer is the size of the data CI.
The default value is two data buffers one of which is used only during CI/CA splits.
Therefore the number of data buffers left for normal processing is one.
If more data buffers are allocated, then performance of sequential processing will
improve significantly.
BUFNI
This parameter gives the number of I/O buffers needed for the index
component of the cluster. Each buffer is the size of the index. This sub-parameter
may be coded only for a KSDS because ESDS and RRDS do not have index
components. The default value is one index buffer.
If more index buffers are allocated, then performance of random processing
will improve significantly.
BUFSP
This parameter indicates the number of bytes for data and index component
buffers. If this value is more than the value given in the BUFFERSPACE parameter of
the DEFINE CLUSTER, it overrides the BUFFERSPACE. Otherwise BUFFERSPACE takes
precedence. The value of BUFSP is calculated as
However it is recommended not to code this parameter and let VSAM perform the
calculations from the BUFND and BUFNI values instead.
If SMS is active, then VSAM datasets can be created in JCL without using IDCAMS as
below:
//KSDSFILE DD DSN=DEVI.CUST.MASTER,DISP=(NEW,CATLG,DELETE),
// SPACE=(CYL,(10,10)),
Mainframe Refresher Part-1 VSAM-Page:26
// LRECL=100,KEYOFF=10,KEYLEN=12,RECORG=KS
RECORG can also be ES(for Entry Sequenced Datasets), RR(for Relative Record
datasets) and LS(for Linear Datasets). Other parameters of DEFINE CLUSTER will be
assigned default values or you can additionally mention SMS parameter, DATACLASS
that is defined with predefined values.
INTERVIEW Questions
NOTES
Mainframe Refresher Part-1 VSAM-Page:28
NOTES