This article is part of the SHARE intro to the mainframe series. If you would like to contribute to this series, please reach out to editor@share.org.
Have you ever wondered how the “system” finds all of the data sets that you use and need to run your applications and jobs? There are thousands of data sets on hundreds of disk volumes or tape volumes. How does the “system” keep track of what’s where?
The answer is a concept that is commonly called a “catalog.” The term “catalog” is a generic term that represents a capability that allows the system to look up the location and characteristics of data sets that are needed. A catalog can be thought of as a data set containing location pointers to other data sets. Often, it is metaphorically thought of as a phone book.
A Short History of Catalogs
Originally, there were OS SYSCTLGs for early versions of the operating system for the 360-model computers. When Virtual Storage Access Method (VSAM) was designed, more information had to be kept for VSAM data sets; VSAM catalogs were designed to contain this information. There were many inherent problems and limitations in the VSAM catalog design.
A newer design, called Integrated Catalog Facility (ICF) Catalogs, was implemented. This new design leveraged VSAM data structures for their performance and resilience characteristics. OS SYSCTLGs and VSAM Catalogs were phased out in the year 2000 (Y2K). Today, only ICF catalogs remain
There have been 3 types of z/OS (MVS) Catalogs:
- OS SYSCTLGs (also called CVOLs)
- VSAM Catalogs
- ICF Catalogs
Going beyond the concept of a catalog, which is a logical construct, we need to explore catalogs in more detail. A catalog is technically more complex and has several physical components. Currently, z/OS implements the catalog using the Integrated Catalog Facility (ICF) Catalog. The processes that access and manipulate these entries are called “Catalog Management,” and these processes have been segregated into their own Address Space called “Catalog Address Space” (CAS). ICF Catalogs, Catalog Management, and the CAS are an integral part of the z/OS operating system.
- Catalog refers to data sets that contain data set pointer entries, some data set characteristics, and System Managed Storage (SMS) characteristics.
- Catalog Management is the set of programs that create, access, and change the information about data sets in a Catalog.
- CAS is the address space that processes requests to create, access, or change catalog information.
- CAS contains all of the Catalog Management code
- CAS interfaces with the user program and schedules work to be processed in the Catalog Address Space on the user’s behalf
Where Do Catalogs Fit in the Mainframe Landscape?
As noted earlier, ICF Catalogs are an integral part of z/OS. If you are referencing data sets only by name, you are utilizing catalog services. Said another way, unless you specify the precise location of a data set, i.e., which volume (volume serial number) and what device type (disk or tape) a data set resides on, the operating system has to find that information in a catalog.
To streamline the process of locating data, the catalog is essential. Being such a critical component of the operating system, there are several management topics that need to be understood and implemented, including catalog performance, backup and recovery, and disaster recovery. These disciplines are rather involved. For now, let’s look at basic catalog structure, function, and terminology.
ICF Catalog Parts
ICF Catalogs can be divided into two structural parts: Basic Catalog Structure (BCS) and VSAM Volume Data Set (VVDS).
The BCS contains the logical information for the data set. In the case of VSAM data sets the BCS describes all of the components associated with the VSAM cluster and its Alternate Indices (AIXs), if any. The BCS points to VVDSs and VTOCs via fields called volume cells. Without going into a detailed discussion of VSAM, suffice it to say that the BCS is a VSAM key-sequenced data set (KSDS). A BCS resides on only one volume, but entries in it can point to many volumes.
The VVDS contains the physical description of the data set. The entries in the VVDS are called VSAM Volume Records (VVRs) for VSAM data sets and Non-VSAM Volume Records (NVRs) for non-VSAM data sets. VVDSs are VSAM ESDSs (entry-sequenced data sets). The VVDS also contains SMS information for the data sets (e.g., storage class). There is one VVDS on each volume.
The BCS has the logical information about the data set and lists the volumes on which the data set resides. The VVDS contains the physical information about the data set, including such things as the control interval size of VSAM components and the physical block size of the VSAM data set. Thus, the BCS contains the location and logical characteristics, and the VVDS contains the physical characteristics.
Once the data set is identified as residing on a specific volume, the physical characteristics and Storage Management Subsystem (SMS) characteristics are retrieved from the VVDS and the Volume Table of Contents (VTOC)
by the system in order to access the data set. The VTOC is a related structure that contains physical information about non-VSAM data sets (data set type, data set organization, and block size), as well as extent information for both VSAM and non-VSAM data sets, and their physical locations on the disk.
z/OS Catalog Structure
The z/OS Catalog structure is only two levels deep. This is in contrast to other operating systems that use hierarchical directories to locate files.
There is one master catalog for a system, and there can be many user catalogs. The master catalog is identified by an entry in SYS1.NUCLEUS or SYS1.PARMLIB at IPL time. There is nothing else that identifies the catalog as a master catalog. Ideally, a master catalog contains only pointers to user catalogs and system data sets needed at IPL time. Typically, SYS1 data sets are cataloged in the master catalog.
User catalogs contain all other entries for VSAM and non-VSAM data sets. Often data sets are cataloged in user catalogs to distribute entries for better management to optimize performance, backup and recovery, and minimize single points of failure.
Who Uses Catalogs?
The typical user only needs to know that the system locates data sets using catalog services. Most commonly z/OS systems programmers, z/OS storage administrators, and z/OS system administrators need to understand how ICF catalogs work and how to maintain and manage them to ensure optimal performance and resiliency. Proper catalog placement and sizing, in addition to distribution of high-level indices can balance I/O patterns and minimize bottlenecks.
How Would You Recommend Learning More About Catalogs?
IBM manuals and Redbooks offer deep dives and key information about ICF catalogs. There are also several training courses offered by IBM and other vendors, and there are sessions in the SHARE proceedings that discuss various aspects of ICF catalogs.
The primary resources to read to learn more about ICF catalog are the following IBM publications:
Janet Sun is a past SHARE president and IBM Champion.
Read SHARE's Intro to the Mainframe series articles: