Data File
Conversion File contains Multiple Record Types http://www.simotime.com |
When technology complements business | Copyright © 1987-2012 SimoTime Enterprises All Rights Reserved |
This document provides information about converting the content of a sequential file from EBCDIC to ASCII while maintaining mainframe numeric encoded formats. The sequential file also contains multiple record types. The programs are COBOL and will run on an IBM Mainframe. With Micro Focus technology the programs will run on the Windows and UNIX platforms that are supported by Micro Focus technology. JCL members are provided for execution on the mainframe. Command files are provided for executing on a Windows platform. UNIX script files have not been provided.
We would like to thank Larry Simmons of Micro Focus for providing us with much of the information in this document. His knowledge and expertise were greatly appreciated.
This section will focus on how to optionally prepare a data file for transfer on the sending computer system, how to transfer a data file and how to restore a data file to the format required on the receiving computer system.
If Micro Focus Mainframe Express (MFA) is available the file transfer process is greatly simplified. Since MFA may not be available during a Proof of Concept effort this document will focus on using FTP as the file transfer vehicle. Since many of the data files have packed and binary fields the file content conversion between EBCDIC and ASCII will need to be done at the field level.
If the requirement for data conversion is to do the conversion on a Windows client machine that is used for development and testing and the data is limited to sequential or indexed files then the Data File Converter (DFCONV) from Micro Focus may offer a cost effective and convenient solution since it is included in Mainframe Express and Net Express products. However, Micro Focus will need to include all of the components for DFCONV into our various run time offering to allow implementation in the production environments. Also, at the time of the writing of this document DFCONV was not available on the Mainframe or UNIX platforms.
The common technology used to transfer files between computer systems is the File Transfer Protocol (FTP). When transferring sequential files, source code or VSAM data sets between a Mainframe and a WinTel system the Mainframe Access (MFA) provided by Micro Focus offers a better solution than FTP. Both of these technologies and the methodologies are discussed in more detail in the following two sections.
This technology is provided by Micro Focus and has a client component and a mainframe server component. It has the capability of transferring source code, sequential files and VSAM data sets between a Mainframe (running MVS, OS/390 or ZOS) and a WinTel environment. MFA has both a Graphical User Interface and Command Line Interface.
The Graphical User Interface offers a Windows Explorer type of view of the Mainframe and with a simple point-and-click a source member, sequential file or a VSAM data set may be "Dragged-and-Dropped" from the mainframe into a directory on the Wintel system. To automate this function the Command Line Interface may be used. The source members are automatically converted between EBCDIC and ASCII. The data files are transferred and the EBCDIC-encoded content and mainframe numeric-encoding schemes are maintained in their original formats.
Note: The MFA Server component is not available for mainframes running VSE.
FTP is the common methodology used to exchange files between computers on the Internet. The FTP technology is available for WinTel, UNIX and the Mainframe (MVS, OS/390, ZOS and VSE).
For sequential file transfers FTP offers an easy and simple solution. A VSAM, Keyed Sequential Data Sets (KSDS) must be converted (or copied) to a sequential file before FTP can be used to transfer the file to a WinTel or UNIX system. There are a number of ways to do this but it is usually done using the REPRO function of IDCAMS. This process is discussed in more detail in a following section of this document.
For sequential files with variable length records the use of special FTP statements for the mainframe will be required in order to maintain the Record Descriptor Word (RDW) and optional Block Descriptor Word (BDW). This process is discussed in more detail in a following section of this document.
The following is a simple FTP script for transferring (from a mainframe system to a WinTel system) a sequential data file with text, packed and binary values. The file will be created on the WinTel system with EBCDIC encoded content and mainframe numeric formats.
userid password CD .. PWD BINARY GET SIMOTIME.DATA.DATAFILE c:\CONLIB01\FTPLIB01\DATAFILE.DAT QUIT
The preceding script is executed from the WinTel system. To execute from the mainframe the GET statement would need to be changed to a PUT statement
This type of conversion may be based on a business requirement but is quite often required prior to the file transfer process. For example, when using FTP it will be necessary to convert a VSAM, KSDS to a sequential file in order to download the data using FTP. Once the sequential file is downloaded it must be converted to a Keyed, Index file in Micro Focus format.
The need to do queries of data is driving format conversions from VSAM, Keyed-Sequential-Data-Sets to a Relational Data Base models. This type of conversion is beyond the scope of this document.
Other types of format conversions usually involved changes in field sizes that result in a change in the total record size or the key field of a keyed-indexed data set.
File format conversions may be required for two reasons. The first is based on the requirements of the file transfer process, For example, FTP needs the data to be stored in a sequential file. The second is based on the file formats available on the receiving system.
Changing the content or encoding of data within a record is referred to as File Content conversion. In some cases file content conversion may result in a concurrent file format conversion.
With the exception of source code, control files and files that contain printed information most of the mainframe data files will contain packed or binary data. This creates the requirement for being able to do content conversion of the individual records within a file at the field level. The conversion of the text based data between EBCDIC and ASCII is the most requested type of content conversion.
An example of the type of content conversion that is driven by business would be to personalize the Name, Address information stored on the mainframe in upper case to an upper and lower case format.
Since the mainframe is an EBCDIC-encoded system and the WinTel or UNIX systems are ASCII-encoded systems it may be necessary to convert the data files from EBCDIC to ASCII.
Note: Micro Focus can run a COBOL-oriented application on a WinTel or UNIX platform in EBCDIC mode. However, if the data (or a subset of the data) will be imported into an excel spreadsheet or exported to other non-COBOL applications or technologies it will be necessary to convert the data from EBCDIC to ASCII.
This section describes the three (3) file types that are the focus of this document and provides references to coding examples. The effort involved to transfer a file varies depending on the file type.
Since many of the data files on the mainframe contain packed, binary and signed numeric fields it is necessary to do most of the content conversions at the field level. The effort of converting the file content (i.e. EBCDIC to ASCII) depends on the number of numeric fields and the format of the numeric fields. To convert the content of any of the three file types from EBCDIC to ASCII will require additional processing. The recommended approach for all the file types is to read the EBCDIC-encoded file and create a new ASCII-encoded file.
A sequential file could be converted in place but this is not recommended. A keyed-index file conversion process must read the EBCDIC-encoded file and write a new ASCII-encoded file.
An attempt to update in place would result in new records with the new ASCII-encoded key being added to the file. An example of this conversion process is provided in a later section of this document.
This is the easiest of file types to transfer. The file must be downloaded in BINARY. This will maintain the EBCDIC-encoding and the mainframe numeric formats for packed and binary data.
Once the file is downloaded it may be accessed using the Data File Editor in Mainframe Express or Net Express . COBOL programs that access the file may also be downloaded and compiled with Mainframe Express or Net Express. For Mainframe Express the directives are set and the default compile option is EBCDIC mode. For Net Express it will be necessary to use IBMCOMP, NOTRUNC, ASSIGN(EXTERNAL) and CHARSET(EBCDIC).
This is a simple FTP transfer but will require an FTP statement that is unique to the mainframe in order to transfer the file and retain the information about the length of each record within the file. The file must be downloaded in BINARY. This will maintain the EBCDIC-encoded content and the mainframe numeric formats for packed and binary data.
Once the file is downloaded it is a mirror of the mainframe format and must be converted to a Micro Focus format for variable length files. See the examples section of this document for more information.
If MFA is available simple download the file and it will be converted from a mainframe, VSAM, KSDS to a Micro Focus Keyed Indexed file.
A VSAM, Keyed Sequential Data Sets (KSDS) must be converted to a sequential file before FTP can be used to transfer the data to a WinTel or UNIX system. There are a number of ways to do this but it is usually done using the REPRO function of IDCAMS. The new sequential file can then be transferred using FTP. Once the sequential file has been transferred it must be converted to a Micro Focus Keyed Index file. If the key is an alpha-numeric key then special considerations must be taken to allow for the differences in the sorting or collating sequence of EBCDIC and ASCII. See the examples section of this document for more information.
This section will describe the examples available doing the file transfers and file conversions between EBCDIC and ASCII for each of the three (3) files types..
The primary directory is called ConLib01 and contains two sub-directories called DEVL and PROD.
The DEVL directory contains the executable members and the following sub-directories.
Sub-Directory | Description |
CobCpy1 | Contains COBOL copy files |
COBOL | Contains COBOL source code. |
DataAsc1 | Contains the ASCII-encoded Data Files. |
DataEbc1 | Contains the EBCDIC-encoded Data Files. |
DataFtp1 | Contains the original data downloaded from the mainframe using FTP. |
DataMfa1 | Contains the files downloaded by Mainframe Access (MFA). |
DataTxt1 | Contains the ASCII/Text files. |
DataWrk1 | Used as a working directory when generating COBOL conversion program. This directory is also used for testing the conversion process. |
The PROD directory contains the executable members and the following sub-directories.
Sub-Directory | Description |
DataAsc1 | Contains the ASCII-encoded Data Files. |
DataEbc1 | Contains the EBCDIC-encoded Data Files. |
DataAsc1 | Contains the sequential and indexed files that use AQSCII-encoded content. |
DataEbc1 | Contains the sequential and indexed files that use EBCDIC-encoded content. |
DataFtp1 | Contains the original data downloaded from the mainframe using FTP. |
DataMfa1 | Contains the files downloaded by Mainframe Access (MFA). |
DataTxt1 | Contains the ASCII/Text files. DataWrk1 Used as a working directory when generating COBOL conversion programs. This directory is also used for testing the conversion process. |
Sequential files with fixed length records are sometimes referred to as flat, sequential files or flat file. Other file types are quite often converted to this file type to facilitate the file or data transfer process. If FTP is used as the file transfer media this format is easily accessed and transferred with standard FTP statements.
The following is a sample script to FTP a sequential file with fixed-length records.
userid password CD .. PWD BINARY GET SIMOTIME.DATA.ZDDFSA01 c:\MFI01\FTPLIB01\ZDDFSA01.DAT QUIT
Once this file type has been downloaded a file format conversion is not necessary. This file is still in its mainframe EBCDIC-encoded format. The Micro Focus File Handling technology within Mainframe Express (MFE) and Net Express has the capability of accessing this file and processing the data in EBCDIC mode.
If it is a requirement to provide the file with an ASCII-encoded content then it will be necessary to do a file content conversion.
Work-in-Progress...WIP...
Sequential files with variable length records can be difficult to manage. The format of this type of file on the mainframe has each record in the file preceded by a four byte Record Descriptor Word (RDW) and each block (this is optional) preceded by a Block Descriptor Word (BDW). When transferring this type of file the RDW and BDW must be retained.
The following is a sample script to FTP a sequential file with variable-length records. This script is for the client or receiving system.
userid password CD .. PWD BINARY QUOTE SITE RDW GET SIMOTIME.DATA.RPLFTE01 c:\MFI01\FTPLIB01\FTPRPL01.DAT QUIT
The file must be downloaded in BINARY. This will maintain the EBCDIC-encoding and the mainframe numeric formats for packed and binary data. One of the following FTP statements must be used prior to a GET or PUT statement in order to download the Record-Descriptor-Word (RDW) and the possible Block-Descriptor-Word (BDW). The BDW and RDW are in binary format so it is critical to download in binary even if the records in the file are all text strings.
Prior to a GET use one of the following statements.
QUOTE SITE RDW
or
LITERAL SITE RDW
Prior to a PUT use the following statement.
LOCSITE RDW
Once the file is transferred it is a mirror of the mainframe format and must be converted to a Micro Focus format for variable length files. It cannot be accessed using the Data File Editor or a COBOL program using the standard SELECT and FD syntax.
The newly transferred file can only be accessed using byte-stream I/O. A callable routine (VRECRTN1.DLL) is provided with the following Application Programming Interface (API).
01 PASSRTN1-AREA. 05 RTN1-REQUEST pic X(8). 05 RTN1-RESPOND pic 9(4). 05 RTN1-LENGTH pic 9(5). 05 RTN1-BUFFER pic X(32760).
The following is an example of how to initialize the pass area. This only needs to be done one time prior to the first call.
move 'GET ' to RTN1-REQUEST move ZERO to RTN1-RESPOND move ZERO to RTN1-LENGTH move SPACES to RTN1-BUFFER
The following is an example of a call statement for the callable routine.
call 'VRECRTN1' using PASSRTN1-AREA
It is not necessary to do an explicit open of the input, byte-stream file. The first call to the routine will open the file and read the first record. Subsequent calls will return a logical record in the buffer with its record length in the RTN1-LENGTH field. When a call results in an end of file condition the routine will close the file. If the plan is to convert the file content from EBCDIC-encoded content to ASCII-encoded content this may be done in the same step as the format conversion.
Work-in-Progress...WIP...
A VSAM, KSDS will need to be converted to a sequential file prior to being transferred using FTP. The following is a sample JCL member that uses the REPRO function of IDCAMS to copy from the VSAM, KSDS to a sequential file.
//IDCAMSJ1 JOB SIMOTIME,ACCOUNT,CLASS=1,MSGCLASS=0,NOTIFY=CSIP1 // EXEC PGM=IDCAMS //SYSPRINT DD SYSOUT=A //KSDGET01 DD DSN=SIMOTIME.DATA.KSDGET01,DISP=OLD //SEQPUT01 DD DSN=SIMOTIME.DATA.SEQPUT01,UNIT=SYSDA, // SPACE=(TRK,(10,10)), // DISP=(NEW,CATLG), // DCB=(LRECL=80,BLKSIZE=0,RECFM=FB) //SYSIN DD * REPRO - INFILE(KSDGET01) - OUTFILE(SEQPUT01) /*
The following is a sample script to FTP a sequential file with fixed-length records (this is the same as the sample for the FTP for sequential files that was described in an early section of this document).
userid password CD .. PWD BINARY GET SIMOTIME.DATA.KSDDDS01 c:\MFI01\FTPLIB01\KSDDDSZ1.DAT QUIT
Once the sequential file has been transferred it will need to be converted to a Micro Focus Indexed File. If the plan is to convert the file from EBCDIC-encoded content to ASCII-encoded content this may be done in the same step as the format conversion.
Work-in-Progress...WIP...
The following is a list of questions that focus on file transfer and file conversion.
Work-in-Progress...WIP...
1. I want to do the conversion of a VSAM, KSDS with Alphameric keys on
the mainframe and have the ASCII formatted file staged on the mainframe for
download from clients using either Windows or UNIX and FTP. How do I do this?
2. As we move toward a service oriented architecture there is a
requirement emerging for the ability to convert data at the record level
instead of converting an entire file. Can you recommend an approach for doing
this?
3. Older file and data base structures that are still processed via
COBOL on the mainframe need to be updated and/or converted prior to
downloading. Can we do the EBCDIC/ASCII conversion in the same program and in a
single pass?
4. Can you describe how to process multiple record types that
requires the inclusion or replication of user logic based on the contents of a
field or multiple fields within a record?
5. Can you describe how to
process variable length records that require the inclusion or replication of
user logic based on the content of a field or multiple fields.
6. Con you
describe how to read an EBCDIC encoded file and add or update records in an
existing ASCII encoded file.?
7. Can you describe how to reading an EBCDIC
encoded file with mainframe numeric encoded packed and binary fields and create
a Comma-Separated-Value (CSV), ASCII encoded file with expanded numeric fields
with a separate leading sign. that may be easily imported into an excel
spreadsheet?
There are many ways to address the technical challenges presented when attempting to use computer systems to provide solutions to business requirements. This document provides some solutions to some of the data migrations challenges.
Check out The COBOL Connection for more examples of mainframe COBOL coding techniques and sample code.
Check out The JCL Connection for more mainframe JCL examples.
Check out The VSAM - QSAM Connection for more examples of mainframe VSAM and QSAM accessing techniques and sample code.
This document provides a quick summary of the File Status Key for VSAM data sets and QSAM files.
Check out The SimoTime Library for a wide range of topics for Programmers, Project Managers and Software Developers.
To review all the information available on this site start at The SimoTime Home Page .
Check out The SimoTime Glossary for a list of terms and definitions used in the documents provided by SimoTime.
If you have any questions, suggestions or comments please call or send an e-mail to: helpdesk@simotime.com
Founded in 1987, SimoTime Enterprises is a privately owned company. We specialize in the creation and deployment of business applications using new or existing technologies and services. We have a team of individuals that understand the broad range of technologies being used in today's environments. This includes the smallest thin client using the Internet and the very large mainframe systems. There is more to making the Internet work for your company's business than just having a nice looking WEB site. It is about combining the latest technologies and existing technologies with practical business experience. It's about the business of doing business and looking good in the process. Quite often, to reach larger markets or provide a higher level of service to existing customers it requires the newer Internet technologies to work in a complementary manner with existing corporate mainframe systems. Whether you want to use the Internet to expand into new market segments or as a delivery vehicle for existing business functions simply give us a call or check the web site at http://www.simotime.com
Return-to-Top |
Copyright © 1987-2012 SimoTime Enterprises All Rights Reserved |
When technology complements business |