Data File Migration
File Transfer and Conversion
|The SimoTime Home Page|
This suite of programs provides examples of how to move data between an IBM Mainframe System and a Linux, UNIX or Windows environment. The focus is on migrating data from its EBCDIC encoded format on a Mainframe to an ASCII encoded format for Linux, UNIX or Windows platforms. This document provides information about transferring or converting the sequential files and VSAM Keyed Sequential Data Sets (KSDS). Both "file format" and "record content" conversion will be discussed.
A suite of sample programs is included to show the detail of the processes. Most of the programs are COBOL and will run on an IBM Mainframe. The programs will also run on Windows and UNIX platforms that are supported by Micro Focus technology. A few of the programs are unique to the Micro Focus environment. JCL members are provided for execution on the mainframe. Command files are provided for executing on a Windows platform. UNIX script files have not been provided. This document provides detailed information for migrating data files between an IBM Mainframe system and a Micro Focus environment running on a Linux, UNIX or Windows system.
This document does not provide information for migrating relational data bases (i.e. DB/2 and SQL Server). This document will describe various approaches to the data migration and conversion processes.
Anyone considering a data conversion should seek additional assistance from a consulting or programming services organization that has experience in this area.
We would like to thank Larry Simmons of Micro Focus for providing us with much of the information in this document. His assistance, knowledge and expertise were greatly appreciated.
We have made a significant effort to ensure the documents and software technologies are correct and accurate. We reserve the right to make changes without notice at any time. The function delivered in this version is based upon the enhancement requests from a specific group of users. The intent is to provide changes as the need arises and in a timeframe that is dependent upon the availability of resources.
Copyright © 1987-2022
SimoTime Technologies and Services
All Rights Reserved
The following is an overview of the data transfer alternatives and the types of data conversion.
|2.||File Format Conversion|
|3.||Record Content Conversion|
Note: The preceding Data Migration Tasks will be discussed in more detail in the following sections of this document.
This document will focus on three (3) traditional file types.
|1.||Sequential Files with Fixed-Length records|
|2.||Sequential Files with Variable-Length records|
|3.||VSAM, KSDS or Keyed Indexed|
Note: The preceding non-Relational Data Types will be discussed in more detail in the following sections of this document.
The sequential files with fixed length records are usually a very simple transfer. The sequential files with variable length records require special handling to transfer and then convert to the format required by the receiving system.
The following are additional comments and thoughts. There are two categories of data conversion efforts for migrating data and associated programs from an IBM Mainframe System to a Linux, UNIX or Windows System.
|1.||Migrate then retire the application on Mainframe. This is a possible one-time, one-way conversion (with "one-time" being a few times for testing and one last time to deploy).|
|2.||Migrate then Coexist/Complement the mainframe. This is an on-going transfer/exchange of data This may require bi-direction translation (i.e. download/upload).|
It is important to provide platform flexibility as to where the conversion is done.
|1.||Do the conversion on the mainframe.|
|2.||Do the conversion on the server.|
|3.||Do the conversion on the client machine.|
|4.||Do the conversion during the transfer (i.e. download/upload) process.|
This section will focus on how to optionally prepare a data file for transfer on the sending computer system, how to transfer a data file and how to restore a data file to the format required on the receiving computer system.
If Micro Focus Mainframe Express (MFA) is available the file transfer process is greatly simplified. Since MFA may not be available during a Proof of Concept effort this document will focus on using FTP as the file transfer vehicle. Since many of the data files have packed and binary fields the file content conversion between EBCDIC and ASCII will need to be done at the field level.
If the requirement for data conversion is to do the conversion on a Windows client machine that is used for development and testing and the data is limited to sequential or indexed files then the Data File Converter (DFCONV) from Micro Focus may offer a cost effective and convenient solution since it is included in Mainframe Express and Net Express products. However, Micro Focus will need to include all of the components for DFCONV into our various run time offering to allow implementation in the production environments. Also, at the time of the writing of this document DFCONV was not available on the Mainframe or UNIX platforms.
The common technology used to transfer files between computer systems is the File Transfer Protocol (FTP). When transferring sequential files, source code or VSAM data sets between a Mainframe and a WinTel system the Mainframe Access (MFA) provided by Micro Focus offers a better solution than FTP. Both of these technologies and the methodologies are discussed in more detail in the following two sections.
is technology is provided by Micro Focus and has a client component and a mainframe server component. It has the capability of transferring source code, sequential files and VSAM data sets between a Mainframe (running MVS, OS/390 or ZOS) and a WinTel environment. MFA has both a Graphical User Interface and Command Line Interface.
The Graphical User Interface offers a Windows Explorer type of view of the Mainframe and with a simple point-and-click a source member, sequential file or a VSAM data set may be "Dragged-and-Dropped" from the mainframe into a directory on the Wintel system. To automate this function the Command Line Interface may be used. The source members are automatically converted between EBCDIC and ASCII. The data files are transferred and the EBCDIC-encoded content and mainframe numeric-encoding schemes are maintained in their original formats.
Note: The MFA Server component is not available for mainframes running VSE.
FTP is the common methodology used to exchange files between computers on the Internet. The FTP technology is available for WinTel, UNIX and the Mainframe (MVS, OS/390, ZOS and VSE).
For sequential file transfers FTP offers an easy and simple solution. A VSAM, Keyed Sequential Data Sets (KSDS) must be converted (or copied) to a sequential file before FTP can be used to transfer the file to a WinTel or UNIX system. There are a number of ways to do this but it is usually done using the REPRO function of IDCAMS. This process is discussed in more detail in a following section of this document.
For sequential files with variable length records the use of special FTP statements for the mainframe will be required in order to maintain the Record Descriptor Word (RDW) and optional Block Descriptor Word (BDW). This process is discussed in more detail in a following section of this document.
The following is a simple FTP script for transferring (from a mainframe system to a WinTel system) a sequential data file with text, packed and binary values. The file will be created on the WinTel system with EBCDIC encoded content and mainframe numeric formats.
userid password CD .. PWD BINARY GET SIMOTIME.DATA.DATAFILE c:\CONLIB01\FTPLIB01\DATAFILE.DAT QUIT
The preceding script is executed from a Windows System. To execute from an IBM Mainframe System the GET statement would need to be changed to a PUT statement.
This type of conversion may be based on a business requirement but is quite often required prior to the file transfer process. For example, when using FTP it will be necessary to convert a VSAM, KSDS to a sequential file in order to download the data using FTP. Once the sequential file is downloaded it must be converted to a Keyed, Index file in Micro Focus format.
The need to do queries of data is driving format conversions from VSAM, Keyed-Sequential-Data-Sets to a Relational Data Base models. This type of conversion is beyond the scope of this document.
Other types of format conversions usually involved changes in field sizes that result in a change in the total record size or the key field of a keyed-indexed data set.
File format conversions may be required for two reasons. The first is based on the requirements of the file transfer process, For example, FTP needs the data to be stored in a sequential file. The second is based on the file formats available on the receiving system.
Changing the content or encoding of data within a record is referred to as Record Content conversion. In some cases the Record content conversion may result in a concurrent file format conversion.
With the exception of source code, control files and files that contain printed information most of the mainframe data files will contain packed or binary data. This creates the requirement for being able to do content conversion of the individual records within a file at the field level. The conversion of the text based data between EBCDIC and ASCII is the most requested type of content conversion.
An example of the type of content conversion that is driven by business would be to personalize the Name, Address information stored on the mainframe in upper case to an upper and lower case format.
Since the mainframe is an EBCDIC-encoded system and the WinTel or UNIX systems are ASCII-encoded systems it may be necessary to convert the data files from EBCDIC to ASCII.
Note: Micro Focus can run a COBOL-oriented application on a Linux, UNIX or Windows platform in EBCDIC mode. However, if the data (or a subset of the data) will be imported into an excel spreadsheet or exported to other non-COBOL applications or technologies it will be necessary to convert the data from EBCDIC to ASCII.
This section describes the three (3) file types that are the focus of this document and provides references to coding examples. The effort involved to transfer a file varies depending on the file type.
Since many of the data files on the mainframe contain packed, binary and signed numeric fields it is necessary to do most of the content conversions at the field level. The effort of converting the file content (i.e. EBCDIC to ASCII) depends on the number of numeric fields and the format of the numeric fields. To convert the content of any of the three file types from EBCDIC to ASCII will require additional processing. The recommended approach for all the file types is to read the EBCDIC-encoded file and create a new ASCII-encoded file.
A sequential file could be converted in place but this is not recommended. A keyed-index file conversion process must read the EBCDIC-encoded file and write a new ASCII-encoded file.
An attempt to update in place would result in new records with the new ASCII-encoded key being added to the file. An example of this conversion process is provided in a later section of this document.
This is the easiest of file types to transfer. The file must be downloaded in BINARY. This will maintain the EBCDIC-encoding and the mainframe numeric formats for packed and binary data.
Once the file is downloaded it may be accessed using the Data File Editor in Mainframe Express or Net Express. COBOL programs that access the file may also be downloaded and compiled with Mainframe Express or Net Express. For Mainframe Express the directives are set and the default compile option is EBCDIC mode. For Net Express it will be necessary to use IBMCOMP, NOTRUNC, ASSIGN(EXTERNAL) and CHARSET(EBCDIC).
This is a simple FTP transfer but will require an FTP statement that is unique to the mainframe in order to transfer the file and retain the information about the length of each record within the file. The file must be downloaded in BINARY. This will maintain the EBCDIC-encoded content and the mainframe numeric formats for packed and binary data.
Once the file is downloaded it is a mirror of the mainframe format and must be converted to a Micro Focus format for variable length files. See the examples section of this document for more information.
If MFA is available simple download the file and it will be converted from a mainframe, VSAM, KSDS to a Micro Focus Keyed Indexed file.
A VSAM, Keyed Sequential Data Sets (KSDS) must be converted to a sequential file before FTP can be used to transfer the data to a WinTel or UNIX system. There are a number of ways to do this but it is usually done using the REPRO function of IDCAMS. The new sequential file can then be transferred using FTP. Once the sequential file has been transferred it must be converted to a Micro Focus Keyed Index file. If the key is an alpha-numeric key then special considerations must be taken to allow for the differences in the sorting or collating sequence of EBCDIC and ASCII. See the examples section of this document for more information.
This section will describe the examples available doing the file transfers and file conversions between EBCDIC and ASCII for each of the three (3) files types.
For this suite of Application and utility programs the primary directory is C:\SIMOSAM1. This directory contains three sub-directories named DEVL, TEST and PROD.
Note: The TEST and PROD sub-directories have the same structure. The following shows the directory structures.
Explore a Directory Structure for a Development Environment that supports the execution of business applications using Micro Focus Enterprise Server.
Explore a Directory Structure for a Production Environment that supports the execution of business applications using Micro Focus Enterprise Server.
Sequential files with fixed length records are sometimes referred to as flat, sequential files or flat file. Other file types are quite often converted to this file type to facilitate the file or data transfer process. If FTP is used as the file transfer media this format is easily accessed and transferred with standard FTP statements and binary mode.
The following is a sample script to FTP a sequential file with fixed-length records.
userid password CD .. PWD BINARY GET SIMOTIME.DATA.ZDDFSA01 c:\MFI01\FTPLIB01\ZDDFSA01.DAT QUIT
Once this file type has been downloaded a file format conversion is not necessary. This file is still in its mainframe EBCDIC-encoded format. The Micro Focus File Handling technology within Mainframe Express (MFE) and Net Express has the capability of accessing this file and processing the data in EBCDIC mode.
If it is a requirement to provide the file with an ASCII-encoded content then it will be necessary to do a file content conversion.
Sequential files with variable length records can be difficult to manage. The format of this type of file on the mainframe has each record in the file preceded by a four byte Record Descriptor Word (RDW) and each block (this is optional) preceded by a Block Descriptor Word (BDW). When transferring this type of file the RDW and BDW must be retained.
The following is a sample script to FTP a sequential file with variable-length records. This script is for the client or receiving system.
userid password CD .. PWD BINARY GET SIMOTIME.DATA.RPLFTE01 c:\MFI01\FTPLIB01\FTPRPL01.DAT QUIT
The file must be downloaded in BINARY. This will maintain the EBCDIC-encoding and the mainframe numeric formats for packed and binary data. One of the following FTP statements must be used prior to a GET or PUT statement in order to download the Record-Descriptor-Word (RDW) and the possible Block-Descriptor-Word (BDW). The BDW and RDW are in binary format so it is critical to download in binary even if the records in the file are all text strings.
Prior to a GET use one of the following statements.
QUOTE SITE RDW
LITERAL SITE RDW
Prior to a PUT use the following statement.
Once the file is transferred it is a mirror of the mainframe format and must be converted to a Micro Focus format for variable length files. It cannot be accessed using the Data File Editor or a COBOL program using the standard SELECT and FD syntax.
The newly transferred file can only be accessed using byte-stream I/O. A callable routine (VRECRTN1.DLL) is provided with the following Application Programming Interface (API).
01 PASSRTN1-AREA. 05 RTN1-REQUEST pic X(8). 05 RTN1-RESPOND pic 9(4). 05 RTN1-LENGTH pic 9(5). 05 RTN1-BUFFER pic X(32760).
The following is an example of how to initialize the pass area. This only needs to be done one time prior to the first call.
move 'GET ' to RTN1-REQUEST move ZERO to RTN1-RESPOND move ZERO to RTN1-LENGTH move SPACES to RTN1-BUFFER
The following is an example of a call statement for the callable routine.
call 'VRECRTN1' using PASSRTN1-AREA
It is not necessary to do an explicit open of the input, byte-stream file. The first call to the routine will open the file and read the first record. Subsequent calls will return a logical record in the buffer with its record length in the RTN1-LENGTH field. When a call results in an end of file condition the routine will close the file. If the plan is to convert the file content from EBCDIC-encoded content to ASCII-encoded content this may be done in the same step as the format conversion.
A VSAM, KSDS will need to be converted to a sequential file prior to being transferred using FTP. The following (IDCAMSJ1.jcl) is a sample JCL member that uses the REPRO function of IDCAMS to copy from the VSAM, KSDS to a sequential file.
//IDCAMSJ1 JOB SIMOTIME,ACCOUNT,CLASS=1,MSGCLASS=0,NOTIFY=CSIP1 //* ******************************************************************* //* This program is provided by SimoTime Technologies * //* (C) Copyright 1987-2019 All Rights Reserved * //* Web Site URL: http://www.simotime.com * //* e-mail: email@example.com * //* ******************************************************************* //* //* TEXT - COPY (OR REPRO) A KSDS TO A SEQUENTIAL FILE //* AUTHOR - SIMOTIME TECHNOLOGIES //* DATE - JANUARY 01, 1989 //* // EXEC PGM=IDCAMS //SYSPRINT DD SYSOUT=A //KSDGET01 DD DSN=SIMOTIME.DATA.ITEMMAST,DISP=OLD //ITEMRSEQ DD DSN=SIMOTIME.DATA.ITEMRSEQ,UNIT=SYSDA, // SPACE=(TRK,(10,10)), // DISP=(NEW,CATLG), // DCB=(LRECL=512,RECFM=FB) //SYSIN DD * REPRO - INFILE(KSDGET01) - OUTFILE(ITEMRSEQ) /*
The following is a sample script to FTP a sequential file with fixed-length records (this is the same as the sample for the FTP for sequential files that was described in an early section of this document).
userid password CD .. PWD BINARY GET SIMOTIME.DATA.KSDDDS01 c:\MFI01\FTPLIB01\KSDDDSZ1.DAT QUIT
Once the sequential file has been transferred it will need to be converted to a Micro Focus Indexed File. If the plan is to convert the file from EBCDIC-encoded content to ASCII-encoded content this may be done in the same step as the format conversion.
The following is a list of questions that focus on file transfer and file conversion.
Question: I want to do the conversion of a VSAM, KSDS with Alphamerical keys on the mainframe and have the ASCII formatted file staged on the mainframe for download from clients using FTP from a Linux, UNIX or Windows System. How do I do this?
Answer: First, the VSAM, KSDS will need to be copied/converted to a record sequential file in order to transfer between systems using FTP. Next, the record content will need to be converted from EBCDIC-encoding to ASCII-encoding while maintaining numeric integrity for the various numeric formats.
Question: As we move toward a service oriented architecture there is a requirement emerging for the ability to convert data at the record level instead of converting an entire file. Can you recommend an approach for doing this?
Answer: One of the conversion alternatives provided by SimoTime is to generate the source code for two COBOL programs. The primary program does the file I/O and calls the secondary program to do the record content conversion. This secondary program may be called from a user program.
Question: Older files and VSAM data structures that are still processed via COBOL on the mainframe need to be updated and/or converted prior to downloading. Can we do the file format conversion and the EBCDIC/ASCII conversion in the same program and in a single pass?
Answer: The SimoTime approach to Data File Conversion is to generate COBOL programs that do the actual file format and record content conversion. The generated COBOL source code is OS390 compliant and may be compiled and executed on an IBM Mainframe System or a Linux, UNIX or Windows System with Micro Focus. User code may be easily added to the generated COBOL source code.
Question: Can you describe how to process multiple record types that requires the inclusion or replication of user logic based on the contents of a field or multiple fields within a record?
Answer: The following link provides an example of processing a file with multiple record types.
Explore an example of processing a file with multiple record types using COBOL programs. The records contain a combination of text strings and numeric values stored in various formats such as signed-zoned-decimal, packed and binary.
Question: Can you describe how to process (i.e. Convert the record content between EBCDIC and ASCII) variable length records that require the inclusion or replication of user logic based on the content of a field or multiple fields?
Answer: A typical use of files with variable length records incorporated a record structure the included a base segment and a variable number of user-defined segments preceded by a user-identifier. SimoTime provides the base technology and professional services for this type of data conversion.
Question: Can you describe how to read an EBCDIC encoded file and add or update records in an existing ASCII encoded file?
Answer: The SimoTime Technologies will generate a variety of Conversion programs. The most common is to do a sequential (or ordered) load of a new file. An alternative would be to generate a conversion program that reads an EBCDIC-encoded file and does updates or adds to an existing ASCII-encoded file. SimoTime Professional Services can assist in this effort.
Question: Can you describe how to read an EBCDIC encoded file with mainframe numeric encoded packed and binary fields and create a Comma-Separated-Value (CSV), ASCII encoded file with expanded numeric fields with a separate leading sign that may be easily imported into an excel spreadsheet?
Answer: This suite of sample programs describes how to read a column oriented file of fixed length records and fixed length fields and create a comma-delimited file (filename.CSV, Comma-Separated-Value) of variable length fields with the leading and trailing spaces removed from each of the fields. If a field (or data string) contains a delimiter character then enclose the field in double quotes. The program may be adjusted to create a delimited file using a tab, semicolon or other character as the delimiter.
Explore how to Create a File with CSV Formatted Records. The conversion from a Fixed-Field-Length (FFL) format to a Comma-Separated-Values (CSV) format is included in this example.
This suite of programs provides examples of how to move data between a Mainframe System and a Linux, UNIX or Windows environment. The focus is on migrating data from its EBCDIC encoded format on a Mainframe to an ASCII encoded format for Linux, UNIX or Windows platforms. This document may be used to assist as a tutorial for new programmers or as a quick reference for experienced programmers.
In the world of programming there are many ways to solve a problem. This documentation and software were developed and tested on systems that are configured for a SIMOTIME environment based on the hardware, operating systems, user requirements and security requirements. Therefore, adjustments may be needed to execute the jobs and programs when transferred to a system of a different architecture or configuration.
SIMOTIME Services has experience in moving or sharing data or application processing across a variety of systems. For additional information about SIMOTIME Services or Technologies please contact us using the information in the Contact, Comment or Feedback section of this document.
Software Agreement and Disclaimer
Permission to use, copy, modify and distribute this software, documentation or training material for any purpose requires a fee to be paid to SimoTime Technologies. Once the fee is received by SimoTime the latest version of the software, documentation or training material will be delivered and a license will be granted for use within an enterprise, provided the SimoTime copyright notice appear on all copies of the software. The SimoTime name or Logo may not be used in any advertising or publicity pertaining to the use of the software without the written permission of SimoTime Technologies.
SimoTime Technologies makes no warranty or representations about the suitability of the software, documentation or learning material for any purpose. It is provided "AS IS" without any expressed or implied warranty, including the implied warranties of merchantability, fitness for a particular purpose and non-infringement. SimoTime Technologies shall not be liable for any direct, indirect, special or consequential damages resulting from the loss of use, data or projects, whether in an action of contract or tort, arising out of or in connection with the use or performance of this software, documentation or training material.
This section includes links to documents with additional information that are beyond the scope and purpose of this document. The first group of documents may be available from a local system or via an internet connection, the second group of documents will require an internet connection.
Note: A SimoTime License is required for the items to be made available on a local system or server.
The following links may be to the current server or to the Internet.
Note: The latest versions of the SimoTime Documents and Program Suites are available on the Internet and may be accessed using the icon. If a user has a SimoTime Enterprise License the Documents and Program Suites may be available on a local server and accessed using the icon.
Explore An Enterprise System Model that describes and demonstrates how Applications that were running on a Mainframe System and non-relational data that was located on the Mainframe System were copied and deployed in a Microsoft Windows environment with Micro Focus Enterprise Server.
Explore The ASCII and EBCDIC Translation Tables. These tables are provided for individuals that need to better understand the bit structures and differences of the encoding formats.
Explore The File Status Return Codes to interpret the results of accessing VSAM data sets and/or QSAM files.
The following links provide additional detail about file transfer alternatives.
Explore the alternatives for transferring data files between systems. This link provides access to a repository of information that includes the transferring and/or sharing of data between Mainframe (ZOS or VSE), Linux, UNIX and Windows Systems.
Explore the File Transfer Protocol (FTP) commands using an interactive or scripted batch interface. This document describes a typical process for an interactive and automated, batch FTP session running on a Windows System and connecting to another Windows System, a Linux or UNIX System or an IBM Mainframe System.
Explore Sample FTP Scripts and Windows Command Files(FTP) that will transfer files between a Mainframe Host System and a Windows Client System.
The following links will require an internet connect.
A good place to start is The SimoTime Home Page for access to white papers, program examples and product information. This link requires an Internet Connection
Explore The Micro Focus Web Site for more information about products (including Micro Focus COBOL) and services available from Micro Focus. This link requires an Internet Connection.
Explore the GnuCOBOL Technologies available from SourceForge. SourceForge is an Open Source community resource dedicated to helping open source projects be as successful as possible. GnuCOBOL (formerly OpenCOBOL) is a COBOL compiler with run time support. The compiler (cobc) translates COBOL source to executable using intermediate C, designated C compiler and linker. This link will require an Internet Connection.
Explore the Glossary of Terms for a list of terms and definitions used in this suite of documents and white papers.
This document was created and is maintained by SimoTime Technologies and Services. If you have any questions, suggestions, comments or feedback please use the following contact information.
|1.||Send an e-mail to our helpdesk.|
|2.||Our telephone numbers are as follows.|
|2.1.||1 415 763-9430 office-helpdesk|
|2.2.||1 415 827-7045 mobile|
We appreciate hearing from you.
SimoTime Technologies was founded in 1987 and is a privately owned company. We specialize in the creation and deployment of business applications using new or existing technologies and services. We have a team of individuals that understand the broad range of technologies being used in today's environments. Our customers include small businesses using Internet technologies to corporations using very large mainframe systems.
Quite often, to reach larger markets or provide a higher level of service to existing customers it requires the newer Internet technologies to work in a complementary manner with existing corporate mainframe systems. We specialize in preparing applications and the associated data that are currently residing on a single platform to be distributed across a variety of platforms.
Preparing the application programs will require the transfer of source members that will be compiled and deployed on the target platform. The data will need to be transferred between the systems and may need to be converted and validated at various stages within the process. SimoTime has the technology, services and experience to assist in the application and data management tasks involved with doing business in a multi-system environment.
Whether you want to use the Internet to expand into new market segments or as a delivery vehicle for existing business functions simply give us a call or check the web site at http://www.simotime.com
|Data Migration and Conversion|
|Copyright © 1987-2022
SimoTime Technologies and Services
All Rights Reserved
|When technology complements business|