Data File Convert Data Management Series |
The SimoTime Home Page |
Ever since the second computer architecture was introduced the task of data conversion in preparation for data migration and data sharing has been a never-ending process. Data conversions may be driven by business requirements or system requirements such as changes in system architectures. The SimoTime Data File Convert process is part of the larger Data Management and Data Validation processes.
SimoTime provides a Conversion Utility program (UTCONVRT). This program does not do the data file conversion. UTCONVRT is a "pre-edit and execute" program that runs on a Windows System with Micro Focus Enterprise Developer and generates conversion programs (COBOL Source Code) that may be compiled and executed on a distributed Linux, UNIX or Windows (LUW) platform with Micro Focus or an IBM Mainframe System. The UTCONVRT program is capable of generating programs that do File Format, Record Format and Record Content Conversion. UTCONVRT does this by performing a pre-edit of a set of user specifications then generates the conversion programs that will do the actual conversion. For record content conversion an optional COBOL copy file may be used to do the conversion at the field level and maintain mainframe numeric integrity.
This document is an introduction or overview of the data file conversion aspects of an application migration between a mainframe system and a Windows system running a Micro Focus Enterprise Server (or Enterprise Developer). The following discussion will divide the data file conversion tasks into two categories.
1. | File Format Conversion: changes the format of the file (i.e. Key-Sequenced-Data-Set to Sequential file) and leaves the record content unchanged. |
2. | Record Content Conversion: changes the content of the records within the file (i.e. EBCDIC to ASCII, fixed or variable length fields, CSV format). |
Note: The file format and record content may be accomplished in a single pass of reading and writing the data files.
The document and links to other documents will cover the following items.
1. | File format conversion varies widely across Mainframes, Wintel, UNIX and Linux systems. The least common denominator for file formats is a record sequential file of fixed length records. It is a common practice to take a proprietary file format on one system and copy it to a record sequential file. The record sequential file is then transferred to the target platform (usually FTP in binary mode) and used as the base to create a new file in a format that is native to the receiving system. |
2. |
Record content conversion may present a bigger challenge. It is the exception when a file on a mainframe contains records that are all text and may be converted between EBCDIC and ASCII as a single string. The reality is that records in mainframe files contain a mix of text strings and numeric fields that may be signed or un-signed, zoned-decimal, packed-decimal, binary or floating point formats. Therefore, the record content conversion must be done at the field level. A first step is to determine the encoding schema (i.e. ASCII or EBCDIC) for the target environment. The following may be helpful in making this decision. |
2.1. | If the application being migrated (or replicated) from a mainframe platform to a non-mainframe platform will continue to complement or coexist with applications or data that will continue to reside on the mainframe then configuring the new environment to use an EBCDIC-encoding schema is a viable and practical approach. |
2.2. | If the application being migrated (or moved) from a mainframe platform to a non-mainframe platform (with little or no dependencies on the mainframe) then configuring the new environment to use an ASCII-encoding schema is a viable and practical approach. |
2.3. | In some situations the use of third-party software may influence or determine the encoding schema for a new, non-mainframe environment. For example, the use of a relational data base combined with data sharing to other systems or applications that are ASCII encoded may drive a decision to convert to an ASCII-encoded environment. |
3. | When approaching a record content conversion (i.e. EBCDIC-encoding to ASCII-encoding) it is a good practice to use an existing data definition of the record layout if one exists. Creating a new or propriety set of data definition specifications increases the risk of introducing new errors in the conversion process. |
4. | A COBOL copy file that defines the record layouts within a file is usually available or a COBOL program with a working storage definition may be "cut-and-pasted" into a new COBOL copy file that is then used to create a data conversion process at the field level. |
5. | It can be a challenge to get EBCDIC-encoded data with fixed length fields interspersed with numeric fields of binary or packed values into a format that is suitable for an ASCII-encoded Excel spreadsheet. |
5.1. | Converting the format of a numeric field from Packed or Binary to a numeric field of display or print text will increase the size of the new field and the new record. |
5.2. | Use the Data Conversion Technologies available from SimoTime Enterprises may be used to do the numeric field format conversion along with the file format and record content conversion. |
The data conversion process should be a repeatable process with an audit or validation trail. The process should be executable as an automated, unattended process. Requiring operator input during the conversion process introduces an exposure point for error.
We have made a significant effort to ensure the documents and software technologies are correct and accurate. We reserve the right to make changes without notice at any time. The function delivered in this version is based upon the enhancement requests from a specific group of users. The intent is to provide changes as the need arises and in a timeframe that is dependent upon the availability of resources.
Copyright © 1987-2025
SimoTime Technologies and Services
All Rights Reserved
There are two categories of data conversion requirements based upon the target environment configuration.
1. | Migrate (or move) the application and then retire application on Mainframe | ||||
|
|||||
2. | Migate (or replicate) the application then Coexist/Complement the mainframe by sharing data and processes. | ||||
|
This section provides a list of questions that will aid in determining the scope of effort for creating the process to do data file conversions. It is important to provide platform flexibility as to where the conversion is done. The following questions should be answered at the beginning of the process to migrate an application and its associated data.
The following is used to determine the number of files and the file types and characteristics.
1. | How many Key-Sequenced-Data-Sets (KSDS or Indexed Files) will be converted? | ______ | |||
|
______ | ||||
|
______ | ||||
|
______ | ||||
|
______ | ||||
|
______ | ||||
|
______ | ||||
|
______ | ||||
2. | How many Sequential files do you have to be converted? | Y or N | |||
|
______ | ||||
|
______ | ||||
|
______ | ||||
|
______ | ||||
|
______ | ||||
|
______ | ||||
3 | Do you have COBOL copy files that define the record layouts? | Y or N | |||
|
Y or N | ||||
|
Y or N | ||||
|
Y or N | ||||
4. | Do you use duplicate field names across group items (for example, FIELD-A of GROUP-01 and FIELD-A of GROUP-02)? | Y or N | |||
5. | Do you have packed (i.e. COMP-3) and binary fields (i.e. COMP) fields? | Y or N | |||
6. | Do you have signed, zone decimal fields? | Y or N | |||
7. | Do your files have Floating Point fields (i.e. COMP-1 or COMP-2)? | Y or N | |||
|
Y or N | ||||
8. | Will you be using Line Sequential (i.e. ASCII/Text) files in the Windows environment? | Y or N |
The following is used to determine the basic requirements for the data file conversion effort.
1. | Is there a requirement to do the conversion on the mainframe? | Y or N |
2. | Is there a requirement to do the conversion on the server? | Y or N |
3. | Is there a requirement to do the conversion on the client machine? | Y or N |
4. | Is there a requirement to do the conversion during the transfer (i.e. download/upload) process? | Y or N |
5. | Is there a requirement to do the conversion at the file level? | Y or N |
6. | Is there a requirement to do the conversion at the record level? | Y or N |
7. | Is there a requirement for the conversion routine to be a callable module? | Y or N |
8. | Is there a requirement for the conversion routine to handle variable length records? | Y or N |
9. | Is there a requirement for the conversion routine to handle multiple record types? | Y or N |
10. | VALIGN="TOP" ALIGN="LEFT">Is there a requirement for the conversion process to be used or adapted to convert other data structures (i.e. IDMS, DataCom, Adabas, etc )? | Y or N |
The following is used to determine the effort of transferring the data files.
1. | File Transfer Protocol (FTP) will be used as the transfer medium. | Y or N | ||
2. | Micro Focus Mainframe Access will be used as the transfer medium. | Y or N | ||
3. | Machine Readable Media will be used as the transfer medium. | Y or N | ||
|
______ | |||
4. | Other Comments |
This section provides additional detail about the process for generating, compiling and executing a data file comparison program. This discussion is limited to the Windows environment. However, the process is very similar for the mainframe and UNIX environments once the COBOL source code has been generated on a Windows system.
If the requirement for data conversion is to do the conversion on a Windows client machine that is used for development and testing and the data is limited to sequential or indexed files then the Data File CONVerter included in Net Express (DFCONV) offers a cost effective and convenient solution. However, Micro Focus does not include the components for DFCONV into the various run time (or production) offering (Application Server and Enterprise Server) to allow full implementation in the production environments.
Note: Both DFCONV and the Data File Editor are delivered as technologies for a development environment running on a Windows System.
Note: Also, DFCONV is only available on Windows, it is not available on UNIX.
In many situations the requirement goes beyond just a simple file conversion that is performed on a Windows platform. The following may require more function than provided by DFCONV.
| ||||||||||||||||||
Possible Requirements List for Data File Conversion |
The following is a reasonable three-step guideline for approaching a data file conversion effort.
| ||||||
Three Possible Approaches to Data File Conversion |
With the exception of files with variable length records that are dynamically created at execution time a COBOL copy file that defines a record layout is usually available. If a copy file is not available a COBOL working storage definition can usually be "cut-and-pasted" to create a COBOL copy file.
With DFCONV the COBOL copy file or a working storage definition may be used as a record definition for the data file conversion.
With SimoTime technologies the COBOL copy file may be used to generate the COBOL source code that may be compiled and executed on a Mainframe system (MVS or VSE), a Wintel System with Micro Focus or a UNIX system with Micro Focus. Using a COBOL copy file is not an absolute requirement since an optional feature of the SimoTime Technologies provides for data conversion based on position within record.
If the requirement is to only convert the file format in order to FTP between systems then the REPRO function of IDCAMS may be the solution.
The following IDCAMSJ1.jcl is an example of how to use IDCAMS to convert a VSAM, KSDS to a record sequential file.
//IDCAMSJ1 JOB SIMOTIME,ACCOUNT,CLASS=1,MSGCLASS=0,NOTIFY=CSIP1 //* ******************************************************************* //* This program is provided by SimoTime Technologies * //* (C) Copyright 1987-2019 All Rights Reserved * //* Web Site URL: http://www.simotime.com * //* e-mail: helpdesk@simotime.com * //* ******************************************************************* //* //* TEXT - COPY (OR REPRO) A KSDS TO A SEQUENTIAL FILE //* AUTHOR - SIMOTIME TECHNOLOGIES //* DATE - JANUARY 01, 1989 //* // EXEC PGM=IDCAMS //SYSPRINT DD SYSOUT=A //KSDGET01 DD DSN=SIMOTIME.DATA.ITEMMAST,DISP=OLD //ITEMRSEQ DD DSN=SIMOTIME.DATA.ITEMRSEQ,UNIT=SYSDA, // SPACE=(TRK,(10,10)), // DISP=(NEW,CATLG), // DCB=(LRECL=512,RECFM=FB) //SYSIN DD * REPRO - INFILE(KSDGET01) - OUTFILE(ITEMRSEQ) /*
The following IDCAMSJ2.jcl is an example of how to use IDCAMS to convert a record sequential file to a VSAM, KSDS.
//IDCAMSJ2 JOB SIMOTIME,ACCOUNT,CLASS=1,MSGCLASS=0,NOTIFY=CSIP1 //* ******************************************************************* //* This program is provided by SimoTime Technologies * //* (C) Copyright 1987-2019 All Rights Reserved * //* Web Site URL: http://www.simotime.com * //* e-mail: helpdesk@simotime.com * //* ******************************************************************* //* //* TEXT - COPY (OR REPRO) A SEQUENTIAL FILE TO A VSAM, KSDS //* AUTHOR - SIMOTIME TECHNOLOGIES //* DATE - JANUARY 01, 1989 //* // EXEC PGM=IDCAMS //SYSPRINT DD SYSOUT=A //ITEMRSEQ DD DSN=SIMOTIME.DATA.ITEMRSEQ,DISP=(SHR) //ITEMKSD2 DD DSN=SIMOTIME.DATA.ITEMKSD2, // SPACE=(TRK,(10,1),RLSE), // DISP=(NEW,CATLG,DELETE), // LRECL=512,KEYOFF=0,KEYLEN=12,RECORG=KS //SYSIN DD * REPRO - INFILE(ITEMRSEQ) - OUTFILE(ITEMKSD2) /*
The following is an example of how to use IDCAMS to convert a flat, sequential file to a VSAM, KSDS.
When converting a Sequential file that is downloaded from a Mainframe System (EBCDIC-encoded) to an Indexed file that is created on a Windows System (ASCII-encoded) it may be necessary to make adjustments for the differences in the ASCII and EBCDIC collating sequences. This is especially true if the field that determines the sequence of the file is alpha-numeric (if the key field is all numeric then sequencing should not be a problem).
If both the file format (Sequential to Indexed with an alpha-numeric key) and the file content (EBCDIC to ASCII) is being done in a single pass then an unordered load of the new Indexed file must be done to avoid getting an out-of-sequence error when adding new records into the new Indexed file.
Explore a suite of sample programs that describes the ASCII and EBCDIC sorting or collating sequences and provides an example of programming logic that will work in an EBCDIC environment (i.e. Mainframe System with ZOS) but fail in an ASCII environment (i.e. Linux, UNIX or Windows with Micro Focus).
An alternate index should not be created until after the file has been converted. Since we are changing the value (i.e. encoding format) of the alternate key (or index) then we should only create the alternate key after the data that makes up the alternate key has been converted.
Building the alternate index may be done using IDCAMS and JCL (Mainframe Express or ES/MTO with the Batch Facility) or with the Micro Focus BLDINDEX utility.
Each record type within a file should be treated as if the record type were a separate file. If a file contains five (5) different record types then a process must be implemented that will determine the record type and then pass control to the appropriate conversion routine for the record type. The technique used by the SimoTime technology greatly simplifies this effort. Since the SimoTime technology generates a callable routine based on a COBOL copy file it will be necessary to create a callable routine for each record type. Once this is done the user logic that determines the record type may be "cut" from an existing COBOL program that accesses the file and pasted into the SimoTime generated COBOL source code that performs the file I/O. The file I/O program will then call the appropriate conversion routine based on the "cut-and-pasted" user logic.
If a file contains variable length records created by using blank truncation techniques then the conversion process is similar to the process used to convert files with fixed length records.
If the variable length records are built dynamically by appending a variety of structured data segments to a base segment of a record then each segment must be treated as a separate entity and converted accordingly. In other words a separate routine should be generated for each segment of data and the user logic from an existing program should be used to determine the appended segment type, length and structure and do the segment conversion accordingly. Additional time should be allocated to convert this type of record content structure.
The IBM Mainframe has a variety of numeric formats or encoding schemes. This mix of numeric formats has been challenging for mainframe programmers since their inception. When migrating data the numeric formats require special consideration. The "BINARY" or "PACKED" numeric fields should not be converted (or translated) between EBCDIC to ASCII. These fields are supported on a Mainframe System and in a Micro Focus environment running on Windows, Linux or UNIX when using a mainframe dialect.
The following list provides external links to reference material and examples for the various types of numeric formats used with the COBOL programming language and/or the Mainframe System.
| ||||||||||||
Additional Information for the various Types of Numeric Formats used with COBOL |
The following items should be considered when thinking about expanding packed or binary numeric fields.
| ||||||||||||
Considerations for Expanding Numeric Fields |
Binary, Packed or Floating Point numeric fields may be used with "Record Sequential" files that are EBCDIC or ASCII encoded. The Binary, Packed or Floating Point numeric fields will cause a problem if attempting to include in a "Line Sequential" (or ASCII/Text) file.
This is part of the basic requirements to ensure the files being converted will have an equal number of records read from the input file and written to the output file. The SimoTime technology has an option to provide a read write count for an ordered load of an indexed file. When doing an unordered load it will provide a read, write and update count.
Management and/or the auditors/consultants should be involved in the data migration (and data file content conversion) process as early as possible. This will help ensure the process will meet the requirements and maintain the necessary level of data integrity at each step in the process.
When converting data files an existing definition (such as a COBOL copy file) of the data structures (or record layouts) should be used. This will avoid introducing errors in a process that requires the data to be defined by a new or proprietary format.
The data conversion process should be a repeatable process with an audit trail. The process should be executable as an automated, unattended process. Operator intervention during the conversion process should be considered as an exposure to introducing errors into the process.
Here is what we have done over the years when faced with converting data files between ASCII and EBCDIC. In the world of programming there are many ways to solve a problem. This documents and the links to other documents are intended to provide a choice of alternatives.
| ||||||||||
The Evolution of Data File Conversion Techniques |
Today, we take the following approach.
| ||||||||
Data File Conversion Techniques used in Today's Environments |
Anyone considering a data conversion should seek additional assistance from a consulting or programming services organization that has experience in this area.
In the world of programming there are many ways to solve a problem. This documentation and software were developed and tested on systems that are configured for a SIMOTIME environment based on the hardware, operating systems, user requirements and security requirements. Therefore, adjustments may be needed to execute the jobs and programs when transferred to a system of a different architecture or configuration.
SIMOTIME Services has experience in moving or sharing data or application processing across a variety of systems. For additional information about SIMOTIME Services or Technologies please contact us using the information in the Contact or Feedback section of this document.
Software Agreement and Disclaimer
Permission to use, copy, modify and distribute this software, documentation or training material for any purpose requires a fee to be paid to SimoTime Technologies. Once the fee is received by SimoTime the latest version of the software, documentation or training material will be delivered and a license will be granted for use within an enterprise, provided the SimoTime copyright notice appear on all copies of the software. The SimoTime name or Logo may not be used in any advertising or publicity pertaining to the use of the software without the written permission of SimoTime Technologies.
SimoTime Technologies makes no warranty or representations about the suitability of the software, documentation or learning material for any purpose. It is provided "AS IS" without any expressed or implied warranty, including the implied warranties of merchantability, fitness for a particular purpose and non-infringement. SimoTime Technologies shall not be liable for any direct, indirect, special or consequential damages resulting from the loss of use, data or projects, whether in an action of contract or tort, arising out of or in connection with the use or performance of this software, documentation or training material.
This section includes links to documents with additional information that are beyond the scope and purpose of this document. The first group of documents may be available from a local system or via an Internet connection, the second group of documents will require an Internet connection.
Note: A SimoTime License is required for the items to be made available on a local system or server.
The following links may be to the current server or to the Internet.
Note: The latest versions of the SimoTime Documents and Program Suites are available on the Internet and may be accessed using the icon. If a user has a SimoTime Enterprise License the Documents and Program Suites may be available on a local server and accessed using the icon.
Explore How to Generate a Data File Convert Program using simple specification statements in a Process Control File (PCF). This link to the User Guide includes the information necessary to create a Process Control File and generate the COBOL programs that will do the actual data file conversion. The User Guide contains a list of the PCF statements that are used for the data file convert process.
Explore how to convert between the various numeric formats (or data types such as DISPLAY, COMP, COMP-3 or DECIMAL, BINARY and PACKED) used with COBOL and on an IBM Mainframe System. This example illustrates how to display an actual hexadecimal (or Hex Dump) content of a numeric field using a callable dump routine.
Explore a Series of White Papers for non-relational data files. This includes information about data file management in a diverse or mixed systems environment.
Explore a quick overview of the data file management tasks for data file transfer, conversion and comparison. Ever since the second computer was introduced into the world the file management tasks of data file transfer, share, convert and compare (or data file validation) have been technically challenging.
Explore the alternatives for transferring data files between systems. This link provides access to a repository of information that includes the transferring and/or sharing of data between Mainframe (ZOS or VSE), Linux, UNIX and Windows Systems.
Explore the Principles of Data File Conversion. This link includes guidelines for defining requirements and determining the scope of effort for a data conversion effort.
Explore the Principles of Data File Comparison. This link includes guidelines for defining requirements and determining the scope of effort for a data comparison effort.
Explore An Enterprise System Model that describes and demonstrates how Applications that were running on a Mainframe System and non-relational data that was located on the Mainframe System were copied and deployed in a Microsoft Windows environment with Micro Focus Enterprise Server.
Explore The ASCII and EBCDIC Translation Tables. These tables are provided for individuals that need to better understand the bit structures and differences of the encoding formats.
Explore The File Status Return Codes that are used to interpret the results of accessing VSAM data sets and/or QSAM files.
The following links will require an internet connect.
A good place to start is The SimoTime Home Page for access to white papers, program examples and product information. This link requires an Internet Connection
Explore The Micro Focus Web Site for more information about products (including Micro Focus COBOL) and services available from Micro Focus. This link requires an Internet Connection.
Explore the GnuCOBOL Technologies available from SourceForge. SourceForge is an Open Source community resource dedicated to helping open source projects be as successful as possible. GnuCOBOL (formerly OpenCOBOL) is a COBOL compiler with run time support. The compiler (cobc) translates COBOL source to executable using intermediate C, designated C compiler and linker. This link will require an Internet Connection.
Explore the Glossary of Terms for a list of terms and definitions used in this suite of documents and white papers.
This document was created and is maintained by SimoTime Technologies. If you have any questions, suggestions, comments or feedback please use the following contact information.
1. | Send an e-mail to our helpdesk. |
1.1. | helpdesk@simotime.com. |
2. | Our telephone numbers are as follows. |
2.1. | 1 415 763-9430 office-helpdesk |
2.2. | 1 415 827-7045 mobile |
We appreciate hearing from you.
SimoTime Technologies was founded in 1987 and is a privately owned company. We specialize in the creation and deployment of business applications using new or existing technologies and services. We have a team of individuals that understand the broad range of technologies being used in today's environments. Our customers include small businesses using Internet technologies to corporations using very large mainframe systems.
Quite often, to reach larger markets or provide a higher level of service to existing customers it requires the newer Internet technologies to work in a complementary manner with existing corporate mainframe systems. We specialize in preparing applications and the associated data that are currently residing on a single platform to be distributed across a variety of platforms.
Preparing the application programs will require the transfer of source members that will be compiled and deployed on the target platform. The data will need to be transferred between the systems and may need to be converted and validated at various stages within the process. SimoTime has the technology, services and experience to assist in the application and data management tasks involved with doing business in a multi-system environment.
Whether you want to use the Internet to expand into new market segments or as a delivery vehicle for existing business functions simply give us a call or check the web site at http://www.simotime.com
Return-to-Top |
Data Management Series, Data File Convert |
Copyright © 1987-2025 SimoTime Technologies and Services All Rights Reserved |
When technology complements business |
http://www.simotime.com |