Binary or COMP Format
Description and Discussion
|The SimoTime Home Page|
This document provides an overview (description and discussion of content and size) of binary (or COMP) fields as they are implemented on an IBM Mainframe System or a Micro Focus environment implemented on a Linux, UNIX or Windows System.
Note: The items in this document are appropriate for applications that are written in COBOL, Mainframe Assembler (HLASM) or PL/I. The IBM Mainframe architecture drove many of the numeric formats that existed in the early ANSI specifications for COBOL and have been carried forward to the current COBOL ANSI specifications.
The following table shows the structure of a three digit numeric field using the Binary format that is used on an IBM Mainframe System. The COBOL syntax would be USAGE IS COMPUTATIONAL. The field contains a value of one-hundred-twenty-three (or 123). Since the binary format stores the number as an actual binary value the field will only be two (2) bytes in length.
|Note: 1 A binary field that is defined as "Unsigned" (i.e. PIC 999) is an implied positive value. A two (2) byte unsigned, binary field may contain a range of implied positive values from 0 to 65,535.|
|Note: 2 A binary field that is defined as "Signed" (i.e. PIC S999) will use the high-order, leftmost bit as the sign. A zero (0) is a positive sign and a one (1) is a negative sign. A two (2) byte signed, binary field may contain a range of values from -32,767 to +32,767.|
|The BINARY Format for a Numeric Field|
We have made a significant effort to ensure the documents and software technologies are correct and accurate. We reserve the right to make changes without notice at any time. The function delivered in this version is based upon the enhancement requests from a specific group of users. The intent is to provide changes as the need arises and in a timeframe that is dependent upon the availability of resources.
Copyright © 1987-2023
SimoTime Technologies and Services
All Rights Reserved
The creation and processing of COMP or BINARY data on a Windows or UNIX platform must be done in the same manner as the mainframe. On the mainframe COMP or BINARY fields must be 2, 4, or 8 bytes in length (the mainframe was originally a half-word, full-word and double-word centric system). On Linux, UNIX or Windows (using Micro Focus COBOL with Net Express or Server Express) the COMP or BINARY fields may be 1 through 8 bytes in length.
Note: The syntax for COBOL is "USAGE IS COMPUTATIONAL". However, this is usually abbreviated to COMP or may be coded as BINARY.
Notice the following, the record layout for the Item Master File contains two (2) COMP or BINARY fields. These fields are defined as PIC 9(7) and may contain values from zero (0) through 9,999,999 or the binary values of x'000000' through x'98967F' which may be contained in a three (3) byte binary field.
Herein is the problem, on the EBCDIC-encoded, half-word, full-word, double-word, mainframe environment the fields would be allocated with an actual length of four (4) bytes for a binary field. For the Micro Focus, ASCII-encoded, byte-oriented environment the fields would be allocated with an actual length of three (3) bytes for a binary field.
01 ITEM-RECORD. 05 ITEM-NUMBER PIC X(12). 05 ITEM-DATA. 10 ITEM-DESCRIPTION PIC X(48). 10 ITEM-QTY-ONHAND PIC 9(7) COMP. 10 ITEM-QTY-ALLOCATED PIC 9(7) COMP. 10 ITEM-UNIT-OF-MEASURE PIC X(16). 10 ITEM-COST PIC S9(7)V9(2) COMP-3. 10 ITEM-PRICE PIC S9(7)V9(2) COMP-3. 10 ITEM-LADATE PIC X(8). 10 ITEM-LATIME PIC X(8). 10 ITEM-TOKEN PIC X(3). 10 ITEM-D-CODE-1 PIC X. 10 ITEM-D-PERCENT-1 PIC S9(3)V9(4). 10 ITEM-D-CODE-2 PIC X. 10 ITEM-D-PERCENT-2 PIC S9(3)V9(4). 10 ITEM-D-CODE-3 PIC X. 10 ITEM-D-PERCENT-3 PIC S9(3)V9(4). 10 FILLER PIC X(375).
The solution to this problem for the Micro Focus environment is to use the IBMCOMP and NOTRUNC directives when compiling the programs. This will enforce the mainframe rules for COMP or BINARY fields and the field lengths will be the same as the mainframe.
An alternative solution is to modify the copy file and change the PIC 9(7) to PIC 9(9). However, this solution requires a source code change and is not recommended during the first phase of a data migration.
The following table shows the COBOL picture clause, the number of digits, the length of a packed field, the length of a binary (COMP) field for an IBM Mainframe and the length of a binary field for the Linux, UNIX and Windows (LUW) environments running Micro Focus COBOL.
The differences in field (or data string) length are high-lighted in RED .
Binary (Micro Focus)
It is important to note that Micro Focus Mainframe Express supports the mainframe format for binary (or COMP) fields. This is accomplished by pre-setting the compiler directives to force this behavior. The compiler directives are IBMCOMP and NOTRUNC. These directives may also be used with Net Express and must be configured manually since the default for Net Express is to allow binary fields to be any length.
Information is usually process by executing programs that were created using a program language that separates the user/programmer from the underlying hardware structure. However, this separation is not one-hundred percent. Therefore, some level of awareness or understanding of the hardware may be required. The hardware techniques used to define, process, save and retrieve numeric values is typically an area where some level of understanding is required.
First, lets review how units of information are structured in a typical computer system.
A bit is a unit of information. A bit may be in an "OFF" or "ON" condition that is traditionally referred to as "0" or "1". Four (4) bits make a nybble and eight bits (or 2 nybbles) make a byte.
Note: The term "nibble" is commonly used but the original spelling was "nybble".
Next, lets review how units of information are stored in memory and processed by the system.
In order to understand the concept of Big and Little Endian we need to understand memory (typically referred to as RAM or Random Access Memory). We may think of RAM as one large array with many one-byte elements. An "Index" is typically used to access a specific element within an array. An "Address" is typically used to access a specific location within RAM (or memory).
Note: For this discussion we are using a RAM architecture that stores one byte in each RAM location. There are some RAM architectures where each memory location stores something besides a byte. However, these are rare so we will limit this discussion to RAM architectures that are byte oriented.
An IBM Mainframe System has the hardware capability of performing arithmetic tasks using a variety of different formats. Since we are currently focused on a discussion of Big and Little Endian formats we will limit this discussion to 32 bit (or 4 byte) integers. For COBOL programmers this would be "USAGE IS COMPUTATIONAL".
The question, "How are these 4 bytes placed in RAM for processing?"
The answer, "It depends, The IBM Mainframe System uses Big Endian and the hardware that is used to run Linux, UNIX and Windows typically use Little Endian". It would be important to note that Micro Focus COBOL has a compiler directives (IBMCOMP) that provides support for the Big Endian support for COMP fields.
The following shows how the 4 bytes of a 32 bit integer are arranged in RAM starting at an address location of 100.
Note: Notice the bytes of the little endian are in the reverse order when compared to big endian format. With little endian the least significant byte is stored first. With big endian the most significant byte is stored first.
Next, lets review how data files are used to save and retrieve units of information.
To understand the importance of "endianness" lets take a look at the following example.
Attention This would result in a difference of 856,747,725 between the expected value and the actual value.
The purpose of this document is to provide an overview of binary formats for numeric data strings or fields. This document may be used as a tutorial for new programmers or as a quick reference for experienced programmers.
In the world of programming there are many ways to solve a problem. This documentation and software were developed and tested on systems that are configured for a SIMOTIME environment based on the hardware, operating systems, user requirements and security requirements. Therefore, adjustments may be needed to execute the jobs and programs when transferred to a system of a different architecture or configuration.
SIMOTIME Services has experience in moving or sharing data or application processing across a variety of systems. For additional information about SIMOTIME Services or Technologies please contact us using the information in the Contact, Comment or Feedback section of this document.
Permission to use, copy, modify and distribute this software, documentation or training material for any purpose requires a fee to be paid to SimoTime Technologies. Once the fee is received by SimoTime the latest version of the software, documentation or training material will be delivered and a license will be granted for use within an enterprise, provided the SimoTime copyright notice appear on all copies of the software. The SimoTime name or Logo may not be used in any advertising or publicity pertaining to the use of the software without the written permission of SimoTime Technologies.
SimoTime Technologies makes no warranty or representations about the suitability of the software, documentation or learning material for any purpose. It is provided "AS IS" without any expressed or implied warranty, including the implied warranties of merchantability, fitness for a particular purpose and non-infringement. SimoTime Technologies shall not be liable for any direct, indirect, special or consequential damages resulting from the loss of use, data or projects, whether in an action of contract or tort, arising out of or in connection with the use or performance of this software, documentation or training material.
Downloads & Links
This section includes links to documents with additional information that are beyond the scope and purpose of this document. The first sub-section requires an internet connection, the second sub-section references locally available documents.
Note: A SimoTime License is required for the items to be made available on a local server.
Current Server or Internet Access
The following links may be to the current server or to the Internet.
Note: The latest versions of the SimoTime Documents and Program Suites are available on the Internet and may be accessed using the icon. If a user has a SimoTime Enterprise License the Documents and Program Suites may be available on a local server and accessed using the icon.
Explore The Binary or COMP format for numeric data strings. This numeric structure is supported by COBOL and may be explicitly defined with the "USAGE IS COMP" or "USAGE IS BINARY" clause.
Explore The Edited for Display format for numeric data strings. This numeric structure is supported by COBOL and may be used with an edit-mask to prepare the presentation for readability by human beings.
Explore The Packed-Decimal or COMP-3 format for numeric data strings. This numeric structure is supported by COBOL and may be explicitly defined with the "USAGE IS COMP-3" clause.
Explore The Zoned-Decimal format for numeric data strings. This numeric structure is the default numeric for COBOL and may be explicitly defined with the "USAGE IS DISPLAY" clause.
Explore commonly used formats and processing techniques for managing various numeric formats available on the mainframe.
Explore the Numbers Connection for additional information about the structure and processing of numeric data items (or numeric fields).
Explore How to Generate a Data File Convert Program using simple specification statements in a Process Control File (PCF). This link to the User Guide includes the information necessary to create a Process Control File and generate the COBOL programs that will do the actual data file conversion. The User Guide contains a list of the PCF statements that are used for the data file convert process.
Explore a typical data file conversion process that may be required when working in a multi-system environment. This suite of documents describes a model for managing non-relational data structures (Sequential Files and VSAM Data Sets) that contain ASCII or EBCDIC text strings and various numeric formats such as BINARY, PACKED-Decimal and ZONED-Decimal. This model has the capability of creating a test file for an ASCII or EBCDIC encoded environment. This suite of documents will address many of the challenges of doing a record content conversion of a file that will be transferred between an EBCDIC-encoded Mainframe System and an ASCII-encoded Linux, UNIX or Windows System.
Explore The ASCII and EBCDIC Translation Tables. These tables are provided for individuals that need to better understand the bit structures and differences of the encoding formats.
Explore The File Status Return Codes to interpret the results of accessing VSAM data sets and/or QSAM files.
Internet Access Required
The following links will require an internet connect.
A good place to start is The SimoTime Home Page for access to white papers, program examples and product information. This link requires an Internet Connection
Explore The Micro Focus Web Site for more information about products (including Micro Focus COBOL) and services available from Micro Focus. This link requires an Internet Connection.
Explore the GnuCOBOL Technologies available from SourceForge. SourceForge is an Open Source community resource dedicated to helping open source projects be as successful as possible. GnuCOBOL (formerly OpenCOBOL) is a COBOL compiler with run time support. The compiler (cobc) translates COBOL source to executable using intermediate C, designated C compiler and linker. This link will require an Internet Connection.
Glossary of Terms
Explore the Glossary of Terms for a list of terms and definitions used in this suite of documents and white papers.
Comments or Feedback
This document was created and is maintained by SimoTime Technologies. If you have any questions, suggestions, comments or feedback please use the following contact information.
We appreciate hearing from you.
SimoTime Technologies was founded in 1987 and is a privately owned company. We specialize in the creation and deployment of business applications using new or existing technologies and services. We have a team of individuals that understand the broad range of technologies being used in today's environments. Our customers include small businesses using Internet technologies to corporations using very large mainframe systems.
Quite often, to reach larger markets or provide a higher level of service to existing customers it requires the newer Internet technologies to work in a complementary manner with existing corporate mainframe systems. We specialize in preparing applications and the associated data that are currently residing on a single platform to be distributed across a variety of platforms.
Preparing the application programs will require the transfer of source members that will be compiled and deployed on the target platform. The data will need to be transferred between the systems and may need to be converted and validated at various stages within the process. SimoTime has the technology, services and experience to assist in the application and data management tasks involved with doing business in a multi-system environment.
Whether you want to use the Internet to expand into new market segments or as a delivery vehicle for existing business functions simply give us a call or check the web site at http://www.simotime.com