What you actually need to know before reading COBOL source cold - get DIVISION / PIC / COMP-3 / COPY / PERFORM straight first

· · COBOL, Legacy Technology, Business Systems, Maintenance, Mainframe

The short version

  • COBOL is a record-definition language before it is a logic language
  • reading only PROCEDURE DIVISION gets you less than half of it. Look at DATA DIVISION first
  • PIC is the shape of an item, USAGE is its internal representation
  • COMP-3 is packed decimal. You see it constantly on amounts and counts
  • 88 is not a separate variable — it is a condition name attached to the value of the item just above it
  • REDEFINES lets you view the same memory in a different shape (it is not a copy)
  • if there is a COPY, you cannot see the whole picture without opening the copybook
  • old source uses fixed format, where column position matters. Whitespace is not decoration

The four DIVISIONs

DIVISION what to look for
IDENTIFICATION DIVISION program name
ENVIRONMENT DIVISION files, I/O assumptions
DATA DIVISION record definitions, work areas, arguments
PROCEDURE DIVISION the actual processing steps

Inside DATA DIVISION the parts that matter most are:

  • FILE SECTION - record definitions for input/output files
  • WORKING-STORAGE SECTION - variables, flags, counters
  • LINKAGE SECTION - arguments passed in from outside (the entry point for subprograms)

How to read fixed format

In old COBOL, column position has meaning.

  • columns 1-6: sequence number
  • column 7: indicator (* = comment, - = continuation, D = debug line)
  • columns 8-11: Area A
  • columns 12-72: Area B

Tab conversion or left-aligning the file will break it. Check whether the source is fixed format or free format before anything else.

DATA DIVISION — the minimum

Level numbers

01  WS-ORDER.               <- top-level record
   05  WS-ORDER-ID  PIC 9(8).
   05  WS-AMOUNT    PIC S9(7)V99 COMP-3.
   05  WS-STATUS    PIC X.
       88  WS-OK    VALUE '0'.   <- condition name (true when WS-STATUS is '0')
       88  WS-ERROR VALUE '9'.

77  WS-COUNT         PIC 9(4).   <- standalone single item
  • 01: top of a group
  • 0549: subordinate items
  • 77: standalone single item
  • 88: condition name (a name attached to a value of the item just above)

Do not read 88 as a separate boolean variable. It is just a name for a state of the value.

PICTURE (PIC)

notation meaning
X character
9 digit
S signed
V decimal point (logical only — there is no . in the data)
X(10) 10 characters
9(5) 5-digit number
S9(7)V99 signed 7-digit integer + 2 decimal places

V does not occupy an actual . character, so if you stare at the file as plain text the digits look misaligned.

USAGE (internal representation)

notation meaning watch out for
DISPLAY decimal as visible characters on a mainframe this can mean EBCDIC
COMP / BINARY binary the visible digit count and the internal layout are different things
COMP-3 packed decimal amounts and counts. Looks like garbage if you read it as text

DISPLAY does not automatically mean ASCII. On z/OS it is EBCDIC by default.

REDEFINES / OCCURS / COPY / FILLER

  • REDEFINES: view the same area in a different shape (close to a C union). Not a copy
  • OCCURS: array. OCCURS DEPENDING ON is a variable-length table
  • COPY: a compile-time include. You cannot see the finished record without the copybook
  • FILLER: an unnamed item. Used for reserved space or to pad record length. It still takes bytes

PROCEDURE DIVISION — the minimum

PERFORM

The basic call-and-return construct.

PERFORM UNTIL EOF
    READ IN-FILE
        AT END SET EOF TO TRUE
        NOT AT END PERFORM PROCESS-REC
    END-READ
END-PERFORM

IF / EVALUATE

  • IF is the basic conditional
  • EVALUATE is the switch/case equivalent

Watch the period .

In old code, a . is an implicit scope terminator. A single stray period changes the range of an IF or PERFORM. NEXT SENTENCE jumps to whatever comes after the next ..

READ / WRITE / CALL

  • READ ... AT END ... is the standard pattern
  • CALL 'SUBPGM' USING ... jumps into another program

What lives outside the COBOL

COBOL rarely stands alone.

  • FILE STATUS - the result code after every I/O. Essential for diagnosing file failures
  • EXEC SQL - embedded SQL. The COBOL is just a holder for host variables; the real work is on the SQL side
  • EXEC CICS - CICS transaction context. You need to read the screens and COMMAREA together with the code
  • JCL - on a mainframe batch, which datasets get assigned is decided outside the COBOL itself

A minimum reading order

  1. Walk every COPY - open the copybooks or find the post-expansion source
  2. Pick up the 01-level record definitions - in FILE SECTION, WORKING-STORAGE, and LINKAGE SECTION
  3. Read PIC and USAGE - identify amounts, dates, counts, codes
  4. Search for READ/WRITE/CALL/EXEC SQL/EXEC CICS - locate the I/O and the external boundaries
  5. Follow only the first main path - trace the PERFORM chain from the top of PROCEDURE DIVISION
  6. Look at 88s and status items - the meaning of EOF and ok/error becomes much easier to read
  7. Mark every REDEFINES, OCCURS DEPENDING ON, and COMP-3
  8. Check FILE STATUS - it cuts down misreads around I/O errors

Quick reference

if you see this think this first
01 top of a record or group
88 a named meaning for a flag or status code
PIC X(…) character item
PIC 9(…) numeric item — check digits and decimal position
COMP binary
COMP-3 packed decimal — likely an amount or count
REDEFINES the same area being read a different way
OCCURS array
OCCURS DEPENDING ON variable length — watch what comes after it too
FILLER no name, but it has length
COPY you need the copybook to see the full record
PERFORM skeleton of the main path
READ/WRITE file I/O
EXEC SQL DB work
EXEC CICS transaction work
FILE STATUS I/O result code

Things that trip people up

  1. Treating REDEFINES as a separate variable — it is just the same area viewed differently
  2. Treating 88 as a standalone bool — it is a name attached to the value of the item above
  3. Reading the body and ignoring COPY — half the field definitions live outside the file
  4. Treating MOVE as plain assignment — type-driven conversion and digit alignment happen during the move
  5. Underestimating the period . — the scope terminator easily moves
  6. Mistaking packed decimal or EBCDIC for mojibake — it was never plain text, or it is just not ASCII

Wrap-up

COBOL is not hard because it is old. It is hard because data definitions, external files, and the execution context are tightly bound together, which makes the entry point hard to find.

  • get the map from the DIVISIONs
  • read DATA DIVISION first
  • read item shapes through PIC and USAGE
  • mark COMP-3, REDEFINES, OCCURS, 88, and COPY
  • follow PERFORM, READ, WRITE, and CALL
  • pin down the external boundary with FILE STATUS, EXEC SQL, and EXEC CICS

Once the scale of the map matches the territory, it reads more normally than you would expect.

Related Articles

Recent articles sharing the same tags. Deepen your understanding with closely related topics.

Related Topics

These topic pages place the article in a broader service and decision context.

Where This Topic Connects

This article connects naturally to the following service pages.

Back to the Blog