Dynamically Reading COBOL Redefines with C#
I can maybe help you, as 2 years ago I have accomplished exactly what you are doing now.
I had to design a MySQL Datawarehouse, including the ETL system, based exclusively on files from a RM COBOL ERP application running on Linux. The application had more than 600 files, and it was still unclear how much of them would finally end up in the database. Most of the important files were indexed, on COMP fields to make it harder, and one of the obvious requirement was that all relationships between files and their indexed keys could be reproduced on the database. So I potentially needed every field of every file.
Giving the number of files, it was out of question to treat all the files, manually and one by one.
I saw only one pragmatic solution to my problem: applying automatic programming. Ie coding a program that would generate programs, from only one source: the cobol copybooks.
I had some restrictions (set by the client) on the technology that I was allowed to use. I finally ended up with a VB.NET application that take the COBOL copybooks in input, and :
- Generates COBOL programs that convert the data in something exploitable, by reading the original indexed files and writing the records in a sequential text file.
- Generates VBA modules with all the code needed to import those data files from MS Access into MySQL (including CREATE TABLE and Indexes)
At the beginning of the project, I ran into exactly the same issues than you now, notably those damn REDEFINES. I found the task of listing and coding all copybook possibilities, if not impossible, at least hazardous. So I looked into another way, and found this :
CB2XML
COBOL copybook to XML converter: SourceForge
This saved me weeks of hard work on copybook parsing and interpreting. It can parse COBOL copybooks to change them into an XML file describing perfectly all PICTURE with a lot of useful attributes, like length or type. It fully support COBOL'86 standards.
Example with an Invoice file ( Facture in french)
000001 FD FACTURE.
000006 01 REC-FACTURE.
000011 03 FS1 PIC X.
000016 03 FS2.
000021 05 FS2A PIC 9.
05 RFS2B PIC X(8).
000026 05 FS2B REDEFINES RFS2B PIC 9(8).
000031 03 FS3.
000036 05 FS3A PIC 9.
000041 05 FS3B PIC X(10).
000046 03 FS4.
000051 05 FS4A PIC 99.
000056 05 FS4B PIC 99.
000061 05 FS4C PIC 99.
000066 03 FS5 PIC X(5).
000071 03 FS6 PIC X(20).
000076 03 FS7 PIC 9.
000081 03 FS8 PIC S9(9)V99 COMP-3.
000086 03 FS9 PIC S9(9)V99 COMP-3.
000091 03 FS10 PIC 9.
000096 03 FS11 PIC S9(9)V99 COMP-3.
000101 03 FS12 PIC S9(9)V99 COMP-3.
000106 03 FS13 PIC S9(9)V99 COMP-3.
000111 03 FS14-15 OCCURS 10.
000116 05 FS14 PIC 9.
000121 05 FS15 PIC S9(9)V99 COMP-3.
000126 05 FS16 PIC S9(9)V99 COMP-3.
000131 03 FS17 OCCURS 10 PIC S9(9)V99 COMP-3.
000136 03 FS18 PIC 9(6).
000141 03 FS19 PIC 9.
000241 03 FILLER PIC X.
Turns into this :
<copybook filename="FD8.COP.CLEAN">
<item display-length="428" level="01" name="REC-FACTURE" position="1" storage-length="428">
<item display-length="1" level="03" name="FS1" picture="X" position="1" storage-length="1"/>
<item display-length="9" level="03" name="FS2" position="2" storage-length="9">
<item display-length="1" level="05" name="FS2A" numeric="true" picture="9" position="2" storage-length="1"/>
<item display-length="8" level="05" name="RFS2B" picture="X(8)" position="3" redefined="true" storage-length="8"/>
<item display-length="8" level="05" name="FS2B" numeric="true" picture="9(8)" position="3" redefines="RFS2B" storage-length="8"/>
</item>
<item display-length="11" level="03" name="FS3" position="11" storage-length="11">
<item display-length="1" level="05" name="FS3A" numeric="true" picture="9" position="11" storage-length="1"/>
<item display-length="10" level="05" name="FS3B" picture="X(10)" position="12" storage-length="10"/>
</item>
<item display-length="6" level="03" name="FS4" position="22" storage-length="6">
<item display-length="2" level="05" name="FS4A" numeric="true" picture="99" position="22" storage-length="2"/>
<item display-length="2" level="05" name="FS4B" numeric="true" picture="99" position="24" storage-length="2"/>
<item display-length="2" level="05" name="FS4C" numeric="true" picture="99" position="26" storage-length="2"/>
</item>
<item display-length="5" level="03" name="FS5" picture="X(5)" position="28" storage-length="5"/>
<item display-length="20" level="03" name="FS6" picture="X(20)" position="33" storage-length="20"/>
<item display-length="1" level="03" name="FS7" numeric="true" picture="9" position="53" storage-length="1"/>
<item display-length="11" level="03" name="FS8" numeric="true" picture="S9(9)V99" position="54" scale="2" signed="true" storage-length="6" usage="computational-3"/>
<item display-length="11" level="03" name="FS9" numeric="true" picture="S9(9)V99" position="60" scale="2" signed="true" storage-length="6" usage="computational-3"/>
<item display-length="1" level="03" name="FS10" numeric="true" picture="9" position="66" storage-length="1"/>
<item display-length="11" level="03" name="FS11" numeric="true" picture="S9(9)V99" position="67" scale="2" signed="true" storage-length="6" usage="computational-3"/>
<item display-length="11" level="03" name="FS12" numeric="true" picture="S9(9)V99" position="73" scale="2" signed="true" storage-length="6" usage="computational-3"/>
<item display-length="11" level="03" name="FS13" numeric="true" picture="S9(9)V99" position="79" scale="2" signed="true" storage-length="6" usage="computational-3"/>
<item display-length="13" level="03" name="FS14-15" occurs="10" position="85" storage-length="13">
<item display-length="1" level="05" name="FS14" numeric="true" picture="9" position="85" storage-length="1"/>
<item display-length="11" level="05" name="FS15" numeric="true" picture="S9(9)V99" position="86" scale="2" signed="true" storage-length="6" usage="computational-3"/>
<item display-length="11" level="05" name="FS16" numeric="true" picture="S9(9)V99" position="92" scale="2" signed="true" storage-length="6" usage="computational-3"/>
</item>
<item display-length="11" level="03" name="FS17" numeric="true" occurs="10" picture="S9(9)V99" position="215" scale="2" signed="true" storage-length="6" usage="computational-3"/>
<item display-length="6" level="03" name="FS18" numeric="true" picture="9(6)" position="275" storage-length="6"/>
<item display-length="1" level="03" name="FS19" numeric="true" picture="9" position="281" storage-length="1"/>
List of all XML attributes
I will be lazy here and just copy/paste my VB.NET code, there's a comment that explains clearly each attribute
For Each Attribute As Xml.XmlAttribute In itemNode.Attributes
Select Case Attribute.Name
Case "name" ' FIeld name
Case "level" ' PICTURE level
Case "numeric" ' True if numeric data type
Case "picture" ' COmplete PICTURE string
Case "storage-length" ' Variable storage lenght
Case "usage" ' If COMP field, give the original COMP type ("computational-x")
Case "signed" ' true if PIC S...
Case "scale" ' Give number of digits afeter decimal point
Case "redefined" ' true if the field is redifined afterwards
Case "redefines" ' If REDEFINES : give the name of the redefined field
Case "occurs" ' give the number of occurences if it's an ARRAY
Case "position" ' Give the line position in the original copybook
Case "display-length" ' Give the display size
Case "filename" ' Give the FD name
With the help of this XML structure I have achieved all the goals and beyond.
The generated COBOL programs that convert the indexed files (readable only with RM cobol runtime) into flat files deals with every field, ARRAYS and REDEFINES included.
- For REDEFINES: I create a field for both the "primary" PICTURE, and all its REDEFINES alterations, and their type matches their COBOL PICTURE
- For ARRAYs, I create a field for each element, and also a huge field containing the whole array "line"
- For COMPUTATIONAL fields, I just move the original COMP into the exact same DISPLAY PICTURE
Not all the fields have a purpose when they are in the database but at least everything is available all the time
With the invoice file above, the SEQUENTIAL text file copybook becomes this :
Auto generated COBOL
FILE SECTION.
* -----------------------------------------------------------
* INPUT FILE
COPY "FD8.COP" .
* -----------------------------------------------------------
* OUTPUT FILE
FD FACTURE-DWH.
01 REC-FACTURE-DWH.
03 FS1-DWH PIC X.
03 FS2-DWH PIC X(9).
03 FS2A-DWH PIC 9.
03 RFS2B-DWH PIC X(8).
03 FS2B-DWH PIC 9(8).
03 FS3-DWH PIC X(11).
03 FS3A-DWH PIC 9.
03 FS3B-DWH PIC X(10).
03 FS4-DWH PIC X(6).
03 FS4A-DWH PIC 99.
03 FS4B-DWH PIC 99.
03 FS4C-DWH PIC 99.
03 FS5-DWH PIC X(5).
03 FS6-DWH PIC X(20).
03 FS7-DWH PIC 9.
03 FS8-DWH PIC -9(9)V99.
03 FS9-DWH PIC -9(9)V99.
03 FS10-DWH PIC 9.
03 FS11-DWH PIC -9(9)V99.
03 FS12-DWH PIC -9(9)V99.
03 FS13-DWH PIC -9(9)V99.
03 FS14-15-1-DWH PIC X(13).
03 FS14-15-2-DWH PIC X(13).
03 FS14-15-3-DWH PIC X(13).
03 FS14-15-4-DWH PIC X(13).
03 FS14-15-5-DWH PIC X(13).
03 FS14-15-6-DWH PIC X(13).
03 FS14-15-7-DWH PIC X(13).
03 FS14-15-8-DWH PIC X(13).
03 FS14-15-9-DWH PIC X(13).
03 FS14-15-10-DWH PIC X(13).
03 FS14-1-DWH PIC 9.
03 FS14-2-DWH PIC 9.
03 FS14-3-DWH PIC 9.
03 FS14-4-DWH PIC 9.
03 FS14-5-DWH PIC 9.
03 FS14-6-DWH PIC 9.
03 FS14-7-DWH PIC 9.
03 FS14-8-DWH PIC 9.
03 FS14-9-DWH PIC 9.
03 FS14-10-DWH PIC 9.
03 FS15-1-DWH PIC -9(9)V99.
03 FS15-2-DWH PIC -9(9)V99.
03 FS15-3-DWH PIC -9(9)V99.
03 FS15-4-DWH PIC -9(9)V99.
03 FS15-5-DWH PIC -9(9)V99.
03 FS15-6-DWH PIC -9(9)V99.
03 FS15-7-DWH PIC -9(9)V99.
03 FS15-8-DWH PIC -9(9)V99.
03 FS15-9-DWH PIC -9(9)V99.
03 FS15-10-DWH PIC -9(9)V99.
03 FS16-1-DWH PIC -9(9)V99.
03 FS16-2-DWH PIC -9(9)V99.
03 FS16-3-DWH PIC -9(9)V99.
03 FS16-4-DWH PIC -9(9)V99.
03 FS16-5-DWH PIC -9(9)V99.
03 FS16-6-DWH PIC -9(9)V99.
03 FS16-7-DWH PIC -9(9)V99.
03 FS16-8-DWH PIC -9(9)V99.
03 FS16-9-DWH PIC -9(9)V99.
03 FS16-10-DWH PIC -9(9)V99.
03 FS17-1-DWH PIC -9(9)V99.
03 FS17-2-DWH PIC -9(9)V99.
03 FS17-3-DWH PIC -9(9)V99.
03 FS17-4-DWH PIC -9(9)V99.
03 FS17-5-DWH PIC -9(9)V99.
03 FS17-6-DWH PIC -9(9)V99.
03 FS17-7-DWH PIC -9(9)V99.
03 FS17-8-DWH PIC -9(9)V99.
03 FS17-9-DWH PIC -9(9)V99.
03 FS17-10-DWH PIC -9(9)V99.
03 FS18-DWH PIC 9(6).
03 FS19-DWH PIC 9.
MOVE instructions
* ============================================================
PROG.
MOVE FS1 TO FS1-DWH
MOVE FS2 TO FS2-DWH
MOVE FS2A TO FS2A-DWH
MOVE RFS2B TO RFS2B-DWH
MOVE FS2B TO FS2B-DWH
MOVE FS3 TO FS3-DWH
MOVE FS3A TO FS3A-DWH
MOVE FS3B TO FS3B-DWH
MOVE FS4 TO FS4-DWH
MOVE FS4A TO FS4A-DWH
MOVE FS4B TO FS4B-DWH
MOVE FS4C TO FS4C-DWH
MOVE FS5 TO FS5-DWH
MOVE FS6 TO FS6-DWH
MOVE FS7 TO FS7-DWH
MOVE FS8 TO FS8-DWH
MOVE FS9 TO FS9-DWH
MOVE FS10 TO FS10-DWH
MOVE FS11 TO FS11-DWH
MOVE FS12 TO FS12-DWH
MOVE FS13 TO FS13-DWH
MOVE FS14-15(1) TO FS14-15-1-DWH
MOVE FS14-15(2) TO FS14-15-2-DWH
MOVE FS14-15(3) TO FS14-15-3-DWH
MOVE FS14-15(4) TO FS14-15-4-DWH
MOVE FS14-15(5) TO FS14-15-5-DWH
MOVE FS14-15(6) TO FS14-15-6-DWH
MOVE FS14-15(7) TO FS14-15-7-DWH
MOVE FS14-15(8) TO FS14-15-8-DWH
MOVE FS14-15(9) TO FS14-15-9-DWH
MOVE FS14-15(10) TO FS14-15-10-DWH
MOVE FS14(1) TO FS14-1-DWH
MOVE FS14(2) TO FS14-2-DWH
MOVE FS14(3) TO FS14-3-DWH
MOVE FS14(4) TO FS14-4-DWH
MOVE FS14(5) TO FS14-5-DWH
MOVE FS14(6) TO FS14-6-DWH
MOVE FS14(7) TO FS14-7-DWH
MOVE FS14(8) TO FS14-8-DWH
MOVE FS14(9) TO FS14-9-DWH
MOVE FS14(10) TO FS14-10-DWH
MOVE FS15(1) TO FS15-1-DWH
MOVE FS15(2) TO FS15-2-DWH
MOVE FS15(3) TO FS15-3-DWH
MOVE FS15(4) TO FS15-4-DWH
MOVE FS15(5) TO FS15-5-DWH
MOVE FS15(6) TO FS15-6-DWH
MOVE FS15(7) TO FS15-7-DWH
MOVE FS15(8) TO FS15-8-DWH
MOVE FS15(9) TO FS15-9-DWH
MOVE FS15(10) TO FS15-10-DWH
MOVE FS16(1) TO FS16-1-DWH
MOVE FS16(2) TO FS16-2-DWH
MOVE FS16(3) TO FS16-3-DWH
MOVE FS16(4) TO FS16-4-DWH
MOVE FS16(5) TO FS16-5-DWH
MOVE FS16(6) TO FS16-6-DWH
MOVE FS16(7) TO FS16-7-DWH
MOVE FS16(8) TO FS16-8-DWH
MOVE FS16(9) TO FS16-9-DWH
MOVE FS16(10) TO FS16-10-DWH
MOVE FS17(1) TO FS17-1-DWH
MOVE FS17(2) TO FS17-2-DWH
MOVE FS17(3) TO FS17-3-DWH
MOVE FS17(4) TO FS17-4-DWH
MOVE FS17(5) TO FS17-5-DWH
MOVE FS17(6) TO FS17-6-DWH
MOVE FS17(7) TO FS17-7-DWH
MOVE FS17(8) TO FS17-8-DWH
MOVE FS17(9) TO FS17-9-DWH
MOVE FS17(10) TO FS17-10-DWH
MOVE FS18 TO FS18-DWH
MOVE FS19 TO FS19-DWH
Once the flat files are written, they can be processed to MySQL by the VBA code, also generated by the VB.NET application.
Auto generated VBA
Type def declaration to deal with the text file importation
Note the original PICTURE in comments next to each field
'-------------------------------------------------------------
' REC_FC8 Record
'-------------------------------------------------------------
Private Type REC_FC8
FS1 as string*1 ' 03 FS1 PIC X
FS2 as string*9 ' 03 FS2 PIC
FS2A as string*1 ' 05 FS2A PIC 9
RFS2B as string*8 ' 05 RFS2B PIC X(8)
FS2B as string*8 ' 05 FS2B PIC 9(8)
FS3 as string*11 ' 03 FS3 PIC
FS3A as string*1 ' 05 FS3A PIC 9
FS3B as string*10 ' 05 FS3B PIC X(10)
FS4 as string*6 ' 03 FS4 PIC
FS4A as string*2 ' 05 FS4A PIC 99
FS4B as string*2 ' 05 FS4B PIC 99
FS4C as string*2 ' 05 FS4C PIC 99
FS5 as string*5 ' 03 FS5 PIC X(5)
FS6 as string*20 ' 03 FS6 PIC X(20)
FS7 as string*1 ' 03 FS7 PIC 9
FS8 as string*12 ' 03 FS8 PIC S9(9)V99 computational-3
FS9 as string*12 ' 03 FS9 PIC S9(9)V99 computational-3
FS10 as string*1 ' 03 FS10 PIC 9
FS11 as string*12 ' 03 FS11 PIC S9(9)V99 computational-3
FS12 as string*12 ' 03 FS12 PIC S9(9)V99 computational-3
FS13 as string*12 ' 03 FS13 PIC S9(9)V99 computational-3
FS14_15_1 as string*13 ' 03 FS14-15 PIC
FS14_15_2 as string*13 ' 03 FS14-15 PIC
FS14_15_3 as string*13 ' 03 FS14-15 PIC
FS14_15_4 as string*13 ' 03 FS14-15 PIC
FS14_15_5 as string*13 ' 03 FS14-15 PIC
FS14_15_6 as string*13 ' 03 FS14-15 PIC
FS14_15_7 as string*13 ' 03 FS14-15 PIC
FS14_15_8 as string*13 ' 03 FS14-15 PIC
FS14_15_9 as string*13 ' 03 FS14-15 PIC
FS14_15_10 as string*13 ' 03 FS14-15 PIC
FS14_1 as string*1 ' 05 FS14 PIC 9
FS14_2 as string*1 ' 05 FS14 PIC 9
FS14_3 as string*1 ' 05 FS14 PIC 9
FS14_4 as string*1 ' 05 FS14 PIC 9
FS14_5 as string*1 ' 05 FS14 PIC 9
FS14_6 as string*1 ' 05 FS14 PIC 9
FS14_7 as string*1 ' 05 FS14 PIC 9
FS14_8 as string*1 ' 05 FS14 PIC 9
FS14_9 as string*1 ' 05 FS14 PIC 9
FS14_10 as string*1 ' 05 FS14 PIC 9
FS15_1 as string*12 ' 05 FS15 PIC S9(9)V99 computational-3
FS15_2 as string*12 ' 05 FS15 PIC S9(9)V99 computational-3
FS15_3 as string*12 ' 05 FS15 PIC S9(9)V99 computational-3
FS15_4 as string*12 ' 05 FS15 PIC S9(9)V99 computational-3
FS15_5 as string*12 ' 05 FS15 PIC S9(9)V99 computational-3
FS15_6 as string*12 ' 05 FS15 PIC S9(9)V99 computational-3
FS15_7 as string*12 ' 05 FS15 PIC S9(9)V99 computational-3
FS15_8 as string*12 ' 05 FS15 PIC S9(9)V99 computational-3
...
FS17_8 as string*12 ' 03 FS17 PIC S9(9)V99 computational-3
FS17_9 as string*12 ' 03 FS17 PIC S9(9)V99 computational-3
FS17_10 as string*12 ' 03 FS17 PIC S9(9)V99 computational-3
FS18 as string*6 ' 03 FS18 PIC 9(6)
FS19 as string*1 ' 03 FS19 PIC 9
FC8LF As String * 2 ' LF 11
End Type
Create table procedure
Each field has become an object (from a custom class I created), and the method SQLtypeFull
used below returns the MySQL datatype of each field
'========================================================================
Private Function Create_Table_MySQL() As Boolean
On Error GoTo Erreur
Dim Rs As Recordset
Dim SQL As String
SQL = "CREATE TABLE IF NOT EXISTS `TBL_DAT_FACTURE` ( `ID` INT(11) NOT NULL auto_increment, `RECID` INT(11)"
SQL = SQL & ", `FS1` " & FS1.SQLtypeFull
SQL = SQL & ", `FS2` " & FS2.SQLtypeFull
SQL = SQL & ", `FS2A` " & FS2A.SQLtypeFull
SQL = SQL & ", `RFS2B` " & RFS2B.SQLtypeFull
SQL = SQL & ", `FS2B` " & FS2B.SQLtypeFull
SQL = SQL & ", `FS3` " & FS3.SQLtypeFull
SQL = SQL & ", `FS3A` " & FS3A.SQLtypeFull
SQL = SQL & ", `FS3B` " & FS3B.SQLtypeFull
SQL = SQL & ", `FS4` " & FS4.SQLtypeFull
SQL = SQL & ", `FS4A` " & FS4A.SQLtypeFull
SQL = SQL & ", `FS4B` " & FS4B.SQLtypeFull
SQL = SQL & ", `FS4C` " & FS4C.SQLtypeFull
SQL = SQL & ", `FS5` " & FS5.SQLtypeFull
SQL = SQL & ", `FS6` " & FS6.SQLtypeFull
SQL = SQL & ", `FS7` " & FS7.SQLtypeFull
SQL = SQL & ", `FS8` " & FS8.SQLtypeFull
SQL = SQL & ", `FS9` " & FS9.SQLtypeFull
SQL = SQL & ", `FS10` " & FS10.SQLtypeFull
SQL = SQL & ", `FS11` " & FS11.SQLtypeFull
SQL = SQL & ", `FS12` " & FS12.SQLtypeFull
SQL = SQL & ", `FS13` " & FS13.SQLtypeFull
SQL = SQL & ", `FS14_15_1` " & FS14_15_1.SQLtypeFull
SQL = SQL & ", `FS14_15_2` " & FS14_15_2.SQLtypeFull
SQL = SQL & ", `FS14_15_3` " & FS14_15_3.SQLtypeFull
SQL = SQL & ", `FS14_15_4` " & FS14_15_4.SQLtypeFull
SQL = SQL & ", `FS14_15_5` " & FS14_15_5.SQLtypeFull
SQL = SQL & ", `FS14_15_6` " & FS14_15_6.SQLtypeFull
SQL = SQL & ", `FS14_15_7` " & FS14_15_7.SQLtypeFull
SQL = SQL & ", `FS14_15_8` " & FS14_15_8.SQLtypeFull
SQL = SQL & ", `FS14_15_9` " & FS14_15_9.SQLtypeFull
SQL = SQL & ", `FS14_15_10` " & FS14_15_10.SQLtypeFull
SQL = SQL & ", `FS14_1` " & FS14_1.SQLtypeFull
SQL = SQL & ", `FS14_2` " & FS14_2.SQLtypeFull
SQL = SQL & ", `FS14_3` " & FS14_3.SQLtypeFull
SQL = SQL & ", `FS14_4` " & FS14_4.SQLtypeFull
SQL = SQL & ", `FS14_5` " & FS14_5.SQLtypeFull
SQL = SQL & ", `FS14_6` " & FS14_6.SQLtypeFull
SQL = SQL & ", `FS14_7` " & FS14_7.SQLtypeFull
SQL = SQL & ", `FS14_8` " & FS14_8.SQLtypeFull
SQL = SQL & ", `FS14_9` " & FS14_9.SQLtypeFull
SQL = SQL & ", `FS14_10` " & FS14_10.SQLtypeFull
SQL = SQL & ", `FS15_1` " & FS15_1.SQLtypeFull
SQL = SQL & ", `FS15_2` " & FS15_2.SQLtypeFull
SQL = SQL & ", `FS15_3` " & FS15_3.SQLtypeFull
SQL = SQL & ", `FS15_4` " & FS15_4.SQLtypeFull
SQL = SQL & ", `FS15_5` " & FS15_5.SQLtypeFull
...
SQL = SQL & ", `FS17_9` " & FS17_9.SQLtypeFull
SQL = SQL & ", `FS17_10` " & FS17_10.SQLtypeFull
SQL = SQL & ", `FS18` " & FS18.SQLtypeFull
SQL = SQL & ", `FS19` " & FS19.SQLtypeFull
SQL = SQL & ", PRIMARY KEY (`ID`)"
SQL = SQL & ") ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE utf8_bin;"
MySQLcon.Execute (SQL)
Create_Table_MySQL = True
Exit_Sub:
Exit Function
Erreur:
Create_Table_MySQL = False
Resume Exit_Sub
End Function
Final SQL statement
CREATE TABLE IF NOT EXISTS `FACTURE`
( `ID` INT(11) NOT NULL auto_increment, `RECID` INT(11), `FS1` CHAR(1), `FS2` CHAR(9), `FS2A` TINYINT(1) UNSIGNED, `RFS2B` CHAR(8), `FS2B` INT(8) UNSIGNED, `FS3` CHAR(11), `FS3A` TINYINT(1) UNSIGNED, `FS3B` CHAR(10), `FS4` CHAR(6), `FS4A` TINYINT(2) UNSIGNED, `FS4B` TINYINT(2) UNSIGNED, `FS4C` TINYINT(2) UNSIGNED, `FS5` CHAR(5), `FS6` CHAR(20), `FS7` TINYINT(1) UNSIGNED, `FS8` DECIMAL(11,2), `FS9` DECIMAL(11,2), `FS10` TINYINT(1) UNSIGNED, `FS11` DECIMAL(11,2), `FS12` DECIMAL(11,2), `FS13` DECIMAL(11,2), `FS14_15_1` CHAR(13), `FS14_15_2` CHAR(13), `FS14_15_3` CHAR(13), `FS14_15_4` CHAR(13), `FS14_15_5` CHAR(13), `FS14_15_6` CHAR(13), `FS14_15_7` CHAR(13), `FS14_15_8` CHAR(13), `FS14_15_9` CHAR(13), `FS14_15_10` CHAR(13), `FS14_1` TINYINT(1) UNSIGNED, `FS14_2` TINYINT(1) UNSIGNED, `FS14_3` TINYINT(1) UNSIGNED, `FS14_4` TINYINT(1) UNSIGNED, `FS14_5` TINYINT(1) UNSIGNED, `FS14_6` TINYINT(1) UNSIGNED, `FS14_7` TINYINT(1) UNSIGNED, `FS14_8` TINYINT(1) UNSIGNED, `FS14_9` TINYIN
T(1) UNSIGNED, `FS14_10` TINYINT(1) UNSIGNED, `FS15_1` DECIMAL(11,2), `FS15_2` DECIMAL(11,2), `FS15_3` DECIMAL(11,2), `FS15_4` DECIMAL(11,2), `FS15_5` DECIMAL(11,2), `FS15_6` DECIMAL(11,2), `FS15_7` DECIMAL(11,2), `FS15_8`
...
DECIMAL(11,2), `FS17_10` DECIMAL(11,2), `FS18` DATE, `FS19` TINYINT(1) UNSIGNED,
PRIMARY KEY (`ID`)) ENGINE=MyISAM
DEFAULT CHARSET=utf8 COLLATE utf8_bin;
I have much more in the generated VBA modules, and the level of detail and accuracy of the generated xml helped a lot for all of them:
- I have created a Class that manage all aspects of fields, and notably take care of the conversions VBA/MySQL following the original PICTURE and VBA types (date, long, doubles, currency, etc...) and there's a hook in case you want to force another type.
- it deals entirely with the metadata creation (also in MySQL)
- it deals with the errors when importing data, logging everything at file and fields level
I have probably shown enough to give you some ideas so I will stop there.
The most important: When we transpose the complex computations made within the COBOL ERP, into SQL statements run against the new datawarehouse, the numbers returned by the DB are exactly identical to those into the ERP. On several millions of records, There's not a single digit loss on computations. This is the best way to prove that it works :-)
If you wonder why I used Access/VBA and not .NET for the importation: it was a non-negotiable requirement.
On a last note : I am not affiliated in any way with CB2XML and this is not an advertisement for it. It's just a great and helpful piece of software, which deserves some attention.
REDEFINES
is going to make your task more difficult. It is not the the "compiler" knows which particular field to use, intuitively, it is that the code in the existing COBOL system knows which field to use. There will be some indication, some value in another field, which will indicate which of the fields to use at which particular time.
Taking your second example, as the first is devoid of context:
05 ADDRESS-PO-BOX-FLAG PIC X.
That field will be interrogated before the data is used. Either directly (you can find lots of horrible code out there) or with an 88-level Condition Name:
88 ADDRESS-IS-A-PO-BOX VALUE "Y". (an example only)
IF ADDRESS-IS-A-PO-BOX
some code relating to PO Boxes
ELSE
some code relating to other types of addresses
END-IF
Your first example will be dealt with in a similar manner.
It is an "old style" use of REDEFINES, to use the same storage locations on a record for mutually-exclusive situations. Saves storage, which was expensive. The system you are working with is either "old", or the design of it was infected by false "experience".
You have two broad choices: to replicate all the conditional selection of data (so that you have two sets of business-logic to keep in step); to get the file changed so that each field occupies its own storage.
The presence of COMP-3 (or PACKED-DECIMAL) or COMP/COMP-4/COMP-5/BINARY data-types also complicate things for you. You'd need to then do your EBCDIC-to-ASCII at the field level, for actual EBCDIC data, and do whatever would be necessary to convert or simply acquire the "computational" data.
Also be aware that any signed-DISPLAY-numeric fields (numeric fields with a PICture beginning with an S but without an explicit "computational" usage) will apparently contain "character" data in the final byte, as the sign is held as an "overpunch" of the final byte.
Note that the binary data-types will be Big Endian.
It will be massively simpler for you if you receive files which have no REDEFINES, no "computational" fields, and no embedded signs (or implicit decimal-places). All your data would be character, and you can EBCDIC-to-ASCII at the record-level (or at the file level, with your file-transfer mechanism).
If you look at questions here tagged COMP-3, you'll find further discussion of this, and if you decide that the ridiculous route (your program understanding native Mainframe COBOL data-items rather than plain "text") is the only possible way to go, then there are a number of things in the discussions you may find useful and be able to use or apply.
If your company is "regulated" externally, then ensure your Compliance, Audit and Accounting departments are happy with your design before you code one line. Whoops. Late for that. Let's hope it is manufacturing.