#420 Add experimental support for 'FB' format #421

yruslan · 2021-09-17T06:54:30Z

Resolves #420
Resolves #422

Added experimental support for fixed block record formats (FB). When the record format is 'FB' you need to specify either block length or number of records per
block. As with 'F' if record_length is not specified, it will be determined from the copybook.

.option("record_format", "FB")
.option("record_length", "250")
.option("block_length", "500")

or

.option("record_format", "FB")
.option("record_length", "250")
.option("records_per_block", "2")

More on fixed-length record formats: https://www.ibm.com/docs/en/zos/2.3.0?topic=sets-fixed-length-record-formats

…plicitly.

sree018 · 2021-09-17T13:49:59Z

Hi @yruslan

I have four types of files in ebcdic format
1. variable block with variable length record (VB-> able to parse file )
2. variable block with fixed length record ( not able to parse )
Custom record extractor implementation for VBVR data files, should it not use rawRecordExtractor?#412
I am using above logic to convert file to fixed block and reading as F.
3. fixed block with variable length record (V-> able to parse file)
4. fixed block with fixed length record (F-> able to parse file )

File_Type   BDW.   RDW  
 VBVR         yes      yes 
 VBFR          yes      no
 FBVR          no       yes
 FBFR          no         no

if you provide record_format as VBVR,VBFR,FBVR and FBFR which will help us.
Unfortunately, due to security issues, I cant provide dummy files for variable block with fixed length files.

yruslan · 2021-09-17T15:52:35Z

@sree018 , I see that you can't provide a dummy data. But could you please provide a spec for VBFR?

The way I see it:
VBVR: BDW RDW DATA RDW DATA RDW DATA BDW RDW DATA RDW DATA RDW DATA (aka VB)
VBFR: ? (has BDW, but not RDW??? I couldn't find a spec for such kind of files in IBM'd docs)
FBVR: RDW DATA RDW DATA RDW DATA RDW DATA RDW DATA RDW DATA (aka V, no BDW)
FBFR: DATA DATA DATA DATA DATA DATA (aka F and FB)

Currently Cobrix supports this directly:
VBVR : option("record_format", "VB")
VBFR : ?
FBVR : option("record_format", "V")
FBFR : option("record_format", "F")

Alternatively, if you can't provide dummy data or a spec, you can implement VBFR as a custom record extractor and submit a PR. We will integrate it with Cobrix and add a direct suport.

yruslan · 2021-09-17T15:59:05Z

Does this make any sense?
VBFR: BDW DATA DATA DATA BDW DATA DATA DATA

I can add support for BDWs for .option("record_format", "FB") and this might solve the problem.

sree018 · 2021-09-17T19:57:42Z

@yruslan thanks for info.
My file follow this pattern.
VBFR: BDW DATA DATA DATA BDW DATA DATA DATA
DATA is fixed length.
But, In my copy book we flattened RDW.
after adjusting copy book, I am able parse VBFR-> BDW + RDW(2,2)

yruslan · 2021-09-18T10:17:48Z

I'm glad you found a solution. I'll add support for
VBFR: BDW DATA DATA DATA BDW DATA DATA DATA
as well, for completion.

yruslan · 2021-09-21T07:07:54Z

@sree018, just added support for BDW without RDW. Use

.option("record_format", "FB")
.option("record_length", "250")

(250 - record length)
or

.option("record_format", "FB")

sree018 · 2021-09-27T18:49:55Z

Hi @yruslan

Incase of multi segment hierarchical copybook,

by using custom record extractor, how do i parse file ?

sree018 · 2022-03-06T16:15:03Z

Hi @yruslan

I have multi segmented binary file and able to parse it, if single segment column present. How do I parse multi segmented file with two segment columns present ?

ex: copybook

       01 MASTER-DATA.
          05 KEY-TYPE  PIC X(01).
          05 KEY-IND     PIC X(01).
 * KEY=A AND IND=0
          05 KEY-A0-REC.
                 10 DATA-FIELDS  PIC X(998).
   *KEY =B AND IND=0
          05 KEY-B0-REC REDEFINES KEY-A0-REC.
                 10 DATA-FIELDS PIC X(998).
  *KEY= 'A'-'F' AND IND=1
           05 KEY-01-REC REDEFINES KEY-B0-REC.
                 10 DATA-FIELDS PIC X(998).
  * KEY = 'A-F' AND IND=2
          05 KEY-02-REC REDEFINES KEY-01-REC.
                 10 DATA-FIELDS PIC X(998).

yruslan · 2022-04-05T12:20:12Z

Hi @sree018,

Please, create a new issue for this question.

yruslan added 8 commits September 14, 2021 14:35

Fix compilation of the examples collection app

e3dd5fd

Extract BDW options into a separate class.

20586db

#420 Add implementation of raw record extractor for FB format.

63f06b1

#420 Add an integration test suite for FB record format.

4cbbbbb

#420 Suppress logger in tests.

229250e

#420 Make Scala 2.11 compiler happy

76ed1fb

Build: Select Spark version based on Scala version if not provided ex…

82f46b9

…plicitly.

#420 Add FB usage description to README.

8c66531

yruslan mentioned this pull request Sep 17, 2021

Custom record extractor implementation for VBVR data files, should it not use rawRecordExtractor? #412

Closed

yruslan added 3 commits September 17, 2021 13:38

#422 Fix decoding of the 'broken pipe' character from EBCDIC.

d303eba

Update minor version of Scala 2.12

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Verified
Learn about vigilant mode

028806e

Update version of sbt

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Verified
Learn about vigilant mode

7a652d7

Fix formatting

0359706

#420 Add support for FB format with BDW only.

6ba8304

yruslan merged commit dd52448 into master Sep 21, 2021

yruslan deleted the feature/420-fb-record-format branch September 21, 2021 10:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#420 Add experimental support for 'FB' format #421

#420 Add experimental support for 'FB' format #421

yruslan commented Sep 17, 2021 •

edited

Loading

sree018 commented Sep 17, 2021

yruslan commented Sep 17, 2021 •

edited

Loading

yruslan commented Sep 17, 2021 •

edited

Loading

sree018 commented Sep 17, 2021

yruslan commented Sep 18, 2021

yruslan commented Sep 21, 2021

sree018 commented Sep 27, 2021

sree018 commented Mar 6, 2022

yruslan commented Apr 5, 2022

#420 Add experimental support for 'FB' format #421

#420 Add experimental support for 'FB' format #421

Conversation

yruslan commented Sep 17, 2021 • edited Loading

sree018 commented Sep 17, 2021

yruslan commented Sep 17, 2021 • edited Loading

yruslan commented Sep 17, 2021 • edited Loading

sree018 commented Sep 17, 2021

yruslan commented Sep 18, 2021

yruslan commented Sep 21, 2021

sree018 commented Sep 27, 2021

sree018 commented Mar 6, 2022

yruslan commented Apr 5, 2022

yruslan commented Sep 17, 2021 •

edited

Loading

yruslan commented Sep 17, 2021 •

edited

Loading

yruslan commented Sep 17, 2021 •

edited

Loading