Does the Cobrix handle the Easytrieve layout.? #516

AnveshAeturi · 2022-09-09T14:26:21Z

Background [Optional]

I am having the Easytrive layout which is having the Packed unsigned fields (data-type U in Easytrieve), binary unsigned fields (data-type B in Easytrieve) and Alpha-numeric fields (data-type A in Easytrieve and storing Hexbit). The Data file that we are trying to convert is EBCDIC data.

Question

Is there a way we can convert this data thru Cobrix by providing the above mentioned Easytrieve layout? @yruslan

yruslan · 2022-09-12T13:30:19Z

Hi, could you attach an example copybook and a link to the documentation for the data type, please?

AnveshAeturi · 2022-09-14T07:59:58Z

The cobol copybook says X (2) but however the data itself is coming from an Easytrieve with a data type of U (Packed Unsigned).
Example is VARIABLE PIC X (2). The data stored is actually an unsigned packed field (definition of U in Easytrieve)

Data Type Link: https://www.mvsforums.com/manuals/EZT_PL_APP_63_MASTER.pdf
Page#35 - Library 2-11 is the footer on the page

AnveshAeturi · 2022-09-19T08:02:09Z

Easyterieve_Layout_sample.xlsx

Hi @yruslan , This is the Excel which we created from the Easytrieve layout. only sample fields are added here.

yruslan · 2022-09-22T14:16:25Z

I see. The data types look parsable at first glance. The only thing you need a proper copybook that matches the data in order to parse records like that. And for that you would need a mapping between Easyretrieve data types and Cobol data types.
For instance an Easyretrieve type U with length 4 can have a PIC 9(4) (or PIC 9(9) if the encoding is binary)

Do I understand it correctly that the fields specified in the Excel file are not all fields of the record? Field 'CRSCON' with length 1 at offset 10 is followed by CRADTR at offset 20. It means there are other fields between CRSCON and CRADTR that fill the rest 9 bytes.

mike-childs · 2022-12-07T22:33:17Z

Hello. I am adding a comment because I also need to request this same support for Unsigned Packed fields in the mainframe records.
Here is what is meant by "Unsigned Packed" :
An Easytrieve U (Unsigned Packed) field is the same as a normal Packed field, but without the sign-nibble on the end.

For example, let's say we have an account date value of '20220425'.
As a Packed number, that field would be defined in COBOL like this:
ACCT-DATE PIC 9(8) COMP-3.
. . .and in memory, that field would contain this:
X'020220425F'

As a U (Unsigned Packed) number, that field would be defined in COBOL like this:
ACCT-DATE PIC X(4).
. . .and in memory, that field would contain this:
X'20220425'

Unsigned Packed (U) fields must be defined in COBOL as PIC X fields because COBOL does not support the Unsigned Packed format.
It is invalid data to COBOL.
Therefore, when COBOL programmers encounter Unsigned Packed fields in their data, they have to write special code to convert it to a normal Packed value by inserting the sign nibble at the end, then processing it as a Packed field.

The Unsigned Packed field cannot be declared as a COBOL BINARY (COMP) field because it does not contain a binary value. It contains a Packed value without the sign nibble.
If you took our example data above and defined it as Binary in COBOL . . . 'PIC 9(8) COMP', the X'20220425' value is now treated as a Binary value, which is 539,100,197.

Adding support for Unsigned Packed fields would be pretty simple in Cobrix. You could add a "Unsigned Packed" flag to the 'decodeBCDIntegralNumber' function that handles Packed (COMP-3) values, and just leave out the sign nibble if it's the Unsigned Packed format.
You could add a Cobrix special parm, like COMP-UP, (similar to what you did for COMP-9).
Then, users could code this in their COBOL copybook for the Unsigned Packed field:
ACCT-DATE PIC X(4) COMP-UP.

Please let me know if you'd like to chat more about this. Thank you very much.

yruslan · 2022-12-08T08:31:27Z

Hi @mike-childs,

Makes sense. I might ask a couple of more questions as we go.

The first one,

When you have ACCT-DATE PIC X(4). in unsigned packed format, does this mean that the maximum number of digits of the packed number is 4, or it means the field occupies 4 bytes, so it can have 8 digits?

yruslan · 2022-12-08T09:14:31Z

I see the answer to the question in your description. Sorry.
I think adding a special USAGE like COMP-UP would indeed be the best way to do.
Or maybe COMP-3U (since is is like COMP-3, just without the sign nibble).

mike-childs · 2022-12-08T12:38:54Z

Hi @yruslan,
Yes, COMP-3U would also be excellent. And yes, the 'X(4)' length refers to 4-bytes in memory (8 digits). And please do feel free to ask questions. I have experience with this topic.
Thank you very much for accepting this request. It will be extremely helpful for us, (and others).

yruslan · 2022-12-08T13:09:01Z

Great, thanks for the answer and for such a detailed description!

Will implement it soon.

yruslan · 2022-12-08T14:14:56Z

One more question. Would it be okay if PIC required for packed numerics to be

PIC 9(4) COMP-3U.

not

PIC X(4) COMP-3U.

?
This is because the parser relies heavily on numeric data types usage of '9' in PIC.

mike-childs · 2022-12-08T14:26:11Z

Yes, requiring the '9' (as in 'PIC 9(4) COMP-3U') makes perfect sense because the field should contain only numeric data. The field would have all the same rules as a normal Packed field, other than the lack of a sign nibble.
Thanks you.

This parses 'unsigned packed' format, that is BCD without the sign nibble.

yruslan · 2022-12-15T16:21:04Z

This is added. You can try building spark-cobol from master. Let me know if it works as expected.

mike-childs · 2022-12-19T13:48:09Z

Thank you very much @yruslan! We have a story in our backlog to pull in the latest Cobrix version and do thorough testing with the new COMP-3U type parm. I will add an update here once we have done that work. We really appreciate you adding this functionality.

mike-childs · 2023-01-05T15:38:04Z

Hello @yruslan. We have finished our testing with the new COMP-3U parm, and it correctly converted the Unsigned-Packed fields. I have attached a screen shot showing my input and output and test results. Please let me nkow if you need any further information. Thank you very much.

yruslan · 2023-01-05T16:09:13Z

Hi @mike-childs , Thanks a lot for confirming! Glad it works as expected.

diddyp20 · 2023-03-22T15:34:28Z

@yruslan I am getting the below error when updating the copybook to COMP-3U.

za.co.absa.cobrix.cobol.parser.exceptions.SyntaxErrorException: Syntax error in the copybook at line 28: Invalid input 'COMP-3U' at position 28:64

yruslan · 2023-03-22T15:55:50Z

Use spark-cobol 2.6.4.
If you are already using the latest Cobrix, let me know how your copybook statement looks like for that field.

diddyp20 · 2023-03-22T17:13:59Z

@yruslan I have upgraded spark-cobol 2.6.4 and getting this error:

java.lang.NoClassDefFoundError: scala/$less$colon$less

here is the command:

class_poc_df = spark.read.format("cobol")
.option("copybook",class_copybook)
.option("record_format", "D")
.option("schema_retention_policy", "collapse_root")
.option("drop_value_fillers", "false")
.load(class_data)

yruslan · 2023-03-22T18:25:10Z

The error suggests that you are using spark-cobol build for a different Scala version from your Spark environment.

Use the artifact that matches your Scala version:

spark-cobol_2.11
spark-cobol_2.12
spark-cobol_2.13

or build the one that matches your environment exactly using 'sbt assembly' (the full command is in README)

AnveshAeturi · 2023-03-22T19:41:29Z

Hi @diddyp20 I have faced the similar error copybook at line 28: Invalid input 'COMP-3U' at position 28:64 in the past wrt to my copybooks. check the field alignment in the copybook, It should be aligned properly wrt to data inside the file. Hopefully that should solve the issue.

AnveshAeturi added the question label Sep 9, 2022

AnveshAeturi changed the title ~~Does the Cobrix handle the Binary unsigned fields~~ Does the Cobrix handle the Easytrieve layout.? Sep 9, 2022

yruslan added enhancement accepted and removed question labels Dec 8, 2022

yruslan added a commit that referenced this issue Dec 8, 2022

#516 Add new COMP-3U option to the copybook parser.

1f332dc

This parses 'unsigned packed' format, that is BCD without the sign nibble.

yruslan added a commit that referenced this issue Dec 9, 2022

#516 Implement COMP-3U decoding for 'spark-cobol'.

7079c4a

yruslan added a commit that referenced this issue Dec 12, 2022

#516 Implement COMP-3U decoding for 'spark-cobol'.

ea037b9

yruslan added a commit that referenced this issue Dec 12, 2022

#516 Add info about the new extension to README.

d690236

yruslan mentioned this issue Dec 15, 2022

Feature/516 add support for packed unsigned #546

Merged

yruslan closed this as completed in #546 Dec 15, 2022

yruslan added a commit that referenced this issue Dec 15, 2022

#516 Add new COMP-3U option to the copybook parser.

2af4a9b

This parses 'unsigned packed' format, that is BCD without the sign nibble.

yruslan added a commit that referenced this issue Dec 15, 2022

#516 Implement COMP-3U decoding for 'spark-cobol'.

5fa69a3

yruslan added a commit that referenced this issue Dec 15, 2022

#516 Add info about the new extension to README.

b682532

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does the Cobrix handle the Easytrieve layout.? #516

Does the Cobrix handle the Easytrieve layout.? #516

AnveshAeturi commented Sep 9, 2022

yruslan commented Sep 12, 2022

AnveshAeturi commented Sep 14, 2022

AnveshAeturi commented Sep 19, 2022

yruslan commented Sep 22, 2022

mike-childs commented Dec 7, 2022

yruslan commented Dec 8, 2022

yruslan commented Dec 8, 2022 •

edited

Loading

mike-childs commented Dec 8, 2022

yruslan commented Dec 8, 2022

yruslan commented Dec 8, 2022

mike-childs commented Dec 8, 2022

yruslan commented Dec 15, 2022

mike-childs commented Dec 19, 2022

mike-childs commented Jan 5, 2023

yruslan commented Jan 5, 2023

diddyp20 commented Mar 22, 2023

yruslan commented Mar 22, 2023

diddyp20 commented Mar 22, 2023 •

edited

Loading

yruslan commented Mar 22, 2023

AnveshAeturi commented Mar 22, 2023

Does the Cobrix handle the Easytrieve layout.? #516

Does the Cobrix handle the Easytrieve layout.? #516

Comments

AnveshAeturi commented Sep 9, 2022

Background [Optional]

Question

yruslan commented Sep 12, 2022

AnveshAeturi commented Sep 14, 2022

AnveshAeturi commented Sep 19, 2022

yruslan commented Sep 22, 2022

mike-childs commented Dec 7, 2022

yruslan commented Dec 8, 2022

yruslan commented Dec 8, 2022 • edited Loading

mike-childs commented Dec 8, 2022

yruslan commented Dec 8, 2022

yruslan commented Dec 8, 2022

mike-childs commented Dec 8, 2022

yruslan commented Dec 15, 2022

mike-childs commented Dec 19, 2022

mike-childs commented Jan 5, 2023

yruslan commented Jan 5, 2023

diddyp20 commented Mar 22, 2023

yruslan commented Mar 22, 2023

diddyp20 commented Mar 22, 2023 • edited Loading

yruslan commented Mar 22, 2023

AnveshAeturi commented Mar 22, 2023

yruslan commented Dec 8, 2022 •

edited

Loading

diddyp20 commented Mar 22, 2023 •

edited

Loading