-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
copybook meta data for RDBMS #634
Comments
Spark does not have However, it could be possible to retain metadata after schema flattening. How do you flat the schema? |
SparkUtils.flattenSchema(df,useShortFieldManes=false) |
I've tested if retaining the metadata is possible, and it is. This PR makes SparkUtils.flattenSchema() retain metadata: #635 It is already merged into |
New feature working. thanks for feature |
Awesome! Thanks for letting me know |
Background
Currently, copybook metadata comes as spark schema, we need schema as rdbms level
Example [Optional]
'''
01 MASTER-RECORD.
02 RDT-TLF-MTHD-NM PIC X(08).
02 RDT-ADJ-ORGN-TRAN-DT PIC 9(06).
02 FILLER PIC X(03).
02 RDT-ADDL-DATA-GROUP.
05 RDT-ADDL-DATA OCCURS 0 TO 2 TIMES
DEPENDING ON RDT-ADDL-SEGS-NO.
10 RDT-ADDL-SEG-KEY.
15 RDT-ADDL-SEG-KEY-PROD PIC X(02).
15 RDT-ADDL-SEG-KEY-TYPE PIC S9(15)V99 COMP-3.
'''
Current Schema:
root
|-- RDT-TLF-MTHD-NM String
|-- RDT-ADJ-ORGN-TRAN-DT integer
|-- RDT-ADDL-DATA-GROUP
|-- RDT-ADDL-SEG-KEY
|-- RDT-ADDL-SEG-KEY-PROD String
|-- RDT-ADDL-SEG-KEY-TYPE DECIMAL (15,2)
expected out
|-- RDT-TLF-MTHD-NM VARCHAR(08)
|-- RDT-ADJ-ORGN-TRAN-DT integer (06)
|-- RDT-ADDL-DATA-GROUP
|-- RDT-ADDL-SEG-KEY
|-- RDT-ADDL-SEG-KEY-PROD VARCHAR(08)
|-- RDT-ADDL-SEG-KEY-TYPE DECIMAL (15,2)
we are able get parent-level element lengths only before flattening
df.schema.fields(0).metadata.getLong("maxLength")
is there any option to get the expected schema?
The text was updated successfully, but these errors were encountered: