Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segment filter not working in 2.6.2 version #560

Closed
saikumare-a opened this issue Jan 3, 2023 · 8 comments
Closed

segment filter not working in 2.6.2 version #560

saikumare-a opened this issue Jan 3, 2023 · 8 comments
Labels
accepted Accepted for implementation bug Something isn't working

Comments

@saikumare-a
Copy link

saikumare-a commented Jan 3, 2023

Background [Optional]

Hi,
we are receiving the multisegment ascii file and we would like to filter the data for a particular segment based on a column.

as per documentation, tried using option("segment_filter").

even after using this filter, observing no filtration of data is happening. can you help on checking on this?

@saikumare-a saikumare-a added the question Further information is requested label Jan 3, 2023
@saikumare-a
Copy link
Author

Hi @yruslan,

just tested this in 2.6.1 and working fine, but not working in 2.6.2. can you check into this

@yruslan
Copy link
Collaborator

yruslan commented Jan 3, 2023

Hi,
What's your full spark.read code snippet?

@saikumare-a
Copy link
Author

saikumare-a commented Jan 4, 2023

Hi @yruslan,

below are the options used

final_options = {'copybook': ‘<copybook_path>', 'generate_record_id': 'false', 'drop_value_fillers': 'false', 'drop_group_fillers': 'false', 'pedantic': 'true', 'encoding': 'ascii', 'variable_size_occurs': 'true', 'record_format': 'D', 'segment_field': 'BASE_RCRD_ID', 'segment_filter': 'ABC'}

df=spark.read.format("cobol").options(**final_options).load()

@saikumare-a saikumare-a changed the title segment filter not working in "D" record_format segment filter not working in 2.6.2 version Jan 4, 2023
@yruslan
Copy link
Collaborator

yruslan commented Jan 4, 2023

Yeah, I can see why it is happening. You can workaround by filtering your data frame using .filter(col("BASE_RCRD_ID") === "ABC") for now.

@yruslan yruslan added bug Something isn't working accepted Accepted for implementation and removed question Further information is requested labels Jan 4, 2023
@saikumare-a
Copy link
Author

Hi @yruslan ,

is this issue fixed or any timeline by when this could be fixed?

@saikumare-a
Copy link
Author

Hi @yruslan,

any luck with looking into this?. Thanks in advance!!

@yruslan
Copy link
Collaborator

yruslan commented Jan 20, 2023

Not yet. Please, use the workaround for now.

@yruslan
Copy link
Collaborator

yruslan commented Feb 2, 2023

This should be fixed in 2.6.3 released yesterday.

@yruslan yruslan closed this as completed Feb 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Accepted for implementation bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants