Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable dropping of columns in dataset.schema.translate. #2387

Merged
merged 1 commit into from
Oct 21, 2022

Conversation

jaheba
Copy link
Contributor

@jaheba jaheba commented Oct 21, 2022

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Please tag this pr with at least one of these labels to make our release process faster: BREAKING, new feature, bug fix, other change, dev setup

Comment on lines 343 to +349
def __call__(self, item):
result = dict(item)
if self.drop:
keys = item.keys() - self.get_fields()
result = {key: item[key] for key in keys}
else:
result = dict(item)

Copy link
Contributor

@codingWhale13 codingWhale13 Oct 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is supposed to happen when we call the Translator with drop=True? I could imagine two things:

  • Dropping item from fields or
  • returning the fields that are not present in item

It seems to me like item would be a subset of fields so item.keys() - self.get_fields() would always be empty...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

item is the input, so not sure what you mean with dropping item from fields.

The idea is that if drop is enabled, only those fields are copied to the result which are not selected by some translation.

This is what it does:

t = Translator.parse(a="b", drop=True)
t(
    {
        "b": 1, # will be removed if `drop` because selected by a
        "c": 42, # will remain always
    }
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, that makes sense. 👍
Could you add a similar explanation to the docstring, explaining what drop does?

@jaheba jaheba merged commit 9d55b6c into awslabs:dev Oct 21, 2022
@jaheba jaheba deleted the translate-drop branch October 21, 2022 11:51
@lostella lostella added the new feature (one of pr required labels) label Oct 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature (one of pr required labels)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants