Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dataset.schema.Schema + types. #2391

Merged
merged 9 commits into from
Oct 25, 2022
Merged

Add dataset.schema.Schema + types. #2391

merged 9 commits into from
Oct 25, 2022

Conversation

jaheba
Copy link
Contributor

@jaheba jaheba commented Oct 24, 2022

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Please tag this pr with at least one of these labels to make our release process faster: BREAKING, new feature, bug fix, other change, dev setup

@jaheba jaheba changed the title Add dataset.schema.Schema + types. Add dataset.schema.Schema + types. Oct 24, 2022
Jasper Zschiegner added 4 commits October 24, 2022 16:13

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
@lostella lostella added the new feature (one of pr required labels) label Oct 24, 2022
Comment on lines 57 to 71
`time_dim` is just a marker, indicating which axis notes the time-axis,
useful for splitting. If `time_dim` is none, the array is time invariant.
"""

ndim: int
dtype: typing.Optional[typing.Type[T]] = None
time_dim: typing.Optional[int] = None

def apply(self, data):
arr = np.asarray(data, dtype=self.dtype)

if arr.ndim < self.ndim:
to_expand = self.ndim - arr.ndim
new_shape = (1,) * to_expand + arr.shape
arr = arr.reshape(new_shape)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like time_dim ends up referring to the reshaped array. Then if to_expand > 0 (and reshaping kicks in) one needs to track what time_dim will refer to. Not sure how to fix this though

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe reshaping should not happen at all? And simply, a badly-shaped list not be convertible to the desired array type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe that's easiest. I don't think we allow 1d arrays for dynamic_feat, or do we?

The other thing one can do is to use translate, to add dimensions on the fly.

@jaheba jaheba merged commit 267e8f5 into awslabs:dev Oct 25, 2022
@jaheba jaheba deleted the schema-type branch October 25, 2022 10:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature (one of pr required labels)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants