Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undocumented change: preservation of class attribute #4051

Closed
heavywatal opened this issue Dec 22, 2018 · 5 comments · Fixed by #4751
Closed

Undocumented change: preservation of class attribute #4051

heavywatal opened this issue Dec 22, 2018 · 5 comments · Fixed by #4751

Comments

@heavywatal
Copy link

The class attribute is lost in the released 0.7.8, and preserved in the dev version:

x = as_tibble(iris)
class(x) = c("myclass", class(x))
x %>% dplyr::filter(Sepal.Length < 5) %>% class()

# 0.7.8
# [1] "tbl_df"     "tbl"        "data.frame"

# 0.7.99.9000
# [1] "myclass"    "tbl_df"     "tbl"        "data.frame"

It can be a side effect of the changes in grouped_df (or slice()), and happens with the stable tibble 1.4.2 and other verbs like mutate(). Anyway it is favorable to me and seems to be consistent with the upcoming tibble with subclass support. I just hope it is documented more explicitly so that I can give a clear direction to issues like YuLab-SMU/tidytree#7.

@elinw
Copy link

elinw commented Jan 5, 2019

This is a breaking change for skimr because it means that the print function no longer is print.data.frame() but rather print.myclass() so the output needs to be piped to as.data.frame() in order to avoid an error. Thanks @romainfrancois for alerting us to the breakage. I feel this is an extremely significant change that, if kept, should be documented in the list of major breaking changes. I'm actually not sure what to do that would work with both versions except for overloading the impacted dplyr verbs.

@elinw
Copy link

elinw commented Jan 5, 2019

Based on #4051 I guess we should do mutate also; it's just not in any of our examples or vignettes. Are you aware of this being an issue for other verbs?

@yutannihilation
Copy link
Member

Is the preservation of class really an intended change? If so, I'll file the below as a new issue; for regrouping purpose, group_by(..., add = TRUE) might be better replaced because group_by() doesn't respect the original classes.

library(dplyr, warn.conflicts = FALSE)

x <- as_tibble(iris)
class(x) <- c("myclass", class(x))
x %>%
  add_count(Species) %>%
  class()
#> [1] "tbl_df"     "tbl"        "data.frame"

Created on 2019-01-07 by the reprex package (v0.2.1)

@hadley
Copy link
Member

hadley commented Jan 7, 2019

Yes, it's definitely intended although it won't be until dplyr 0.9.0 (where we have full vctrs integration) that we expect every function to do the right thing. The behaviour of add_count() etc is going to need quite a lot of thought, so an issue would be helpful.

@yutannihilation
Copy link
Member

Thanks, it's really a good news to me then :) I'll file a new issue.

@romainfrancois romainfrancois added this to the 0.9.0 milestone Jun 10, 2019
hadley added a commit that referenced this issue Jan 14, 2020
hadley added a commit that referenced this issue Jan 15, 2020
This PR starts to develop the dplyr "interface", i.e. the set of generics that you need to provide methods for if you want to extend dplyr to work with new data frame subclasses. It also uses those methods (along with a count_regroups()) to ensure that the existing grouped_df implementations are not needlessly regrouping data.

Fixes #4086 because count() can now use dplyr_reconstruct() to restore the original class
Fixes #4051 because I've carefully documented the return value of the major verbs
Fixes #4711 since implementing with_groups() is now easy.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants