Note that decimals disappeared in the examples above because this is how tibble displays it, they are still there, you can display them in console with adding %>% as.ame() at the end of each snippet. If you're bothered by specifications of summaries in names, you can add at the end something like %>% rename_all(function(x) gsub("_.*", "", x)).Īnd last but not least, also a way with purrr (would give the same output as the first approach here): library(tidyverse) Group.var x1_mean x2_mean x3_mean y1_sum y2_sum 圓_sum Select(group.var, matches("x\\d_mean"), matches("y\\d_sum")) Descriptive Statistics in R - Functions to perform. Im going to explain some of the key components to the summary() function in R for linear regression models. That is tapply() function allows us to create a group summaries based on factor levels. One way would be with mutate and then distinct: df %>%Īnother way would be to make both summaries for all, and then select only relevant combinations ( mean for x, and sum for y): df %>% The summary function implores specific methods that depend on the class of the first argument. The end result I want is: group.var x1 x2 x3 y1 y2 圓Īny suggestions, preferably using dplyr or data.table? I can write two snippets which do the same grouping, selecting and filtering, but different summarizing using the summarize_all function, and then join the grouped df's using group.var, but I'm looking for a more efficient method. You may also use custom functions to summarize regression models that do not currently have broom tidiers. ![]() filter(x2 != 20) %>% # just for referenceĮrror in is_character(x, encoding = encoding, n = 1L) : select(group.var, x1:xn, y1:yn) %>% # just for reference Ideally I want to use dplyr's summarize_at function twice in the same chain to apply mean to variable set 1 and sum to set 2 in two different operations, but for obvious reason, the returned grouped df cannot identify the second set of varibales. ![]() That is, I want to apply two different summary functions to two different sets of variables in a data frame after applying some chain functions (such as filter and select, because the original problem is more complicated than this). I have a data frame in which for each grouping variable, there are two types of variables: one set for which I need the mean within each group, the other one for which I need the sum within each group.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |