summarise()
is typically used on grouped data created by group_by()
.
The output will have one row for each group.
summarise(.data, ...) summarize(.data, ...)
.data | A tbl. All main verbs are S3 generics and provide methods
for |
---|---|
... | Name-value pairs of summary functions. The name will be the
name of the variable in the result. The value should be an expression
that returns a single value like These arguments are automatically quoted and
evaluated in the context of the data
frame. They support unquoting and
splicing. See |
An object of the same class as .data
. One grouping level will
be dropped.
Center: mean()
, median()
Spread: sd()
, IQR()
, mad()
Range: min()
, max()
, quantile()
Count: n()
, n_distinct()
Logical: any()
, all()
Data frames are the only backend that supports creating a variable and using it in the same summary. See examples for more details.
When applied to a data frame, row names are silently dropped. To preserve,
convert to an explicit variable with tibble::rownames_to_column()
.
# A summary applied to ungrouped tbl returns a single row mtcars %>% summarise(mean = mean(disp), n = n())#> mean n #> 1 230.7219 32# Usually, you'll want to group first mtcars %>% group_by(cyl) %>% summarise(mean = mean(disp), n = n())#> # A tibble: 3 x 3 #> cyl mean n #> <dbl> <dbl> <int> #> 1 4 105. 11 #> 2 6 183. 7 #> 3 8 353. 14# Each summary call removes one grouping level (since that group # is now just a single row) mtcars %>% group_by(cyl, vs) %>% summarise(cyl_n = n()) %>% group_vars()#> [1] "cyl"# Note that with data frames, newly created summaries immediately # overwrite existing variables mtcars %>% group_by(cyl) %>% summarise(disp = mean(disp), sd = sd(disp))#> # A tibble: 3 x 3 #> cyl disp sd #> <dbl> <dbl> <dbl> #> 1 4 105. NA #> 2 6 183. NA #> 3 8 353. NA# summarise() supports quasiquotation. You can unquote raw # expressions or quosures: var <- quo(mean(cyl)) summarise(mtcars, !!var)#> mean(cyl) #> 1 6.1875