Supporting Statistical Analysis for Research
2.10 Summarizing data
The summarise()
function transforms a tibble
by
applying functions that produce statistics of the
variables.
2.10.1 Examples
Summarize one variable of the
cps
tibble
.We will find the mean and standard devation of
age
.cps %>% summarise( mean_age = mean(age), std_dev_age = sd(age) )
# A tibble: 1 x 2 mean_age std_dev_age <dbl> <dbl> 1 33.2 11.0
Summarizing multiple columns
We will find the mean earnings in years 74, 75, and 78 for each ethnicity.
cps_eth_earn <- cps %>% group_by(ethnicity) %>% summarise_at( vars(real_earn_74:real_earn_78), mean ) cps_eth_earn
# A tibble: 3 x 4 ethnicity real_earn_74 real_earn_75 real_earn_78 <fct> <dbl> <dbl> <dbl> 1 white_non_hisp 14376. 13999. 15213. 2 black 11427. 10941. 12007. 3 hisp 12402. 12290. 13397.
Summarizing with multiple grouping variables
We will find the mean earnings in year 78 for each ethnicity and maritial status.
cps_eth_marr_earn <- cps %>% group_by(ethnicity, marr) %>% summarise( mean_earn_78 = mean(real_earn_78) ) cps_eth_marr_earn
# A tibble: 6 x 3 # Groups: ethnicity [3] ethnicity marr mean_earn_78 <fct> <dbl> <dbl> 1 white_non_hisp 0 11319. 2 white_non_hisp 1 16742. 3 black 0 9199. 4 black 1 13728. 5 hisp 0 10138. 6 hisp 1 14607.