SSCC - Social Science Computing Cooperative Supporting Statistical Analysis for Research

5.3 Numeric variables

These exercises use the mtcars.csv data set.

  1. Import the mtcars.csv data set.

    mtcars_path <- file.path("..", "datasets", "mtcars.csv")
    mtcars_in <- read_csv(mtcars_path, col_types = cols())
    Warning: Missing column names filled in: 'X1' [1]
    mtcars_in <- rename(mtcars_in, make_model = X1)
    mtcars <- mtcars_in
    
    glimpse(mtcars)
    Observations: 32
    Variables: 12
    $ make_model <chr> "Mazda RX4", "Mazda RX4 Wag", "Datsun 710", "Hornet...
    $ mpg        <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22....
    $ cyl        <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, ...
    $ disp       <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 14...
    $ hp         <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123,...
    $ drat       <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.9...
    $ wt         <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3....
    $ qsec       <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20...
    $ vs         <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, ...
    $ am         <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
    $ gear       <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, ...
    $ carb       <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, ...
  2. The wt variable is measured in thousands of pounds. Change this variable to a character variable that has a comma separating the thousands digit from the hundreds digit, e.g. 2.14 becomes 2,140.

    Hint, for one of the possible solution you may find it useful to look for a tidyverse string function that will pad. Padding adds characters to a fixed width.

    mtcars <-
      mtcars %>%
      mutate(
        wt_1000s = wt,
        wt = as.character(wt_1000s)
      ) %>%
      separate(
        wt, into = c("wt_d4", "wt_d123"), sep = "\\.", remove = TRUE
        ) %>%
      mutate(
         wt_d123 = str_pad(wt_d123, width = 3, side = "right", pad = "0")
      ) %>%
      unite(wt, wt_d4, wt_d123, sep = ",")
    
    mtcars %>%
      select(make_model, wt, wt_1000s) %>%
      head()
    # A tibble: 6 x 3
      make_model        wt    wt_1000s
      <chr>             <chr>    <dbl>
    1 Mazda RX4         2,620     2.62
    2 Mazda RX4 Wag     2,875     2.88
    3 Datsun 710        2,320     2.32
    4 Hornet 4 Drive    3,215     3.22
    5 Hornet Sportabout 3,440     3.44
    6 Valiant           3,460     3.46

    This is another posible solution.

    mtcars <-
      mtcars %>%
      mutate(
        wt = wt_1000s
      ) %>%
      select(-wt_1000s)
    
    mtcars <-
      mtcars %>%
      mutate(
        wt_1000s = wt,
        wt = as.character(wt * 1000),
        wt = str_replace(wt, "(...$)", ",\\1")
      )
    
    mtcars %>%
      select(make_model, wt, wt_1000s) %>%
      head()
    # A tibble: 6 x 3
      make_model        wt    wt_1000s
      <chr>             <chr>    <dbl>
    1 Mazda RX4         2,620     2.62
    2 Mazda RX4 Wag     2,875     2.88
    3 Datsun 710        2,320     2.32
    4 Hornet 4 Drive    3,215     3.22
    5 Hornet Sportabout 3,440     3.44
    6 Valiant           3,460     3.46
  3. Convert the character variable you created in the prior exercise to a new numeric variable. Make the units of measure for this new variable 1,000 of pounds.

    mtcars <-
      mtcars %>%
      mutate(
        wt_1000s_new = parse_number(wt) / 1000
      )
    
    mtcars %>%
      select(make_model, wt, wt_1000s_new, wt_1000s) %>%
      head()
    # A tibble: 6 x 4
      make_model        wt    wt_1000s_new wt_1000s
      <chr>             <chr>        <dbl>    <dbl>
    1 Mazda RX4         2,620         2.62     2.62
    2 Mazda RX4 Wag     2,875         2.88     2.88
    3 Datsun 710        2,320         2.32     2.32
    4 Hornet 4 Drive    3,215         3.22     3.22
    5 Hornet Sportabout 3,440         3.44     3.44
    6 Valiant           3,460         3.46     3.46