4.4 Dropping unneeded variables

SSCC - Social Science Computing Cooperative

Supporting Statistical Analysis for Research

These exercises use the PSID.csv data set that was imported in the prior section.

Import the PSID.csv data set.

library(tidyverse)

psid_path <- file.path("..", "datasets", "PSID.csv")
psid_in <- read_csv(psid_path, col_types = cols())

Warning: Missing column names filled in: 'X1' [1]

psid_in <-
  rename(
    psid_in,
    obs_num = X1,
    intvw_num = intnum,
    person_id = persnum,
    marital_status = married
    )

psid <- psid_in
glimpse(psid)

Observations: 4,856
Variables: 9
$ obs_num        <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, ...
$ intvw_num      <dbl> 4, 4, 4, 4, 5, 6, 6, 7, 7, 7, 10, 10, 10, 11, 1...
$ person_id      <dbl> 4, 6, 7, 173, 2, 4, 172, 4, 170, 171, 3, 171, 1...
$ age            <dbl> 39, 35, 33, 39, 47, 44, 38, 38, 39, 37, 48, 47,...
$ educatn        <dbl> 12, 12, 12, 10, 9, 12, 16, 9, 12, 11, 13, 12, 1...
$ earnings       <dbl> 77250, 12000, 8000, 15000, 6500, 6500, 7000, 50...
$ hours          <dbl> 2940, 2040, 693, 1904, 1683, 2024, 1144, 2080, ...
$ kids           <dbl> 2, 2, 1, 2, 5, 2, 3, 4, 3, 5, 98, 3, 0, 0, 2, 0...
$ marital_status <chr> "married", "divorced", "married", "married", "m...

Drop the first variable in the data frame. You may have renamed it after it was loaded.
```
psid <- select(psid, -obs_num)
```
Make the age variable the first variable in the data frame.
```
psid <- select(psid, age, everything())
```