
4.4 Dropping unneeded variables
These exercises use the PSID.csv
data set
that was imported in the prior section.
Import the
PSID.csv
data set.from pathlib import Path import pandas as pd
psid_path = Path('..') / 'datasets' / 'PSID.csv' psid_in = pd.read_csv(psid_path) psid_in = ( psid_in .rename( columns={ 'Unnamed: 0': 'obs_num', 'intnum': 'intvw_num', 'persnum': 'person_id', 'married': 'marital_status'})) psid = psid_in.copy(deep=True) print(psid.dtypes)
obs_num int64 intvw_num int64 person_id int64 age int64 educatn float64 earnings int64 hours int64 kids int64 marital_status object dtype: object
Drop the first variable in the data frame. You may have renamed it after it was loaded.
psid = psid.drop(columns='obs_num')
Make the age variable the first variable in the data frame.
psid = psid.loc[:, [ 'age', 'intvw_num', 'person_id', 'educatn', 'earnings', 'hours', 'kids', 'marital_status']]