SSCC - Social Science Computing Cooperative Supporting Statistical Analysis for Research

2.2 Reading csv files and other delimited data

  1. Import the “mtcars.csv” data set

    Set RStudio to python mode

    library(reticulate)
    repl_python()
    from pathlib import Path
    import pandas as pd
    mtcars_path = Path('..') / 'datasets' / 'mtcars.csv'
    mtcars = pd.read_csv(mtcars_path)
  2. What is the type of each variable of the mtcars data set?

    print(mtcars.head())
              Unnamed: 0   mpg  cyl   disp   hp  ...   qsec  vs  am  gear  carb
    0          Mazda RX4  21.0    6  160.0  110  ...  16.46   0   1     4     4
    1      Mazda RX4 Wag  21.0    6  160.0  110  ...  17.02   0   1     4     4
    2         Datsun 710  22.8    4  108.0   93  ...  18.61   1   1     4     1
    3     Hornet 4 Drive  21.4    6  258.0  110  ...  19.44   1   0     3     1
    4  Hornet Sportabout  18.7    8  360.0  175  ...  17.02   0   0     3     2
    
    [5 rows x 12 columns]
  3. Import the “cane.csv” data set.

    cane_path = Path('..') / 'datasets' / 'cane.csv'
    cane = pd.read_csv(cane_path)
  4. What is the type of each variable of the cane data set?

    print(cane.head())
       Unnamed: 0    n   r   x  var block
    0           1   87  76  19    1     A
    1           2  119   8  14    2     A
    2           3   94  74   9    3     A
    3           4   95  11  12    4     A
    4           5  134   0  12    5     A