SSCC - Social Science Computing Cooperative Supporting Statistical Analysis for Research

Chapter Roadmap

When working with data frames, there are two major dialects of R: “base” R, and the “tidyverse”. Both are built on the shared data type and vector concepts already discussed.

The main differences between the dialects are in the vocabulary (function names), grammar (function arguments), and syntax (function order) when working with dataframes. The two dialects can accomplish the same tasks, just as two languages can express the same concepts, but they have different ways of doing so.

You only need to be proficient in writing one of these dialects, but you should be able to at least understand both.

The next two chapters, Reading Text Data and First Steps with Dataframes, will get you started with reading in a dataset, writing a script, and making basic changes to the dataset.

After that, the R dialects diverge, and you will have a choice to make when learning dataframe operations of subsetting, merging, aggregating, and reshaping.

If you are primarily interested in how to manage data using base R functions, read chapters 12-15, supplementing with chapters 16-19.

If you are primarily interested in how to manage data using tidyverse functions, read chapters 16-19, supplementing with chapters 12-15.

After that, the final chapter is on Formulas, and regardless of which dialect you specialized in, you can read this chapter.