SSCC - Social Science Computing Cooperative Supporting Statistical Analysis for Research

1 Introduction

This article will teach you how to use data visualizations to understand your data and communicate your results. We will go through the most basic features of ggplot2 (commonly abbreviated as just “ggplot”).

This article is organized by the numbers and kinds of variables we would like to plot. After discussing the basic building blocks of ggplot, we will plot univariate, bivariate, and multivariate data, including values obtained from fitted models. Then, we will discuss some of ggplot’s options for customizing plots’ appearance, and we will finish with a brief look at saving plots for use in other applications. You are strongly encouraged to follow along by running the code on your own computer.

Plotting is especially useful in the early stages of data analysis, as you seek to understand your data. Plots can also be used to communicate descriptive statistics and model results to readers. More work and care is needed to produce plots worthy of publication (and some of this is covered in Making Plots Pretty), though it is not uncommon to see published articles that use ggplot’s default settings.

While plotting with ggplot, a cheatsheet you will come back to again and again is the Data Visualization Cheatsheet, which serves as a quick reference guide to ggplot syntax and options. It is available on RStudio’s website alongside other cheatsheets. The R Graph Gallery is also a great resource that shows you what is possible with ggplot, and it provides example code. You will undoubtedly also make use of countless other websites and Stack Exchange discussions you find when you Google “how to change axis font size ggplot.”