10 Using Packages

R is available as a series of modules called packages, a few of which were included when you initially installed R.

Packages can contain all sorts of objects, but generally they are sources of new functions, datasets, example scripts, and documentation.

Anyone can develop and submit a package to CRAN, the central repository. CRAN packages must meet certain benchmarks to be accepted and distributed.

CRAN packages vary considerably in style and the quality of their documentation, even after meeting the CRAN benchmarks.

There are two main steps to using a package:

  • installing the package on your computer (with install.packages())
  • telling R to use that package for objects (functions, data) (with library())

While you only need to install a package once, you need to tell R to use that package any time you start a new R session.

In the SSCC, you will find that there are many packages already installed for you. You can install or update packages yourself - these will automatically be installed in a folder on your U:/ drive.

10.1 What packages are already installed?

If you are working in RStudio you can see the installed packages in the Packages pane, tabbed in the lower right of RStudio with Files, Plots, and Help.

You can scroll through the list, or use the search box in the upper right of the pane. The search box works much like it does in help.

You can click on a package name to see a help page listing all of the functions and other objects in that package.

For example, suppose you were looking for documentation on a function to specify the number of cores you want to use for running R in parallel. If we already know it is in the parallel package, we could

  • Open the Packages pane
  • Search for or scroll down to parallel
  • Scroll through the list of functions to find makeCluster()
  • Click on the function name to read its help page

Alternatively, if we already knew the function name, we can search for a help page from the console with the pattern ?package::function. Note that we can omit the package name if it is already loaded. By default, parallel is not, so we can either load it first (library(parallel)) or find it by typing this into the console: ?parallel::makeCluster

Try it!

RStudio packages

10.2 Installing Additional Packages

You can install a package with the Install icon on the Packages toolbar. By default this installs packages from CRAN. If you have a package from another source in the form of a downloaded archive file, you can also install from that.

You can also install packages by using code. The following code installs the faraway package from CRAN:

install.packages("faraway")
Installing package into 'U:/R/4.1.2'
(as 'lib' is unspecified)
package 'faraway' successfully unpacked and MD5 sums checked

The downloaded binary packages are in
    C:\Users\jstruck2\AppData\Local\Temp\RtmpW2gSY6\downloaded_packages

10.3 Using a Package

To actually use the material in the package you must load it using the library() function. hsb is a dataset in the faraway package. Notice the difference in the output of summary(hsb) before and after loading faraway.

summary(hsb)
Error in summary(hsb): object 'hsb' not found
library(faraway)
summary(hsb)
       id            gender              race         ses         schtyp   
 Min.   :  1.00   female:109   african-amer: 20   high  :58   private: 32  
 1st Qu.: 50.75   male  : 91   asian       : 11   low   :47   public :168  
 Median :100.50                hispanic    : 24   middle:95                
 Mean   :100.50                white       :145                            
 3rd Qu.:150.25                                                            
 Max.   :200.00                                                            
       prog          read           write            math          science     
 academic:105   Min.   :28.00   Min.   :31.00   Min.   :33.00   Min.   :26.00  
 general : 45   1st Qu.:44.00   1st Qu.:45.75   1st Qu.:45.00   1st Qu.:44.00  
 vocation: 50   Median :50.00   Median :54.00   Median :52.00   Median :53.00  
                Mean   :52.23   Mean   :52.77   Mean   :52.65   Mean   :51.85  
                3rd Qu.:60.00   3rd Qu.:60.00   3rd Qu.:59.00   3rd Qu.:58.00  
                Max.   :76.00   Max.   :67.00   Max.   :75.00   Max.   :74.00  
     socst      
 Min.   :26.00  
 1st Qu.:46.00  
 Median :52.00  
 Mean   :52.41  
 3rd Qu.:61.00  
 Max.   :71.00  

10.4 Undoing things

You will rarely, if ever, need to unload or uninstall packages, but we can do these operations with the detach() and remove.packages() functions.

detach() is the opposite of library(). It disassociates the package from your current session. After detaching a package, you will no longer be able to reference its functions and datasets directly as we did with hsb without reloading it first.

detach(package:faraway, unload = TRUE)

remove.packages() reverses install.packages(), and it removes a package from your computer. To use it again, you will have to reinstall it.

remove.packages("faraway")
Removing package from 'U:/R/4.1.2'
(as 'lib' is unspecified)

10.5 Exercises

  1. Load the dplyr package, which has many functions useful for data wrangling. One of these is count(), which returns a dataframe of frequencies of grouping variable combinations.
count(mtcars, cyl, am)
  1. Install the package stargazer. This package contains a function of the same name, stargazer(), which can write tables of model results and summary statistics to a Word document. After installing stargazer, run this code, and take a look at the file it produces.
mod <- lm(mpg ~ am * wt, data = mtcars)

stargazer(mod, type = "html", out = "mod.doc")