3  Data Type

3.1 Warm-Up

Add together values of the same and different types: character + character, character + numeric, etc. For example, try "a" + 1.

  • Character: quoted text, such as "a" or "rstudio"
  • Numeric: numbers with or without decimals, such as 10 or 2.72
  • Logical: TRUE or FALSE

Which combinations return errors? Which combinations return expected results? Which combinations return unexpected results?

3.2 Outcomes

Objective: To find and modify an object’s type.

Why it matters: Type is a fundamental property of data objects in R, and understanding how R handles various data types will help you identify sources of errors and perform useful operations, such as summarizing indicator variables.

Learning outcomes:

Fundamental Skills Extended Skills
  • Name the three basic data types in R.

  • Change an object’s type through coercion.

  • Understand how information is preserved or lost when moving up or down the hierarchy of data types.

Key functions and operators:

typeof()
as.numeric()
as.character()
as.logical()
[ ]

3.3 Basic Types

Data comes in different types, and the type of data is simply the kind of data it is. Is it a number, text, or something else?

First make four objects:

w <- 1:5
x <- 1
y <- "sscc"
z <- TRUE

The three basic types we will use in R are numeric, character, and logical. We can find the type of an object with the typeof() function:

typeof(w)
[1] "integer"
typeof(x)
[1] "double"
typeof(y)
[1] "character"
typeof(z)
[1] "logical"

Notice w is an integer while x is a double. For our purposes, both integers and doubles are “numeric.” We do not usually need to distinguish between these two variations on numeric data.

3.4 Changing Types

In R, the type of a data object can be changed at any point. Changing an object’s type is called “coercion” and can happen explicitly or implicitly.

3.4.1 Explicit Coercion

Explicit coercion occurs when we ask R to change the type of an object with one of the as.type() functions. Coerce x to character:

as.character(x)
[1] "1"

Now, what is the type of x?

typeof(x)
[1] "double"

The type did not change because we did not assign the result back to x. Try again but with assignment:

x <- as.character(x)
x
[1] "1"
typeof(x)
[1] "character"

Coercing numeric to character results in quoted numbers. Try coercing character to numeric:

as.numeric(y)
Warning: NAs introduced by coercion
[1] NA

Some coercions result in missing data (NA). Here, "sscc" is not associated with a number, so we get NA.

If we have quoted numeric values, R can parse the number:

as.numeric("1")
[1] 1
as.numeric("3.14")
[1] 3.14

However, if we have commas or currency symbols, we will get NA:

as.numeric("1,000")
Warning: NAs introduced by coercion
[1] NA
as.numeric("$3.14")
Warning: NAs introduced by coercion
[1] NA

In those cases, we need to first clean up our character data. See the chapter on character vectors.

Coercing data into and out of the logical type yields interesting results:

as.numeric(TRUE)
[1] 1
as.numeric(FALSE)
[1] 0
as.logical(5)
[1] TRUE
as.logical(0)
[1] FALSE
as.logical(-5.75)
[1] TRUE

We will discuss logical values more in the chapter on logical vectors. Coercing numeric to logical and back to numeric will allow us to quickly quantify missing data and summarize variable distributions.

3.4.2 Implicit Coercion

We can also change the type of an object indirectly, called implicit coercion. w has the numbers 1 to 5:

w
[1] 1 2 3 4 5

We can reference the individual values with square brackets. Get the first number with w[1], which coincidentally is 1:

w[1]
[1] 1

Add the first and second elements together to get 3:

w[1] + w[2]
[1] 3

Now assign some character data to the fifth element:

w[5] <- "a"

R produces no warning or message at this stage. Try adding together the first two elements again:

w[1] + w[2]
Error in w[1] + w[2]: non-numeric argument to binary operator

The type of w changed!

typeof(w)
[1] "character"

Just like in our warm-up, we get an error when trying to add together character values.

w is a vector, and vectors can only contain a single type. When we tried to put both numeric and character data into a vector together, could we have predicted that we would have character data as a result?

Yes! The help page for the c() function (see ?c) tells us about the hierarchy of data types. A simplified version of that hierarchy is “logical < numeric < character.” Combining different types will promote lower types to the highest common type. Logical and numeric will result in numeric. Logical and character will result in character. All three will result in character:

typeof(c(TRUE, 1, "a"))
[1] "character"

3.5 Exercises

3.5.1 Fundamental

  1. What is the type of each of these objects?

    a <- mtcars[1, 1]
    b <- letters[5]
    d <- (WorldPhones[4, 3] > 50000)
    e <- names(airquality)[3]
    f <- max(airquality$Day) == 31
    g <- mean(airquality$Temp)
  2. Use the six objects created in exercise 1. Coerce each one to the other two types with as.logical(), as.numeric(), and as.character().

3.5.2 Extended

  1. Review the hierarchy of data types in the details in ?c. Then, revisit exercise 2 above. When is information preserved or lost: when moving up or down the hierarchy?

    • Can you find any values that “survive” coercions up and down the hierarchy of logical-numeric-character?