<- 1:5
w <- 1
x <- "sscc"
y <- TRUE z
3 Data Type
3.1 Warm-Up
Add together values of the same and different types: character + character, character + numeric, etc. For example, try "a" + 1
.
- Character: quoted text, such as
"a"
or"rstudio"
- Numeric: numbers with or without decimals, such as
10
or2.72
- Logical:
TRUE
orFALSE
Which combinations return errors? Which combinations return expected results? Which combinations return unexpected results?
3.2 Outcomes
Objective: To find and modify an object’s type.
Why it matters: Type is a fundamental property of data objects in R, and understanding how R handles various data types will help you identify sources of errors and perform useful operations, such as summarizing indicator variables.
Learning outcomes:
Fundamental Skills | Extended Skills |
|
|
Key functions and operators:
typeof()
as.numeric()
as.character()
as.logical()
[ ]
3.3 Basic Types
Data comes in different types, and the type of data is simply the kind of data it is. Is it a number, text, or something else?
First make four objects:
The three basic types we will use in R are numeric, character, and logical. We can find the type of an object with the typeof()
function:
typeof(w)
[1] "integer"
typeof(x)
[1] "double"
typeof(y)
[1] "character"
typeof(z)
[1] "logical"
Notice w
is an integer
while x
is a double
. For our purposes, both integers and doubles are “numeric.” We do not usually need to distinguish between these two variations on numeric data.
3.4 Changing Types
In R, the type of a data object can be changed at any point. Changing an object’s type is called “coercion” and can happen explicitly or implicitly.
3.4.1 Explicit Coercion
Explicit coercion occurs when we ask R to change the type of an object with one of the as.type()
functions. Coerce x
to character:
as.character(x)
[1] "1"
Now, what is the type of x
?
typeof(x)
[1] "double"
The type did not change because we did not assign the result back to x
. Try again but with assignment:
<- as.character(x)
x x
[1] "1"
typeof(x)
[1] "character"
Coercing numeric to character results in quoted numbers. Try coercing character to numeric:
as.numeric(y)
Warning: NAs introduced by coercion
[1] NA
Some coercions result in missing data (NA
). Here, "sscc"
is not associated with a number, so we get NA
.
If we have quoted numeric values, R can parse the number:
as.numeric("1")
[1] 1
as.numeric("3.14")
[1] 3.14
However, if we have commas or currency symbols, we will get NA
:
as.numeric("1,000")
Warning: NAs introduced by coercion
[1] NA
as.numeric("$3.14")
Warning: NAs introduced by coercion
[1] NA
In those cases, we need to first clean up our character data. See the chapter on character vectors.
Coercing data into and out of the logical type yields interesting results:
as.numeric(TRUE)
[1] 1
as.numeric(FALSE)
[1] 0
as.logical(5)
[1] TRUE
as.logical(0)
[1] FALSE
as.logical(-5.75)
[1] TRUE
We will discuss logical values more in the chapter on logical vectors. Coercing numeric to logical and back to numeric will allow us to quickly quantify missing data and summarize variable distributions.
3.4.2 Implicit Coercion
We can also change the type of an object indirectly, called implicit coercion. w
has the numbers 1 to 5:
w
[1] 1 2 3 4 5
We can reference the individual values with square brackets. Get the first number with w[1]
, which coincidentally is 1:
1] w[
[1] 1
Add the first and second elements together to get 3:
1] + w[2] w[
[1] 3
Now assign some character data to the fifth element:
5] <- "a" w[
R produces no warning or message at this stage. Try adding together the first two elements again:
1] + w[2] w[
Error in w[1] + w[2]: non-numeric argument to binary operator
The type of w
changed!
typeof(w)
[1] "character"
Just like in our warm-up, we get an error when trying to add together character values.
w
is a vector, and vectors can only contain a single type. When we tried to put both numeric and character data into a vector together, could we have predicted that we would have character data as a result?
Yes! The help page for the c()
function (see ?c
) tells us about the hierarchy of data types. A simplified version of that hierarchy is “logical < numeric < character.” Combining different types will promote lower types to the highest common type. Logical and numeric will result in numeric. Logical and character will result in character. All three will result in character:
typeof(c(TRUE, 1, "a"))
[1] "character"
3.5 Exercises
3.5.1 Fundamental
What is the type of each of these objects?
<- mtcars[1, 1] a <- letters[5] b <- (WorldPhones[4, 3] > 50000) d <- names(airquality)[3] e <- max(airquality$Day) == 31 f <- mean(airquality$Temp) g
Use the six objects created in exercise 1. Coerce each one to the other two types with
as.logical()
,as.numeric()
, andas.character()
.
3.5.2 Extended
Review the hierarchy of data types in the details in
?c
. Then, revisit exercise 2 above. When is information preserved or lost: when moving up or down the hierarchy?- Can you find any values that “survive” coercions up and down the hierarchy of logical-numeric-character?