6 Basic Data Types and Type Casting

What This Chapter Covers

This chapter introduces the atomic data types that every value in R is built from, and shows how to move values between those types on purpose with explicit casting. You will meet R’s six atomic types (logical, integer, double, complex, character, and raw), the functions that tell you what type a value has (typeof(), class(), is.logical(), is.numeric(), etc.), and the matching family of casting functions (as.logical(), as.integer(), as.numeric(), as.character()). You will also learn how R performs automatic type coercion when values of different types meet, and how to tell the three special values NA, NaN, and NULL apart. By the end of this chapter you will be able to look at any R value and say precisely what it is and what it can safely become.

flowchart LR
    V["An R Value"] --> T["Atomic Type"]
    T --> L["logical <br> TRUE / FALSE"]
    T --> I["integer <br> 1L, 2L, 3L"]
    T --> D["double <br> 1.5, 3.14, 1e6"]
    T --> CX["complex <br> 1+2i"]
    T --> CH["character <br> 'hello'"]
    T --> R["raw <br> byte-level data"]
    classDef default fill:#2a4d69,color:#ffffff,stroke:#ffcc00,stroke-width:3px,rx:10px,ry:10px;

6.1 The Six Atomic Types

Core Concept: Every R Value Starts as an Atomic Type

Every value in R is ultimately built from one of six atomic types. Higher-level structures such as vectors, matrices, and data frames are simply collections of atomic values, and every element in a single atomic vector has exactly one type.

Type	Example	Stores
`logical`	`TRUE`, `FALSE`, `NA`	Boolean truth values.
`integer`	`1L`, `42L`, `-7L`	Whole numbers, stored exactly.
`double`	`3.14`, `1.5`, `2e6`	Real numbers (floating-point). The default numeric type.
`complex`	`1+2i`	Complex numbers (real + imaginary).
`character`	`"hello"`, `'R'`	Text strings.
`raw`	`as.raw(255)`	Bytes. Rarely used outside low-level file or network work.

Expert Insight: “Numeric” Is an Umbrella Term

The word numeric in R is a bit confusing. It is an umbrella for both integer and double. is.numeric() returns TRUE for both. When R prints numbers without a decimal point it usually still stores them as double, not integer. To force an integer, add the suffix L, as in 42L.

6.2 Logical Values

The Smallest Type: Truth Values

Logical values are either TRUE or FALSE, plus a special missing value NA. They are the result of every comparison (x > 3, name == "Rani") and the glue that conditional code is written in.

Try here

Write TRUE and FALSE in Full

R also accepts T and F as shortcuts. As noted in Chapter 4, those shortcuts are ordinary variables that can be reassigned, and silently breaking them is a classic production incident. Always write TRUE and FALSE in full.

6.3 Integer vs Double

The Default Is Double

Any plain number you type into R, such as 5 or 3.14, is stored as a double. To get an integer you must either append the suffix L or use as.integer().

Try here

Why Two Numeric Types Exist

Doubles use roughly 15 significant decimal digits and can represent very large and very small numbers, but they cannot store every integer exactly beyond about 2^53. Integers use 32 bits, are always exact within their range, and use less memory. For most analysis work, doubles are fine; integer types matter when interfacing with C, when memory is tight, or when exactness beyond 15 digits is required.

Common Mistake: Floating-Point Surprises

Doubles cannot represent every decimal number exactly, which sometimes produces surprising comparisons.

Try here

6.4 Character Strings

Text Lives in Character Vectors

Character values are strings, delimited with single ' or double " quotes. There is no separate “char” vs “string” distinction in R; a single letter and a paragraph are both character vectors of length 1.

Try here

Best Practice: Prefer Double Quotes

R accepts both quote styles, but the Tidyverse style guide and most R code in the wild use double quotes for strings. Reserve single quotes for strings that themselves contain double-quote characters.

6.5 Complex and Raw: The Two You Will Rarely Meet

When They Matter

complex is used in signal processing, physics, and some statistical work; it appears occasionally in fft() and related functions. raw holds bytes and is used when reading binary files, interacting with C code, or working with cryptographic hashes.

Try here

6.6 Checking a Value’s Type

The typeof(), class(), and is.X() Family

R gives you several ways to ask what a value is. Each answers a slightly different question.

Tool	Question It Answers
`typeof(x)`	What is the internal storage type? (`logical`, `integer`, `double`, `character`, …)
`class(x)`	What class (user-facing category) does `x` belong to? (`numeric`, `Date`, `data.frame`, …)
`is.logical(x)`, `is.integer(x)`, `is.double(x)`, `is.numeric(x)`, `is.character(x)`	Yes/no tests for a specific type.

Try here

Expert Insight: typeof() vs class()

typeof() is about how R stores the value in memory. class() is about how R treats the value for dispatching methods. For simple atomic values the two often agree, but for objects such as Date, factor, and data.frame they differ. When in doubt, use class() to reason about behaviour and typeof() to reason about storage.

6.7 Type Coercion (Implicit Conversion)

Core Concept: Mixing Types Triggers Coercion

Atomic vectors in R can hold only one type. When you combine values of different types in a single vector, R silently converts (coerces) everything to the most “general” type in this hierarchy:

logical → integer → double → character

The type furthest to the right wins. Logical becomes integer when mixed with integers; numbers become character when mixed with strings.

Try here

Common Mistake: “My Numbers Are Not Numbers”

A frequent source of bugs is reading a CSV where one column contains a stray string. R coerces the entire column to character, and downstream arithmetic silently fails. Always check str(df) on newly loaded data so you catch type surprises early.

6.8 Type Casting (Explicit Conversion)

The as.X() Family

To move a value deliberately from one type to another, use the matching as.X() function.

Function	Converts To
`as.logical(x)`	`logical` (non-zero numbers → `TRUE`, zero → `FALSE`, `"TRUE"`/`"FALSE"` → the matching value)
`as.integer(x)`	`integer` (truncates toward zero; non-numeric strings → `NA`)
`as.numeric(x)` / `as.double(x)`	`double`
`as.character(x)`	`character`

Try here

When Casting Fails

If R cannot convert a value, it returns NA and emits a warning. Always check the result before using it.

Try here

Best Practice: Cast at the Boundary, Not in the Middle

Convert inputs to the right type as soon as they enter your script (at the “boundary” where file reads and user input happen), then rely on consistent types everywhere else. This keeps the core of your code simple and pushes type defensiveness to the edges, where it belongs.

6.9 The Three Missing or Absent Values

NA, NaN, and NULL Are Three Different Things

R makes a careful distinction between three values that all mean “nothing is here”, and treating them as synonyms is a common source of bugs.

Value	Meaning	Tested With
`NA`	Missing data. One atomic value whose content is unknown.	`is.na(x)`
`NaN`	“Not a Number”. The result of an undefined numeric operation like `0/0`.	`is.nan(x)`
`NULL`	The absence of a value. A zero-length object used to mean “no argument” or “empty slot”.	`is.null(x)`

Try here

Expert Insight: Typed NA Values Exist Too

For rare cases where the type of the NA matters (e.g. initialising a vector you plan to fill later), R provides NA_integer_, NA_real_, NA_character_, and NA_complex_. Most of the time the plain NA is enough.

6.10 A Worked Example: Cleaning Mixed Input

Putting It Together

The snippet below shows a realistic cleanup: a set of marks entered as strings, some missing or malformed, being cast to numbers and summarised.

Try here

This is the pattern you will see repeatedly in later chapters: cast at the boundary, mark failures with NA, and use na.rm = TRUE when summarising.

Summary

Concept	Description
Atomic Types
Six Atomic Types	logical, integer, double, character, complex, raw — every value starts atomic
Logical (TRUE/FALSE)	Truth values written as TRUE and FALSE in full
Integer	Whole numbers; suffix L makes a literal integer
Double	Default numeric type — floating-point with precision caveats
Character	Strings, each one a single-element character vector
Complex and Raw	Specialised numeric types used less often in everyday work
Coercion and Casting
Coercion Rules	When types mix, R coerces upward: logical -> integer -> double -> character
Type Casting Functions	as.integer(), as.numeric(), as.character(), as.logical() convert types explicitly