4  Basic Syntax, Variables, and Naming Conventions

NoteWhat This Chapter Covers

This chapter introduces the fundamental rules of writing R code. You will learn how R reads and evaluates code line by line, how to write comments and multi-line expressions, how R handles spaces and case, how to create variables using the three assignment operators, and what names you can legally give those variables. You will also see the naming conventions professional R users follow, the reserved words the language will not let you use, and the built-in functions that let you inspect or clear variables in your workspace. By the end of this chapter you will be able to write short R programs that assign, inspect, and reassign values with confidence.

flowchart LR
    S["Source Script <br> (your .R or .qmd file)"] --> E["R Parser <br> reads one statement"]
    E --> EV["R Evaluator <br> computes the value"]
    EV --> B["Binding <br> name -> value in environment"]
    B --> O["Output or next statement"]
    style S fill:#e3f2fd,stroke:#1976D2
    style E fill:#fff3e0,stroke:#F57C00
    style EV fill:#fff3e0,stroke:#F57C00
    style B fill:#f3e5f5,stroke:#8E24AA
    style O fill:#e8f5e9,stroke:#388E3C


4.1 R Is an Expression-Driven Language

NoteCore Concept: Every Line Returns a Value

Unlike languages where statements and expressions are different things, almost every piece of R code is an expression that evaluates to a value. When you type 2 + 2 at the console and press Enter, R computes the result and prints it. When you type x <- 5, R also produces a value (the number 5), but it hides the output because assignment is treated as an invisible operation. This design keeps the language small and composable; you can put nearly any piece of code inside any other piece of code.

TipExpert Insight: The Console as a Calculator

New users often treat R like a scripting language where they must put everything inside a file. The R console is also a calculator. When you want to check an expression, compute a quick number, or test a function call, type it at the console. That habit shortens the learning loop enormously.


4.2 Statements, Expressions, and Comments

NoteHow To: Write Clean R Statements

An R statement is a single expression that R can evaluate. You end a statement by pressing Enter or by writing a semicolon ;. You do not need to end every line with a semicolon in R; line breaks are the natural separator. Comments begin with the hash symbol # and continue to the end of the line.

NoteSemicolons and Multi-Statement Lines

You can place more than one statement on a single line by separating them with semicolons. This is sometimes useful for tight scripts but is usually discouraged because it reduces readability.

TipBest Practice: One Idea per Line

Write one statement per line, comment the why (not the what), and let R’s own output do the talking. Future-you reading a script six months from now will thank present-you.

WarningCommon Mistake: Forgetting That R Has No Block Comments

R does not have a native multi-line comment syntax the way C uses /* ... */. To comment out a block of code, prefix every line with #. Most IDEs do this for you with a keyboard shortcut: Ctrl+Shift+C in RStudio.


4.3 Case Sensitivity and Whitespace

NoteCore Concept: R Is Case Sensitive

Age, age, and AGE are three different names in R. The same applies to function names: mean() works, Mean() does not. Treat capitalisation as meaningful information.

NoteWhitespace Is Mostly Ignored

R ignores spaces around operators, inside parentheses, and between tokens. Use spaces to improve readability; do not use them in a way that hides meaning.

Readable Legal but Hard to Read
x <- 5 + 3 x<-5+3
mean(c(1, 2, 3)) mean(c(1,2,3))
y <- (a + b) / 2 y<-(a+b)/2
WarningCommon Mistake: x<-5 Can Be Parsed as x < -5

Writing the assignment operator without spaces around it is usually safe, but in a few contexts it is genuinely ambiguous. The expression x<-5 reads as x <- 5 (assignment), but x < -5 reads as “is x less than negative five”. Always write a space on both sides of <- to avoid the trap.


4.4 Variables and the Three Assignment Operators

NoteCore Concept: A Variable Is a Name for a Value

In R, a variable is not a box that holds a value. It is a name that is bound to a value stored somewhere in memory. When you write x <- 10, R creates the number 10 and binds the name x to it. When you reassign x to something else, the old binding is replaced; the old value may be garbage collected.

R provides three operators for assignment. All three work, but they are not identical in where you can use them or how they read.

Operator Direction Typical Use
<- right-to-left The idiomatic choice in scripts and the R community.
= right-to-left Common inside function calls (for named arguments), sometimes used for assignment.
-> left-to-right Legal but rare; occasionally useful at the end of a pipeline.
NoteSeeing All Three in Action
TipBest Practice: Prefer <- for Assignment

The R community, including the Tidyverse style guide and most textbooks, prefers <- for assignment and reserves = for named arguments inside function calls. This convention makes scripts easier to scan, because the eye can distinguish “this creates a variable” (<-) from “this passes an argument” (=). In RStudio, the keyboard shortcut Alt + - (hyphen) inserts <- with spaces on both sides.


4.5 Reassigning and Updating Variables

NoteHow To: Update a Variable’s Value

Reassignment simply binds the name to a new value; the old value is discarded. The variable can even change type across reassignments because R is dynamically typed.

TipExpert Insight: Dynamic Typing Is Powerful and Dangerous

R letting you rebind score from number to string is convenient in interactive work and treacherous in larger scripts. A common source of bugs is reusing a variable name for something of a different type mid-way through a script. Pick fresh names when the meaning changes.


4.6 Rules for Naming Variables

NoteThe Hard Rules R Enforces
Rule Example of Legal Name Example of Illegal Name
Must start with a letter or a dot . (not followed by a digit). income, .private 2ndRound, .3rd
May contain letters, digits, underscore _, and dot .. mean_score_2024, price.usd mean-score, price usd
Cannot contain operators or spaces. total_cost total cost, total+cost
Cannot be a reserved word. result, count TRUE, if, function
NoteReserved Words You Cannot Use as Names

R refuses to let you assign a value to any of these reserved words. Using them as variable names produces a syntax error.

Category Reserved Words
Logical constants TRUE, FALSE, T, F
Missing and special values NA, NA_integer_, NA_real_, NA_character_, NA_complex_, NULL, NaN, Inf
Control flow if, else, for, while, repeat, break, next, return
Declaration function
WarningT and F Are Reassignable, But Do Not Do It

The letters T and F are shortcuts for TRUE and FALSE. Unlike TRUE and FALSE themselves, they are ordinary variables that happen to be pre-assigned. You can overwrite them with T <- 0, and disaster follows for every piece of code that assumed T meant TRUE. Write TRUE and FALSE in full and never reassign T or F.


4.7 Naming Conventions: What the R Community Uses

NoteCommon Styles You Will See in the Wild
Style Example Who Uses It
snake_case mean_income, customer_id Tidyverse, modern R, this book.
camelCase meanIncome, customerId Some older R packages, developers from a Java background.
dot.case mean.income, customer.id Base R (e.g. data.frame, read.csv), older code.
PascalCase MeanIncome Rare in R; more common for function objects in some codebases.
UPPER_SNAKE MAX_SCORE, N_RUNS Constants and configuration values.
TipBest Practice: Pick One Style and Stick With It

Consistency matters more than the style you pick. A script that mixes meanIncome, mean.income, and mean_income is much harder to scan than a script that picks any one style and uses it everywhere. This book and the Tidyverse both use snake_case throughout.

WarningAvoid Names That Shadow Built-in Functions

Names like data, df, c, t, mean, and sum already exist as built-in functions or datasets in R. If you reassign them, you will shadow the original and confuse your future self and any collaborator. Use customer_df instead of df, avg_score instead of mean, and so on.


4.8 Inspecting and Managing Your Variables

NoteCore Tools for Workspace Management

R provides a handful of built-in functions that let you see, check, and remove the variables in your current session’s environment.

Function Purpose
ls() List all objects in the current environment.
exists("name") Return TRUE if a variable with that name exists.
class(x) Report the class of the value bound to x.
str(x) Show the structure and type of x compactly.
rm(x) Remove the binding for x from the environment.
rm(list = ls()) Remove every user-created object. Use with caution.
NoteRemoving Variables
TipExpert Insight: Start Each Analysis with a Clean Session, Not rm(list = ls())

A common pattern in old R tutorials is to start every script with rm(list = ls()). That clears your workspace but not the packages that are already attached, and it does not reset random seeds or option settings. A truly clean start comes from restarting R itself (in RStudio: SessionRestart R, or the keyboard shortcut Ctrl+Shift+F10). Restarting R is the modern, reproducible way to start fresh.


4.9 A Small Worked Example

NotePutting the Chapter Together

The snippet below applies every idea from this chapter: three assignment operators, readable names, comments that explain the why, a reassignment, and a workspace inspection at the end.

Every one of those three assignment operators works. In production code, the entire block would be written with <- for consistency.


4.10 Summary

NoteKey Concepts at a Glance
Concept Key Takeaway
Expression-driven Almost every line in R is an expression that returns a value.
Case sensitive Age, age, and AGE are three different names.
Whitespace Spaces are mostly ignored; use them for readability, and always around <-.
Three assignment operators <-, =, and -> all assign. The R community prefers <-.
Naming rules Start with a letter or dot; use letters, digits, _, and .. Never use a reserved word.
Naming conventions Pick one style (snake_case is idiomatic) and apply it consistently.
Avoid shadowing Do not name variables mean, sum, data, df, c, or t.
Workspace tools ls(), exists(), class(), str(), and rm() manage the current environment.
Reproducible starts Prefer restarting R over rm(list = ls()) for a truly clean session.
TipApplying This in Practice

The habits you build in this chapter will repeat themselves thousands of times across every R project. Commit to <- for assignment, snake_case for names, one statement per line, and a fresh R session at the start of every meaningful piece of work. In the next chapter you will start reading input into your programs and writing output back out, using R’s core I/O functions.