25 Functions: Purpose, Types, and Creation

What this chapter covers

A function wraps a block of logic behind a name and a clear interface: what goes in (arguments), what comes out (return value). Functions let you name an idea, reuse it, test it, and hide the details. This chapter covers why functions matter, how to write them, the difference between positional, named, and default arguments, variable-length arguments with ..., and the rules R uses to find variables inside a function (scoping). You’ll learn to turn a rough snippet of analysis code into a clean, reusable function.

25.1 Why functions?

Take any script longer than a page and you’ll find the same three or four lines of logic appearing in slightly different form over and over. A function is how you name that logic once and use the name afterwards.

Try here

Functions buy you four things:

Names. pct(82, 100) tells the reader what’s happening; round(82/100*100, 1) makes them decode it.
Reuse. One edit updates every caller.
Isolation. Variables inside a function don’t leak out, you can experiment without polluting the workspace.
Testing. A function with clear inputs and outputs is something you can hand examples to and verify.

25.2 The anatomy of a function

Every R function has the same skeleton:

name <- function(arg1, arg2, ...) {
  body                     # one or more expressions
  return_value             # the last expression is the result
}

A live example:

Try here

Three things to notice:

function(…) { … } is itself an expression. We assign it to a name with <-.
The braces {} wrap the body. You can omit them for a one-line body, but including them is never wrong.
The last expression is the return value, no return() keyword required.

25.3 return(), explicit vs implicit

Both forms below work identically:

Try here

R convention is to rely on the implicit return for the normal exit, and use return() only for early returns, jumping out before the end when a short-circuit condition fires.

Try here

25.4 Arguments, positional, named, default

Arguments can be passed by position or by name. Named passing is more explicit and safer once a function has more than two or three arguments.

Try here

Default values let you omit arguments that usually take the same value:

Try here

Defaults can reference earlier arguments, handy for computed defaults:

Try here

Argument-order discipline

A healthy convention: put the data first, then required parameters, then optional parameters with defaults. This matches R’s own functions (mean(x, na.rm = FALSE)) and plays well with the pipe |>.

25.5 Variable arguments, `...`

... lets a function accept an unknown number of extra arguments. It collects them and passes them through to another function unchanged.

Try here

Inside the function, ... can be inspected by wrapping it in list(...):

Try here

... is how most plotting and summary functions let you pass through graphical or statistical options you didn’t anticipate.

25.6 Return multiple values, return a list

R functions return exactly one object. To return several things, bundle them in a named list.

Try here

The caller pulls values out with $ or [[. This idiom is everywhere, every model-fitting function in R returns a list of this shape.

25.7 Scoping, where does a name come from?

Inside a function, R looks for variables in a specific order: local first, then the enclosing environment, then parent environments, then the global workspace.

Try here

Variables created inside a function are local, they disappear when the function returns:

Try here

Avoid reaching out for inputs

The example above works but is fragile. If multiplier changes, the function’s behaviour changes silently. Always prefer passing values in as arguments, functions should read their inputs from arguments, not from the surrounding workspace.

25.8 Functions are first-class

A function is an ordinary R object. You can store it in a variable, pass it to another function, return it from a function, or put it in a list.

Try here

This is the property that makes the apply family (Chapter 24) possible, sapply(x, mean) passes the function mean as a value.

25.9 Types of functions you’ll meet

Four categories worth naming, even though there’s nothing syntactically different between them:

Built-in functions, ship with R: mean(), sum(), paste(), lm().
Package functions, loaded via library(): dplyr::mutate(), stringr::str_detect().
User-defined functions, the kind we’re writing in this chapter.
Anonymous functions, defined on the spot without a name: \(x) x^2 used inside an apply call.

The rules are identical for all four. The distinction is social, not technical.

25.10 Pure vs side-effect functions

A pure function returns a value and does nothing else, no printing, no plotting, no writing to files. A side-effect function changes the outside world.

Try here

Rule of thumb: pure functions are easier to test and combine. Reserve printing, messaging, and file I/O for functions whose job is precisely that.

25.11 Worked example, a reusable grading function

Package the grading logic from Chapter 21 as a proper function: explicit arguments, default rules, a clean return value, and support for a vector input via ifelse.

Try here

Look at what the function gained:

A default cutoff vector matches the common case; callers override for a custom scheme.
The function accepts scalar or vector input, one implementation, two use cases.
The behaviour is documented through the parameter names, not a paragraph of comments.

Summary

Concept	Description
Why and How
Why Functions	Encapsulate logic for reuse, testing, and clarity
Anatomy of a Function	name <- function(args) { body } is the canonical shape
return() Implicit vs Explicit	Last expression is returned implicitly; return() exits early
Arguments and Return
Positional and Named Arguments	Arguments can be passed by position or by name
Default Arguments	Set default values to make arguments optional
... Variadic Arguments	... lets you forward extra arguments to inner functions
Returning Multiple Values	Wrap multiple results in a list and return that
Functions are First-Class	Functions can be passed as arguments and returned from other functions