2  Purpose and Advantages of Using R

NoteWhat This Chapter Covers

Chapter 1 introduced R as a language and an environment. This chapter answers the next natural question: why would you actually pick R for a piece of work? You will see the specific purposes R was designed for, the concrete advantages it gives you over spreadsheets and general-purpose languages, the situations where those advantages matter most, and the trade-offs you should understand before committing to R for a project. By the end you will be able to defend a “why R?” decision to a colleague or supervisor with specific, evidence-based reasoning.

flowchart LR
    P["Purpose <br> (What R is built for)"] --> A["Advantages <br> (Why it beats alternatives)"]
    A --> U["Use Cases <br> (Where it shines)"]
    U --> T["Trade-offs <br> (When to choose something else)"]
    style P fill:#e3f2fd,stroke:#1976D2
    style A fill:#fff3e0,stroke:#F57C00
    style U fill:#f3e5f5,stroke:#8E24AA
    style T fill:#ffebee,stroke:#C62828


2.1 The Purpose R Was Built For

NoteCore Concept: A Language Designed for Data Analysis

R was not designed as a general-purpose programming language that was later adapted for data. It was designed, from day one, for statistical computing, data analysis, and the interactive exploration that analytical work demands. Every language-level decision, from how vectors behave, to how functions handle missing values, to how data frames are indexed, reflects that purpose. Understanding this intent is the single most useful lens for making sense of R’s design choices.

NoteThe Four Design Priorities Inherited From S

When Ihaka and Gentleman began R in the early 1990s, they inherited four priorities from the S language they were re-implementing. These priorities are still visible in every line of R code you will write.

Priority What It Means for You
Interactive first You can try an idea at the console, see the result, and iterate. R is built to support thinking-while-analysing, not just running pre-written scripts.
Vectors as a primitive A single number is stored as a length-one vector. Operations work on whole vectors at once, which matches how statistics and data analysis actually think about data.
Functional and expressive Data transformations compose cleanly, and most tasks can be expressed in one or two lines that a reader can follow.
Extensible through packages The base language is deliberately small, and almost any new technique is delivered as a package, which is how R keeps pace with modern statistical methods.
TipWhy This Matters in Your Own Work

When you find yourself fighting R, for example trying to write an explicit loop that an experienced user would replace with a one-line vectorised expression, the fight is usually a signal that you are working against the language’s design rather than with it. Re-reading the priorities above often points to a cleaner solution.


2.2 The Core Advantages of R

NoteAdvantage One: Free, Open Source, and Cross-Platform

R is released under the GNU General Public Licence, which means you can install, inspect, modify, and redistribute it without paying a cent or signing any agreement. This matters in three concrete ways. For students, it removes the cost barrier to professional-grade analysis tools. For organisations, it removes vendor lock-in and the risk of licence-fee increases. For scientists and auditors, it makes every computation fully inspectable, because you can read the source of every function, not just the documentation.

NoteAdvantage Two: Vectorised, Expressive Syntax

In most general-purpose languages, you would write a loop to apply an operation to every element of a list. In R, the operation applies to the entire vector in one expression. The result is fewer lines of code, fewer bugs, and code that reads much closer to the mathematical or statistical idea you are expressing.

NoteAdvantage Three: A Statistical Library Unmatched in Any Other Language

Most modern statistical techniques, from generalised linear models to Bayesian multilevel models, from survival analysis to mixed-effects models, are available in R, often in packages written or reviewed by the statisticians who developed the technique. This depth is not matched by any general-purpose language.

NoteAdvantage Four: Publication-Quality Graphics

R produces graphics that are good enough to appear directly in peer-reviewed journals and professional reports. The ggplot2 package in particular provides a grammar of graphics that lets you describe a plot in terms of data, aesthetics, and geometric layers, rather than in terms of low-level drawing commands.

NoteAdvantage Five: Reproducible Research Built Into the Workflow

R, together with R Markdown and Quarto, lets you weave code, results, tables, and narrative into a single document that re-runs end to end whenever data changes. This is how reproducible research is supposed to work: the report itself is the code, and the code itself is auditable. In regulated industries such as clinical trials or financial reporting, this is not a nice-to-have but a formal requirement.

NoteAdvantage Six: A Global, Active Community

CRAN hosts roughly twenty thousand packages. Stack Overflow has hundreds of thousands of answered R questions. Posit Community, R-Ladies chapters, R-bloggers, the useR! conference, and local meetups in most major cities mean that when you get stuck, someone has almost certainly been stuck on the same thing before, and their solution is one search away.

TipExpert Insight: Advantages Compound

These six advantages look like a list, but they compound in practice. Because R is free, universities teach it, which produces graduates who use it, which produces demand for R packages, which grows CRAN, which makes R more attractive for new projects, which brings more contributors to the community. A decision to learn R in 2026 sits on three decades of this compounding.

WarningA Common Misconception: R Is Too Slow for Real Work

Readers sometimes hear that R is slow compared to C++ or Java. The base R interpreter is indeed slower than a compiled language for tight numerical loops, but in practice this almost never matters, because the heavy lifting in most R workflows happens inside compiled C or Fortran code that R simply calls. When true speed is needed, packages such as data.table, Rcpp, and arrow close the gap dramatically.


2.3 Where R Shines: Primary Use Cases

NoteFive Workloads Where R Is the Default Choice
Use Case Why R Fits Example Roles That Use It
Statistical modelling and inference Classical and modern methods are native, with mature diagnostics and reporting. Biostatisticians, applied researchers, quantitative analysts.
Exploratory data analysis Interactive console, vectorised syntax, and tidyverse make “poke at the data” very fast. Data analysts, journalists, social scientists.
Reproducible reports and dashboards Quarto and R Markdown turn an analysis into a live, re-runnable document; Shiny builds interactive apps. Research teams, regulatory reporting, business intelligence.
Data visualisation at publication quality ggplot2 and its extensions produce charts that are ready for a journal or an annual report. Scientific publishing, newsroom analytics, consulting.
Teaching and learning statistics Small code expresses large ideas, which is why R is the classroom standard in many statistics and data science programmes. Educators, students, self-learners.
NoteA Quick Industry Tour
Industry Typical R Use
Pharmaceuticals and clinical research Clinical trial analysis, pharmacokinetics, regulatory submissions under FDA and EMA guidance.
Banking and insurance Credit risk models, actuarial pricing, fraud pattern analysis, stress testing.
Technology and e-commerce A/B test analysis, experimentation platforms, attribution modelling, recommender evaluation.
Government and public policy Official statistics, census analysis, economic forecasting, public health monitoring.
Academia and research Almost every quantitative field uses R somewhere in its workflow.
Journalism Data journalism at outlets such as the BBC, FiveThirtyEight, and The New York Times relies heavily on R.
TipCase Example: Why a Hospital Chose R for Its Outcomes Dashboard

Consider a mid-sized hospital that wants a weekly dashboard of patient-outcome metrics. The analysts need to pull data from the hospital database, clean and reshape it, compute risk-adjusted outcome rates, visualise trends by department, and publish the result as an interactive web page without exposing raw patient data. An R pipeline using DBI for the database, dplyr for reshaping, ggplot2 for the visuals, and Quarto or Shiny for the front end delivers this end to end in one language, with the entire workflow audit-ready. Replacing R would mean stitching together three or four different tools and giving up the single, reproducible document.


2.4 When R Is Not the Best Choice

NoteHonest Trade-offs

No language is optimal for every task. The table below highlights situations where R may not be the best first choice, along with the tool that typically fits better.

Situation Better Tool Reason
Building a production web application with heavy user traffic Python, Go, or Node.js General-purpose runtime, stronger web frameworks, better concurrency.
Deep learning at scale on GPUs Python with TensorFlow or PyTorch Deep learning ecosystems matured around Python; R interfaces exist but lag the frontier.
Tight numerical loops measured in microseconds C, C++, or Rcpp inside R Compiled languages win on raw speed; use Rcpp if you want to stay inside R.
Streaming data pipelines and orchestration Python, Scala, or Kafka-centric tools Most streaming ecosystems target Python or JVM languages first.
Quick, one-off spreadsheet tasks for non-technical users Excel or Google Sheets Grid-editing, conditional formatting, and colleague hand-off are simpler in a spreadsheet.
WarningThe Learning Curve Is Real, Just Different

R is friendly for data-oriented work but unusual if you are coming from Python, Java, or C. Expressions such as x <- c(1, 2, 3), df$col, and the pipe operator look foreign at first. The steepness comes mainly in the first two weeks, after which productivity rises quickly. Plan your learning with this curve in mind rather than assuming any single hour will feel like a general-purpose language tutorial.

TipExpert Insight: Choose the Tool, Not the Tribe

Professionals who have worked with R for many years rarely argue that R is always the right answer. The mature view is to treat R as an excellent default for statistical, analytical, and reporting work, and to reach for another tool when the task clearly demands it. The reticulate package even lets you call Python from R when a specific Python library is the right fit, so the two languages can live inside a single workflow.


2.5 A Side-by-Side Example: The Same Task in R and a Spreadsheet

NoteThe Task

Given monthly sales figures for two products, compute the total, the mean, and the growth rate from the first to the last month, and produce a simple trend chart.

TipWhat a Spreadsheet Would Need for the Same Task

A spreadsheet version would require at least a dozen manual formula entries, a chart wizard, and named ranges to keep the logic clean. Any change to the data would mean verifying that every formula still points to the right cells. In R, every calculation above is reproducible: rerun the chunk with new data and the entire summary and chart update instantly, with no room for a broken cell reference.

WarningWhere Spreadsheets Still Win

Spreadsheets remain excellent for ad-hoc data entry, simple budgets, and quick what-if analysis that you want a non-technical colleague to edit directly. R replaces spreadsheets when the analysis needs to be repeated, audited, or scaled, not every time data is involved.


2.6 Summary

NoteKey Concepts at a Glance
Concept Key Takeaway
Purpose of R Designed specifically for statistical computing, data analysis, and interactive exploration, inheriting S’s priorities.
Free and open source GNU GPL licensing removes cost and lock-in, and makes every function inspectable.
Vectorised syntax Operations apply to whole vectors at once, so code is shorter, clearer, and closer to statistical thinking.
Statistical depth Classical and cutting-edge methods are first-class, often authored by the statisticians who invented them.
Graphics Publication-quality visuals via base R graphics and ggplot2.
Reproducibility R Markdown and Quarto let code, results, and narrative live in a single auditable document.
Community CRAN, Stack Overflow, Posit Community, and meetups make help easy to find.
When not to use R Production web apps, large-scale deep learning, tight numerical loops, and streaming pipelines typically fit other tools better.
TipApplying This in Practice

When you are asked to justify using R on a new project, resist the temptation to argue from generalities. Instead, map the project to the six advantages in this chapter. If the task involves statistical modelling, custom visualisation, reproducible reporting, and a team that values transparency, R is almost always the right default. If the task is mainly production engineering or deep learning, a different language may serve you better, and recognising that early is a sign of judgement, not weakness. The next chapter takes this practical attitude further by walking you through installing R, choosing an IDE, and getting your environment ready for the rest of the book.