Effective data visualization is as much an art as it is a science; color choice is an important example. From manual color selection, to basic functions, to entire packages dedicated to color, there are multiple methods to specify the colors used in visualizations made with R. We will explore the following:
-
Individual colors included in base R
-
Hexadecimal codes
-
Color palettes included in base R
-
Color palettes from packages
Base R - Individual Colors
Base R defines 657 colors, which can be called by name or by number. You can print a complete list of the 657 colors in your R console using:
colors()
## [1] "white" "aliceblue" "antiquewhite"
## [4] "antiquewhite1" "antiquewhite2" "antiquewhite3"
## [7] "antiquewhite4" "aquamarine" "aquamarine1"
## [10] "aquamarine2" "aquamarine3" "aquamarine4"
## [13] "azure" "azure1" "azure2"
....
## [646] "wheat" "wheat1" "wheat2"
## [649] "wheat3" "wheat4" "whitesmoke"
## [652] "yellow" "yellow1" "yellow2"
## [655] "yellow3" "yellow4" "yellowgreen"
Selecting Colors by Hex Codes
The hexadecimal color system uses a 6-digit code to describe color. Red, green, and blue are each represented by 2 digits (#rrggbb). Hex codes allow for precise color selection, which is particularly useful when designing visualizations that will be featured on a poster, website, report, or digital dashboard. In this way, the figure can be easily integrated into the overall color palette of the medium to showcase that it was designed for a specific purpose. You may have noticed that many large companies use specific color palettes, which keeps company materials and presentations looking cohesive across offices and the web.
There are many online tools for identifying hex codes from images or websites, and most design tools (e.g., Canva) include built-in color ID tools. To find the hex code for a base R color, you can visit this helpful site or run the following code directly in your R console.
library(dplyr) # dplyr is required to use pipes (%>%)
col2rgb("darkorchid4") %>%
t() %>%
rgb(maxColorValue = 255)
## [1] "#68228B"
Assigning a single color to a base R plot using the “col” argument is simple. The following three plots use the same color, which were be called by different methods: name (specific to R), number (specific to R), and hexadecimal code.
par(mfrow = c(1, 3))
plot(iris$Sepal.Length, iris$Petal.Length, col = "darkorchid4")
plot(iris$Sepal.Length, iris$Petal.Length, col = colors()[99])
plot(iris$Sepal.Length, iris$Petal.Length, col = "#68228B")

When exploring data, it can also be helpful to use color to provide quick explorations of the data.
plot(
x = iris$Sepal.Length,
y = iris$Petal.Length,
col = ifelse(iris$Sepal.Width < 3, "gray", "darkorchid4")
)

However, more often than not, a visualization will require more than one color. If have not been assigned a color palette for your project, there are numerous pre-built color palettes to help choose complimentary colors for your viz. However, not all palettes are created equal, so it is worth taking the time to select a palette best suited to your task.
Base R - Color Palettes
The grDevices package (included in base R) defines numerous color
palettes. Possibly the most
infamous
is the rainbow() palette. In 2019, the palette options were improved
with the addition of numerous HCL (hue-chroma-luminance)
palettes
to supplement the existing RGB/HSV (hus-saturation-value)-based palettes
grDevices. As the release
blog
explains, RGB/HSV-based colors are more difficult for humans to perceive
than HCL colors.
Starting from any existing palette, you can create a custom palette
based on the number of colors needed. For example, creating a 3-color
palette from the “heat.colors” palette to distinguish a categorical
variable with 3 categories (e.g., the species variable of the iris
dataset), as shown below. Mapping a color palette to categorical
variables in base R requires transforming the “variable of interest in
an integer that will be used to call the appropriate color.” Mapping a
color palette to a continuous variable in base R requires manually
dividing the variable of interest into bins. Plot examples adapted from
R Graph Gallery. An
easier method for mapping a color palette to variable, using ggplot2,
is explored in the section R Packages Related to
Color.
# Create a color palette with the desired number of colors
colors <- heat.colors(n = 3)
# Transform the categorical variable into integers
class <- unclass(iris$Species)
# "unclass" returns a copy of its argument with its class attribute removed
# For a categorical variable, it returns a list of integers.
# Scatterplot with each species a different color
plot(
x = iris$Sepal.Length,
y = iris$Petal.Length,
bg = colors[class],
cex = 3,
pch = 21)

# Create a color palette with the desired number of colors
cont_colors <- heat.colors(n = 20)
# Transform the numeric variable into bins
rank <- as.factor(as.numeric(cut(iris$Petal.Width, 20)))
# Scatterplot with color gradient
plot(
x = iris$Sepal.Length,
y = iris$Petal.Length,
bg = cont_colors[rank],
cex = 3,
pch=21
)

R Packages Related to Color
ggplot2
The best way to integrate color into your visuals in a compelling way is
to use ggplot2. Mapping a color palette to a variable is much easier
with ggplot grammar than with base R. As seen below, ggplot2
automatically assigns a color to each class of a categorical variable
without requiring the user to transform the variable.
library(ggplot2)
ggplot(data = iris, aes(x = Sepal.Length, y = Petal.Length, color = Species)) +
geom_point(size = 5)

When no color palette is assigned, ggplot2 uses a default palette.
However, this can be changed using a number of functions. Below, I will
demonstrate the functions housed in the paletteer package, though
ggplot2 can natively call other color palettes; for more information,
refer to the
r-graph-gallery page
on the subject.
paletteer (by Emil Hvitfeldt).
A compilation of color palettes from numerous R packages (full list here). The latest version (v_1.6.0) compiles 2728 palettes from 75 packages. Some notable ones:
RcolorBrewer, one of the most well-known collection of palettes in R.viridis, which includes 8 color palettes designed with colorblindness/desaturation in mind.grDevices, if you want to stick with baseR palettes.
The palettes are classified as:
- Discrete (fixed width): have a set amount of colors which does not change even when a number of colors is requested. E.g., “nord::frost” will always be limited to 4 colors.
- Discrete (dynamic): the colors of the palette depend on the number of colors required.
- Continuous: a smooth transition best for continuous variables.
Each type of palette is called by a unique function when plotting with ggplot2:
-
Discrete (fixed width):
scale_color_paletteer_d("packagename::palettename") -
Discrete (dynamic):
scale_color_paletteer_d("packagename::palettename", dynamic = TRUE) -
Continuous:
scale_color_paletteer_c("packagename::palettename")library(ggplot2) library(paletteer) ggplot(data = iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) + geom_point(size = 3) + scale_color_paletteer_d("nord::frost")
library(ggplot2) library(paletteer) ggplot(data = iris, aes(x = Sepal.Length, y = Sepal.Width, color = Sepal.Width)) + geom_point(size = 3) + scale_color_paletteer_c("ggthemes::Orange-Blue Diverging")
Even more color palettes can be found in the cpt-city
package, which was
excluded from paletteer due to the sheer number of palettes
(7140!). The cpt-ctiy
project is an archive
of colour gradients for cartography, technical illustration and design;
the R package is maintained by Sergio
Ibarra-Espinosa.
colorBlindness (by Jianhong Ou).
This package contains several useful tools to ensure your visualizations are accessible to viewers colorblindness, namely: deuteranopia and protanopia (two forms of red-green colorblindness) and achromatopsia (total absence of color perception). In addition, stimulating the appearance of desaturated plots (which represents viewers with achromatopsia) is a helpful way to check if your viz will be legible when printed in B&W.
-
Color-vision-deficiency (CVD) simulation is possible with the “cvdPlot” function. This function was adapted from the
colorblindrpackage, written by Claire D. McWhite and Claus O. Wilke. Note that this function is only compatible with a ggplot grob, you cannot use a plot made with base R.library(ggplot2) library(paletteer) library(colorBlindness) p <- ggplot(data = iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) + geom_point() + scale_color_paletteer_d("lisa::OskarSchlemmer") colorBlindness::cvdPlot(p)
-
CVD stimulation is also possible with individual colors or colors in a palette with the “displayAllColors” function. In the examples below, note how much better the viridis palette performs when desaturated, compared to the rainbow palette.
library(paletteer) library(colorBlindness) colorBlindness::displayAllColors(rainbow(n = 10))
colorBlindness::displayAllColors(paletteer_c("viridis::inferno", n = 10))
General Considerations for Color in Data Viz
- Keep colors consistent. If you are creating multiple plots using the same data groupings, make sure that groups are always represented by the same colors. This avoids confusion for readers, especially if they are not paying close attention to the text of the labels.
- Accommodate for colorblind viewers. Commonly, research style
guides will advise avoiding red and green in the same figure. While
this is a starting point, it does not guarantee that your
visualizations will be accessible. My preferred tool is the
colorBlindnesspackage, which is described in the section R Packages Related to Color. - Don’t go overboard. Colors are a tool to help distinguish groups or sections of your graph. They should not be overused to the point of overwhelming your viewers or making your plots appear unprofessional. Not everything needs an eye-catching color; some figure features (e.g., titles, labels, grid lines) are better left black.
- Some colors carry social connotations. For example, if displaying monetary values or profits in a figure, it’s important to remember that most people will interpret red as a negative or a loss, so avoid using red to denote the positive end of your scale. In addition, if you are making a comparison, avoid colors that imply “good” or “bad.” For example, when comparing the efficacy of two products, if you make data points from one product red, it could lead viewers to interpret that product as bad, harmful, or less useful, even if the data does not support this opinion.
Summary
In R, individual colors can be called by name, number, or hexadecimal
code. Thousands of color palettes exist, some of which are available in
base R, but most can be found in the paletter package. Though changing
the color of plots with base R is possible, the process is more
efficient and flexible with the use of ggplot2. There are also useful
packages, such as colorBlindness that can be used to ensure your
figures are accessible to colorblind viewers. The most effective color
choice will depend on the purpose of the visualization and the type of
data. For example, choosing between a discrete or continuous palette
will depend on the type of data you are visualizing. Colors help engage
viewers and make visual data easier to comprehend.
Software/Languages Used
R version 4.3.2
RStudio version 2023.09.1+494
R Packages: ggplot2
(v_3.4.4); paletteer
(v_1.6.0);
colorBlindness
(v_0.1.9)
Learning Roundup
Though I was familiar with all of the tools described in this post,
finding resources to link to this post ended up teaching me several new
tricks. For example, I discovered the cvdPlot function in the
colorBlindness package, which can be used to simulate
color-vision-deficiency of finished plots. Until now, I had only used
the package to see if my color selections were distinguishable to
colorblind viewers, but seeing the final plots actually makes it much
easier to know how readable or unreadable the plots would be.
On a more general note, I wanted to truncate the lengthy output of an rmd chunk, which led me to the rmarkdown cookbook, where I learned how to edit the default knittr hook. The resulting code can be found in the rmd document used to created this page, found in my GitHub repository.
Resources
Color choice in data viz:
- What to consider when choosing colors for data visualization. A great list of considerations for designing visuals with helpful sample visuals, from Datawrapper.
- Color Theory terminology
Modifying color with R:
Individual colors and palettes available with R:
- Colors
- paletteer starter guide and comprehensive list of palettes created by the package author
- Samples of Palettes included in paletteer package, with additional hex code and code samples, as well as the ability to search palette names in a browser with CRTL+F: Palettes and paletteer gallery
Accessibility:
- Web Accessibility Guidelines: Contrast & Color
- colorBlindness package guide