library(plotly)
library(ggplot2)
R Visualizations - plotly
Statistics 506
Note: Do to the heavy reliance on images in this document, I suggest toggling to light mode via the button in the top-right corner of the document.
Introduction
Plotly is a commerical company that makes available open source graphing libraries for R, Python, Julia, and others. In particular, the plots generated are interactive so that users can modify their view of the plot to hopefully make things more clear.
There are two primary ways to interface with plotly. The first is to pass any object created via ggplot()
into the ggplotly()
function. The second is to use plot_ly
to generate a plot from scratch.
ggplotly
Any gg
object created by ggplot
can be passed into ggplotly
.
# This code taken from the previous set of notes
<- read.csv("data/chicago-nmmaps.csv")
nnmaps $date <- as.Date(nnmaps$date)
nnmapssuppressWarnings(nnmaps_month <-
aggregate(nnmaps, by = list(nnmaps$month_numeric,
$year),
nnmapsFUN = mean, na.rm = TRUE))
<- nnmaps_month[order(nnmaps_month$year,
nnmaps_month $month_numeric), ]
nnmaps_month$time <- seq_len(nrow(nnmaps_month))
nnmaps_month
<- ggplot(nnmaps, aes(x = date, y = o3, color = season)) +
g geom_point() +
geom_line(data = nnmaps_month, color = "red", linewidth = 3)
Play around with the above plot. You can …
- mouse over individual points
- zoom in and out or move the plot around (changing the tooltip type)
- hide or isolate the subgroups by clicking/double clicking on the legend
However, with more advanced plots, the conversion from a ggplot into a plotly plot can cause issues. Recall this plot the previous set of notes:
data(mpg)
<- ggplot(mpg) +
g geom_jitter(aes(x = hwy, y = cty, color = displ, shape = drv, size = cyl)) +
scale_radius()
g
Note the three separate scales in the legends. Observe what happens when we convert to a plotly object.
ggplotly(g)
plot_ly
For more advanced plots, we’re better off creating plots directly in plotly with the plot_ly
function. To be honest, plotly is not the easiest package to use, but with a little work (and a lot of online resources) you can figure out how to generate plots.
Let’s start by replicating the NNMAPS seasonal data.
<- plot_ly(nnmaps, x = ~ date, y = ~ o3, color = ~ season,
p type = "scatter", mode = "markers")
p
This shows the basics of a plot: the x
, y
and color
arguments being passed into ...
, the type
argument defining what kind of plot, and the mode
being an argument that scatterplots supports.
A plotly objects consists of some number of “trace” and a “layout”. Each trace represents one plotted object, and more specifically the transformation that connects the data to the visual plot.
Behind the scenes, R plotly
objects are converted into JSON lists before being passed to the browser for drawing. We can examine these objects to get information about the traces.
<- plotly_build(p) # Convert the plot into the JSON list
pb sapply(pb$x$data, "[[", "name")
[1] "Autumn" "Spring" "Summer" "Winter"
So we see that each of the four seasons generated it’s own trace. There is obviously a massive amount of information inside the pb
object. If you are really adventurous, you can manipulate it directly to modify the plotted object.
$x$data[[1]]$mode <- "markers+lines"
pb pb
We’ll add the averaged line by adding another trace to the plot.
<- add_trace(p, x = ~ date, y = ~ o3,
p2 data = nnmaps_month,
type = "scatter",
mode = "lines",
color = "red",
name = "Average",
line = list(width = 10))
sapply(plotly_build(p2)$x$data, "[[", "name")
[1] "Autumn" "Spring" "Summer" "Winter" "Average"
p2
add_trace
is a generic version that requires specifying the type
(or letting plotly guess); there exist specific add_*
s as well.
# Not run
add_lines(p, x = ~ date, y = ~ o3,
data = nnmaps_month,
color = "red",
name = "Average",
line = list(width = 10))
Often it is best to “manually” add each trace to an empty plot.
plot_ly() |>
add_markers(x = ~ date, y = ~ o3, color = ~ season, data = nnmaps) |>
add_lines(x = ~ date, y = ~ o3,
data = nnmaps_month,
color = "red",
name = "Average",
line = list(width = 10))
Plotly has a version of inheritance similar to ggplot2. I do find it more fickle and liable to error.
plot_ly(x = ~ date, y = ~ o3, color = ~ season, data = nnmaps,
type = "scatter", mode = "markers") |>
add_lines(data = nnmaps_month, # no `x=` or `y=` argument
color = "red",
name = "Average",
line = list(width = 10))
3d-plots can be easily created by adding a third dimension and switching to the 3d scatterplot type.
plot_ly(z = ~ date, x = ~ o3, y = ~ temp, color = ~ season, data = nnmaps,
type = "scatter3d", mode = "markers")
Performance of ggplotly
vs plot_ly
library(microbenchmark)
<- microbenchmark(
mb ggplot = ggplotly(ggplot(nnmaps, aes(x = date, y = o3, color = season)) +
geom_point() +
geom_line(data = nnmaps_month,
color = "red", linewidth = 3)),
plot_ly =
plot_ly() |>
add_markers(x = ~ date, y = ~ o3, color = ~ season, data = nnmaps) |>
add_lines(x = ~ date, y = ~ o3,
data = nnmaps_month,
color = "red",
name = "Average",
line = list(width = 10))
)
Warning in microbenchmark(ggplot = ggplotly(ggplot(nnmaps, aes(x = date, : less
accurate nanosecond times to avoid potential integer overflows
print(mb, unit = "s", signif = 4)
Unit: seconds
expr min lq mean median uq max neval cld
ggplot 0.0277200 0.0284200 0.030130 0.028910 0.0309800 0.078110 100 a
plot_ly 0.0002064 0.0002208 0.000262 0.000245 0.0002541 0.002408 100 b
This implies that ggplotly
is over 114 times slower! This could have a big impact on larger documents with many embedded interactive plots.
Layout
The layout
function allows manipulation of the non-trace attributes of the plot. It takes in a plotly
object and returns one as well so its appropriate in a pipe chain.
<- plot_ly(x = ~ date, y = ~ o3, color = ~ season, data = nnmaps,
p type = "scatter", mode = "markers")
<- p |> layout(title = "O3 over time",
p2 yaxis = list(type = "log",
title = "log(o3)"))
p2
Similar to how we extracted or could modify aspects of the JSON to affect the trace, we likewise can do so to affect the layout.
plotly_build(p)$x$layout$title
NULL
plotly_build(p2)$x$layout$title
[1] "O3 over time"
<- plotly_build(p2)
p3 $x$layout$title <- "My new title!"
p3 p3
Subplots
The subplot
function can easily combine multiple plots into a single output.
<- plot_ly() |> add_markers(x = ~ date, y = ~ o3,
p1 data = nnmaps, name = "o3")
<- plot_ly() |> add_markers(x = ~ date, y = ~ temp,
p2 data = nnmaps, name = "type")
subplot(p1, p2)
subplot(p1, p2, nrows = 2)
The share_
arguments control whether axes are linked.
subplot(p1, p2, shareY = TRUE)
subplot(p1, p2, nrows = 2, shareX = TRUE)
While in this example we’re placing both scattergraphs, they can be any types of traces.
Adding additional interactivity
So far all the interactivity we’ve seen has been zooming or panning or hovering. However, we can also fairly easily add functionality to let users modify parts of the plot. For example, let’s allow users to choose what predictor goes on the Y axis. This takes place in the layout
as well.
<- plot_ly(data = nnmaps) |>
p add_markers(x = ~ date, y = ~ o3) |>
add_markers(x = ~ date, y = ~ temp, visible = FALSE)
p
We’ve added a second trace, but we’ve made it not visible. Now we can add a menu where a user can toggle which trace is visible.
|> layout(updatemenus = list(
p list(
y = 0.5,
buttons = list(
list(method = "update",
args = list(list(visible = list(TRUE, FALSE)),
list(yaxis = list(title = "o3"))),
label = "o3"),
list(method = "update",
args = list(list(visible = list(FALSE, TRUE)),
list(yaxis = list(title = "temp"))),
label = "temp"))
) ))
We can also let users control whether to plot lines or scatterplot, but this time let’s make them buttons rather than a menu.
|> layout(updatemenus = list(
p list(
y = .8,
buttons = list(
list(method = "update",
args = list(list(visible = list(TRUE, FALSE)),
list(yaxis = list(title = "o3"))),
label = "o3"),
list(method = "update",
args = list(list(visible = list(FALSE, TRUE)),
list(yaxis = list(title = "temp"))),
label = "temp"))),
list(
type = "buttons",
y = .7,
buttons = list(
list(method = "update",
args = list(list(mode = "markers")),
label = "markers"),
list(method = "update",
args = list(list(mode = "lines")),
label = "lines"))
) ))
Finally, let’s let users zoom in and out of the range in the x-axis more naturally.
|> layout(
p xaxis = list(
rangeslider = list(type = "date")
),updatemenus = list(
list(
y = .8,
buttons = list(
list(method = "update",
args = list(list(visible = list(TRUE, FALSE)),
list(yaxis = list(title = "o3"))),
label = "o3"),
list(method = "update",
args = list(list(visible = list(FALSE, TRUE)),
list(yaxis = list(title = "temp"))),
label = "temp"))),
list(
type = "buttons",
y = .7,
buttons = list(
list(method = "update",
args = list(list(mode = "markers")),
label = "markers"),
list(method = "update",
args = list(list(mode = "lines")),
label = "lines"))
) ))