Introduction to R Graphics

Le Yan



R Graphic Systems

  • There are at least three plotting systems in R
    • base
    • lattice
    • ggplot2
  • Today we will touch on “base” briefly then focus on “ggplot2”


  • Base plot system
  • ggplot2 plot system
    • Basic concepts
    • Geom and stat functions
    • Title, axis labels and legends
    • Themes
    • Scale functions
    • Coordination systems
    • Faceting


  • Datasets are from the packages datasets, gcookbook` andggplot2“
  • To learn more about those datasets, run help(library='<name>')
  • For information on individual datasets, run ?<dataset>
First Plot in R

  • We will use the “pressure” dataset for our first plot.

First, let's examine the data:

'data.frame':   19 obs. of  2 variables:
 $ temperature: num  0 20 40 60 80 100 120 140 160 180 ...
 $ pressure   : num  0.0002 0.0012 0.006 0.03 0.09 0.27 0.75 1.85 4.2 8.8 ...
  temperature     pressure       
 Min.   :  0   Min.   :  0.0002  
 1st Qu.: 90   1st Qu.:  0.1800  
 Median :180   Median :  8.8000  
 Mean   :180   Mean   :124.3367  
 3rd Qu.:270   3rd Qu.:126.5000  
 Max.   :360   Max.   :806.0000  
First Plot in R

We can use the plot() function in the base plot system to create a scatter plot:

# Simply specify the x and y variables.

plot of chunk unnamed-chunk-5

Since there are only two variables, we can simply run:


plot of chunk unnamed-chunk-6

More Plot Types

  • The type argument of plot() can be used to specify plot type

Line plot with “l”:


plot of chunk unnamed-chunk-7

Or dot and line with “b”:


plot of chunk unnamed-chunk-8

Adding More Layers

There are a few functions that can be used to add more elements/layers to the plot

  • Points
  • Lines
  • Texts
# Create the plot with title and axis labels.
     main="Vapor Pressure of Mercury",
     ylab="Vapor Pressure")

# Add points

# Add annotation
text(150,700,"Source: Weast, R. C., ed. (1973) Handbook \n
     of Chemistry and Physics. CRC Press.")

plot of chunk unnamed-chunk-9


  • Use boxplot() for boxplots


boxplot(hwy ~ cyl, data=mpg)

# Use the title function to add title and labels.
title("Highway Mileage per Gallon",
      xlab = "Number of cylinders",
      ylab = "Mileage (per gallon)")

plot of chunk unnamed-chunk-13


  • Function hist() can used to create histograms

plot of chunk unnamed-chunk-14

hist(mpg$hwy, breaks=c(5,15,25,30,50))

plot of chunk unnamed-chunk-15


The curve() function draws a function over a specified range.

curve(cos,-3*pi, 3*pi)
title("Cosine Function")

# The abline() function adds one or more straight lines to the current plot
       col = 2, lty = 2, lwd = 1.5) 

plot of chunk unnamed-chunk-16

Panel Grid of Plots

  • When datasets with multiple variables are passed to plot() without X and Y being specified, it will generate a panel grid of plots.
plot of chunk unnamed-chunk-18

Saving Plots to Files

  • Steps to save a plot to a file
    • Open a device
    • Create the plot
    • Close the device
  • Supported devices
    • Vector: svg
    • Bitmap: jpeg,tiff,png,bmp
    • PDF: pdf
    • Postcript: postcript

Example: saving to a PNG file

plot(pressure, type="l")

Here is the saved graph: Caption

ggplot2 Package

  • “gg” stands for grammar-of-graphics
  • Any data graphics can be described by specifying
    • A dataset
    • Visual marks that represent data points
    • A coordination system
  • ggplot2 package in R is an implementation of it
    • Versatile
    • Clear and consistent interface
    • Beautiful output

qplot Function

  • The qplot() function from ggplot2 package is similar to the plot() function in the base system.

Examine the data:

Scatterplot with ```qplot``` Function

  • Data represented by points
qplot(weightLb, heightIn, data=heightweight, geom="point")

plot of chunk unnamed-chunk-23

  • Data represented by labels
qplot(weightLb, heightIn, data=heightweight, geom ="text", label=ageYear)

plot of chunk unnamed-chunk-24

A Fancier Plot

plot of chunk unnamed-chunk-25

A Fancier Plot

This is what is under the hood:

ggplot(heightweight, aes(x=weightLb, y=heightIn, color=sex, shape=sex)) + 
  geom_point(size=3.5) +
  ggtitle("School Children\nHeight ~ Weight") +
  labs(y="Height (inch)", x="Weight (lbs)") +
  stat_smooth(method=loess, se=T, color="black", fullrange=T) +
  annotate("text",x=145,y=75,label="Locally weighted polynomial fit with 95% CI",color="Green",size=6) +
  scale_color_brewer(palette = "Set1", labels=c("Female", "Male")) +
  guides(shape=F) +
  theme_bw() +
  theme(plot.title = element_text(size=20, hjust=0.5), 
        legend.position = c(0.9,0.2),
        axis.title.x = element_text(size=20), axis.title.y = element_text(size=20),
        legend.title = element_text(size=15),legend.text = element_text(size=15))

Don't Panic!!!

Basic Concepts of ggplot2

Grammar of Graphics components:

  • Data: Use the ggplot function to indicate what data to use
  • Visual marks: Use geom_xxx functions to indicate what types of visual marks to use
    • Points, lines, area, etc.
  • Mapping: Use aesthetic properties (aes() function) to map variables to visual marks
    • Color, shape, size, x, y, etc.
ggplot(heightweight, # What data to use
       aes(x=weightLb,y=heightIn)) + # Aesthetic specifies variables
  geom_point() # Geom specifies visual marks 

plot of chunk unnamed-chunk-27

This is equivalent to:

qplot(weightLb, heightIn, data=heightweight, geom="point")

Histogram with ggplot2

ggplot(mpg,aes(x=hwy)) + geom_histogram(binwidth=5, fill="white", color="black")

plot of chunk unnamed-chunk-29

Contour Plots with ggplot2

ggplot(faithfuld, aes(waiting, eruptions, z = density))+
  geom_raster(aes(fill = density)) +
  geom_contour(colour = "white")

plot of chunk unnamed-chunk-30

Maps with ggplot2

  • Combined with the maps package, one can create geographical graphs
east_asia <- map_data("world", 
                      region=c("Japan","China","North Korea","South Korea"))
ggplot(east_asia, aes(x=long,y=lat,group=group, fill=region)) + 
  geom_polygon(color="black") + 

plot of chunk unnamed-chunk-32

List of Geoms in ggplot2

There are more than 30 geoms in ggplot2

  • One variable
    • geom_bar
    • geom_area
  • Two variables
    • geom_point
    • geom_smooth
    • geom_text
    • geom_boxplot
  • Graphic primitives
    • geom_path
    • geom_polygon
  • Error visualizatoin
    • geom_errorbar
  • Special
    • geom_map
    • geom_contour

Customizing Appearance of Data Points

  • Appearance of Data Points can be customized with the geom functions
    • Color
    • Shape (symbol)
    • Size
    • Alpha (transparency)
ggplot(heightweight, aes(x=weightLb,y=heightIn)) + 

plot of chunk unnamed-chunk-33

List of Symbols

  • There are 25 built-in shapes

plot of chunk unnamed-chunk-34

Notes on Colors

  • A list of possible color names can be obtained with the function colors()
  • Can also use hex values
    • Starts with a “#”
Adding More Layers to A Plot

  • New layers can be added to a plot by using geom_xxx functions
ggplot(heightweight, aes(x=weightLb,y=heightIn)) + 
  geom_point() + 
  geom_quantile(quantiles = c(0.25,0.5,0.75)) +
  geom_text(label=rownames(heightweight), vjust=-0.5)

plot of chunk unnamed-chunk-36

More on Aesthetic Mapping

  • Aesthetic mappings describe how variables in the data are mapped to visual properties
    • Colors, shapes, sizes, transparency etc.
    • Controlled by the aes() function
    • Can be specified in either ggplot function or individual layers
    • Aesthetic mappings specified in ggplot are default, but can be overriden in individual layers

Mapping Discrete Variables to Aesthetic Properties

  • Discret data values can be mapped to an aesthetic value to group data points

Example: use the “sex” variable to group data points by shape and color:

ggplot(heightweight, aes(x=weightLb,y=heightIn)) + 

plot of chunk unnamed-chunk-37

Specify the shapes and colors manually (more on stat function later):

ggplot(heightweight, aes(x=weightLb,y=heightIn)) + 
  geom_point(aes(shape=sex,color=sex),size=4) +
  scale_shape_manual(values=c(1,4)) +

plot of chunk unnamed-chunk-38

Mapping Continuous variables to Aesthetic Properties

  • Continuous variables can be mapped to aesthetic values too
ggplot(heightweight, aes(x=weightLb,y=heightIn)) + 

plot of chunk unnamed-chunk-39

Adding Fitted Models

Use stat_smooth function to add a fitted model to the plot:

ggplot(heightweight, aes(x=weightLb,y=heightIn)) + 
  geom_point(aes(shape=sex,color=sex),size=4) +
  scale_shape_manual(values=c(1,4)) +
  scale_color_manual(values=c("blue","green")) +
  stat_smooth(method = lm, level=0.95)

plot of chunk unnamed-chunk-40

Moving the color=sex statement to the ggplot function produces two lines:

ggplot(heightweight, aes(x=weightLb,y=heightIn, color=sex)) + 
  geom_point(aes(shape=sex),size=4) +
  scale_shape_manual(values=c(1,4)) +
  scale_color_manual(values=c("blue","green")) +
  stat_smooth(method = lm, level=0.95)

plot of chunk unnamed-chunk-41

Labeling individual points

To label data points, use either annotate or geom_text

ggplot(heightweight, aes(x=weightLb,y=heightIn)) + 
  geom_point(aes(shape=sex,color=sex,size=ageYear)) +
  annotate("text",x=150,y=68,label="Some label",color="darkgreen",size=12)

plot of chunk unnamed-chunk-42

ggplot(heightweight, aes(x=weightLb,y=heightIn)) + 
  geom_point(aes(shape=sex,color=sex,size=ageYear)) +

plot of chunk unnamed-chunk-43

Stat Functions

  • Some plots visualize a transformation of the original dataset.
  • Use a stat_xxx function to choose a common transformation to visualize.

We have seen the stat_smooth() function:

ggplot(heightweight, aes(x=weightLb,y=heightIn, color=sex)) + 
  geom_point(aes(shape=sex),size=4) +
  scale_shape_manual(values=c(1,4)) +
  scale_color_manual(values=c("blue","green")) +
  stat_smooth(method = lm, level=0.95)

plot of chunk unnamed-chunk-44

Stat Functions

Another example: stat_bin() function creates a frequency count:

ggplot(mpg,aes(x=hwy)) + stat_bin(binwidth = 5)

plot of chunk unnamed-chunk-45

This is equivalent to:

ggplot(mpg,aes(x=hwy)) + geom_histogram(binwidth = 5)


ggplot(mpg,aes(x=hwy)) + 
  geom_histogram(stat="bin", binwidth = 5) 
# The "bin" stat is the implied default for histogram

Stat Functions

Density plot with stat_density:

ggplot(mpg,aes(x=hwy)) + stat_density()

plot of chunk unnamed-chunk-48

Or the same plot with geom_histogram:

ggplot(mpg,aes(x=hwy)) + geom_histogram(stat="density")

plot of chunk unnamed-chunk-49

Saving Plot to An Object

  • A ggplot plot can be saved in an object
  • More convenient when you are experienting
p <- ggplot(heightweight, aes(x=weightLb,y=heightIn))
p + geom_point(aes(shape=sex,color=sex,size=ageYear))

plot of chunk unnamed-chunk-50

# Here we use the saved plot object "p"
p + geom_smooth(method=lm)

plot of chunk unnamed-chunk-51

Saving Plots to Files

With ggplot2 one can use the ggsave() function to save a plot:

ggplot(heightweight, aes(x=weightLb,y=heightIn, color=sex)) + 
  geom_point(aes(shape=sex),size=4) +
  scale_shape_manual(values=c(1,4)) +
  scale_color_manual(values=c("blue","green")) +
  stat_smooth(method = lm, level=0.99)

Plot Titles

To add a title, use either ggtitle or labs(title=)

p <- ggplot(heightweight, aes(x=weightLb,y=heightIn, color=sex)) + 
  geom_point(aes(shape=sex),size=4) +
  scale_shape_manual(values=c(1,4)) +
p + ggtitle("Height ~ weight of school children")

plot of chunk unnamed-chunk-53

Note the title is left-aligned by default.

Axis Labels

To add axis labels, use either (x|y)lab or labs(x=,y=)

p + ggtitle("Height ~ weight of school children") +
  xlab("Weight (lbs)") + ylab("Height (inch)")

plot of chunk unnamed-chunk-54

Legend Titles

  • Use labs(<aes>=) to specify legend titles
p + ggtitle("Height ~ weight of school children") +
  xlab("Weight (lbs)") + ylab("Height (inch)") +
  labs(color='Gender', shape='Gender')

plot of chunk unnamed-chunk-55


  • Use the guides function to set legend type for each aesthetic properties.


p <- ggplot(heightweight, aes(x=weightLb,y=heightIn,color=ageYear)) + 

plot of chunk unnamed-chunk-56


p + guides(shape='none',color='legend')

plot of chunk unnamed-chunk-57


  • Themes decide the appearance of a plot
  • ggplot2 provides a few pre-defined themes for users to choose from

The classic theme:

p <- ggplot(heightweight, aes(x=weightLb,y=heightIn, color=sex)) + 
p + theme_classic()

plot of chunk unnamed-chunk-58

The dark theme:

p + theme_dark()

plot of chunk unnamed-chunk-59

Package ggthemes

  • Additional themes are available from the ggthemes package

Example: Excel theme

p + theme_excel()

plot of chunk unnamed-chunk-61

Fine-tuning the Theme

  • Most elements related to appearance are controlled by the theme() function.
    • Fonts (family, size, color etc.)
    • Background color
    • Grid lines
    • Axis ticks

Removing the grid lines:

p + theme_bw() +
  theme(panel.grid = element_blank())

plot of chunk unnamed-chunk-62

Or just removing the vertical ones:

p + theme_bw() +
  theme(panel.grid.major.x = element_blank(),
        panel.grid.minor.x = element_blank())

plot of chunk unnamed-chunk-63

Customizing Fonts

Change the base size and font family:

p + theme_bw(base_size = 24, base_family = "Times")

plot of chunk unnamed-chunk-64

Or fine tune each element:

p + theme_bw(base_size = 24, base_family = "Times") +
  theme(legend.title = element_text(size=20,color="blue"),# Legend title
        legend.text = element_text(size=18,color="red"), # Legend text
        axis.title.x = element_text(size=18,color="red"), # X axis label
        axis.title.y = element_blank(), # Remove Y axis label

plot of chunk unnamed-chunk-65

The element_blank() function can be used to remove undesired elements.

Changing Legend Position

p + theme_bw(base_size = 24, base_family = "Times") +
  theme(legend.position = "bottom")

plot of chunk unnamed-chunk-66

p + theme_bw(base_size = 24, base_family = "Times") +
  theme(legend.position = c(0.9,0.1))

plot of chunk unnamed-chunk-67

List of Theme Elements

Elements that can be adjusted with the theme() function:

Reset the default theme

  • The default them is theme_grey()
  • Use theme_set() to change the default

With old default:


plot of chunk unnamed-chunk-68

With new default:


plot of chunk unnamed-chunk-69

Creating Your Own Theme

  • You can create your own theme and reuse later:
mytheme <- theme_bw(base_size = 24, base_family = "Times") +
  theme(legend.title = element_text(size=20,color="blue"),# Legend title
        legend.text = element_text(size=18,color="red"), # Legend text
        axis.title.x = element_text(size=18,color="red"), # X axis label
        axis.title.y = element_blank(), # Remove Y axis label
p + mytheme

plot of chunk unnamed-chunk-70

Coordination systems

Functions that control the coordination system

  • coord_cartesian - the default cartesian coordinates
  • coord_flip - flip X and Y
  • coord_polar - polar coordinates
  • coord_trans - transform cartesian coordinates

Coordination systems


g <- ggplot(mpg,aes(x=hwy)) + geom_histogram(binwidth=5, fill="white", color="black")

plot of chunk unnamed-chunk-71

With flipped coorinates:

g + coord_flip()

plot of chunk unnamed-chunk-72

Coordination systems



plot of chunk unnamed-chunk-73

With transformed Y coordinate:

g + coord_trans(y="sqrt")

plot of chunk unnamed-chunk-74

Axis Limits

  • Use the xlim() and ylim() functions to set the range of axes:
p + theme_light() +
  xlim(0,200) +

plot of chunk unnamed-chunk-75


  • The scale_<aes>_(continuous|discrete|manual|identity|...) family of functions controls how data points are mapped to aesthetic values
    • Color
    • Shape
    • Size
    • Alpha (transparency)
    • X and Y location

X and Y scales

scale_x_continuous: scale for X, which is a continuous variable

p + theme_bw() +
  ylim(50,100) +

plot of chunk unnamed-chunk-77

X and Y scales

p + theme_economist_white() +
                limits=c(5,500)) + # Plot X on a log10 scale
  scale_y_reverse() # Reverse the Y scale

plot of chunk unnamed-chunk-78

Legend Labels

  • Scale functions can be used to customize legend labels
    • Color, shape, size, fill etc.
ggplot(mpg,aes(x=drv,y=cty,fill=drv)) +

plot of chunk unnamed-chunk-79

ggplot(mpg,aes(x=drv,y=cty,fill=drv)) +
  geom_boxplot() +
                      labels=c("Front","Rear","4 Wheel Drive"))

plot of chunk unnamed-chunk-80

Other scales

By default:

               color=drv,shape=fl)) +

plot of chunk unnamed-chunk-81


ggplot(mpg,aes(x=displ,y=hwy,size=cyl,color=drv, alpha=cty)) +
  geom_point() +
  scale_size_identity() + # Use the values of "cyl" variable for size
  scale_color_manual(values=c("darkblue","rosybrown2","#24FA22")) +

plot of chunk unnamed-chunk-82


  • Facets divide a plot into subplots based on the values of one or more discrete variables.
  • Faceting in ggplot2 is managed by the functions facet_grid and facet_wrap.

facet_grid: create a row of panels defined by the variable “drv”:

ggplot(mpg, aes(displ, hwy)) + 
  geom_point() +
  facet_grid(. ~ drv)

plot of chunk unnamed-chunk-83


facet_grid: creates a column of panels defined by the variable “fl”:

ggplot(mpg, aes(displ, hwy)) + 
  geom_point() +
  facet_grid(fl ~ .)

plot of chunk unnamed-chunk-84


facet_grid: creates a matrix of panels defined by the variables “fl” and “drv”:

ggplot(mpg, aes(displ, hwy)) + 
  geom_point() +
  facet_grid(fl ~ drv)

plot of chunk unnamed-chunk-85


facet_wrap: wraps 1d sequence of panels into 2d:

ggplot(mpg, aes(displ, hwy)) + 
  geom_point() +
  facet_wrap(~class, nrow=3)

plot of chunk unnamed-chunk-86

Online Interactive Graphics

Open your browser and try:

The code:

g <- ggplot(nmmaps, aes(date, temp, color=factor(season)))+  geom_point() +
  scale_color_manual(values=c("dodgerblue4", "darkolivegreen4",
                              "darkorchid3", "goldenrod1"))
#ggplotly(g) # offline
api_create(g, filename = NULL, fileopt = "new", sharing = "public")

Further Reading