Introduction to R Graphics

Le Yan
HPC@LSU

h2

Disclaimer

R Graphic Systems

  • There are at least three plotting systems in R
    • base
    • lattice
    • ggplot2
  • Today we will touch on “base” briefly then focus on “ggplot2”

Outline

  • Base plot system
  • ggplot2 plot system
    • Basic concepts
    • Geom and stat functions
    • Title, axis labels and legends
    • Themes
    • Scale functions
    • Coordination systems
    • Faceting

DataSets

  • Datasets are from the packages datasets, gcookbook` andggplot2“
  • To learn more about those datasets, run help(library='<name>')
  • For information on individual datasets, run ?<dataset>
library(help='datasets')
        Information on package 'datasets'

Description:

Package:       datasets
Version:       3.3.3
Priority:      base
Title:         The R Datasets Package
Author:        R Core Team and contributors worldwide
Maintainer:    R Core Team <R-core@r-project.org>
Description:   Base R datasets.
License:       Part of R 3.3.3
Built:         R 3.3.3; ; 2017-03-06 14:15:22 UTC; windows

Index:

AirPassengers           Monthly Airline Passenger Numbers 1949-1960
BJsales                 Sales Data with Leading Indicator
BOD                     Biochemical Oxygen Demand
CO2                     Carbon Dioxide Uptake in Grass Plants
ChickWeight             Weight versus age of chicks on different diets
DNase                   Elisa assay of DNase
EuStockMarkets          Daily Closing Prices of Major European Stock
...

First Plot in R

  • We will use the “pressure” dataset for our first plot.

First, let's examine the data:

str(pressure)
'data.frame':   19 obs. of  2 variables:
 $ temperature: num  0 20 40 60 80 100 120 140 160 180 ...
 $ pressure   : num  0.0002 0.0012 0.006 0.03 0.09 0.27 0.75 1.85 4.2 8.8 ...
summary(pressure)
  temperature     pressure       
 Min.   :  0   Min.   :  0.0002  
 1st Qu.: 90   1st Qu.:  0.1800  
 Median :180   Median :  8.8000  
 Mean   :180   Mean   :124.3367  
 3rd Qu.:270   3rd Qu.:126.5000  
 Max.   :360   Max.   :806.0000  
?pressure
pressure {datasets} R Documentation
Vapor Pressure of Mercury as a Function of Temperature

Description

Data on the relation between temperature in degrees Celsius and vapor pressure of mercury in millimeters (of mercury).

Usage

pressure
Format

A data frame with 19 observations on 2 variables.

[, 1]    temperature     numeric     temperature (deg C)
[, 2]    pressure    numeric     pressure (mm)
Source

Weast, R. C., ed. (1973) Handbook of Chemistry and Physics. CRC Press.

References

McNeil, D. R. (1977) Interactive Data Analysis. New York: Wiley.

First Plot in R

We can use the plot() function in the base plot system to create a scatter plot:

# Simply specify the x and y variables.
plot(pressure$temperature,pressure$pressure) 

plot of chunk unnamed-chunk-5

Since there are only two variables, we can simply run:

plot(pressure)

plot of chunk unnamed-chunk-6

More Plot Types

  • The type argument of plot() can be used to specify plot type

Line plot with “l”:

plot(pressure,type="l")

plot of chunk unnamed-chunk-7

Or dot and line with “b”:

plot(pressure,type="b")

plot of chunk unnamed-chunk-8

Adding More Layers

There are a few functions that can be used to add more elements/layers to the plot

  • Points
  • Lines
  • Texts
# Create the plot with title and axis labels.
plot(pressure,type="l",
     main="Vapor Pressure of Mercury",
     xlab="Temperature", 
     ylab="Vapor Pressure")

# Add points
points(pressure,size=4,col='red') 

# Add annotation
text(150,700,"Source: Weast, R. C., ed. (1973) Handbook \n
     of Chemistry and Physics. CRC Press.")

plot of chunk unnamed-chunk-9

Boxplot

  • Use boxplot() for boxplots

dataset:

str(mpg)
Classes 'tbl_df', 'tbl' and 'data.frame':   234 obs. of  11 variables:
 $ manufacturer: chr  "audi" "audi" "audi" "audi" ...
 $ model       : chr  "a4" "a4" "a4" "a4" ...
 $ displ       : num  1.8 1.8 2 2 2.8 2.8 3.1 1.8 1.8 2 ...
 $ year        : int  1999 1999 2008 2008 1999 1999 2008 1999 1999 2008 ...
 $ cyl         : int  4 4 4 4 6 6 6 4 4 4 ...
 $ trans       : chr  "auto(l5)" "manual(m5)" "manual(m6)" "auto(av)" ...
 $ drv         : chr  "f" "f" "f" "f" ...
 $ cty         : int  18 21 20 21 16 18 18 18 16 20 ...
 $ hwy         : int  29 29 31 30 26 26 27 26 25 28 ...
 $ fl          : chr  "p" "p" "p" "p" ...
 $ class       : chr  "compact" "compact" "compact" "compact" ...
summary(mpg)
 manufacturer          model               displ            year     
 Length:234         Length:234         Min.   :1.600   Min.   :1999  
 Class :character   Class :character   1st Qu.:2.400   1st Qu.:1999  
 Mode  :character   Mode  :character   Median :3.300   Median :2004  
                                       Mean   :3.472   Mean   :2004  
                                       3rd Qu.:4.600   3rd Qu.:2008  
                                       Max.   :7.000   Max.   :2008  
      cyl           trans               drv                 cty       
 Min.   :4.000   Length:234         Length:234         Min.   : 9.00  
 1st Qu.:4.000   Class :character   Class :character   1st Qu.:14.00  
 Median :6.000   Mode  :character   Mode  :character   Median :17.00  
 Mean   :5.889                                         Mean   :16.86  
 3rd Qu.:8.000                                         3rd Qu.:19.00  
 Max.   :8.000                                         Max.   :35.00  
      hwy             fl               class          
 Min.   :12.00   Length:234         Length:234        
 1st Qu.:18.00   Class :character   Class :character  
 Median :24.00   Mode  :character   Mode  :character  
 Mean   :23.44                                        
 3rd Qu.:27.00                                        
 Max.   :44.00                                        
?mpg
Fuel economy data from 1999 and 2008 for 38 popular models of car

Description

This dataset contains a subset of the fuel economy data that the EPA makes available on http://fueleconomy.gov. It contains only models which had a new release every year between 1999 and 2008 - this was used as a proxy for the popularity of the car.

Usage

mpg
Format

A data frame with 234 rows and 11 variables

manufacturer
model
model name

displ
engine displacement, in litres

year
year of manufacture

cyl
number of cylinders

trans
type of transmission

drv
f = front-wheel drive, r = rear wheel drive, 4 = 4wd

cty
city miles per gallon

hwy
highway miles per gallon

fl
fuel type

class
"type" of car

Boxplot

boxplot(hwy ~ cyl, data=mpg)

# Use the title function to add title and labels.
title("Highway Mileage per Gallon",
      xlab = "Number of cylinders",
      ylab = "Mileage (per gallon)")

plot of chunk unnamed-chunk-13

Histogram

  • Function hist() can used to create histograms
hist(mpg$hwy)

plot of chunk unnamed-chunk-14

hist(mpg$hwy, breaks=c(5,15,25,30,50))

plot of chunk unnamed-chunk-15

Curve

The curve() function draws a function over a specified range.

curve(cos,-3*pi, 3*pi)
title("Cosine Function")

# The abline() function adds one or more straight lines to the current plot
abline(h=c(-1,0,1), 
       col = 2, lty = 2, lwd = 1.5) 

plot of chunk unnamed-chunk-16

Panel Grid of Plots

  • When datasets with multiple variables are passed to plot() without X and Y being specified, it will generate a panel grid of plots.
str(airquality)
'data.frame':   153 obs. of  6 variables:
 $ Ozone  : int  41 36 12 18 NA 28 23 19 8 NA ...
 $ Solar.R: int  190 118 149 313 NA NA 299 99 19 194 ...
 $ Wind   : num  7.4 8 12.6 11.5 14.3 14.9 8.6 13.8 20.1 8.6 ...
 $ Temp   : int  67 72 74 62 56 66 65 59 61 69 ...
 $ Month  : int  5 5 5 5 5 5 5 5 5 5 ...
 $ Day    : int  1 2 3 4 5 6 7 8 9 10 ...
plot(airquality)

plot of chunk unnamed-chunk-18

Saving Plots to Files

  • Steps to save a plot to a file
    • Open a device
    • Create the plot
    • Close the device
  • Supported devices
    • Vector: svg
    • Bitmap: jpeg,tiff,png,bmp
    • PDF: pdf
    • Postcript: postcript

Example: saving to a PNG file

png("test.png",width=5*240,height=3*240)
plot(pressure, type="l")
points(pressure,col="red")
dev.off()

Here is the saved graph: Caption

ggplot2 Package

  • “gg” stands for grammar-of-graphics
  • Any data graphics can be described by specifying
    • A dataset
    • Visual marks that represent data points
    • A coordination system
  • ggplot2 package in R is an implementation of it
    • Versatile
    • Clear and consistent interface
    • Beautiful output

qplot Function

  • The qplot() function from ggplot2 package is similar to the plot() function in the base system.

Examine the data:

str(heightweight)
'data.frame':   236 obs. of  5 variables:
 $ sex     : Factor w/ 2 levels "f","m": 1 1 1 1 1 1 1 1 1 1 ...
 $ ageYear : num  11.9 12.9 12.8 13.4 15.9 ...
 $ ageMonth: int  143 155 153 161 191 171 185 142 160 140 ...
 $ heightIn: num  56.3 62.3 63.3 59 62.5 62.5 59 56.5 62 53.8 ...
 $ weightLb: num  85 105 108 92 112 ...
summary(heightweight)
 sex        ageYear         ageMonth        heightIn        weightLb    
 f:111   Min.   :11.58   Min.   :139.0   Min.   :50.50   Min.   : 50.5  
 m:125   1st Qu.:12.33   1st Qu.:148.0   1st Qu.:58.73   1st Qu.: 85.0  
         Median :13.58   Median :163.0   Median :61.50   Median :100.5  
         Mean   :13.67   Mean   :164.1   Mean   :61.34   Mean   :101.0  
         3rd Qu.:14.83   3rd Qu.:178.0   3rd Qu.:64.30   3rd Qu.:112.0  
         Max.   :17.50   Max.   :210.0   Max.   :72.00   Max.   :171.5  
?heightweight
heightweight {gcookbook}    R Documentation
Height and weight of schoolchildren

Description

Height and weight of schoolchildren

Variables

sex

ageYear: Age in years.

ageMonth: Age in months.

heightIn: Height in inches.

weightLb: Weight in pounds.

Source

Lewis, T., & Taylor, L.R. (1967), Introduction to Experimental Ecology, Academic Press.

Scatterplot with ```qplot``` Function

  • Data represented by points
qplot(weightLb, heightIn, data=heightweight, geom="point")

plot of chunk unnamed-chunk-23

  • Data represented by labels
qplot(weightLb, heightIn, data=heightweight, geom ="text", label=ageYear)

plot of chunk unnamed-chunk-24

A Fancier Plot

plot of chunk unnamed-chunk-25

A Fancier Plot

This is what is under the hood:

ggplot(heightweight, aes(x=weightLb, y=heightIn, color=sex, shape=sex)) + 
  geom_point(size=3.5) +
  ggtitle("School Children\nHeight ~ Weight") +
  labs(y="Height (inch)", x="Weight (lbs)") +
  stat_smooth(method=loess, se=T, color="black", fullrange=T) +
  annotate("text",x=145,y=75,label="Locally weighted polynomial fit with 95% CI",color="Green",size=6) +
  scale_color_brewer(palette = "Set1", labels=c("Female", "Male")) +
  guides(shape=F) +
  theme_bw() +
  theme(plot.title = element_text(size=20, hjust=0.5), 
        legend.position = c(0.9,0.2),
        axis.title.x = element_text(size=20), axis.title.y = element_text(size=20),
        legend.title = element_text(size=15),legend.text = element_text(size=15))

Don't Panic!!!

Basic Concepts of ggplot2

Grammar of Graphics components:

  • Data: Use the ggplot function to indicate what data to use
  • Visual marks: Use geom_xxx functions to indicate what types of visual marks to use
    • Points, lines, area, etc.
  • Mapping: Use aesthetic properties (aes() function) to map variables to visual marks
    • Color, shape, size, x, y, etc.
ggplot(heightweight, # What data to use
       aes(x=weightLb,y=heightIn)) + # Aesthetic specifies variables
  geom_point() # Geom specifies visual marks 

plot of chunk unnamed-chunk-27

This is equivalent to:

qplot(weightLb, heightIn, data=heightweight, geom="point")

Histogram with ggplot2

ggplot(mpg,aes(x=hwy)) + geom_histogram(binwidth=5, fill="white", color="black")

plot of chunk unnamed-chunk-29

Contour Plots with ggplot2

ggplot(faithfuld, aes(waiting, eruptions, z = density))+
  geom_raster(aes(fill = density)) +
  geom_contour(colour = "white")

plot of chunk unnamed-chunk-30

Maps with ggplot2

  • Combined with the maps package, one can create geographical graphs
east_asia <- map_data("world", 
                      region=c("Japan","China","North Korea","South Korea"))
ggplot(east_asia, aes(x=long,y=lat,group=group, fill=region)) + 
  geom_polygon(color="black") + 
  scale_fill_brewer(palette="Set2")

plot of chunk unnamed-chunk-32

List of Geoms in ggplot2

There are more than 30 geoms in ggplot2

  • One variable
    • geom_bar
    • geom_area
  • Two variables
    • geom_point
    • geom_smooth
    • geom_text
    • geom_boxplot
  • Graphic primitives
    • geom_path
    • geom_polygon
  • Error visualizatoin
    • geom_errorbar
  • Special
    • geom_map
    • geom_contour

Customizing Appearance of Data Points

  • Appearance of Data Points can be customized with the geom functions
    • Color
    • Shape (symbol)
    • Size
    • Alpha (transparency)
ggplot(heightweight, aes(x=weightLb,y=heightIn)) + 
  geom_point(shape=13,size=5,color='red',alpha=0.5)

plot of chunk unnamed-chunk-33

List of Symbols

  • There are 25 built-in shapes

plot of chunk unnamed-chunk-34

Notes on Colors

  • A list of possible color names can be obtained with the function colors()
  • Can also use hex values
    • Starts with a “#”
colors()
  [1] "white"                "aliceblue"            "antiquewhite"        
  [4] "antiquewhite1"        "antiquewhite2"        "antiquewhite3"       
  [7] "antiquewhite4"        "aquamarine"           "aquamarine1"         
 [10] "aquamarine2"          "aquamarine3"          "aquamarine4"         
 [13] "azure"                "azure1"               "azure2"              
 [16] "azure3"               "azure4"               "beige"               
 [19] "bisque"               "bisque1"              "bisque2"             
 [22] "bisque3"              "bisque4"              "black"               
 [25] "blanchedalmond"       "blue"                 "blue1"               
 [28] "blue2"                "blue3"                "blue4"               
 [31] "blueviolet"           "brown"                "brown1"              
 [34] "brown2"               "brown3"               "brown4"              
 [37] "burlywood"            "burlywood1"           "burlywood2"          
 [40] "burlywood3"           "burlywood4"           "cadetblue"           
 [43] "cadetblue1"           "cadetblue2"           "cadetblue3"          
 [46] "cadetblue4"           "chartreuse"           "chartreuse1"         
 [49] "chartreuse2"          "chartreuse3"          "chartreuse4"         
 [52] "chocolate"            "chocolate1"           "chocolate2"          
 [55] "chocolate3"           "chocolate4"           "coral"               
 [58] "coral1"               "coral2"               "coral3"              
 [61] "coral4"               "cornflowerblue"       "cornsilk"            
 [64] "cornsilk1"            "cornsilk2"            "cornsilk3"           
 [67] "cornsilk4"            "cyan"                 "cyan1"               
 [70] "cyan2"                "cyan3"                "cyan4"               
 [73] "darkblue"             "darkcyan"             "darkgoldenrod"       
 [76] "darkgoldenrod1"       "darkgoldenrod2"       "darkgoldenrod3"      
 [79] "darkgoldenrod4"       "darkgray"             "darkgreen"           
 [82] "darkgrey"             "darkkhaki"            "darkmagenta"         
 [85] "darkolivegreen"       "darkolivegreen1"      "darkolivegreen2"     
 [88] "darkolivegreen3"      "darkolivegreen4"      "darkorange"          
 [91] "darkorange1"          "darkorange2"          "darkorange3"         
 [94] "darkorange4"          "darkorchid"           "darkorchid1"         
 [97] "darkorchid2"          "darkorchid3"          "darkorchid4"         
[100] "darkred"              "darksalmon"           "darkseagreen"        
[103] "darkseagreen1"        "darkseagreen2"        "darkseagreen3"       
[106] "darkseagreen4"        "darkslateblue"        "darkslategray"       
[109] "darkslategray1"       "darkslategray2"       "darkslategray3"      
[112] "darkslategray4"       "darkslategrey"        "darkturquoise"       
[115] "darkviolet"           "deeppink"             "deeppink1"           
[118] "deeppink2"            "deeppink3"            "deeppink4"           
[121] "deepskyblue"          "deepskyblue1"         "deepskyblue2"        
[124] "deepskyblue3"         "deepskyblue4"         "dimgray"             
[127] "dimgrey"              "dodgerblue"           "dodgerblue1"         
[130] "dodgerblue2"          "dodgerblue3"          "dodgerblue4"         
[133] "firebrick"            "firebrick1"           "firebrick2"          
[136] "firebrick3"           "firebrick4"           "floralwhite"         
[139] "forestgreen"          "gainsboro"            "ghostwhite"          
[142] "gold"                 "gold1"                "gold2"               
[145] "gold3"                "gold4"                "goldenrod"           
[148] "goldenrod1"           "goldenrod2"           "goldenrod3"          
[151] "goldenrod4"           "gray"                 "gray0"               
[154] "gray1"                "gray2"                "gray3"               
[157] "gray4"                "gray5"                "gray6"               
[160] "gray7"                "gray8"                "gray9"               
[163] "gray10"               "gray11"               "gray12"              
[166] "gray13"               "gray14"               "gray15"              
[169] "gray16"               "gray17"               "gray18"              
[172] "gray19"               "gray20"               "gray21"              
[175] "gray22"               "gray23"               "gray24"              
[178] "gray25"               "gray26"               "gray27"              
[181] "gray28"               "gray29"               "gray30"              
[184] "gray31"               "gray32"               "gray33"              
[187] "gray34"               "gray35"               "gray36"              
[190] "gray37"               "gray38"               "gray39"              
[193] "gray40"               "gray41"               "gray42"              
[196] "gray43"               "gray44"               "gray45"              
[199] "gray46"               "gray47"               "gray48"              
[202] "gray49"               "gray50"               "gray51"              
[205] "gray52"               "gray53"               "gray54"              
[208] "gray55"               "gray56"               "gray57"              
[211] "gray58"               "gray59"               "gray60"              
[214] "gray61"               "gray62"               "gray63"              
[217] "gray64"               "gray65"               "gray66"              
[220] "gray67"               "gray68"               "gray69"              
[223] "gray70"               "gray71"               "gray72"              
[226] "gray73"               "gray74"               "gray75"              
[229] "gray76"               "gray77"               "gray78"              
[232] "gray79"               "gray80"               "gray81"              
[235] "gray82"               "gray83"               "gray84"              
[238] "gray85"               "gray86"               "gray87"              
[241] "gray88"               "gray89"               "gray90"              
[244] "gray91"               "gray92"               "gray93"              
[247] "gray94"               "gray95"               "gray96"              
[250] "gray97"               "gray98"               "gray99"              
[253] "gray100"              "green"                "green1"              
[256] "green2"               "green3"               "green4"              
[259] "greenyellow"          "grey"                 "grey0"               
[262] "grey1"                "grey2"                "grey3"               
[265] "grey4"                "grey5"                "grey6"               
[268] "grey7"                "grey8"                "grey9"               
[271] "grey10"               "grey11"               "grey12"              
[274] "grey13"               "grey14"               "grey15"              
[277] "grey16"               "grey17"               "grey18"              
[280] "grey19"               "grey20"               "grey21"              
[283] "grey22"               "grey23"               "grey24"              
[286] "grey25"               "grey26"               "grey27"              
[289] "grey28"               "grey29"               "grey30"              
[292] "grey31"               "grey32"               "grey33"              
[295] "grey34"               "grey35"               "grey36"              
[298] "grey37"               "grey38"               "grey39"              
[301] "grey40"               "grey41"               "grey42"              
[304] "grey43"               "grey44"               "grey45"              
[307] "grey46"               "grey47"               "grey48"              
[310] "grey49"               "grey50"               "grey51"              
[313] "grey52"               "grey53"               "grey54"              
[316] "grey55"               "grey56"               "grey57"              
[319] "grey58"               "grey59"               "grey60"              
[322] "grey61"               "grey62"               "grey63"              
[325] "grey64"               "grey65"               "grey66"              
[328] "grey67"               "grey68"               "grey69"              
[331] "grey70"               "grey71"               "grey72"              
[334] "grey73"               "grey74"               "grey75"              
[337] "grey76"               "grey77"               "grey78"              
[340] "grey79"               "grey80"               "grey81"              
[343] "grey82"               "grey83"               "grey84"              
[346] "grey85"               "grey86"               "grey87"              
[349] "grey88"               "grey89"               "grey90"              
[352] "grey91"               "grey92"               "grey93"              
[355] "grey94"               "grey95"               "grey96"              
[358] "grey97"               "grey98"               "grey99"              
[361] "grey100"              "honeydew"             "honeydew1"           
[364] "honeydew2"            "honeydew3"            "honeydew4"           
[367] "hotpink"              "hotpink1"             "hotpink2"            
[370] "hotpink3"             "hotpink4"             "indianred"           
[373] "indianred1"           "indianred2"           "indianred3"          
[376] "indianred4"           "ivory"                "ivory1"              
[379] "ivory2"               "ivory3"               "ivory4"              
[382] "khaki"                "khaki1"               "khaki2"              
[385] "khaki3"               "khaki4"               "lavender"            
[388] "lavenderblush"        "lavenderblush1"       "lavenderblush2"      
[391] "lavenderblush3"       "lavenderblush4"       "lawngreen"           
[394] "lemonchiffon"         "lemonchiffon1"        "lemonchiffon2"       
[397] "lemonchiffon3"        "lemonchiffon4"        "lightblue"           
[400] "lightblue1"           "lightblue2"           "lightblue3"          
[403] "lightblue4"           "lightcoral"           "lightcyan"           
[406] "lightcyan1"           "lightcyan2"           "lightcyan3"          
[409] "lightcyan4"           "lightgoldenrod"       "lightgoldenrod1"     
[412] "lightgoldenrod2"      "lightgoldenrod3"      "lightgoldenrod4"     
[415] "lightgoldenrodyellow" "lightgray"            "lightgreen"          
[418] "lightgrey"            "lightpink"            "lightpink1"          
[421] "lightpink2"           "lightpink3"           "lightpink4"          
[424] "lightsalmon"          "lightsalmon1"         "lightsalmon2"        
[427] "lightsalmon3"         "lightsalmon4"         "lightseagreen"       
[430] "lightskyblue"         "lightskyblue1"        "lightskyblue2"       
[433] "lightskyblue3"        "lightskyblue4"        "lightslateblue"      
[436] "lightslategray"       "lightslategrey"       "lightsteelblue"      
[439] "lightsteelblue1"      "lightsteelblue2"      "lightsteelblue3"     
[442] "lightsteelblue4"      "lightyellow"          "lightyellow1"        
[445] "lightyellow2"         "lightyellow3"         "lightyellow4"        
[448] "limegreen"            "linen"                "magenta"             
[451] "magenta1"             "magenta2"             "magenta3"            
[454] "magenta4"             "maroon"               "maroon1"             
[457] "maroon2"              "maroon3"              "maroon4"             
[460] "mediumaquamarine"     "mediumblue"           "mediumorchid"        
[463] "mediumorchid1"        "mediumorchid2"        "mediumorchid3"       
[466] "mediumorchid4"        "mediumpurple"         "mediumpurple1"       
[469] "mediumpurple2"        "mediumpurple3"        "mediumpurple4"       
[472] "mediumseagreen"       "mediumslateblue"      "mediumspringgreen"   
[475] "mediumturquoise"      "mediumvioletred"      "midnightblue"        
[478] "mintcream"            "mistyrose"            "mistyrose1"          
[481] "mistyrose2"           "mistyrose3"           "mistyrose4"          
[484] "moccasin"             "navajowhite"          "navajowhite1"        
[487] "navajowhite2"         "navajowhite3"         "navajowhite4"        
[490] "navy"                 "navyblue"             "oldlace"             
[493] "olivedrab"            "olivedrab1"           "olivedrab2"          
[496] "olivedrab3"           "olivedrab4"           "orange"              
[499] "orange1"              "orange2"              "orange3"             
[502] "orange4"              "orangered"            "orangered1"          
[505] "orangered2"           "orangered3"           "orangered4"          
[508] "orchid"               "orchid1"              "orchid2"             
[511] "orchid3"              "orchid4"              "palegoldenrod"       
[514] "palegreen"            "palegreen1"           "palegreen2"          
[517] "palegreen3"           "palegreen4"           "paleturquoise"       
[520] "paleturquoise1"       "paleturquoise2"       "paleturquoise3"      
[523] "paleturquoise4"       "palevioletred"        "palevioletred1"      
[526] "palevioletred2"       "palevioletred3"       "palevioletred4"      
[529] "papayawhip"           "peachpuff"            "peachpuff1"          
[532] "peachpuff2"           "peachpuff3"           "peachpuff4"          
[535] "peru"                 "pink"                 "pink1"               
[538] "pink2"                "pink3"                "pink4"               
[541] "plum"                 "plum1"                "plum2"               
[544] "plum3"                "plum4"                "powderblue"          
[547] "purple"               "purple1"              "purple2"             
[550] "purple3"              "purple4"              "red"                 
[553] "red1"                 "red2"                 "red3"                
[556] "red4"                 "rosybrown"            "rosybrown1"          
[559] "rosybrown2"           "rosybrown3"           "rosybrown4"          
[562] "royalblue"            "royalblue1"           "royalblue2"          
[565] "royalblue3"           "royalblue4"           "saddlebrown"         
[568] "salmon"               "salmon1"              "salmon2"             
[571] "salmon3"              "salmon4"              "sandybrown"          
[574] "seagreen"             "seagreen1"            "seagreen2"           
[577] "seagreen3"            "seagreen4"            "seashell"            
[580] "seashell1"            "seashell2"            "seashell3"           
[583] "seashell4"            "sienna"               "sienna1"             
[586] "sienna2"              "sienna3"              "sienna4"             
[589] "skyblue"              "skyblue1"             "skyblue2"            
[592] "skyblue3"             "skyblue4"             "slateblue"           
[595] "slateblue1"           "slateblue2"           "slateblue3"          
[598] "slateblue4"           "slategray"            "slategray1"          
[601] "slategray2"           "slategray3"           "slategray4"          
[604] "slategrey"            "snow"                 "snow1"               
[607] "snow2"                "snow3"                "snow4"               
[610] "springgreen"          "springgreen1"         "springgreen2"        
[613] "springgreen3"         "springgreen4"         "steelblue"           
[616] "steelblue1"           "steelblue2"           "steelblue3"          
[619] "steelblue4"           "tan"                  "tan1"                
[622] "tan2"                 "tan3"                 "tan4"                
[625] "thistle"              "thistle1"             "thistle2"            
[628] "thistle3"             "thistle4"             "tomato"              
[631] "tomato1"              "tomato2"              "tomato3"             
[634] "tomato4"              "turquoise"            "turquoise1"          
[637] "turquoise2"           "turquoise3"           "turquoise4"          
[640] "violet"               "violetred"            "violetred1"          
[643] "violetred2"           "violetred3"           "violetred4"          
[646] "wheat"                "wheat1"               "wheat2"              
[649] "wheat3"               "wheat4"               "whitesmoke"          
[652] "yellow"               "yellow1"              "yellow2"             
[655] "yellow3"              "yellow4"              "yellowgreen"         

Adding More Layers to A Plot

  • New layers can be added to a plot by using geom_xxx functions
ggplot(heightweight, aes(x=weightLb,y=heightIn)) + 
  geom_point() + 
  geom_quantile(quantiles = c(0.25,0.5,0.75)) +
  geom_text(label=rownames(heightweight), vjust=-0.5)

plot of chunk unnamed-chunk-36

More on Aesthetic Mapping

  • Aesthetic mappings describe how variables in the data are mapped to visual properties
    • Colors, shapes, sizes, transparency etc.
    • Controlled by the aes() function
    • Can be specified in either ggplot function or individual layers
    • Aesthetic mappings specified in ggplot are default, but can be overriden in individual layers

Mapping Discrete Variables to Aesthetic Properties

  • Discret data values can be mapped to an aesthetic value to group data points

Example: use the “sex” variable to group data points by shape and color:

ggplot(heightweight, aes(x=weightLb,y=heightIn)) + 
  geom_point(aes(shape=sex,color=sex))

plot of chunk unnamed-chunk-37

Specify the shapes and colors manually (more on stat function later):

ggplot(heightweight, aes(x=weightLb,y=heightIn)) + 
  geom_point(aes(shape=sex,color=sex),size=4) +
  scale_shape_manual(values=c(1,4)) +
  scale_color_manual(values=c("blue","green"))

plot of chunk unnamed-chunk-38

Mapping Continuous variables to Aesthetic Properties

  • Continuous variables can be mapped to aesthetic values too
ggplot(heightweight, aes(x=weightLb,y=heightIn)) + 
  geom_point(aes(shape=sex,color=sex,size=ageYear))

plot of chunk unnamed-chunk-39

Adding Fitted Models

Use stat_smooth function to add a fitted model to the plot:

ggplot(heightweight, aes(x=weightLb,y=heightIn)) + 
  geom_point(aes(shape=sex,color=sex),size=4) +
  scale_shape_manual(values=c(1,4)) +
  scale_color_manual(values=c("blue","green")) +
  stat_smooth(method = lm, level=0.95)

plot of chunk unnamed-chunk-40

Moving the color=sex statement to the ggplot function produces two lines:

ggplot(heightweight, aes(x=weightLb,y=heightIn, color=sex)) + 
  geom_point(aes(shape=sex),size=4) +
  scale_shape_manual(values=c(1,4)) +
  scale_color_manual(values=c("blue","green")) +
  stat_smooth(method = lm, level=0.95)

plot of chunk unnamed-chunk-41

Labeling individual points

To label data points, use either annotate or geom_text

ggplot(heightweight, aes(x=weightLb,y=heightIn)) + 
  geom_point(aes(shape=sex,color=sex,size=ageYear)) +
  annotate("text",x=150,y=68,label="Some label",color="darkgreen",size=12)

plot of chunk unnamed-chunk-42

ggplot(heightweight, aes(x=weightLb,y=heightIn)) + 
  geom_point(aes(shape=sex,color=sex,size=ageYear)) +
  geom_text(aes(label=ageYear),vjust=0.5)

plot of chunk unnamed-chunk-43

Stat Functions

  • Some plots visualize a transformation of the original dataset.
  • Use a stat_xxx function to choose a common transformation to visualize.

We have seen the stat_smooth() function:

ggplot(heightweight, aes(x=weightLb,y=heightIn, color=sex)) + 
  geom_point(aes(shape=sex),size=4) +
  scale_shape_manual(values=c(1,4)) +
  scale_color_manual(values=c("blue","green")) +
  stat_smooth(method = lm, level=0.95)

plot of chunk unnamed-chunk-44

Stat Functions

Another example: stat_bin() function creates a frequency count:

ggplot(mpg,aes(x=hwy)) + stat_bin(binwidth = 5)

plot of chunk unnamed-chunk-45

This is equivalent to:

ggplot(mpg,aes(x=hwy)) + geom_histogram(binwidth = 5)

Or:

ggplot(mpg,aes(x=hwy)) + 
  geom_histogram(stat="bin", binwidth = 5) 
# The "bin" stat is the implied default for histogram

Stat Functions

Density plot with stat_density:

ggplot(mpg,aes(x=hwy)) + stat_density()

plot of chunk unnamed-chunk-48

Or the same plot with geom_histogram:

ggplot(mpg,aes(x=hwy)) + geom_histogram(stat="density")

plot of chunk unnamed-chunk-49

Saving Plot to An Object

  • A ggplot plot can be saved in an object
  • More convenient when you are experienting
p <- ggplot(heightweight, aes(x=weightLb,y=heightIn))
p + geom_point(aes(shape=sex,color=sex,size=ageYear))

plot of chunk unnamed-chunk-50

# Here we use the saved plot object "p"
p + geom_smooth(method=lm)

plot of chunk unnamed-chunk-51

Saving Plots to Files

With ggplot2 one can use the ggsave() function to save a plot:

ggplot(heightweight, aes(x=weightLb,y=heightIn, color=sex)) + 
  geom_point(aes(shape=sex),size=4) +
  scale_shape_manual(values=c(1,4)) +
  scale_color_manual(values=c("blue","green")) +
  stat_smooth(method = lm, level=0.99)
ggsave("hw.png",width=6,height=6)

Plot Titles

To add a title, use either ggtitle or labs(title=)

p <- ggplot(heightweight, aes(x=weightLb,y=heightIn, color=sex)) + 
  geom_point(aes(shape=sex),size=4) +
  scale_shape_manual(values=c(1,4)) +
  scale_color_manual(values=c("blue","green"))
p + ggtitle("Height ~ weight of school children")

plot of chunk unnamed-chunk-53

Note the title is left-aligned by default.

Axis Labels

To add axis labels, use either (x|y)lab or labs(x=,y=)

p + ggtitle("Height ~ weight of school children") +
  xlab("Weight (lbs)") + ylab("Height (inch)")

plot of chunk unnamed-chunk-54

Legend Titles

  • Use labs(<aes>=) to specify legend titles
p + ggtitle("Height ~ weight of school children") +
  xlab("Weight (lbs)") + ylab("Height (inch)") +
  labs(color='Gender', shape='Gender')

plot of chunk unnamed-chunk-55

Legends

  • Use the guides function to set legend type for each aesthetic properties.

Before:

p <- ggplot(heightweight, aes(x=weightLb,y=heightIn,color=ageYear)) + 
  geom_point(aes(shape=sex)) 
p

plot of chunk unnamed-chunk-56

After:

p + guides(shape='none',color='legend')

plot of chunk unnamed-chunk-57

Themes

  • Themes decide the appearance of a plot
  • ggplot2 provides a few pre-defined themes for users to choose from

The classic theme:

p <- ggplot(heightweight, aes(x=weightLb,y=heightIn, color=sex)) + 
  geom_point(aes(shape=sex),size=4)
p + theme_classic()

plot of chunk unnamed-chunk-58

The dark theme:

p + theme_dark()

plot of chunk unnamed-chunk-59

Package ggthemes

  • Additional themes are available from the ggthemes package

Example: Excel theme

p + theme_excel()

plot of chunk unnamed-chunk-61

Fine-tuning the Theme

  • Most elements related to appearance are controlled by the theme() function.
    • Fonts (family, size, color etc.)
    • Background color
    • Grid lines
    • Axis ticks

Removing the grid lines:

p + theme_bw() +
  theme(panel.grid = element_blank())

plot of chunk unnamed-chunk-62

Fine-tuning the Theme

  • Most elements related to appearance are controlled by the theme() function.
    • Fonts (family, size, color etc.)
    • Background color
    • Grid lines
    • Axis ticks

Or just removing the vertical ones:

p + theme_bw() +
  theme(panel.grid.major.x = element_blank(),
        panel.grid.minor.x = element_blank())

plot of chunk unnamed-chunk-63

Customizing Fonts

Change the base size and font family:

p + theme_bw(base_size = 24, base_family = "Times")

plot of chunk unnamed-chunk-64

Or fine tune each element:

p + theme_bw(base_size = 24, base_family = "Times") +
  theme(legend.title = element_text(size=20,color="blue"),# Legend title
        legend.text = element_text(size=18,color="red"), # Legend text
        axis.title.x = element_text(size=18,color="red"), # X axis label
        axis.title.y = element_blank(), # Remove Y axis label
        )

plot of chunk unnamed-chunk-65

The element_blank() function can be used to remove undesired elements.

Changing Legend Position

p + theme_bw(base_size = 24, base_family = "Times") +
  theme(legend.position = "bottom")

plot of chunk unnamed-chunk-66

p + theme_bw(base_size = 24, base_family = "Times") +
  theme(legend.position = c(0.9,0.1))

plot of chunk unnamed-chunk-67

List of Theme Elements

Elements that can be adjusted with the theme() function:

theme(line, rect, text, title, aspect.ratio, axis.title, axis.title.x,
  axis.title.x.top, axis.title.y, axis.title.y.right, axis.text, axis.text.x,
  axis.text.x.top, axis.text.y, axis.text.y.right, axis.ticks, axis.ticks.x,
  axis.ticks.y, axis.ticks.length, axis.line, axis.line.x, axis.line.y,
  legend.background, legend.margin, legend.spacing, legend.spacing.x,
  legend.spacing.y, legend.key, legend.key.size, legend.key.height,
  legend.key.width, legend.text, legend.text.align, legend.title,
  legend.title.align, legend.position, legend.direction, legend.justification,
  legend.box, legend.box.just, legend.box.margin, legend.box.background,
  legend.box.spacing, panel.background, panel.border, panel.spacing,
  panel.spacing.x, panel.spacing.y, panel.grid, panel.grid.major,
  panel.grid.minor, panel.grid.major.x, panel.grid.major.y, panel.grid.minor.x,
  panel.grid.minor.y, panel.ontop, plot.background, plot.title, plot.subtitle,
  plot.caption, plot.margin, strip.background, strip.placement, strip.text,
  strip.text.x, strip.text.y, strip.switch.pad.grid, strip.switch.pad.wrap, ...,
  complete = FALSE, validate = TRUE)

Reset the default theme

  • The default them is theme_grey()
  • Use theme_set() to change the default

With old default:

p

plot of chunk unnamed-chunk-68

With new default:

theme_set(theme_light())
p

plot of chunk unnamed-chunk-69

Creating Your Own Theme

  • You can create your own theme and reuse later:
mytheme <- theme_bw(base_size = 24, base_family = "Times") +
  theme(legend.title = element_text(size=20,color="blue"),# Legend title
        legend.text = element_text(size=18,color="red"), # Legend text
        axis.title.x = element_text(size=18,color="red"), # X axis label
        axis.title.y = element_blank(), # Remove Y axis label
        )
p + mytheme

plot of chunk unnamed-chunk-70

Coordination systems

Functions that control the coordination system

  • coord_cartesian - the default cartesian coordinates
  • coord_flip - flip X and Y
  • coord_polar - polar coordinates
  • coord_trans - transform cartesian coordinates

Coordination systems

Original:

g <- ggplot(mpg,aes(x=hwy)) + geom_histogram(binwidth=5, fill="white", color="black")
g

plot of chunk unnamed-chunk-71

With flipped coorinates:

g + coord_flip()

plot of chunk unnamed-chunk-72

Coordination systems

Original:

g

plot of chunk unnamed-chunk-73

With transformed Y coordinate:

g + coord_trans(y="sqrt")

plot of chunk unnamed-chunk-74

Axis Limits

  • Use the xlim() and ylim() functions to set the range of axes:
p + theme_light() +
  xlim(0,200) +
  ylim(50,100)

plot of chunk unnamed-chunk-75

Scales

  • The scale_<aes>_(continuous|discrete|manual|identity|...) family of functions controls how data points are mapped to aesthetic values
    • Color
    • Shape
    • Size
    • Alpha (transparency)
    • X and Y location

X and Y scales

scale_x_continuous: scale for X, which is a continuous variable

p + theme_bw() +
  ylim(50,100) +
  scale_x_continuous(limits=c(0,200),
                     breaks=c(50,110,170),
                     labels=c("Thin","Medium\nSize","Chubby"))

plot of chunk unnamed-chunk-77

X and Y scales

p + theme_economist_white() +
  scale_x_log10(breaks=c(10,20,50,100,200),
                limits=c(5,500)) + # Plot X on a log10 scale
  scale_y_reverse() # Reverse the Y scale

plot of chunk unnamed-chunk-78

Legend Labels

  • Scale functions can be used to customize legend labels
    • Color, shape, size, fill etc.
ggplot(mpg,aes(x=drv,y=cty,fill=drv)) +
  geom_boxplot()

plot of chunk unnamed-chunk-79

ggplot(mpg,aes(x=drv,y=cty,fill=drv)) +
  geom_boxplot() +
  scale_fill_discrete(limits=c("f","r","4"),
                      labels=c("Front","Rear","4 Wheel Drive"))

plot of chunk unnamed-chunk-80

Other scales

By default:

ggplot(mpg,aes(x=displ,y=hwy,size=cyl,
               color=drv,shape=fl)) +
  geom_point(aes(alpha=cty))

plot of chunk unnamed-chunk-81

Re-scaled

ggplot(mpg,aes(x=displ,y=hwy,size=cyl,color=drv, alpha=cty)) +
  geom_point() +
  scale_size_identity() + # Use the values of "cyl" variable for size
  scale_color_manual(values=c("darkblue","rosybrown2","#24FA22")) +
  scale_alpha_continuous(range=c(0.1,1))

plot of chunk unnamed-chunk-82

Faceting

  • Facets divide a plot into subplots based on the values of one or more discrete variables.
  • Faceting in ggplot2 is managed by the functions facet_grid and facet_wrap.

facet_grid: create a row of panels defined by the variable “drv”:

ggplot(mpg, aes(displ, hwy)) + 
  geom_point() +
  facet_grid(. ~ drv)

plot of chunk unnamed-chunk-83

Facet_grid

facet_grid: creates a column of panels defined by the variable “fl”:

ggplot(mpg, aes(displ, hwy)) + 
  geom_point() +
  facet_grid(fl ~ .)

plot of chunk unnamed-chunk-84

Facet_grid

facet_grid: creates a matrix of panels defined by the variables “fl” and “drv”:

ggplot(mpg, aes(displ, hwy)) + 
  geom_point() +
  facet_grid(fl ~ drv)

plot of chunk unnamed-chunk-85

Facet_wrap

facet_wrap: wraps 1d sequence of panels into 2d:

ggplot(mpg, aes(displ, hwy)) + 
  geom_point() +
  facet_wrap(~class, nrow=3)

plot of chunk unnamed-chunk-86

Online Interactive Graphics

Plot.ly

Open your browser and try: https://plot.ly/~lyan1/2/

The code:

library(plotly)
nmmaps<-read.csv("chicago-nmmaps.csv", as.is=T)
nmmaps$date<-as.Date(nmmaps$date)
nmmaps<-nmmaps[nmmaps$date>as.Date("1996-12-31"),]
nmmaps$year<-substring(nmmaps$date,1,4)
g <- ggplot(nmmaps, aes(date, temp, color=factor(season)))+  geom_point() +
  scale_color_manual(values=c("dodgerblue4", "darkolivegreen4",
                              "darkorchid3", "goldenrod1"))
#ggplotly(g) # offline
api_create(g, filename = NULL, fileopt = "new", sharing = "public")

Further Reading