Plotting Spatial objects in R

Plotting Spatial Objects in R

Here I am going to cover some of the low-level plotting options for spatial objects. There are several specialized packages (e.g., ggplot2, lattice, rasterVis) that allow high level plotting of spatial objects but for quick visualization or general mapmaking, the basic plot functions can be used quite effectively. Plotting points and polygons with no attributes is as easy a just plotting the object. However, when you want to display factorial and numeric variables as choropleth maps things get more complicated.

Add required libraries. You will need to install some new ones.

require(sp)
require(spdep)
require(raster)
require(maptools)
require(classInt)
require(RColorBrewer)

Let’s start simple, with point locations. Due to the discrete x,y nature of points they are the easiest feature class to plot.

We can once again use the meuse data from sp.

data(meuse)
coordinates(meuse) = ~x+y

Simply passing the meuse object to plot will display the spatial locations

plot(meuse)

All of the par arguments that you can pass to plot still apply to spatial objects in the plot function. Here we change the point symbol (pch), size (cex) and color (col).

plot(meuse, pch=19, cex=1.5, col="red")

Now we can start thinking about plotting values contained in the data. Let’s look at the attributes associated with meuse

str(meuse@data)

We will use two columns, the factor “soil” and the numeric data “copper”. First we will plot the factorial variable. The data summary tells us that there are three levels in soil. We can easily assign colors to these three levels by creating a new vector of color labels and passing it to the col argument.

( soil.lab <- factor(meuse@data$soil, labels=c("red", "green", "blue")) )

plot(meuse, pch = 19, col = soil.lab)

Now we will focus on plotting a continuous numeric variable “copper”. There are many ways to do this but they all involve assigning breaks and associated colors to the data. We will start with explicit assignment showing two examples, vector assignment and ifelse assignment. Both of these approaches results in exactly the same result and are a matter of preference.

Vector assignment of dummy color vector that will be passed to plot

    color <- rep("xx", nrow(meuse@data))
    color[(meuse@data$copper > 0)&(meuse@data$copper <= 31)] <- "blue"
    color[(meuse@data$copper > 31)&(meuse@data$copper <= 49.50)] <- "yellow"
    color[(meuse@data$copper > 49.50)] <- "red"

ifelse assignment of dummy color vector

    color <- meuse@data$copper
    ( color <- ifelse(color > 0 & color <= 31, "blue",
            ifelse(color > 31 & color <= 49.5, "yellow",
                        ifelse(color > 49.5, "red", NA))) )

Plot results

plot(meuse, col=color, pch=19)

Now we will get into a more complicated example where we find or define breaks and interpolate colors. There are some remarkable useful tools that simplify this process in the classInt library. The classInt library provides methods for finding breaks in a continuous distribution, provides an object that can be used to “cut” the data into classes as well as interpolate and assign colors based on defined breaks. We will also use the RColorBrewer library for access to its nice color pallets.

We will start with fixed (defined) breaks using the quartiles. We can conveniently retrieve these values to pass to the classIntervals function using summary.

summary(meuse@data$copper)

( cuts <- classIntervals(meuse@data$copper, style="fixed", fixedBreaks=c(summary(meuse@data$copper))) )

Now we will define our color pallet using the spectral pallet from RColorBrewer

plotclr <- rev(brewer.pal(5, "Spectral"))

Using the findColours function we can assign colors to our points without breaking the associated order

colcode <- findColours(cuts, plotclr)

Now we are ready to plot the results

plot(meuse, col=colcode, pch=19)

Let’s also add a legend. These can take some practice but one you get the hang if it they are not too difficult. This is a function that you pass to a plot that has already been called. Dissecting the below example, the first argument “bottomright” is a symbolic representation of where the plot will be located. You can also use “topright”, bottomleft” “bottomright”, “right”, “center” explicit x,y plot coordinates or locator(1) for point-and-click placement. The “legend” argument is the text that will appear in the legend, “fill” is used to define the color gradient from the defined color vector. After the few necessary specific legend specific arguments you can use standard arguments from par (cex, pch, lty, col, etc…). The vector must be the same length and order as the legend contents. For example, say you have three point types that you have defined the argument for pch would be pch=c(2,5,19). The same goes for the col argument.

legend("bottomright", legend=names(attr(colcode, "table")), fill=attr(colcode, "palette"), bty="n")

I do not care for how the legend names are being displayed when pulled from the cuts object so I am going to define my own as well as add a title. Note that for spatial object plot you cannot use the "main" argument in the plot function, you must use the title function.

plot(meuse, col=colcode, pch=19)

box()
        legend("bottomright", legend=c("14-23","23-31","31-40.32","40.32-49.5","49.5-128"),
                  fill=attr(colcode, "palette"), bty="n")
      title(main="Meuse - Copper")

Now let’s look at plotting polygons. We can use the state data distributed with the maptools library. We will add the North Carolina sids data and display it’s attributes. We can use all of the above methods we used with plotting points for polygons.

    nc <- readShapePoly(system.file("shapes/sids.shp", package="maptools")[1],
           IDvar="FIPSNO", proj4string=CRS("+proj=longlat +ellps=clrk66"))
    str(nc@data)

Using the choynowski model available in spdep, let’s bring in some simple probabilistic analysis of disease rates into our choropleth maps. Choynowski’s approach folds the two tails of the measured probabilities together, so that small values, for a chosen α, occur for spatial units with either unusually high or low rates. This necessitates plotting the high and low counties separately. This is a case where spplot can come handy. The spplot function is a wrapper for the lattice plotting library and is designed specifically for spatial objects.

    ch <- choynowski(nc$SID74, nc$BIR74)
        nc$ch_pmap_low <- ifelse(ch$type, ch$pmap, NA)
        nc$ch_pmap_high <- ifelse(!ch$type, ch$pmap, NA)
        prbs <- c(0,.001,.01,.05,.1,1)
        nc$high = cut(nc$ch_pmap_high, prbs)
        nc$low = cut(nc$ch_pmap_low,prbs )

Now that we have our probability distribution defined we can plot the results using spplot. The col.regions argument is where you can define a color scheme. The spplot function automatically adds a legend. I am using colorRampPalette to automatically create a color vector based on a range of colors (blue and red) and a number of colors (which matches the number of factoral levels in the low and high columns.

spplot(nc, c("low", "high"), col.regions= colorRampPalette(c("blue","red"), space = "Lab")(5))

We will now look at some simple overlay plotting. First we will create some coincidental point and raster data.

xy <- SpatialPointsDataFrame(SpatialPoints(cbind(-50, seq(-80, 80, by=20))), data.frame(ID=seq(1,9,1), x=runif(9)))
r <- raster(ncol=100, nrow=100)

r[] <- runif(ncell(r))

For raster class object we can just pass the object directly to the plot function

plot(r)

We can apply much of what we have already learned about breaks and color vectors to raster display. A word of caution about plotting rasters. the raster library was specifically designed to hold large rasters out of memory to facilitate processing them. In order to plot a raster R needs to read them into memory which is exceedingly slow and can exceed system memory limitations.

Let’s define some break values for displaying our raster. We can then manually create a breaks list object that can be passed directly to plot.

    brk <- c(0, 0.25, 0.75, 1)
    arg <- list(at=c(0.12,0.5,0.87), labels=c("Low","Med.","High"))
    plot(r, col=c("blue", "yellow", "red"), breaks=brk, , axis.args=arg)

Using colorRampPalette we can create a color ramp for displaying continuous values. In this case we will create a 255 value color vector ramping the colors blue to yellow to red.

plot(r, col=colorRampPalette(c("blue", "yellow", "red"))(255))

In addition to using the add=TRUE argument in plot, we can also overlay points on an existing plot using the points command.

plot(r)

points(xy, pch=19)

We can also overlay polygons or lines on an existing plot using the add=TRUE plot argument.

    e <- extent(r)
    plot(r)
    plot(e, add=TRUE, col="red", lwd=4)
    plot(e <- e / 2, add=TRUE, col="red")

This also goes for overlaying points on polygons.

plot(e, col="red", lwd=4)

points(xy, pch=19)

As I mentioned previously, there are specific libraries for high level and specialty plotting. If you would like more sophisticated options for plotting rasters I would recommend the rasterVis library. I honestly find spplot quite obtuse but, when you can get it to work, it can be very useful. In general, it is a worthwhile endeavor to learn both the lattice and ggplot2 libraries. These both are capable of producing very complex, publication quality graphics. The Deducer R GUI has a module for creating spatial plots. As most things in R this barley scratches the surface but hopefully provides a place to start exploring creating plots with spatial data.