Last week, we learned about data cleaning and management and how to do it in R. This week, we will explore spatial analysis in R with data about WiFi signal strength (WSS) from the Brisbane Botanic Gardens.
1 Aims
Specifically, we will be comparing how different sampling effort in collecting WiFi signal strength data: 1) randomly collected in 2019 by students, and 2) strategically collected by a single person. We will be reading different file types into R, creating xy table data into a spatial object, and checking comparability with these spatial data i.e., checking coordinate reference systems.
2 Project set-up
First things first, let’s get organized and either use your existing RStudio project from a previous week or set-up a new RStudio project following the steps below.
Click the “File” menu button (top left corner), then “New Project”
Click “New Directory”
Click “New Project”
In “Directory name”, type the name of your project, for example “EVB203”
Browse and select a folder where to locate your project (~ is your home directory). For example, a folder called “R_tutorials” in “EVB203”: ~/EVB203/R_tutorials.
Click the “Create Project” button
You should already have “scripts”, “data”, and “plots” sub-folders in your project. If you don’t, create these folders by running:
Or click on the New Folder button in the Files pane in the lower right quadrant of RStudio and type in the folder names.
3 Download the data
Go ahead and download the Week 4 data from Canvas and move into the data folder.
4 Explore the wifi signal strength (WSS) data
For these exercises we will use the soloeffort and classeffort datasets. You should notice straight away that we have two different file formats: a shapefile and a csv.
First let’s read in the shapefile which we have done before.
#install.packages("sf") # install the library if you do not have itlibrary(sf) # load the libraryclasseffort <-st_read("data/classeffort.shp") # read in the data
Reading layer `classeffort' from data source
`C:\Users\kimc7\OneDrive - Queensland University of Technology\Teaching\EVB203_2023\R_content\EVB203_R\data\classeffort.shp'
using driver `ESRI Shapefile'
Simple feature collection with 1325 features and 6 fields
Geometry type: MULTIPOINT
Dimension: XY
Bounding box: xmin: 502705.5 ymin: 6960568 xmax: 503147.2 ymax: 6961255
Projected CRS: GDA94 / MGA zone 56
Check the coordinate reference system (CRS) like we always should do using spatial data we haven’t collected.
st_crs(classeffort)
Coordinate Reference System:
User input: GDA94 / MGA zone 56
wkt:
PROJCRS["GDA94 / MGA zone 56",
BASEGEOGCRS["GDA94",
DATUM["Geocentric Datum of Australia 1994",
ELLIPSOID["GRS 1980",6378137,298.257222101,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4283]],
CONVERSION["Map Grid of Australia zone 56",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",0,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",153,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",0.9996,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",500000,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",10000000,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["(E)",east,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["(N)",north,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Engineering survey, topographic mapping."],
AREA["Australia - onshore and offshore between 150°E and 156°E."],
BBOX[-58.96,150,-13.87,156]],
ID["EPSG",28356]]
Plot it with the base plot() function. Do you remember how to plot a specific column, not all of them?
plot(classeffort) # plot all columns
plot(classeffort["WSS"]) # plot only the WiFi signal strength column
Now let’s read in the soloeffort.xlsx file. Base R (and many R packages can read in .csv files). For excel files, we will need to use a different package for this aptly names {readxl}.
Remember, install the package if you don’t have it yet and we always need to load the package with library() before using any of the functions in the package.
#install.packages("readxl")library(readxl)soloeffort <-read_excel(path ="data/soloeffort.xlsx")soloeffort # View it
Now let’s plot these data. This data is a table so we need to define the x and y coordinates to make a scatter plot.
plot(x = soloeffort$X, y = soloeffort$Y)
We have 128 points in this dataset. Much fewer than the classeffort dataset. This is not a spatial object so it is a basic scatter plot.
Challenge plot with ggplot2
Plot both of these datasets using {ggplot2}. Hint: Look at previous week’s tutorials for code.
Before we move onto the spatial analysis bit, take a moment to compare the two datasets. How do the number and pattern of points compare?
5 Thiessen polygons or Voronoi diagram
Watch this video for a simple explanation about Thiessen Polygons.
5.1 Class data
Full disclosure, this code is based off of Valentin_Stefan’s stack exchange response to this question.
First we need to get the points ready.
set.seed(12345) # so results are reproduciblelibrary(dplyr) # to use pipeclass_voronoi <- classeffort %>%st_geometry() %>%# get only the geometry of the sf objectdo.call(c, .) %>%# c() or concatenate all the geometries togetherst_voronoi() %>%# create the Thiessen/Voronoi polygonsst_collection_extract() # return only the geometries of the polygonsclass_voronoi # view it
Geometry set for 1292 features
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 502018.9 ymin: 6959881 xmax: 503833.8 ymax: 6961941
CRS: NA
First 5 geometries:
We can see that there is no more CRS defined - reset the CRS to the original classeffort CRS.
st_crs(class_voronoi) <-st_crs(classeffort)
Now we will match the polygons to the points to get the WSS column for coloring by creating a new “pols” column in classeffort using $ and set the values as the geometries for the newly created Voronoi polygons.
Then we need set the geometry as the new polygons and the values as the WSS column.
Challenge clip the Voronoi polygons to the extent of the park
How would we tidy up this plot by clipping the polygons to the extent of the Brisbane Botanic Gardens?
Hint: Use the CBGBoundary2.shp file and look at last week’s tutorial for inspiration.
Reading layer `CBGBoundary2' from data source
`C:\Users\kimc7\OneDrive - Queensland University of Technology\Teaching\EVB203_2023\R_content\EVB203_R\data\CBGBoundary2.shp'
using driver `ESRI Shapefile'
Simple feature collection with 1 feature and 2 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 502696.8 ymin: 6960578 xmax: 503150.8 ymax: 6961272
Projected CRS: GDA94 / MGA zone 56
[1] "sf" "data.frame"
Coordinate Reference System:
User input: GDA94 / MGA zone 56
wkt:
PROJCRS["GDA94 / MGA zone 56",
BASEGEOGCRS["GDA94",
DATUM["Geocentric Datum of Australia 1994",
ELLIPSOID["GRS 1980",6378137,298.257222101,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4283]],
CONVERSION["Map Grid of Australia zone 56",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",0,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",153,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",0.9996,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",500000,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",10000000,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["(E)",east,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["(N)",north,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Engineering survey, topographic mapping."],
AREA["Australia - onshore and offshore between 150°E and 156°E."],
BBOX[-58.96,150,-13.87,156]],
ID["EPSG",28356]]
Good work!
5.2 Solo effort data
Now repeat the process with the solo effort data. We have to do some more coding gymnastics to get our table of xy coordinates in shape to do the Voronoi diagram.
To start, we need to convert our xy data frame into an sf object using st_as_sf().
set.seed(5678) # set different seed for reproducible resultssolo_sf <-st_as_sf(soloeffort, coords =c("X", "Y"))
our soloeffort data was read in as an excel so there is no CRS associated with it. We can set the CRS the same as our classeffort data.
However there is an issue with doing that. HINT: What are the geometry units for each object?
Our soloeffort spatial data is in decimal degrees while our classeffort data is in meters. We will have the transform the soloeffort data to convert the decimal degrees latitude/longitude into universal transverse Mercator units of meters.
First, we must set the CRS to something with decimal degrees - we can use “EPSG:4326”.
Geometry set for 128 features
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 502086.5 ymin: 6960003 xmax: 503761.1 ymax: 6961866
Projected CRS: GDA94 / MGA zone 56
First 5 geometries:
Match the Voronoi polygons to the original soloeffort points for plotting with color.
I challenge you to figure out how to clip the extent of the Voronoi polygons to the extent of the Gardens.
6 Summary
We have essentially done the first spatial analysis of the WiFi workflow. How do the two Voronoi polygon plots compare to each other? Are the WiFi patterns the same or different?
See if you can find some tutorials on doing Average Nearest Neighbor and spatial interpolation in R.