EVB 203 Week 4: Spatial Analysis

Author

Catherine Kim

Published

March 21, 2023

Last week, we learned about data cleaning and management and how to do it in R. This week, we will explore spatial analysis in R with data about WiFi signal strength (WSS) from the Brisbane Botanic Gardens.

1 Aims

Specifically, we will be comparing how different sampling effort in collecting WiFi signal strength data: 1) randomly collected in 2019 by students, and 2) strategically collected by a single person. We will be reading different file types into R, creating xy table data into a spatial object, and checking comparability with these spatial data i.e., checking coordinate reference systems.

2 Project set-up

First things first, let’s get organized and either use your existing RStudio project from a previous week or set-up a new RStudio project following the steps below.

  • Click the “File” menu button (top left corner), then “New Project”
  • Click “New Directory”
  • Click “New Project”
  • In “Directory name”, type the name of your project, for example “EVB203”
  • Browse and select a folder where to locate your project (~ is your home directory). For example, a folder called “R_tutorials” in “EVB203”: ~/EVB203/R_tutorials.
  • Click the “Create Project” button

You should already have “scripts”, “data”, and “plots” sub-folders in your project. If you don’t, create these folders by running:

dir.create("scripts")
dir.create("data")
dir.create("plots")

Or click on the New Folder button in the Files pane in the lower right quadrant of RStudio and type in the folder names.

3 Download the data

Go ahead and download the Week 4 data from Canvas and move into the data folder.

4 Explore the wifi signal strength (WSS) data

For these exercises we will use the soloeffort and classeffort datasets. You should notice straight away that we have two different file formats: a shapefile and a csv.

First let’s read in the shapefile which we have done before.

#install.packages("sf")  # install the library if you do not have it
library(sf) # load the library
classeffort <- st_read("data/classeffort.shp") # read in the data
Reading layer `classeffort' from data source 
  `C:\Users\kimc7\OneDrive - Queensland University of Technology\Teaching\EVB203_2023\R_content\EVB203_R\data\classeffort.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 1325 features and 6 fields
Geometry type: MULTIPOINT
Dimension:     XY
Bounding box:  xmin: 502705.5 ymin: 6960568 xmax: 503147.2 ymax: 6961255
Projected CRS: GDA94 / MGA zone 56
classeffort # view the data
Simple feature collection with 1325 features and 6 fields
Geometry type: MULTIPOINT
Dimension:     XY
Bounding box:  xmin: 502705.5 ymin: 6960568 xmax: 503147.2 ymax: 6961255
Projected CRS: GDA94 / MGA zone 56
First 10 features:
            id       authorId metadataSt WSS name       date
1   1246447135 n10216839, IOS        iOS   0 WiFi 17/05/2019
2   1473978009 n10216839, IOS        iOS   0 WiFi 17/05/2019
3  -2133502039 n10216839, IOS        iOS   0 WiFi 17/05/2019
4  -1117280325 n10216839, IOS        iOS   0 WiFi 17/05/2019
5   -648261611 n10216839, IOS        iOS   0 WiFi 17/05/2019
6  -1556534246 n10216839, IOS        iOS   0 WiFi 17/05/2019
7   -577382290 n10216839, IOS        iOS   0 WiFi 17/05/2019
8   -507973632 n10216839, IOS        iOS   0 WiFi 17/05/2019
9   -194155426 n10216839, IOS        iOS   0 WiFi 17/05/2019
10   976206635 n10216839, IOS        iOS   0 WiFi 17/05/2019
                         geometry
1  MULTIPOINT ((502912.4 69607...
2   MULTIPOINT ((502928 6960752))
3  MULTIPOINT ((502956.3 69607...
4   MULTIPOINT ((502955 6960748))
5  MULTIPOINT ((502922.2 69607...
6  MULTIPOINT ((503121.3 69608...
7  MULTIPOINT ((503122.4 69607...
8  MULTIPOINT ((503098.8 69607...
9   MULTIPOINT ((503101 6960738))
10 MULTIPOINT ((503100.2 69607...

Check the coordinate reference system (CRS) like we always should do using spatial data we haven’t collected.

st_crs(classeffort)
Coordinate Reference System:
  User input: GDA94 / MGA zone 56 
  wkt:
PROJCRS["GDA94 / MGA zone 56",
    BASEGEOGCRS["GDA94",
        DATUM["Geocentric Datum of Australia 1994",
            ELLIPSOID["GRS 1980",6378137,298.257222101,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4283]],
    CONVERSION["Map Grid of Australia zone 56",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",0,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",153,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",0.9996,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",500000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",10000000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Engineering survey, topographic mapping."],
        AREA["Australia - onshore and offshore between 150°E and 156°E."],
        BBOX[-58.96,150,-13.87,156]],
    ID["EPSG",28356]]

Plot it with the base plot() function. Do you remember how to plot a specific column, not all of them?

plot(classeffort)  # plot all columns

plot(classeffort["WSS"])  # plot only the WiFi signal strength column

Now let’s read in the soloeffort.xlsx file. Base R (and many R packages can read in .csv files). For excel files, we will need to use a different package for this aptly names {readxl}.

Remember, install the package if you don’t have it yet and we always need to load the package with library() before using any of the functions in the package.

#install.packages("readxl")
library(readxl)
soloeffort <- read_excel(path = "data/soloeffort.xlsx")
soloeffort # View it
# A tibble: 128 × 6
      id   WSS elevation_     X     Y       Z
   <dbl> <dbl>      <dbl> <dbl> <dbl>   <dbl>
 1    62     0     21.2    153. -27.5  21.2  
 2    63     2     10.3    153. -27.5  10.3  
 3    64     0      5.90   153. -27.5   5.90 
 4    65     0      5.00   153. -27.5   5.00 
 5    66     2     -7.38   153. -27.5  -7.38 
 6    67     3    -12.5    153. -27.5 -12.5  
 7    68     2     -0.892  153. -27.5  -0.892
 8    69     0      9.51   153. -27.5   9.51 
 9    70     0     -0.377  153. -27.5  -0.377
10    71     2      0.717  153. -27.5   0.717
# ℹ 118 more rows

Now let’s plot these data. This data is a table so we need to define the x and y coordinates to make a scatter plot.

plot(x = soloeffort$X, y = soloeffort$Y)

We have 128 points in this dataset. Much fewer than the classeffort dataset. This is not a spatial object so it is a basic scatter plot.

Plot both of these datasets using {ggplot2}. Hint: Look at previous week’s tutorials for code.

Before we move onto the spatial analysis bit, take a moment to compare the two datasets. How do the number and pattern of points compare?

5 Thiessen polygons or Voronoi diagram

Watch this video for a simple explanation about Thiessen Polygons.

5.1 Class data

Full disclosure, this code is based off of Valentin_Stefan’s stack exchange response to this question.

First we need to get the points ready.

set.seed(12345) # so results are reproducible

library(dplyr) # to use pipe
class_voronoi <- classeffort %>% 
  st_geometry() %>%             # get only the geometry of the sf object
  do.call(c, .) %>%             # c() or concatenate all the geometries together
  st_voronoi() %>%              # create the Thiessen/Voronoi polygons
  st_collection_extract()       # return only the geometries of the polygons

class_voronoi # view it
Geometry set for 1292 features 
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 502018.9 ymin: 6959881 xmax: 503833.8 ymax: 6961941
CRS:           NA
First 5 geometries:

We can see that there is no more CRS defined - reset the CRS to the original classeffort CRS.

st_crs(class_voronoi) <- st_crs(classeffort)

Now we will match the polygons to the points to get the WSS column for coloring by creating a new “pols” column in classeffort using $ and set the values as the geometries for the newly created Voronoi polygons.

Then we need set the geometry as the new polygons and the values as the WSS column.

classeffort$pols <- st_intersects(classeffort, class_voronoi) %>% 
  unlist %>% 
  class_voronoi[.]
class_voronoi_wss <- st_set_geometry(classeffort, "pols")["WSS"]

How would we tidy up this plot by clipping the polygons to the extent of the Brisbane Botanic Gardens?

Hint: Use the CBGBoundary2.shp file and look at last week’s tutorial for inspiration.

Reading layer `CBGBoundary2' from data source 
  `C:\Users\kimc7\OneDrive - Queensland University of Technology\Teaching\EVB203_2023\R_content\EVB203_R\data\CBGBoundary2.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 1 feature and 2 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 502696.8 ymin: 6960578 xmax: 503150.8 ymax: 6961272
Projected CRS: GDA94 / MGA zone 56

[1] "sf"         "data.frame"
Coordinate Reference System:
  User input: GDA94 / MGA zone 56 
  wkt:
PROJCRS["GDA94 / MGA zone 56",
    BASEGEOGCRS["GDA94",
        DATUM["Geocentric Datum of Australia 1994",
            ELLIPSOID["GRS 1980",6378137,298.257222101,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4283]],
    CONVERSION["Map Grid of Australia zone 56",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",0,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",153,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",0.9996,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",500000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",10000000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Engineering survey, topographic mapping."],
        AREA["Australia - onshore and offshore between 150°E and 156°E."],
        BBOX[-58.96,150,-13.87,156]],
    ID["EPSG",28356]]

Good work!

5.2 Solo effort data

Now repeat the process with the solo effort data. We have to do some more coding gymnastics to get our table of xy coordinates in shape to do the Voronoi diagram.

To start, we need to convert our xy data frame into an sf object using st_as_sf().

set.seed(5678) # set  different seed for reproducible results

solo_sf <- st_as_sf(soloeffort, coords = c("X", "Y"))

our soloeffort data was read in as an excel so there is no CRS associated with it. We can set the CRS the same as our classeffort data.

solo_sf
Simple feature collection with 128 features and 4 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 153.0274 ymin: -27.47795 xmax: 153.0318 ymax: -27.47234
CRS:           NA
# A tibble: 128 × 5
      id   WSS elevation_       Z             geometry
   <dbl> <dbl>      <dbl>   <dbl>              <POINT>
 1    62     0     21.2    21.2   (153.0278 -27.47507)
 2    63     2     10.3    10.3   (153.0276 -27.47469)
 3    64     0      5.90    5.90   (153.028 -27.47443)
 4    65     0      5.00    5.00  (153.0284 -27.47417)
 5    66     2     -7.38   -7.38  (153.0287 -27.47395)
 6    67     3    -12.5   -12.5     (153.029 -27.4735)
 7    68     2     -0.892  -0.892 (153.0293 -27.47346)
 8    69     0      9.51    9.51  (153.0296 -27.47314)
 9    70     0     -0.377  -0.377 (153.0301 -27.47277)
10    71     2      0.717   0.717 (153.0305 -27.47241)
# ℹ 118 more rows
classeffort
Simple feature collection with 1325 features and 6 fields
Active geometry column: geometry
Geometry type: MULTIPOINT
Dimension:     XY
Bounding box:  xmin: 502705.5 ymin: 6960568 xmax: 503147.2 ymax: 6961255
Projected CRS: GDA94 / MGA zone 56
First 10 features:
            id       authorId metadataSt WSS name       date
1   1246447135 n10216839, IOS        iOS   0 WiFi 17/05/2019
2   1473978009 n10216839, IOS        iOS   0 WiFi 17/05/2019
3  -2133502039 n10216839, IOS        iOS   0 WiFi 17/05/2019
4  -1117280325 n10216839, IOS        iOS   0 WiFi 17/05/2019
5   -648261611 n10216839, IOS        iOS   0 WiFi 17/05/2019
6  -1556534246 n10216839, IOS        iOS   0 WiFi 17/05/2019
7   -577382290 n10216839, IOS        iOS   0 WiFi 17/05/2019
8   -507973632 n10216839, IOS        iOS   0 WiFi 17/05/2019
9   -194155426 n10216839, IOS        iOS   0 WiFi 17/05/2019
10   976206635 n10216839, IOS        iOS   0 WiFi 17/05/2019
                         geometry                           pols
1  MULTIPOINT ((502912.4 69607... POLYGON ((502922.3 6960763,...
2   MULTIPOINT ((502928 6960752)) POLYGON ((502923.6 6960745,...
3  MULTIPOINT ((502956.3 69607... POLYGON ((502953.5 6960751,...
4   MULTIPOINT ((502955 6960748)) POLYGON ((502955.3 6960745,...
5  MULTIPOINT ((502922.2 69607... POLYGON ((502924.7 6960732,...
6  MULTIPOINT ((503121.3 69608... POLYGON ((503106.8 6960812,...
7  MULTIPOINT ((503122.4 69607... POLYGON ((503115.6 6960748,...
8  MULTIPOINT ((503098.8 69607... POLYGON ((503091.6 6960729,...
9   MULTIPOINT ((503101 6960738)) POLYGON ((503102.9 6960733,...
10 MULTIPOINT ((503100.2 69607... POLYGON ((503089.8 6960748,...

However there is an issue with doing that. HINT: What are the geometry units for each object?

Our soloeffort spatial data is in decimal degrees while our classeffort data is in meters. We will have the transform the soloeffort data to convert the decimal degrees latitude/longitude into universal transverse Mercator units of meters.

First, we must set the CRS to something with decimal degrees - we can use “EPSG:4326”.

st_crs(solo_sf) <- st_crs("EPSG:4326")

solo_sf <- st_transform(solo_sf, crs = st_crs(classeffort))
solo_sf
Simple feature collection with 128 features and 4 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 502707.6 ymin: 6960624 xmax: 503140.1 ymax: 6961245
Projected CRS: GDA94 / MGA zone 56
# A tibble: 128 × 5
      id   WSS elevation_       Z           geometry
 * <dbl> <dbl>      <dbl>   <dbl>        <POINT [m]>
 1    62     0     21.2    21.2   (502746.8 6960943)
 2    63     2     10.3    10.3   (502729.8 6960985)
 3    64     0      5.90    5.90  (502763.5 6961013)
 4    65     0      5.00    5.00  (502806.2 6961042)
 5    66     2     -7.38   -7.38  (502832.9 6961067)
 6    67     3    -12.5   -12.5   (502867.4 6961117)
 7    68     2     -0.892  -0.892 (502897.8 6961121)
 8    69     0      9.51    9.51  (502928.3 6961157)
 9    70     0     -0.377  -0.377 (502970.3 6961198)
10    71     2      0.717   0.717 (503011.5 6961238)
# ℹ 118 more rows

Now we have our solo_sf object in meters. Now, we can use the same code as above using our newly minted sf object.

solo_voronoi <- solo_sf %>% 
  st_geometry() %>% 
  do.call(c, .) %>% 
  st_voronoi() %>% 
  st_collection_extract()
st_crs(solo_voronoi) <- st_crs(solo_sf)
solo_voronoi
Geometry set for 128 features 
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 502086.5 ymin: 6960003 xmax: 503761.1 ymax: 6961866
Projected CRS: GDA94 / MGA zone 56
First 5 geometries:

Match the Voronoi polygons to the original soloeffort points for plotting with color.

solo_sf$pols <- st_intersects(solo_sf, solo_voronoi) %>% 
  unlist %>% 
  solo_voronoi[.]
st_crs(solo_sf)
Coordinate Reference System:
  User input: GDA94 / MGA zone 56 
  wkt:
PROJCRS["GDA94 / MGA zone 56",
    BASEGEOGCRS["GDA94",
        DATUM["Geocentric Datum of Australia 1994",
            ELLIPSOID["GRS 1980",6378137,298.257222101,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4283]],
    CONVERSION["Map Grid of Australia zone 56",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",0,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",153,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",0.9996,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",500000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",10000000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Engineering survey, topographic mapping."],
        AREA["Australia - onshore and offshore between 150°E and 156°E."],
        BBOX[-58.96,150,-13.87,156]],
    ID["EPSG",28356]]
solo_voronoi_wss <- st_set_geometry(solo_sf, "pols")["WSS"]

plot(solo_voronoi_wss)

I challenge you to figure out how to clip the extent of the Voronoi polygons to the extent of the Gardens.

6 Summary

We have essentially done the first spatial analysis of the WiFi workflow. How do the two Voronoi polygon plots compare to each other? Are the WiFi patterns the same or different?

See if you can find some tutorials on doing Average Nearest Neighbor and spatial interpolation in R.

7 Resources