Alaska bearded and spotted seal example dataset and analysis

compiled on 23 February, 2019

Josh M. London , Devin S. Johson

1. Alaska Fisheries Science Center, NOAA, Seattle, WA, USA
  josh.london@noaa.gov
  orcid.org/0000-0002-3647-5046
2. Alaska Fisheries Science Center, NOAA, Seattle, WA, USA
  devin.johnson@noaa.gov

Abstract

The crawl package is an R package that provides analytical tools for estimating the movement paths of animals from telemetry data. The package is specifically designed with marine mammals in mind, but other species in other habits with similar telemetry devices will work. This package compendium provides step-by-step instructions for importing and setting up the telemetry data, initiating the crwMLE(), crwPredict(), and (eventually, when I get around to writing it up) crwSimulator() + crwPostIS() functions and visualizing results. The write up also discusses use of the fix_path() function to re-route predicted tracks around land. A few examples of data visualization plots are also presented. This compendium was originaly developed in support of a training course at the NOAA Fisheries Protected Species Assessment Workshop in La Jolla, California, USA.

Introduction to Telemetry Data

The crawl package in R is designed, mostly, with telemetry data from satellite linked bio-loggers in mind. Locations derived from the ARGOS satellite system or the GPS system (with data either recovered post-deployemnt or transmitted during deployment) are the most common data source. In this example, the data are all locations derived from the ARGOS satellite system.

The ARGOS system can have significant error associated with each location estimate and those errors are described as either location qualities (3,2,1,0,A,B) or, in more recent data, values that describe the ellipse error for each location. For this example, we will focus on the location quality values and use a fairly generic set of values to describe each error quality.

The data for this example are available for download from the original source on the DataONE network. These were all deployments of bio-loggers from Wildlife Computers (Redmond, Washington, USA) and each is a zipped archive pulled from the Wildlife Computers Data Portal. Each archive contains both location and behavior data. For this example, we will focus on just the location data.

please cite any use of the original data provided here with the following citation:

Michael Cameron, Josh London, Kathy Frost, Alex Whiting, and Peter Boveng. Satellite Telemetry Dataset (Raw): Juvenile Bearded and Spotted Seals, 2004-2006, Kotzebue, Alaska. Research Workspace. 10.24431/rw1k118.

And, if you are interested in learning more about the science, ecology, and behavior of bearded (and spotted) seals, I recommend reading this article available on PlosONE.

Loading Raw Telemetry Data

All of the data are stored within the data-raw folder of our root repository. We will use list.files and the walk function from purrr to unzip everything into a temp directory.

library(tidyverse)

## ── Attaching packages ───────────────────────────────── tidyverse 1.2.1 ──

## ✔ ggplot2 3.1.0       ✔ purrr   0.3.0  
## ✔ tibble  2.0.1       ✔ dplyr   0.8.0.1
## ✔ tidyr   0.8.2       ✔ stringr 1.4.0  
## ✔ readr   1.3.1       ✔ forcats 0.4.0

## ── Conflicts ──────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

library(here)

## here() starts at /Users/josh.london/_projects/r/crwexampleakbs

# here::here() provides a path relative to the project root directory
data_path <- here::here("data-raw")
td <- tempdir()

list.files(file.path(data_path), full.names=TRUE, pattern = "*.zip") %>%
  purrr::walk(~ unzip(.x, exdir = td, overwrite = TRUE))

The next step is to read just the *-Locations.csv file into R. The readr::read_csv() function allows us to set the column data types in advance so we don’t have to mess with converting types afterwards. If you have telemetry data from the Wildlife Computers data portal, these specifications should work for you as well.

my_cols <- cols(
  DeployID = col_character(),
  Ptt = col_integer(),
  Instr = col_character(),
  Date = col_datetime("%H:%M:%S %d-%b-%Y"),
  Type = col_character(),
  Quality = col_character(),
  Latitude = col_double(),
  Longitude = col_double(),
  `Error radius` = col_integer(),
  `Error Semi-major axis` = col_integer(),
  `Error Semi-minor axis` = col_integer(),
  `Error Ellipse orientation` = col_integer(),
  Offset = col_character(),
  `Offset orientation` = col_character(),
  `GPE MSD` = col_character(),
  `GPE U` = col_character(),
  Count = col_character(),
  Comment = col_character()
)

Now, we can combine the power of readr::read_csv() with purrr:map_df() to read in all of the *-Locations.csv files into a single tibble (data.frame).

tbl_locs <- list.files(file.path(td), full.names = TRUE, 
                       pattern = "*-Locations.csv") %>% 
  purrr::map_df( ~ readr::read_csv(.x, col_types = my_cols))

A key tennant of tidy data is that each row represents a single observation. For location data, regardless of the data source, this is typical of the source data structure. Each line in a file usually represents a single location estimate at a specific time. So, there’s very little we need to do in order to get our tbl_locs into an acceptable structure.

One thing we can do, is to adjust the column names so there are no spaces or hyphens. We can also drop everything to lower-case to avoid typos. To do this we will rely on the clean_names() function within the janitor package. We also want to change the date column to a more appropriate and less confusing date_time. And, just because, I prefer deployid over deploy_id.

library(janitor)
tbl_locs <- tbl_locs %>% 
  janitor::clean_names() %>%  
  dplyr::rename(date_time = date,
                deployid = deploy_id) %>% 
  dplyr::arrange(deployid, date_time)

Let’s get a quick summary of our deployments

tbl_locs %>% dplyr::group_by(deployid) %>% 
  dplyr::summarise(num_locs = n(),
            start_date = min(date_time),
            end_date = max(date_time))

## # A tibble: 30 x 4
##    deployid            num_locs start_date          end_date           
##    <chr>                  <int> <dttm>              <dttm>             
##  1 EB2004_5972_04L0016     1804 2004-10-06 01:13:47 2005-02-01 02:24:54
##  2 EB2004_5973_04L0035      711 2004-10-13 01:11:54 2004-12-01 16:16:54
##  3 EB2005_5950_04L0168     1435 2005-06-17 23:51:25 2006-02-22 10:29:09
##  4 EB2005_5952_04L0041     2176 2005-08-02 17:09:56 2006-02-07 19:26:01
##  5 EB2005_5954_04L0043      829 2005-08-02 17:10:07 2005-11-26 01:08:56
##  6 EB2005_5956_04L0165     1984 2005-10-02 21:34:41 2005-12-27 17:43:01
##  7 EB2005_5957_04L0047      921 2005-10-03 01:29:28 2006-01-15 01:05:09
##  8 EB2005_5958_04L0164     4022 2005-10-04 02:13:50 2006-04-20 05:01:58
##  9 EB2005_5959_04L0166     3918 2005-10-05 02:47:41 2006-03-29 05:56:04
## 10 EB2005_5960_04L0167     1400 2005-10-08 01:04:36 2006-01-07 11:13:12
## # … with 20 more rows

There are some locations that were from test transmissions at the Wildlife Computers office in Redmond, Washington, USA and at St. Andrews in Scotland for a few SMRU tags. We can filter those out based on the longitude values.

tbl_locs <- tbl_locs %>% 
  dplyr::filter(!between(longitude, -125, -120),
                !between(longitude, -50,0))

Visualizing Source Data

To further explore our imported data and confirm there aren’t any remaining issues to address, we can create visual representations of the data. The first plot will rely on the spatial features within ggplot2 and a custom ptolemy package with world coastline data to create a plot of the tbl_locs object. You can read up more on the ptolemy package on the GitHub repo and use devtools::install_github('jmlondon/ptolemy') to install.

The sf package provides the bulk of our spatial data framework. This is not intended to be a full sf tutorial, but, where possible, we will highlight a few features and functions to explain the workflow. Users are encouraged to spend significant time exploring and understanding the sf package through existing documentation and several examples and blog posts available online. We also load the ggspatial package for some convenience elements when plotting maps with ggplot. If you are going to be making a lot of maps with sf and ggplot2, I strongly suggest learning and adopting the functions within ggspatial. You will save yourself a lot of time futzing with bounding boxes and other parameters.

library(sf)

## Linking to GEOS 3.7.1, GDAL 2.4.0, PROJ 5.2.0

library(ggspatial)

# library(devtools)
# devtools::install_github('jmlondon/ptolemy')
# after install, load and follow prompts to download source data
library(ptolemy)

Here, we are simply creating an sf_locs object via the sf::st_as_sf() function. We specify the coordinate columns and then indicate the projection of the data is geographic (lat/lon) as indicated by the 4326 epsg code.

In addition to points, it is often useful to organize the point data into separate tracks for each deployment. The code to derive tracks from point data within sf is a bit more complicated but is demonstrated below. It is important that you include the do_union = FALSE argument to the dplyr::summarise() function to maintain point order when creating the lines (and, in general, when in doubt, insert a dplyr::arrange(deployid, date_time) call. The sf::st_cast("MULTILINESTRING") step at the end converts the final geometry for our object to MULTILINESTRING.

sf_locs <- sf::st_as_sf(tbl_locs, coords = c("longitude","latitude")) %>% 
  sf::st_set_crs(4326)

sf_lines <- sf_locs %>% 
  dplyr::arrange(deployid, date_time) %>% 
  dplyr::group_by(deployid) %>% 
  dplyr::summarise(do_union = FALSE) %>% 
  sf::st_cast("MULTILINESTRING")

The sf_lines object is, essentially, a data frame with each row representing a single line from a single deployment.

{ r show-sf-lines} sf_lines

These data are still in the WGS84 geographic coordinate system. Projecting the data into a more appropriate coordinate reference system is strongly encouraged and needed for some functionality in crawl. This is especially important when/if data cross the 180 line, are spread over a large latitudinal range, or within the polar regions.

sf_locs <- sf_locs %>% 
  sf::st_transform(3571)
sf_lines <- sf_lines %>% 
  sf::st_transform(3571)

The ptolemy package provides a number of pre-built collections of coastline data as sf objects. In this case, we are interested in Alaska and the Bering Sea.

bering <- ptolemy::bering()

## importGSHHS status:
## --> Pass 1: complete: 13265 bounding boxes within limits.
## --> Pass 2: complete.
## --> Clipping...

## Warning in refocusWorld(as.PolySet(as.data.frame(xres), projection = "LL"), : Removed duplicates of following polygons (Antarctica?): 0, 15

## importGSHHS: input xlim was (0, 360) and the longitude range of the extracted data is (0, 359.442861).

## Warning in st_buffer.sfc(st_geometry(x), dist, nQuadSegs, endCapStyle =
## endCapStyle, : st_buffer does not correctly buffer longitude/latitude data

## dist is assumed to be in decimal degrees (arc_degrees).

And, now, we build the ggplot using our sf_lines object. The annotation_spatial() and layer_spatial() functions are helpers for handling some of the complexities for plotting spatial data with ggplot2. Most notably, it auto-picks a sensible bounding box.

ggplot() + 
  annotation_spatial(bering, fill = "grey", lwd = 0) +
  layer_spatial(sf_lines, size = 0.75,aes(color = deployid)) +
  theme(legend.position = "none") +
  scale_color_viridis_d() +
  scale_x_continuous(breaks = seq(-180, 180, by = 5)) +
  ggtitle("Observed Argos Location Paths", 
          subtitle = "bearded and spotted seals, Bering Sea")

We can also plot the individual points themselves. However, if we’re not careful, the ggplot and geom_sf() process can be overwelmed and take several minutes to plot (this will be fixed in upcomming updates to ggplto2 and sf). We have 50916 POINT records in our sf_locs object. Instead of sending an object of many POINT records to geom_sf it is better to send fewer MULTIPOINT recrods that, each, contain many points. In this case, we can group our points into MULTIPOINT features based on the deployid column.

ggplot() + 
  annotation_spatial(bering, fill = "grey", lwd = 0) +
  layer_spatial(data = sf_locs %>% 
            dplyr::group_by(deployid) %>% 
            dplyr::summarise(), 
          size = 0.75, aes(color = deployid)) +
  theme(legend.position = "none") +
  scale_color_viridis_d() +
  scale_x_continuous(breaks = seq(-180, 180, by = 5)) +
  ggtitle("Observed Argos Location Points", 
          subtitle = "bearded and spotted seals, Bering Sea")

Duplicate Times

Argos data often contain near duplicate records. These are identified by location estimates with the same date-time but differing coordinate or error values. In theory, crawl::crwMLE() can handle these situations, but we have found it is more reliable to fix these records. The first option for fixing the records would be to eliminate one of the duplicate records. However, it is often not possible to reliably identify which record is more appropriate to discard. For this reason, we advocate adjusting the date-time value for one of the records and increasing the value by 10 seconds. To facilitate this, we will rely on the xts::make.time.unique() function.

make_unique <- function(x) {
  suppressWarnings(
  xts::make.time.unique(x$date_time,eps = 10)
  )
}

library(xts)

tbl_locs <- tbl_locs %>% 
  dplyr::arrange(deployid,date_time) %>% 
  dplyr::group_by(deployid) %>% tidyr::nest() %>% 
  dplyr::mutate(unique_time = purrr::map(data, make_unique)) %>% 
  tidyr::unnest() %>% 
  dplyr::select(-date_time) %>% rename(date_time = unique_time)

Course Speed Filter

This step is optional, but, as we see in this dataset, it is very common for Argos data to include obviously wrong locations (locations that are many kilometers away from the study area). Including such obviously wrong locations in the analysis can result in unexpected issues or problems fitting. For this reason, we recommend a course speed filter to remove these obviously wrong locations. A typical speed filter might use a value of 2.5 m/s as a biologically reasonable value for pinnipeds.

For this application, we will use 7.5 m/s (very conservative) and rely on the argosfilter package. The example code below follows a typical split-apply-combine approach using functions from the dplyr, tidyr, and purrr packages. This analysis can be run in series, however, the process lends itself nicely to parallel processing. The parallel package (included with the distribution of R) would be one option for taking advantage of multiple processors. However, we want to maintain the purrr approach so will use furrr which brings the future parallel processing functionality to purrr.

First, lets print out the number or records in each deployment

tbl_locs %>% 
  dplyr::arrange(deployid, date_time) %>% 
  dplyr::group_by(deployid) %>% 
  tidyr::nest() %>% 
  mutate(n_records = map_int(data,nrow))

## # A tibble: 30 x 3
##    deployid            data                  n_records
##    <chr>               <list>                    <int>
##  1 EB2004_5972_04L0016 <tibble [1,804 × 17]>      1804
##  2 EB2004_5973_04L0035 <tibble [711 × 17]>         711
##  3 EB2005_5950_04L0168 <tibble [1,430 × 17]>      1430
##  4 EB2005_5952_04L0041 <tibble [2,176 × 17]>      2176
##  5 EB2005_5954_04L0043 <tibble [829 × 17]>         829
##  6 EB2005_5956_04L0165 <tibble [1,984 × 17]>      1984
##  7 EB2005_5957_04L0047 <tibble [921 × 17]>         921
##  8 EB2005_5958_04L0164 <tibble [4,022 × 17]>      4022
##  9 EB2005_5959_04L0166 <tibble [3,918 × 17]>      3918
## 10 EB2005_5960_04L0167 <tibble [1,400 × 17]>      1400
## # … with 20 more rows

And, now, speed filter across multiple processors

library(purrr)
library(furrr)

future::plan(multiprocess)

tbl_locs <- tbl_locs %>% 
  dplyr::arrange(deployid, date_time) %>% 
  dplyr::group_by(deployid) %>% 
  tidyr::nest() %>% 
  dplyr::mutate(filtered = furrr::future_map(data, ~ argosfilter::sdafilter(
    lat = .x$latitude,
    lon = .x$longitude,
    dtime = .x$date_time,
    lc = .x$quality,
    vmax = 7.5
  ))) %>% 
  tidyr::unnest() %>% 
  dplyr::filter(filtered %in% c("not", "end_location")) %>% 
  dplyr::select(-filtered) %>% 
  dplyr::arrange(deployid,date_time)

Let’s see if there are any differences from our previous count of records

tbl_locs %>% 
  dplyr::arrange(deployid, date_time) %>% 
  dplyr::group_by(deployid) %>% 
  tidyr::nest() %>% 
  mutate(n_records = map_int(data,nrow))

## # A tibble: 30 x 3
##    deployid            data                  n_records
##    <chr>               <list>                    <int>
##  1 EB2004_5972_04L0016 <tibble [1,623 × 17]>      1623
##  2 EB2004_5973_04L0035 <tibble [650 × 17]>         650
##  3 EB2005_5950_04L0168 <tibble [1,233 × 17]>      1233
##  4 EB2005_5952_04L0041 <tibble [1,818 × 17]>      1818
##  5 EB2005_5954_04L0043 <tibble [763 × 17]>         763
##  6 EB2005_5956_04L0165 <tibble [1,763 × 17]>      1763
##  7 EB2005_5957_04L0047 <tibble [836 × 17]>         836
##  8 EB2005_5958_04L0164 <tibble [3,597 × 17]>      3597
##  9 EB2005_5959_04L0166 <tibble [3,479 × 17]>      3479
## 10 EB2005_5960_04L0167 <tibble [1,195 × 17]>      1195
## # … with 20 more rows

Note, the previous two steps were done on our tbl_locs object which is not an sf type and still in the geographic coordinate system. The argosfilter package requires geopgraphic data and cannot accept an sf object. So, we used tbl_locs. But, now, we need to convert and project.

sf_locs <- sf::st_as_sf(tbl_locs, coords = c("longitude","latitude")) %>% 
  sf::st_set_crs(4326) %>% 
  sf::st_transform(3571)

sf_lines <- sf_locs %>% 
  dplyr::arrange(deployid, date_time) %>% 
  dplyr::group_by(deployid) %>% 
  dplyr::summarise(do_union = FALSE) %>% 
  sf::st_cast("MULTILINESTRING")

And, then, we can plot our data again

ggplot() + 
  annotation_spatial(bering, fill = "grey", lwd = 0) +
  layer_spatial(sf_lines, size = 0.75,aes(color = deployid)) +
  theme(legend.position = "none") +
  scale_color_viridis_d() +
  scale_x_continuous(breaks = seq(-180, 180, by = 5)) +
  ggtitle("Observed Argos Location Paths (speed filtered)", 
          subtitle = "bearded and spotted seals, Bering Sea")

Interactive Plots to Explore Tracks

The mapview package provides an easy means for creating interactive maps of your spatial data. The one caveat to mapview is that it is based on the Web Mercator projection (a la Google Maps, etc) and there are some extra steps required if your tracks cross the dateline at 180 (which ours do … yay! fun!)

What we need to do is transform our data back to geographic and, then, convert the coordinates from -180:180 to 0:360. We will use a custom written function to handle this for us.

st_to_360 <- function(g) {
  coords <- (sf::st_geometry(g) + c(360,90)) %% c(360) - c(0,90)
  g <- sf::st_set_geometry(g,coords) %>% sf::st_set_crs(4326)
  return(g)
}

We will use the ESRI Ocean Basemap for our background layer. The default view for mapview would create a legend with the colors for each deployid listed. My preference is for something less cluttered. We want each deployid to show up as a layer that can be turned on/off. To do this, we specificy burst = TRUE and turn off the legend and homebutton.

library(mapview)
sf::st_transform(sf_lines,4326) %>% st_to_360() %>% 
mapview::mapview(map.types = "Esri.OceanBasemap",zcol = "deployid",
                 burst = TRUE, legend = FALSE, homebutton = FALSE)

Create a Nested Tibble

When dealing with large numbers of deployments (and, potentially, wanting to take advantage of parallel processing), book-keeping and data management are important.

We suggest organizing your data and analysis results into nested tibbles. This is easily created with the tidyr::nest() function. In this case, we will organize based on the deployid group.

tbl_data <- sf_locs %>% 
  dplyr::group_by(deployid) %>% 
  nest()
tbl_data

## # A tibble: 30 x 2
##    deployid            data                 
##    <chr>               <list>               
##  1 EB2004_5972_04L0016 <tibble [1,623 × 16]>
##  2 EB2004_5973_04L0035 <tibble [650 × 16]>  
##  3 EB2005_5950_04L0168 <tibble [1,233 × 16]>
##  4 EB2005_5952_04L0041 <tibble [1,818 × 16]>
##  5 EB2005_5954_04L0043 <tibble [763 × 16]>  
##  6 EB2005_5956_04L0165 <tibble [1,763 × 16]>
##  7 EB2005_5957_04L0047 <tibble [836 × 16]>  
##  8 EB2005_5958_04L0164 <tibble [3,597 × 16]>
##  9 EB2005_5959_04L0166 <tibble [3,479 × 16]>
## 10 EB2005_5960_04L0167 <tibble [1,195 × 16]>
## # … with 20 more rows

Fit Movement Model with `crawl::crwMLE()`

To fit a movement model to each collection of observed points, we will use the crawl::crwMLE() function. The parameters below are a fairly generic set of values. Researchers may need to customize these further for their particular deployment and species characteristics.

The fit_func() is a wrapper that simplifies the downstream purrr::map() function call.

Also, readers paying close attention may wonder if the furrr approach described previously for running the speed filter would also work for running crwMLE. The answer is, yes. But, we’ll leave that for you to work out on your own. I will advise that you read up on the purrr::safely() function as a way to make sure a misfit deployment doesn’t break the entire process.

# this doc and example requires the devel branch of `crawl` available
# on GitHub and installed with `devtools`
#
# library(devtools)
# devtools::install_github('NMML/crawl@devel)

library(crawl)
fixPar <- c(log(250), log(500), log(1500), rep(NA,3), rep(NA,2))

constr = list(lower=c(rep(log(1500),3), rep(-Inf,2)), upper=rep(Inf,5))

fit_func <- function(data,fixPar,constr) {
  suppressWarnings(
  crwMLE(mov.model = ~1, 
                   err.model=list(x=~quality-1), 
                   data=data, 
                   Time.name="date_time", 
                   fixPar=fixPar, 
                   constr=constr,
                   control=list(maxit=2000, trace=0, REPORT=1),
                   attempts=40,
                   initialSANN=list(maxit=200, trace=0, REPORT=1))
  )
}

# fit movement model
tbl_data <- tbl_data %>% 
  dplyr::mutate(fit = purrr::map(data,~fit_func(data = .x, constr = constr,
                                                fixPar = fixPar)))

Predict Hourly Locations

Now that we have a model fit for each deployment, we want to predict hourly locations from fit movement model. This is payoff for all the work to this point. By creating regular (in time) predictions for our deployments, we can open the door to additional analysis (e.g. see the momentuHMM and ctmcmove packages).

The time interval you choose for predicting is up to you. You can even provide a vector of prediction times. For convenience, predTime accepts a character string describing the time interval. In this case we use 1 hour.

Lastly, you’ll want to make sure the previous fit process was successful before predicting. The crwMLE() function should only return a crwFit object if the fit was fully successful. That’s what we are checking for in the filter step below.

tbl_data <- tbl_data %>% 
  dplyr::filter(map_lgl(fit, ~inherits(.x,"crwFit"))) %>% 
  dplyr::mutate(pred_pts = purrr::map(fit, 
                                      ~crwPredict(.x, predTime = "1 hour")))

Simulate Other Possible Tracks

The output from crwPredict() represents predictions along the most likely track. But, we might also want to explore (or analyze) the broader range of possible tracks. The combination of crwSimulator() and crwPostIS functions can do this.

some point in the near future, I’ll revisit this section and expand with example code and plots for simulated tracks

Convert predicted points to `sf` tracks

Now that we have our tracks fit and predicted paths created, we want to plot them and explore the results.

The first thing we need to do is convert the crwPredict results to sf objects. We can do that with a convenience function within the latest (remember, you’ll need the devel branch from GitHub) version of crawl. Note that we specify locType = "p". By default, crwPredict returns predictions at both the specified interval (in this case, hourly) and the time point for each corresponding observed location. By specifying locType = "p" we only get the predicted locations at the regular interval of 1 hour.

tbl_data <- tbl_data %>% 
  dplyr::mutate(
    pts_sf = purrr::map(pred_pts, ~ crawl::crw_as_sf(.x, ftype = "POINT",
                                                     locType ="p")),
    line_sf = purrr::map(pred_pts, ~ crawl::crw_as_sf(.x, ftype = "LINESTRING",
                                                      locType = "p"))
    )
tbl_data

## # A tibble: 29 x 6
##    deployid      data        fit      pred_pts       pts_sf      line_sf   
##    <chr>         <list>      <list>   <list>         <list>      <list>    
##  1 EB2004_5972_… <tibble [1… <S3: cr… <data.frame [… <tibble [2… <tibble […
##  2 EB2004_5973_… <tibble [6… <S3: cr… <data.frame [… <tibble [1… <tibble […
##  3 EB2005_5950_… <tibble [1… <S3: cr… <data.frame [… <tibble [3… <tibble […
##  4 EB2005_5952_… <tibble [1… <S3: cr… <data.frame [… <tibble [4… <tibble […
##  5 EB2005_5954_… <tibble [7… <S3: cr… <data.frame [… <tibble [2… <tibble […
##  6 EB2005_5956_… <tibble [1… <S3: cr… <data.frame [… <tibble [2… <tibble […
##  7 EB2005_5957_… <tibble [8… <S3: cr… <data.frame [… <tibble [2… <tibble […
##  8 EB2005_5958_… <tibble [3… <S3: cr… <data.frame [… <tibble [4… <tibble […
##  9 EB2005_5959_… <tibble [3… <S3: cr… <data.frame [… <tibble [4… <tibble […
## 10 EB2005_5960_… <tibble [1… <S3: cr… <data.frame [… <tibble [2… <tibble […
## # … with 19 more rows

Lets unnest our lines and plot them with mapview.

tbl_data %>% 
  dplyr::select(deployid,line_sf) %>% 
  tidyr::unnest() %>%
  dplyr::select(-id) %>% 
  sf::st_as_sf(crs = 3571) %>% 
  sf::st_transform(4326) %>% st_to_360() %>% 
mapview::mapview(map.types = "Esri.OceanBasemap", zcol = "deployid",
                 burst = TRUE, legend = FALSE, homebutton = FALSE)

Fix predicted path around land

Bearded and spotted seals are coastal during some of their behavior and it isn’t unusual for the predicted tracks to cross land at times.

tbl_data %>% 
  dplyr::select(deployid, pts_sf) %>% 
  tidyr::unnest() %>%
  sf::st_as_sf(crs = 3571) %>% 
  sf::st_intersects(bering) %>% 
    purrr::map_lgl(~ length(.x) > 0) %>% 
  sum()

## [1] 2393

We, ideally, would include a land mask as part of the crwMLE model fit. But, that becomes computationally challenging. As a compromise, crawl includes a fix_path() function that (is very much in beta development) will re-route portions of the predicted path around land.

tbl_data <- tbl_data %>%
  dplyr::mutate(fix = purrr::map2(
    pred_pts,
    fit,
    ~ crawl::fix_path(.x,
                      vector_mask = bering,
                      crwFit = .y)
  ))

Now, let’s update our nested tibble with _*fix columns. We can also take a moment to examine the number of points that may still intersect with our land polygon (fingers crossed the resulting value is 0).

tbl_data <- tbl_data %>% 
  dplyr::mutate(
    pts_fix = purrr::map(fix, ~ crawl::crw_as_sf(.x, ftype = "POINT",
                                                      locType = "p")),
    line_fix = purrr::map(fix, ~ crawl::crw_as_sf(.x, ftype = "LINESTRING",
                                                      locType = "p"))
    )

tbl_data %>% 
  dplyr::select(deployid,pts_fix) %>% 
  tidyr::unnest() %>%
  sf::st_as_sf(crs = 3571) %>% 
  sf::st_intersects(bering) %>% 
    purrr::map_lgl(~ length(.x) > 0) %>% 
  sum()

## [1] 0

Now, let’s go ahead and re-create our mapview and zoom in closely to some of the coastline and see how things look. One thing to keep in mind is that the tracklines may still cross land at some locations since we only predicted every hour. There’s a tradeoff: the more frequent predictions, the smoother the path and the fixed path around land features. But, the computational effort will increase.

If you want to challenge your skills, go ahead and create a mapview that plots pts_fix instead of line_fix. But, remember the cautionary bit earlier about creating MULTIPOINT features instead of sending large POINT features to the map.

tbl_data %>% 
  dplyr::select(deployid,line_fix) %>% 
  tidyr::unnest() %>%
  dplyr::select(-id) %>% 
  sf::st_as_sf(crs = 3571) %>% 
  sf::st_transform(4326) %>% st_to_360() %>% 
mapview::mapview(map.types = "Esri.OceanBasemap", zcol = "deployid",
                 burst = TRUE, legend = FALSE, homebutton = FALSE)

Animated Plot

Now, let’s make something really cool and animate the tracks. We’ll use the gganimate package. This will probably take about 7-10 minutes to render.

library(gganimate)
library(ggspatial)


pts_by_day <- tbl_data %>% 
  dplyr::select(deployid,pts_fix) %>% 
  tidyr::unnest() %>%
  dplyr::filter(locType == "p") %>% 
  dplyr::select(deployid,date_time, geometry) %>% 
  dplyr::mutate(anim_date = format(date_time, format="%B %d"),
                yday = lubridate::yday(date_time)) %>% 
  sf::st_as_sf(crs = 3571) %>% 
  dplyr::arrange(deployid,yday) %>% 
  dplyr::group_by(deployid,yday) %>% 
  dplyr::summarise(do_union = FALSE)

animated_plot <- ggplot() +
  annotation_spatial(bering, fill = "grey", lwd = 0) +
  layer_spatial(pts_by_day, size = 0.75,aes(color = deployid)) +
  scale_color_viridis_d() + 
  scale_x_continuous(breaks = seq(-180, 180, by = 5)) +
  theme(legend.position = "none") +
  ggtitle("Predicted Movements, Day of Year: {as.integer(frame_time)}", 
          subtitle = "bearded and spotted seals, Bering Sea") +
  transition_time(yday) +
  shadow_wake(wake_length = 0.1, alpha = FALSE)

gganimate::animate(animated_plot)