This document is under active development and subject to significant change at any time. Please do not cite or use the information in any form without contacting the author(s) for additional details.
This document describes the technical details of tidying satellite telemetry data associated with the Aleutian harbor seal research project. Data from the locations, histos, and behavior tables are tidy’d and joined with seal data from the PEP enterprise database. Data will be made available via an R package and also as *.csv files organized into a ‘data package’ and, eventually, be uploaded to the Arctic Data Center.
The wcUtils
package is available for install from GitHub and the other
packages are available for install from CRAN.
if (!require('devtools')) install.packages('devtools')
if (!require('wcUtils')) {
devtools::install_github('jmlondon/wcUtils')
}
library(tidyverse)
library(here)
library(RPostgreSQL)
library(xts)
The deployment data from Wildlife Computers does not include much of the detailed information regarding age, sex, species, etc of the the seal. The DeployID field provides a unique key for associating each deployment with a seal. These data are stored in a local, enterprise database. So, first thing we will do is setup our connection to this database.
library(DBI)
con <- dbConnect(RPostgres::Postgres(),
host = Sys.getenv("PEP_IP"),
user = keyringr::get_kc_account("pgpep_londonj"),
password = keyringr::decrypt_kc_pw("pgpep_londonj"),
dbname = "pep")
After downloading the deployment archives from the data portal, there are some additional steps we need to take in order to get things ready for analysis. Since each deployment is downloaded as a separate entity, we will combine all deployments into a single data.frame. There are some additional columns (e.g. gpe-* columns) that are not relevant for our work so we’ll remove them.
We also need to extract each of these archive zip files into a temporary
directory so we can then pull out the *.csv
files of interest
zipfiles <- list.files(here::here("datapack-raw"),
pattern = ".zip",full.names = TRUE)
temp_dir <- file.path(tempdir())
for (zipfile in zipfiles) {
unzip(file.path(zipfile),overwrite = TRUE, exdir = temp_dir)
}
All deployment archives will have a *-Locations.csv
that includes all of the
Argos locations for that deployment. Some deployments, however, also include
a FastLoc GPS sensor and the archive also includes a *-Locations.csv
with
GPS quality locations. In those cases where FastLoc GPS data exists, we want
to include those data. So, we will need to do some checking of file names within
each deployment to sort this out.
# column data type definitions
my_cols <- cols(
DeployID = col_character(),
Ptt = col_integer(),
Instr = col_character(),
Date = col_datetime("%H:%M:%S %d-%b-%Y"),
Type = col_character(),
Quality = col_character(),
Latitude = col_double(),
Longitude = col_double(),
`Error radius` = col_integer(),
`Error Semi-major axis` = col_integer(),
`Error Semi-minor axis` = col_integer(),
`Error Ellipse orientation` = col_integer(),
Offset = col_character(),
`Offset orientation` = col_character(),
`GPE MSD` = col_character(),
`GPE U` = col_character(),
Count = col_character(),
Comment = col_character()
)
# local functions
# identify ptt value from filename
extract_ptt <- function(fstring) {
if (nchar(fstring) > 20) {
ptt <- strsplit(fstring,"-")[[1]][2]
} else {
ptt <- strsplit(fstring,"-")[[1]][1]
}
return(ptt)
}
# identify gps fastloc deployments
fastloc_gps <- function(fstring) {
if (nchar(fstring) > 20) {
gps <- TRUE
} else {
gps <- FALSE
}
return(gps)
}
# make our column names consistent
make_names <- function(x) {
new_names <- make.names(colnames(x))
new_names <- gsub("\\.", "_", new_names)
new_names <- tolower(new_names)
colnames(x) <- new_names
x
}
loc_files <- list.files(temp_dir, pattern = "*-Locations.csv")
# read csv data into a nested dataframe
tbl_locs <- tibble(filename = loc_files) %>%
mutate(
file_ptt = map_chr(filename, ~ extract_ptt(.)),
file_gps = map_lgl(filename, ~ fastloc_gps(.)),
file_contents = map(filename,
~ read_csv(file.path(temp_dir, .),
col_types = my_cols))
)
# filter out argos location files that also have gps data files
gps_files <- tbl_locs %>% filter(file_gps)
argos_files <- tbl_locs %>% filter(!file_gps,
!file_ptt %in% gps_files$file_ptt)
# merge the data back together and clean up for our final table
tbl_locs <- dplyr::bind_rows(gps_files,argos_files) %>%
tidyr::unnest() %>%
dplyr::select(-(filename),-(file_ptt),-(file_gps),-(Ptt)) %>%
make_names() %>%
dplyr::rename(date_time = date) %>%
dplyr::arrange(deployid, date_time)
tbl_locs
## # A tibble: 168,777 x 17
## deployid instr date_time type quality latitude longitude
## <chr> <chr> <dttm> <chr> <chr> <dbl> <dbl>
## 1 PV2014_… Mk10 2014-09-02 22:12:21 Argos 0 51.9 -177.
## 2 PV2014_… Mk10 2014-09-02 22:47:59 Argos 1 51.9 -177.
## 3 PV2014_… Mk10 2014-09-02 23:09:44 Argos 2 51.9 -177.
## 4 PV2014_… Mk10 2014-09-02 23:56:41 Argos 1 51.9 -177.
## 5 PV2014_… Mk10 2014-09-03 00:22:46 Argos 2 51.9 -177.
## 6 PV2014_… Mk10 2014-09-03 01:05:42 Argos B 51.9 -177.
## 7 PV2014_… Mk10 2014-09-03 02:13:23 Argos 2 51.9 -177.
## 8 PV2014_… Mk10 2014-09-03 02:22:55 Argos 3 51.9 -177.
## 9 PV2014_… Mk10 2014-09-03 02:40:24 Argos 3 51.9 -177.
## 10 PV2014_… Mk10 2014-09-03 04:03:57 Argos 2 51.9 -177.
## # … with 168,767 more rows, and 10 more variables: error_radius <int>,
## # error_semi_major_axis <int>, error_semi_minor_axis <int>,
## # error_ellipse_orientation <int>, offset <chr>,
## # offset_orientation <chr>, gpe_msd <chr>, gpe_u <chr>, count <chr>,
## # comment <chr>
The locations data is subject to having consecutive records with duplicate times but slightly different coordinates. We don’t really want to throw out any records, so we’ll just add 1 second to the duplicate record.
make_unique <- function(x) {
xts::make.time.unique(x$date_time,eps = 1)
}
tbl_locs <- tbl_locs %>%
dplyr::arrange(deployid,date_time) %>%
dplyr::group_by(deployid) %>% tidyr::nest() %>%
dplyr::mutate(unique_time = purrr::map(data, make_unique)) %>%
tidyr::unnest() %>%
dplyr::select(-date_time) %>% rename(date_time = unique_time)
Let’s take a look at some basic summary stats on each deployment
tbl_locs %>% group_by(deployid) %>%
summarise(num_locs = n(),
start_date = min(date_time),
end_date = max(date_time))
## # A tibble: 143 x 4
## deployid num_locs start_date end_date
## <chr> <int> <dttm> <dttm>
## 1 PV2014_2001_10A0193 4917 2014-09-02 22:12:21 2015-04-08 04:31:34
## 2 PV2014_2001_13U0082 526 2014-09-02 22:13:04 2014-12-30 14:32:18
## 3 PV2014_2002_13A0249 5563 2014-09-02 22:00:00 2015-03-30 07:58:21
## 4 PV2014_2002_13U0144 146 2014-09-02 23:56:01 2014-09-19 21:30:38
## 5 PV2014_2003_13U0076 582 2014-09-03 12:39:49 2015-01-25 23:55:45
## 6 PV2014_2003_14A0117 4636 2014-09-02 22:44:00 2015-03-26 01:09:41
## 7 PV2014_2004_14U0268 129 2014-09-03 17:49:47 2015-07-10 17:06:07
## 8 PV2014_2005_10A0199 3821 2014-09-03 01:05:55 2015-04-02 03:40:16
## 9 PV2014_2005_15A0888 440 2016-09-13 03:18:00 2016-10-03 08:57:03
## 10 PV2014_2005_R1_14U0356 226 2016-10-03 04:02:39 2017-05-17 21:33:30
## # … with 133 more rows
The information downloaded from the WCDP does not contain any details regarding
the seal the tag was deployed on (e.g. age, sex, release date, end
date). We have that data stored seperately and link the information via the
assigned deployid value which is present in both data sources. The enddate
values have been specified by PEP researchers based on examination of the data.
In most cases, the enddate corresponds with the last transmission from the
tag. However, some tags may fall of on shore and continue to transmit well after
detaching from the seal. We also need to standardise the naming scheme for
columns and rely on a custom make_names
function to do this.
tbl_locs <- con %>% tbl(dbplyr::in_schema("telem","tag_deployments")) %>%
dplyr::collect() %>%
dplyr::right_join(tbl_locs, by = 'deployid')
Check to make sure deployment start and end dates are set
check_dates <- tbl_locs %>% filter(is.na(deploy_dt) | is.na(end_dt)) %>%
group_by(deployid) %>%
summarise(nlocs = n())
check_dates
## # A tibble: 2 x 2
## deployid nlocs
## <chr> <int>
## 1 PV2014_2005_15A0888 440
## 2 PV2015_2020_14U0358 622
if(nrow(check_dates) > 0) {
warning("Some deployments lack start or end dates in the database")
}
tbl_locs %>% dplyr::filter(!is.na(deploy_dt) & !is.na(end_dt)) %>%
rowwise() %>%
dplyr::filter(between(date_time,deploy_dt, end_dt))
## Source: local data frame [166,463 x 24]
## Groups: <by row>
##
## # A tibble: 166,463 x 24
## speno species deployid serial_num ptt tag_family deploy_dt
## <chr> <chr> <chr> <chr> <int> <chr> <dttm>
## 1 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 2 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 3 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 4 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 5 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 6 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 7 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 8 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 9 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 10 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## # … with 166,453 more rows, and 17 more variables: end_dt <dttm>,
## # date_time <dttm>, instr <chr>, type <chr>, quality <chr>,
## # latitude <dbl>, longitude <dbl>, error_radius <int>,
## # error_semi_major_axis <int>, error_semi_minor_axis <int>,
## # error_ellipse_orientation <int>, offset <chr>,
## # offset_orientation <chr>, gpe_msd <chr>, gpe_u <chr>, count <chr>,
## # comment <chr>
The percent-dry timeline data tracks the percentage of each hour during the deployment that the tag was dry. This data is mostly used to study haul-out behavior of seals. Since these data are stored within the histos data file structure, some processing needs to happend in order to make these data more analysis friendly.
# column data type definitions
my_cols <- readr::cols(
DeployID = readr::col_character(),
Ptt = readr::col_character(),
DepthSensor = readr::col_character(),
Source = readr::col_character(),
Instr = readr::col_character(),
HistType = readr::col_character(),
Date = readr::col_datetime("%H:%M:%S %d-%b-%Y"),
`Time Offset` = readr::col_double(),
Count = readr::col_integer(),
BadTherm = readr::col_integer(),
LocationQuality = readr::col_character(),
Latitude = readr::col_double(),
Longitude = readr::col_double(),
NumBins = readr::col_integer(),
Sum = readr::col_integer(),
Bin1 = readr::col_double(), Bin2 = readr::col_double(),
Bin3 = readr::col_double(), Bin4 = readr::col_double(),
Bin5 = readr::col_double(), Bin6 = readr::col_double(),
Bin7 = readr::col_double(), Bin8 = readr::col_double(),
Bin9 = readr::col_double(), Bin10 = readr::col_double(),
Bin11 = readr::col_double(), Bin12 = readr::col_double(),
Bin13 = readr::col_double(), Bin14 = readr::col_double(),
Bin15 = readr::col_double(), Bin16 = readr::col_double(),
Bin17 = readr::col_double(), Bin18 = readr::col_double(),
Bin19 = readr::col_double(), Bin20 = readr::col_double(),
Bin21 = readr::col_double(), Bin22 = readr::col_double(),
Bin23 = readr::col_double(), Bin24 = readr::col_double(),
Bin25 = readr::col_double(), Bin26 = readr::col_double(),
Bin27 = readr::col_double(), Bin28 = readr::col_double(),
Bin29 = readr::col_double(), Bin30 = readr::col_double(),
Bin31 = readr::col_double(), Bin32 = readr::col_double(),
Bin33 = readr::col_double(), Bin34 = readr::col_double(),
Bin35 = readr::col_double(), Bin36 = readr::col_double(),
Bin37 = readr::col_double(), Bin38 = readr::col_double(),
Bin39 = readr::col_double(), Bin40 = readr::col_double(),
Bin41 = readr::col_double(), Bin42 = readr::col_double(),
Bin43 = readr::col_double(), Bin44 = readr::col_double(),
Bin45 = readr::col_double(), Bin46 = readr::col_double(),
Bin47 = readr::col_double(), Bin48 = readr::col_double(),
Bin49 = readr::col_double(), Bin50 = readr::col_double(),
Bin51 = readr::col_double(), Bin52 = readr::col_double(),
Bin53 = readr::col_double(), Bin54 = readr::col_double(),
Bin55 = readr::col_double(), Bin56 = readr::col_double(),
Bin57 = readr::col_double(), Bin58 = readr::col_double(),
Bin59 = readr::col_double(), Bin60 = readr::col_double(),
Bin61 = readr::col_double(), Bin62 = readr::col_double(),
Bin63 = readr::col_double(), Bin64 = readr::col_double(),
Bin65 = readr::col_double(), Bin66 = readr::col_double(),
Bin67 = readr::col_double(), Bin68 = readr::col_double(),
Bin69 = readr::col_double(), Bin70 = readr::col_double(),
Bin71 = readr::col_double(), Bin72 = readr::col_double()
)
tbl_percent <- list.files(temp_dir, pattern = "*-Histos.csv",
full.names = TRUE) %>%
purrr::map(read_csv,col_types = my_cols) %>%
dplyr::bind_rows() %>%
make_names() %>%
dplyr::filter(histtype %in% c("Percent")) %>%
dplyr::select(-(bin25:bin72)) %>%
dplyr::select(-(ptt)) %>%
dplyr::arrange(deployid, date)
tbl_percent
## # A tibble: 18,288 x 38
## deployid depthsensor source instr histtype date
## <chr> <chr> <chr> <chr> <chr> <dttm>
## 1 PV2014_… 0.500000 Trans… Mk10 Percent 2014-09-02 21:00:00
## 2 PV2014_… 0.500000 Trans… Mk10 Percent 2014-09-03 00:00:00
## 3 PV2014_… 0.500000 Trans… Mk10 Percent 2014-09-04 00:00:00
## 4 PV2014_… 0.500000 Trans… Mk10 Percent 2014-09-05 00:00:00
## 5 PV2014_… 0.500000 Trans… Mk10 Percent 2014-09-06 00:00:00
## 6 PV2014_… 0.500000 Trans… Mk10 Percent 2014-09-07 00:00:00
## 7 PV2014_… 0.500000 Trans… Mk10 Percent 2014-09-08 00:00:00
## 8 PV2014_… 0.500000 Trans… Mk10 Percent 2014-09-09 00:00:00
## 9 PV2014_… 0.500000 Trans… Mk10 Percent 2014-09-10 00:00:00
## 10 PV2014_… 0.500000 Trans… Mk10 Percent 2014-09-11 00:00:00
## # … with 18,278 more rows, and 32 more variables: time_offset <dbl>,
## # count <int>, badtherm <int>, locationquality <chr>, latitude <dbl>,
## # longitude <dbl>, numbins <int>, sum <int>, bin1 <dbl>, bin2 <dbl>,
## # bin3 <dbl>, bin4 <dbl>, bin5 <dbl>, bin6 <dbl>, bin7 <dbl>,
## # bin8 <dbl>, bin9 <dbl>, bin10 <dbl>, bin11 <dbl>, bin12 <dbl>,
## # bin13 <dbl>, bin14 <dbl>, bin15 <dbl>, bin16 <dbl>, bin17 <dbl>,
## # bin18 <dbl>, bin19 <dbl>, bin20 <dbl>, bin21 <dbl>, bin22 <dbl>,
## # bin23 <dbl>, bin24 <dbl>
Hourly percent-dry data are stored within the first 24 bin columns. Each bin column refers to the hour of the day (bin1 = 00:00-01:00) in UTC time. So, we’ll do some tidying, gathering, mutating and summarizing to create a more meaningful data frame.
## Create a tbl_df that Relates Bin Columns to Day Hours
bins <- tibble(bin = paste("bin",1:24,sep = ""),hour = 0:23)
## Chain Together Multiple Commands to Create Our Tidy Dataset
tbl_percent <- tbl_percent %>%
tidyr::gather(bin,percent_dry, starts_with('bin')) %>%
dplyr::left_join(bins, by = "bin") %>%
dplyr::rename(date_hour = date) %>%
dplyr::mutate(date_hour = date_hour + lubridate::hours(hour)) %>%
dplyr::select(deployid,date_hour,percent_dry) %>%
group_by(deployid, date_hour) %>%
summarize(percent_dry = mean(percent_dry)) %>%
ungroup() %>%
dplyr::arrange(deployid,date_hour)
tbl_percent <- con %>% tbl(dbplyr::in_schema("telem","tag_deployments")) %>%
dplyr::collect() %>%
dplyr::right_join(tbl_percent, by = 'deployid')
tbl_percent
## # A tibble: 438,035 x 10
## speno species deployid serial_num ptt tag_family deploy_dt
## <chr> <chr> <chr> <chr> <int> <chr> <dttm>
## 1 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 2 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 3 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 4 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 5 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 6 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 7 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 8 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 9 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 10 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## # … with 438,025 more rows, and 3 more variables: end_dt <dttm>,
## # date_hour <dttm>, percent_dry <dbl>
The behavior data stream contains all of the information we have regarding dive behavior. There are only a few additional steps that need to be dealt with. We will rename a few columns (count, start, end) so as not to conflict with reservered words in the database.
# column data type definitions
my_cols <- readr::cols_only(
DeployID = readr::col_character(),
Ptt = readr::col_character(),
DepthSensor = readr::col_character(),
Source = readr::col_character(),
Instr = readr::col_character(),
Count = readr::col_integer(),
Start = readr::col_datetime("%H:%M:%S %d-%b-%Y"),
End = readr::col_datetime("%H:%M:%S %d-%b-%Y"),
What = readr::col_character(),
Number = readr::col_integer(),
Shape = readr::col_character(),
DepthMin = readr::col_double(),
DepthMax = readr::col_double(),
DurationMin = readr::col_double(),
DurationMax = readr::col_double(),
Shallow = readr::col_integer(),
Deep = readr::col_integer()
)
tbl_behav <- list.files(temp_dir, pattern = "*-Behavior.csv",
full.names = TRUE) %>%
purrr::map(read_csv,col_types = my_cols) %>%
dplyr::bind_rows() %>%
make_names() %>%
dplyr::rename(behav_start = start,
behav_end = end,
msg_count = count) %>%
dplyr::select(-(ptt)) %>%
dplyr::arrange(deployid, behav_start)
tbl_behav <- con %>% tbl(dbplyr::in_schema("telem","tag_deployments")) %>%
dplyr::collect() %>%
dplyr::right_join(tbl_behav, by = 'deployid')
tbl_behav
## # A tibble: 1,171,940 x 23
## speno species deployid serial_num ptt tag_family deploy_dt
## <chr> <chr> <chr> <chr> <int> <chr> <dttm>
## 1 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 2 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 3 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 4 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 5 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 6 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 7 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 8 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 9 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## 10 PV20… Pv PV2014_… 10A0193 37521 SPLA 2014-09-02 14:29:00
## # … with 1,171,930 more rows, and 16 more variables: end_dt <dttm>,
## # depthsensor <chr>, source <chr>, instr <chr>, msg_count <int>,
## # behav_start <dttm>, behav_end <dttm>, what <chr>, number <int>,
## # shape <chr>, depthmin <dbl>, depthmax <dbl>, durationmin <dbl>,
## # durationmax <dbl>, shallow <int>, deep <int>
The behavior table includes 3 types of data related to the behavior timeline: Message,Surface, and Dive. To make it easier to work with, we will create a nested tibble that is split by these three behavior message types.
tbl_behav_nest <- tbl_behav %>% group_by(what) %>% nest()
tbl_behav_dive <- tbl_behav_nest %>%
dplyr::filter(what == "Dive") %>%
tidyr::unnest() %>%
dplyr::select(-(ptt))
tbl_behav_dive <- tbl_behav_dive %>%
dplyr::mutate(end_dt = ifelse(is.na(end_dt), Sys.time(), end_dt)) %>%
dplyr::mutate(end_dt = as.POSIXct(end_dt,
tz = "UTC",
origin = "1970-01-01"))
tbl_behav_dive
## # A tibble: 532,805 x 22
## what speno species deployid serial_num tag_family deploy_dt
## <chr> <chr> <chr> <chr> <chr> <chr> <dttm>
## 1 Dive PV20… Pv PV2014_… 10A0193 SPLA 2014-09-02 14:29:00
## 2 Dive PV20… Pv PV2014_… 10A0193 SPLA 2014-09-02 14:29:00
## 3 Dive PV20… Pv PV2014_… 10A0193 SPLA 2014-09-02 14:29:00
## 4 Dive PV20… Pv PV2014_… 10A0193 SPLA 2014-09-02 14:29:00
## 5 Dive PV20… Pv PV2014_… 10A0193 SPLA 2014-09-02 14:29:00
## 6 Dive PV20… Pv PV2014_… 10A0193 SPLA 2014-09-02 14:29:00
## 7 Dive PV20… Pv PV2014_… 10A0193 SPLA 2014-09-02 14:29:00
## 8 Dive PV20… Pv PV2014_… 10A0193 SPLA 2014-09-02 14:29:00
## 9 Dive PV20… Pv PV2014_… 10A0193 SPLA 2014-09-02 14:29:00
## 10 Dive PV20… Pv PV2014_… 10A0193 SPLA 2014-09-02 14:29:00
## # … with 532,795 more rows, and 15 more variables: end_dt <dttm>,
## # depthsensor <chr>, source <chr>, instr <chr>, msg_count <int>,
## # behav_start <dttm>, behav_end <dttm>, number <int>, shape <chr>,
## # depthmin <dbl>, depthmax <dbl>, durationmin <dbl>, durationmax <dbl>,
## # shallow <int>, deep <int>
tbl_behav_surf <- tbl_behav_nest %>%
dplyr::filter(what == "Surface") %>%
tidyr::unnest() %>%
dplyr::select(-(ptt))
tbl_behav_surf <- tbl_behav_surf %>%
dplyr::mutate(end_dt = ifelse(is.na(end_dt), Sys.time(), end_dt)) %>%
dplyr::mutate(end_dt = as.POSIXct(end_dt,
tz = "UTC",
origin = "1970-01-01"))
tbl_behav_surf
## # A tibble: 532,532 x 22
## what speno species deployid serial_num tag_family deploy_dt
## <chr> <chr> <chr> <chr> <chr> <chr> <dttm>
## 1 Surf… PV20… Pv PV2014_… 10A0193 SPLA 2014-09-02 14:29:00
## 2 Surf… PV20… Pv PV2014_… 10A0193 SPLA 2014-09-02 14:29:00
## 3 Surf… PV20… Pv PV2014_… 10A0193 SPLA 2014-09-02 14:29:00
## 4 Surf… PV20… Pv PV2014_… 10A0193 SPLA 2014-09-02 14:29:00
## 5 Surf… PV20… Pv PV2014_… 10A0193 SPLA 2014-09-02 14:29:00
## 6 Surf… PV20… Pv PV2014_… 10A0193 SPLA 2014-09-02 14:29:00
## 7 Surf… PV20… Pv PV2014_… 10A0193 SPLA 2014-09-02 14:29:00
## 8 Surf… PV20… Pv PV2014_… 10A0193 SPLA 2014-09-02 14:29:00
## 9 Surf… PV20… Pv PV2014_… 10A0193 SPLA 2014-09-02 14:29:00
## 10 Surf… PV20… Pv PV2014_… 10A0193 SPLA 2014-09-02 14:29:00
## # … with 532,522 more rows, and 15 more variables: end_dt <dttm>,
## # depthsensor <chr>, source <chr>, instr <chr>, msg_count <int>,
## # behav_start <dttm>, behav_end <dttm>, number <int>, shape <chr>,
## # depthmin <dbl>, depthmax <dbl>, durationmin <dbl>, durationmax <dbl>,
## # shallow <int>, deep <int>
We are going to create two types of export data files: comma-separated files and
R data files. The comma-separated files will be stored within the data package
(and, eventually, archived with the Arctic Data Center). The R data files will
be stored within the data directory of this R package so data can be easily
accessilbe by R users. Future versions of the R package will include an
install_data
function that will pull data form the Arctic Data Center instead
of distributing with the R package.
readr::write_csv(tbl_locs, path = 'aleutpv_tbl_locs.csv')
readr::write_csv(tbl_percent, path = 'aleutpv_tbl_percent.csv')
readr::write_csv(tbl_behav_dive, path = 'aleutpv_tbl_behav_dive.csv')
readr::write_csv(tbl_behav_surf, path = 'aleutpv_tbl_behav_surf.csv')
save(tbl_locs, file = '../data/tbl_locs.rda')
save(tbl_percent, file = '../data/tbl_percent.rda')
save(tbl_behav_dive, file = '../data/tbl_behav_dive.rda')
save(tbl_behav_surf, file = '../data/tbl_behav_surf.rda')