Generate SIS Connect files for multiple plant species
Source:vignettes/Generate-csvs.Rmd
Generate-csvs.Rmd
The aim of LCr
is to speed up the process of adding
Least Concern (LC) species to the IUCN Red List. We’ll start with
a list of plant species and end up with a zip file of CSV files that
contain the minimal required information to support a Least Concern
assessment. The zip file can then be uploaded into the IUCN Species
Informaiton System (SIS) via SIS Connect (requires
registration) where the draft assessments can be edited, reviewed and
hopefully published on the IUCN Red List in due course. Spatial data to
support the assessment is also generated by LCr
.
Before you start, make sure you have the rWCVP
package
installed along with the associated data package rWCVPdata
.
For more information see the Getting
Started guide.
Load the LCr
library:
Get name keys from a species list
The first step is to determine the list of LC species that you want to document. A study predicted extinction risk for all species of flowering plants (Angiosperms). You can filter on species that are confidently predicted to be LC, or manually generate a data frame from a list as shown below.
Note that predictions can be wrong so please verify that your selected species are genuine LC species. A useful resource to help determine whether your species is LC is to check whether it has previously been assessed. The ThreatSearch resource maintained by BGCI contains evidence-based plant conservation assessments compiled from digital resources including national/regional Red Lists.
lc_species <-
data.frame(sp = c(
"Crabbea acaulis", "Crabbea cirsioides", "Crabbea nana", "Crabbea velutina"
))
print(lc_species)
#> sp
#> 1 Crabbea acaulis
#> 2 Crabbea cirsioides
#> 3 Crabbea nana
#> 4 Crabbea velutina
Next we can reconcile the names against other data sources using
existing packages such as rWCVP
and rgbif
.
This will provide us with name keys that can be used to search for data
on those species.
In this case we want to enforce a single matching name for every name
in our list so we set the match
parameter to ‘single’, but
you can set this to ‘multiple’ if you wish to allow multiple matches.
Similarly, we want to focus only on accepted species, so we set the
tax_status
parameter to ‘accepted’, but you can set this to
‘any’ if you wish to return a different taxonomic status.
lc_keys <-
get_name_keys(
df = lc_species,
name_column = "sp",
tax_status = "accepted",
match = "single"
)
# take a look at the resulting table
glimpse(lc_keys)
The name matching went well. I’ve checked the authors and I’m happy
that the names match to the correct concept for these taxa. Note that it
is worth spending time with the name matching step to ensure you are
using a consistent concept for the taxa you are working with across GBIF
and WCVP. Adjust the tax_status
and match
parameters for more fuzzy searching.
We now have some useful fields to help us find more data for these
species. The GBIF_usageKey
is an identifier for species
according to the GBIF name backbone and the wcvp_ipni_id
identifies the species according to the WCVP name backbone.
Generate an LC point file
Use the make_LC_points
function to kick off a download
from GBIF, clean the downloaded occurences, and reformat them to IUCN
spatial standards. Name information is required for the standard point
file. Set the range_check
parameter to TRUE
if
you want to exclude points outside the native range according to
WCVP
, otherwise standard cleaning protocols are applied
using the CoordinateCleaner
package.
Note that we will use the rGBIF
package to obtain the
occurrence data. You will need to set up your GBIF credentials to obtain
the downloads. After you have set up an account at GBIF you need to register your
credentials in the r environment - see this post
for an explanation.
The IUCN spatial point standards require some fields relating to the data compiler and associated institution. These can be manually entered as below.
lc_sis_occs <-
make_LC_points(
keys_df = lc_keys,
first_name = "Steven",
second_name = "Bachman",
institution = "Royal Botanic Gardens, Kew",
range_check = TRUE
)
# save the file - you'll need to submit this to IUCN
library(readr)
write_csv(lc_points$points, "lc_points_fungi.csv")
Some occurrence records were removed as they did not pass the
cleaning tests or were outside the native range. Note that the output
from make_LC_points
will likely be a large list with two or
three objects: a data frame of occurrences called points
, a
data frame called citation
that holds the citation details
for this download, and if you set range_check = TRUE
it
will also return the native ranges as a data frame. The GBIF citation is
important to retain as we need to cite the use of this data in the Red
List assessment and we’ll add this to the reference file later on.
Generate LC CSV files
We can now generate the LC csv files. We can use the
wcvp_ipni_id
field as a unique identifier. We need to add
the assessor details again as this is registered in the credits csv
file, note that the email address is also required. Now we can add our
GBIF citation back in so that it goes in the references csv file.
Finally, the native ranges we generated earlier can be used to help
define the list of countries that our species are native to.
lc_sis_files <- make_sis_csvs(unique_id = lc_keys$wcvp_ipni_id,
wcvp_ipni_id = lc_keys$wcvp_ipni_id,
first_name = "Steven",
second_name = "Bachman",
email = "s.bachman@kew.org",
institution = "Royal Botanic Gardens, Kew",
gbif_ref = lc_sis_occs$citation,
native_ranges = lc_sis_occs$native_ranges
)
The final step is to make a ZIP file as this is needed to import the
data into SIS Connect.
With the lc_sis_files
list object we just created we run
the make_zip
function that saves the zip file to your
current working directory.
All done! Send those LC species to the Red list and start working on the next batch.
make_zip(lc_sis_files)