This package was developed to facilitate the processes of reptile nomenclature update based on a search for species synonyms according to the Reptile Database website (Uetz et al., 2025).
Currently, the package accesses many species information from the Reptile Database using R interface.
It is useful to people trying to harmonize reptile nomenclature in databases from different sources (IUCN, species traits database, etc), or trying to get summaries from a given higher taxa or region (e.g.: Snakes from Brazil). But it can also just print single species accounts directly in R for quick information check.
The package is available on CRAN, check if the installed version is 1.1.0 or above. If not, the development version can be installed from GitHub.
To install the stable version of this package, users must run:
# Install CRAN version (check package version, must be 1.1.0 or above)
install.packages("letsRept")
#OR
# Install development version from GitHub
# install.packages("devtools")
devtools::install_github("joao-svalencar/letsRept")
library(letsRept)
NOTE: Some functions use parallel processing by default. Parallel processing may affect system performance. Adjust the number of cores according to your needs. A snippet showing how to access the number of available cores in your computer is provided along with the functions examples.
reptSearch
:
Retrieves species information from The Reptile Database using a binomial species name.
# single species:
reptSearch(binomial = "Apostolepis adhara")
# If user wants to check the list of references related to species, as listed in RDB:
reptSearch(binomial = "Bothrops pauloensis", getRef=TRUE)
reptSearch()
supports synonym-based queries. If the
provided binomial does not match any currently valid species name, the
function automatically passes the query to
reptAdvancedSearch(synonym = binomial)
. In such cases, if
the synonym can be unambiguously resolved to a valid species, the
function will return the corresponding species information. Otherwise,
it provides a link (which can be accessed using
reptSpecies(url = link)
) to a list of all species that
include the queried synonym in their synonymy.
reptRefs
:
Retrieves the reference list for a given species from the Reptile
Database. By default, it returns both the citation and a link to the
reference source (if available). If the user sets
getLink = FALSE
, the function returns only the plain-text
references.
This function is useful to extract citation metadata for taxonomic or historical analyses.
reptAdvancedSearch
:
Creates a link for a page as derived from an Advanced Search in RD (multiple species in a page):
# create multiple species link:
link_boa <- reptAdvancedSearch(genus = "Boa", exact = TRUE) #returns a link to a list of all Boa species.
link_apo <- reptAdvancedSearch(genus = "Apostolepis") #returns a link to a list of all Apostolepis species
link <- reptAdvancedSearch(higher = "snakes", location = "Brazil") #returns a link to a list of all snake species in Brazil
link <- reptAdvancedSearch(year = "2010 OR 2011 OR 2012") #returns a link to a list of all species described from 2010 to 2012
reptAdvancedSearch(synonym = "example")
will return the
species information directly (via reptSearch("example")
) if
the synonym uniquely matches a single valid species. The
exact = TRUE
argument in the Boa example parses
the information strictly as “Boa”, avoiding to return genus like
Pseudoboa, or Boaedon.
⚠️ Note:
The argument exact
does not work properly for searches
using logical arguments (e.g. AND/OR
). If you want to force
an exact match (e.g., “Boa” as a phrase) with multiple terms
(e.g., “Boa OR Apostolepis”), you must manually
include quotes in the input string:
link <- reptAdvancedSearch(synonym = "\"Boa\" OR Apostolepis")
reptSpecies
:
Retrieve species data from the species link created by
reptAdvancedSearch
. User can also copy and paste the url
from a RD webpage containing the list of species derived from an
advanced search. It includes higher taxa information, authors, year of
description, and the species url. By default this function uses parallel
processing, with default set to use half of the available cores. If user
wants to use just one core, there are options for backup saving based on
the checkpoint
argument.
#check number of available cores:
install.packages("parallel")
parallel::detectCores()
# sample multiple species data:
#Returns higher taxa information and species url:
boa <- reptSpecies(link_boa, taxonomicInfo = TRUE, getLink = TRUE) #link from reptAdvancedSearch(genus = "Boa")
#example 2
apo <- reptSpecies(link_apo, taxonomicInfo = TRUE, getLink = TRUE) #link from reptAdvancedSearch(genus = "Apostolepis")
#Returns only species url - Faster and recommended for large datasets:
boa <- reptSpecies(link_boa, taxonomicInfo = FALSE, getLink = TRUE) #link from reptAdvancedSearch(genus = "Boa")
#Returns only species names - Faster and recommended for large datasets for just nomenclatural check:
boa <- reptSpecies(link_boa, taxonomicInfo = FALSE, getLink = FALSE) #link from reptAdvancedSearch(genus = "Boa")
#example 2
apo <- reptSpecies(link_apo, taxonomicInfo = FALSE, getLink = TRUE) #link from reptAdvancedSearch(genus = "Apostolepis")
#With checkpoint and backups (only without parallel sampling: a lot slower but safer):
path <- "path/to/save/backup_file.rds" #not to run just example
apo <- reptSpecies(link_apo, taxonomicInfo = FALSE, getLink = TRUE, checkpoint=6, backup_file=path) #link from reptAdvancedSearch(genus = "Apostolepis")
⚠️ Note:
All console messages, warnings, and progress updates can be silenced.
However, reptSpecies()
issues a helpful warning if any
species information fails to be retrieved, along with instructions on
how to access those entries from the function output.
As many functions in letsRept
relies on internet
connection errors or failed species information sampling might happen
with no regards on the package coding. From reptSpecies()
a
given warning might appear:
Warning message:
In reptSpecies(dataList = allReptilesLinks, getLink = FALSE, taxonomicInfo = TRUE, :
Data sampling completed with errors for the following species:
- Ablepharus himalayanus: cannot open the connection
- Acanthodactylus tristrami: cannot open the connection
- Amphisbaena leucocephala: cannot open the connection
- Amphisbaena vanzolinii: cannot open the connection
- Amyda ornata: cannot open the connection
... and 10 others
In this case, all species have been tried to be sampled once, and for
some (in the example: 15 species) loading their page in R failed. This
will provide a dataframe with the usual 7 columns desired when
taxonomicInfo = TRUE
, plus a column to report the occurence
of an error, the message related to the error and the species url
regardeless of if getLink = FALSE
and the code necessary to
extract the failed species from the output in order to retry their
sampling:
Where df
is object named in the first run of
reptSpecies()
. This line samples the object selecting all
species that got any errors in the previous run along with their url,
the current requeried entries to run reptSpecies()
with
argument dataList
:
failed <- reptSpecies(dataList = failed_spp, taxonomicInfo = TRUE)
To merge the original df
object with the ressampled
species we first remove the failed species from df
, then we
rbind
the objects and reorder the species:
df <- df[!df$species %in% df$species[allRept$error == TRUE], c(-8,-9,-10)]
dfNew <- rbind(df, failed)
dfNew <- dfNew[order(dfNew$species),]
A similar approach is possible for any of the functions tied to web scraping.
A new version of reptSpecies()
allowing users to ask for
automated retries is planned for a future version.
This function summarizes the taxonomic content of a species list,
typically an object created with reptSpecies()
with higher
taxa information. If no object is provided it summarizes the internal
dataset allReptiles
.
# summary of internal database allReptiles
reptStats()
# filter by family and return summary table (internal database)
reptStats(family = "Elapidae")
reptStats(family = "Elapidae", verbose = TRUE)
# creating a subset from RDB
link <- reptAdvancedSearch(higher="snakes", location = "Brazil")
snakes_br <- reptSpecies(link, taxonomicInfo = TRUE)
# summarizing subset
reptStats(snakes_br)
reptStats(snakes_br, family = "Viperidae", verbose = TRUE)
reptSynonyms
:
Samples species synonyms either using binomial or a data frame with
species names and the species link (e.g.: the result of
reptSpecies(link, getLink=TRUE)
). By default this function
uses parallel processing, with default set to use half of the available
cores.
I benchmarked the options using just a vector of species names and using a dataframe with species url. In general using the dataframe is faster, but for a small list of species using just species names should be enough.
#check number of available cores:
install.packages("parallel")
parallel::detectCores()
# sample species synonyms
boa_syn <- reptSynonyms(boa) # using data frame created, with reptSpecies(boa_link, getLink = TRUE)
Bconstrictor_syn <- reptSynonyms(x = "Boa constrictor") # using species binomial
#example 2
apo_syn <- reptSynonyms(apo)
⚠️ Note:
All console messages, warnings, and progress updates can be silenced.
However, reptSynonyms()
issues a helpful warning if any
species information fails to be retrieved, along with instructions on
how to access those entries from the original data frame.
The complex regex
pattern used to sample synonyms from
The Reptile Database is quite efficient, but still samples about 0.2% in
an incorrect format. Most cases represent unusual nomenclature so users
might not face any problems trying to match current valid names. In any
case, I fixed (potentially) all unusual synonym formats in the internal
dataset allSynonyms
(last update: 23rd May, 2025)
reptCompare
:
Compares the nomenclature of a user provided vector of species
(x
) with the RDB nomenclature. User may provide vector of
current names sampled from RDB with reptSpecies(), otherwise the names
in x
will be compared with those in the internal dataset
allReptiles
(the dataset documentation to see the RDB
version (current is May, 2025).
The function returns species that are either unmatched (“review”) or matched with the RDB list, or both.
my_species <- data.frame(species = c("Boa constrictor", "Pantherophis guttatus", "Fake species"))
reptCompare(my_species)
reptCompare(my_species, filter = "review")
reptCompare(my_species, filter = "matched")
Species that requires nomenclature review are usually queried to
reptSync()
, while matched species is ideally queried to
reptSplitCheck
with a reference date parsed to the argument
pubDate
.
reptSync
:
Initially inspired in function aswSync from package AmphiNom (Liedtke, 2018).
This is the most recursive function of the package, using all the previous functions in order to provide the most likely updated nomenclature for the queried species. By default this function uses parallel processing, with default set to use half of the available cores.
The function is divided in two main steps. Here is how it works:
Step 1
The function queries a vector of species (e.g.: IUCN, or a regional
list), check their validity through reptSearch
and returns
a data frame with current valid species names. When
reptSearch
finds a species page it assumes that is the
valid name for the queried species and returns the status “up_to_date”.
When reptSearch
doesn’t find a species it parses the
binomial to reptAdvancedSearch
using the synonym filter. If
reptAvancedSearch
returns a link for a species page that
species name is considered valid for the synonym queried and the
function returns the status "updated"
. Otherwise,
reptAvancedSearch
will return a link for a page with a list
of species, then the function assumes that the queried synonym could be
assigned to any of those valid names and returns the status:
"ambiguous"
. If the queried species does not return a
species page nor a page for multiple species the function returns to
columns "RDB"
and "status"
the sentence
"not_found"
.
Step 2
Step 2 is activated only if solveAmbiguity = TRUE
. When
reptAvancedSearch
returns a link for a page with a list of
species, that link is parsed to reptSpecies
which collects
species names and urls
and automatically parses the
resulting data frame to reptSynonyms
. Finally, with the
result of reptSynonyms
the function compares the queried
species with all listed synonyms. If the queried species is actually
listed as a synonym of only one of the searched species (e.g. the
queried name is not a synonym, but is mentioned in the comments
section), the function will return that valid name and status will be
"updated"
. If the queried species is actually a synonym of
more than one valid species, then the function will return both species
names and the status will still be "ambiguous"
.
#check number of available cores:
install.packages("parallel")
parallel::detectCores()
# comparing synonyms:
query <- c("Vieira-Alencar authoristicus",
"Boa atlantica",
"Boa diviniloqua",
"Boa imperator",
"Boa constrictor longicauda")
reptSync(query)
#example 2:
query <- c("Vieira-Alencar authorisensis",
"Apostolepis ambiniger",
"Apostolepis cerradoensis",
"Elapomorphus assimilis",
"Apostolepis tertulianobeui",
"Apostolepis goiasensis")
reptSync(query)
The column "RDB"
shows current valid name according to
The Reptile Database.
Pay special attention to the "status"
column:
status:
"up_to_date"
- Species name provided is the current
valid name found in The Reptile Database
"updated"
- Species name provided is a synonym, and
the current valid name is reported unambiguously.
"ambiguous"
- Species name provided is considered a
synonym of more than one current valid species, likely from a split in
taxonomy.
"not_found"
- Species name provided in query is not
a current valid name nor synonym according to The Reptile Database. This
status could be derived from a typo within species name. A very rare
situation where this status can pop up is when the current valid name is
updated in the query but is not found in the list of synonyms.
"merge"
- Multiple names in the query are now
considered synonyms of a single valid species, likely from a
synonymization.
⚠️ ATTENTION!⚠️
letsRept
does not make authoritative taxonomic
decisions. It matches input names against currently accepted names in
the Reptile Database (RDB).
A name marked as "up_to_date"
may still refer to a taxon
that has been split, and thus may not reflect the most recent
population-level taxonomy, see function reptSplitCheck
below.
reptSplitCheck
:
Species names in recent databases are most likely marked with the
status "up_to_date"
. However, the function
reptSync
only indicates whether the queried binomial is
currently valid. In some cases, a species may have undergone a taxonomic
split, where part of its original populations retains the name while
others have been described as new species. reptSync
does
not account for such cases, so it is recommended to review all
"up_to_date"
species for potential taxonomic splits.
To assist with this, the function reptSplitCheck
queries
binomial names as synonyms using reptAdvancedSearch
, and
checks whether any associated species were described after a
user-defined date (e.g., the publication date of the dataset being
used). By default this function uses parallel processing, with default
set to use half of the available cores.
query <- c("Atractus dapsilis",
"Atractus trefauti",
"Atractus snethlageae",
"Tantilla melanocephala",
"Oxybelis aeneus",
"Oxybelis rutherfordi")
reptSplitCheck(query, pubDate = 2019) # pubDate of Nogueira et al., Atlas of Brazilian Snakes
reptSplitCheck(query, pubDate = 2019, includeAll = TRUE)
Pay special attention to the "status"
column:
status:
"up_to_date"
– The species name provided is not a
synonym of any species described after pubDate
.
"check_split"
– The species name provided is a
synonym of at least one valid species described in or after
pubDate
, suggesting a possible taxonomic split.
"not_found"
- This is a temporary status to detect
species that are not listed as synonyms of themselves, so it does not
appear in the advanced search (e.g.: to correct in RDB).
OBS: With argument includeAll
set to
default (FALSE
), if the queried species is a synonym of
only species described in the same year pubDate
that are
already included in the queried species list, the queried species will
receive the status "up_to_date"
. To check every possible
taxonomic split regardless of whether the new species is already present
in the queried species list, change the argument value to
TRUE
.
Attention: The column "RDB"
shows only
the current valid names of species described in, or after,
pubDate
according to The Reptile Database. It will NOT show
the queried species, but be aware that reptSplitCheck
is
for valid names only (e.g.: “matched”, from reptCompare
).
Therefore, the "check_split"
status is only to highlight
that users must check if the data related to the queried species could
be used (e.g.: species traits) or divided (e.g.: distribution) to the
"RDB"
column species. In other words, in the example above,
it does not mean that Tantilla melanocephala current valid name
is T. selmae. It means that users must ensure that the use of
the queried nomenclature will not overlook the possibility that the
data, or part of it, should be assigned to T. selmae
instead.
reptTidySyn
:
This function was developed exclusively to improve the visualization
of reptSync
and reptSplitCheck
outcomes.
Queried species with many current valid names would break the data frame
visualization in the R console. reptTidySyn
stacks current
valid names and improves data visualization. Moreover, the argument
filter
, allows users to filter the printed data frame by
“status” so users can focus only in the status that they want to
evaluate.
query <- c("Vieira-Alencar authorisensis",
"Apostolepis ambiniger",
"Apostolepis cerradoensis",
"Elapomorphus assimilis",
"Apostolepis tertulianobeui",
"Apostolepis goiasensis")
df <- reptSync(query)
reptTidySyn(df)
The package counts with a full list of current valid species
(allReptiles
- 12,500 species) with their respective higher
taxa information (updated to September 15th, 2025);
A dataset with all unique synonyms and chresonyms for each
current valid species (allSynonyms
- 53,159 entries -
updated to May 23rd, 2025);
Another synonyms and chresonyms dataset with all entries
considering their respective references (allSynonymsRef
-
110,413 entries - updated to May 23rd, 2025).
To get the full reference to cite this package in publications, run to get the most up to date version reference:
citation("letsRept")
⚠️ Important note
letsRept
retrieves valuable taxonomic and synonymy data
directly from The Reptile
Database.
When citing this package, please also cite the original database as a data source.
Liedtke, H. C. (2018). AmphiNom: an amphibian systematic tool. Systematics and Biodiversity, 17(1) 1-6. https://doi.org/10.1080/14772000.2018.1518935
Uetz, P., Freed, P, Aguilar, R., Reyes, F., Kudera, J. & Hošek, J. (eds.) (2025). The Reptile Database