Skip to contents

SpeciesSearchR provides simple tools for searching the scientific literature for specific species using the OpenAlex API. The package is designed to make it easy to retrieve journal articles mentioning a focal species (and optionally common names or synonyms).

This can be useful for:

  • literature discovery

  • evidence mapping

  • rapid review scoping

  • species bibliographies

  • ecological meta-research

The package builds on the excellent openalexR package and provides a convenient wrapper for species-based queries.

Installation

You can install the development version of SpeciesSearchR from GitHub with:

# install.packages("pak")
pak::pak("DrMattG/SpeciesSearchR")

Alternatively:

# install.packages("remotes")
remotes::install_github("DrMattG/SpeciesSearchR")

Example

Search for literature mentioning the gray wolf (Canis lupus):

library(SpeciesSearchR)
wolf <- search_species_openalex(
  species = "Canis lupus",
  synonyms = c("gray wolf", "grey wolf"),
  pages = 1
)
#> Using basic paging...
wolf
#> # A tibble: 199 × 48
#>    species     title   journal  year doi   cited_by_count publication_date id   
#>    <chr>       <chr>   <chr>   <int> <chr>          <int> <date>           <chr>
#>  1 Canis lupus Grey W… Advanc…  2014 http…          17614 2014-01-22       http…
#>  2 Canis lupus Binary… Neuroc…  2015 http…           1344 2015-08-06       http…
#>  3 Canis lupus Multi-… Expert…  2015 http…           1677 2015-11-22       http…
#>  4 Canis lupus Mitoch… Molecu…  1999 http…            362 1999-12-01       http…
#>  5 Canis lupus An imp… Expert…  2020 http…           1068 2020-09-16       http…
#>  6 Canis lupus Grey w… Neural…  2017 http…           1065 2017-11-25       http…
#>  7 Canis lupus Gray w… Veteri…  2011 http…            204 2011-05-23       http…
#>  8 Canis lupus Hypoxi… PLoS G…  2014 http…            189 2014-07-31       http…
#>  9 Canis lupus How ef… Applie…  2015 http…            657 2015-01-15       http…
#> 10 Canis lupus A nove… Swarm …  2018 http…            430 2018-01-05       http…
#> # ℹ 189 more rows
#> # ℹ 40 more variables: display_name <chr>, authorships <list>, abstract <chr>,
#> #   publication_year <int>, relevance_score <dbl>, fwci <dbl>,
#> #   counts_by_year <list>, ids <list>, type <chr>, is_oa <lgl>,
#> #   is_oa_anywhere <lgl>, oa_status <chr>, oa_url <chr>,
#> #   any_repository_has_fulltext <lgl>, source_display_name <chr>,
#> #   source_id <chr>, issn_l <chr>, host_organization <chr>, …

The function returns a tibble containing bibliographic information including:

dplyr::glimpse(wolf)
#> Rows: 199
#> Columns: 48
#> $ species                       <chr> "Canis lupus", "Canis lupus", "Canis lup…
#> $ title                         <chr> "Grey Wolf Optimizer", "Binary grey wolf…
#> $ journal                       <chr> "Advances in Engineering Software", "Neu…
#> $ year                          <int> 2014, 2015, 2015, 1999, 2020, 2017, 2011…
#> $ doi                           <chr> "https://doi.org/10.1016/j.advengsoft.20…
#> $ cited_by_count                <int> 17614, 1344, 1677, 362, 1068, 1065, 204,…
#> $ publication_date              <date> 2014-01-22, 2015-08-06, 2015-11-22, 199…
#> $ id                            <chr> "https://openalex.org/W2061438946", "htt…
#> $ display_name                  <chr> "Grey Wolf Optimizer", "Binary grey wolf…
#> $ authorships                   <list> [<tbl_df[3 x 7]>], [<tbl_df[3 x 7]>], […
#> $ abstract                      <chr> NA, NA, NA, "The grey wolf (Canis lupus)…
#> $ publication_year              <int> 2014, 2015, 2015, 1999, 2020, 2017, 2011…
#> $ relevance_score               <dbl> 5304.5570, 1202.7178, 1132.1453, 962.802…
#> $ fwci                          <dbl> 306.44260, 65.96240, 49.79470, 6.10620, …
#> $ counts_by_year                <list> [<data.frame[13 x 2]>], [<data.frame[11…
#> $ ids                           <list> <"https://openalex.org/W2061438946", "h…
#> $ type                          <chr> "article", "article", "article", "articl…
#> $ is_oa                         <lgl> FALSE, FALSE, FALSE, TRUE, FALSE, FALSE,…
#> $ is_oa_anywhere                <lgl> TRUE, FALSE, FALSE, TRUE, FALSE, TRUE, F…
#> $ oa_status                     <chr> "green", "closed", "closed", "bronze", "…
#> $ oa_url                        <chr> "http://hdl.handle.net/10072/66188", NA,…
#> $ any_repository_has_fulltext   <lgl> TRUE, FALSE, FALSE, TRUE, FALSE, TRUE, F…
#> $ source_display_name           <chr> "Advances in Engineering Software", "Neu…
#> $ source_id                     <chr> "https://openalex.org/S16540516", "https…
#> $ issn_l                        <chr> "0965-9978", "0925-2312", "0957-4174", "…
#> $ host_organization             <chr> "https://openalex.org/P4310320990", "htt…
#> $ host_organization_name        <chr> "Elsevier BV", "Elsevier BV", "Elsevier …
#> $ landing_page_url              <chr> "https://doi.org/10.1016/j.advengsoft.20…
#> $ pdf_url                       <chr> NA, NA, NA, "https://onlinelibrary.wiley…
#> $ license                       <chr> NA, NA, NA, NA, NA, NA, NA, "cc-by", NA,…
#> $ version                       <chr> "publishedVersion", "publishedVersion", …
#> $ referenced_works              <list> <"https://openalex.org/W84428182", "htt…
#> $ referenced_works_count        <int> 87, 35, 77, 92, 94, 142, 38, 71, 46, 63,…
#> $ related_works                 <list> <"https://openalex.org/W867937275", "ht…
#> $ concepts                      <list> [<data.frame[8 x 5]>], [<data.frame[22 …
#> $ topics                        <list> [<tbl_df[12 x 5]>], [<tbl_df[12 x 5]>],…
#> $ keywords                      <list> [<data.frame[8 x 3]>], [<data.frame[15 …
#> $ is_paratext                   <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE…
#> $ is_retracted                  <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE…
#> $ language                      <chr> "en", "en", "en", "en", "en", "en", "en"…
#> $ sustainable_development_goals <list> NA, NA, NA, [<data.frame[1 x 3]>], NA, …
#> $ awards                        <list> NA, NA, NA, NA, NA, NA, NA, NA, NA, <"h…
#> $ funders                       <list> NA, NA, NA, [<data.frame[1 x 3]>], NA, …
#> $ apc                           <list> [<data.frame[2 x 5]>], [<data.frame[2 x…
#> $ first_page                    <chr> "46", "371", "106", "2089", "113917", "4…
#> $ last_page                     <chr> "61", "381", "119", "2103", "113917", "4…
#> $ volume                        <chr> "69", "172", "47", "8", "166", "30", "18…
#> $ issue                         <chr> NA, NA, NA, "12", NA, "2", "2-4", "7", "…

Searching multiple species

The function can easily be used in loops or purrr workflows to search for multiple species.

library(purrr)

species_list <- list(
  wolf = c("Canis lupus", "gray wolf", "grey wolf"),
  lynx = c("Lynx lynx", "Eurasian lynx"),
  bear = c("Ursus arctos", "brown bear")
)

results <- purrr::imap_dfr(
  species_list,
  ~ search_species_openalex(
      species = .x[1],
      synonyms = .x[-1],
      pages = 1
    )
)
#> Using basic paging...
#> Using basic paging...
#> Using basic paging...

results
#> # A tibble: 597 × 48
#>    species     title   journal  year doi   cited_by_count publication_date id   
#>    <chr>       <chr>   <chr>   <int> <chr>          <int> <date>           <chr>
#>  1 Canis lupus Grey W… Advanc…  2014 http…          17614 2014-01-22       http…
#>  2 Canis lupus Binary… Neuroc…  2015 http…           1344 2015-08-06       http…
#>  3 Canis lupus Multi-… Expert…  2015 http…           1677 2015-11-22       http…
#>  4 Canis lupus Mitoch… Molecu…  1999 http…            362 1999-12-01       http…
#>  5 Canis lupus An imp… Expert…  2020 http…           1068 2020-09-16       http…
#>  6 Canis lupus Grey w… Neural…  2017 http…           1065 2017-11-25       http…
#>  7 Canis lupus Gray w… Veteri…  2011 http…            204 2011-05-23       http…
#>  8 Canis lupus Hypoxi… PLoS G…  2014 http…            189 2014-07-31       http…
#>  9 Canis lupus How ef… Applie…  2015 http…            657 2015-01-15       http…
#> 10 Canis lupus A nove… Swarm …  2018 http…            430 2018-01-05       http…
#> # ℹ 587 more rows
#> # ℹ 40 more variables: display_name <chr>, authorships <list>, abstract <chr>,
#> #   publication_year <int>, relevance_score <dbl>, fwci <dbl>,
#> #   counts_by_year <list>, ids <list>, type <chr>, is_oa <lgl>,
#> #   is_oa_anywhere <lgl>, oa_status <chr>, oa_url <chr>,
#> #   any_repository_has_fulltext <lgl>, source_display_name <chr>,
#> #   source_id <chr>, issn_l <chr>, host_organization <chr>, …

Notes

OpenAlex search is not equivalent to structured database searches used in systematic reviews (e.g. Web of Science or Scopus). However, it is extremely useful for automated literature discovery and exploratory evidence synthesis.