Package {scopusflow}


Title: A Reproducible Workflow Layer for 'Scopus' Bibliographic Searches
Version: 0.1.0
Description: A coherent, quota-aware workflow layer over the Elsevier 'Scopus' Search 'API' https://dev.elsevier.com/sc_apis.html. It builds reproducible search plans, retrieves records with rate-limit handling, retry with back-off and optional resumable caching, normalises results to a stable tidy schema, extracts and tracks changes in Digital Object Identifiers (DOIs), compares publication trends across topics and exports to formats compatible with downstream bibliometric tools. Network and 'API' errors are surfaced as typed conditions so that callers can respond to them programmatically. 'Scopus' is a trademark of Elsevier. This package is an independent client and is not affiliated with or endorsed by Elsevier.
License: MIT + file LICENSE
URL: https://github.com/pablobernabeu/scopusflow, https://pablobernabeu.github.io/scopusflow/
BugReports: https://github.com/pablobernabeu/scopusflow/issues
Depends: R (≥ 4.1.0)
Imports: cli, httr2 (≥ 1.0.0), jsonlite, rlang (≥ 1.0.0), stats, tibble, tools, utils
Suggests: ggplot2, knitr, rmarkdown, spelling, testthat (≥ 3.0.0), withr
VignetteBuilder: knitr
Config/testthat/edition: 3
Encoding: UTF-8
LazyData: true
Language: en-GB
Config/roxygen2/version: 8.0.0
NeedsCompilation: no
Packaged: 2026-06-14 22:23:30 UTC; PabloB
Author: Pablo Bernabeu ORCID iD [aut, cre, cph]
Maintainer: Pablo Bernabeu <pcbernabeu@gmail.com>
Repository: CRAN
Date/Publication: 2026-06-20 13:50:02 UTC

scopusflow: A Reproducible Workflow Layer for 'Scopus' Bibliographic Searches

Description

logo

A coherent, quota-aware workflow layer over the Elsevier 'Scopus' Search 'API' https://dev.elsevier.com/sc_apis.html. It builds reproducible search plans, retrieves records with rate-limit handling, retry with back-off and optional resumable caching, normalises results to a stable tidy schema, extracts and tracks changes in Digital Object Identifiers (DOIs), compares publication trends across topics and exports to formats compatible with downstream bibliometric tools. Network and 'API' errors are surfaced as typed conditions so that callers can respond to them programmatically. 'Scopus' is a trademark of Elsevier. This package is an independent client and is not affiliated with or endorsed by Elsevier.

Author(s)

Maintainer: Pablo Bernabeu pcbernabeu@gmail.com (ORCID) [copyright holder]

Authors:

See Also

Useful links:


Convert records to a bibliometrix-compatible data frame

Description

Re-maps a scopus_records tibble to the tagged column layout used by the bibliometrix package (and the wider ISI/Web of Science convention), so results can flow into downstream science-mapping workflows.

Usage

as_bibliometrix(x)

Arguments

x

A scopus_records tibble.

Details

This produces the shape bibliometrix expects from the core descriptive fields. It reconstructs only what the 'Scopus' Search API returns, so richer fields that some bibliometrix analyses use, such as full author affiliations or cited references, are left out. To obtain those, export a full 'Scopus' CSV or BibTeX file from the web interface and read it with bibliometrix::convert2df().

Value

A data frame (classed bibliometrixDB) with the standard tag columns AU (authors), TI (title), SO (source or publication), DI (DOI), PY (publication year), TC (times cited), UT (record id) and DB ("SCOPUS"). Character tag fields are upper-cased to match the bibliometrix convention.

Examples

recs <- scopus_records(list(entry = list(
  list(`dc:identifier` = "SCOPUS_ID:1", `prism:doi` = "10.1/a",
       `dc:title` = "A study", `dc:creator` = "Doe J.",
       `prism:publicationName` = "Journal", `prism:coverDate` = "2020-01-01",
       `citedby-count` = "3")
)))
as_bibliometrix(recs)

Normalise raw 'Scopus' entries to a stable tidy schema

Description

Converts the nested list returned by the 'Scopus' Search API into a flat, predictable tibble with one row per record. This shape is the common currency of the package. Both scopus_fetch() and scopus_fetch_plan() return it, and the DOI, comparison and export helpers all consume it.

Usage

## S3 method for class 'scopus_records'
as_tibble(x, ...)

## S3 method for class 'scopus_records'
as.data.frame(x, ...)

scopus_records(x, query = NA_character_)

is_scopus_records(x)

Arguments

x

An object to test.

...

Ignored, for S3 compatibility.

query

Optional character scalar recording the query that produced the entries, kept in the query column for provenance.

Details

The 'Scopus' API signals an empty result set with a single sentinel entry that carries an error field and no identifier. This is detected and turned into a zero-row result rather than a spurious record, while a genuine record that also carries a per-entry error annotation is kept.

Value

The coercion methods return a plain tibble or data frame with the same columns and the scopus_records class removed.

A tibble of class scopus_records with the columns entry_number (integer), scopus_id (character), doi (character), title (character), authors (character, the creator names joined with "; " when several are listed), year (integer, the leading four digits of the cover date), date (character, the ISO cover date), publication (character, the source title), citations (integer) and query (character). A missing field becomes NA, and an empty result set yields a zero-row tibble with the same columns.

is_scopus_records() returns a length-one logical.

Examples

# A minimal entry as the API would return it.
raw <- list(entry = list(
  list(
    `dc:identifier` = "SCOPUS_ID:1",
    `prism:doi` = "10.1000/abc",
    `dc:title` = "An example",
    `dc:creator` = "Doe J.",
    `prism:publicationName` = "Journal of Examples",
    `prism:coverDate` = "2020-05-01",
    `citedby-count` = "7"
  )
))
scopus_records(raw, query = "TITLE(example)")

Example set of normalised 'Scopus' records

Description

A small set of six illustrative records spanning several disciplines (genome editing, machine learning, climate science, materials, oncology and physics), in the shape that scopus_fetch() returns. It is provided so that the package can be explored and its examples run without a 'Scopus' API key. The records were normalised from the static page fixture bundled in inst/extdata.

Usage

example_records

Format

A scopus_records tibble with six rows and the standard schema:

entry_number

Position within the retrieval.

scopus_id

The 'Scopus' record identifier.

doi

Digital Object Identifier.

title

Document title.

authors

First or corresponding author.

year

Publication year.

date

Cover date in ISO form.

publication

Source title.

citations

Citation count.

query

The query that produced the record.

Source

Synthetic example data, illustrative only.

Examples

example_records
summary(example_records)

Plot a topic comparison

Description

Draws a line chart of each comparison topic's share of the reference literature over time, from the output of scopus_compare_topics(). The chart uses integer year breaks, a colour-blind-safe palette and, for a handful of topics, labels the lines directly so the reader need not consult a legend. Shaded bands convey how stable each yearly share is.

Usage

plot_scopus_comparison(
  x,
  pub_count_in_legend = TRUE,
  highlight = NULL,
  interval = TRUE,
  ...
)

## S3 method for class 'scopus_comparison'
autoplot(object, ...)

Arguments

x

A scopus_comparison object from scopus_compare_topics().

pub_count_in_legend

Logical. When TRUE (the default), each topic's label carries its total record count, for example ⁠effect size (n = 1,204)⁠.

highlight

Optional character scalar naming one comparison topic to draw the eye to. The named topic is drawn in an accent colour, and the others in grey, which is useful when one topic is the focus of a figure.

interval

Logical. When TRUE (the default), a shaded band around each line shows a Wilson interval on the yearly share. See Details for how to read it.

...

Currently unused, present for S3 consistency.

object

A scopus_comparison object (for the autoplot() method).

Details

This needs the suggested package ggplot2 and raises an informative error when it is absent. The chart shows the comparison topics alone, since the reference is the 100% denominator against which they are measured. A year for which the reference has no records carries no defined share and is omitted, which is noted in the caption.

The shaded band is a Wilson score interval computed from the comparison count and the reference count for each year. 'Scopus' returns exact counts rather than a sample, so the band is not a confidence interval in the inferential sense. It is best read as an illustrative stability range: it is wide where the reference set for a year is small, and so the share would move easily, and narrow where the reference set is large. It says nothing about query wording, indexing lag or coverage, which are the larger real uncertainties.

Value

A ggplot2::ggplot object. Printing it draws the plot.

See Also

scopus_compare_topics()

Examples


cmp <- tibble::tibble(
  query = "q", query_type = "comparison",
  abridged_query = rep(c("computer vision", "drug discovery"), each = 4),
  year = rep(2017:2020, 2), n = c(220, 280, 360, 430, 30, 55, 90, 150),
  reference_n = rep(1500, 8),
  comparison_percentage = c(14.7, 18.7, 24, 28.7, 2, 3.7, 6, 10),
  average_comparison_percentage = rep(c(21.5, 5.4), each = 4)
)
class(cmp) <- c("scopus_comparison", class(cmp))
plot_scopus_comparison(cmp)
plot_scopus_comparison(cmp, highlight = "drug discovery")


Clear the scopusflow managed cache

Description

Deletes the cache files written under scopus_cache_dir(). A cache you created in a directory of your own is left untouched.

Usage

scopus_cache_clear()

Value

Invisibly, TRUE once the managed cache directory is removed or found to be absent.

Examples

# Safe to call even when nothing is cached.
scopus_cache_clear()

Managed cache directory for scopusflow

Description

Returns (and creates on request) a per-user cache directory under tools::R_user_dir(), suitable for passing to cache_dir in scopus_fetch_plan(). The cache is entirely optional and can be cleared with scopus_cache_clear().

Usage

scopus_cache_dir(create = FALSE)

Arguments

create

Logical. When TRUE, the directory is created if it is absent.

Value

The cache directory path, invisibly when create = TRUE.

Examples

scopus_cache_dir(create = FALSE)

Combine record sets into one

Description

Binds several scopus_records objects into a single one, renumbering entry_number across the result and, optionally, dropping duplicates. This is the safe way to merge separate fetches: plain rbind() would leave duplicate entry numbers, and c() would return a list.

Usage

scopus_combine(..., dedupe = FALSE)

## S3 method for class 'scopus_records'
c(x, ...)

Arguments

...

Two or more scopus_records objects, or a single list of them.

dedupe

Logical. When TRUE, records sharing a 'Scopus' identifier, or failing that a DOI (compared case-insensitively), are kept once.

x

A scopus_records object (for the c() method).

Value

A scopus_records tibble. Per-retrieval attributes such as total_results are not carried over, since they describe a single fetch.

See Also

scopus_fetch_plan(), which combines plan cells the same way.

Examples

# Merging a set with itself and de-duplicating recovers the distinct records.
scopus_combine(example_records, example_records, dedupe = TRUE)

Compare publication trends across topics

Description

Compares how often a set of comparison topics co-occur with a reference topic over time. For each year and each comparison term, the number of records matching the reference combined with that term is expressed as a percentage of the records matching the reference alone. This reveals which sub-topics are growing or shrinking within a literature.

Usage

scopus_compare_topics(
  reference_query,
  comparison_terms,
  years,
  field = NULL,
  view = c("STANDARD", "COMPLETE"),
  api_key = NULL,
  inst_token = NULL,
  verbose = FALSE
)

Arguments

reference_query

Character scalar. The reference topic that anchors the comparison (for example "language learning").

comparison_terms

Character vector of topics to compare against the reference (for example c("effect size", "Bayesian")). Each is combined with the reference using a logical AND.

years

Integer vector of publication years to span (for example 2015:2020).

field

Optional 'Scopus' field tag applied to every component of every query (see scopus_plan()).

view

Either "STANDARD" or "COMPLETE".

api_key, inst_token

Optional credentials (see scopus_has_key()).

verbose

Logical. When TRUE, progress is reported.

Value

A tibble of class scopus_comparison with the columns query (the full query used), query_type ("reference" or "comparison"), abridged_query (the topic label for plotting), year, n (records that year), reference_n (reference records that year), comparison_percentage (100 * n / reference_n, or NA when reference_n is 0) and average_comparison_percentage (the same ratio computed on period totals). Comparison rows are sorted by descending average percentage.

API access

This performs one count request per term per year, so it requires a valid API key and internet access. The API access section of scopus_count() gives the details. A modest number of terms and years keeps the call within quota.

See Also

plot_scopus_comparison() to visualise the result.

Examples


cmp <- scopus_compare_topics(
  reference_query = "deep learning",
  comparison_terms = c("computer vision", "drug discovery", "medical imaging"),
  years = 2015:2022,
  field = "TITLE-ABS-KEY"
)
cmp


Count 'Scopus' results for a query

Description

Retrieves only the total number of records matching a query, without downloading them. This is the inexpensive way to size a retrieval before committing quota. The count can guide how to partition a scopus_plan(), or simply report how large a topic is.

Usage

scopus_count(
  query,
  years = NULL,
  field = NULL,
  view = c("STANDARD", "COMPLETE"),
  api_key = NULL,
  inst_token = NULL
)

Arguments

query

Character scalar. The base search expression.

years

Optional integer vector of publication years to restrict to.

field

Optional 'Scopus' field tag to wrap the query in (see scopus_plan()).

view

Either "STANDARD" or "COMPLETE".

api_key, inst_token

Optional credentials, resolved by default from options or environment variables (see scopus_has_key()).

Value

A single number giving the total number of matching records, or NA when the API reports no total. It is returned as a double so that very large totals are represented exactly rather than overflowing, with the parsed quota (see scopus_quota()) attached as the quota attribute so a workflow can pace itself off a count.

API access

This function performs a network request and therefore requires a valid API key and internet access. When no key is configured it raises a scopus_error_no_key condition, and other failures raise typed scopus_error subclasses such as scopus_error_rate_limit. A tryCatch() around the call lets a workflow handle these gracefully.

Examples


scopus_count("CRISPR", years = 2015:2020, field = "TITLE-ABS-KEY")


Compare two DOI retrievals

Description

Identifies which DOIs were added, removed or unchanged between an earlier and a later retrieval. This supports change tracking: re-running a search later and seeing exactly what is new.

Usage

scopus_diff_dois(old, new)

Arguments

old, new

scopus_records objects or character vectors of DOIs, representing the earlier (old) and later (new) retrievals.

Value

A tibble of class scopus_doi_diff with columns doi and status, where status is an ordered factor with levels "added" (in new only), "removed" (in old only) and "unchanged" (in both). Rows are sorted by status then DOI, and printing shows the counts in each category.

See Also

scopus_extract_dois()

Examples

old <- c("10.1/a", "10.1/b")
new <- c("10.1/b", "10.1/c")
scopus_diff_dois(old, new)

Extract, clean and optionally export DOIs

Description

Pulls Digital Object Identifiers from a scopus_records object (or a bare character vector), normalises them and removes missing values. The resulting list can be imported into a reference manager such as Zotero to assemble a bibliography.

Usage

scopus_extract_dois(x, dedupe = TRUE, file = NULL)

Arguments

x

A scopus_records tibble, or a character vector of DOIs.

dedupe

Logical, dropping duplicate DOIs by default.

file

Optional path at which to write the DOIs as a single-column CSV. A file is written only when this argument is supplied, and only to the exact path given, so the package always leaves the working directory untouched unless asked. Parent directories are assumed to exist already.

Details

Normalisation trims surrounding whitespace and strips common resolver prefixes (⁠https://doi.org/⁠, ⁠http://dx.doi.org/⁠, ⁠doi:⁠) so that the same article is counted once even when its DOI is formatted differently in two records. Because DOIs are case-insensitive, comparison and deduplication ignore case, while the output keeps the original casing.

Value

A character vector of cleaned DOIs, returned invisibly when file is written.

See Also

scopus_diff_dois() to compare two retrievals.

Examples

recs <- scopus_records(list(entry = list(
  list(`prism:doi` = "10.1/AAA"),
  list(`prism:doi` = "https://doi.org/10.1/aaa"),
  list(`prism:doi` = NULL)
)))
scopus_extract_dois(recs)

# Write to a temporary file (never the working directory).
path <- tempfile(fileext = ".csv")
scopus_extract_dois(recs, file = path)

Fetch 'Scopus' records for a query

Description

Retrieves records page by page, accumulating them and returning a single normalised scopus_records tibble. Pagination, the API's hard start < 5000 ceiling, rate-limit handling and retry with back-off are all managed for you.

Usage

scopus_fetch(
  query,
  max_results = Inf,
  view = c("STANDARD", "COMPLETE"),
  page_size = NULL,
  field = NULL,
  years = NULL,
  api_key = NULL,
  inst_token = NULL,
  verbose = FALSE
)

Arguments

query

Character scalar. The base search expression.

max_results

Maximum number of records to retrieve. Defaults to Inf, meaning all available records up to the API ceiling. The 'Scopus' Search API refuses offsets of 5000 or more, so a single query yields at most 5000 records. To go beyond that, partition the search by year with scopus_plan().

view

Either "STANDARD" or "COMPLETE".

page_size

Integer records per page, or NULL (default) to use the most quota-efficient page the view allows (200 for STANDARD, 25 for COMPLETE). See scopus_plan() for why larger pages cost less quota.

field

Optional 'Scopus' field tag to wrap the query in (see scopus_plan()).

years

Optional integer vector of publication years to restrict to.

api_key, inst_token

Optional credentials, resolved by default from options or environment variables (see scopus_has_key()).

verbose

Logical. When TRUE, progress is reported as the retrieval proceeds.

Value

A scopus_records tibble. The reported total and the most recent parsed quota are attached as the total_results and quota attributes.

API access

Requires a valid API key and internet access. The API access section of scopus_count() lists the conditions that may be raised.

See Also

scopus_fetch_plan() for cached, resumable, partitioned retrieval.

Examples


recs <- scopus_fetch("graphene", field = "TITLE-ABS-KEY", max_results = 50)
recs


Execute a 'Scopus' search plan, with optional caching and resume

Description

Runs every cell of a scopus_plan() in turn, optionally caching each cell's result so that an interrupted or quota-limited retrieval can resume without re-spending quota on the cells already fetched. Results are accumulated and bound once into a single scopus_records tibble.

Usage

scopus_fetch_plan(
  plan,
  max_results = Inf,
  cache_dir = NULL,
  resume = TRUE,
  api_key = NULL,
  inst_token = NULL,
  verbose = FALSE
)

Arguments

plan

A scopus_plan object from scopus_plan().

max_results

Maximum records to retrieve per cell (default Inf).

cache_dir

Optional directory for per-cell cache files. The default of NULL performs no caching. Pass an explicit path you control, or scopus_cache_dir() to use a managed, clearable cache under tools::R_user_dir(). Caching happens only when you opt in through this argument.

resume

Logical. When TRUE and cache_dir is set, a cell whose cache file already exists is loaded from disk rather than fetched again.

api_key, inst_token

Optional credentials (see scopus_has_key()).

verbose

Logical. When TRUE, per-cell progress is reported.

Value

A scopus_records tibble combining all cells, with the originating plan attached as the plan attribute.

API access

Any cell not served from cache requires a valid API key and internet access. The API access section of scopus_count() gives the details.

See Also

scopus_cache_dir(), scopus_cache_clear()

Examples


plan <- scopus_plan("renewable energy", years = 2015:2022, partition = "year")
dir <- file.path(tempdir(), "energy-cache")
recs <- scopus_fetch_plan(plan, cache_dir = dir, resume = TRUE)


Recognised 'Scopus' field tags

Description

Lists the field tags that scopus_plan(), scopus_fetch() and scopus_compare_topics() understand, together with a short note on what each one searches. Passing one of these tags as field restricts a query to the corresponding part of a record, so TITLE-ABS-KEY looks in the title, abstract and keywords while AUTH looks only at author names. Other valid 'Scopus' tags are accepted too. This is a guide to the common ones.

Usage

scopus_field_tags()

Value

A tibble with a tag column and a searches column describing the scope of each tag.

See Also

scopus_plan()

Examples

scopus_field_tags()

Locate the 'Scopus' API key and institutional token

Description

scopus_has_key() reports whether an API key can be found, without revealing it. The key itself is resolved internally and is never printed by the package.

Usage

scopus_has_key()

Details

The key is looked up first from the api_key argument of whichever function is being called, then from the scopusflow.api_key option, and finally from the SCOPUS_API_KEY environment variable. An optional institutional token, used for off-campus access to subscriber content, is resolved the same way from the inst_token argument, the scopusflow.inst_token option, or the SCOPUS_INST_TOKEN environment variable.

A key is a secret. The safest home for it is ⁠~/.Renviron⁠, as in SCOPUS_API_KEY=xxxx, rather than a script, and it should stay out of version control.

Value

A length-one logical that is safe to print, TRUE when a non-empty key is available and FALSE otherwise.

See Also

scopus_count(), scopus_fetch()

Examples

# Does the current session have a key configured?
scopus_has_key()

Build a reproducible 'Scopus' search plan

Description

A plan is a fully specified, inspectable description of one or more 'Scopus' queries to run. Splitting the act of describing a search from executing it makes workflows reproducible (the plan can be saved, reviewed and version controlled) and lets large retrievals be partitioned, for example one cell per year, so they can be cached and resumed.

Usage

scopus_plan(
  query,
  years = NULL,
  field = NULL,
  view = c("STANDARD", "COMPLETE"),
  page_size = NULL,
  partition = c("none", "year")
)

is_scopus_plan(x)

Arguments

query

Character scalar. The base search expression, without field tags or year filters (these are added through field and years).

years

Optional integer vector of publication years to restrict to, for example 2015:2020. When partition = "year", one plan cell is created for each distinct year. Otherwise the minimum and maximum define a single date range.

field

Optional character scalar naming a 'Scopus' field tag to wrap the query in, for example "TITLE-ABS-KEY", "TITLE", "AUTH" or "AFFIL". When NULL, the query is used verbatim. See scopus_field_tags() for the common tags.

view

Either "STANDARD" or "COMPLETE". COMPLETE returns more fields but requires a subscriber entitlement and is limited to a smaller page size.

page_size

Integer number of records to request per page, or NULL (the default) to use the largest page the view allows. The 'Scopus' Search API permits up to 200 records per page for the STANDARD view but only 25 for COMPLETE. Because the weekly quota is charged per request, requesting the maximum page size keeps the number of requests, and so the quota, as low as possible for a given result set. Lower it only where you have a reason to.

partition

Either "none" (a single query cell) or "year" (one cell per year in years). Partitioning by year is the recommended way to stay under the API's hard limit of start < 5000.

x

An object to test or print.

Value

A tibble of class scopus_plan, one row per cell, with columns cell, query (field-wrapped), date (year range string or NA), year (integer or NA), view and page_size. Plan-level settings are stored as attributes.

is_scopus_plan() returns a length-one logical.

See Also

scopus_fetch_plan() to execute a plan, scopus_count() to size it.

Examples

scopus_plan("quantum computing", years = 2015:2022, field = "TITLE-ABS-KEY")
scopus_plan("immunotherapy", years = 2010:2020, partition = "year")

Build a field-tagged 'Scopus' query

Description

Combines several terms into one 'Scopus' query string, optionally wrapping each in a field tag and joining them with a boolean operator. It is a tidier alternative to pasting query fragments together by hand, which is where field-tag and bracket mistakes tend to creep in.

Usage

scopus_query(..., .op = c("AND", "OR", "AND NOT"), .field = NULL)

Arguments

...

Character terms to combine, for example "language learning" and "effect size".

.op

The boolean operator joining the terms, one of "AND", "OR" or "AND NOT".

.field

Optional field tag applied to every term (see scopus_field_tags()).

Value

A length-one character string suitable for scopus_count(), scopus_fetch() or the query of scopus_plan().

See Also

scopus_field_tags(), scopus_plan()

Examples

scopus_query("climate change", "adaptation", .field = "TITLE-ABS-KEY")
scopus_query("graphene", "supercapacitor", .op = "AND")
scopus_query("CRISPR", "Cas9", "Cas12", .op = "OR")

Parse 'Scopus' quota and rate-limit headers

Description

Elsevier returns the caller's weekly quota and short-term rate-limit status in response headers. scopus_quota() extracts them into a tidy list so a workflow can pause, schedule or report on the remaining allowance.

Usage

scopus_quota(resp)

Arguments

resp

An httr2::response object, typically captured during a request.

Details

The relevant headers are X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset (epoch seconds), X-ELS-Status and Retry-After. When the API raises a quota or rate-limit error, the parsed quota is also attached to the resulting condition, where it is available as cnd$quota.

Value

A list with elements limit, remaining, reset (a POSIXct time at which the rate-limit window resets, or NA), status and retry_after (seconds, or NA). A missing header yields NA.

Examples

# Build a fake response to show the shape of the output (no network used).
resp <- httr2::response(
  status_code = 200,
  headers = list(
    `X-RateLimit-Limit` = "20000",
    `X-RateLimit-Remaining` = "19987",
    `X-RateLimit-Reset` = "1700000000"
  )
)
scopus_quota(resp)

Summarise a set of 'Scopus' records

Description

Gives a compact overview of a scopus_records object, reporting how many records it holds, the span of publication years they cover, how many distinct sources and Digital Object Identifiers appear among them and how widely they have been cited. It is a convenient way to take stock of a retrieval before any closer analysis.

Usage

## S3 method for class 'scopus_records'
summary(object, ...)

Arguments

object

A scopus_records tibble.

...

Ignored, present for compatibility with the summary() generic.

Value

A list of class scopus_records_summary, with elements n_records, years (the earliest and latest year present, each NA when no year is known), n_sources, n_with_doi, total_citations, median_citations, top_cited (the title of the most-cited record) and top_source (the most frequent source title). Printing it produces a short readable report.

Examples

summary(example_records)

Read and write 'Scopus' record sets

Description

Save a scopus_records tibble to disk and read it back, with a stable round-trip. The file extension selects the format. An .rds file preserves the types and class exactly, while a .csv file is portable plain text.

Usage

write_scopus_records(x, path)

read_scopus_records(path)

Arguments

x

A scopus_records tibble to write.

path

Explicit file path. The functions read from, or write to, exactly this path and leave the working directory alone. Parent directories are assumed to exist already.

Value

write_scopus_records() returns x invisibly. read_scopus_records() returns a scopus_records tibble.

Examples

recs <- scopus_records(list(entry = list(
  list(`dc:identifier` = "SCOPUS_ID:1", `prism:doi` = "10.1/a",
       `dc:title` = "A study", `prism:coverDate` = "2020-01-01")
)))
path <- tempfile(fileext = ".csv")
write_scopus_records(recs, path)
read_scopus_records(path)