Skip to contents

pmcbioc_db() connects to a new or existing DuckDB database to store article metadata and an index individual articles. The database is effectively explored using 'dbplyr' / 'dplyr'.

db_disconnect() disconnects from the DuckDB database, closing the connection and shutting down the DuckDB instance.

db_dir() returns the path to the database.

db_tables() lists the tables defined in the database.

Use tbl() to create a dbplyr tibble from a table in the database.

Usage

pmcbioc_db(db_dir, read_only = TRUE)

# S3 method for pmcbioc_db
print(x, ...)

db_disconnect(db)

db_dir(db)

db_tables(db)

# S3 method for pmcbioc_db
tbl(src, from, ...)

Arguments

db_dir

character(1) file path to an existing or new DuckDB database. If the path exists, the database is open 'read only' (by default) to avoid corruption of existing data.

read_only

logical(1) allows existing databases to be open for updating (e.g., adding the article XML index as a step separate from parsing the metadata).

x

for print.pmcbioc_db, a pmcbioc_db object

...

for tbl(), additional arguments passed to duckdb:::tbl.duckdb_connection().

db

a database object returned by pmcbioc_db().

src

an object created with pmcbioc_db().

from

the name of the table to be used.

Value

pmcbioc_db() returns a pmcbioc_db object that can be used to open metadata and index tables.

db_dir() returns the to the database as a scalar character.

db_tables() returns a character vector of tables defined in the database.

tbl() returns a dbplyr tibble representing the DuckDB table.