Title: | Calculate Persistent Homology with Ripser-Based Engines |
---|---|
Description: | Ports the Ripser <arXiv:1908.02518> and Cubical Ripser <arXiv:2005.12692> persistent homology calculation engines from C++. Can be used as a rapid calculation tool in topological data analysis pipelines. |
Authors: | Raoul Wadhwa [aut, cre] |
Maintainer: | Raoul Wadhwa <[email protected]> |
License: | GPL-3 |
Version: | 0.2.0 |
Built: | 2025-03-09 05:12:16 UTC |
Source: | https://github.com/tdaverse/ripserr |
A geographic dataset of known occurrences of Aedes aegypti mosquitoes in Brazil, derived from peer-reviewed and unpublished literature and reverse-geocoded to states.
aegypti
aegypti
A tibble of 4411 observations and 13 variables:
species identification (aegypti versus albopictus)
unique occurrence identifier
published versus unpublished, with reference identifier
point or polygon location
admin level or polygon size; -999 for point locations
latitudinal coordinate of point or polygon centroid
longitudinal coordinate of point or polygon centroid
established versus transient population
name of reverse-geolocated state
two-letter state code
http://dx.doi.org/10.5061/dryad.47v3c
# calculate persistence data for occurrences in Acre acre_coord <- aegypti[aegypti$state_code == "AC", c("x", "y"), drop = FALSE] acre_rips <- vietoris_rips(acre_coord) plot.new() plot.window( xlim = c(0, max(acre_rips$death)), ylim = c(0, max(acre_rips$death)), asp = 1 ) axis(1L) axis(2L) abline(a = 0, b = 1) points(acre_rips[acre_rips$dim == 0L, c("birth", "death")], pch = 16L) points(acre_rips[acre_rips$dim == 1L, c("birth", "death")], pch = 17L)
# calculate persistence data for occurrences in Acre acre_coord <- aegypti[aegypti$state_code == "AC", c("x", "y"), drop = FALSE] acre_rips <- vietoris_rips(acre_coord) plot.new() plot.window( xlim = c(0, max(acre_rips$death)), ylim = c(0, max(acre_rips$death)), asp = 1 ) axis(1L) axis(2L) abline(a = 0, b = 1) points(acre_rips[acre_rips$dim == 0L, c("birth", "death")], pch = 16L) points(acre_rips[acre_rips$dim == 1L, c("birth", "death")], pch = 17L)
Converts valid objects to PHom
instances.
as.PHom(x, dim_col = 1, birth_col = 2, death_col = 3)
as.PHom(x, dim_col = 1, birth_col = 2, death_col = 3)
x |
object being converted to |
dim_col |
either |
birth_col |
either |
death_col |
either |
PHom
instance
# construct data frame with valid persistence data df <- data.frame(dimension = c(0, 0, 1, 1, 1, 2), birth = rnorm(6), death = rnorm(6, mean = 15)) # convert to `PHom` instance and print df_phom <- as.PHom(df) df_phom # print feature details to confirm accuracy print.data.frame(df_phom)
# construct data frame with valid persistence data df <- data.frame(dimension = c(0, 0, 1, 1, 1, 2), birth = rnorm(6), death = rnorm(6, mean = 15)) # convert to `PHom` instance and print df_phom <- as.PHom(df) df_phom # print feature details to confirm accuracy print.data.frame(df_phom)
A data set of numbers of cases of Dengue in each state of Brazil in 2013 and three state-level variables used in a predictive model.
case_predictors
case_predictors
A data frame of 27 observations and 4 variables:
state population in 2013
average temperature across state municipalities
average precipitation across state municipalities
number of state Dengue cases in 2013
https://web.archive.org/web/20210209122713/https://www.gov.br/saude/pt-br/assuntos/boletins-epidemiologicos-1/por-assunto, http://www.ipeadata.gov.br/Default.aspx, https://ftp.ibge.gov.br/Estimativas_de_Populacao/, https://www.ibge.Goiasv.br/geociencias/organizacao-do-territorio/estrutura-territorial/15761-areas-dos-municipios.html?edicao=30133&t=acesso-ao-produto
Data pre-processing: After acquiring data from above links, we converted any dataset embedded in PDF format to CSV. Using carried functionalities in the CSV file, we sorted all datasets alphabetically based on state names to make later iterations more convenient. Also, we calculated the annual average temperature and added to the original dataset where it was documented by quarter.
This function is an R wrapper for the CubicalRipser C++ library to calculate
persistent homology. For more information on the C++ library, see
https://github.com/CubicalRipser. For more information on how objects of
different classes are evaluated by cubical
, read the Details section
below.
cubical(dataset, ...) ## S3 method for class 'array' cubical(dataset, threshold = 9999, method = "lj", ...) ## S3 method for class 'matrix' cubical(dataset, ...)
cubical(dataset, ...) ## S3 method for class 'array' cubical(dataset, threshold = 9999, method = "lj", ...) ## S3 method for class 'matrix' cubical(dataset, ...)
dataset |
object on which to calculate persistent homology |
... |
other relevant parameters |
threshold |
maximum simplicial complex diameter to explore |
method |
either |
cubical.array
assumes dataset
is a lattice, with each element containing
the value of the lattice at the point represented by the indices of the
element in the array
.
cubical.matrix
is redundant for versions of R
at or after 4.0. For
previous versions of R
, in which objects with class matrix
do not
necessarily also have class array
, dataset
is converted to an array
and persistent homology is then calculated using cubical.array
.
PHom
object
# 2-dim example dataset <- rnorm(10 ^ 2) dim(dataset) <- rep(10, 2) cubical_hom2 <- cubical(dataset) # 3-dim example dataset <- rnorm(8 ^ 3) dim(dataset) <- rep(8, 3) cubical_hom3 <- cubical(dataset) # 4-dim example dataset <- rnorm(5 ^ 4) dim(dataset) <- rep(5, 4)
# 2-dim example dataset <- rnorm(10 ^ 2) dim(dataset) <- rep(10, 2) cubical_hom2 <- cubical(dataset) # 3-dim example dataset <- rnorm(8 ^ 3) dim(dataset) <- rep(8, 3) cubical_hom3 <- cubical(dataset) # 4-dim example dataset <- rnorm(5 ^ 4) dim(dataset) <- rep(5, 4)
Returns the first part of a PHom
instance.
## S3 method for class 'PHom' head(x, ...)
## S3 method for class 'PHom' head(x, ...)
x |
object of class |
... |
other parameters |
# create sample persistence data df <- data.frame(dimension = c(0, 0, 1, 1, 1, 2), birth = rnorm(6), death = rnorm(6, mean = 15)) df_phom <- as.PHom(df) # look at first 3 features head(df_phom) # look at last 3 features tail(df_phom)
# create sample persistence data df <- data.frame(dimension = c(0, 0, 1, 1, 1, 2), birth = rnorm(6), death = rnorm(6, mean = 15)) df_phom <- as.PHom(df) # look at first 3 features head(df_phom) # look at last 3 features tail(df_phom)
Tests if objects are valid PHom
instances.
is.PHom(x)
is.PHom(x)
x |
object whose |
TRUE
if x
is a valid PHom
object; FALSE
otherwise
# create sample persistence data df <- data.frame(dimension = c(0, 0, 1, 1, 1, 2), birth = rnorm(6), death = rnorm(6, mean = 15)) df <- as.PHom(df) # confirm that persistence data is valid is.PHom(df) # mess up df object (feature birth cannot be after death) df$birth[1] <- rnorm(1, mean = 50) # confirm that persistence data is NOT valid is.PHom(df)
# create sample persistence data df <- data.frame(dimension = c(0, 0, 1, 1, 1, 2), birth = rnorm(6), death = rnorm(6, mean = 15)) df <- as.PHom(df) # confirm that persistence data is valid is.PHom(df) # mess up df object (feature birth cannot be after death) df$birth[1] <- rnorm(1, mean = 50) # confirm that persistence data is NOT valid is.PHom(df)
PHom() creates instances of PHom
objects, which are convenient containers
for persistence data. Generally, data frame (or similar) objects are used
to create PHom
instances with users specifying which columns contain
dimension, birth, and death details for each feature.
PHom(x, dim_col = 1, birth_col = 2, death_col = 3)
PHom(x, dim_col = 1, birth_col = 2, death_col = 3)
x |
object used to create |
dim_col |
either |
birth_col |
either |
death_col |
either |
PHom
instance
# construct data frame with valid persistence data df <- data.frame(dimension = c(0, 0, 1, 1, 1, 2), birth = rnorm(6), death = rnorm(6, mean = 15)) # create `PHom` instance and print df_phom <- PHom(df) df_phom # print feature details to confirm accuracy print.data.frame(df_phom)
# construct data frame with valid persistence data df <- data.frame(dimension = c(0, 0, 1, 1, 1, 2), birth = rnorm(6), death = rnorm(6, mean = 15)) # create `PHom` instance and print df_phom <- PHom(df) df_phom # print feature details to confirm accuracy print.data.frame(df_phom)
Print a PHom object.
## S3 method for class 'PHom' print(x, ...)
## S3 method for class 'PHom' print(x, ...)
x |
object of class |
... |
other parameters; ignored |
# create circle dataset angles <- runif(25, 0, 2 * pi) circle <- cbind(cos(angles), sin(angles)) # calculate persistent homology circle_phom <- vietoris_rips(circle) # print persistence data print(circle_phom)
# create circle dataset angles <- runif(25, 0, 2 * pi) circle <- cbind(cos(angles), sin(angles)) # calculate persistent homology circle_phom <- vietoris_rips(circle) # print persistence data print(circle_phom)
Ports Ripser-based persistent homology calculation engines from C++ to R using the Rcpp package.
Maintainer: Raoul Wadhwa [email protected] (ORCID)
Authors:
Matt Piekenbrock [email protected]
Jacob Scott (ORCID)
Jason Cory Brunson [email protected] (ORCID)
Xinyi Zhang [email protected]
Other contributors:
Emily Noble [contributor]
Takeki Sudo (Takeki Sudo is a copyright holder for Cubical Ripser (GPL-3 license), which was refactored prior to inclusion in ripserr.) [copyright holder, contributor]
Kazushi Ahara (Kazushi Ahara is a copyright holder for Cubical Ripser (GPL-3 license), which was refactored prior to inclusion in ripserr.) [copyright holder, contributor]
Ulrich Bauer (Ulrich Bauer holds the copyright to Ripser (MIT license), which was refactored prior to inclusion in ripserr.) [copyright holder, contributor]
Useful links:
Returns the last part of a PHom
instance.
## S3 method for class 'PHom' tail(x, ...)
## S3 method for class 'PHom' tail(x, ...)
x |
object of class |
... |
other parameters |
# create sample persistence data df <- data.frame(dimension = c(0, 0, 1, 1, 1, 2), birth = rnorm(6), death = rnorm(6, mean = 15)) df_phom <- as.PHom(df) # look at first 3 features head(df_phom) # look at last 3 features tail(df_phom)
# create sample persistence data df <- data.frame(dimension = c(0, 0, 1, 1, 1, 2), birth = rnorm(6), death = rnorm(6, mean = 15)) df_phom <- as.PHom(df) # look at first 3 features head(df_phom) # look at last 3 features tail(df_phom)
This function is an R wrapper for the Ripser C++ library to calculate
persistent homology. For more information on the C++ library, see
https://github.com/Ripser/ripser. For more information on how objects of
different classes are evaluated by vietoris_rips
, read the Details section
below.
vietoris_rips(dataset, ...) ## S3 method for class 'data.frame' vietoris_rips(dataset, ...) ## S3 method for class 'matrix' vietoris_rips(dataset, max_dim = 1L, threshold = -1, p = 2L, dim = NULL, ...) ## S3 method for class 'dist' vietoris_rips(dataset, max_dim = 1L, threshold = -1, p = 2L, dim = NULL, ...) ## S3 method for class 'numeric' vietoris_rips( dataset, data_dim = 2L, dim_lag = 1L, sample_lag = 1L, method = "qa", ... ) ## S3 method for class 'ts' vietoris_rips(dataset, ...) ## Default S3 method: vietoris_rips(dataset, ...)
vietoris_rips(dataset, ...) ## S3 method for class 'data.frame' vietoris_rips(dataset, ...) ## S3 method for class 'matrix' vietoris_rips(dataset, max_dim = 1L, threshold = -1, p = 2L, dim = NULL, ...) ## S3 method for class 'dist' vietoris_rips(dataset, max_dim = 1L, threshold = -1, p = 2L, dim = NULL, ...) ## S3 method for class 'numeric' vietoris_rips( dataset, data_dim = 2L, dim_lag = 1L, sample_lag = 1L, method = "qa", ... ) ## S3 method for class 'ts' vietoris_rips(dataset, ...) ## Default S3 method: vietoris_rips(dataset, ...)
dataset |
object on which to calculate persistent homology |
... |
other relevant parameters |
max_dim |
maximum dimension of persistent homology features to be calculated |
threshold |
maximum simplicial complex diameter to explore |
p |
prime field in which to calculate persistent homology |
dim |
deprecated; passed to |
data_dim |
desired end data dimension |
dim_lag |
time series lag factor between dimensions |
sample_lag |
time series lag factor between samples (rows) |
method |
currently only allows |
vietoris_rips.data.frame
assumes dataset
is a point cloud, with each row
representing a point and each column representing a dimension.
vietoris_rips.matrix
currently assumes dataset
is a point cloud (similar
to vietoris_rips.data.frame
). Currently in the process of adding network
representation to this method.
vietoris_rips.dist
takes a dist
object and calculates persistent homology
based on pairwise distances. The dist
object could have been calculated
from a point cloud, network, or any object containing elements from a finite
metric space.
vietoris_rips.numeric
and vietoris_rips.ts
both calculate persistent
homology of a time series object. The time series object is converted to a
matrix using the quasi-attractor method detailed in Umeda (2017)
doi:10.1527/tjsai.D-G72. Persistent homology of the resulting matrix is
then calculated.
PHom
object
# create a 2-d point cloud of a circle (100 points) num.pts <- 100 rand.angle <- runif(num.pts, 0, 2*pi) pt.cloud <- cbind(cos(rand.angle), sin(rand.angle)) # calculate persistent homology (num.pts by 3 numeric matrix) pers.hom <- vietoris_rips(pt.cloud)
# create a 2-d point cloud of a circle (100 points) num.pts <- 100 rand.angle <- runif(num.pts, 0, 2*pi) pt.cloud <- cbind(cos(rand.angle), sin(rand.angle)) # calculate persistent homology (num.pts by 3 numeric matrix) pers.hom <- vietoris_rips(pt.cloud)