API and Modules

Dimensionality Reduction sklearn Compatible Estimators

TorchDR provides a set of classes that are compatible with the sklearn API. For example, running TSNE can be done in the exact same way as running sklearn.manifold.TSNE with the same parameters. Note that the TorchDR classes work seamlessly with both Numpy and PyTorch tensors.

For all methods, TorchDR provides the ability to use GPU acceleration using device='cuda' as well as LazyTensor objects that allows to fit large scale models directly on the GPU memory without overflows using keops=True.

TorchDR supports a variety of dimensionality reduction methods. They are presented in the following sections.

Spectral Embedding

Those classes are used to perform classical spectral embedding from a torchdr.Affinity object defined on the input data. They give the same output as using torchdr.AffinityMatcher with this same torchdr.Affinity in input space and a torchdr.ScalarProductAffinity in the embedding space. However, torchdr.AffinityMatcher relies on a gradient-based solver while the spectral embedding classes rely on the eigendecomposition of the affinity matrix.

PCA([n_components, device, verbose, ...])

Principal Component Analysis module.

KernelPCA(affinity, n_components, device, ...)

Kernel Principal Component Analysis module.

Neighbor Embedding

TorchDR supports the following neighbor embedding methods.

SNE([perplexity, n_components, lr, ...])

Implementation of Stochastic Neighbor Embedding (SNE) introduced in [1].

TSNE([perplexity, n_components, lr, ...])

Implementation of t-Stochastic Neighbor Embedding (t-SNE) introduced in [2].

TSNEkhorn([perplexity, n_components, lr, ...])

Implementation of the TSNEkhorn algorithm introduced in [3].

InfoTSNE([perplexity, n_components, lr, ...])

Implementation of the InfoTSNE algorithm introduced in [15].

LargeVis([perplexity, n_components, lr, ...])

Implementation of the LargeVis algorithm introduced in [13].

UMAP([n_neighbors, n_components, min_dist, ...])

Implementation of UMAP introduced in [8] and further studied in [12].

Advanced Dimensionality Reduction with TorchDR

TorchDR provides a set of generic classes that can be used to implement new dimensionality reduction methods. These classes provide a modular and extensible framework that allows you to focus on the core components of your method.

Base Classes

The torchdr.DRModule class is the base class for a dimensionality reduction estimator. It is the base class for all the DR classes in TorchDR.

torchdr.AffinityMatcher is the base class for all the DR methods that use gradient-based optimization to minimize a loss function constructed from two affinities in input and embedding spaces.

DRModule([n_components, device, keops, ...])

Base class for DR methods.

AffinityMatcher(affinity_in, affinity_out[, ...])

Perform dimensionality reduction by matching two affinity matrices.

Base Neighbor Embedding Modules

Neighbor embedding base modules inherit from the torchdr.AffinityMatcher class and implement specific strategies that are common to all neighbor embedding methods such as early exaggeration.

In particular, torchdr.SparseNeighborEmbedding relies on the sparsity of the input affinity to compute the attractive term in linear time. torchdr.SampledNeighborEmbedding inherits from this class and adds the possibility to approximate the repulsive term of the loss via negative samples.

NeighborEmbedding(affinity_in, affinity_out)

Solves the neighbor embedding problem.

SparseNeighborEmbedding(affinity_in, ...[, ...])

Solves the neighbor embedding problem with a sparse input affinity matrix.

SampledNeighborEmbedding(affinity_in, ...[, ...])

Solves the neighbor embedding problem with both sparsity and sampling.

Affinity Classes

The following classes are used to compute the affinities between the data points. Broadly speaking, they define a notion of similarity between samples.

Simple Affinities

GaussianAffinity([sigma, metric, zero_diag, ...])

Compute the Gaussian affinity matrix.

StudentAffinity([degrees_of_freedom, ...])

Compute the Student affinity matrix based on the Student-t distribution.

ScalarProductAffinity([device, keops, verbose])

Compute the scalar product affinity matrix.

NormalizedGaussianAffinity([sigma, metric, ...])

Compute the Gaussian affinity matrix which can be normalized along a dimension.

NormalizedStudentAffinity([...])

Compute the Student affinity matrix which can be normalized along a dimension.

Affinities Normalized by kNN Distances

SelfTuningAffinity([K, normalization_dim, ...])

Compute the self-tuning affinity introduced in [22].

MAGICAffinity([K, metric, zero_diag, ...])

Compute the MAGIC affinity introduced in [23].

Entropic Affinities

SinkhornAffinity([eps, tol, max_iter, ...])

Compute the symmetric doubly stochastic affinity matrix.

EntropicAffinity([perplexity, tol, ...])

Solve the directed entropic affinity problem introduced in [1].

SymmetricEntropicAffinity([perplexity, lr, ...])

Compute the symmetric entropic affinity (SEA) introduced in [3].

Quadratic Affinities

DoublyStochasticQuadraticAffinity([eps, ...])

Compute the symmetric doubly stochastic affinity.

UMAP Affinities

UMAPAffinityIn([n_neighbors, tol, max_iter, ...])

Compute the input affinity used in UMAP [8].

UMAPAffinityOut([min_dist, spread, a, b, ...])

Compute the affinity used in embedding space in UMAP [8].

Utils

The following classes are used to perform various operations such as computing the pairwise distances between the data points as well as solving root search problems.

pairwise_distances(X[, Y, metric, keops])

Compute pairwise distances matrix between points in two datasets.

binary_search(f, n[, begin, end, max_iter, ...])

Implement the binary search root finding method.

false_position(f, n[, begin, end, max_iter, ...])

Implement the false position root finding method.

References