API and Modules
Dimensionality Reduction sklearn
Compatible Estimators
TorchDR provides a set of classes that are compatible with the sklearn
API.
For example, running TSNE
can be done in the exact same way as running
sklearn.manifold.TSNE
with the same parameters.
Note that the TorchDR classes work seamlessly with both Numpy and PyTorch tensors.
For all methods, TorchDR provides the ability to use GPU acceleration using
device='cuda'
as well as LazyTensor objects that allows to fit large scale models
directly on the GPU memory without overflows using keops=True
.
TorchDR supports a variety of dimensionality reduction methods. They are presented in the following sections.
Spectral Embedding
Those classes are used to perform classical spectral embedding from a
torchdr.Affinity
object defined on the input data.
They give the same output as using torchdr.AffinityMatcher
with this same
torchdr.Affinity
in input space and a torchdr.ScalarProductAffinity
in
the embedding space. However, torchdr.AffinityMatcher
relies on a
gradient-based solver while the spectral embedding classes rely on the
eigendecomposition of the affinity matrix.
|
Principal Component Analysis module. |
|
Kernel Principal Component Analysis module. |
Neighbor Embedding
TorchDR supports the following neighbor embedding methods.
|
Implementation of Stochastic Neighbor Embedding (SNE) introduced in [1]. |
|
Implementation of t-Stochastic Neighbor Embedding (t-SNE) introduced in [2]. |
|
Implementation of the TSNEkhorn algorithm introduced in [3]. |
|
Implementation of the InfoTSNE algorithm introduced in [15]. |
|
Implementation of the LargeVis algorithm introduced in [13]. |
|
Implementation of UMAP introduced in [8] and further studied in [12]. |
Advanced Dimensionality Reduction with TorchDR
TorchDR provides a set of generic classes that can be used to implement new dimensionality reduction methods. These classes provide a modular and extensible framework that allows you to focus on the core components of your method.
Base Classes
The torchdr.DRModule
class is the base class for a dimensionality
reduction estimator. It is the base class for all the DR classes in TorchDR.
torchdr.AffinityMatcher
is the base class for all the DR methods that
use gradient-based optimization to minimize a loss function constructed from
two affinities in input and embedding spaces.
|
Base class for DR methods. |
|
Perform dimensionality reduction by matching two affinity matrices. |
Base Neighbor Embedding Modules
Neighbor embedding base modules inherit from the torchdr.AffinityMatcher
class and implement specific strategies that are common to all neighbor embedding
methods such as early exaggeration.
In particular, torchdr.SparseNeighborEmbedding
relies on the sparsity of the
input affinity to compute the attractive term in linear time. torchdr.SampledNeighborEmbedding
inherits from this class and adds the possibility to
approximate the repulsive term of the loss via negative samples.
|
Solves the neighbor embedding problem. |
|
Solves the neighbor embedding problem with a sparse input affinity matrix. |
|
Solves the neighbor embedding problem with both sparsity and sampling. |
Affinity Classes
The following classes are used to compute the affinities between the data points. Broadly speaking, they define a notion of similarity between samples.
Simple Affinities
|
Compute the Gaussian affinity matrix. |
|
Compute the Student affinity matrix based on the Student-t distribution. |
|
Compute the scalar product affinity matrix. |
|
Compute the Gaussian affinity matrix which can be normalized along a dimension. |
|
Compute the Student affinity matrix which can be normalized along a dimension. |
Affinities Normalized by kNN Distances
|
Compute the self-tuning affinity introduced in [22]. |
|
Compute the MAGIC affinity introduced in [23]. |
Entropic Affinities
|
Compute the symmetric doubly stochastic affinity matrix. |
|
Solve the directed entropic affinity problem introduced in [1]. |
|
Compute the symmetric entropic affinity (SEA) introduced in [3]. |
Quadratic Affinities
|
Compute the symmetric doubly stochastic affinity. |
UMAP Affinities
|
Compute the input affinity used in UMAP [8]. |
|
Compute the affinity used in embedding space in UMAP [8]. |
Utils
The following classes are used to perform various operations such as computing the pairwise distances between the data points as well as solving root search problems.
|
Compute pairwise distances matrix between points in two datasets. |
|
Implement the binary search root finding method. |
|
Implement the false position root finding method. |