SymmetricEntropicAffinity#

class torchdr.SymmetricEntropicAffinity(perplexity: float = 30, lr: float = 0.1, eps_square: bool = True, tol: float = 0.001, max_iter: int = 500, check_interval: int = 50, optimizer: str = 'Adam', metric: str = 'sqeuclidean', zero_diag: bool = True, device: str = 'auto', backend: str | None = None, verbose: bool = False, compile: bool = False, _pre_processed: bool = False)[source]#

Bases: LogAffinity

Compute the symmetric entropic affinity (SEA) introduced in [Van Assel et al., 2024].

Compute the solution \(\mathbf{P}^{\mathrm{se}}\) to the symmetric entropic affinity (SEA) problem.

The algorithm computes the optimal dual variables \(\mathbf{\mu}^\star \in \mathbb{R}^n\) and \(\mathbf{\varepsilon}^\star \in \mathbb{R}^n_{>0}\) using dual ascent. The affinity matrix is then given by

\[\forall (i,j), \: P^{\mathrm{se}}_{ij} = \exp \left( \frac{\mu^\star_{i} + \mu^\star_j - 2 C_{ij}}{\varepsilon^\star_i + \varepsilon^\star_j} \right) \:\]

Convex problem. It amounts to the following convex optimization problem:

\[\begin{split}\mathbf{P}^{\mathrm{se}} \in \mathop{\arg\min}_{\mathbf{P} \in \mathbb{R}_+^{n \times n}} \: &\langle \mathbf{C}, \mathbf{P} \rangle \\ \text{s.t.} \quad &\mathbf{P} \mathbf{1} = \mathbf{1} \\ &\forall i, \: \mathrm{h}(\mathbf{P}_{i:}) \geq \log (\xi) + 1 \\ &\mathbf{P} = \mathbf{P}^\top \:.\end{split}\]

where :

\(\mathbf{C}\): symmetric pairwise distance matrix between the samples.
\(\xi\): perplexity parameter.
\(\mathrm{h}\): (row-wise) Shannon entropy such that \(\mathrm{h}(\mathbf{p}) = - \sum_{i} p_{i} (\log p_{i} - 1)\).
\(\mathbf{1} := (1,...,1)^\top\): all-ones vector.

It is a symmetric version of EntropicAffinity, where a symmetry constraint is added in the optimization problem.

Note

Unlike \((\mathbf{P}^{\mathrm{e}} + (\mathbf{P}^{\mathrm{e}})^\top )/ 2\) used in TSNE where \(\mathbf{P}^{\mathrm{e}}\) is the EntropicAffinity matrix, SymmetricEntropicAffinity allows to control the entropy and mass of each row/column of the affinity matrix.

Parameters:

perplexity (float) – Perplexity parameter, related to the number of ‘effective’ nearest neighbors. Consider selecting a value between 2 and the number of samples.
lr (float, optional) – Learning rate used to update dual variables.
eps_square (bool, optional) – Whether to optimize on the square of the dual variables. May be more stable in practice.
tol (float, optional) – Precision threshold at which the algorithm stops, by default 1e-5.
max_iter (int, optional) – Number of maximum iterations for the algorithm, by default 500.
check_interval (int, optional) – Interval for logging progress.
optimizer (str , optional) – Which pytorch optimizer to use (default ‘Adam’).
metric (str, optional) – Metric to use for computing distances, by default “sqeuclidean”.
zero_diag (bool, optional) – Whether to set the diagonal of the distance matrix to 0.
device (str, optional) – Device to use for computation.
backend ({"keops", "faiss", None}, optional) – Which backend to use for handling sparsity and memory efficiency. Default is None.
verbose (bool, optional) – Verbosity. Default is False.
compile (bool, optional) – If True, use torch compile. Default is False.
_pre_processed (bool, optional) – If True, assumes inputs are already torch tensors on the correct device and skips the to_torch conversion. Default is False.

SymmetricEntropicAffinity#

This Page