SymmetricEntropicAffinity

class torchdr.SymmetricEntropicAffinity(perplexity: float = 30, lr: float = 0.1, eps_square: bool = True, tol: float = 0.001, max_iter: int = 500, optimizer: str = 'Adam', tolog: bool = False, metric: str = 'sqeuclidean', zero_diag: bool = True, device: str = 'auto', keops: bool = False, verbose: bool = False)[source]

Bases: LogAffinity

Compute the symmetric entropic affinity (SEA) introduced in [3].

Compute the solution \(\mathbf{P}^{\mathrm{se}}\) to the symmetric entropic affinity (SEA) problem described in [3].

The algorithm computes the optimal dual variables \(\mathbf{\mu}^\star \in \mathbb{R}^n\) and \(\mathbf{\varepsilon}^\star \in \mathbb{R}^n_{>0}\) using dual ascent. The affinity matrix is then given by

\[\forall (i,j), \: P^{\mathrm{se}}_{ij} = \exp \left( \frac{\mu^\star_{i} + \mu^\star_j - 2 C_{ij}}{\varepsilon^\star_i + \varepsilon^\star_j} \right) \:\]

Convex problem. It amounts to the following convex optimization problem:

\[\begin{split}\mathbf{P}^{\mathrm{se}} \in \mathop{\arg\min}_{\mathbf{P} \in \mathbb{R}_+^{n \times n}} \: &\langle \mathbf{C}, \mathbf{P} \rangle \\ \text{s.t.} \quad &\mathbf{P} \mathbf{1} = \mathbf{1} \\ &\forall i, \: \mathrm{h}(\mathbf{P}_{i:}) \geq \log (\xi) + 1 \\ &\mathbf{P} = \mathbf{P}^\top \:.\end{split}\]

where :

  • \(\mathbf{C}\): symmetric pairwise distance matrix between the samples.

  • \(\xi\): perplexity parameter.

  • \(\mathrm{h}\): (row-wise) Shannon entropy such that \(\mathrm{h}(\mathbf{p}) = - \sum_{i} p_{i} (\log p_{i} - 1)\).

  • \(\mathbf{1} := (1,...,1)^\top\): all-ones vector.

It is a symmetric version of EntropicAffinity, where a symmetry constraint is added in the optimization problem.

Note

Unlike \((\mathbf{P}^{\mathrm{e}} + (\mathbf{P}^{\mathrm{e}})^\top )/ 2\) used in TSNE where \(\mathbf{P}^{\mathrm{e}}\) is the EntropicAffinity matrix, SymmetricEntropicAffinity allows to control the entropy and mass of each row/column of the affinity matrix.

Parameters:
  • perplexity (float) – Perplexity parameter, related to the number of ‘effective’ nearest neighbors. Consider selecting a value between 2 and the number of samples.

  • lr (float, optional) – Learning rate used to update dual variables.

  • eps_square (bool, optional) – Whether to optimize on the square of the dual variables. May be more stable in practice.

  • tol (float, optional) – Precision threshold at which the algorithm stops, by default 1e-5.

  • max_iter (int, optional) – Number of maximum iterations for the algorithm, by default 500.

  • optimizer ({'SGD', 'Adam', 'NAdam', 'LBFGS}, optional) – Which pytorch optimizer to use (default ‘Adam’).

  • tolog (bool, optional) – Whether to store intermediate result in a dictionary (default False).

  • metric (str, optional) – Metric to use for computing distances, by default “sqeuclidean”.

  • zero_diag (bool, optional) – Whether to set the diagonal of the distance matrix to 0.

  • device (str, optional) – Device to use for computation.

  • keops (bool, optional) – Whether to use KeOps for computation.

  • verbose (bool, optional) – Verbosity. Default is False.

References