SymmetricEntropicAffinity
- class torchdr.SymmetricEntropicAffinity(perplexity: float = 30, lr: float = 0.1, eps_square: bool = True, tol: float = 0.001, max_iter: int = 500, optimizer: str = 'Adam', tolog: bool = False, metric: str = 'sqeuclidean', zero_diag: bool = True, device: str = 'auto', keops: bool = False, verbose: bool = False)[source]
Bases:
LogAffinity
Compute the symmetric entropic affinity (SEA) introduced in [3].
Compute the solution \(\mathbf{P}^{\mathrm{se}}\) to the symmetric entropic affinity (SEA) problem described in [3].
The algorithm computes the optimal dual variables \(\mathbf{\mu}^\star \in \mathbb{R}^n\) and \(\mathbf{\varepsilon}^\star \in \mathbb{R}^n_{>0}\) using dual ascent. The affinity matrix is then given by
\[\forall (i,j), \: P^{\mathrm{se}}_{ij} = \exp \left( \frac{\mu^\star_{i} + \mu^\star_j - 2 C_{ij}}{\varepsilon^\star_i + \varepsilon^\star_j} \right) \:\]Convex problem. It amounts to the following convex optimization problem:
\[\begin{split}\mathbf{P}^{\mathrm{se}} \in \mathop{\arg\min}_{\mathbf{P} \in \mathbb{R}_+^{n \times n}} \: &\langle \mathbf{C}, \mathbf{P} \rangle \\ \text{s.t.} \quad &\mathbf{P} \mathbf{1} = \mathbf{1} \\ &\forall i, \: \mathrm{h}(\mathbf{P}_{i:}) \geq \log (\xi) + 1 \\ &\mathbf{P} = \mathbf{P}^\top \:.\end{split}\]where :
\(\mathbf{C}\): symmetric pairwise distance matrix between the samples.
\(\xi\): perplexity parameter.
\(\mathrm{h}\): (row-wise) Shannon entropy such that \(\mathrm{h}(\mathbf{p}) = - \sum_{i} p_{i} (\log p_{i} - 1)\).
\(\mathbf{1} := (1,...,1)^\top\): all-ones vector.
It is a symmetric version of
EntropicAffinity
, where a symmetry constraint is added in the optimization problem.Note
Unlike \((\mathbf{P}^{\mathrm{e}} + (\mathbf{P}^{\mathrm{e}})^\top )/ 2\) used in
TSNE
where \(\mathbf{P}^{\mathrm{e}}\) is theEntropicAffinity
matrix,SymmetricEntropicAffinity
allows to control the entropy and mass of each row/column of the affinity matrix.- Parameters:
perplexity (float) – Perplexity parameter, related to the number of ‘effective’ nearest neighbors. Consider selecting a value between 2 and the number of samples.
lr (float, optional) – Learning rate used to update dual variables.
eps_square (bool, optional) – Whether to optimize on the square of the dual variables. May be more stable in practice.
tol (float, optional) – Precision threshold at which the algorithm stops, by default 1e-5.
max_iter (int, optional) – Number of maximum iterations for the algorithm, by default 500.
optimizer ({'SGD', 'Adam', 'NAdam', 'LBFGS}, optional) – Which pytorch optimizer to use (default ‘Adam’).
tolog (bool, optional) – Whether to store intermediate result in a dictionary (default False).
metric (str, optional) – Metric to use for computing distances, by default “sqeuclidean”.
zero_diag (bool, optional) – Whether to set the diagonal of the distance matrix to 0.
device (str, optional) – Device to use for computation.
keops (bool, optional) – Whether to use KeOps for computation.
verbose (bool, optional) – Verbosity. Default is False.
References