silhouette_samples#
- torchdr.silhouette_samples(X: Tensor | ndarray, labels: Tensor | ndarray, weights: Tensor | ndarray | None = None, metric: str = 'euclidean', device: str | None = None, backend: str | FaissConfig | None = None, warn: bool = True)[source]#
Compute the silhouette coefficients for each data sample.
Each coefficient is calculated using the mean intra-cluster distance (\(a\)) and the mean nearest-cluster distance (\(b\)) of the sample, according to the formula \((b - a) / max(a, b)\). The best value is 1 and the worst value is -1. Values near 0 indicate overlapping clusters. Negative values generally indicate that a sample has been assigned to the wrong cluster, as a different cluster is more similar.
See [Rousseeuw, 1987] for more information.
- Parameters:
X (torch.Tensor, np.ndarray of shape (n_samples_x, n_samples_x) if) –
metric="precomputed"else (n_samples_x, n_features) Input data as a pairwise distance matrix or a feature matrix.labels (torch.Tensor or np.ndarray of shape (n_samples_x,)) – Labels associated to X.
weights (torch.Tensor or np.ndarray of shape (n_samples_x,), optional) – Probability vector taking into account the relative importance of samples in X. The default is None and considers uniform weights.
metric (str, optional) – The distance to use for computing pairwise distances. Must be an element of [“euclidean”, “manhattan”, “hyperbolic”, “precomputed”]. The default is ‘euclidean’.
device (str, optional) – Device to use for computations.
backend ({"keops", "faiss", None} or FaissConfig, optional) – Which backend to use for handling sparsity and memory efficiency. - “keops”: Memory-efficient symbolic computations - “faiss”: Fast approximate nearest neighbors - None: Standard PyTorch operations - FaissConfig object: FAISS with custom configuration Default is None.
warn (bool, optional) – Whether to output warnings when edge cases are identified.
- Returns:
coefficients – Silhouette coefficients for each sample.
- Return type:
torch.Tensor or np.ndarray of shape (n_samples_x,)