silhouette_score#
- torchdr.silhouette_score(X: Tensor | ndarray, labels: Tensor | ndarray, weights: Tensor | ndarray | None = None, metric: str = 'euclidean', device: str | None = None, backend: str | None = None, sample_size: int | None = None, random_state: int | None = None, warn: bool = True)[source]#
Compute the Silhouette score as the mean of silhouette coefficients.
Each coefficient is calculated using the mean intra-cluster distance (\(a\)) and the mean nearest-cluster distance (\(b\)) of the sample, according to the formula \((b - a) / max(a, b)\). The best value is 1 and the worst value is -1. Values near 0 indicate overlapping clusters. Negative values generally indicate that a sample has been assigned to the wrong cluster, as a different cluster is more similar.
See [Rousseeuw, 1987] for more information.
- Parameters:
X (torch.Tensor, np.ndarray of shape (n_samples_x, n_samples_x) if) – `metric=”precomputed” else (n_samples_x, n_features) Input data as a pairwise distance matrix or a feature matrix.
labels (torch.Tensor or np.ndarray of shape (n_samples_x,)) – Labels associated to X.
weights (torch.Tensor or np.ndarray of shape (n_samples_x,), optional) – Probability vector taking into account the relative importance of samples in X. The default is None and considers uniform weights.
metric (str, optional) – The distance to use for computing pairwise distances. Must be an element of [“euclidean”, “manhattan”, “hyperbolic”, “precomputed”]. The default is ‘euclidean’.
device (str, optional) – Device to use for computations.
backend ({"keops", "faiss", None}, optional) – Which backend to use for handling sparsity and memory efficiency. Default is None.
sample_size (int, optional) – Number of samples to use when computing the score on a random subset. If sample_size is None, no sampling is used.
random_state (int, optional) – Random state for selecting a subset of samples. Used when sample_size is not None.
warn (bool, optional) – Whether to output warnings when edge cases are identified.
- Returns:
silhouette_score – mean silhouette coefficients for all samples.
- Return type: