silhouette_samples#

Compute the silhouette coefficients for each data sample.

Each coefficient is calculated using the mean intra-cluster distance (\(a\)) and the mean nearest-cluster distance (\(b\)) of the sample, according to the formula \((b - a) / max(a, b)\). The best value is 1 and the worst value is -1. Values near 0 indicate overlapping clusters. Negative values generally indicate that a sample has been assigned to the wrong cluster, as a different cluster is more similar.

See [Rousseeuw, 1987] for more information.

Parameters:

X (torch.Tensor, np.ndarray of shape (n_samples_x, n_samples_x) if) – metric="precomputed" else (n_samples_x, n_features) Input data as a pairwise distance matrix or a feature matrix.
labels (torch.Tensor or np.ndarray of shape (n_samples_x,)) – Labels associated to X.
weights (torch.Tensor or np.ndarray of shape (n_samples_x,), optional) – Probability vector taking into account the relative importance of samples in X. The default is None and considers uniform weights.
metric (str, optional) – The distance to use for computing pairwise distances. Must be an element of [“euclidean”, “manhattan”, “hyperbolic”, “precomputed”]. The default is ‘euclidean’.
device (str, optional) – Device to use for computations.
backend ({"keops", "faiss", None} or FaissConfig, optional) – Which backend to use for handling sparsity and memory efficiency. - “keops”: Memory-efficient symbolic computations - “faiss”: Fast approximate nearest neighbors - None: Standard PyTorch operations - FaissConfig object: FAISS with custom configuration Default is None.
warn (bool, optional) – Whether to output warnings when edge cases are identified.

Returns:

coefficients – Silhouette coefficients for each sample.

Return type:

torch.Tensor or np.ndarray of shape (n_samples_x,)

silhouette_samples#

This Page