Skip to content

sklearn.cluster: Clustering

sklearn.cluster.KMeans

sklearn.cluster.KMeans This class provides K-Means clustering model.

Important

Currently, this model works by gathering all the data in a single node and then generating K-Means model. Make sure you have enough memory on the first node in your hostfile.

Methods

sklearn.cluster.KMeans.fit

  • sklearn.cluster.KMeans.fit(X, y=None, sample_weight=None)

    Supported Arguments

    * X: NumPy Array, Pandas Dataframes, or CSR sparse matrix. * sample_weight: Numeric NumPy Array

    Note

    Bodo ignores y, which is consistent with scikit-learn.

sklearn.cluster.KMeans.predict

  • sklearn.cluster.KMeans.predict(X, sample_weight=None)

    Supported Arguments

    - X: NumPy Array, Pandas Dataframes, or CSR sparse matrix. - sample_weight: Numeric NumPy Array

sklearn.cluster.KMeans.score

  • sklearn.cluster.KMeans.score(X, y=None, sample_weight=None)

    Supported Arguments

    - X: NumPy Array, Pandas Dataframes, or CSR sparse matrix. - sample_weight: Numeric NumPy Array

    Note

    Bodo ignores y, which is consistent with scikit-learn.

sklearn.cluster.KMeans.transform

  • sklearn.cluster.KMeans.transform(X)

    Supported Arguments

    - X: NumPy Array, Pandas Dataframes, or CSR sparse matrix.

Example Usage

>>> import bodo
>>> from sklearn.cluster import KMeans
>>> import numpy as np
>>> @bodo.jit
>>> def test_kmeans(X):
...   kmeans = KMeans(n_clusters=2)
...   kmeans.fit(X)
...   ans = kmeans.predict([[0, 0], [12, 3]])
...   print(ans)
...
>>> X = np.array([[1, 2], [1, 4], [1, 0], [10, 2], [10, 4], [10, 0]])
>>> test_kmeans(X)
[1 0]