sklearn.cluster: Clustering¶
sklearn.cluster.KMeans¶
sklearn.cluster.KMeans
This class provides K-Means clustering model.
Important
Currently, this model works by gathering all the data in a single node and then generating K-Means model. Make sure you have enough memory on the first node in your hostfile.
Methods¶
sklearn.cluster.KMeans.fit¶
-
sklearn.cluster.KMeans.fit(X, y=None, sample_weight=None)
Supported Arguments
*X
: NumPy Array, Pandas Dataframes, or CSR sparse matrix. *sample_weight
: Numeric NumPy ArrayNote
Bodo ignores
y
, which is consistent with scikit-learn.
sklearn.cluster.KMeans.predict¶
-
sklearn.cluster.KMeans.predict(X, sample_weight=None)
Supported Arguments
-X
: NumPy Array, Pandas Dataframes, or CSR sparse matrix. -sample_weight
: Numeric NumPy Array
sklearn.cluster.KMeans.score¶
-
sklearn.cluster.KMeans.score(X, y=None, sample_weight=None)
Supported Arguments
-X
: NumPy Array, Pandas Dataframes, or CSR sparse matrix. -sample_weight
: Numeric NumPy ArrayNote
Bodo ignores y, which is consistent with scikit-learn.
sklearn.cluster.KMeans.transform¶
-
sklearn.cluster.KMeans.transform(X)
Supported Arguments
-X
: NumPy Array, Pandas Dataframes, or CSR sparse matrix.
Example Usage¶
>>> import bodo
>>> from sklearn.cluster import KMeans
>>> import numpy as np
>>> @bodo.jit
>>> def test_kmeans(X):
... kmeans = KMeans(n_clusters=2)
... kmeans.fit(X)
... ans = kmeans.predict([[0, 0], [12, 3]])
... print(ans)
...
>>> X = np.array([[1, 2], [1, 4], [1, 0], [10, 2], [10, 4], [10, 0]])
>>> test_kmeans(X)
[1 0]