Function Index
KMeans
KMeans[
data
,
seeds
]
returns a clustering of
data
by minimizing the within-cluster sums-of-squares, given initial estimates for the cluster centers,
seeds
.
KMeans[
data
,
seeds
,
weights
]
returns a clustering of
data
by minimizing the within-cluster sums-of-squares, given initial estimates for the cluster centers,
seeds
, and weights for each element of the measurement vectors in
data
.
The argument
data
is a list of
n
-element measurement vectors,
seeds
is a list of
n
-element vectors denoting the initial estimates of the cluster centers, and
weights
, if given, is an
n
-element vector. The length of the argument
seeds
determines the number of clusters returned by
KMeans
.
The default option
DistanceFunction → EuclideanDistance
selects the Euclidean distance metric for use in
KMeans
. Other common distance functions include
ManhattanDistance
,
ChebyshevDistance
,
CanberraDistance
, and
MinkowskiDistance
.
The option
MaxRecursion
→
15
limits the maximum number of iterations of the algorithm.
See also User's Guide
7.5
and the tutorial
Partitioning Data into Clusters
.
Modified in Version 2.2.
Example
This loads the package.
In[1]:=
This defines three seeds.
In[2]:=
This generates the example data.
In[3]:=
Here is a scatter plot of the data.
In[4]:=
Out[4]=
This finds a grouping of the data that minimizes the within-cluster Euclidean distance.
In[5]:=
This plots the result using different colors for each of the clusters.
In[6]:=
Out[6]=
This computes two alternative groupings of the data.
In[7]:=
This shows the two results.
In[9]:=
Out[9]=
© 2010 Wolfram Research, Inc.
•
Terms of Use
•
Privacy Policy
Sign up for our newsletter: