| Contents |
k-Means Clustering
| Using
k-Means Clustering in
XLMiner™:
In XLMiner™, select Data Reduction and Exploration -> k-Means Clustering, enter the data range that needs to be processed, and move the variables of interest to the Selected variables box.
Click Next and the following dialog box comes up. Normalize the Data: Normalizing the data is important to ensure that the distance measure accords equal weight to each variable -- without normalization, the variable with the largest scale will dominate the measure. # Clusters: Select the number of final clusters to be formed. This is actually the parameter k in the K-means clustering. The number of clusters should be at least 2 and at most the number of observations in the data range. Set this value based on your best estimate of how many clusters there will be; it is a good idea to repeat the procedure with several different values. # Iterations: This determines how many times the program will start with an initial partition and follow through with the clustering algorithm. The configuration of clusters (and how good a job they do of separating the data) may differ from one starting partition to another. The program will go through the specified number of iterations, and select the cluster configuration that minimizes the distance measure. Options : With Fixed start, XLMiner™ starts building the model with a single fixed starting point. If we select Random starts the algorithm starts at any random point. You have to specify the No. of starts and XLMiner™ generates as many cluster sets. It decides which is the best one and releases the output generated using the best cluster set . We also have the option of fixing the seed when we select Random starts. Click Next and the following dialog appears, where you select the output to be displayed.
Click Finish and the output will be displayed. See also |