The Hierarchical Clustering block enables you to apply a hierarchical clustering model to a dataset.
The following demonstrates how to use the Hierarchical Clustering to split the input basketball_players.csv dataset (containing observations that describe baskteball players in a national league) and assign observations to a specified number of clusters:
- Import the basketball_players.csv dataset onto a Workflow canvas using the Text File Import block.
- Expand the Model Training group in the Workflow palette, then click and drag a Hierarchical Clustering block onto the Workflow canvas.
- Click the Output port of the basketball_players dataset block and drag a connection towards the Input port of the Hierarchical Clustering block.
- Double-click the Hierarchical Clustering block to display the Clustering view along with the hierarchical clustering Preferences dialog box.
- In the hierarchical clustering Preferences dialog box:
- In the Unselected Variables list, press and hold CTRL and select the goals_scored, height, and weight variables.
- Click Select to move the specified variables to the Selected Independent Variables list.
- Click OK to save the configuration and close the hierarchical clustering Preferences dialog box.
The Clustering view displays the clusters in the model.
- Close the Hierarchical Clustering view and save the configuration when prompted.
A green execution status is displayed in the Output port of the Hierarchical Clustering block and the new Working Dataset. The working dataset contains the input dataset with a new variable listing the cluster to which the observation is allocated.