Strategies for hierarchical clustering generally fall into two types: Agglomerative: This is a 'bottom-up'. Step 1: Import the necessary Libraries for the Hierarchical Clustering. Z = hierarchy.linkage(Y, method='single')Īx = ndrogram(Z, show_contracted=True, labels=data.index. In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis which seeks to build a hierarchy of clusters. Y = (data.as_matrix(), metric='euclidean') I'm using Euclidean distance, and the single-link agglomerative method. 'nine' (spelled out) and I'm using numerals '1' through '9' as the object's properties. To avoid confusion I'm calling the objects 'zero'. I didn't want to type up your data by hand, so I just randomly generated a matrix. It sounds like you'd like to do some hierarchical clustering, which you can do with example below. Perhaps Euclidean distance is appropriate to determine the distance between objects you would know best. It either starts with all samples in the dataset as one cluster and goes on dividing that cluster into more clusters or it starts with single samples in the dataset as clusters and then merges samples based on criteria. To do this, you can treat each of the 10 objects as having 10 arbitrary properties then this is a standard setup. Hierarchical clustering is a kind of clustering that uses either top-down or bottom-up approach in creating clusters from data. So, looking at your data, object 'eight' and object 'nine' might be in the same cluster because they have both have mostly low values and one relatively high value for the 'eight' column. One class is mainly recognize as another one then both classes should I would like the clusters to maximize the classification results: if Types are ( both are same but reverse in direction) Agglomerative Hierarchical Clustering ( top down) Divisive Hierarchical Clustering. I'm not sure I understand WHY you are doing this, but, based on the comment which you posted above, it seems that you'd like to cluster 10 objects ('zero', 'one' 'nine') by comparing their values in your confusion matrix, generated by some other algorithm. Also called Hierarchical cluster analysis or HCA is an unsupervised clustering algorithm which involves creating clusters that have predominant ordering from top to bottom.